RIDX

Depositor-defined Reference IDs

RIDX Requirements

  • Depositor-defined Reference IDs (RIDXs) are defined in the REFERENCE deposition file.

  • An RIDX may refer to the results from a given publication, or a single unpublished dataset.

  • An RIDX should be a string that is meaningful to the depositor.

  • RIDXs must be unique within a source. Where a source contains submissions from multiple datasets or sites, using a unique RIDX for each one will make it possible to distinguish between them in ChEMBL.

  • An entry in the REFERENCE file may include URLs, or simply text descriptions of the collection of data.

  • RIDXs map 1:1 to an internal DOC_ID field. This is enforced by DOC_ID being the Primary Key on the DOCS table. A DOC_ID can never be shared between SRC_IDs.

  • RIDX is referred to as CRIDX in the ACTIVITIES file. It should be the same as the depositor defined identifier RIDX in the REFERENCE, COMPOUND_RECORD and ASSAY files; this is for historical programmatic reasons.

Suggested format:

Smith_KinaseXYZ_wave1 (abbreviation of the group leader, subject area, specific identifier)

This RIDX is easy to differentiate from others, and easy to track through the deposition files. It makes any data issues during loading easier to resolve, and makes your data easier to search when loaded.

Relations to compounds and assays

  • Each Compound IDX (CIDX) or Assay IDX (AIDX) defined by a depositor can only be assigned to a SINGLE RIDX. The same RIDX may be assigned to multiple CIDXs and AIDXs.

  • A depositor may only assign RIDXs defined by themselves, and may only assign them to their own CIDXs and AIDXs.

Default DOC_IDs and RIDXs

All SRC_IDs are associated with a single 'default' DOC_ID, which is created alongside the new source. The DOCS table requires that a unique constraint of a ‘SRC_ID / RIDX’ combination be enforced. Because multi-field unique constraints in Oracle cannot accept ‘null’ in any field, the string ‘default’ is used for the RIDX field when no RIDX has been specified.

If a depositor decides to not associate an RIDX to one of their CIDXs, AIDXs or Activities by leaving the field blank, the loader will automatically assign the default RIDX and DOC_ID for this SRC_ID. Using an RIDX which does not match any other RIDX defined in the Source would have the same effect as leaving the RIDX blank. The default DOC_ID can be manually edited by the ChEMBL administrator in consultation with the depositor if this was the result of a data error rather than intentional.

Attempting to create or edit an RIDX called ‘default’ will be met with an error by the loader. Its use to confirm that the Activity data set should be assigned to ‘default’ (by putting ‘default’ in the RIDX field in the ACTIVITY file) is permitted.

Last updated