Example dataset

An example dataset, showing the structure of all required files and advice on avoiding common issues.

A simple set of example data files suitable for use as a template are available below:

The dataset below is a truncated example showing only the first SDF record and first three rows of each file type. The full files are available in the "Simple Example Dataset" download. Mandatory fields are in bold.

PLEASE NOTE: For ease of viewing the column headers have been formatted as rows in this example. When submitting data to ChEMBL the formatting should be rotated by 90 degrees so that the fields are column headers and each row below is a data entry point.

REFERENCE.tsv

RIDX

Pathogen_Box_Bloggs

PUBMED_ID

DOI

10.1016/S0960-894X(01)81052-0

PATENT_ID

JOURNAL_NAME

YEAR

2011

VOLUME

ISSUE

FIRST_PAGE

LAST_PAGE

REF_TYPE

Dataset

TITLE

Pathogen Box Compounds Screening

AUTHORS

Bloggs, J; Smith, M

ABSTRACT

400 compounds from the Pathogen box were screened for inhibitory activity against two human proteins.

COMPOUND_RECORD.tsv

CIDX

MMV010764

MMV026468

MMV011229

RIDX

Pathogen_Box_Bloggs

Pathogen_Box_Bloggs

Pathogen_Box_Bloggs

COMPOUND_NAME

MMV010764

MMV026468

MMV011229

COMPOUND_KEY

MMV010764

MMV026468

MMV011229

COMPOUND_SOURCE

COMPOUND_CTAB.sdf


  Mrv0541 04191616472D          

 27 28  0  0  0  0            999 V2000
   -5.0013   -1.2375    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.2868   -0.8250    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.2868    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5724   -1.2375    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.8579   -0.8250    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
   -2.1434   -1.2375    0.0000 S   0  0  0  0  0  0  0  0  0  0  0  0
   -2.5559   -1.9520    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7309   -0.5230    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
   -1.4289   -1.6500    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.7145   -1.2375    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.0000   -1.6500    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.0000   -2.4750    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.7145   -2.8875    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.4289   -2.4750    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.1434   -2.8875    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.1434   -3.7125    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.8579   -2.4750    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.5724   -2.8875    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.2868   -2.4750    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.0013   -2.8875    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.0013   -3.7125    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.7158   -4.1250    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
    4.2868   -4.1250    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.5724   -3.7125    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.8579   -4.1250    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.7145   -2.8875    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.4289   -2.4750    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  2  3  1  0  0  0  0
  2  4  1  0  0  0  0
  4  5  1  0  0  0  0
  5  6  1  0  0  0  0
  6  7  2  0  0  0  0
  6  8  2  0  0  0  0
  6  9  1  0  0  0  0
  9 10  2  0  0  0  0
 10 11  1  0  0  0  0
 11 12  2  0  0  0  0
 12 13  1  0  0  0  0
 13 14  1  0  0  0  0
 14 15  1  0  0  0  0
 15 16  2  0  0  0  0
 15 17  1  0  0  0  0
 17 18  1  0  0  0  0
 18 19  2  0  0  0  0
 19 20  1  0  0  0  0
 20 21  2  0  0  0  0
 21 22  1  0  0  0  0
 21 23  1  0  0  0  0
 23 24  2  0  0  0  0
 18 24  1  0  0  0  0
 24 25  1  0  0  0  0
 12 26  1  0  0  0  0
 26 27  2  0  0  0  0
  9 27  1  0  0  0  0
M  END
> <CIDX>
MMV161996

$$$$

ASSAY.tsv

AIDX

PB_FECH

PB_HMBS

RIDX

Pathogen_Box_Bloggs

Pathogen_Box_Bloggs

ASSAY_DESCRIPTION

Compound was evaluated for the inhibition of human FECH at 10uM

Compound was evaluated for the inhibition of human HMBS at 100uM

ASSAY_TYPE

B

B

ASSAY_ORGANISM

Homo sapiens

Homo sapiens

ASSAY_STRAIN

ASSAY_TAX_ID

9606

9606

ASSAY_TISSUE

ASSAY_CELL_TYPE

ASSAY_SUBCELLULAR_FRACTION

ASSAY_SOURCE

TARGET_TYPE

PROTEIN

PROTEIN

TARGET_NAME

FECH

HMBS

TARGET_ACCESSION

P22830

P08397

TARGET_ORGANISM

Homo sapiens

Homo sapiens

TARGET_TAX_ID

9606

9606

ASSAY_PARAM.tsv

AIDX

PB_FECH

PB_HMBS

TYPE

CONC

CONC

RELATION

=

=

VALUE

10

100

UNITS

uM

uM

TEXT_VALUE

COMMENTS

ACTIVITY.tsv

RIDX

Pathogen_Box_Bloggs

Pathogen_Box_Bloggs

Pathogen_Box_Bloggs

CRIDX

Pathogen_Box_Bloggs

Pathogen_Box_Bloggs

Pathogen_Box_Bloggs

CRIDX_DOCID

CRIDX_CHEMBLID

CIDX

MMV161996

MMV202458

MMV676395

SRC_ID_CIDX

AIDX

1

1

1

TYPE

Inhibition

Inhibition

Inhibition

ACTION_TYPE

ANTAGONIST

ANTAGONIST

ANTAGONIST

TEXT_VALUE

RELATION

=

=

=

VALUE

0

1

61

UPPER_VALUE

UNITS

%

%

%

SD_PLUS

SD_MINUS

ACTIVITY_COMMENT

Not active

Not active

Active

ACT_ID

PB_FECH_MMV161996

PB_FECH_MMV202458

PB_FECH_MMV676395

TEOID

ACTIVITY_PROPERTIES.tsv

ACT_ID

PB_FECH_MMV161996

TYPE

HILL_SLOPE

RELATION

=

VALUE

UNITS

TEXT_VALUE

1.1

COMMENTS

RESULTS_FLAG

ACTIVITY_SUPP.tsv

REGID

0

1

2

2

SAMID

0

1

2

2

TYPE

Treatment

Treatment

Treatment

"Side Effect (Y,N)"

RELATION

=

=

=

=

ACT_ID

PB_FECH_MMV161996

PB_FECH_MMV202458

PB_FECH_MMV676395

PB_FECH_MMV676395

TEXT_VALUE

Y

UNITS

VALUE

COMMENTS

1

1

1

"Y,N"

ACTIVITY_SUPP_MAP.tsv

ACT_ID

PB_FECH_MMV161996

PB_FECH_MMV202458

PB_FECH_MMV676395

PB_FECH_MMV676395

SAMID

0

1

2

2

Last updated