Example dataset
An example dataset, showing the structure of all required files and advice on avoiding common issues.
A simple set of example data files suitable for use as a template are available below:
The dataset below is a truncated example showing only the first SDF record and first rows of each file type. The full files are available in the "Simple_Full_Example" download. Mandatory fields are in bold.
PLEASE NOTE: For ease of viewing the column headers have been formatted as rows in this example. When submitting data to ChEMBL the formatting should be rotated by 90 degrees so that the fields are column headers and each row below is a data entry point.
REFERENCE.tsv
RIDX
Pathogen_Box_Bloggs
PUBMED_ID
DOI
10.1016/S0960-894X(01)81052-0
PATENT_ID
JOURNAL_NAME
YEAR
2022
VOLUME
ISSUE
FIRST_PAGE
LAST_PAGE
REF_TYPE
Dataset
TITLE
Pathogen Box Compounds Screening
AUTHORS
Bloggs, J; Smith, M
ABSTRACT
400 compounds from the Pathogen box were screened for inhibitory activity against two human proteins.
DOI
10.1016/S0960-894X(01)81052-0
DATA_LICENCE
This has been left blank. In order to include your data, it must be filled with ''CC0", in order to show you consent to your data being released as Public Domain.
If this is not possible for you, contact us, and we may be able to permit an alternative free/open licence.
CONTACT
https://orcid.org/0000-0002-0343-0319
COMPOUND_RECORD.tsv
CIDX
MMV010764
MMV026468
MMV011229
RIDX
Pathogen_Box_Bloggs
Pathogen_Box_Bloggs
Pathogen_Box_Bloggs
COMPOUND_NAME
MMV010764
MMV026468
MMV011229
COMPOUND_KEY
1a
2b
MMV011229
COMPOUND_SOURCE
COMPOUND_CTAB.sdf
Mrv0541 04191616472D
27 28 0 0 0 0 999 V2000
-5.0013 -1.2375 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-4.2868 -0.8250 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-4.2868 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.5724 -1.2375 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.8579 -0.8250 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
-2.1434 -1.2375 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0
-2.5559 -1.9520 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
-1.7309 -0.5230 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
-1.4289 -1.6500 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.7145 -1.2375 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 -1.6500 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 -2.4750 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7145 -2.8875 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.4289 -2.4750 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.1434 -2.8875 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.1434 -3.7125 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.8579 -2.4750 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
3.5724 -2.8875 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.2868 -2.4750 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.0013 -2.8875 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.0013 -3.7125 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.7158 -4.1250 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
4.2868 -4.1250 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.5724 -3.7125 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.8579 -4.1250 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.7145 -2.8875 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.4289 -2.4750 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
2 4 1 0 0 0 0
4 5 1 0 0 0 0
5 6 1 0 0 0 0
6 7 2 0 0 0 0
6 8 2 0 0 0 0
6 9 1 0 0 0 0
9 10 2 0 0 0 0
10 11 1 0 0 0 0
11 12 2 0 0 0 0
12 13 1 0 0 0 0
13 14 1 0 0 0 0
14 15 1 0 0 0 0
15 16 2 0 0 0 0
15 17 1 0 0 0 0
17 18 1 0 0 0 0
18 19 2 0 0 0 0
19 20 1 0 0 0 0
20 21 2 0 0 0 0
21 22 1 0 0 0 0
21 23 1 0 0 0 0
23 24 2 0 0 0 0
18 24 1 0 0 0 0
24 25 1 0 0 0 0
12 26 1 0 0 0 0
26 27 2 0 0 0 0
9 27 1 0 0 0 0
M END
> <CIDX>
MMV161996
$$$$
ASSAY.tsv
AIDX
PB_FECH
PB_CAVIA
RIDX
Pathogen_Box_Bloggs
Pathogen_Box_Bloggs
ASSAY_DESCRIPTION
Compound was evaluated for the inhibition of human FECH at 10uM
Half life in Dunkin-Hartley guinea pig plasma at 5 mM by HPLC analysis
ASSAY_TYPE
B
A
ASSAY_ORGANISM
Homo sapiens
Cavia porcellus
ASSAY_STRAIN
Dunkin-Hartley
ASSAY_TAX_ID
9606
10141
ASSAY_TISSUE
Plasma
ASSAY_CELL_TYPE
ASSAY_SUBCELLULAR_FRACTION
ASSAY_SOURCE
TARGET_TYPE
PROTEIN
ADMET
TARGET_NAME
FECH
TARGET_ACCESSION
P22830
TARGET_ORGANISM
Homo sapiens
TARGET_TAX_ID
9606
ASSAY_PARAM.tsv
AIDX
PB_VIRUS
PB_VIRUS
TYPE
CONC
TIMEPOINT
RELATION
=
=
VALUE
10
2
UNITS
uM
hr
TEXT_VALUE
COMMENTS
ACTIVITY.tsv
CIDX
MMV161996
MMV161996
MMV202458
MMV202458
MMV676395
MMV676395
CRIDX
Pathogen_Box_Bloggs
Pathogen_Box_Bloggs
Pathogen_Box_Bloggs
Pathogen_Box_Bloggs
Pathogen_Box_Bloggs
Pathogen_Box_Bloggs
AIDX
1
1
1
TYPE
Inhibition
Inhibition
Inhibition
Inhibition
Inhibition
Inhibition
ACTION_TYPE
ANTAGONIST
ANTAGONIST
ANTAGONIST
ANTAGONIST
ANTAGONIST
ANTAGONIST
TEXT_VALUE
Active
Not Active
Not Active
RELATION
=
=
=
VALUE
2
2
0
UNITS
%
%
%
ACTIVITY_COMMENT
ACT_ID
PB_FECH_MMV161996
PB_FECH_MMV161996
PB_FECH_MMV202458
PB_FECH_MMV202458
PB_FECH_MMV676395
PB_FECH_MMV676395
ACTIVITY_PROPERTIES.tsv
ACT_ID
PB_FECH_MMV161996
TYPE
HILL_SLOPE
RELATION
=
VALUE
1.1
UNITS
TEXT_VALUE
COMMENTS
RESULTS_FLAG
ACTIVITY_SUPP.tsv
REGID
0
1
2
2
SAMID
0
1
2
2
TYPE
Treatment
Treatment
Treatment
"Side Effect (Y,N)"
RELATION
=
=
=
TEXT_VALUE
Y
UNITS
VALUE
COMMENTS
1
1
1
"Y,N"
ACTIVITY_SUPP_MAP.tsv
ACT_ID
PB_FECH_MMV010764
PB_FECH_MMV675997
PB_FECH_MMV026468
SAMID
0
1
2
Last updated