ASSAY.tsv
The ASSAY file provides a brief description of the assay, along with the target organism, tissue, cellular fraction etc.
RIDX is optional, but if it is excluded the data will be assigned to the default DOC_ID for the Source.
ASSAY_TAX_ID and TARGET_TAX_ID must be the NCBI Taxon ID for the organism, not the strain. The strain for the assay organism can be given in ASSAY_STRAIN.
ACT_ID is mandatory if providing an ACTIVITY_PROPERTIES or ACTIVITY_SUPPLEMENTARY record
AIDX, ASSAY_DESCRIPTION and ASSAY_TYPE are all mandatory.
For guidance on writing the ASSAY_DESCRIPTION see below.
TARGET_TYPE must be one from the defined list.
An AIDX must be created for all the activity points you wish to record in the ACTIVITY file, there is a one-to-many relationship between assays and activities. ACTIVITY files must cite only existing DDIs.
An ASSAY record is a single instance of an assay, not an assay protocol record. If the same assay protocol is used in multiple datasets, it is still a new ASSAY in each dataset. Each assay can only have a single target; in the case of a panel assay, there will need to be one assay record per target.
Guidance on the ASSAY_DESCRIPTION
Assay descriptions should capture the method, overall assay aim, target plus isoform/mutation details and any other biological entities present in the assay.
We have a comprehensive breakdown of this field here.
These examples all cover protein-based functional and binding assays
Examples:
Activation of GAL4 fused human LXR alpha expressed in HEK293 cells after 24 hrs by luciferase reporter gene assay
Agonist activity at PPARgamma in human HepG2 cells without THRA knockdown assessed as induction of CD36 expression at 1 uM incubated for 8 hrs in by Western blot analysis
Biased agonist activity at human 5-HT1A receptor stably expressed in CHO-K1 cells assessed as increase in ERK1/2 phosphorylation after 15 mins by Alphalisa assay
Binding affinity against glycine binding site associated with N-methyl-D-aspartate glutamate receptor from rat synaptic plasma membrane(SPM) determined using [3H]glycine as radioligand.
Binding affinity against nicotinic acetylcholine receptor alpha4-beta2 using [3H]epibatidine as radioligand expressed in HEK293 cells or tsA cells
Noncompetitive inhibition of histidine-tagged human recombinant ABL T315I mutant expressed in baculovirus system by Lineweaver-Burke plot analysis using ATP as substrate
Inhibition of human Abl T315I mutant at 10 uM after 60 mins by TR-FRET assay
ASSAY_CELL_TYPE rules
The assay_cell_type is either a cell-line name (e.g. MRC-5) or endogenous cell type (e.g. fibroblasts).
This in an example cell report card from ChEMBL for a cell line: https://www.ebi.ac.uk/chembl/cell_line_report_card/CHEMBL3308499/
This in an example cell report card from ChEMBL for an endogenous cell type: https://www.ebi.ac.uk/chembl/explore/cell_line/CHEMBL4295351
For cell lines and endogenous cells that are within ChEMBL, the cell report card Name can be used to populate the 'ASSAY_CELL_TYPE' field. It is especially important for cell-based assays that you consider the ASSAY_TAX_ID, which might be different to the TARGET_TAX_ID if, for example, a human protein is being expressed in a non-human cell line (ASSAY_TAX_ID non human, TARGET_TAX_ID human).
For cell lines that are not found within ChEMBL, Cellosaurus is a good resource. For endogenous cells please try to use the Experimental Factor Ontology (EFO) which can be searched on the Ontology Lookup Service. Otherwise, a depositor-defined name can be provided as the ASSAY_CELL_TYPE.
Header
Description
Existence
Data type
AIDX
The AIDX set by the depositor - a primary key
Mandatory
Any character up to a length of 200.
Should be a meaningful unique identifier for each Assay, not just a number.
RIDX
The RIDX cited by the depositor. MUST be owned by the depositor
Optional
Any character up to a length of 200
ASSAY_DESCRIPTION
A description of the assay (guidance below)
Mandatory
Any character up to a length of 4000
ASSAY_TYPE
The type of the assay. B (Binding) ,F (Functional),A (ADMET),U (Unassigned),P (Physicochemical) or T (toxicity)
Mandatory
Accepted Assay types
(ADMET, A, Functional, F, Binding, B, Unassigned, U, Physiochemical, P, Toxicity, T)
ASSAY_GROUP
Assays deposited across multiple datasets/releases that are considered comparable by the depositor are mapped to the same ASSAY GROUP. Allows depositors to group comparable assays, and users to identify assays that can be considered identical.
Optional, unless you plan to submit further data using the same assays.
Any character up to a length of 200
ASSAY_ORGANISM
The assay organism
Optional (Mandatory if ASSAY_TAXON_ID is populated)
Any character up to a length of 250
ASSAY_STRAIN
The strain of the assay organism
Optional
Any character up to a length of 200
ASSAY_TAX_ID
NCBI taxonomy ID for the assay organism
Optional
Any positive integer up to a length of 11 or '0'
ASSAY_SOURCE
The original source of the assay
Optional
Any character up to a length of 100
ASSAY_TISSUE
The type of tissue used in the assay
Optional
Any character up to a length of 100
ASSAY_SUBCELLULAR_FRACTION
The subcellular fraction used in the assay
Optional
Any character up to a length of 100
TARGET_NAME
The name of the target
Optional
Any character up to a length of 400
TARGET_ACCESSION
A UniProt accession for the target where the target is a protein. Otherwise to be left blank.
Optional
An integer or a UniProt ID up to a length of 255
TARGET_ORGANISM
The target organism
Optional
Any character up to a length of 100
TARGET_TAX_ID
The NCBI taxonomy ID of the target organism
Optional
Any positive integer up to a length of 11 of '0'
Last updated