ASSAY.tsv

The ASSAY file provides a brief description of the assay, along with the target organism, tissue, cellular fraction etc.

  • RIDX is optional, but if it is excluded the data will be assigned to the default DOC_ID for the Source.

  • ASSAY_TAX_ID and TARGET_TAX_ID must be the NCBI Taxon ID for the organism, not the strain. The strain for the assay organism can be given in ASSAY_STRAIN.

  • ACT_ID is mandatory if providing an ACTIVITY_PROPERTIES or ACTIVITY_SUPPLEMENTARY record

  • AIDX, ASSAY_DESCRIPTION and ASSAY_TYPE are all mandatory.

  • For guidance on writing the ASSAY_DESCRIPTION see below.

  • TARGET_TYPE must be one from the defined list.

  • An AIDX must be created for all the activity points you wish to record in the ACTIVITY file, there is a one-to-many relationship between assays and activities. ACTIVITY files must cite only existing DDIs.

An ASSAY record is a single instance of an assay, not an assay protocol record. If the same assay protocol is used in multiple datasets, it is still a new ASSAY in each dataset. Each assay can only have a single target; in the case of a panel assay, there will need to be one assay record per target.

Guidance on the ASSAY_DESCRIPTION

Assay descriptions should capture the method, overall assay aim, target plus isoform/mutation details and any other biological entities present in the assay.

We have a comprehensive breakdown of this field here.

These examples all cover protein-based functional and binding assays

Examples:

Activation of GAL4 fused human LXR alpha expressed in HEK293 cells after 24 hrs by luciferase reporter gene assay

Agonist activity at PPARgamma in human HepG2 cells without THRA knockdown assessed as induction of CD36 expression at 1 uM incubated for 8 hrs in by Western blot analysis

Biased agonist activity at human 5-HT1A receptor stably expressed in CHO-K1 cells assessed as increase in ERK1/2 phosphorylation after 15 mins by Alphalisa assay

Binding affinity against glycine binding site associated with N-methyl-D-aspartate glutamate receptor from rat synaptic plasma membrane(SPM) determined using [3H]glycine as radioligand.

Binding affinity against nicotinic acetylcholine receptor alpha4-beta2 using [3H]epibatidine as radioligand expressed in HEK293 cells or tsA cells

Noncompetitive inhibition of histidine-tagged human recombinant ABL T315I mutant expressed in baculovirus system by Lineweaver-Burke plot analysis using ATP as substrate

Inhibition of human Abl T315I mutant at 10 uM after 60 mins by TR-FRET assay

ASSAY_CELL_TYPE rules

The assay_cell_type is either a cell-line name (e.g. MRC-5) or endogenous cell type (e.g. fibroblasts).

This in an example cell report card from ChEMBL for a cell line: https://www.ebi.ac.uk/chembl/cell_line_report_card/CHEMBL3308499/

This in an example cell report card from ChEMBL for an endogenous cell type: https://www.ebi.ac.uk/chembl/explore/cell_line/CHEMBL4295351

For cell lines and endogenous cells that are within ChEMBL, the cell report card Name can be used to populate the 'ASSAY_CELL_TYPE' field. It is especially important for cell-based assays that you consider the ASSAY_TAX_ID, which might be different to the TARGET_TAX_ID if, for example, a human protein is being expressed in a non-human cell line (ASSAY_TAX_ID non human, TARGET_TAX_ID human).

For cell lines that are not found within ChEMBL, Cellosaurus is a good resource. For endogenous cells please try to use the Experimental Factor Ontology (EFO) which can be searched on the Ontology Lookup Service. Otherwise, a depositor-defined name can be provided as the ASSAY_CELL_TYPE.

Header

Description

Existence

Data type

AIDX

The AIDX set by the depositor - a primary key

Mandatory

Any character up to a length of 200.

Should be a meaningful unique identifier for each Assay, not just a number.

RIDX

The RIDX cited by the depositor. MUST be owned by the depositor

Optional

Any character up to a length of 200

ASSAY_DESCRIPTION

A description of the assay (guidance below)

Mandatory

Any character up to a length of 4000

ASSAY_TYPE

The type of the assay. B (Binding) ,F (Functional),A (ADMET),U (Unassigned),P (Physicochemical) or T (toxicity)

Mandatory

Accepted Assay types

(ADMET, A, Functional, F, Binding, B, Unassigned, U, Physiochemical, P, Toxicity, T)

ASSAY_GROUP

Assays deposited across multiple datasets/releases that are considered comparable by the depositor are mapped to the same ASSAY GROUP. Allows depositors to group comparable assays, and users to identify assays that can be considered identical.

Optional, unless you plan to submit further data using the same assays.

Any character up to a length of 200

ASSAY_ORGANISM

The assay organism

Optional (Mandatory if ASSAY_TAXON_ID is populated)

Any character up to a length of 250

ASSAY_STRAIN

The strain of the assay organism

Optional

Any character up to a length of 200

ASSAY_TAX_ID

NCBI taxonomy ID for the assay organism

Optional

Any positive integer up to a length of 11 or '0'

ASSAY_SOURCE

The original source of the assay

Optional

Any character up to a length of 100

ASSAY_TISSUE

The type of tissue used in the assay

Optional

Any character up to a length of 100

ASSAY_CELL_TYPE

The cell line - see section below

Optional

Any character up to a length of 100

ASSAY_SUBCELLULAR_FRACTION

The subcellular fraction used in the assay

Optional

Any character up to a length of 100

The type of target

Optional

Any character up to a length of 25. Target type list

TARGET_NAME

The name of the target

Optional

Any character up to a length of 400

TARGET_ACCESSION

A UniProt accession for the target where the target is a protein. Otherwise to be left blank.

Optional

An integer or a UniProt ID up to a length of 255

TARGET_ORGANISM

The target organism

Optional

Any character up to a length of 100

TARGET_TAX_ID

The NCBI taxonomy ID of the target organism

Optional

Any positive integer up to a length of 11 of '0'

Last updated