General Questions

How often do you update the data?

The data is updated regularly, with releases approximately 2-3 times a year.

What is the ChEMBLID and why do both Assays and Documents have ChEMBLIDs?

The ChEMBLID is a unique ID that has been assigned to compounds, targets, assays, documents, tissues and cell types in ChEMBL. It can be used to retrieve a Report Card page for these entities, or to search for them using the keyword search.

What is the difference between Molregno and ChEMBLID?

Molregno is our ChEMBL internal identification given to each compound. The ChEMBLID is the externally viewed identification for each compound.

How can I interconvert PubChem and ChEMBL_IDs?

1) UniChem can be used to map between 2 sets of identifiers.

The source mapping tables are on the FTP website and include a ChEMBL_ID to PubChem_ID mapping table (source mapping table for src1 to src22).

2) In the ChEMBL database, identifiers may be found in the compound_records table (as the compound_key).

What literature coverage is there in ChEMBL?

Data are routinely extracted from seven core journals:

However, we also have data from selected articles in more than 200 journals including Antimicrob Agents Chemother, Med Chem Res, J Agric Food Chem and Drug Metab Dispos. ChEMBL currently contains data from more than 86,000 journal articles, and we also now include data from selected patents.

How can I be informed of updates in ChEMBL?

You can sign up to our ChEMBL Announce mailing list (http://listserver.ebi.ac.uk/mailman/listinfo/chembl-announce), where you will be kept up to date with all new releases and changes to the database.

Do you show which compounds and/or bioactivity data has been changed or deleted with each release of ChEMBL?

As the data and the compounds are continually being curated, it is not possible to keep a track of these alterations or additions. The CHEMBL_ID_LOOKUP table stores a list of all active and inactive (obsolete) entities under the status field: ACTIVE OBS. It is also possible to download previous releases and identify changes through a comparison of the data.

If you need help, please contact us and we will be able to let you know what, if anything, has happened to the data you are interested in.

The publications used to cite ChEMBL are:

ChEMBL Database:

Zdrazil B, Felix E, Hunter F, Manners EJ, Blackshaw J, Corbett S, de Veij M, Ioannidis H, Lopez DM, Mosquera JF, Magarinos MP, Bosc N, Arcila R, Kizilören T, Gaulton A, Bento AP, Adasme MF, Monecke P, Landrum GA, Leach AR. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res. 2024 Jan 5;52(D1):D1180-D1192.

doi: 10.1093/nar/gkad1004. PMID: 37933841; PMCID: PMC10767899.

ChEMBL Web Services:

Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP. (2015) 'ChEMBL web services: streamlining access to drug discovery data and utilities.' Nucleic Acids Res., 43(W1) W612-W620.

DOI: 10.1093/nar/gkv352 PMC: PMC4489243

ChEMBL RDF:

S. Jupp, J. Malone, J. Bolleman, M. Brandizi, M. Davies, L. Garcia, A. Gaulton, S. Gehant, C. Laibe, N. Redaschi, S.M Wimalaratne, M. Martin, N. Le Novère, H. Parkinson, E. Birney and A.M Jenkinson (2014) The EBI RDF Platform: Linked Open Data for the Life Sciences Bioinformatics 30 1338-1339

DOI: 10.1093/bioinformatics/btt765 PMID: 24413672

How do I report errors or make suggestions for the interface?

  1. We have a dedicated email address for data queries, error reporting or help requests. This is: chembl-help@ebi.ac.uk

  2. You can create an issue in our GitHub Interface Repository to report issues with the interface only.

  3. You can create an issue in our GitHub Data Repository to report issues with the data only.

  4. You can create an issue in our GitHub API Repository to report issues with the ChEMBL API.

  5. You can create an issue in our GitHub Python Client Repository to report issues with the Python client library.

What is the best way to access large amounts of data in ChEMBL?

The best way to access large amounts of data is to install a database instance on your own computer using MySQL.

Is there a ChEMBL RDF?

Yes, there is a ChEMBL RDF. It is stored on the FTP site.

At present we do not maintain the RDF, and therefore any recently changed data fields in the ChEMBL database will not be included in the existing RDF version.

What is the difference between the different releases of ChEMBL, e.g. ChEMBL_14 and ChEMBL_15?

ChEMBL_14 is an earlier release of the data, with ChEMBL_15 being an updated version. Subsequent updates will have consecutive numbers, so the one with the highest number will be the most recent full version of the data.

What is the difference between ChEMBL and ChEMBL-NTD?

ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). We attempt to normalise the bioactivities into a uniform set of end-points and units where possible, and also to tag the links between a molecular target and a published assay with a set of varying confidence levels. The data is abstracted and curated from the primary scientific literature, and cover a significant fraction of the SAR and discovery of modern drugs.

ChEMBL-NTD is a repository for Open Access primary screening and medicinal chemistry data directed at neglected diseases - endemic tropical diseases of the developing regions of the Africa, Asia, and the Americas. The primary purpose of ChEMBL-NTD is to provide a freely accessible and permanent archive and distribution centre for deposited data. ChEMBL-NTD is a subset of the data in the free medicinal chemistry and drug discovery database ChEMBLdb.

How do you use the Web Services?

Web Service documentation and example queries can be found at: https://www.ebi.ac.uk/chembl/ws

Can I purchase compounds from ChEMBL?

ChEMBL is a database of bioactivity data and we do not supply compounds.

However, you can use the cross-references on the compound report cards to view commercial chemical providers.

Can you provide more details on licensing?

The ChEMBL database is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License (https://creativecommons.org/licenses/by-sa/3.0/). This allows use, redistribution and adaption of ChEMBL as long as appropriate attribution is given (e.g., cite the current ChEMBL paper:) and ensure that any adaptations are redistributed under the same license.

ChEMBL also includes compound property calculations, derived from commercial software. When using these commercial calculations, users should ensure that they refer to and abide by commercial licensing agreements. For example, calculations should not be extracted in isolation with the aim to train models intended to replicate the commercial process.

Can you provide more details on ChEMBL releases?

Each new ChEMBL release includes literature data from the previous 12-18 months. The precise cutoff point for extraction of literature data varies slightly between ChEMBL releases and for different journals. Occasionally, selected articles are added to ChEMBL at a later date, therefore the ChEMBL version and publication dates are not always associated. A new release may also contain new data for existing compounds.

ChEMBL provide periodic updates to the database that include new data, corrections, additional curation and new features. Subsequent updates will have consecutive numbers, so the version with the highest number will be the most recent full version of the data. Version information is found in the VERSION table.

The API and interface will always reflect the data in the current release and can not currently be used to extract data from previous versions. However, old data dumps remain available for download and can be queried using a database tool and the SQL language to obtain version-specific data.

What type of data does ChEMBL accept in depositions and how can I deposit data?

ChEMBL is a database of bioactivity data for drug-like compounds and includes data from seven core medicinal chemistry journals, patents and deposited data that fits within the scope of ChEMBL. Deposited data is primarily dose-response and ADMET type data. Please get in touch if you believe your data could be included in ChEMBL.

For regular depositors, you can sign up to the chembl_depositor mailing list (https://listserver.ebi.ac.uk/mailman/listinfo/chembl-depositors) to receive updates on ChEMBL deposition timelines.

Documentation for depositors can be found here - https://chembl.gitbook.io/chembl-loader/ where you can see the information required to submit to ChEMBL.

How can I access ChEMBL through KNIME?

The ChEMBL KNIME nodes are no longer maintained but our recommendation is to use the generic KNIME REST nodes to access the ChEMBL web services, as this will be simpler to maintain and adapt to cope with schema changes etc. There are some example workflows available, using this approach, that have been provided by Daria Goldmann e.g., https://www.knime.com/blog/a-restful-way-to-find-and-retrieve-data https://hub.knime.com/knime/workflows/Examples/50_Applications/30_RESTful_ChEMBL/01_ChEMBL_REST_Services*qBCbzaFZ85qUMoWZ https://hub.knime.com/knime/workflows/Examples/50_Applications/30_RESTful_ChEMBL/02_ChEMBL_Structure_Search*wCygb7lTvPqnYXuB https://hub.knime.com/knime/workflows/Examples/50_Applications/30_RESTful_ChEMBL/03_ChEMBL_Bioactivity_Search*tGSG1nPdGg4cW7Qp

Last updated