The data is updated regularly, with releases approximately 2-3 times a year.
The ChEMBLID is a unique ID that has been assigned to compounds, targets, assays, documents, tissues and cell types in ChEMBL. It can be used to retrieve a Report Card page for these entities, or to search for them using the keyword search.
Molregno is our ChEMBL internal identification given to each compound. The ChEMBLID is the externally viewed identification for each compound.
The source mapping tables are on the FTP website and include a ChEMBL_ID to PubChem_ID mapping table (source mapping table for src1 to src22).
2) In the ChEMBL database, identifiers may be found in the compound_records table (as the compound_key).
Data are routinely extracted from seven core journals:
- Bioorg Med Chem Lett (https://www.sciencedirect.com/journal/bioorganic-and-medicinal-chemistry-letters)
However, we also have data from selected articles in more than 200 journals including Antimicrob Agents Chemother, Med Chem Res, J Agric Food Chem and Drug Metab Dispos. ChEMBL currently contains data from more than 86,000 journal articles, and we also now include data from selected patents.
You can sign up to our ChEMBL Announce mailing list (http://listserver.ebi.ac.uk/mailman/listinfo/chembl-announce), where you will be kept up to date with all new releases and changes to the database.
As the data and the compounds are continually being curated, it is not possible to keep a track of these alterations or additions. The CHEMBL_ID_LOOKUP table stores a list of all active and inactive (obsolete) entities under the status field: ACTIVE OBS. It is also possible to download previous releases and identify changes through a comparison of the data.
If you need help, please contact us and we will be able to let you know what, if anything, has happened to the data you are interested in.
Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Félix E, Magariños MP, Mosquera JF, Mutowo P, Nowotka M, Gordillo-Marañón M, Hunter F, Junco L, Mugumbate G, Rodriguez-Lopez M, Atkinson F, Bosc N, Radoux CJ, Segura-Cabrera A, Hersey A, Leach AR. (2019) 'ChEMBL: towards direct deposition of bioassay data.' Nucleic Acids Res., 47(D1) D930-D940.
ChEMBL Web Services:
Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP. (2015) 'ChEMBL web services: streamlining access to drug discovery data and utilities.' Nucleic Acids Res., 43(W1) W612-W620.
S. Jupp, J. Malone, J. Bolleman, M. Brandizi, M. Davies, L. Garcia, A. Gaulton, S. Gehant, C. Laibe, N. Redaschi, S.M Wimalaratne, M. Martin, N. Le Novère, H. Parkinson, E. Birney and A.M Jenkinson (2014) The EBI RDF Platform: Linked Open Data for the Life Sciences Bioinformatics 30 1338-1339
- 1.We have a dedicated email address for data queries, error reporting or help requests. This is: [email protected]
The best way to access large amounts of data is to install a database instance on your own computer using MySQL.
Yes, there is a ChEMBL RDF. It is stored on the FTP site.
ChEMBL_14 is an earlier release of the data, with ChEMBL_15 being an updated version. Subsequent updates will have consecutive numbers, so the one with the highest number will be the most recent full version of the data.
ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). We attempt to normalise the bioactivities into a uniform set of end-points and units where possible, and also to tag the links between a molecular target and a published assay with a set of varying confidence levels. The data is abstracted and curated from the primary scientific literature, and cover a significant fraction of the SAR and discovery of modern drugs.
ChEMBL-NTD is a repository for Open Access primary screening and medicinal chemistry data directed at neglected diseases - endemic tropical diseases of the developing regions of the Africa, Asia, and the Americas. The primary purpose of ChEMBL-NTD is to provide a freely accessible and permanent archive and distribution centre for deposited data. ChEMBL-NTD is a subset of the data in the free medicinal chemistry and drug discovery database ChEMBLdb.
ChEMBL consists of data from a wide variety of data sources including scientific literature and patents, deposited data sets, PubChem BioAssay and BindingDB databases, toxicology data sets and drug/clinical candidate resources. The current list of sources is:
GSK Malaria Screening
Novartis Malaria Screening
St Jude Malaria Screening
Sanger Institute Genomics of Drug Sensitivity in Cancer
Guide to Receptors and Channels (DEPRECATED)
Manually Added Drugs
USP Dictionary of USAN and International Drug Names
Drugs for Neglected Diseases Initiative (DNDi)
GSK Published Kinase Inhibitor Set
MMV Malaria Box
TP-search Transporter Database
Harvard Malaria Screening
WHO-TDR Malaria Screening
Deposited Supplementary Bioactivity Data
GSK Tuberculosis Screening
Open Source Malaria Screening
Millipore Kinase Screening (DEPRECATED - MERGED WITH SRC_ID = 1
External Project Compounds
Gene Expression Atlas Compounds
AstraZeneca Deposited Data
FDA Approval Packages
GSK Kinetoplastid Screening
Curated Drug Metabolism Pathways
St Jude Leishmania Screening
Gates Library compound collection
MMV Pathogen Box
Patent Bioactivity Data
Curated Drug Pharmacokinetic Data
CO-ADD antimicrobial screening data
WHO Anatomical Therapeutic Chemical Classification
British National Formulary
Published Kinase Inhibitor Set 2
Kuster lab chemical proteomics drug profiling
Winzeler Lab Plasmodium Screening Data
SARS-CoV-2 Screening Data 2020-21
Prodrug active ingredients
Donated Chemical Probes - SGC Frankfurt
EUbOPEN Chemogenomic Library
Salvensis and LSHTM Schistosomiasis screening data
IMI-CARE SARS-CoV-2 Data
MMV Malaria HGL
International Nonproprietary Names
Links to webinar slides and PDFs: