General Questions
The data is updated regularly, with releases approximately 2-3 times a year.
The ChEMBLID is a unique ID that has been assigned to compounds, targets, assays, documents, tissues and cell types in ChEMBL. It can be used to retrieve a Report Card page for these entities, or to search for them using the keyword search.
Molregno is our ChEMBL internal identification given to each compound. The ChEMBLID is the externally viewed identification for each compound.
The source mapping tables are on the FTP website and include a ChEMBL_ID to PubChem_ID mapping table (source mapping table for src1 to src22).
2) In the ChEMBL database, identifiers may be found in the compound_records table (as the compound_key).
Data are routinely extracted from seven core journals:
- Bioorg Med Chem Lett (https://www.sciencedirect.com/journal/bioorganic-and-medicinal-chemistry-letters)
However, we also have data from selected articles in more than 200 journals including Antimicrob Agents Chemother, Med Chem Res, J Agric Food Chem and Drug Metab Dispos. ChEMBL currently contains data from more than 86,000 journal articles, and we also now include data from selected patents.
You can sign up to our ChEMBL Announce mailing list (http://listserver.ebi.ac.uk/mailman/listinfo/chembl-announce), where you will be kept up to date with all new releases and changes to the database.
As the data and the compounds are continually being curated, it is not possible to keep a track of these alterations or additions. The CHEMBL_ID_LOOKUP table stores a list of all active and inactive (obsolete) entities under the status field: ACTIVE OBS. It is also possible to download previous releases and identify changes through a comparison of the data.
If you need help, please contact us and we will be able to let you know what, if anything, has happened to the data you are interested in.
ChEMBL Database:
Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Félix E, Magariños MP, Mosquera JF, Mutowo P, Nowotka M, Gordillo-Marañón M, Hunter F, Junco L, Mugumbate G, Rodriguez-Lopez M, Atkinson F, Bosc N, Radoux CJ, Segura-Cabrera A, Hersey A, Leach AR. (2019) 'ChEMBL: towards direct deposition of bioassay data.' Nucleic Acids Res., 47(D1) D930-D940.
ChEMBL Web Services:
Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP. (2015) 'ChEMBL web services: streamlining access to drug discovery data and utilities.' Nucleic Acids Res., 43(W1) W612-W620.
ChEMBL RDF:
S. Jupp, J. Malone, J. Bolleman, M. Brandizi, M. Davies, L. Garcia, A. Gaulton, S. Gehant, C. Laibe, N. Redaschi, S.M Wimalaratne, M. Martin, N. Le Novère, H. Parkinson, E. Birney and A.M Jenkinson (2014) The EBI RDF Platform: Linked Open Data for the Life Sciences Bioinformatics 30 1338-1339
- 1.We have a dedicated email address for data queries, error reporting or help requests. This is: [email protected]
- 2.
The best way to access large amounts of data is to install a database instance on your own computer using MySQL.
Yes, there is a ChEMBL RDF. It is stored on the FTP site.
ChEMBL_14 is an earlier release of the data, with ChEMBL_15 being an updated version. Subsequent updates will have consecutive numbers, so the one with the highest number will be the most recent full version of the data.
ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). We attempt to normalise the bioactivities into a uniform set of end-points and units where possible, and also to tag the links between a molecular target and a published assay with a set of varying confidence levels. The data is abstracted and curated from the primary scientific literature, and cover a significant fraction of the SAR and discovery of modern drugs.
ChEMBL-NTD is a repository for Open Access primary screening and medicinal chemistry data directed at neglected diseases - endemic tropical diseases of the developing regions of the Africa, Asia, and the Americas. The primary purpose of ChEMBL-NTD is to provide a freely accessible and permanent archive and distribution centre for deposited data. ChEMBL-NTD is a subset of the data in the free medicinal chemistry and drug discovery database ChEMBLdb.
ChEMBL consists of data from a wide variety of data sources including scientific literature and patents, deposited data sets, PubChem BioAssay and BindingDB databases, toxicology data sets and drug/clinical candidate resources. The current list of sources is:
Source | Source ID |
---|---|
Scientific Literature | 1 |
GSK Malaria Screening | 2 |
Novartis Malaria Screening | 3 |
St Jude Malaria Screening | 4 |
Sanger Institute Genomics of Drug Sensitivity in Cancer | 5 |
PDBe Ligands | 6 |
PubChem BioAssays | 7 |
Clinical Candidates | 8 |
Orange Book | 9 |
Guide to Receptors and Channels (DEPRECATED) | 10 |
Open TG-GATEs | 11 |
Manually Added Drugs | 12 |
USP Dictionary of USAN and International Drug Names | 13 |
Drugs for Neglected Diseases Initiative (DNDi) | 14 |
DrugMatrix | 15 |
GSK Published Kinase Inhibitor Set | 16 |
MMV Malaria Box | 17 |
TP-search Transporter Database | 18 |
Harvard Malaria Screening | 19 |
WHO-TDR Malaria Screening | 20 |
Deposited Supplementary Bioactivity Data | 21 |
GSK Tuberculosis Screening | 22 |
Open Source Malaria Screening | 23 |
Millipore Kinase Screening (DEPRECATED - MERGED WITH SRC_ID = 1 | 24 |
External Project Compounds | 25 |
Gene Expression Atlas Compounds | 26 |
AstraZeneca Deposited Data | 27 |
FDA Approval Packages | 28 |
GSK Kinetoplastid Screening | 29 |
K4DD Project | 30 |
Curated Drug Metabolism Pathways | 31 |
St Jude Leishmania Screening | 32 |
Gates Library compound collection | 33 |
MMV Pathogen Box | 34 |
HeCaToS Compounds | 35 |
Withdrawn Drugs | 36 |
BindingDB Database | 37 |
Patent Bioactivity Data | 38 |
Curated Drug Pharmacokinetic Data | 39 |
CO-ADD antimicrobial screening data | 40 |
WHO Anatomical Therapeutic Chemical Classification | 41 |
British National Formulary | 42 |
Published Kinase Inhibitor Set 2 | 43 |
Kuster lab chemical proteomics drug profiling | 48 |
HESi | 49 |
Winzeler Lab Plasmodium Screening Data | 51 |
SARS-CoV-2 Screening Data 2020-21 | 52 |
Prodrug active ingredients | 53 |
Donated Chemical Probes - SGC Frankfurt | 54 |
EUbOPEN Chemogenomic Library | 55 |
Salvensis and LSHTM Schistosomiasis screening data | 56 |
IMI-CARE SARS-CoV-2 Data | 57 |
Fraunhofer HDAC6 | 59 |
MMV Malaria HGL | 60 |
International Nonproprietary Names | 63 |
Links to webinar slides and PDFs:
Last modified 1mo ago