🔏
Malaria project
  • MAIP
  • Molecule standardisation
  • Input data file
  • Output files
  • Using MAIP results
  • Step by step guide
  • Acknowledgements
  • Contact us
  • How to cite us
Powered by GitBook
On this page

Was this helpful?

Molecule standardisation

This is the public version of a tool designed to provide a simple way of standardising molecules as a prelude to e.g. molecular modelling exercises.

PreviousMAIPNextInput data file

Last updated 4 years ago

Was this helpful?

A Python module performs the complete standardisation procedure. In addition, the modules that implement the individual steps in this procedure may be accessed separately if required, for example as part of a custom standardisation pipeline.

The tool is open-source and is available from .

A slide-set describing some of the background to the project is available

In summary, the general procedure for standardising a molecule (with the documentation for the appropriate module linked) is...

  • Break bonds to Group I or II metals []

  • Neutralise charges by adding/removing protons []

  • Apply standardisation rules []

  • Apply tautomerism rules [ ]

  • Re-run neutralisation (in case any charges are exposed by rules)

  • Discard any salt/solvate components []

  • Return standardised parent

The complete procedure is implemented by the module; a bare-bones alternative workflow using the individual modules is shown .

The documentation is contained in the project docs/ directory, and consists of a set of , which can be viewed (and run and edited) by starting a notebook server in that directory.

A simple command-line driver program standardiser_mol.py is available in the project bin/ directory. It take SD or SMILES as input, and writes out a file containing those structures that have been successfully standardised and one containing structures for which the procedure has failed.

Acknowledgements

Licensing

This work was funded by the project.

The salt dictionary used is based on that used in the ChEMBL database; this was compiled by L.J. Bellis, A. Hersey and others and was in turn was based on that used in the nomenclature.

Some of the standardisation rules were inspired by those used in the software.

This project is built using the chemistry toolkit.

This code is released under the license. Copyright [2020] is retained by the .

IMI eTOX
USAN
InChI
RDKit
Apache 2.0
EMBL-EBI
GitHub
break_bonds
neutralise
rules
tautomerism
unsalt
standardise
here
Jupyter Notebooks
1MB
standardiser.pdf
pdf
standardiser