🔏
Malaria project
  • MAIP
  • Molecule standardisation
  • Input data file
  • Output files
  • Using MAIP results
  • Step by step guide
  • Acknowledgements
  • Contact us
  • How to cite us
Powered by GitBook
On this page
  • Usage
  • Score calculation
  • Model validation
  • Performance metrics
  • Validation sets
  • Results
  • Wrap-up

Was this helpful?

Using MAIP results

How to use the MAIP results for your dataset

PreviousOutput filesNextStep by step guide

Last updated 3 years ago

Was this helpful?

MAIP returns a score for each compound that passed the standardisation procedure. Compounds failing standardisation are included in the output file but with no score.

Usage

MAIP returns the model score.To help user with their decision making we provide some performance metrics that were calculated for 3 different validation sets. We have calculated enrichment factors for different fractions of each dataset and provided the model scores that classify each dataset fraction.

Score calculation

The score (active class logarithmic joint probability) is generated by the Naïve Bayes model described in . For a given compound it is defined by:

Where P(active|Fi) is the model posterior probability for each feature Fi. It is defined by:

Where A is the number of active molecules in the set, T the total number of molecules in the set, TFi the total number of molecules that contain feature Fi and AFi the number of active molecules that contain feature Fi.

Model validation

Performance metrics

We used two metrics to measure the model performance

Enrichment factor. The enrichment factor (EF) is the hit rate (the proportion of active compounds) within a defined sorted fraction divided by the total hit rate. It is defined by :

Where X is the user-defined fraction, P[X] the number of actives in the fraction, N[X] the total number of compounds in the fraction, P the total number of actives and N the total number of compounds. The EF is frequently used as a pragmatic measure of performance, reflecting the common use of in silico models to identify a subset of compounds for experimental evaluation. In this work, we calculated this metric at 1% and 10%, respectively.

Validation sets

Our model was validated with three experimental validation sets. Here are some details:

Dataset name
dataset size
number of actives
number of inactives
hit rate

St. Jude Screening Set

220,691

9,082

211,609

4.12%

MMV test set

5,869

1,198

4,671

20.41%

PubChem

91,796

384

91,412

0.42%

Note the difference in terms of active compounds, particularly for MMV test set

Results

St. Jude Screening Set

MMV test set

PubChem

Result summary

Performance metrics

ROC AUC score

EF[1%]

EF[10%]

EF[50%]

#actives

#inactives

St. Jude Screening Set

0.81

12.1 (71)

4.8 (36)

1.8 (15)

9,082

211,609

MMV test set

0.67

3.5 (60)

2.1 (41)

1.4 (23)

1,198

4,671

Pubchem

0.69

7.0 (56)

2.8 (47)

1.5 (34)

384

91,412

(the number between parentheses indicate the score for which x% is achieved)

Wrap-up

The model shows enrichment when tested against all three validation sets. For each dataset, the higher the model score, the greater the observed enrichment. Also, the thresholds needed to pick 1%, 10% and 50% of the predictions correlate with the dataset size.

We recommend that users refer to these data when analysing their results. For instance, if one predicts a 1 million compound library but (for example) few of those compounds have a high score (perhaps only 1% of them have a score above 15) we would expect the enrichment for that subset to be modest.

Finally, users are advised to apply additional in silico filters to assess the suitability of any hits from MAIP prior to screening. High scoring compounds may have physicochemical properties and/or substructures that are unsuitable as starting points for a malaria drug discovery programme. In addition, some of the training sets used in MAIP contain examples of known anti-malarial compounds (eg aminoquinolines). Thus, molecules with a high score in the model may have already been worked on extensively in anti-malarial programmes. Public bioactivity resources such as ChEMBL can be used to suggest whether anti-malarial activity is already known for particular structural classes.

ROC AUC score. The receiver operator characteristic (ROC) curve is calculated by plotting the fraction of false positives on the x-axis and the fraction of true positives on the y-axis. The area under the curve (AUC) provides a measure of how well a model is able to classify binary data. A value 0.5 corresponds to selecting compounds at random while a perfect model will return an ROC AUC value of 1. This metric is often used to measure the performance of classification models as it is insensitive to class imbalance. The figure below shows an of an ROC curve.

example
Xia et al. (2004)
Example of ROC curve
Individual distribution of model scores for the active (orange) and inactive (blue) compounds for the St. Jude Screening set. Grey lines illustrate the fractions of compounds used to calculated the enrichment factors.
Individual distribution of model scores for the active (orange) and inactive (blue) compounds for the MMV test set. Grey lines illustrate the fractions of compounds used to calculated the enrichment factors.
Individual distribution of model scores for the active (orange) and inactive (blue) compounds for the PubChem set. Grey lines illustrate the fractions of compounds used to calculated the enrichment factors.