# FAQs

**Unichem Contents**

**What sort of sources are used in UniChem ?**

* Any chemically aware database that contains compounds which have been assigned ids (named as src\_compound\_ids in UniChem) and structures. A source may store structures in a variety of ways.&#x20;
* If standard InChIs are provided by the source, then these are used by UniChem.&#x20;
* However, if a source does not provide standard InChIs, UniChem will produce these during loading of the data.&#x20;
* There need not be a 1:1 relationship between src\_compound\_id's and InChIs in a source. Many sources define chemical uniqueness by other means than the standard InChI. UniChem will handle 1:many and many:1 relationships between src\_compound\_ids and standard InChIs.

&#x20;**What sources are currently used in UniChem ?**\
&#x20;These are listed on the 'Sources' page...\
&#x20;Go [here](https://www.ebi.ac.uk/unichem/sources)\
\
&#x20;**Some src\_compound\_ids are missing from a source in UniChem. Why is this ?**\
&#x20;This is explained on the 'Reasons for data omission' page...\
&#x20;Go [here](/unichem/submission-of-data-to-unichem/rules-for-loading.md)

**The source I am trying to create hyperlinks to does not use the src\_compound\_id to create the URL for compound-specific pages. How does UniChem deal with this ?**\
In these circumstances, UniChem makes use of 'auxiliary data' to create links, as described immediately below.\
\
**What is 'auxiliary data' for a src\_compound\_id and when would I need to use it ?**

* Most sources within UniChem create URLs for compound specific pages by simply appending a src\_compound\_id to a ‘base URL’ (eg: appending 'CHEMBL59' to '<https://www.ebi.ac.uk/chembldb/compound/inspect/>' gives: <https://www.ebi.ac.uk/chembldb/compound/inspect/CHEMBL59).&#x20>;
* However, in some instances of UniChem a small number of sources exist which create URLs for compound-specific pages by using strings or identifiers ('auxiliary data') that are different to the src\_compound\_ids for the source. This is not very common, but is dealt with in UniChem by use of an additional mapping step for these sources, where the src\_compound\_ids are mapped to the 'auxiliary data'.&#x20;
* Sources where this step is necesary are marked up with a '1' in the 'AUX\_FOR\_URL' field of the UC\_SOURCES table.&#x20;

**It looks like some data is replicated in multiple sources in UniChem, for example… ‘pubchem’, ‘pubchem\_tpharma’ and 'pubchem\_dotf’ all come from PubChem. Why is this ?**

* Some users wish to create hyperlinks to an entire data source, others to only sub-sets of data within a data sources. For this reason, some sources are maintained in UniChem as a separate source for the entire source, and others for sub-sets of the source.
* Some sources in UniChem, such as PubChem, have integrated structures from a wide variety of depositing primary sources. In the case of PubChem, the structures as originally deposited are assigned a different set of identifiers (SIDs) to the ‘integrated’ equivalent of the molecule (which is assigned a ‘CID’). SIDs therefore represents the original depositor-defined version of the structure, and the CID represents ‘PubChem’s’ normalized, integrated form of the molecule.&#x20;
* Sometimes these structures are different to one another, as the normalization process may change the structure.&#x20;
* In the case of PubChem, UniChem has included some sub-sets on the basis of the original depositor, and has therefore adopted the following policy:&#x20;
  * For the entire Pubchem data source CIDs are used.&#x20;
  * For depositor-defined sub-sets of PubChem, SIDs are used instead.&#x20;
* Please go [here](http://www.jcheminf.com/content/5/1/3) for a full discussion of provenance in relation to UniChem.\
  \
  \ <br>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://chembl.gitbook.io/unichem/faqs.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
