⚗️
SureChEMBL
  • SureChEMBL
  • Contact us
  • AVAILABLE DATA SOURCES
  • PATENT VIEW
  • TEXT SEARCH
    • Search interface overview
    • Simple patent query overview
    • Complex Solr search
    • Patents with small molecules filter
    • Date filter
    • Query assistant
    • Solr query field names and examples
    • Patent number search format
  • CHEMICAL SEARCH
    • Structure drawing
    • Insert a SMILES, SMARTS, MOL, or Name Entry
    • Structure search type
    • Similarity Search, Tanimoto Coefficient and Fingerprint Generation
    • Filter by molecular weight
    • Search for structure in document section(s)
    • SMARTS search
  • PATENT ANNOTATION
    • Chemistry annotations
    • Biomedical annotations
  • API
    • API documentation
    • Examples
  • DOWNLOADS
    • Bulk data
    • Export annotations for patent
    • Old data downloads
      • MAP files
      • SureChEMBL data client
      • SureChEMBL compound data dump
  • FAQ
  • Data protection: Privacy notice for SureChEMBL's public website
Powered by GitBook
On this page
  1. CHEMICAL SEARCH

Structure search type

PreviousInsert a SMILES, SMARTS, MOL, or Name EntryNextSimilarity Search, Tanimoto Coefficient and Fingerprint Generation

Last updated 24 days ago

There are several approaches to locating chemical structures within patent documents. SureChEMBL focuses on the three most popular methods.

By definition, the examined molecule within a patent is called a target, the structure we seek is called a query, and a target molecule matching the query structure is called a hit.

The following chemical searches are available:

Search type
Description

Substructure

Chemists are most often interested in Substructure search, that is, whether a target structure contains the query structure within it.

Note: If special molecular features are present on the query (eg. stereochemistry, charge, etc.), only those targets containing the feature are considered hits. However, if a feature is missing from the query, it is not checked and targets without that feature may appear as hits.

Similarity

A Similarity structure search looks for target structures that are similar to the query structure. The similarity concept implemented is based on hashed binary chemical fingerprints derived using a Tanimoto metric. That is to say, the presence of molecular features are recorded for both the query and the target and then compared using a standard formula.

Note: See for a complete description of the concept and method used.

Note: If you choose Similarity as your search type, you will be prompted to provide a Tanimoto coefficient between 50 and 100%.

Identical

In an Identical structure search, all molecular features need to be equal (e.g. a non-stereo query will only match a non-stereo target).

Connectivity

Connectivity search retrieves compounds with common atomic connectivity but differing stereochemistry or isotopic forms.

Tanimoto coefficient and fingerprint generation