⚗️
SureChEMBL
  • SureChEMBL
  • Contact us
  • AVAILABLE DATA SOURCES
  • PATENT VIEW
  • TEXT SEARCH
    • Search interface overview
    • Simple patent query overview
    • Complex Solr search
    • Patents with small molecules filter
    • Date filter
    • Query assistant
    • Solr query field names and examples
    • Patent number search format
  • CHEMICAL SEARCH
    • Structure drawing
    • Insert a SMILES, SMARTS, MOL, or Name Entry
    • Structure search type
    • Similarity Search, Tanimoto Coefficient and Fingerprint Generation
    • Filter by molecular weight
    • Search for structure in document section(s)
    • SMARTS search
  • PATENT ANNOTATION
    • Chemistry annotations
    • Biomedical annotations
  • API
    • API documentation
    • Examples
  • DOWNLOADS
    • Bulk data
    • Export annotations for patent
    • Old data downloads
      • MAP files
      • SureChEMBL data client
      • SureChEMBL compound data dump
  • FAQ
  • Data protection: Privacy notice for SureChEMBL's public website
Powered by GitBook
On this page
  1. CHEMICAL SEARCH

Structure search type

PreviousInsert a SMILES, SMARTS, MOL, or Name EntryNextSimilarity Search, Tanimoto Coefficient and Fingerprint Generation

Last updated 2 days ago

There are several approaches to locating chemical structures within patent documents. SureChEMBL focuses on the three most popular methods.

By definition, the examined molecule within a patent is called a target, the structure we seek is called a query, and a target molecule matching the query structure is called a hit.

The following chemical searches are available:

Search type
Description

Substructure

Chemists are most often interested in Substructure search, that is, whether a target structure contains the query structure within it.

Note: If special molecular features are present on the query (eg. stereochemistry, charge, etc.), only those targets containing the feature are considered hits. However, if a feature is missing from the query, it is not checked and targets without that feature may appear as hits.

Similarity

A Similarity structure search looks for target structures that are similar to the query structure. The similarity concept implemented is based on hashed binary chemical fingerprints derived using a Tanimoto metric. That is to say, the presence of molecular features are recorded for both the query and the target and then compared using a standard formula.

Note: See for a complete description of the concept and method used.

Note: If you choose Similarity as your search type, you will be prompted to provide a Tanimoto coefficient between 50 and 100%.

Identical

In an Identical structure search, all molecular features need to be equal (e.g. a non-stereo query will only match a non-stereo target).

Connectivity

Connectivity search retrieves compounds with common atomic connectivity but differing stereochemistry or isotopic forms.

Tanimoto coefficient and fingerprint generation