REFERENCE.tsv
The REFERENCE file describes the provenance of the deposition, including the title, authors and abstract. It can contain the details of a published reference, or a submitted paper pre-publication, or without being associated with a publication.
The file should contain sufficient details for a user to locate your publication or project by reading the reference. For example, data associated with a project the ABSTRACT field could contain a short description of the dataset and a link to the project site.
RIDX, TITLE, YEAR, ABSTRACT, AUTHORS and REF_TYPE are mandatory, all other fields are optional.
For a deposited dataset, YEAR should be the year we received the file.
It is mandatory to have either a PMID or DOI. If your data do not have one please contact us before submission. We can provide a ChEMBL DOI for datasets until an external DOI has been generated.
We strongly recommend adding the information of up to 3 stable points of contact in the CONTACT field.
The TITLE, ABSTRACT and AUTHORS fields of the REFERENCE file should be populated for datasets, as well as publications. These can be brief (i.e. an organisation name can be provided for the AUTHORS field) and the abstract can be a simple summary of the experiments and their overall purpose. It should contain sufficient details for a user to locate your publication or project by reading the reference; the ABSTRACT field could contain a short description of the dataset and a link to the project site.
Guidance on the use of the CONTACT field
The contact field shall contain a contact profile of someone willing to be contacted about details of the dataset (ideally an ORCID ID, RESEARCHER ID, potentially LinkedIn, Academia.edu profile etc.; up to 3 contacts can be provided).
This information is not equivalent to a corresponding author, but should be a stable point of contact at the main institution(s) involved. To simplify everyone’s data protection needs, we need it to be a link to a profile, rather than something directly personally contactable like an email address. See https://support.orcid.org/hc/en-us/articles/360006897674-Structure-of-the-ORCID-Identifier
Header
Description
Existence
Data Type
RIDX
The RIDX set by the depositor - a primary key
Mandatory
Any character up to a length of 200. Should not start with 0. Should be a meaningful unique identifier for each Reference, not just a number.
PUBMED_ID
PubMed ID
Mandatory (if the document has a PMID)
Any positive integer up to a length of 11.
JOURNAL_NAME
Journal name
Optional
Any character up to a length of 50. Use the standard NIH NLM Catalog abbreviated name of the journal.
YEAR
Year of publication
Mandatory
Any integer up to a length of 4 between 1900 and 2050
VOLUME
The volume of the publication
Optional
Any character upt o a length of 50
ISSUE
The issue of the publication
Optional
Any character up to a length of 50
FIRST_PAGE
The first page of the article
Mandatory if it is a Publication
Any positive integer up to a length of 50
LAST_PAGE
The last page of the article
Optional
Any positive integer up to a length of 50
REF_TYPE
The type of reference (Publication, Patent, Dataset, Book)
Mandatory
One of Patent, Publication, Dataset or Book
TITLE
The title of the reference
Mandatory
Any character up to a length of 500
DOI
The Digital Object Identifier
Mandatory - Must be present, may be empty if there is a PUBMED_ID
Any DOI up to a length of 200
PATENT_ID
The Patent Identifier
Optional
Any Patent Identifier up to a length of 200
ABSTRACT
The abstract of the article. For a dataset, include a description of the dataset here
Mandatory
A very large text field
AUTHORS
A list of the authors of the publication
Mandatory
Any character up to a length of 4000
CONTACT
A contact profile of someone willing to be contacted about details of the dataset (see below)
Recommended
Any character up to a length of 200
Last updated