# Linking supplementary data through SAMID and REGID Supporting data stored in the ACTIVITY\_SUPP table and require the use of two additional depositor defined identifiers: supplementary activity mapping ID (SAMID) and record grouping ID (REGID). The table is used for the bulk storage of raw data relating to ACTIVITY values, including both data used directly to calculate the ACTIVITY value, and also miscellaneous data from the same experiment, not used in this calculation, but which may be of interest to the specialist user.


*IDX	Description	Primary File
SAMID	Supplementary Activity Mapping ID	ACTIVITY_SUPP
REGID	REcord Grouping ID	ACTIVITY_SUPP

### SAMID identifiers in the ACTIVITY\_SUPP\_MAP file The ACTIVITY\_SUPP\_MAP file acts as a intermediate file to connect data in the ACTIVITY file, represented by ACT\_IDs, to data in the ACTIVITY\_SUPP file, represented by SAMIDs. The ACTIVITY\_SUPP\_MAP file should therefore only contain two columns, ACT\_ID and SAMID, to map which records in the ACTIVITY\_SUPP table are supporting evidence for which records in the ACTIVITY table. SAMIDs are mandatory in the ACTIVITY\_SUPP\_MAP file, and all SAMIDs used in a deposition must be present at least once in the ACTIVITY\_SUPP file. SAMID can be left as null in the ACTIVITY\_SUPP file if the supplementary record does not directly point to any ACTIVITY value in particular. ### REGID identifiers A REGID is used to cluster or group together related records in the ACTIVITY\_SUPP file, just as the TEOID is used in the ACTIVITY table. An SAMID usually refers to a single specific measurement like a time point, rather than a group, animal or well. A REGID must be a positive integer such as 1,2,3, etc. The REGID only has meaning within the deposition, so values can be re-used between depositions without the potential to overwrite previous depositions, as is the case with some other depositor defined identifiers. When preparing data for loading it can be useful to think of REGIDs as 'row numbers' in a 2D table of data.


A SAMID (Single Measurement ID) uniquely identifies a single data point in ACTIVITY_SUPP	One individual measurement at one compound, dose, time point and in one entity (animal, plate well etc). So biological replicates (animals) have differing SAMIDs.
A REGID (Regimen ID) groups different measurements for the same entity and conditions	E.g. All measurements for hematology, clinical chemistry, organ weight AND pathology observed for one animal at the same compound, dose & time point.

### Complex result Sets as a 2D matrix For many scientists one of the most convenient ways of browsing the values in a highly complex result set is in the form of a 2D matrix, such as a spreadsheet. Indeed, this is the most common format for complex result sets presented to ChEMBL administrators for loading into ChEMBL. The ACTIVITY\_SUPP table can be thought of as a transformation of a 2D matrix where REGIDs represent rows, the TYPEs are the column headers, and the VALUEs are the cells. In fact, one of the best ways of creating the ACTIVITY\_SUPP file can be to start with a 2D matrix of all the data to be loaded, and transform these data into the ACTIVITY\_SUPP file with a script. By doing this, reconstructing the original data set is straightforward when such data are subsequently exported from ChEMBL. --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://chembl.gitbook.io/chembl-data-deposition-guide/file-structure/supplementary-data-files/complex-results/linking-supplementary-data-through-samid-and-regid.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.