We use cookies to enable functionality on our website and track usage.
The majority of models within the passport have undergone extensive characterisation including: sequencing, copy number, methylation, gene expression and drug screening. These datasets enable researchers to identify and understand the underlying molecular causes of cancer. Systematic comparisons have demonstrated that cancer cell lines and organoids effectively represent clinical tumour samples.
These large scale genomic and functional datasets have been made available through the website as processed downloads and via the API. In addition, links to the raw data enable users to independently analyse each dataset.
Table of data sets and key information.
Models | Dataset | Type | Details | Data/Link | Publication |
---|---|---|---|---|---|
Cell Lines | Whole Exome Sequencing | BAM | Illumina HiSeq 2000 | EGAS00001000978 | 1 |
Cell Lines | Copy Number Variation | Affymetrix SNP6 | EGAS00001000978 | 1 | |
Cell Lines | Expression | CEL | Affymetrix Human Genome U219 Array | E-MTAB-3610 | 1 |
Cell Lines | RNASeq | BAM | Illumina HiSeq 2000 | EGAS00001000828 | 2 |
Cell Lines | Methylation | TAR (of IDAT) | Illumina Human Methylation 450 BeadChip | GSE68379 | 1 |
Organoids | Targeted Sequencing | CRAM | Illumina HiSeq 4000 | EGAS00001002221 | |
Organoids | Whole Genome Sequencing | CRAM | Illumina HiSeq 4000 | EGAS00001002222 |
Publications:
All genes have an internal ID, allowing mapping to current and previous HGNC gene symbols, Ensembl Gene IDs (v91) and other external gene identifiers. All genes with an HGNC-approved symbol as of April 2018 are currently included in the Passports, including those without a protein product. Any dataset values that were mapped to genes without an official gene symbol have been discarded from processing, but continue to be available in raw data downloads.
To annotate cancer drivers, the list of cancer driver variants from the above Cell paper was used (Table S2C), excluding any fusion genes.
Only mutations listed in this table can receive cancer-related annotations. Mutations not found in this table are considered technical artefacts or non-oncogenic (passenger) mutations. From this table, genes that pass the 'Recurrence Filter' and are present in one of the annotated driver gene lists are marked as cancer mutations.
From the list of cancer genes - those that pass the recurrence filter - all cancer genes with 'Truncating' mutations are annotated as tumour suppressor genes, while genes without such mutations are designated oncogene. This results in Driver Gene List that can be found on the Datasets & Downloads page.
Fusions are annotated from RNASeq data as detailed in Picco et al., 2019. Specifically, the fusion events, validation information and patient annotation is obtained from Supplementary table 2 from this paper. The COSMIC fusion list is obtained from the COSMIC Fusions page and matched by gene symbol to the Cell Model Passports.
For published datasets we recommend users refer to the original publication for details of the model authentication within that dataset. During dataset integration names and identifiers have been cross referenced to ensure that the data is attributed to the correct model.
Newly generated organoid sequencing data available through the passports has been authenticated back to primary tumour samples obtained from clinical sites using a panel of 95 SNPs assayed using the 96.96 Dynamic Array IFC, Fluidigm.