import React, {Fragment} from 'react';
import { Link } from "react-router-dom";
import Col from "reactstrap/es/Col";

export default function(){
    return(
        <Fragment>
            <h1>Datasets Overview</h1>
            <p>
                The majority of models within the passport have undergone extensive characterisation including:
                sequencing, copy number, methylation, gene expression and drug screening.
                These datasets enable researchers to identify and understand the underlying molecular causes of
                cancer. Systematic comparisons have demonstrated that cancer cell lines and organoids effectively
                represent clinical tumour samples.
            </p>

            <p>
                These large scale genomic and functional datasets have been made available through the website as
                processed downloads and via the API. In addition, links to the raw data enable users to independently
                analyse each dataset.

            </p>
            <p className="mt-4">
                Table of data sets and key information.
            </p>
            <table className="table">
                <tbody>
                <tr>
                    <th>Models</th>
                    <th align="centre">Dataset</th>
                    <th>Type</th>
                    <th>Details</th>
                    <th>Data/Link</th>
                    <th>Publication</th>
                </tr>
                <tr>
                    <td>Cell Lines</td>
                    <td>Whole Exome Sequencing</td>
                    <td>BAM</td>
                    <td>Illumina HiSeq 2000</td>
                    <td><a href="https://www.ebi.ac.uk/ega/studies/EGAS00001000978">EGAS00001000978</a></td>
                    <td>1</td>
                </tr>
                <tr>
                    <td>Cell Lines</td>
                    <td>Copy Number Variation</td>
                    <td></td>
                    <td>Affymetrix SNP6</td>
                    <td><a href="https://www.ebi.ac.uk/ega/studies/EGAS00001000978">EGAS00001000978</a></td>
                    <td>1</td>
                </tr>
                <tr>
                    <td>Cell Lines</td>
                    <td>Expression</td>
                    <td>CEL</td>
                    <td>Affymetrix Human Genome U219 Array</td>
                    <td><a href="https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-3610/">E-MTAB-3610</a></td>
                    <td>1</td>
                </tr>
                <tr>
                    <td>Cell Lines</td>
                    <td>RNASeq</td>
                    <td>BAM</td>
                    <td>Illumina HiSeq 2000</td>
                    <td><a href="https://www.ebi.ac.uk/ega/studies/EGAS00001000828">EGAS00001000828</a></td>
                    <td>2</td>
                </tr>
                <tr>
                    <td>Cell Lines</td>
                    <td>Methylation</td>
                    <td>TAR (of IDAT)</td>
                    <td>Illumina Human Methylation 450 BeadChip</td>
                    <td><a href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE68379">GSE68379</a></td>
                    <td>1</td>
                </tr>
                <tr>
                    <td>Organoids</td>
                    <td>Targeted Sequencing</td>
                    <td>CRAM</td>
                    <td>Illumina HiSeq 4000</td>
                    <td><a href="https://www.ebi.ac.uk/ega/studies/EGAS00001002221">EGAS00001002221</a></td>
                    <td></td>
                </tr>
                <tr>
                    <td>Organoids</td>
                    <td>Whole Genome Sequencing</td>
                    <td>CRAM</td>
                    <td>Illumina HiSeq 4000</td>
                    <td><a href="https://www.ebi.ac.uk/ega/studies/EGAS00001002222">EGAS00001002222</a></td>
                    <td></td>
                </tr>
                </tbody>
            </table>

            <p>Publications: </p>
            <ul>
                1. <a href="https://doi.org/10.1016/j.cell.2016.06.017">Iorio et al. A Landscape of Pharmacogenomic Interactions in Cancer. Cell, 2016.</a>
            </ul>
            <ul>2. <a href="https://doi.org/10.1158/0008-5472.CAN-17-1679">Garcia-Alonso et al. Transcription factor activities enhance markers of drug sensitivity in cancer. Cancer Res, 2017.
            </a></ul>

            <br />
            <h5>Gene Annotation & Mapping</h5>
            <p>
                All genes have an internal ID, allowing mapping to current and previous HGNC gene symbols, Ensembl Gene
                IDs (v91) and other external gene identifiers. All genes with an HGNC-approved symbol as of April 2018
                are currently included in the Passports, including those without a protein product. Any dataset values
                that were mapped to genes without an official gene symbol have been discarded from processing,
                but continue to be available in raw data downloads.</p>
            <br />
            <h5 id="driver_list" >Cancer Driver List</h5>
            <p>To annotate cancer drivers, the list of cancer driver variants from the above Cell paper was used (
                <a href="https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/suppData/TableS2C.xlsx">Table S2C</a>),
                excluding any fusion genes.
            </p>
            <Col xl={{size: 11, offset: 1}} className="pt-3">
                <h6>Variants</h6>
                <p>
                    Only mutations listed in this table can receive cancer-related annotations. Mutations not found in this table are
                    considered technical artefacts or non-oncogenic (passenger) mutations. From this table,
                    genes that pass the 'Recurrence Filter' and are present in one of the annotated driver gene lists are marked as cancer mutations.
                </p>
                <br />
                <h6>Oncogene / Tumour Suppressor Gene annotation</h6>
                <p>
                    From the list of cancer genes - those that pass the recurrence filter - all cancer genes with
                    'Truncating' mutations are annotated as tumour suppressor genes, while genes without such mutations are
                    designated oncogene. This results in <Link to={"/downloads"}>Driver Gene List</Link> that can be found
                    on the Datasets & Downloads page.
                </p>
                <br />
            </Col>

            <h5>Gene Fusions</h5>
            <p>
                Fusions are annotated from RNASeq data as detailed in <a href="https://www.nature.com/articles/s41467-019-09940-1" target="_blank" rel="noreferrer noopener" > Picco <em>et al.</em>, 2019</a>.
                Specifically, the fusion events, validation information and patient annotation is obtained from Supplementary table 2 from this paper.
                The COSMIC fusion list is obtained from the <a href="https://cancer.sanger.ac.uk/cosmic/fusion" target="_blank" rel="noreferrer noopener" >COSMIC Fusions page</a> and matched by gene symbol to
                the Cell Model Passports.
            </p>
            <br />

            <h5>Model Authentication for Datasets
            </h5>
            <p>For published datasets we recommend users refer to the original publication for details of the model
                authentication within that dataset. During dataset integration names and identifiers have been cross
                referenced to ensure that the data is attributed to the correct model.
            </p>
            <p>Newly generated organoid sequencing data available through the passports has been authenticated back to
                primary tumour samples obtained from clinical sites using a panel of 95 SNPs assayed using the 96.96
                Dynamic Array IFC, Fluidigm.
            </p>
            <br />
        </Fragment>
    )
}