Updates at the Kids First Data Resource Portal – January, 2019-title-h1
Updates at the Kids First Data Resource Portal – January, 2019
Since the launch of the Kids First Data Resource Portal on September 10, 2018, the NIH Common Fund's Gabriella Miller Kids First Data Resource Center (Kids First DRC) has continued breaking ground and forging new avenues for patient families, researchers, and clinicians to share important information and resources about childhood diseases, including childhood cancers and structural birth defects.
In the months following its launch, the Data Resource Portal has amassed the largest collection of pediatric genomic data in the world – combining whole genome sequences (WGS), RNA-seq, clinical, imaging, and histology data. This data is drawn from over 30,000 files (or 809.1 TB) representing over 9,000 tissue samples, gathered from 5,902 patients representing 1,694 family research participants.
The data in the Data Resource Portal is a collection of datasets from various investigators who are performing disease-specific research. By bringing these various datasets together in one location, additional investigators may devise new studies and research based on the data already curated and available (Information on the studies/datasets that are in the Data Resource Portal can be found on the Kids First DRC's Study Updates page and the NIH Common Fund's Gabriella Miller Kids First X01 Study page).
In September of 2018, the following principal investigators were selected to have samples from their structural birth defect and/or childhood cancer cohorts sequenced at one of two NIH-supported sequencing centers – the Broad Institute or the Hudson-Alpha Institute for Biotechnology, in collaboration with St. Jude Children's Research Hospital. Once sequenced, the data generated from their disease cohort samples will be made available on the Kids First Data Resource Portal.
2018 X01 Projects* for the Gabriella Miller Kids First Program
Contact PI Name | Institution Name | Title | Anticipated Number of Samples |
---|---|---|---|
Christina Chambers | University of California, San Diego | Discovery of Genetic Basis of Fetal Alcohol Spectrum Disorders | 236 |
Wendy Chung | Columbia University Health Sciences | Genomic Analysis of Esophageal Atresia and Tracheoesophageal Fistulas and Associated Congential Anomalies | 425 |
Beth Drolet | Medical College of Wisconsin | Analyzing the Genetic Spectrum of Vascular Anomalies, Overgrowth and Structural Birth Defects | 300 |
Ali Gharavi | Columbia University Health Sciences | Genetics of Structural Defects of the Kidney and Urinary Tract | 141 |
Angie Jelin | Johns Hopkins University | Genetics of Structural Defects of the Kidney and Urinary Tract | 425 |
Ian Krantz | Children's Hospital of Philadelphia | Genomic Diagnostics in Cornelia de Lange Syndrome, Related Diagnoses and Structural Birth Defects | 400 |
Ching Lau | The Jackson Laboratory | Genetic Predisposition to Intracranial Germ Cell Tumors | 800 |
Philip Lupo | Baylor College of Medicine | Genomic Analysis of Congenital Heart Defects and Acute Lymphoblastic Leukemia in Children with Down Syndrome | 408 |
Soheil Meshinchi | Fred Hutchinson Cancer Research Center | Germline and Somatic Variants in Myeloid Malignancies in Children | 1440 |
Christine Seidman | Harvard Medical School | Germline Mutations in CHD | 300 |
Jonathan Seidman | Harvard Medical School | The Genetics of Microtia in Hispanic Populations | 400 |
*All projects are pending sequencing completion
Kids First data analytics and engineering specialists continuously work to further refine, monitor, and perform quality control across all currently-available datasets, of which there are seven:
- Pediatric Brain Tumor Atlas (PBTA)
- Orofacial Cleft: European Ancestry
- Ewing Sarcoma: Genetic Risk
- Syndromic Cranial Dysinnervation
- Congenital Heart Defects
- Congenital Diaphragmatic Hernia
- Adolescent Idiopathic Scoliosis
Our most recently released dataset, on Adolecent Idiopathic Scoliosis, was added to the Portal on October 12, 2018. The data is comprised of 1,984 files relating to 299 study participants and 73 families, and includes aligned reads and gVCF files. Tissue samples for this dataset were sequenced at Hudson Alpha, with data harmonization conducted by the Kids First DRC.
While continuous improvements are made on the data available to Portal users, our data engineering, analytics, and bioinformatics experts have also made significant improvements to the Data Resource Portal User Interface and Tools.
On November 29, 2018, a new Portal Dashboard was launched, featuring additional information fields and a new stylized design. Improvements over the previous design give users an at-a-glance overview of their current projects on Cavatica, as well as approved access for controlled study files on Gen3. Users are also now able to save queries, view research interests of other users, and access easy-to-review breakdowns of all Kids First DRC datasets and most frequently occurring diagnoses.
Additional functional improvements include text-based values within the file repository to Human Phenotype Ontology (HPO) identifying numbers, allowing for easier identification within the Portal. We've also refined our data harmonization pipelines, allowing for improved downstream analysis; and on December 20, 2018, our Data Harmonization Pipeline was officially launched as a public application on Cavatica, ensuring that researchers and clinicians anywhere can have access to this valuable tool.
To learn more about the Kids First Data Research Portal, and to register as a new user, visit us at https://kidsfirstdrc.org/portal/portal-features/