61 Biopolis Dr, Singapore 138673

NGS Data Integration

Multi-Omics
Case Study
The advancement of next generation sequencing and computational technologies have enabled large scale profiling of biological systems at different dimensions such as genome, transcriptome, chromatin-protein interaction, chromatin conformation etc. How can we laverage on these huge data sets to gain much greater insight in a biological systems?

Together with stastical analysis, different systematic analysis approaches can be applied to integrate omics data and translate it into meaningful insights. For example, in the data integration figure shown below, differentially expressed genes (DEG) between tumor and adjacent normal tissue were identified with transcriptomic sequencing (A). Motif enrichment analysis on the the promoter sequences of these down-regulated genes (B) is then carried out, and potential regulators of gene expression such as trancription factor HNF4a can be identified. To futher understand the underlying regulation mechanism that causes tumor progression, we can compare the chromatin accessibilty profile using ATAC-Seq of tumor and adjacent normal tissues (C). Again, by applying motif enrichment analysis on the differential accessible regions, the HNF4a motif is found to be enriched. Subsequently, by utilising ChIP-Seq data, we can identify the binding regions of HNF4a across the genome and also study the activation profile by histone modification profilling (D). Downstream analysis can then be performed on the candidate regulating regions to decifer the regulation network by HNF4a in tumorigenesis. Further more, other downstream integrative analyses can also be carried out to emphasize the role of a regulatory pathway or mechanism (E). These include, functional enrichment analysis, motif enrichment analysis, subtyping analysis and integrated data visualisation.
Motif Enrichment Analysis
DNA sequence motif, a short sequence pattern which carried important functional properties in the genome which one of the most well known is protein specific binding in cis-regulation of gene expression. By analysing the regulatory sequence like promoter sequence upstream of a gene set, we can profile the footprint of regulatory protein(s) like transcription factors that involve in the expression regulation. This analysis can be performed with data from different assays such as:

• Sequences containing mutation or SNP from resequencing data
• Differentially expressed genes identified in RNA-Seq or expression arrays
• Chromatin regions with differential accessibility signal identified in ATAC-Seq
• Chromatin regions with histone modification signal from ChIP-Seq
Functional Enrichment Analysis
In order to understand the biological mechanism within an organism, we must first know the functions of the components that made up a genome such as genes or proteins. With the evolving of biological knowledgebases, vast number of genes have been systematically annotated and classified according to their associated functions. By annotating the target genes set with functional terms such as gene ontology term and pathways, along with statistical analysis, we can profile the genes network and identify the functions or pathway which potentially regulated or disrupted in a certain disease. This analysis can be done on various kind of datasets such as:

• Differentially expressed genes identified in RNA-Seq or expression arrays
• Chromatin regions with differential accessibility signal identified in ATAC-Seq
• Protein bound chromatin regions identified in ChIP-Seq
Expression Subtyping Analysis
Many clinical conditions have been shown to be associated with mis-regulation of gene expression resulted by upstream genetic disruption such as mutation. Other than direct transcriptional regulation which in turn controlling protein expression, genes expression also involves in indirect regulation such as chromatin conformational changes and cofactor binding. Hence, identification of the gene module(s) which having expression perturbation is important to understand a certain phenotype and to design therapeutic strategies. By applying machine learning approach on genes expression data together with phenotypic or clinical data, we can identify candidate genes set which potentially control a disease condition. These candidate genes set can then serve as candidate biomarker(s) or therapeutic target(s) of the clinical condition. This analysis can be done basically with any expression data in coupled with phenotype data. As example below, with protein expression data and clinical subtype information, a set of 37 genes were identified to classify breast cancer subtypes. Further functional analysis on this gene set would help in understand the cancer progression and predict therapeutic outcome.
Basal-like HER2-enriched Luminal A Luminal B
Basal-like 24 0 1 0
HER2-enriched 0 12 0 6
Luminal A 1 1 21 6
Luminal B 0 2 2 29
Overal classification accuracy: 81.9%

 

 
 

 

 

$625
BASIC
PRICE PLAN
FREE initial consultation
Standard data analysis
Publication ready figures/plots
Bioinformatics methodology writing
Publication submission consultation
Data integration
TCGA Analysis
$1100
best
Decipher
PRICE PLAN
FREE initial consultation
Standard data analysis
Publication ready figures/plots
Bioinformatics methodology writing
Publication submission consultation
Data integration
TCGA Analysis
$1925
Decipher X
PRICE PLAN
FREE initial consultation
Standard data analysis
Publication ready figures/plots
Bioinformatics methodology writing
Publication submission consultation
Data integration
TCGA Analysis
  • FREE initial consulation includes discussion on project objectives and result expectation, analysis method proposal, timeline and cost estimation.
  • Standard data analysis includes primary sequencing reads quality control to downstream reads quatification and functional profiling for up to 12 samples. Please refer to analysis catalog for detail analysis workflow and result description.
  • Publication ready figures/plots includes customize processing of analysis result plots such as spliting by samples group/conditions, color selection, highligting specific components within the plot etc. Only applies to analysis result of samples in package analysis.
  • Publication submission consulation includes compilation of publication ready analysis result, scientific writing review, scientific plots review or customization, consultation for reviewer queries for up to 5 hours in total.
  • Data integration includes intergration of existing pre-processed data for up to two applications between RNA-Seq, ChIP-Seq, or ATAC-Seq.
  • TCGA differential expression and correlation analysis profiles various cancer types and identifies the most associated genes and pathways related to gene of interest. Please refer to our TCGA Analysis page for more information.
  • Customize bioinformatic analysis: USD 370 / hour
  • Standard data analysis: USD 110 / sample
  • Bioinformatics consulation: USD 110 / hour
 
 

 

 

Please enter your query below:

We provide customized NGS analysis service to our customers. Our bioinformatics team generate publication quality data based on your requirements.

Please briefly let us know your requirements. Our scientific team will respond to you shortly to address your query.

    Please prove you are human by selecting the star.