Computational Methods for Quality Check, Preprocessing and Normalization of RNA-Seq Data for Systems Biology and Analysis

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

Documents

  • Gianluca Mazzoni
  • Haja N. Kadarmideen
The use of RNA sequencing (RNA-Seq) technologies is increasing mainly due to the development of new next-generation sequencing machines that have reduced the costs and the time needed for data generation.
Nevertheless, microarrays are still the more common choice and one of the reasons is the complexity of the RNA-Seq data analysis. Furthermore, numerous biases can arise from RNA-Seq technology, and these biases have to be identified and removed properly in order to obtain accurate results.
Nowadays, many tools have been developed which allow to perform each step without high-level programming skills. However, each step of the pipeline needs to be performed carefully and requires a good knowledge of both the technology and the algorithms.
In this comprehensive review, we describe the fundamental steps of the pipeline for RNA-Seq analysis to identify differentially expressed genes: raw data quality control, trimming and filtering procedures, alignment, postmapping quality control, counting, normalization and differential expression test.
For each step, we present the most common tools and we give a complete description of their main characteristics and advantages focusing on the statistics that they perform and the assumptions that they make about the data.
The choice of the right tool can have a big impact on the final results. Until now, no gold standard has been established for this type of analysis.
In conclusion, this review can be useful for both educational purposes as well as for less experienced practitioners of animal genomic research. In the absence of a commonly accepted standard procedure, the general overview presented in this review can help to make the best choices during the implementation of an RNA-Seq pipeline.
Original languageEnglish
Title of host publicationSystems Biology in Animal Production and Health
EditorsHaja N. Kadarmideen
Number of pages17
Volume2
Place of PublicationSwitzerland
PublisherSpringer
Publication date29 Oct 2016
Pages61-77
Chapter3
ISBN (Print)978-3-319-43330-1
ISBN (Electronic)978-3-319-43332-5
DOIs
Publication statusPublished - 29 Oct 2016

Number of downloads are based on statistics from Google Scholar and www.ku.dk


No data available

ID: 169060714