Author Information

Douglas Baumann
R. W. Doerge

Abstract

Next-generation sequencing (NGS) technologies have opened the door to a wealth of knowledge and information about biological systems, particularly in genomics and epigenomics. These tools, although useful, carry with them additional technological and statistical challenges that need to be understood and addressed. One such issue is ampli cation bias. Specifically, the majority of NGS technologies effectively sample small amounts of DNA or RNA that are amplified (i.e., copied) prior to sequencing. The amplification process is not perfect, and thus sequenced read counts can be extremely biased. Unfortunately, current amplification bias controlling procedures introduce a dependence of gene expression on gene length, which effectively masks the effects of short genes with high transcription rates. In this work we present a novel procedure to account for amplification bias and demonstrate its effectiveness in estimating true gene expression independent of gene length.

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Share

COinS
 
Apr 29th, 9:30 AM

CORRECTING FOR AMPLIFICATION BIAS IN NEXT-GENERATION SEQUENCING DATA

Next-generation sequencing (NGS) technologies have opened the door to a wealth of knowledge and information about biological systems, particularly in genomics and epigenomics. These tools, although useful, carry with them additional technological and statistical challenges that need to be understood and addressed. One such issue is ampli cation bias. Specifically, the majority of NGS technologies effectively sample small amounts of DNA or RNA that are amplified (i.e., copied) prior to sequencing. The amplification process is not perfect, and thus sequenced read counts can be extremely biased. Unfortunately, current amplification bias controlling procedures introduce a dependence of gene expression on gene length, which effectively masks the effects of short genes with high transcription rates. In this work we present a novel procedure to account for amplification bias and demonstrate its effectiveness in estimating true gene expression independent of gene length.