A Comprehensive Analysis Workflow for Genome-Wide Screening Data from ChIP-Sequencing Experiments

Ozer, Hatice Gulcin; Bozdağ, Doruk; Camerlengo, Terry; Wu, Jiejun; Huang, Yi-Wen; Hartley, Tim; Parvin, Jeffrey D.; Huang, Tim; Catalyurek, Umit V.; Huang, Kun

doi:10.1007/978-3-642-00727-9_30

Hatice Gulcin Ozer^20,21,
Doruk Bozdağ^20,22,
Terry Camerlengo²⁰,
Jiejun Wu²³,
Yi-Wen Huang²³,
Tim Hartley²⁰,
Jeffrey D. Parvin^20,21,
Tim Huang²³,
Umit V. Catalyurek²⁰ &
…
Kun Huang^20,21

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5462))

Included in the following conference series:

International Conference on Bioinformatics and Computational Biology

1220 Accesses

Abstract

ChIP-sequencing is a new technique for generating short DNA sequences useful in analyzing DNA-protein interactions and carrying out genome-wide studies. Although there are some studies to process and analyze ChIP-sequencing data, a complete workflow has not been reported yet. The size of the data and broad range of biological questions are the main challenges to establish a data analysis workflow for ChIP-sequencing data. In this paper, we present the ChIP-sequencing data analysis workflow that we developed at the Ohio State University Comprehensive Cancer Center Bioinformatics Shared Resources. This pipeline utilizes 1) use of different mapping algorithms such as Eland, MapReads, SeqMap, RMAP to align short sequence reads to the reference genome 2) a novel normalization algorithm to detect significant binding densities and to compare binding densities of different experiments 3) gene database mapping and 3D binding density visualization 4) distributed computing and high performance computing (HPC) support.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection

Article Open access 24 February 2016

An automated analysis pipeline for a large set of ChIP-seq data: AutoChIP

Article 23 December 2014

The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data

Article Open access 18 November 2016

References

Johnson, D.S., Mortazavi, A., Myers, R.M., Wold, B.: Genome-wide mapping of in vivo protein-DNA interactions. Science 316(5830), 1497–1502 (2007)
Article CAS PubMed Google Scholar
Robertson, G., Hirst, M., Bainbridge, M., Bilenky, M., Zhao, Y., Zeng, T., Euskirchen, G., Bernier, B., Varhol, R., Delaney, A., Thiessen, N., Griffith, O.L., He, A., Marra, M., Snyder, M., Jones, S.: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nature Methods 4, 651–657 (2007)
Article CAS PubMed Google Scholar
Cox, A.: ELAND: Efficient Local Alignment of Nucleotide Data (unpublished)
Google Scholar
Zhang, Z., et al.: Fast flexible mapping of AB SOLiD short sequence reads (unpublished)
Google Scholar
Jiangi, H., Wong, W.H.: SeqMap: mapping massive amount of oligonucleotides to the Genome. Bioinformatics 24(20), 2395–2396 (2008)
Article Google Scholar
Smith, A.D., Xuan, Z., Zhang, M.Q.: Using quality scores and longer reads improves accuracy of Solexa read mapping. BMC Bioinformatics 9, 128 (2008)
Article PubMed PubMed Central Google Scholar
Li, W., Carroll, J.S., Brown, M., Liu, S.: xMAN: extreme MApping of OligoNucleotides. BMC Genomics 9(suppl. 1), S20 (2008)
Article Google Scholar
Bozdag, D., Barbacioru, C., Catalyurek, U.: Parallel Short Sequence Mapping for High Throughput Genome Sequencing. In: 23rd International Parallel and Distributed Processing Symposium (to appear) (2009)
Google Scholar
Beynon, M.D., Kurc, T., Catalyurek, U., Chang, C., Sussman, A., Saltz, J.: Distributed processing of very large datasets with DataCutter. Parallel Computing 27(11), 1457–1478 (2001)
Article Google Scholar
Dudoit, S., Yang, Y.H., Callow, M.J., Speed, T.P.: Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12, 111–140 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biomedical Informatics, The Ohio State University, USA
Hatice Gulcin Ozer, Doruk Bozdağ, Terry Camerlengo, Tim Hartley, Jeffrey D. Parvin, Umit V. Catalyurek & Kun Huang
The Ohio State University Comprehensive Cancer Center Biomedical Informatics Shared Resource, USA
Hatice Gulcin Ozer, Jeffrey D. Parvin & Kun Huang
Department of Electrical & Computer Engineering, The Ohio State University, USA
Doruk Bozdağ
Department of Molecular Virology, Immunology, The Ohio State University, 43210, Columbus, OH, USA
Jiejun Wu, Yi-Wen Huang & Tim Huang

Authors

Hatice Gulcin Ozer
View author publications
You can also search for this author in PubMed Google Scholar
Doruk Bozdağ
View author publications
You can also search for this author in PubMed Google Scholar
Terry Camerlengo
View author publications
You can also search for this author in PubMed Google Scholar
Jiejun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Wen Huang
View author publications
You can also search for this author in PubMed Google Scholar
Tim Hartley
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey D. Parvin
View author publications
You can also search for this author in PubMed Google Scholar
Tim Huang
View author publications
You can also search for this author in PubMed Google Scholar
Umit V. Catalyurek
View author publications
You can also search for this author in PubMed Google Scholar
Kun Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of Connecticut, 257 ITE Building, 371 Fairfield Way, CT 06269-2155, Storrs, USA
Sanguthevar Rajasekaran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ozer, H.G. et al. (2009). A Comprehensive Analysis Workflow for Genome-Wide Screening Data from ChIP-Sequencing Experiments. In: Rajasekaran, S. (eds) Bioinformatics and Computational Biology. BICoB 2009. Lecture Notes in Computer Science(), vol 5462. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00727-9_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-00727-9_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00726-2
Online ISBN: 978-3-642-00727-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Comprehensive Analysis Workflow for Genome-Wide Screening Data from ChIP-Sequencing Experiments

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection

An automated analysis pipeline for a large set of ChIP-seq data: AutoChIP

The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Comprehensive Analysis Workflow for Genome-Wide Screening Data from ChIP-Sequencing Experiments

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection

An automated analysis pipeline for a large set of ChIP-seq data: AutoChIP

The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation