Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Selecting Oligonucleotide Probes for Whole-Genome Tiling Arrays with a Cross-Hybridization Potential

Published: 01 November 2011 Publication History
  • Get Citation Alerts
  • Abstract

    For designing oligonucleotide tiling arrays popular, current methods still rely on simple criteria like Hamming distance or longest common factors, neglecting base stacking effects which strongly contribute to binding energies. Consequently, probes are often prone to cross-hybridization which reduces the signal-to-noise ratio and complicates downstream analysis. We propose the first computationally efficient method using hybridization energy to identify specific oligonucleotide probes. Our Cross-Hybridization Potential (CHP) is computed with a Nearest Neighbor Alignment, which efficiently estimates a lower bound for the Gibbs free energy of the duplex formed by two DNA sequences of bounded length. It is derived from our simplified reformulation of t-gap insertion-deletion-like metrics. The computations are accelerated by a filter using weighted ungapped q-grams to arrive at seeds. The computation of the CHP is implemented in our software OSProbes, available under the GPL, which computes sets of viable probe candidates. The user can choose a trade-off between running time and quality of probes selected. We obtain very favorable results in comparison with prior approaches with respect to specificity and sensitivity for cross-hybridization and genome coverage with high-specificity probes. The combination of OSProbes and our Tileomatic method, which computes optimal tiling paths from candidate sets, yields globally optimal tiling arrays, balancing probe distance, hybridization conditions, and uniqueness of hybridization.

    References

    [1]
    F. Li and G. D. Stormo, "Selection of Optimal DNA Oligos for Gene Expression Arrays," Bioinformatics, vol. 17, no. 11, pp. 1067-1076, Nov. 2001.
    [2]
    T. J. Albert, M. N. Molla, D. M. Muzny, L. Nazareth, D. Wheeler, X. Song, T. A. Richmond, C. M. Middle, M. J. Rodesch, C. J. Packard, G. M. Weinstock, and R. A. Gibbs, "Direct Selection of Human Genomic Loci by Microarray Hybridization," Nature Methods, vol. 4, pp. 903-905, 2007.
    [3]
    R. Sasidharan, A. Agarwal, J. Rozowsky, and M. Gerstein, "An Approach to Compare Genome Tiling Microarray and MPSS Sequencing Data for Transcript Mapping," BMC Research Notes, vol. 2, no. 1, p. 150, July 2009.
    [4]
    W. Huber, A. von Heydebreck, H. Sültmann, A. Poustka, and M. Vingron, "Variance Stabilization Applied to Microarray Data Calibration and to the Quantification of Differential Expression," Bioinformatics, vol. 18, pp. S96-S104, Dec. 2002.
    [5]
    T. E. Royce, J. S. Rozowsky, and M. B. Gerstein, "Assessing the Need for Sequence-Based Normalization in Tiling Microarray Experiments," Bioinformatics, vol. 23, no. 8, pp. 988-997, Apr. 2007.
    [6]
    H.-R. Chung, D. Kostka, and M. Vingron, "A Physical Model for Tiling Array Analysis," Bioinformatics, vol. 23, no. 13, pp. i80-i86, June 2007.
    [7]
    A. Schliep and R. Krause, "Efficient Algorithms for the Computational Design of Optimal Tiling Arrays," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 5, no. 4, pp. 557- 567, Oct.-Dec. 2008.
    [8]
    A. G. D'yachkov, A. J. Macula, W. K. Pogozelski, T. E. Renz, V. V. Rykov, and D.C. Torney, "New t-Gap Insertion-Deletion-Like Metrics for DNA Hybridization Thermodynamic Modeling," J. Computational Biology, vol. 13, no. 4, pp. 866-881, May 2006.
    [9]
    M. D. Kane, T. A. Jatkoe, C. R. Stumpf, J. Lu, J. D. Thomas, and S. J. Madore, "Assessment of the Sensitivity and Specificity of Oligonucleotide (50mer) Microarrays," Nucleic Acid Research, vol. 28, no. 22, pp. 4552-4557, 2000.
    [10]
    N. Reymond, H. Charles, L. Duret, F. Calevro, G. Beslon, and J.-M. Fayard, "ROSO: Optimizing Oligonucleotide Probes for Microarrays," Bioinformatics, vol. 20, no. 2, pp. 271-273, Jan. 2004.
    [11]
    S. Rimour, D. Hill, C. Militon, and P. Peyret, "GoArrays: Highly Dynamic and Efficient Microarray Probe Design," Bioinformatics, vol. 21, no. 7, pp. 1094-1103, Apr. 2005.
    [12]
    X. Wang and B. Seed, "Selection of Oligonucleotide Probes for Protein Coding Sequences," Bioinformatics, vol. 19, no. 7, pp. 796- 802, May 2003.
    [13]
    R. Wernersson and H. B. Nielsen, "OligoWiz 2.0--Integrating Sequence Feature Annotation into the Design of Microarray Probes," Nucleic Acids Research, vol. 33, pp. W611-W615, July 2005.
    [14]
    H. Chen and B. M. Sharp, "Oliz, A Suite of Perl Scripts that Assist in the Design of Microarrays Using 50mer Oligonucleotides from the 3' Untranslated Region," BMC Bioinformatics, vol. 3, p. 27, Oct. 2002.
    [15]
    L. Kaderali and A. Schliep, "Selecting Signature Oligonucleotides to Identify Organisms Using DNA Arrays," Bioinformatics, vol. 18, no. 10, pp. 1340-1349, Oct. 2002.
    [16]
    S. Rahmann, "Fast Large Scale Oligonucleotide Selection Using the Longest Common Factor Approach," J. Bioinformatics and Computational Biology, vol. 1, no. 2, pp. 343-361, July 2003.
    [17]
    J.-M. Rouillard, C. J. Herbert, and M. Zuker, "OligoArray: Genome-Scale Oligonucleotide Design for Microarrays," Bioinformatics, vol. 18, no. 3, pp. 486-487, Mar. 2002.
    [18]
    J.-M. Rouillard, M. Zuker, and E. Gulari, "OligoArray 2.0: Design of Oligonucleotide Probes for DNA Microarrays Using a Thermodynamic Approach," Nucleic Acids Research, vol. 31, no. 12, pp. 3057-3062, June 2003.
    [19]
    J. D. Gans and M. Wolinsky, "Improved Assay-Dependent Searching of Nucleic Acid Sequence Databases," Nucleic Acids Research, vol. 36, no. 12, p. e74, July 2008.
    [20]
    P. Bertone, V. Trifonov, J. S. Rozowsky, F. Schubert, O. Emanuelsson, J. Karro, M.Y. Kao, M. Snyder, and M. Gerstein, "Design Optimization Methods for Genomic DNA Tiling Arrays," Genome Research, vol. 16, no. 2, pp. 271-281, Feb. 2006.
    [21]
    S. Gräf, F. G. G. Nielsen, S. Kurtz, M. A. Huynen, E. Birney, H. Stunnenberg, and P. Flicek, "Optimized Design and Assessment of Whole Genome Tiling Arrays," Bioinformatics, vol. 23, no. 13, pp. i195-i204, July 2007.
    [22]
    G. O. S. Thomassen, A. D. Rowe, K. Lagesen, J. M. Lindvall, and T. Rognes, "Custom Design and Analysis of High-Density Oligonucleotide Bacterial Tiling Microarrays," PLoS One, vol. 4, no. 6, p. e5943, 2009.
    [23]
    J. SantaLucia, "A Unified View of Polymer, Dumbbell, and Oligonucleotide DNA Nearest-Neighbor Thermodynamics," Proc. Nat'l Academy Sciences USA, vol. 95, no. 4, pp. 1460-1465, Feb. 1998.
    [24]
    A. E. Pozhitkov and D. Tautz, "An Algorithm and Program for Finding Sequence Specific Oligonucleotide Probes for Species Identification," BMC Bioinformatics, vol. 3, p. 9, 2002.
    [25]
    L. Zhang, C. Wu, R. Carta, and H. Zhao, "Free Energy of DNA Duplex Formation on Short Oligonucleotide Microarrays," Nucleic Acids Research, vol. 35, no. 3, p. e18, 2007.
    [26]
    M. Seringhaus, J. Rozowsky, T. Royce, U. Nagalakshmi, J. Jee, M. Snyder, and M. Gerstein, "Mismatch Oligonucleotides in Human and Yeast: Guidelines for Probe Design on Tiling Microarrays," BMC Genomics, vol. 9, p. 635, 2008.
    [27]
    S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, "Basic Local Alignment Search Tool," J. Molecular Biology, vol. 215, no. 3, pp. 403-410, Oct. 1990.
    [28]
    S. F. Altschul, T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman, "Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs," Nucleic Acids Research, vol. 25, no. 17, pp. 3389-3402, Sept. 1997.
    [29]
    W. R. Pearson, "Rapid and Sensitive Sequence Comparison with FASTP and FASTA," Methods Enzymology, vol. 183, pp. 63-98, 1990.
    [30]
    S. Burkhardt, A. Crauser, P. Ferragina, H.-P. Lenhof, E. Rivals, and M. Vingron, "Q-Gram Based Database Searching Using a Suffix Array (QUASAR)," Proc. Third Int'l Conf. Computational Molecular Biology (RECOMB '99), pp. 77-83, 1999.
    [31]
    P. Jokinen and E. Ukkonen, "Two Algorithms for Approximate String Matching in Static Texts," Proc. 16th Symp. Math. Foundations of Computer Science, vol. 520, pp. 240-248, 1991.
    [32]
    E. Ukkonen, "Approximate String-Matching with q-Grams and Maximal Matches," Theoretical Computer Science, vol. 92, no. 1, pp. 191-211, 1992.
    [33]
    E. Southern, K. Mir, and M. Shchepinov, "Molecular Interactions on Microarrays," Nature Genetics, vol. 21, pp. 5-9, 1999.
    [34]
    W. B. Langdon, G. J. Upton, and A. P. Harrison, "Probes Containing Runs of Guanines Provide Insights into the Biophysics and Bioinformatics of Affymetrix GeneChips," Briefings in Bioinformatics, vol. 10, no. 3, pp. 259-277, May 2009.
    [35]
    S. Burkhardt and J. Kärkkaïnen, "Better Filtering with Gapped q-Grams," Fundamenta Informaticae, pp. 73-85, 2001.
    [36]
    A. Schliep, D.C. Torney, and S. Rahmann, "Group Testing with DNA Chips: Generating Designs and Decoding Experiments," Proc. Second IEEE CS Bioinformatics (CSB '03) Conf., pp. 84-93, 2003.
    [37]
    G. W. Klau, S. Rahmann, A. Schliep, M. Vingron, and K. Reinert, "Optimal Robust Non-Unique Probe Selection Using Integer Linear Programming," Bioinformatics, vol. 20, pp. i186-i193, Aug. 2004.
    [38]
    A. Phillippy, X. Deng, W. Zhang, and S. Salzberg, "Efficient Oligonucleotide Probe Selection for Pan-Genomic Tiling Arrays." BMC Bioinformatics, vol. 10, p. 293, 2009.
    [39]
    D. J. Lockhart, H. Dong, M.C. Byrne, M. T. Follettie, M. V. Gallo, M. S. Chee, M. Mittmann, C. Wang, M. Kobayashi, H. Horton, and E. L. Brown, "Expression Monitoring by Hybridization to High-Density Oligonucleotide Arrays," Nature Biotechnology, vol. 14, no. 13, pp. 1675-1680, Dec. 1996.
    [40]
    S. Lemoine, F. Combes, and S. L. Crom, "An Evaluation of Custom Microarray Applications: The Oligonucleotide Design Challenge," Nucleic Acids Research, vol. 37, no. 6, pp. 1726-1739, Apr. 2009.
    [41]
    D. Martinez, R. M. Berka, B. Henrissat, M. Saloheimo, M. Arvas, S. E. Baker, J. Chapman, O. Chertkov, P. M. Coutinho, D. Cullen, E. G. J. Danchin, I. V. Grigoriev, P. Harris, M. Jackson, C. P. Kubicek, C. S. Han, I. Ho, L. F. Larrondo, A. L. de Leon, J. K. Magnuson, S. Merino, M. Misra, B. Nelson, N. Putnam, B. Robbertse, A. A. Salamov, M. Schmoll, A. Terry, N. Thayer, A. Westerholm-Parvinen, C. L. Schoch, J. Yao, R. Barabote, R. Barbote, M. A. Nelson, C. Detter, D. Bruce, C. R. Kuske, G. Xie, P. Richardson, D. S. Rokhsar, S. M. Lucas, E. M. Rubin, N. Dunn-Coleman, M. Ward, and T. S. Brettin, "Genome Sequencing and Analysis of the Biomass-Degrading Fungus Trichoderma reesei (syn. Hypocrea jecorina)," Nature Biotechnology, vol. 26, no. 5, pp. 553-560, May 2008.
    [42]
    N. R. Markham and M. Zuker, "DINAMelt Web Server for Nucleic Acid Melting Prediction," Nucleic Acids Research, vol. 33, pp. W577-W581, July 2005.
    [43]
    R. A. Dimitrov and M. Zuker, "Prediction of Hybridization and Melting for Double-Stranded Nucleic Acids," Biophysical J., vol. 87, no. 1, pp. 215-226, July 2004.
    [44]
    E. Birney et al., "Identification and Analysis of Functional Elements in 1% of the Human Genome by the ENCODE Pilot Project," Nature, vol. 447, no. 7146, pp. 799-816, 2007.
    [45]
    T. J. P. Hubbard, B. L. Aken, S. Ayling, B. Ballester, K. Beal, E. Bragin, S. Brent, Y. Chen, P. Clapham, L. Clarke, G. Coates, S. Fairley, S. Fitzgerald, J. Fernandez-Banet, L. Gordon, S. Graf, S. Haider, M. Hammond, R. Holland, K. Howe, A. Jenkinson, N. Johnson, A. Kahari, D. Keefe, S. Keenan, R. Kinsella, F. Kokocinski, E. Kulesha, D. Lawson, I. Longden, K. Megy, P. Meidl, B. Overduin, A. Parker, B. Pritchard, D. Rios, M. Schuster, G. Slater, D. Smedley, W. Spooner, G. Spudich, S. Trevanion, A. Vilella, J. Vogel, S. White, S. Wilder, A. Zadissa, E. Birney, F. Cunningham, V. Curwen, R. Durbin, X. M. Fernandez-Suarez, J. Herrero, A. Kasprzyk, G. Proctor, J. Smith, S. Searle, and P. Flicek, "Ensembl 2009," Nucleic Acids Research, vol. 37, no. 1, pp. D690- D697, 2009.

    Cited By

    View all
    • (2016)On-Chip Reconfigurable Hardware Accelerators for Popcount ComputationsInternational Journal of Reconfigurable Computing10.1155/2016/89720652016(5)Online publication date: 1-Mar-2016

    Index Terms

    1. Selecting Oligonucleotide Probes for Whole-Genome Tiling Arrays with a Cross-Hybridization Potential

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
            IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 8, Issue 6
            November 2011
            286 pages

            Publisher

            IEEE Computer Society Press

            Washington, DC, United States

            Publication History

            Published: 01 November 2011
            Published in TCBB Volume 8, Issue 6

            Author Tags

            1. Biology and genetics
            2. DNA microarrays
            3. cross hybridization.
            4. oligonucleotide probes
            5. tiling arrays

            Qualifiers

            • Article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)1
            • Downloads (Last 6 weeks)0

            Other Metrics

            Citations

            Cited By

            View all
            • (2016)On-Chip Reconfigurable Hardware Accelerators for Popcount ComputationsInternational Journal of Reconfigurable Computing10.1155/2016/89720652016(5)Online publication date: 1-Mar-2016

            View Options

            Get Access

            Login options

            Full Access

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media