Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Discovery of spatially cohesive itemsets in three-dimensional protein structures

Published: 01 September 2014 Publication History

Abstract

In this paper we present a cohesive structural itemset miner aiming to discover interesting patterns in a set of data objects within a multidimensional spatial structure by combining the cohesion and the support of the pattern. We propose two ways to build the itemset miner, VertexOne and VertexAll, in an attempt to find a balance between accuracy and run-times. The experiments show that VertexOne performs better, and finds almost the same itemsets as VertexAll in a much shorter time. The usefulness of the method is demonstrated by applying it to find interesting patterns of amino acids in spatial proximity within a set of proteins based on their atomic coordinates in the protein molecular structure. Several patterns found by the cohesive structural itemset miner contain amino acids that frequently co-occur in the spatial structure, even if they are distant in the primary protein sequence and only brought together by protein folding. Further various indications were found that some of the discovered patterns seem to represent common underlying support structures within the proteins.

References

[1]
C. Zhou, P. Meysman, B. Cule, K. Laukens, and B. Goethals, "Mining spatially cohesive itemsets in protein molecular structures," in Proc. 12th Int. Workshop Data Mining Bioinformatics, 2013, pp. 42-50.
[2]
B. Cule, B. Goethals, and C. Robardet, "A new constraint for mining sets in sequences," in Proc. SIAM Int. Conf. Data Mining, 2009, pp. 317-328.
[3]
A. Kouranov, L. Xie, J. de la Cruz, L. Chen, J. Westbrook, P.E. Bourne, and H.M. Berman, "The RCSB PDB information portal for structural genomics," Nucleic Acids Res., vol. 34, no. suppl. 1, pp. D302-D305, Jan. 2006.
[4]
G.J. Kleywegt, "Recognition of spatial motifs in protein structures," J. Mol. Biol., vol. 285, no. 4, pp. 1887-1897, Jan. 1999.
[5]
J. Huan, W. Wang, A. Washington, J. Prins, R. Shah, and A. Tropsha, "Accurate classification of protein structural families using coherent subgraph analysis," in Proc. 9th Pacific Symp. Biocomput., 2003, pp. 411-422.
[6]
Z.-P. Liu, L.-Y. Wu, Y. Wang, X.-S. Zhang, and L. Chen, "Bridging protein local structures and protein functions," Amino Acids, vol. 35, no. 3, pp. 627-50, Oct. 2008.
[7]
M. Vendruscolo, E. Kussell, and E. Domany, "Recovery of protein structure from contact maps," Folding Designing, vol. 2, no. 5, pp. 295-306, Oct. 1997.
[8]
J. Hu, X. Shen, Y. Shao, C. Bystroff, and M. J. Zaki, "Mining protein contact maps," presented at the 2nd BIOKDD Workshop Data Mining in Bioinformatics, Edmonton, AB, Canada, 2002.
[9]
J. Huan, W. Wang, D. Bandyopadhyay, J. Snoeyink, J. Prins, and A. Tropsha, "Mining protein family specific residue packing patterns from protein structure graphs," in Proc. 8th Annu. Int. Conf. Research Comput. Mol. Biol., 2004, pp. 308-315.
[10]
J. Huan, D. Bandyopadhyay, J. Prins, J. Snoeyink, A. Tropsha, and W. Wang, "Distance-based identification of structure motifs in proteins using constrained frequent subgraph mining," in Proc. IEEE Comput. Syst. Bioinformatics Conf., Jan. 2006, pp. 227-38.
[11]
O. Rahat, U. Alon, Y. Levy, and G. Schreiber, "Understanding hydrogen-bond patterns in proteins using network motifs," Bioinformatics, vol. 25, no. 22, pp. 2921-2928, 2009.
[12]
W. Dhifli, R. Saidi, and E. Mephu Nguifo, "Smoothing 3d protein structure motifs through graph mining and amino acid similarities," J. Comput. Biol., vol. 21, pp. 162-172, 2013.
[13]
B. Gärtner, "Fast and robust smallest enclosing balls," in Proc. 7th Annu. Symp. Algorithms, 1999, pp. 325-338.
[14]
R. Agrawal and R. Srikant, "Fast algorithms for mining association rules in large databases," in Proc. 20th Int. Conf. Very Large Data Bases, 1994, pp. 487-499.
[15]
A. Andreeva, D. Howorth, J.-M. Chandonia, S.E. Brenner, T.J.P. Hubbard, C. Chothia, and A.G. Murzin, "Data growth and its impact on the SCOP database: new developments," Nucleic Acids Res., vol. 36, no. database issue, pp. D419-D425, Jan. 2008.
[16]
M. Ashburner, C.A. Ball, J.A. Blake, D. Botstein, H. Butler, J. M. Cherry, A.P. Davis, K. Dolinski, S.S. Dwight, J.T. Eppig, M. A. Harris, D.P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J.E. Richardson, M. Ringwald, G.M. Rubin, and G. Sherlock, "Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium," Nature Genetics, vol. 25, no. 1, pp. 25-29, May 2000.
[17]
C.E. Bell, P. Frescura, A. Hochschild, and M. Lewis, "Crystal structure of the lambda repressor C-terminal domain provides a model for cooperative operator binding," Cell, vol. 101, no. 7, pp. 801-811, June 2000.
[18]
C.G. Kalodimos, R. Boelens, and R. Kaptein, "Toward an integrated model of protein-DNA recognition as inferred from NMR studies on the Lac repressor system," Chemical Rev., vol. 104, no. 8, pp. 3567-3586, Aug. 2004.
[19]
D.N. Arvidson, F. Lu, C. Faber, H. Zalkin, and R.G. Brennan, "The structure of PurR mutant L54M shows an alternative route to DNA kinking," Nature Struct. Biol., vol. 5, no. 6, pp. 436-41, June 1998.
[20]
P. Meysman, K. Marchal, and K. Engelen, "Identifying common structural DNA properties in transcription factor binding site sets of the LacI-GalR family," Current Bioinformatics, vol. 8, no. 4, pp. 483-488, 2013.
[21]
K.S. Gajiwala and S.K. Burley, "Winged helix proteins," Current Opinion Struct. Biol., vol. 10, no. 1, pp. 110-116, Feb. 2000.
[22]
H. Sharma, S. Yu, J. Kong, J. Wang, and T. A. Steitz, "Structure of apo-CAP reveals that large conformational changes are necessary for DNA binding," in Proc. Nat. Acad. Sci. USA, vol. 106, no. 39, pp. 16604-16609, Sept. 2009.
[23]
A. Nakamura, C. Wada, and K. Miki, "Structural basis for regulation of bifunctional roles in replication initiator protein," in Proc. Nat. Acad. Sci. USA, vol. 104, no. 47, pp. 18484-18489, Nov. 2007.
[24]
E.D. Scheeff and P.E. Bourne, "Structural evolution of the protein kinase-like superfamily," PLoS Comput. Biol., vol. 1, no. 5, p. e49, Oct. 2005.
[25]
A. Reményi, M.C. Good, R.P. Bhattacharyya, and W.A. Lim, "The role of docking interactions in mediating signaling input, output, and discrimination in the yeast MAPK network," Molecular Cell, vol. 20, no. 6, pp. 951-962, Dec. 2005.
[26]
W.T. Lowther and B.W. Matthews, "Metalloaminopeptidases: Common functional themes in disparate structural surroundings," Chemical Rev., vol. 102, no. 12, pp. 4581-4608, Dec. 2002.
[27]
S.C. Graham, P.E. Lilley, M. Lee, P.M. Schaeffer, A.V. Kralicek, N. E. Dixon, and J.M. Guss, "Kinetic and crystallographic analysis of mutant Escherichia coli aminopeptidase P: insights into substrate recognition and the mechanism of catalysis," Biochemistry, vol. 45, no. 3, pp. 964-975, Jan. 2006.

Index Terms

  1. Discovery of spatially cohesive itemsets in three-dimensional protein structures

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
          IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 11, Issue 5
          September/October 2014
          206 pages
          ISSN:1545-5963
          • Editor:
          • Ying Xu
          Issue’s Table of Contents

          Publisher

          IEEE Computer Society Press

          Washington, DC, United States

          Publication History

          Published: 01 September 2014
          Accepted: 06 March 2014
          Revised: 05 February 2014
          Received: 04 November 2013
          Published in TCBB Volume 11, Issue 5

          Author Tags

          1. cohesion
          2. itemset mining
          3. multidimensional data
          4. protein structure

          Qualifiers

          • Article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 26
            Total Downloads
          • Downloads (Last 12 months)2
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 25 Dec 2024

          Other Metrics

          Citations

          View Options

          Login options

          Full Access

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media