Peano Count Trees (P-Trees) and Rule Association Mining for Gene Expression Profiling of Microarray Data

Valdivia-Granda, Willy; Perrizo, William; Deckard, Edward; Larson, Francis

Computer Science > Data Structures and Algorithms

arXiv:cs/0610076 (cs)

[Submitted on 12 Oct 2006 (v1), last revised 13 Oct 2006 (this version, v2)]

Title:Peano Count Trees (P-Trees) and Rule Association Mining for Gene Expression Profiling of Microarray Data

Authors:Willy Valdivia-Granda, William Perrizo, Edward Deckard, Francis Larson

View PDF

Abstract: The greatest challenge in maximizing the use of gene expression data is to develop new computational tools capable of interconnecting and interpreting the results from different organisms and experimental settings. We propose an integrative and comprehensive approach including a super-chip containing data from microarray experiments collected on different species subjected to hypoxic and anoxic stress. A data mining technology called Peano count tree (P-trees) is used to represent genomic data in multidimensions. Each microarray spot is presented as a pixel with its corresponding red/green intensity feature bands. Each bad is stored separately in a reorganized 8-separate (bSQ) file format. Each bSQ is converted to a quadrant base tree structure (P-tree) from which a superchip is represented as expression P-trees (EP-trees) and repression P-trees (RP-trees). The use of association rule mining is proposed to derived to meanigingfully organize signal transduction pathways taking in consideration evolutionary considerations. We argue that the genetic constitution of an organism (K) can be represented by the total number of genes belonging to two groups. The group X constitutes genes (X1,Xn) and they can be represented as 1 or 0 depending on whether the gene was expressed or not. The second group of Y genes (Y1,Yn) is expressed at different levels. These genes have a very high repression, high expression, very repressed or highly repressed. However, many genes of the group Y are specie specific and modulated by the products and combinations of genes of the group X. In this paper, we introduce the dSQ and P-tree technology; the biological implications of association rule mining using X and Y gene groups and some advances in the integration of this information using the BRAIN architecture.

Subjects:	Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR); Molecular Networks (q-bio.MN)
Cite as:	arXiv:cs/0610076 [cs.DS]
	(or arXiv:cs/0610076v2 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.cs/0610076
Journal reference:	2002 International Conference in Bioinformatics. Bangkok, Thailand

Submission history

From: Willy Valdivia-Granda [view email]
[v1] Thu, 12 Oct 2006 19:55:32 UTC (272 KB)
[v2] Fri, 13 Oct 2006 18:27:58 UTC (272 KB)

Computer Science > Data Structures and Algorithms

Title:Peano Count Trees (P-Trees) and Rule Association Mining for Gene Expression Profiling of Microarray Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:Peano Count Trees (P-Trees) and Rule Association Mining for Gene Expression Profiling of Microarray Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators