Abstract
In this paper, a decomposition method for binary tensors, generalized multi-linear model for principal component analysis (GMLPCA) is proposed. To the best of our knowledge at present there is no other principled systematic framework for decomposition or topographic mapping of binary tensors. In the model formulation, we constrain the natural parameters of the Bernoulli distributions for each tensor element to lie in a sub-space spanned by a reduced set of basis (principal) tensors. We evaluate and compare the proposed GMLPCA technique with existing real-valued tensor decomposition methods in two scenarios: (1) in a series of controlled experiments involving synthetic data; (2) on a real-world biological dataset of DNA sub-sequences from different functional regions, with sequences represented by binary tensors. The experiments suggest that the GMLPCA model is better suited for modelling binary tensors than its real-valued counterparts. Furthermore, we extended our GMLPCA model to the semi-supervised setting by forcing the model to search for a natural parameter subspace that represents a user-specified compromise between the modelling quality and the degree of class separation.
Similar content being viewed by others
Notes
The updating formulas of CSA and MPCA are similar, the only difference being that MPCA subtracts the mean from the data tensors.
As a term, we denote a short and widespread sequence of nucleotides that has or may have a biological significance.
R = [3 × 3] represents a natural parameter subspace spanned by 3-row and 3-column vectors.
\(\theta\)’s are fixed current values of the parameters and should be treated as constants.
References
Lu H, Plataniotis KN, Venetsanopoulos AN (2008) MPCA: multilinear principal component analysis of tensor objects. IEEE Trans Neural Netw 19:18–39
Nolker C, Ritter H (2002) Visual recognition of continuous hand postures. IEEE Trans Neural Netw 13:983–994
Jia K, Gong S (2005) Multi-modal tensor face for simultaneous super-resolution and recognition. In: 10th IEEE international conference computer vision 2:1683–1690 (Beijing)
Renard N, Bourennane S (2008) An ICA-based multilinear algebra tools for dimensionality reduction in hyperspectral imagery. In: IEEE international conference acoustics, speech and signal processing, vol 4. Las Vegas, NV, pp 1345–1348
Cai D, He X, Han J (2006) Tensor space model for document analysis. In: Proceedings 29th annual ACM SIGIR international conference research and development in information retrieval. Seatlle, WA, pp 625–626
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2(6):559–572
Lu H, Plataniotis KN, Venetsanopoulos AN (2009) Uncorrelated multilinear discriminant analysis with regularization and aggregation for tensor object recognition. IEEE Trans Neural Netw 20:103–123
Zafeiriou S (2009) Discriminant nonnegative tensor factorization algorithms. IEEE Trans Neural Netw 20:217–235
Panagakis Y, Kotropoulos C, Arce G (2010) Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification. IEEE Trans Audio Speech Lang Process 18:576–588
Brachat J, Comon P, Mourrain B, Tsigaridas E (2010) Symmetric tensor decomposition. Linear Algebra Appl (In Press, Corrected Proof)
Schein A, Saul L, Ungar L (2003) A generalized linear model for principal component analysis of binary data. In: 9th international workshop artificial intelligence and statistics. Key West, FL
Acar E, Yener B (2009) “Unsupervised multiway data analysis: a literature survey. IEEE Trans Knowl Data En 21:6–20
Wang H, Ahuja N (2005) Rank-r approximation of tensors: using image-as-matrix representation. Comput Vis Pattern Recognit 346–353
De Lathauwer L, De Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21 (4):1253–1278
Kofidis E, Regalia PA (2001) On the best rank-1 approximation of higher-order supersymmetric tensors. SIAM J Matrix Anal Appl 23(3):863–884
Wang H, Ahuja N (2004) Compact representation of multidimensional data using tensor rank-one decomposition. In: Proceedings 17th international conference pattern recognition. Cambridge, UK, pp 44–47
Ye J, Janardan R, Li Q (2004) Gpca: an efficient dimension reduction scheme for image compression and retrieval .In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’04. New York, NY, USA, pp 354–363, ACM, 2004.
Xu D, Yan S, Zhang L, Lin S, Zhang H-J, Huang T (2008) Reconstruction and recognition of tensor-based objects with concurrent subspaces analysis. IEEE Trans Circuits Syst Video Technol 18:36–47
Lu H, Plataniotis KN, Venetsanopoulos AN (2009) Uncorrelated multilinear principal component analysis for unsupervised multilinear subspace learning. IEEE Trans Neural Netw 20(11):1820–1836
Inoue K, Hara K, Urahama K (2009) Robust multilinear principal component analysis. In: 12th international conference on computer vision 2009, pp 591–597, 2–29 oct 2009
Lu H, Plataniotis KN, Venetsanopoulos AN (2011) A survey of multilinear subspace learning for tensor data. Pattern Recognit 44:1540–1551
Cortes C, Mohri M (2003) AUC optimization vs. error rate minimization. In: Advances in neural information processing systems, vol 16. Banff, AL, Canada, pp 313–320
Li X, Zeng J, Yan H (2008) PCA-HPR: a principle component analysis model for human promoter recognition. Bioinformation 2(9):373–378
Sonnenburg S, Zien A, Philips P, Ratsch G (2008) POIMs: positional oligomer importance matrices–understanding support vector machine-based signal detectors. Bioinformatics 24(13): i6–i14
Baldi P, Brunak S (2001) Bioinformatics: the machine learning approach. The MIT Press, 2nd ed.
Isaev A (2007) Introduction to mathematical methods in Bioinformatics. Springer, Secaucus
Wakaguri H, Yamashita R, Sugano SYS, Nakai K (2008) DBTSS: database of transcription start sites. Nucleic Acids Res 36:97–101 (no. Database-Issue)
Saxonov S, Daizadeh I, Fedorov A, Gilbert W (2000) EID: the exon-intron database—an exhaustive database of protein-coding intron-containing genes. Nucleic Acids Res 28(1):185–190
Ron D, Singer Y, Tishby N (1996) “The power of amnesia: learning probabilistic automata with variable memory length. Mach Learning 25(2–3):117–149
Tino P, Dorffner G (2001) Predicting the future of discrete sequences from fractal representations of the past. Mach Learning 45(2):187–217
Cross S, Clark V, Bird A (1999) Isolation of CpG islands from large genomic clones. Nucleic Acids Res 27(10):2099–2107
Bodén M, Bailey TL (2007) Associating transcription factor-binding site motifs with target GO terms and target genes. Nucleic Acids Res 36(12):4108–4117
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29
Straussman R, Nejman D, Roberts D, Steinfeld I, Blum B, Benvenisty N, Simon I, Yakhini Z, Cedar H (2009) Developmental programming of CpG island methylation profiles in the human genome. Nat Struct Mol Biol 16(5): 564–571
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) Meme suite: tools for motif discovery and searching. Nucleic Acids Res 37:W202–W208 (Web Server issue)
Gupta S, Stamatoyannopoulos J, Bailey T, Noble W (2007) Quantifying similarity between motifs. Genome Biol8(2): R24
Globerson A, Roweis S (2006) Metric learning by collapsing classes. Adva Neural Info Process Syst 18:451–458
Acknowledgements
Jakub Mažgut was supported by the Slovak Research and Development Agency under the contract No. APVV-0208-10 and by the Scientific Grant Agency of Slovak Republic, Grant No. VG1/0971/11. Peter Tiňo was supported by the DfES UK/Hong Kong Fellowship for Excellence and a BBSRC Grant (No. BB/H012508/1). Mikael Bodén was supported by the ARC Centre of Excellence in Bioinformatics and the 2009 University of Birmingham Ramsay Research Scholarship Award. Hong Yan is supported by a grant from City University of Hong Kong (Project 7002843).
Author information
Authors and Affiliations
Corresponding author
Appendix: Parameter estimation
Appendix: Parameter estimation
To get analytical parameter updates, we use the trick of [11] and take advantage of the fact that while the model log-likelihood (7) is not convex in the parameters, it is convex in any parameter, if the others are kept fixed. This leads to an iterative estimation scheme detailed below.
The analytical updates will be derived from a lower bound on the log-likelihood (7) using [11]:
where \(\theta\) stands for the current value of individual natural parameters \(\theta_{m,{\bf i}}\) of the Bernoulli noise models \({P({\mathcal A}_{m,{\bf i}} | \theta_{m,{\bf i}})}\) and \(\hat \theta\) stands for the future estimate of the parameter, given the current parameter values. Hence, from (7) we obtainFootnote 4
Denote \((\tanh \frac{\theta_{m,{\bf i}}}{2}) / \theta_{m,{\bf i}}\) by \(\Uppsi_{m,{\bf i}}. \) Grouping together constant terms in (32) leads to
Note that \({H(\hat \Uptheta, \Uptheta) = {\mathcal {L}}(\hat \Uptheta)}\) only if \(\hat \Uptheta = \Uptheta.\) Therefore, by choosing \(\hat \Uptheta\) that maximizes \(H(\hat \Uptheta, \Uptheta)\) we guarantee \({{\mathcal {L}}(\hat \Uptheta) \ge H(\hat \Uptheta, \Uptheta) \ge H(\Uptheta, \Uptheta) = {\mathcal {L}}(\Uptheta)}\) [11].
We are now ready to constrain the Bernoulli parameters to be optimized [see (9)]:
We will update the model parameters so as to maximize
where
with \(\hat \theta_{m,{\bf i}}\) given by (35).
1.1 Updates for n-mode space basis
When updating the n-mode space basis {u (n)1 , u (n)2 , ..., u (n) R_n }, the bias tensor \(\Updelta\) and the expansion coefficients \({{\mathcal {Q}}_{m,{\bf r}}, }\) m = 1, 2, ...M, r ∈ ρ, are kept fixed to their current values.
For n = 1, 2, ..., N, define
with obvious interpretation in the boundary cases. Given \({\bf\ {{i}}} \in \Upupsilon_{-n}\) and an n-mode index \(j \in \{1,2,\ldots,I_{n}\},\) the index N-tuple (i 1, ..., i n-1, j, i n+1, ..., i N ) formed by inserting j at the nth place of i is denoted by [i, j|n].
In order to evaluate
we realize that \(u^{(n)}_{q,j}\) is involved in expressing all \(\hat \theta_{m,[{\bf i}, j|n]},\) m = 1, 2, ..., M, with \({\bf\ {{i}}} \in \Upupsilon_{-n}.\) Therefore,
where
and from (35),
Here, the index set \(\rho_{-n}\) is defined analogously to \(\Upupsilon_{-n}:\)
Setting the derivative (39) to zero results in
Rewriting (35) as
and applying to (43) we obtain
where
and
For each n-mode coordinate \(j \in \{ 1,2,\ldots,I_n\},\) collect the j-th coordinate values of all n-mode basis vectors into a column vector \({\bf u}^{(n)}_{:,j} = (u^{(n)}_{1,j}, u^{(n)}_{2,j}, \ldots, u^{(n)}_{R_n,j} )^T.\) Analogously, stack all the \({{\mathcal {S}}^{(n)}_{q,j}}\) values in a column vector \({{\mathcal {S}}^{(n)}_{:,j} = ({\mathcal {S}}^{(n)}_{1,j}, {\mathcal {S}}^{(n)}_{2,j}, \ldots,}\) \({\mathcal {S}}^{(n)}_{R_n,j})^T.\) Finally, we construct an \(R_n \times R_n\) matrix \({\mathcal {K}}^{(n)}_{:,:,j}\) whose q-th row is \(({\mathcal {K}}^{(n)}_{q,1,j}, {\mathcal {K}}^{(n)}_{q,2,j},\ldots, {\mathcal {K}}^{(n)}_{q,R_n,j}), q = 1,2,\ldots,R_n.\) The n-mode basis vectors are updated by solving I n linear systems of size \(R_n \times R_n\):
1.2 Updates for expansion coefficients
When updating the expansion coefficients \({{\mathcal {Q}}_{m,{\bf\ {{r}}}}, }\) the bias tensor \(\Updelta\) and the basis sets \(\{ {\bf\ {{u}}}^{(n)}_1, {\bf u}^{(n)}_2,\ldots, {\bf u}^{(n)}_{R_n} \}\) for all n modes n = 1, 2, ..., N are kept fixed to their current values.
For \({\bf r} \in \rho\) and \({\bf i} \in \Upupsilon\) denote \(\prod_{n=1}^N u^{(n)}_{r_n,i_n}\) by \(C_{{\bf r},{\bf i}}.\) For data index \(\ell = 1,2,\ldots,M\) and basis index \({\bf\ {{v}}} \in \rho\) we have
where
and \({ \frac{\partial \ \hat \theta_{m,{\bf i}}}{\partial \ {\mathcal {Q}}_{\ell,{\bf v}}} = C_{{\bf v},{\bf i}}}\) if \(m=\ell\) and \({\frac{\partial \ \hat \theta_{m,{\bf\ {{i}}}}}{\partial \ {\mathcal {Q}}_{\ell,{\bf v}}} = 0}\) otherwise.
By imposing \({\frac{\partial \ \mathcal{H}}{\partial \ {\mathcal {Q}}_{\ell,{\bf v}}} = 0, }\) we get
where
and
To solve for expansion coefficients using the tools of matrix algebra, we need to vectorize tensor indices. Consider any one-to-one function \(\kappa\) from \(\rho\) to \(\{ 1,2,\ldots,\prod_{n=1}^N R_n \}.\) For each input tensor index \(\ell = 1,2,\ldots,M,\)
-
create a square \((\prod_{n=1}^N R_n) \times (\prod_{n=1}^N R_n)\)matrix \({{\mathcal {P}}_{:,:,\ell}}\) whose \((\kappa({\bf v}), \kappa({\bf r}))\)-th element is equal to \({{\mathcal {P}}_{{\bf\ {{v}}},{\bf r},\ell}, }\)
-
stack the values of \({{\mathcal {T}}_{{\bf v},\ell}}\) into a column vector \({{\mathcal {T}}_{:,\ell}}\) whose \(\kappa({\bf\ {{v}}})\)-th coordinate is \({{\mathcal {T}}_{{\bf v},\ell}, }\)
-
collect the expansion coefficients \({{\mathcal {Q}}_{\ell,{\bf\ {{r}}}}}\) in a column vector \({{\mathcal {Q}}_{\ell,:}}\) with \(\kappa({\bf r})\)-th coordinate equal to \({{\mathcal {Q}}_{\ell,{\bf r}}. }\)
The expansion coefficients for the \(\ell\)-th input tensor \({{\mathcal A}_\ell}\) can be obtained by solving
1.3 Updates for the bias tensor
As before, when updating the bias tensor \(\Updelta,\) the expansion coefficients \({{\mathcal {Q}}_{m,{\bf r}}, }\) m = 1, 2, ...M, \({\bf r} \in \rho,\) and the basis sets \(\{ {\bf u}^{(n)}_1, {\bf u}^{(n)}_2, \ldots, {\bf u}^{(n)}_{R_n} \}\) for all n modes n = 1, 2, ..., N are kept fixed to their current values.
Fix \({\bf j} \in \Upupsilon.\)We evaluate
where \(\frac{\partial \ \hat \theta_{m,{\bf i}}}{\partial \ \Updelta_{{\bf j}}}\) is equal to 1 if i = j and 0 otherwise.
Solving for \(\frac{\partial \ \mathcal{H}}{\partial \ \Updelta_{{\bf j}}} = 0\) leads to
Rights and permissions
About this article
Cite this article
Mažgut, J., Tiňo, P., Bodén, M. et al. Dimensionality reduction and topographic mapping of binary tensors. Pattern Anal Applic 17, 497–515 (2014). https://doi.org/10.1007/s10044-013-0317-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-013-0317-y