Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleSeptember 2024
What is the Intrinsic Dimension of Your Binary Data?—and How to Compute it Quickly
AbstractDimensionality is an important aspect for analyzing and understanding (high-dimensional) data. In their 2006 ICDM paper Tatti et al. answered the question for a (interpretable) dimension of binary data tables by introducing a normalized ...
- short-paperJuly 2024
BRB-KMeans: Enhancing Binary Data Clustering for Binary Product Quantization
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 2306–2310https://doi.org/10.1145/3626772.3657898In Binary Product Quantization (BPQ), where product quantization is applied to binary data, the traditional k-majority method is used for clustering, with centroids determined based on Hamming distance and majority vote for each bit. However, this ...
- research-articleMarch 2024
Statistical inference for noisy incomplete binary matrix
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 95, Pages 4368–4433We consider the statistical inference for noisy incomplete binary (or 1-bit) matrix. Despite the importance of uncertainty quantification to matrix completion, most of the categorical matrix completion literature focuses on point estimation and ...
- research-articleApril 2021
GAEBic: A Novel Biclustering Analysis Method for miRNA-Targeted Gene Data Based on Graph Autoencoder
Journal of Computer Science and Technology (JCST), Volume 36, Issue 2Pages 299–309https://doi.org/10.1007/s11390-021-0804-3AbstractUnlike traditional clustering analysis, the biclustering algorithm works simultaneously on two dimensions of samples (row) and variables (column). In recent years, biclustering methods have been developed rapidly and widely applied in biological ...
- research-articleNovember 2020
Automatic Discovery and Synthesis of Checksum Algorithms from Binary Data Samples
PLAS'20: Proceedings of the 15th Workshop on Programming Languages and Analysis for SecurityPages 25–34https://doi.org/10.1145/3411506.3417599Reverse engineering unknown binary message formats is an important part of security research. Error detecting codes such as checksums and Cyclic Redundancy Check codes (CRCs) are commonly added to messages as a guard against corrupt or untrusted input. ...
-
- research-articleApril 2015
On the discovery of fake binary ratings
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied ComputingPages 901–907https://doi.org/10.1145/2695664.2695866Privacy-preserving collaborative filtering methods promise to preserve privacy of individuals. In general, privacy has two aspects, preserving the rating values of users and masking who rated which items. In this study, we analyze a privacy-preserving ...
- research-articleJune 2014
ftTRACK: Fault-Tolerant Target Tracking in Binary Sensor Networks
ACM Transactions on Sensor Networks (TOSN), Volume 10, Issue 4Article No.: 64, Pages 1–28https://doi.org/10.1145/2538509The provision of accurate and reliable localization and tracking information for a target moving inside a binary Wireless Sensor Network (WSN) is quite challenging, especially when sensor failures due to hardware and/or software malfunctions or ...
- ArticleOctober 2013
Fault Isolation by Comparison of Event Lists Using a Weighted Distance
SMC '13: Proceedings of the 2013 IEEE International Conference on Systems, Man, and CyberneticsPages 595–600https://doi.org/10.1109/SMC.2013.107A decision support system for operators monitoring complex systems is proposed. It consists in a fault isolation method based on pattern matching using binary information, in this case event lists. A training set composed of faults is used to create ...
- research-articleMarch 2012
Feature Selection Based on Class-Dependent Densities for High-Dimensional Binary Data
IEEE Transactions on Knowledge and Data Engineering (IEEECS_TKDE), Volume 24, Issue 3Pages 465–477https://doi.org/10.1109/TKDE.2010.263Data and knowledge management systems employ feature selection algorithms for removing irrelevant, redundant, and noisy information from the data. There are two well-known approaches to feature selection, feature ranking (FR) and feature subset ...
- ArticleNovember 2011
Co-clustering for binary data with maximum modularity
ICONIP'11: Proceedings of the 18th international conference on Neural Information Processing - Volume Part IIPages 700–708https://doi.org/10.1007/978-3-642-24958-7_81The modularity measure have been recently proposed for graph clustering which allows automatic selection of the number of clusters. Empirically, higher values of the modularity measure have been shown to correlate well with graph clustering. In order to ...
- ArticleOctober 2011
Using dimensionality reduction method for binary data to questionnaire analysis
MEMICS'11: Proceedings of the 7th international conference on Mathematical and Engineering Methods in Computer SciencePages 146–154https://doi.org/10.1007/978-3-642-25929-6_14In this paper we introduce a modified version of existing dimensionality reduction method for binary data, weighted logistic principal component analysis (WLPCA). We propose to fit the basis vectors of the latent natural parameter subspace in a ...
- articleJuly 2011
Bayesian process optimization using failure amplification method
Applied Stochastic Models in Business and Industry (ASMBI), Volume 27, Issue 4Pages 402–409https://doi.org/10.1002/asmb.846This work is motivated by a problem of optimizing printed circuit board manufacturing using design of experiments. The data are binary, which poses challenges in model fitting and optimization. We use the idea of failure amplification method to increase ...
- ArticleSeptember 2010
Multilinear decomposition and topographic mapping of binary tensors
ICANN'10: Proceedings of the 20th international conference on Artificial neural networks: Part IPages 317–326Current methods capable of processing tensor objects in their natural higher-order structure have been introduced for real-valued tensors. Such techniques, however, are not suitable for processing binary tensors which arise in many real world problems, ...
- research-articleJuly 2010
Patterns from multiresolution 0-1 data
UP '10: Proceedings of the ACM SIGKDD Workshop on Useful PatternsPages 8–16https://doi.org/10.1145/1816112.1816115Biological systems are complex systems and often the biological data is available in different resolutions. Computational algorithms are often designed to work with only specific resolution of data. Hence, upsampling or downsampling is necessary before ...
- articleDecember 2009
Similarity measures for binary and numerical data: a survey
International Journal of Knowledge Engineering and Soft Data Paradigms (IJKESDP), Volume 1, Issue 1Pages 63–84https://doi.org/10.1504/IJKESDP.2009.021985Similarity measures aim at quantifying the extent to which objects resemble each other. Many techniques in data mining, data analysis or information retrieval require a similarity measure, and selecting an appropriate measure for a given problem is a ...
- ArticleOctober 2009
Text Fusion Watermarking in Medical Image with Semi-reversible for Secure Transfer and Authentication
ARTCOM '09: Proceedings of the 2009 International Conference on Advances in Recent Technologies in Communication and ComputingPages 585–589https://doi.org/10.1109/ARTCom.2009.18Nowadays, the transmission of digitized medical information has become very convenient due to the generality of Internet. Internet has created the biggest benefit to achieve the transmission of patient information efficiently. However, it is easier that ...
- research-articleSeptember 2009
SNAP: Fault Tolerant Event Location Estimation in Sensor Networks Using Binary Data
IEEE Transactions on Computers (ITCO), Volume 58, Issue 9Pages 1185–1197https://doi.org/10.1109/TC.2009.60This paper investigates the use of wireless sensor networks for estimating the location of an event that emits a signal that propagates over a large region. In this context, we assume that the sensors make binary observations and report the event (...
- ArticleAugust 2006
Maximally informative k-itemsets and their efficient discovery
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 237–244https://doi.org/10.1145/1150402.1150431In this paper we present a new approach to mining binary data. We treat each binary feature (item) as a means of distinguishing two sets of examples. Our interest is in selecting from the total set of items an itemset of specified size, such that the ...
- articleMarch 2006
A Unified View on Clustering Binary Data
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This paper studies the problem of clustering binary data. Binary data have been ...
- rfcOctober 2005
RFC 4194: The S Hexdump Format
This document specifies the S Hexdump Format (SHF), a new, XML-based open format for describing binary data in hexadecimal notation. SHF provides the ability to describe both small and large, simple and complex hexadecimal data dumps in an open, modern, ...