Showing 1–2 of 2 results for author: Mesa, D A

Search v0.5.6 released 2020-02-24

arXiv:2003.07921 [pdf, other]

cs.LG stat.ML

Semi-supervised Contrastive Learning Using Partial Label Information

Authors: Colin B. Hansen, Vishwesh Nath, Diego A. Mesa, Yuankai Huo, Bennett A. Landman, Thomas A. Lasko

Abstract: In semi-supervised learning, information from unlabeled examples is used to improve the model learned from labeled examples. In some learning problems, partial label information can be inferred from otherwise unlabeled examples and used to further improve the model. In particular, partial label information exists when subsets of training examples are known to have the same label, even though the l… ▽ More In semi-supervised learning, information from unlabeled examples is used to improve the model learned from labeled examples. In some learning problems, partial label information can be inferred from otherwise unlabeled examples and used to further improve the model. In particular, partial label information exists when subsets of training examples are known to have the same label, even though the label itself is missing. By encouraging the model to give the same label to all such examples through contrastive learning objectives, we can potentially improve its performance. We call this encouragement Nullspace Tuning because the difference vector between any pair of examples with the same label should lie in the nullspace of a linear model. In this paper, we investigate the benefit of using partial label information using a careful comparison framework over well-characterized public datasets. We show that the additional information provided by partial labels reduces test error over good semi-supervised methods usually by a factor of 2, up to a factor of 5.5 in the best case. We also show that adding Nullspace Tuning to the newer and state-of-the-art MixMatch method decreases its test error by up to a factor of 1.8. △ Less

Submitted 3 June, 2024; v1 submitted 17 March, 2020; originally announced March 2020.
arXiv:1901.02523 [pdf, other]

cs.IT

Construction and Analysis of Posterior Matching in Arbitrary Dimensions via Optimal Transport

Authors: Diego A. Mesa, Rui Ma, Siva K. Gorantla, Todd P. Coleman

Abstract: The posterior matching scheme, for feedback encoding of a message point lying on the unit interval over memoryless channels, maximizes mutual information for an arbitrary number of channel uses. However, it in general does not always achieve any positive rate; so far, elaborate analyses have been required to show that it achieves any positive rate below capacity. More recent efforts have introduce… ▽ More The posterior matching scheme, for feedback encoding of a message point lying on the unit interval over memoryless channels, maximizes mutual information for an arbitrary number of channel uses. However, it in general does not always achieve any positive rate; so far, elaborate analyses have been required to show that it achieves any positive rate below capacity. More recent efforts have introduced a random "dither" shared by the encoder and decoder to the problem formulation, to simplify analyses and guarantee that the randomized scheme achieves any rate below capacity. Motivated by applications (e.g. human-computer interfaces) where (a) common randomness shared by the encoder and decoder may not be feasible and (b) the message point lies in a higher dimensional space, we focus here on the original formulation without common randomness, and use optimal transport theory to generalize the scheme for a message point in a higher dimensional space. By defining a stricter, almost sure, notion of message decoding, we use classical probabilistic techniques (e.g. change of measure and martingale convergence) to establish succinct necessary and sufficient conditions on when the message point can be recovered from infinite observations: Birkhoff ergodicity of a random process sequentially generated by the encoder. We also show a surprising "all or nothing" result: the same ergodicity condition is necessary and sufficient to achieve any rate below capacity. We provide applications of this message point framework in human-computer interfaces and multi-antenna communications. △ Less

Submitted 8 January, 2019; originally announced January 2019.

Comments: Submitted to the IEEE Transactions on Information Theory

Search v0.5.6 released 2020-02-24