-
Automated Rewards via LLM-Generated Progress Functions
Authors:
Vishnu Sarukkai,
Brennan Shacklett,
Zander Majercik,
Kush Bhatia,
Christopher Ré,
Kayvon Fatahalian
Abstract:
Large Language Models (LLMs) have the potential to automate reward engineering by leveraging their broad domain knowledge across various tasks. However, they often need many iterations of trial-and-error to generate effective reward functions. This process is costly because evaluating every sampled reward function requires completing the full policy optimization process for each function. In this…
▽ More
Large Language Models (LLMs) have the potential to automate reward engineering by leveraging their broad domain knowledge across various tasks. However, they often need many iterations of trial-and-error to generate effective reward functions. This process is costly because evaluating every sampled reward function requires completing the full policy optimization process for each function. In this paper, we introduce an LLM-driven reward generation framework that is able to produce state-of-the-art policies on the challenging Bi-DexHands benchmark \textbf{with 20$\times$ fewer reward function samples} than the prior state-of-the-art work. Our key insight is that we reduce the problem of generating task-specific rewards to the problem of coarsely estimating \emph{task progress}. Our two-step solution leverages the task domain knowledge and the code synthesis abilities of LLMs to author \emph{progress functions} that estimate task progress from a given state. Then, we use this notion of progress to discretize states, and generate count-based intrinsic rewards using the low-dimensional state space. We show that the combination of LLM-generated progress functions and count-based intrinsic rewards is essential for our performance gains, while alternatives such as generic hash-based counts or using progress directly as a reward function fall short.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates
Authors:
Avanika Narayan,
Mayee F. Chen,
Kush Bhatia,
Christopher Ré
Abstract:
Fine-tuning large language models (LLMs) on instruction datasets is a common way to improve their generative capabilities. However, instruction datasets can be expensive and time-consuming to manually curate, and while LLM-generated data is less labor-intensive, it may violate user privacy agreements or terms of service of LLM providers. Therefore, we seek a way of constructing instruction dataset…
▽ More
Fine-tuning large language models (LLMs) on instruction datasets is a common way to improve their generative capabilities. However, instruction datasets can be expensive and time-consuming to manually curate, and while LLM-generated data is less labor-intensive, it may violate user privacy agreements or terms of service of LLM providers. Therefore, we seek a way of constructing instruction datasets with samples that are not generated by humans or LLMs but still improve LLM generative capabilities. In this work, we introduce Cookbook, a framework that programmatically generates training data consisting of simple patterns over random tokens, resulting in a scalable, cost-effective approach that avoids legal and privacy issues. First, Cookbook uses a template -- a data generating Python function -- to produce training data that encourages the model to learn an explicit pattern-based rule that corresponds to a desired task. We find that fine-tuning on Cookbook-generated data is able to improve performance on its corresponding task by up to 52.7 accuracy points. Second, since instruction datasets improve performance on multiple downstream tasks simultaneously, Cookbook algorithmically learns how to mix data from various templates to optimize performance on multiple tasks. On the standard multi-task GPT4ALL evaluation suite, Mistral-7B fine-tuned using a Cookbook-generated dataset attains the best accuracy on average compared to other 7B parameter instruction-tuned models and is the best performing model on 3 out of 8 tasks. Finally, we analyze when and why Cookbook improves performance and present a metric that allows us to verify that the improvement is largely explained by the model's generations adhering better to template rules.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Robustness Testing of Black-Box Models Against CT Degradation Through Test-Time Augmentation
Authors:
Jack Highton,
Quok Zong Chong,
Samuel Finestone,
Arian Beqiri,
Julia A. Schnabel,
Kanwal K. Bhatia
Abstract:
Deep learning models for medical image segmentation and object detection are becoming increasingly available as clinical products. However, as details are rarely provided about the training data, models may unexpectedly fail when cases differ from those in the training distribution. An approach allowing potential users to independently test the robustness of a model, treating it as a black box and…
▽ More
Deep learning models for medical image segmentation and object detection are becoming increasingly available as clinical products. However, as details are rarely provided about the training data, models may unexpectedly fail when cases differ from those in the training distribution. An approach allowing potential users to independently test the robustness of a model, treating it as a black box and using only a few cases from their own site, is key for adoption. To address this, a method to test the robustness of these models against CT image quality variation is presented. In this work we present this framework by demonstrating that given the same training data, the model architecture and data pre processing greatly affect the robustness of several frequently used segmentation and object detection methods to simulated CT imaging artifacts and degradation. Our framework also addresses the concern about the sustainability of deep learning models in clinical use, by considering future shifts in image quality due to scanner deterioration or imaging protocol changes which are not reflected in a limited local test dataset.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry
Authors:
Michael Zhang,
Kush Bhatia,
Hermann Kumbong,
Christopher Ré
Abstract:
Linear attentions have shown potential for improving Transformer efficiency, reducing attention's quadratic complexity to linear in sequence length. This holds exciting promise for (1) training linear Transformers from scratch, (2) "finetuned-conversion" of task-specific Transformers into linear versions that recover task performance, and (3) "pretrained-conversion" of Transformers such as large l…
▽ More
Linear attentions have shown potential for improving Transformer efficiency, reducing attention's quadratic complexity to linear in sequence length. This holds exciting promise for (1) training linear Transformers from scratch, (2) "finetuned-conversion" of task-specific Transformers into linear versions that recover task performance, and (3) "pretrained-conversion" of Transformers such as large language models into linear versions finetunable on downstream tasks. However, linear attentions often underperform standard softmax attention in quality. To close this performance gap, we find prior linear attentions lack key properties of softmax attention tied to good performance: low-entropy (or "spiky") weights and dot-product monotonicity. We further observe surprisingly simple feature maps that retain these properties and match softmax performance, but are inefficient to compute in linear attention. We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity. Hedgehog uses simple trainable MLPs to produce attention weights mimicking softmax attention. Experiments show Hedgehog recovers over 99% of standard Transformer quality in train-from-scratch and finetuned-conversion settings, outperforming prior linear attentions up to 6 perplexity points on WikiText-103 with causal GPTs, and up to 8.7 GLUE score points on finetuned bidirectional BERTs. Hedgehog also enables pretrained-conversion. Converting a pretrained GPT-2 into a linear attention variant achieves state-of-the-art 16.7 perplexity on WikiText-103 for 125M subquadratic decoder models. We finally turn a pretrained Llama-2 7B into a viable linear attention Llama. With low-rank adaptation, Hedgehog-Llama2 7B achieves 28.1 higher ROUGE-1 points over the base standard attention model, where prior linear attentions lead to 16.5 point drops.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
SuperHF: Supervised Iterative Learning from Human Feedback
Authors:
Gabriel Mukobi,
Peter Chatain,
Su Fong,
Robert Windesheim,
Gitta Kutyniok,
Kush Bhatia,
Silas Alberti
Abstract:
While large language models demonstrate remarkable capabilities, they often present challenges in terms of safety, alignment with human values, and stability during training. Here, we focus on two prevalent methods used to align these models, Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). SFT is simple and robust, powering a host of open-source models, while RL…
▽ More
While large language models demonstrate remarkable capabilities, they often present challenges in terms of safety, alignment with human values, and stability during training. Here, we focus on two prevalent methods used to align these models, Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). SFT is simple and robust, powering a host of open-source models, while RLHF is a more sophisticated method used in top-tier models like ChatGPT but also suffers from instability and susceptibility to reward hacking. We propose a novel approach, Supervised Iterative Learning from Human Feedback (SuperHF), which seeks to leverage the strengths of both methods. Our hypothesis is two-fold: that the reward model used in RLHF is critical for efficient data use and model generalization and that the use of Proximal Policy Optimization (PPO) in RLHF may not be necessary and could contribute to instability issues. SuperHF replaces PPO with a simple supervised loss and a Kullback-Leibler (KL) divergence prior. It creates its own training data by repeatedly sampling a batch of model outputs and filtering them through the reward model in an online learning regime. We then break down the reward optimization problem into three components: robustly optimizing the training rewards themselves, preventing reward hacking-exploitation of the reward model that degrades model performance-as measured by a novel METEOR similarity metric, and maintaining good performance on downstream evaluations. Our experimental results show SuperHF exceeds PPO-based RLHF on the training objective, easily and favorably trades off high reward with low reward hacking, improves downstream calibration, and performs the same on our GPT-4 based qualitative evaluation scheme all the while being significantly simpler to implement, highlighting SuperHF's potential as a competitive language model alignment technique.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models
Authors:
Mayee F. Chen,
Nicholas Roberts,
Kush Bhatia,
Jue Wang,
Ce Zhang,
Frederic Sala,
Christopher Ré
Abstract:
The quality of training data impacts the performance of pre-trained large language models (LMs). Given a fixed budget of tokens, we study how to best select data that leads to good downstream model performance across tasks. We develop a new framework based on a simple hypothesis: just as humans acquire interdependent skills in a deliberate order, language models also follow a natural order when le…
▽ More
The quality of training data impacts the performance of pre-trained large language models (LMs). Given a fixed budget of tokens, we study how to best select data that leads to good downstream model performance across tasks. We develop a new framework based on a simple hypothesis: just as humans acquire interdependent skills in a deliberate order, language models also follow a natural order when learning a set of skills from their training data. If such an order exists, it can be utilized for improved understanding of LMs and for data-efficient training. Using this intuition, our framework formalizes the notion of a skill and of an ordered set of skills in terms of the associated data. First, using both synthetic and real data, we demonstrate that these ordered skill sets exist, and that their existence enables more advanced skills to be learned with less data when we train on their prerequisite skills. Second, using our proposed framework, we introduce an online data sampling algorithm, Skill-It, over mixtures of skills for both continual pre-training and fine-tuning regimes, where the objective is to efficiently learn multiple skills in the former and an individual skill in the latter. On the LEGO synthetic in the continual pre-training setting, Skill-It obtains 36.5 points higher accuracy than random sampling. On the Natural Instructions dataset in the fine-tuning setting, Skill-It reduces the validation loss on the target skill by 13.6% versus training on data associated with the target skill itself. We apply our skills framework on the recent RedPajama dataset to continually pre-train a 3B-parameter LM, achieving higher accuracy on the LM Evaluation Harness with 1B tokens than the baseline approach of sampling uniformly over data sources with 3B tokens.
△ Less
Submitted 26 July, 2023;
originally announced July 2023.
-
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
Authors:
Neel Guha,
Mayee F. Chen,
Kush Bhatia,
Azalia Mirhoseini,
Frederic Sala,
Christopher Ré
Abstract:
Recent work has shown that language models' (LMs) prompt-based learning capabilities make them well suited for automating data labeling in domains where manual annotation is expensive. The challenge is that while writing an initial prompt is cheap, improving a prompt is costly -- practitioners often require significant labeled data in order to evaluate the impact of prompt modifications. Our work…
▽ More
Recent work has shown that language models' (LMs) prompt-based learning capabilities make them well suited for automating data labeling in domains where manual annotation is expensive. The challenge is that while writing an initial prompt is cheap, improving a prompt is costly -- practitioners often require significant labeled data in order to evaluate the impact of prompt modifications. Our work asks whether it is possible to improve prompt-based learning without additional labeled data. We approach this problem by attempting to modify the predictions of a prompt, rather than the prompt itself. Our intuition is that accurate predictions should also be consistent: samples which are similar under some feature representation should receive the same prompt prediction. We propose Embroid, a method which computes multiple representations of a dataset under different embedding functions, and uses the consistency between the LM predictions for neighboring samples to identify mispredictions. Embroid then uses these neighborhoods to create additional predictions for each sample, and combines these predictions with a simple latent variable graphical model in order to generate a final corrected prediction. In addition to providing a theoretical analysis of Embroid, we conduct a rigorous empirical evaluation across six different LMs and up to 95 different tasks. We find that (1) Embroid substantially improves performance over original prompts (e.g., by an average of 7.3 points on GPT-JT), (2) also realizes improvements for more sophisticated prompting strategies (e.g., chain-of-thought), and (3) can be specialized to domains like law through the embedding functions.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
TART: A plug-and-play Transformer module for task-agnostic reasoning
Authors:
Kush Bhatia,
Avanika Narayan,
Christopher De Sa,
Christopher Ré
Abstract:
Large language models (LLMs) exhibit in-context learning abilities which enable the same model to perform several tasks without any task-specific training. In contrast, traditional adaptation approaches, such as fine-tuning, modify the underlying models for each specific task. In-context learning, however, consistently underperforms task-specific tuning approaches even when presented with the same…
▽ More
Large language models (LLMs) exhibit in-context learning abilities which enable the same model to perform several tasks without any task-specific training. In contrast, traditional adaptation approaches, such as fine-tuning, modify the underlying models for each specific task. In-context learning, however, consistently underperforms task-specific tuning approaches even when presented with the same examples. While most existing approaches (e.g., prompt engineering) focus on the LLM's learned representations to patch this performance gap, our analysis actually reveal that LLM representations contain sufficient information to make good predictions. As such, we focus on the LLM's reasoning abilities and demonstrate that this performance gap exists due to their inability to perform simple probabilistic reasoning tasks. This raises an intriguing question: Are LLMs actually capable of learning how to reason in a task-agnostic manner? We answer this in the affirmative and propose TART which generically improves an LLM's reasoning abilities using a synthetically trained Transformer-based reasoning module. TART trains this reasoning module in a task-agnostic manner using only synthetic logistic regression tasks and composes it with an arbitrary real-world pre-trained model without any additional training. With a single inference module, TART improves performance across different model families (GPT-Neo, Pythia, BLOOM), model sizes (100M - 6B), tasks (14 NLP binary classification tasks), and even across different modalities (audio and vision). Additionally, on the RAFT Benchmark, TART improves GPT-Neo (125M)'s performance such that it outperforms BLOOM (176B), and is within 4% of GPT-3 (175B). Our code and models are available at https://github.com/HazyResearch/TART .
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws
Authors:
Kush Bhatia,
Wenshuo Guo,
Jacob Steinhardt
Abstract:
Specifying reward functions for complex tasks like object manipulation or driving is challenging to do by hand. Reward learning seeks to address this by learning a reward model using human feedback on selected query policies. This shifts the burden of reward specification to the optimal design of the queries. We propose a theoretical framework for studying reward learning and the associated optima…
▽ More
Specifying reward functions for complex tasks like object manipulation or driving is challenging to do by hand. Reward learning seeks to address this by learning a reward model using human feedback on selected query policies. This shifts the burden of reward specification to the optimal design of the queries. We propose a theoretical framework for studying reward learning and the associated optimal experiment design problem. Our framework models rewards and policies as nonparametric functions belonging to subsets of Reproducing Kernel Hilbert Spaces (RKHSs). The learner receives (noisy) oracle access to a true reward and must output a policy that performs well under the true reward. For this setting, we first derive non-asymptotic excess risk bounds for a simple plug-in estimator based on ridge regression. We then solve the query design problem by optimizing these risk bounds with respect to the choice of query set and obtain a finite sample statistical rate, which depends primarily on the eigenvalue spectrum of a certain linear operator on the RKHSs. Despite the generality of these results, our bounds are stronger than previous bounds developed for more specialized problems. We specifically show that the well-studied problem of Gaussian process (GP) bandit optimization is a special case of our framework, and that our bounds either improve or are competitive with known regret guarantees for the Matérn kernel.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Congested Bandits: Optimal Routing via Short-term Resets
Authors:
Pranjal Awasthi,
Kush Bhatia,
Sreenivas Gollapudi,
Kostas Kollias
Abstract:
For traffic routing platforms, the choice of which route to recommend to a user depends on the congestion on these routes -- indeed, an individual's utility depends on the number of people using the recommended route at that instance. Motivated by this, we introduce the problem of Congested Bandits where each arm's reward is allowed to depend on the number of times it was played in the past $Δ$ ti…
▽ More
For traffic routing platforms, the choice of which route to recommend to a user depends on the congestion on these routes -- indeed, an individual's utility depends on the number of people using the recommended route at that instance. Motivated by this, we introduce the problem of Congested Bandits where each arm's reward is allowed to depend on the number of times it was played in the past $Δ$ timesteps. This dependence on past history of actions leads to a dynamical system where an algorithm's present choices also affect its future pay-offs, and requires an algorithm to plan for this. We study the congestion aware formulation in the multi-armed bandit (MAB) setup and in the contextual bandit setup with linear rewards. For the multi-armed setup, we propose a UCB style algorithm and show that its policy regret scales as $\tilde{O}(\sqrt{K ΔT})$. For the linear contextual bandit setup, our algorithm, based on an iterative least squares planner, achieves policy regret $\tilde{O}(\sqrt{dT} + Δ)$. From an experimental standpoint, we corroborate the no-regret properties of our algorithms via a simulation study.
△ Less
Submitted 22 January, 2023;
originally announced January 2023.
-
On the Sensitivity of Reward Inference to Misspecified Human Models
Authors:
Joey Hong,
Kush Bhatia,
Anca Dragan
Abstract:
Inferring reward functions from human behavior is at the center of value alignment - aligning AI objectives with what we, humans, actually want. But doing so relies on models of how humans behave given their objectives. After decades of research in cognitive science, neuroscience, and behavioral economics, obtaining accurate human models remains an open research topic. This begs the question: how…
▽ More
Inferring reward functions from human behavior is at the center of value alignment - aligning AI objectives with what we, humans, actually want. But doing so relies on models of how humans behave given their objectives. After decades of research in cognitive science, neuroscience, and behavioral economics, obtaining accurate human models remains an open research topic. This begs the question: how accurate do these models need to be in order for the reward inference to be accurate? On the one hand, if small errors in the model can lead to catastrophic error in inference, the entire framework of reward learning seems ill-fated, as we will never have perfect models of human behavior. On the other hand, if as our models improve, we can have a guarantee that reward accuracy also improves, this would show the benefit of more work on the modeling side. We study this question both theoretically and empirically. We do show that it is unfortunately possible to construct small adversarial biases in behavior that lead to arbitrarily large errors in the inferred reward. However, and arguably more importantly, we are also able to identify reasonable assumptions under which the reward inference error can be bounded linearly in the error in the human model. Finally, we verify our theoretical insights in discrete and continuous control tasks with simulated and human data.
△ Less
Submitted 30 October, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Ask Me Anything: A simple strategy for prompting language models
Authors:
Simran Arora,
Avanika Narayan,
Mayee F. Chen,
Laurel Orr,
Neel Guha,
Kush Bhatia,
Ines Chami,
Frederic Sala,
Christopher Ré
Abstract:
Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt that demonstrates how to perform the task and no additional training. Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect promp…
▽ More
Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt that demonstrates how to perform the task and no additional training. Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect prompt" for a task. To mitigate the high degree of effort involved in prompt-design, we instead ask whether producing multiple effective, yet imperfect, prompts and aggregating them can lead to a high quality prompting strategy. Our observations motivate our proposed prompting method, ASK ME ANYTHING (AMA). We first develop an understanding of the effective prompt formats, finding that question-answering (QA) prompts, which encourage open-ended generation ("Who went to the park?") tend to outperform those that restrict the model outputs ("John went to the park. Output True or False."). Our approach recursively uses the LLM itself to transform task inputs to the effective QA format. We apply the collected prompts to obtain several noisy votes for the input's true label. We find that the prompts can have very different accuracies and complex dependencies and thus propose to use weak supervision, a procedure for combining the noisy predictions, to produce the final predictions for the inputs. We evaluate AMA across open-source model families (e.g., EleutherAI, BLOOM, OPT, and T0) and model sizes (125M-175B parameters), demonstrating an average performance lift of 10.2% over the few-shot baseline. This simple strategy enables the open-source GPT-J-6B model to match and exceed the performance of few-shot GPT3-175B on 15 of 20 popular benchmarks. Averaged across these tasks, the GPT-J-6B model outperforms few-shot GPT3-175B. We release our code here: https://github.com/HazyResearch/ama_prompting
△ Less
Submitted 19 November, 2022; v1 submitted 5 October, 2022;
originally announced October 2022.
-
Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection
Authors:
Kush Bhatia,
Nikki Lijing Kuang,
Yi-An Ma,
Yixin Wang
Abstract:
Variational inference has recently emerged as a popular alternative to the classical Markov chain Monte Carlo (MCMC) in large-scale Bayesian inference. The core idea is to trade statistical accuracy for computational efficiency. In this work, we study these statistical and computational trade-offs in variational inference via a case study in inferential model selection. Focusing on Gaussian infere…
▽ More
Variational inference has recently emerged as a popular alternative to the classical Markov chain Monte Carlo (MCMC) in large-scale Bayesian inference. The core idea is to trade statistical accuracy for computational efficiency. In this work, we study these statistical and computational trade-offs in variational inference via a case study in inferential model selection. Focusing on Gaussian inferential models (or variational approximating families) with diagonal plus low-rank precision matrices, we initiate a theoretical study of the trade-offs in two aspects, Bayesian posterior inference error and frequentist uncertainty quantification error. From the Bayesian posterior inference perspective, we characterize the error of the variational posterior relative to the exact posterior. We prove that, given a fixed computation budget, a lower-rank inferential model produces variational posteriors with a higher statistical approximation error, but a lower computational error; it reduces variance in stochastic optimization and, in turn, accelerates convergence. From the frequentist uncertainty quantification perspective, we consider the precision matrix of the variational posterior as an uncertainty estimate, which involves an additional statistical error originating from the sampling uncertainty of the data. As a consequence, for small datasets, the inferential model need not be full-rank to achieve optimal estimation error (even with unlimited computation budget).
△ Less
Submitted 6 August, 2023; v1 submitted 22 July, 2022;
originally announced July 2022.
-
Detection of AI Synthesized Hindi Speech
Authors:
Karan Bhatia,
Ansh Agrawal,
Priyanka Singh,
Arun Kumar Singh
Abstract:
The recent advancements in generative artificial speech models have made possible the generation of highly realistic speech signals. At first, it seems exciting to obtain these artificially synthesized signals such as speech clones or deep fakes but if left unchecked, it may lead us to digital dystopia. One of the primary focus in audio forensics is validating the authenticity of a speech. Though…
▽ More
The recent advancements in generative artificial speech models have made possible the generation of highly realistic speech signals. At first, it seems exciting to obtain these artificially synthesized signals such as speech clones or deep fakes but if left unchecked, it may lead us to digital dystopia. One of the primary focus in audio forensics is validating the authenticity of a speech. Though some solutions are proposed for English speeches but the detection of synthetic Hindi speeches have not gained much attention. Here, we propose an approach for discrimination of AI synthesized Hindi speech from an actual human speech. We have exploited the Bicoherence Phase, Bicoherence Magnitude, Mel Frequency Cepstral Coefficient (MFCC), Delta Cepstral, and Delta Square Cepstral as the discriminating features for machine learning models. Also, we extend the study to using deep neural networks for extensive experiments, specifically VGG16 and homemade CNN as the architecture models. We obtained an accuracy of 99.83% with VGG16 and 99.99% with homemade CNN models.
△ Less
Submitted 7 March, 2022;
originally announced March 2022.
-
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models
Authors:
Alexander Pan,
Kush Bhatia,
Jacob Steinhardt
Abstract:
Reward hacking -- where RL agents exploit gaps in misspecified reward functions -- has been widely observed, but not yet systematically studied. To understand how reward hacking arises, we construct four RL environments with misspecified rewards. We investigate reward hacking as a function of agent capabilities: model capacity, action space resolution, observation space noise, and training time. M…
▽ More
Reward hacking -- where RL agents exploit gaps in misspecified reward functions -- has been widely observed, but not yet systematically studied. To understand how reward hacking arises, we construct four RL environments with misspecified rewards. We investigate reward hacking as a function of agent capabilities: model capacity, action space resolution, observation space noise, and training time. More capable agents often exploit reward misspecifications, achieving higher proxy reward and lower true reward than less capable agents. Moreover, we find instances of phase transitions: capability thresholds at which the agent's behavior qualitatively shifts, leading to a sharp decrease in the true reward. Such phase transitions pose challenges to monitoring the safety of ML systems. To address this, we propose an anomaly detection task for aberrant policies and offer several baseline detectors.
△ Less
Submitted 14 February, 2022; v1 submitted 10 January, 2022;
originally announced January 2022.
-
Postulating Exoplanetary Habitability via a Novel Anomaly Detection Method
Authors:
Jyotirmoy Sarkar,
Kartik Bhatia,
Snehanshu Saha,
Margarita Safonova,
Santonu Sarkar
Abstract:
A profound shift in the study of cosmology came with the discovery of thousands of exoplanets and the possibility of the existence of billions of them in our Galaxy. The biggest goal in these searches is whether there are other life-harbouring planets. However, the question which of these detected planets are habitable, potentially-habitable, or maybe even inhabited, is still not answered. Some po…
▽ More
A profound shift in the study of cosmology came with the discovery of thousands of exoplanets and the possibility of the existence of billions of them in our Galaxy. The biggest goal in these searches is whether there are other life-harbouring planets. However, the question which of these detected planets are habitable, potentially-habitable, or maybe even inhabited, is still not answered. Some potentially habitable exoplanets have been hypothesized, but since Earth is the only known habitable planet, measures of habitability are necessarily determined with Earth as the reference. Several recent works introduced new habitability metrics based on optimization methods. Classification of potentially habitable exoplanets using supervised learning is another emerging area of study. However, both modeling and supervised learning approaches suffer from drawbacks. We propose an anomaly detection method, the Multi-Stage Memetic Algorithm (MSMA), to detect anomalies and extend it to an unsupervised clustering algorithm MSMVMCA to use it to detect potentially habitable exoplanets as anomalies. The algorithm is based on the postulate that Earth is an anomaly, with the possibility of existence of few other anomalies among thousands of data points. We describe an MSMA-based clustering approach with a novel distance function to detect habitable candidates as anomalies (including Earth). The results are cross-matched with the habitable exoplanet catalog (PHL-HEC) of the Planetary Habitability Laboratory (PHL) with both optimistic and conservative lists of potentially habitable exoplanets.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Contrastive Learning for View Classification of Echocardiograms
Authors:
Agisilaos Chartsias,
Shan Gao,
Angela Mumith,
Jorge Oliveira,
Kanwal Bhatia,
Bernhard Kainz,
Arian Beqiri
Abstract:
Analysis of cardiac ultrasound images is commonly performed in routine clinical practice for quantification of cardiac function. Its increasing automation frequently employs deep learning networks that are trained to predict disease or detect image features. However, such models are extremely data-hungry and training requires labelling of many thousands of images by experienced clinicians. Here we…
▽ More
Analysis of cardiac ultrasound images is commonly performed in routine clinical practice for quantification of cardiac function. Its increasing automation frequently employs deep learning networks that are trained to predict disease or detect image features. However, such models are extremely data-hungry and training requires labelling of many thousands of images by experienced clinicians. Here we propose the use of contrastive learning to mitigate the labelling bottleneck. We train view classification models for imbalanced cardiac ultrasound datasets and show improved performance for views/classes for which minimal labelled data is available. Compared to a naive baseline model, we achieve an improvement in F1 score of up to 26% in those views while maintaining state-of-the-art performance for the views with sufficiently many labelled training observations.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
Preference learning along multiple criteria: A game-theoretic perspective
Authors:
Kush Bhatia,
Ashwin Pananjady,
Peter L. Bartlett,
Anca D. Dragan,
Martin J. Wainwright
Abstract:
The literature on ranking from ordinal data is vast, and there are several ways to aggregate overall preferences from pairwise comparisons between objects. In particular, it is well known that any Nash equilibrium of the zero sum game induced by the preference matrix defines a natural solution concept (winning distribution over objects) known as a von Neumann winner. Many real-world problems, howe…
▽ More
The literature on ranking from ordinal data is vast, and there are several ways to aggregate overall preferences from pairwise comparisons between objects. In particular, it is well known that any Nash equilibrium of the zero sum game induced by the preference matrix defines a natural solution concept (winning distribution over objects) known as a von Neumann winner. Many real-world problems, however, are inevitably multi-criteria, with different pairwise preferences governing the different criteria. In this work, we generalize the notion of a von Neumann winner to the multi-criteria setting by taking inspiration from Blackwell's approachability. Our framework allows for non-linear aggregation of preferences across criteria, and generalizes the linearization-based approach from multi-objective optimization.
From a theoretical standpoint, we show that the Blackwell winner of a multi-criteria problem instance can be computed as the solution to a convex optimization problem. Furthermore, given random samples of pairwise comparisons, we show that a simple plug-in estimator achieves near-optimal minimax sample complexity. Finally, we showcase the practical utility of our framework in a user study on autonomous driving, where we find that the Blackwell winner outperforms the von Neumann winner for the overall preferences.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
Agnostic learning with unknown utilities
Authors:
Kush Bhatia,
Peter L. Bartlett,
Anca D. Dragan,
Jacob Steinhardt
Abstract:
Traditional learning approaches for classification implicitly assume that each mistake has the same cost. In many real-world problems though, the utility of a decision depends on the underlying context $x$ and decision $y$. However, directly incorporating these utilities into the learning objective is often infeasible since these can be quite complex and difficult for humans to specify.
We forma…
▽ More
Traditional learning approaches for classification implicitly assume that each mistake has the same cost. In many real-world problems though, the utility of a decision depends on the underlying context $x$ and decision $y$. However, directly incorporating these utilities into the learning objective is often infeasible since these can be quite complex and difficult for humans to specify.
We formally study this as agnostic learning with unknown utilities: given a dataset $S = \{x_1, \ldots, x_n\}$ where each data point $x_i \sim \mathcal{D}$, the objective of the learner is to output a function $f$ in some class of decision functions $\mathcal{F}$ with small excess risk. This risk measures the performance of the output predictor $f$ with respect to the best predictor in the class $\mathcal{F}$ on the unknown underlying utility $u^*$. This utility $u^*$ is not assumed to have any specific structure. This raises an interesting question whether learning is even possible in our setup, given that obtaining a generalizable estimate of utility $u^*$ might not be possible from finitely many samples. Surprisingly, we show that estimating the utilities of only the sampled points~$S$ suffices to learn a decision function which generalizes well.
We study mechanisms for eliciting information which allow a learner to estimate the utilities $u^*$ on the set $S$. We introduce a family of elicitation mechanisms by generalizing comparisons, called the $k$-comparison oracle, which enables the learner to ask for comparisons across $k$ different inputs $x$ at once. We show that the excess risk in our agnostic learning framework decreases at a rate of $O\left(\frac{1}{k} \right)$. This result brings out an interesting accuracy-elicitation trade-off -- as the order $k$ of the oracle increases, the comparative queries become harder to elicit from humans but allow for more accurate learning.
△ Less
Submitted 17 April, 2021;
originally announced April 2021.
-
Online learning with dynamics: A minimax perspective
Authors:
Kush Bhatia,
Karthik Sridharan
Abstract:
We study the problem of online learning with dynamics, where a learner interacts with a stateful environment over multiple rounds. In each round of the interaction, the learner selects a policy to deploy and incurs a cost that depends on both the chosen policy and current state of the world. The state-evolution dynamics and the costs are allowed to be time-varying, in a possibly adversarial way. I…
▽ More
We study the problem of online learning with dynamics, where a learner interacts with a stateful environment over multiple rounds. In each round of the interaction, the learner selects a policy to deploy and incurs a cost that depends on both the chosen policy and current state of the world. The state-evolution dynamics and the costs are allowed to be time-varying, in a possibly adversarial way. In this setting, we study the problem of minimizing policy regret and provide non-constructive upper bounds on the minimax rate for the problem.
Our main results provide sufficient conditions for online learnability for this setup with corresponding rates. The rates are characterized by 1) a complexity term capturing the expressiveness of the underlying policy class under the dynamics of state change, and 2) a dynamics stability term measuring the deviation of the instantaneous loss from a certain counterfactual loss. Further, we provide matching lower bounds which show that both the complexity terms are indeed necessary.
Our approach provides a unifying analysis that recovers regret bounds for several well studied problems including online learning with memory, online control of linear quadratic regulators, online Markov decision processes, and tracking adversarial targets. In addition, we show how our tools help obtain tight regret bounds for a new problems (with non-linear dynamics and non-convex losses) for which such bounds were not known prior to our work.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
WorkerRep: Immutable Reputation System For Crowdsourcing Platform Based on Blockchain
Authors:
Gurpriya Kaur Bhatia,
Shubham Gupta,
Alpana Dubey,
Ponnurangam Kumaraguru
Abstract:
Crowdsourcing is a process wherein an individual or an organisation utilizes the talent pool present over the Internet to accomplish their task. The existing crowdsourcing platforms and their reputation computation are centralised and hence prone to various attacks or malicious manipulation of the data by the central entity. A few distributed crowdsourcing platforms have been proposed but they lac…
▽ More
Crowdsourcing is a process wherein an individual or an organisation utilizes the talent pool present over the Internet to accomplish their task. The existing crowdsourcing platforms and their reputation computation are centralised and hence prone to various attacks or malicious manipulation of the data by the central entity. A few distributed crowdsourcing platforms have been proposed but they lack a robust reputation mechanism. So we propose a decentralised crowdsourcing platform having an immutable reputation mechanism to tackle these problems. It is built on top of Ethereum network and does not require the user to trust a third party for a non malicious experience. It also utilizes IOTAs consensus mechanism which reduces the cost for task evaluation significantly.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
Bayesian Robustness: A Nonasymptotic Viewpoint
Authors:
Kush Bhatia,
Yi-An Ma,
Anca D. Dragan,
Peter L. Bartlett,
Michael I. Jordan
Abstract:
We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers. We propose Rob-ULA, a robust variant of the Unadjusted Langevin Algorithm (ULA), and provide a finite-sample analysis of its sampling distribution. In particular, we show that after…
▽ More
We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers. We propose Rob-ULA, a robust variant of the Unadjusted Langevin Algorithm (ULA), and provide a finite-sample analysis of its sampling distribution. In particular, we show that after $T= \tilde{\mathcal{O}}(d/\varepsilon_{\textsf{acc}})$ iterations, we can sample from $p_T$ such that $\text{dist}(p_T, p^*) \leq \varepsilon_{\textsf{acc}} + \tilde{\mathcal{O}}(ε)$, where $ε$ is the fraction of corruptions. We corroborate our theoretical analysis with experiments on both synthetic and real-world data sets for mean estimation, regression and binary classification.
△ Less
Submitted 26 July, 2019;
originally announced July 2019.
-
Disease classification of macular Optical Coherence Tomography scans using deep learning software: validation on independent, multi-centre data
Authors:
Kanwal K. Bhatia,
Mark S. Graham,
Louise Terry,
Ashley Wood,
Paris Tranos,
Sameer Trikha,
Nicolas Jaccard
Abstract:
Purpose: To evaluate Pegasus-OCT, a clinical decision support software for the identification of features of retinal disease from macula OCT scans, across heterogenous populations involving varying patient demographics, device manufacturers, acquisition sites and operators.
Methods: 5,588 normal and anomalous macular OCT volumes (162,721 B-scans), acquired at independent centres in five countrie…
▽ More
Purpose: To evaluate Pegasus-OCT, a clinical decision support software for the identification of features of retinal disease from macula OCT scans, across heterogenous populations involving varying patient demographics, device manufacturers, acquisition sites and operators.
Methods: 5,588 normal and anomalous macular OCT volumes (162,721 B-scans), acquired at independent centres in five countries, were processed using the software. Results were evaluated against ground truth provided by the dataset owners.
Results: Pegasus-OCT performed with AUROCs of at least 98% for all datasets in the detection of general macular anomalies. For scans of sufficient quality, the AUROCs for general AMD and DME detection were found to be at least 99% and 98%, respectively.
Conclusions: The ability of a clinical decision support system to cater for different populations is key to its adoption. Pegasus-OCT was shown to be able to detect AMD, DME and general anomalies in OCT volumes acquired across multiple independent sites with high performance. Its use thus offers substantial promise, with the potential to alleviate the burden of growing demand in eye care services caused by retinal disease.
△ Less
Submitted 11 July, 2019;
originally announced July 2019.
-
Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression
Authors:
Arun Sai Suggala,
Kush Bhatia,
Pradeep Ravikumar,
Prateek Jain
Abstract:
We study the problem of robust linear regression with response variable corruptions. We consider the oblivious adversary model, where the adversary corrupts a fraction of the responses in complete ignorance of the data. We provide a nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions. Existing results in this setting eith…
▽ More
We study the problem of robust linear regression with response variable corruptions. We consider the oblivious adversary model, where the adversary corrupts a fraction of the responses in complete ignorance of the data. We provide a nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions. Existing results in this setting either don't guarantee consistent estimates or can only handle a small fraction of corruptions. We also extend our estimator to robust sparse linear regression and show that similar guarantees hold in this setting. Finally, we apply our estimator to the problem of linear regression with heavy-tailed noise and show that our estimator consistently estimates the regression vector even when the noise has unbounded variance (e.g., Cauchy distribution), for which most existing results don't even apply. Our estimator is based on a novel variant of outlier removal via hard thresholding in which the threshold is chosen adaptively and crucially relies on randomness to escape bad fixed points of the non-convex hard thresholding operation.
△ Less
Submitted 19 March, 2019;
originally announced March 2019.
-
FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network
Authors:
Aditya Kusupati,
Manish Singh,
Kush Bhatia,
Ashish Kumar,
Prateek Jain,
Manik Varma
Abstract:
This paper develops the FastRNN and FastGRNN algorithms to address the twin RNN limitations of inaccurate training and inefficient prediction. Previous approaches have improved accuracy at the expense of prediction costs making them infeasible for resource-constrained and real-time applications. Unitary RNNs have increased accuracy somewhat by restricting the range of the state transition matrix's…
▽ More
This paper develops the FastRNN and FastGRNN algorithms to address the twin RNN limitations of inaccurate training and inefficient prediction. Previous approaches have improved accuracy at the expense of prediction costs making them infeasible for resource-constrained and real-time applications. Unitary RNNs have increased accuracy somewhat by restricting the range of the state transition matrix's singular values but have also increased the model size as they require a larger number of hidden units to make up for the loss in expressive power. Gated RNNs have obtained state-of-the-art accuracies by adding extra parameters thereby resulting in even larger models. FastRNN addresses these limitations by adding a residual connection that does not constrain the range of the singular values explicitly and has only two extra scalar parameters. FastGRNN then extends the residual connection to a gate by reusing the RNN matrices to match state-of-the-art gated RNN accuracies but with a 2-4x smaller model. Enforcing FastGRNN's matrices to be low-rank, sparse and quantized resulted in accurate models that could be up to 35x smaller than leading gated and unitary RNNs. This allowed FastGRNN to accurately recognize the "Hey Cortana" wakeword with a 1 KB model and to be deployed on severely resource-constrained IoT microcontrollers too tiny to store other RNN models. FastGRNN's code is available at https://github.com/Microsoft/EdgeML/.
△ Less
Submitted 8 January, 2019;
originally announced January 2019.
-
Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems
Authors:
Dhruv Malik,
Ashwin Pananjady,
Kush Bhatia,
Koulik Khamaru,
Peter L. Bartlett,
Martin J. Wainwright
Abstract:
We study derivative-free methods for policy optimization over the class of linear policies. We focus on characterizing the convergence rate of these methods when applied to linear-quadratic systems, and study various settings of driving noise and reward feedback. We show that these methods provably converge to within any pre-specified tolerance of the optimal policy with a number of zero-order eva…
▽ More
We study derivative-free methods for policy optimization over the class of linear policies. We focus on characterizing the convergence rate of these methods when applied to linear-quadratic systems, and study various settings of driving noise and reward feedback. We show that these methods provably converge to within any pre-specified tolerance of the optimal policy with a number of zero-order evaluations that is an explicit polynomial of the error tolerance, dimension, and curvature properties of the problem. Our analysis reveals some interesting differences between the settings of additive driving noise and random initialization, as well as the settings of one-point and two-point reward feedback. Our theory is corroborated by extensive simulations of derivative-free methods on these systems. Along the way, we derive convergence rates for stochastic zero-order optimization algorithms when applied to a certain class of non-convex problems.
△ Less
Submitted 18 May, 2020; v1 submitted 19 December, 2018;
originally announced December 2018.
-
Gen-Oja: A Two-time-scale approach for Streaming CCA
Authors:
Kush Bhatia,
Aldo Pacchiano,
Nicolas Flammarion,
Peter L. Bartlett,
Michael I. Jordan
Abstract:
In this paper, we study the problems of principal Generalized Eigenvector computation and Canonical Correlation Analysis in the stochastic setting. We propose a simple and efficient algorithm, Gen-Oja, for these problems. We prove the global convergence of our algorithm, borrowing ideas from the theory of fast-mixing Markov chains and two-time-scale stochastic approximation, showing that it achiev…
▽ More
In this paper, we study the problems of principal Generalized Eigenvector computation and Canonical Correlation Analysis in the stochastic setting. We propose a simple and efficient algorithm, Gen-Oja, for these problems. We prove the global convergence of our algorithm, borrowing ideas from the theory of fast-mixing Markov chains and two-time-scale stochastic approximation, showing that it achieves the optimal rate of convergence. In the process, we develop tools for understanding stochastic processes with Markovian noise which might be of independent interest.
△ Less
Submitted 31 January, 2020; v1 submitted 20 November, 2018;
originally announced November 2018.
-
Establishing Appropriate Trust via Critical States
Authors:
Sandy H. Huang,
Kush Bhatia,
Pieter Abbeel,
Anca D. Dragan
Abstract:
In order to effectively interact with or supervise a robot, humans need to have an accurate mental model of its capabilities and how it acts. Learned neural network policies make that particularly challenging. We propose an approach for helping end-users build a mental model of such policies. Our key observation is that for most tasks, the essence of the policy is captured in a few critical states…
▽ More
In order to effectively interact with or supervise a robot, humans need to have an accurate mental model of its capabilities and how it acts. Learned neural network policies make that particularly challenging. We propose an approach for helping end-users build a mental model of such policies. Our key observation is that for most tasks, the essence of the policy is captured in a few critical states: states in which it is very important to take a certain action. Our user studies show that if the robot shows a human what its understanding of the task's critical states is, then the human can make a more informed decision about whether to deploy the policy, and if she does deploy it, when she needs to take control from it at execution time.
△ Less
Submitted 18 October, 2018;
originally announced October 2018.
-
Efficient and Consistent Robust Time Series Analysis
Authors:
Kush Bhatia,
Prateek Jain,
Parameswaran Kamalaruban,
Purushottam Kar
Abstract:
We study the problem of robust time series analysis under the standard auto-regressive (AR) time series model in the presence of arbitrary outliers. We devise an efficient hard thresholding based algorithm which can obtain a consistent estimate of the optimal AR model despite a large fraction of the time series points being corrupted. Our algorithm alternately estimates the corrupted set of points…
▽ More
We study the problem of robust time series analysis under the standard auto-regressive (AR) time series model in the presence of arbitrary outliers. We devise an efficient hard thresholding based algorithm which can obtain a consistent estimate of the optimal AR model despite a large fraction of the time series points being corrupted. Our algorithm alternately estimates the corrupted set of points and the model parameters, and is inspired by recent advances in robust regression and hard-thresholding methods. However, a direct application of existing techniques is hindered by a critical difference in the time-series domain: each point is correlated with all previous points rendering existing tools inapplicable directly. We show how to overcome this hurdle using novel proof techniques. Using our techniques, we are also able to provide the first efficient and provably consistent estimator for the robust regression problem where a standard linear observation model with white additive noise is corrupted arbitrarily. We illustrate our methods on synthetic datasets and show that our methods indeed are able to consistently recover the optimal parameters despite a large fraction of points being corrupted.
△ Less
Submitted 1 July, 2016;
originally announced July 2016.
-
Design and Implementation of Domain based Semantic Hidden Web Crawler
Authors:
Manvi,
Komal Kumar Bhatia,
Ashutosh Dixit
Abstract:
Web is a wide term which mainly consists of surface web and hidden web. One can easily access the surface web using traditional web crawlers, but they are not able to crawl the hidden portion of the web. These traditional crawlers retrieve contents from web pages, which are linked by hyperlinks ignoring the information hidden behind form pages, which cannot be extracted using simple hyperlink stru…
▽ More
Web is a wide term which mainly consists of surface web and hidden web. One can easily access the surface web using traditional web crawlers, but they are not able to crawl the hidden portion of the web. These traditional crawlers retrieve contents from web pages, which are linked by hyperlinks ignoring the information hidden behind form pages, which cannot be extracted using simple hyperlink structure. Thus, they ignore large amount of data hidden behind search forms. This paper emphasizes on the extraction of hidden data behind html search forms. The proposed technique makes use of semantic mapping to fill the html search form using domain specific database. Using semantics to fill various fields of a form leads to more accurate and qualitative data extraction.
△ Less
Submitted 23 September, 2015;
originally announced September 2015.
-
A novel design of hidden web crawler using ontology
Authors:
Manvi,
Komal Kumar Bhatia,
Ashutosh Dixit
Abstract:
Deep Web is content hidden behind HTML forms. Since it represents a large portion of the structured, unstructured and dynamic data on the Web, accessing Deep-Web content has been a long challenge for the database community. This paper describes a crawler for accessing Deep-Web using Ontologies. Performance evaluation of the proposed work showed that this new approach has promising results.
Deep Web is content hidden behind HTML forms. Since it represents a large portion of the structured, unstructured and dynamic data on the Web, accessing Deep-Web content has been a long challenge for the database community. This paper describes a crawler for accessing Deep-Web using Ontologies. Performance evaluation of the proposed work showed that this new approach has promising results.
△ Less
Submitted 10 August, 2015;
originally announced August 2015.
-
Locally Non-linear Embeddings for Extreme Multi-label Learning
Authors:
Kush Bhatia,
Himanshu Jain,
Purushottam Kar,
Prateek Jain,
Manik Varma
Abstract:
The objective in extreme multi-label learning is to train a classifier that can automatically tag a novel data point with the most relevant subset of labels from an extremely large label set. Embedding based approaches make training and prediction tractable by assuming that the training label matrix is low-rank and hence the effective number of labels can be reduced by projecting the high dimensio…
▽ More
The objective in extreme multi-label learning is to train a classifier that can automatically tag a novel data point with the most relevant subset of labels from an extremely large label set. Embedding based approaches make training and prediction tractable by assuming that the training label matrix is low-rank and hence the effective number of labels can be reduced by projecting the high dimensional label vectors onto a low dimensional linear subspace. Still, leading embedding approaches have been unable to deliver high prediction accuracies or scale to large problems as the low rank assumption is violated in most real world applications.
This paper develops the X-One classifier to address both limitations. The main technical contribution in X-One is a formulation for learning a small ensemble of local distance preserving embeddings which can accurately predict infrequently occurring (tail) labels. This allows X-One to break free of the traditional low-rank assumption and boost classification accuracy by learning embeddings which preserve pairwise distances between only the nearest label vectors.
We conducted extensive experiments on several real-world as well as benchmark data sets and compared our method against state-of-the-art methods for extreme multi-label classification. Experiments reveal that X-One can make significantly more accurate predictions then the state-of-the-art methods including both embeddings (by as much as 35%) as well as trees (by as much as 6%). X-One can also scale efficiently to data sets with a million labels which are beyond the pale of leading embedding methods.
△ Less
Submitted 9 July, 2015;
originally announced July 2015.
-
Robust Regression via Hard Thresholding
Authors:
Kush Bhatia,
Prateek Jain,
Purushottam Kar
Abstract:
We study the problem of Robust Least Squares Regression (RLSR) where several response variables can be adversarially corrupted. More specifically, for a data matrix X \in R^{p x n} and an underlying model w*, the response vector is generated as y = X'w* + b where b \in R^n is the corruption vector supported over at most C.n coordinates. Existing exact recovery results for RLSR focus solely on L1-p…
▽ More
We study the problem of Robust Least Squares Regression (RLSR) where several response variables can be adversarially corrupted. More specifically, for a data matrix X \in R^{p x n} and an underlying model w*, the response vector is generated as y = X'w* + b where b \in R^n is the corruption vector supported over at most C.n coordinates. Existing exact recovery results for RLSR focus solely on L1-penalty based convex formulations and impose relatively strict model assumptions such as requiring the corruptions b to be selected independently of X.
In this work, we study a simple hard-thresholding algorithm called TORRENT which, under mild conditions on X, can recover w* exactly even if b corrupts the response variables in an adversarial manner, i.e. both the support and entries of b are selected adversarially after observing X and w*. Our results hold under deterministic assumptions which are satisfied if X is sampled from any sub-Gaussian distribution. Finally unlike existing results that apply only to a fixed w*, generated independently of X, our results are universal and hold for any w* \in R^p.
Next, we propose gradient descent-based extensions of TORRENT that can scale efficiently to large scale problems, such as high dimensional sparse recovery and prove similar recovery guarantees for these extensions. Empirically we find TORRENT, and more so its extensions, offering significantly faster recovery than the state-of-the-art L1 solvers. For instance, even on moderate-sized datasets (with p = 50K) with around 40% corrupted responses, a variant of our proposed method called TORRENT-HYB is more than 20x faster than the best L1 solver.
△ Less
Submitted 8 June, 2015;
originally announced June 2015.
-
A Comparative Study of Hidden Web Crawlers
Authors:
Sonali Gupta,
Komal Kumar Bhatia
Abstract:
A large amount of data on the WWW remains inaccessible to crawlers of Web search engines because it can only be exposed on demand as users fill out and submit forms. The Hidden web refers to the collection of Web data which can be accessed by the crawler only through an interaction with the Web-based search form and not simply by traversing hyperlinks. Research on Hidden Web has emerged almost a d…
▽ More
A large amount of data on the WWW remains inaccessible to crawlers of Web search engines because it can only be exposed on demand as users fill out and submit forms. The Hidden web refers to the collection of Web data which can be accessed by the crawler only through an interaction with the Web-based search form and not simply by traversing hyperlinks. Research on Hidden Web has emerged almost a decade ago with the main line being exploring ways to access the content in online databases that are usually hidden behind search forms. The efforts in the area mainly focus on designing hidden Web crawlers that focus on learning forms and filling them with meaningful values. The paper gives an insight into the various Hidden Web crawlers developed for the purpose giving a mention to the advantages and shortcoming of the techniques employed in each.
△ Less
Submitted 22 July, 2014;
originally announced July 2014.
-
WebParF: A Web partitioning framework for Parallel Crawlers
Authors:
Sonali Gupta,
Komal kumar Bhatia,
Pikakshi Manchanda
Abstract:
With the ever proliferating size and scale of the WWW [1] efficient ways of exploring content are of increasing importance. How can we efficiently retrieve information from it through crawling? And in this era of tera and multi-core processors, we ought to think of multi-threaded processes as a serving solution. So, even better how can we improve the crawling performance by using parallel crawlers…
▽ More
With the ever proliferating size and scale of the WWW [1] efficient ways of exploring content are of increasing importance. How can we efficiently retrieve information from it through crawling? And in this era of tera and multi-core processors, we ought to think of multi-threaded processes as a serving solution. So, even better how can we improve the crawling performance by using parallel crawlers that work independently? The paper devotes to the fundamental development in the field of parallel crawlers [4] highlighting the advantages and challenges arising from its design. The paper also focuses on the aspect of URL distribution among the various parallel crawling processes or threads and ordering the URLs within each distributed set of URLs. How to distribute URLs from the URL frontier to the various concurrently executing crawling process threads is an orthogonal problem. The paper provides a solution to the problem by designing a framework WebParF that partitions the URL frontier into a several URL queues while considering the various design issues.
△ Less
Submitted 22 June, 2014;
originally announced June 2014.
-
Query Interface Integrator For Domain Specific Hidden Web
Authors:
Sudhakar Ranjan,
Komal K. Bhatia
Abstract:
Web is title admittance today mainly relies on search engines. A large amount of data is hidden in the databases behind the search interfaces referred to as Hidden web, which needs to be indexed so in order to serve user query. In this paper database and data mining techniques are used for query interface integration. The query interface must resemble the look and feel of local interface as much a…
▽ More
Web is title admittance today mainly relies on search engines. A large amount of data is hidden in the databases behind the search interfaces referred to as Hidden web, which needs to be indexed so in order to serve user query. In this paper database and data mining techniques are used for query interface integration. The query interface must resemble the look and feel of local interface as much as possible despite being automatically generated without human support.This technique keeps the related documents in the same domain so that searching of documents becomes more efficient in terms of time complexity.
△ Less
Submitted 16 November, 2013;
originally announced November 2013.
-
A Novel Term Weighing Scheme Towards Efficient Crawl of Textual Databases
Authors:
Sonali Gupta,
Komal Kumar Bhatia
Abstract:
The Hidden Web is the vast repository of informational databases available only through search form interfaces, accessible by therein typing a set of keywords in the search forms. Typically, a Hidden Web crawler is employed to autonomously discover and download pages from the Hidden Web. Traditional hidden web crawlers do not provide the search engines with an optimal search experience because of…
▽ More
The Hidden Web is the vast repository of informational databases available only through search form interfaces, accessible by therein typing a set of keywords in the search forms. Typically, a Hidden Web crawler is employed to autonomously discover and download pages from the Hidden Web. Traditional hidden web crawlers do not provide the search engines with an optimal search experience because of the excessive number of search requests posed through the form interface so as to exhaustively crawl and retrieve the contents of the target hidden web database. Here in our work, we provide a framework to investigate the problem of optimal search and curtail it by proposing an effective query term selection approach based on the frequency & distribution of terms in the document database. The paper focuses on developing a term-weighing scheme called VarDF (acronym for variable document frequency) that can ease the identification of optimal terms to be used as queries on the interface for maximizing the achieved coverage of the crawler which in turn will facilitate the search engine to have a diversified and expanded index. We experimentally evaluate the effectiveness of our approach on a manually created database of documents in the area of Information Retrieval.
△ Less
Submitted 2 November, 2013;
originally announced November 2013.
-
A Propound Method for the Improvement of Cluster Quality
Authors:
Shveta Kundra Bhatia,
V. S. Dixit
Abstract:
In this paper Knockout Refinement Algorithm (KRA) is proposed to refine original clusters obtained by applying SOM and K-Means clustering algorithms. KRA Algorithm is based on Contingency Table concepts. Metrics are computed for the Original and Refined Clusters. Quality of Original and Refined Clusters are compared in terms of metrics. The proposed algorithm (KRA) is tested in the educational dom…
▽ More
In this paper Knockout Refinement Algorithm (KRA) is proposed to refine original clusters obtained by applying SOM and K-Means clustering algorithms. KRA Algorithm is based on Contingency Table concepts. Metrics are computed for the Original and Refined Clusters. Quality of Original and Refined Clusters are compared in terms of metrics. The proposed algorithm (KRA) is tested in the educational domain and results show that it generates better quality clusters in terms of improved metric values.
△ Less
Submitted 25 July, 2013;
originally announced July 2013.
-
Collaborative Personalized Web Recommender System using Entropy based Similarity Measure
Authors:
Harita Mehta,
Shveta Kundra Bhatia,
Punam Bedi,
V. S. Dixit
Abstract:
On the internet, web surfers, in the search of information, always strive for recommendations. The solutions for generating recommendations become more difficult because of exponential increase in information domain day by day. In this paper, we have calculated entropy based similarity between users to achieve solution for scalability problem. Using this concept, we have implemented an online user…
▽ More
On the internet, web surfers, in the search of information, always strive for recommendations. The solutions for generating recommendations become more difficult because of exponential increase in information domain day by day. In this paper, we have calculated entropy based similarity between users to achieve solution for scalability problem. Using this concept, we have implemented an online user based collaborative web recommender system. In this model based collaborative system, the user session is divided into two levels. Entropy is calculated at both the levels. It is shown that from the set of valuable recommenders obtained at level I; only those recommenders having lower entropy at level II than entropy at level I, served as trustworthy recommenders. Finally, top N recommendations are generated from such trustworthy recommenders for an online user.
△ Less
Submitted 20 January, 2012;
originally announced January 2012.