Search | arXiv e-print repository

Deep Learning for Prediction and Classifying the Dynamical behaviour of Piecewise Smooth Maps

Authors: Vismaya V S, Bharath V Nair, Sishu Shankar Muni

Abstract: This paper explores the prediction of the dynamics of piecewise smooth maps using various deep learning models. We have shown various novel ways of predicting the dynamics of piecewise smooth maps using deep learning models. Moreover, we have used machine learning models such as Decision Tree Classifier, Logistic Regression, K-Nearest Neighbor, Random Forest, and Support Vector Machine for predict… ▽ More This paper explores the prediction of the dynamics of piecewise smooth maps using various deep learning models. We have shown various novel ways of predicting the dynamics of piecewise smooth maps using deep learning models. Moreover, we have used machine learning models such as Decision Tree Classifier, Logistic Regression, K-Nearest Neighbor, Random Forest, and Support Vector Machine for predicting the border collision bifurcation in the 1D normal form map and the 1D tent map. Further, we classified the regular and chaotic behaviour of the 1D tent map and the 2D Lozi map using deep learning models like Convolutional Neural Network (CNN), ResNet50, and ConvLSTM via cobweb diagram and phase portraits. We also classified the chaotic and hyperchaotic behaviour of the 3D piecewise smooth map using deep learning models such as the Feed Forward Neural Network (FNN), Long Short-Term Memory (LSTM), and Recurrent Neural Network (RNN). Finally, deep learning models such as Long Short-Term Memory (LSTM) and Recurrent Neural Network (RNN) are used for reconstructing the two parametric charts of 2D border collision bifurcation normal form map. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 32 pages, 22 figures

arXiv:2406.16792 [pdf, other]

Deep Learning and Chaos: A combined Approach To Image Encryption and Decryption

Authors: Bharath V Nair, Vismaya V S, Sishu Shankar Muni, Ali Durdu

Abstract: In this paper, we introduce a novel image encryption and decryption algorithm using hyperchaotic signals from the novel 3D hyperchaotic map, 2D memristor map, Convolutional Neural Network (CNN), and key sensitivity analysis to achieve robust security and high efficiency. The encryption starts with the scrambling of gray images by using a 3D hyperchaotic map to yield complex sequences under disrupt… ▽ More In this paper, we introduce a novel image encryption and decryption algorithm using hyperchaotic signals from the novel 3D hyperchaotic map, 2D memristor map, Convolutional Neural Network (CNN), and key sensitivity analysis to achieve robust security and high efficiency. The encryption starts with the scrambling of gray images by using a 3D hyperchaotic map to yield complex sequences under disruption of pixel values; the robustness of this original encryption is further reinforced by employing a CNN to learn the intricate patterns and add the safety layer. The robustness of the encryption algorithm is shown by key sensitivity analysis, i.e., the average sensitivity of the algorithm to key elements. The other factors and systems of unauthorized decryption, even with slight variations in the keys, can alter the decryption procedure, resulting in the ineffective recreation of the decrypted image. Statistical analysis includes entropy analysis, correlation analysis, histogram analysis, and other security analyses like anomaly detection, all of which confirm the high security and effectiveness of the proposed encryption method. Testing of the algorithm under various noisy conditions is carried out to test robustness against Gaussian noise. Metrics for differential analysis, such as the NPCR (Number of Pixel Change Rate)and UACI (Unified Average Change Intensity), are also used to determine the strength of encryption. At the same time, the empirical validation was performed on several test images, which showed that the proposed encryption techniques have practical applicability and are robust to noise. Simulation results and comparative analyses illustrate that our encryption scheme possesses excellent visual security, decryption quality, and computational efficiency, and thus, it is efficient for secure image transmission and storage in big data applications. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.14861 [pdf, other]

Resilience of the Electric Grid through Trustable IoT-Coordinated Assets

Authors: Vineet J. Nair, Venkatesh Venkataramanan, Priyank Srivastava, Partha S. Sarker, Anurag Srivastava, Laurentiu D. Marinovici, Jun Zha, Christopher Irwin, Prateek Mittal, John Williams, H. Vincent Poor, Anuradha M. Annaswamy

Abstract: The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. Ho… ▽ More The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. However, they can introduce new vulnerabilities in the form of cyberattacks, which can cause significant challenges in ensuring grid resilience. %, i.e. the ability to rapidly restore grid services in the face of severe disruptions. We propose a framework in this paper for achieving grid resilience through suitably coordinated assets including a network of Internet of Things (IoT) devices. A local electricity market is proposed to identify trustable assets and carry out this coordination. Situational Awareness (SA) of locally available DERs with the ability to inject power or reduce consumption is enabled by the market, together with a monitoring procedure for their trustability and commitment. With this SA, we show that a variety of cyberattacks can be mitigated using local trustable resources without stressing the bulk grid. The demonstrations are carried out using a variety of platforms with a high-fidelity co-simulation platform, real-time hardware-in-the-loop validation, and a utility-friendly simulator. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Submitted to the Proceedings of the National Academy of Sciences (PNAS), under review

arXiv:2405.11511 [pdf, other]

Online Action Representation using Change Detection and Symbolic Programming

Authors: Vishnu S Nair, Sneha Sree, Jayaraj Joseph, Mohanasankar Sivaprakasam

Abstract: This paper addresses the critical need for online action representation, which is essential for various applications like rehabilitation, surveillance, etc. The task can be defined as representation of actions as soon as they happen in a streaming video without access to video frames in the future. Most of the existing methods use predefined window sizes for video segments, which is a restrictive… ▽ More This paper addresses the critical need for online action representation, which is essential for various applications like rehabilitation, surveillance, etc. The task can be defined as representation of actions as soon as they happen in a streaming video without access to video frames in the future. Most of the existing methods use predefined window sizes for video segments, which is a restrictive assumption on the dynamics. The proposed method employs a change detection algorithm to automatically segment action sequences, which form meaningful sub-actions and subsequently fit symbolic generative motion programs to the clipped segments. We determine the start time and end time of segments using change detection followed by a piece-wise linear fit algorithm on joint angle and bone length sequences. Domain-specific symbolic primitives are fit to pose keypoint trajectories of those extracted segments in order to obtain a higher level semantic representation. Since this representation is part-based, it is complementary to the compositional nature of human actions, i.e., a complex activity can be broken down into elementary sub-actions. We show the effectiveness of this representation in the downstream task of class agnostic repetition detection. We propose a repetition counting algorithm based on consecutive similarity matching of primitives, which can do online repetition counting. We also compare the results with a similar but offline repetition counting algorithm. The results of the experiments demonstrate that, despite operating online, the proposed method performs better or on par with the existing method. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.10190 [pdf, other]

Comparative Analysis of Predicting Subsequent Steps in Hénon Map

Authors: Vismaya V S, Alok Hareendran, Bharath V Nair, Sishu Shankar Muni, Martin Lellep

Abstract: This paper explores the prediction of subsequent steps in Hénon Map using various machine learning techniques. The Hénon map, well known for its chaotic behaviour, finds applications in various fields including cryptography, image encryption, and pattern recognition. Machine learning methods, particularly deep learning, are increasingly essential for understanding and predicting chaotic phenomena.… ▽ More This paper explores the prediction of subsequent steps in Hénon Map using various machine learning techniques. The Hénon map, well known for its chaotic behaviour, finds applications in various fields including cryptography, image encryption, and pattern recognition. Machine learning methods, particularly deep learning, are increasingly essential for understanding and predicting chaotic phenomena. This study evaluates the performance of different machine learning models including Random Forest, Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM) networks, Support Vector Machines (SVM), and Feed Forward Neural Networks (FNN) in predicting the evolution of the Hénon map. Results indicate that LSTM network demonstrate superior predictive accuracy, particularly in extreme event prediction. Furthermore, a comparison between LSTM and FNN models reveals the LSTM's advantage, especially for longer prediction horizons and larger datasets. This research underscores the significance of machine learning in elucidating chaotic dynamics and highlights the importance of model selection and dataset size in forecasting subsequent steps in chaotic systems. △ Less

Submitted 23 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Comments: 19 pages, 9 figures

arXiv:2403.10512 [pdf, other]

doi 10.1145/3613904.3642615

Surveyor: Facilitating Discovery Within Video Games for Blind and Low Vision Players

Authors: Vishnu Nair, Hanxiu 'Hazel' Zhu, Peize Song, Jizhong Wang, Brian A. Smith

Abstract: Video games are increasingly accessible to blind and low vision (BLV) players, yet many aspects remain inaccessible. One aspect is the joy players feel when they explore environments and make new discoveries, which is integral to many games. Sighted players experience discovery by surveying environments and identifying unexplored areas. Current accessibility tools, however, guide BLV players direc… ▽ More Video games are increasingly accessible to blind and low vision (BLV) players, yet many aspects remain inaccessible. One aspect is the joy players feel when they explore environments and make new discoveries, which is integral to many games. Sighted players experience discovery by surveying environments and identifying unexplored areas. Current accessibility tools, however, guide BLV players directly to items and places, robbing them of that experience. Thus, a crucial challenge is to develop navigation assistance tools that also foster exploration and discovery. To address this challenge, we propose the concept of exploration assistance in games and design Surveyor, an in-game exploration assistance tool that enhances discovery by tracking where BLV players look and highlighting unexplored areas. We designed Surveyor using insights from a formative study and compared Surveyor's effectiveness to approaches found in existing accessible games. Our findings reveal implications for facilitating richer play experiences for BLV users within games. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Journal ref: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI '24), May 2024

arXiv:2311.08303 [pdf, other]

Extrinsically-Focused Evaluation of Omissions in Medical Summarization

Authors: Elliot Schumacher, Daniel Rosenthal, Varun Nair, Luladay Price, Geoffrey Tso, Anitha Kannan

Abstract: The goal of automated summarization techniques (Paice, 1990; Kupiec et al, 1995) is to condense text by focusing on the most critical information. Generative large language models (LLMs) have shown to be robust summarizers, yet traditional metrics struggle to capture resulting performance (Goyal et al, 2022) in more powerful LLMs. In safety-critical domains such as medicine, more rigorous evaluati… ▽ More The goal of automated summarization techniques (Paice, 1990; Kupiec et al, 1995) is to condense text by focusing on the most critical information. Generative large language models (LLMs) have shown to be robust summarizers, yet traditional metrics struggle to capture resulting performance (Goyal et al, 2022) in more powerful LLMs. In safety-critical domains such as medicine, more rigorous evaluation is required, especially given the potential for LLMs to omit important information in the resulting summary. We propose MED-OMIT, a new omission benchmark for medical summarization. Given a doctor-patient conversation and a generated summary, MED-OMIT categorizes the chat into a set of facts and identifies which are omitted from the summary. We further propose to determine fact importance by simulating the impact of each fact on a downstream clinical task: differential diagnosis (DDx) generation. MED-OMIT leverages LLM prompt-based approaches which categorize the importance of facts and cluster them as supporting or negating evidence to the diagnosis. We evaluate MED-OMIT on a publicly-released dataset of patient-doctor conversations and find that MED-OMIT captures omissions better than alternative metrics. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.05090 [pdf, other]

Deep Motion Masking for Secure, Usable, and Scalable Real-Time Anonymization of Virtual Reality Motion Data

Authors: Vivek Nair, Wenbo Guo, James F. O'Brien, Louis Rosenberg, Dawn Song

Abstract: Virtual reality (VR) and "metaverse" systems have recently seen a resurgence in interest and investment as major technology companies continue to enter the space. However, recent studies have demonstrated that the motion tracking "telemetry" data used by nearly all VR applications is as uniquely identifiable as a fingerprint scan, raising significant privacy concerns surrounding metaverse technolo… ▽ More Virtual reality (VR) and "metaverse" systems have recently seen a resurgence in interest and investment as major technology companies continue to enter the space. However, recent studies have demonstrated that the motion tracking "telemetry" data used by nearly all VR applications is as uniquely identifiable as a fingerprint scan, raising significant privacy concerns surrounding metaverse technologies. Although previous attempts have been made to anonymize VR motion data, we present in this paper a state-of-the-art VR identification model that can convincingly bypass known defensive countermeasures. We then propose a new "deep motion masking" approach that scalably facilitates the real-time anonymization of VR telemetry data. Through a large-scale user study (N=182), we demonstrate that our method is significantly more usable and private than existing VR anonymity systems. △ Less

Submitted 8 November, 2023; originally announced November 2023.

arXiv:2311.02101 [pdf, other]

Solving MaxSAT with Matrix Multiplication

Authors: David Warde-Farley, Vinod Nair, Yujia Li, Ivan Lobov, Felix Gimeno, Simon Osindero

Abstract: We propose an incomplete algorithm for Maximum Satisfiability (MaxSAT) specifically designed to run on neural network accelerators such as GPUs and TPUs. Given a MaxSAT problem instance in conjunctive normal form, our procedure constructs a Restricted Boltzmann Machine (RBM) with an equilibrium distribution wherein the probability of a Boolean assignment is exponential in the number of clauses it… ▽ More We propose an incomplete algorithm for Maximum Satisfiability (MaxSAT) specifically designed to run on neural network accelerators such as GPUs and TPUs. Given a MaxSAT problem instance in conjunctive normal form, our procedure constructs a Restricted Boltzmann Machine (RBM) with an equilibrium distribution wherein the probability of a Boolean assignment is exponential in the number of clauses it satisfies. Block Gibbs sampling is used to stochastically search the space of assignments with parallel Markov chains. Since matrix multiplication is the main computational primitive for block Gibbs sampling in an RBM, our approach leads to an elegantly simple algorithm (40 lines of JAX) well-suited for neural network accelerators. Theoretical results about RBMs guarantee that the required number of visible and hidden units of the RBM scale only linearly with the number of variables and constant-sized clauses in the MaxSAT instance, ensuring that the computational cost of a Gibbs step scales reasonably with the instance size. Search throughput can be increased by batching parallel chains within a single accelerator as well as by distributing them across multiple accelerators. As a further enhancement, a heuristic based on unit propagation running on CPU is periodically applied to the sampled assignments. Our approach, which we term RbmSAT, is a new design point in the algorithm-hardware co-design space for MaxSAT. We present timed results on a subset of problem instances from the annual MaxSAT Evaluation's Incomplete Unweighted Track for the years 2018 to 2021. When allotted the same running time and CPU compute budget (but no TPUs), RbmSAT outperforms other participating solvers on problems drawn from three out of the four years' competitions. Given the same running time on a TPU cluster for which RbmSAT is uniquely designed, it outperforms all solvers on problems drawn from all four years. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.00430 [pdf, other]

doi 10.1109/TVCG.2024.3372087

Berkeley Open Extended Reality Recordings 2023 (BOXRR-23): 4.7 Million Motion Capture Recordings from 105,852 Extended Reality Device Users

Authors: Vivek Nair, Wenbo Guo, Rui Wang, James F. O'Brien, Louis Rosenberg, Dawn Song

Abstract: Extended reality (XR) devices such as the Meta Quest and Apple Vision Pro have seen a recent surge in attention, with motion tracking "telemetry" data lying at the core of nearly all XR and metaverse experiences. Researchers are just beginning to understand the implications of this data for security, privacy, usability, and more, but currently lack large-scale human motion datasets to study. The B… ▽ More Extended reality (XR) devices such as the Meta Quest and Apple Vision Pro have seen a recent surge in attention, with motion tracking "telemetry" data lying at the core of nearly all XR and metaverse experiences. Researchers are just beginning to understand the implications of this data for security, privacy, usability, and more, but currently lack large-scale human motion datasets to study. The BOXRR-23 dataset contains 4,717,215 motion capture recordings, voluntarily submitted by 105,852 XR device users from over 50 countries. BOXRR-23 is over 200 times larger than the largest existing motion capture research dataset and uses a new, highly efficient purpose-built XR Open Recording (XROR) file format. △ Less

Submitted 30 September, 2023; originally announced October 2023.

Comments: Learn more at https://rdi.berkeley.edu/metaverse/boxrr-23

Journal ref: IEEE Transactions on Visualization and Computer Graphics, pages 1-8, March 2024. IEEE VR 2024, Orlando, FL March 16-21, 2024. Best Paper Honorable Mention

arXiv:2309.02426 [pdf]

Monotone Tree-Based GAMI Models by Adapting XGBoost

Authors: Linwei Hu, Soroush Aramideh, Jie Chen, Vijayan N. Nair

Abstract: Recent papers have used machine learning architecture to fit low-order functional ANOVA models with main effects and second-order interactions. These GAMI (GAM + Interaction) models are directly interpretable as the functional main effects and interactions can be easily plotted and visualized. Unfortunately, it is not easy to incorporate the monotonicity requirement into the existing GAMI models b… ▽ More Recent papers have used machine learning architecture to fit low-order functional ANOVA models with main effects and second-order interactions. These GAMI (GAM + Interaction) models are directly interpretable as the functional main effects and interactions can be easily plotted and visualized. Unfortunately, it is not easy to incorporate the monotonicity requirement into the existing GAMI models based on boosted trees, such as EBM (Lou et al. 2013) and GAMI-Lin-T (Hu et al. 2022). This paper considers models of the form $f(x)=\sum_{j,k}f_{j,k}(x_j, x_k)$ and develops monotone tree-based GAMI models, called monotone GAMI-Tree, by adapting the XGBoost algorithm. It is straightforward to fit a monotone model to $f(x)$ using the options in XGBoost. However, the fitted model is still a black box. We take a different approach: i) use a filtering technique to determine the important interactions, ii) fit a monotone XGBoost algorithm with the selected interactions, and finally iii) parse and purify the results to get a monotone GAMI model. Simulated datasets are used to demonstrate the behaviors of mono-GAMI-Tree and EBM, both of which use piecewise constant fits. Note that the monotonicity requirement is for the full model. Under certain situations, the main effects will also be monotone. But, as seen in the examples, the interactions will not be monotone. △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: 12 pages

arXiv:2308.09341 [pdf]

Document Automation Architectures: Updated Survey in Light of Large Language Models

Authors: Mohammad Ahmadi Achachlouei, Omkar Patil, Tarun Joshi, Vijayan N. Nair

Abstract: This paper surveys the current state of the art in document automation (DA). The objective of DA is to reduce the manual effort during the generation of documents by automatically creating and integrating input from different sources and assembling documents conforming to defined templates. There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there h… ▽ More This paper surveys the current state of the art in document automation (DA). The objective of DA is to reduce the manual effort during the generation of documents by automatically creating and integrating input from different sources and assembling documents conforming to defined templates. There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no comprehensive review of the academic research on DA architectures and technologies. The current survey of DA reviews the academic literature and provides a clearer definition and characterization of DA and its features, identifies state-of-the-art DA architectures and technologies in academic research, and provides ideas that can lead to new research opportunities within the DA field in light of recent advances in generative AI and large language models. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: The current paper is the updated version of an earlier survey on document automation [Ahmadi Achachlouei et al. 2021]. Updates in the current paper are as follows: We shortened almost all sections to reduce the size of the main paper (without references) from 28 pages to 10 pages, added a review of selected papers on large language models, removed certain sections and most of diagrams. arXiv admin note: substantial text overlap with arXiv:2109.11603

MSC Class: 68T50 ACM Class: I.7.0; I.2.7; I.2.4

arXiv:2306.14746 [pdf, other]

MFDPG: Multi-Factor Authenticated Password Management With Zero Stored Secrets

Authors: Vivek Nair, Dawn Song

Abstract: While password managers are a vital tool for internet security, they can also create a massive central point of failure, as evidenced by several major recent data breaches. For over 20 years, deterministic password generators (DPGs) have been proposed, and largely rejected, as a viable alternative to password management tools. In this paper, we survey 45 existing DPGs to asses the main security, p… ▽ More While password managers are a vital tool for internet security, they can also create a massive central point of failure, as evidenced by several major recent data breaches. For over 20 years, deterministic password generators (DPGs) have been proposed, and largely rejected, as a viable alternative to password management tools. In this paper, we survey 45 existing DPGs to asses the main security, privacy, and usability issues hindering their adoption. We then present a new multi-factor deterministic password generator (MFDPG) design that aims to address these shortcomings. The result not only achieves strong, practical password management with zero credential storage, but also effectively serves as a progressive client-side upgrade of weak password-only websites to strong multi-factor authentication. △ Less

Submitted 26 June, 2023; originally announced June 2023.

arXiv:2306.08169 [pdf, other]

doi 10.1109/EuroSP57164.2023.00013

Multi-Factor Credential Hashing for Asymmetric Brute-Force Attack Resistance

Authors: Vivek Nair, Dawn Song

Abstract: Since the introduction of bcrypt in 1999, adaptive password hashing functions, whereby brute-force resistance increases symmetrically with computational difficulty for legitimate users, have been our most powerful post-breach countermeasure against credential disclosure. Unfortunately, the relatively low tolerance of users to added latency places an upper bound on the deployment of this technique… ▽ More Since the introduction of bcrypt in 1999, adaptive password hashing functions, whereby brute-force resistance increases symmetrically with computational difficulty for legitimate users, have been our most powerful post-breach countermeasure against credential disclosure. Unfortunately, the relatively low tolerance of users to added latency places an upper bound on the deployment of this technique in most applications. In this paper, we present a multi-factor credential hashing function (MFCHF) that incorporates the additional entropy of multi-factor authentication into password hashes to provide asymmetric resistance to brute-force attacks. MFCHF provides full backward compatibility with existing authentication software (e.g., Google Authenticator) and hardware (e.g., YubiKeys), with support for common usability features like factor recovery. The result is a 10^6 to 10^48 times increase in the difficulty of cracking hashed credentials, with little added latency or usability impact. △ Less

Submitted 13 June, 2023; originally announced June 2023.

Journal ref: 8th IEEE European Symposium on Security and Privacy (2023) 56-72

arXiv:2306.08168 [pdf, other]

doi 10.1109/ICBC56567.2023.10174998

Decentralizing Custodial Wallets with MFKDF

Authors: Vivek Nair, Dawn Song

Abstract: The average cryptocurrency user today faces a difficult choice between centralized custodial wallets, which are notoriously prone to spontaneous collapse, or cumbersome self-custody solutions, which if not managed properly can cause a total loss of funds. In this paper, we present a "best of both worlds" cryptocurrency wallet design that looks like, and inherits the user experience of, a centraliz… ▽ More The average cryptocurrency user today faces a difficult choice between centralized custodial wallets, which are notoriously prone to spontaneous collapse, or cumbersome self-custody solutions, which if not managed properly can cause a total loss of funds. In this paper, we present a "best of both worlds" cryptocurrency wallet design that looks like, and inherits the user experience of, a centralized custodial solution, while in fact being entirely decentralized in design and implementation. In our design, private keys are not stored on any device, but are instead derived directly from a user's authentication factors, such as passwords, soft tokens (e.g., Google Authenticator), hard tokens (e.g., YubiKey), or out-of-band authentication (e.g., SMS). Public parameters (salts, one-time pads, etc.) needed to access the wallet can be safely stored in public view, such as on a public blockchain, thereby providing strong availability guarantees. Users can then simply "log in" to their decentralized wallet on any device using standard credentials and even recover from lost credentials, thereby providing the usability of a custodial wallet with the trust and security of a decentralized approach. △ Less

Submitted 13 June, 2023; originally announced June 2023.

Journal ref: 5th IEEE International Conference on Blockchain and Cryptocurrency (2023) 1-9

arXiv:2306.06459 [pdf, other]

doi 10.1109/MSEC.2023.3330392

Truth in Motion: The Unprecedented Risks and Opportunities of Extended Reality Motion Data

Authors: Vivek Nair, Louis Rosenberg, James F. O'Brien, Dawn Song

Abstract: Motion tracking "telemetry" data lies at the core of nearly all modern extended reality (XR) and metaverse experiences. While generally presumed innocuous, recent studies have demonstrated that motion data actually has the potential to profile and deanonymize XR users, posing a significant threat to security and privacy in the metaverse. Motion tracking "telemetry" data lies at the core of nearly all modern extended reality (XR) and metaverse experiences. While generally presumed innocuous, recent studies have demonstrated that motion data actually has the potential to profile and deanonymize XR users, posing a significant threat to security and privacy in the metaverse. △ Less

Submitted 10 June, 2023; originally announced June 2023.

Journal ref: IEEE Security & Privacy (2024)

arXiv:2305.19198 [pdf, other]

Inferring Private Personal Attributes of Virtual Reality Users from Head and Hand Motion Data

Authors: Vivek Nair, Christian Rack, Wenbo Guo, Rui Wang, Shuixian Li, Brandon Huang, Atticus Cull, James F. O'Brien, Marc Latoschik, Louis Rosenberg, Dawn Song

Abstract: Motion tracking "telemetry" data lies at the core of nearly all modern virtual reality (VR) and metaverse experiences. While generally presumed innocuous, recent studies have demonstrated that motion data actually has the potential to uniquely identify VR users. In this study, we go a step further, showing that a variety of private user information can be inferred just by analyzing motion data rec… ▽ More Motion tracking "telemetry" data lies at the core of nearly all modern virtual reality (VR) and metaverse experiences. While generally presumed innocuous, recent studies have demonstrated that motion data actually has the potential to uniquely identify VR users. In this study, we go a step further, showing that a variety of private user information can be inferred just by analyzing motion data recorded from VR devices. We conducted a large-scale survey of VR users (N=1,006) with dozens of questions ranging from background and demographics to behavioral patterns and health information. We then obtained VR motion samples of each user playing the game "Beat Saber," and attempted to infer their survey responses using just their head and hand motion patterns. Using simple machine learning models, over 40 personal attributes could be accurately and consistently inferred from VR motion data alone. Despite this significant observed leakage, there remains limited awareness of the privacy implications of VR motion data, highlighting the pressing need for privacy-preserving mechanisms in multi-user VR applications. △ Less

Submitted 10 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

arXiv:2305.15670 [pdf]

Interpretable Machine Learning based on Functional ANOVA Framework: Algorithms and Comparisons

Authors: Linwei Hu, Vijayan N. Nair, Agus Sudjianto, Aijun Zhang, Jie Chen

Abstract: In the early days of machine learning (ML), the emphasis was on developing complex algorithms to achieve best predictive performance. To understand and explain the model results, one had to rely on post hoc explainability techniques, which are known to have limitations. Recently, with the recognition that interpretability is just as important, researchers are compromising on small increases in pre… ▽ More In the early days of machine learning (ML), the emphasis was on developing complex algorithms to achieve best predictive performance. To understand and explain the model results, one had to rely on post hoc explainability techniques, which are known to have limitations. Recently, with the recognition that interpretability is just as important, researchers are compromising on small increases in predictive performance to develop algorithms that are inherently interpretable. While doing so, the ML community has rediscovered the use of low-order functional ANOVA (fANOVA) models that have been known in the statistical literature for some time. This paper starts with a description of challenges with post hoc explainability and reviews the fANOVA framework with a focus on main effects and second-order interactions. This is followed by an overview of two recently developed techniques: Explainable Boosting Machines or EBM (Lou et al., 2013) and GAMI-Net (Yang et al., 2021b). The paper proposes a new algorithm, called GAMI-Lin-T, that also uses trees like EBM, but it does linear fits instead of piecewise constants within the partitions. There are many other differences, including the development of a new interaction filtering algorithm. Finally, the paper uses simulated and real datasets to compare selected ML algorithms. The results show that GAMI-Lin-T and GAMI-Net have comparable performances, and both are generally better than EBM. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: 24 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:2207.06950

arXiv:2305.14320 [pdf, other]

Results of the 2023 Census of Beat Saber Users: Virtual Reality Gaming Population Insights and Factors Affecting Virtual Reality E-Sports Performance

Authors: Vivek Nair, Viktor Radulov, James F. O'Brien

Abstract: The emergence of affordable standalone virtual reality (VR) devices has allowed VR technology to reach mass-market adoption in recent years, driven primarily by the popularity of VR gaming applications such as Beat Saber. However, despite being the top-grossing VR application to date and the most popular VR e-sport, the population of over 6 million Beat Saber users has not yet been widely studied.… ▽ More The emergence of affordable standalone virtual reality (VR) devices has allowed VR technology to reach mass-market adoption in recent years, driven primarily by the popularity of VR gaming applications such as Beat Saber. However, despite being the top-grossing VR application to date and the most popular VR e-sport, the population of over 6 million Beat Saber users has not yet been widely studied. In this report, we present a large-scale comprehensive survey of Beat Saber players (N=1,006) that sheds light on several important aspects of this population, including their background, biometrics, demographics, health information, behavioral patterns, and technical device specifications. We further provide insights into the emerging field of VR e-sports by analyzing correlations between responses and an authoritative measure of in-game performance. △ Less

Submitted 30 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: for interactive version, see https://www.beatleader.xyz/census2023

arXiv:2305.05982 [pdf, other]

Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models

Authors: Varun Nair, Elliot Schumacher, Anitha Kannan

Abstract: A medical provider's summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies… ▽ More A medical provider's summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies in visit summaries (for example, summarizing "patient does not have a fever" when a fever is present) can be detrimental to the outcome of care for the patient. This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks that are sequentially built upon. First, we identify medical entities and their affirmations within the conversation to serve as building blocks. We study dynamically constructing few-shot prompts for tasks by conditioning on relevant patient information and use GPT-3 as the backbone for our experiments. We also develop GPT-derived summarization metrics to measure performance against reference summaries quantitatively. Both our human evaluation study and metrics for medical correctness show that summaries generated using this approach are clinically accurate and outperform the baseline approach of summarizing the dialog in a zero-shot, single-prompt setting. △ Less

Submitted 10 May, 2023; originally announced May 2023.

arXiv:2304.14364 [pdf, other]

CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants

Authors: Albert Yu Sun, Varun Nair, Elliot Schumacher, Anitha Kannan

Abstract: A wave of new task-based virtual assistants has been fueled by increasingly powerful large language models (LLMs), such as GPT-4 (OpenAI, 2023). A major challenge in deploying LLM-based virtual conversational assistants in real world settings is ensuring they operate within what is admissible for the task. To overcome this challenge, the designers of these virtual assistants rely on an independent… ▽ More A wave of new task-based virtual assistants has been fueled by increasingly powerful large language models (LLMs), such as GPT-4 (OpenAI, 2023). A major challenge in deploying LLM-based virtual conversational assistants in real world settings is ensuring they operate within what is admissible for the task. To overcome this challenge, the designers of these virtual assistants rely on an independent guardrail system that verifies the virtual assistant's output aligns with the constraints required for the task. However, relying on commonly used, prompt-based guardrails can be difficult to engineer correctly and comprehensively. To address these challenges, we propose CONSCENDI. We use CONSCENDI to exhaustively generate training data with two key LLM-powered components: scenario-augmented generation and contrastive training examples. When generating conversational data, we generate a set of rule-breaking scenarios, which enumerate a diverse set of high-level ways a rule can be violated. This scenario-guided approach produces a diverse training set and provides chatbot designers greater control. To generate contrastive examples, we prompt the LLM to alter conversations with violations into acceptable conversations to enable fine-grained distinctions. We then use this data, generated by CONSCENDI, to train a smaller model. We find that CONSCENDI results in guardrail models that improve over baselines in multiple dialogue domains. △ Less

Submitted 3 April, 2024; v1 submitted 27 April, 2023; originally announced April 2023.

Comments: To appear in NAACL 2024

arXiv:2303.17071 [pdf, other]

DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents

Authors: Varun Nair, Elliot Schumacher, Geoffrey Tso, Anitha Kannan

Abstract: Large language models (LLMs) have emerged as valuable tools for many natural language understanding tasks. In safety-critical applications such as healthcare, the utility of these models is governed by their ability to generate outputs that are factually accurate and complete. In this work, we present dialog-enabled resolving agents (DERA). DERA is a paradigm made possible by the increased convers… ▽ More Large language models (LLMs) have emerged as valuable tools for many natural language understanding tasks. In safety-critical applications such as healthcare, the utility of these models is governed by their ability to generate outputs that are factually accurate and complete. In this work, we present dialog-enabled resolving agents (DERA). DERA is a paradigm made possible by the increased conversational abilities of LLMs, namely GPT-4. It provides a simple, interpretable forum for models to communicate feedback and iteratively improve output. We frame our dialog as a discussion between two agent types - a Researcher, who processes information and identifies crucial problem components, and a Decider, who has the autonomy to integrate the Researcher's information and makes judgments on the final output. We test DERA against three clinically-focused tasks. For medical conversation summarization and care plan generation, DERA shows significant improvement over the base GPT-4 performance in both human expert preference evaluations and quantitative metrics. In a new finding, we also show that GPT-4's performance (70%) on an open-ended version of the MedQA question-answering (QA) dataset (Jin et al. 2021, USMLE) is well above the passing level (60%), with DERA showing similar performance. We release the open-ended MEDQA dataset at https://github.com/curai/curai-research/tree/main/DERA. △ Less

Submitted 29 March, 2023; originally announced March 2023.

arXiv:2303.16205 [pdf]

doi 10.1093/pnasnexus/pgad111

mHealth hyperspectral learning for instantaneous spatiospectral imaging of hemodynamics

Authors: Yuhyun Ji, Sang Mok Park, Semin Kwon, Jung Woo Leem, Vidhya Vijayakrishnan Nair, Yunjie Tong, Young L. Kim

Abstract: Hyperspectral imaging acquires data in both the spatial and frequency domains to offer abundant physical or biological information. However, conventional hyperspectral imaging has intrinsic limitations of bulky instruments, slow data acquisition rate, and spatiospectral tradeoff. Here we introduce hyperspectral learning for snapshot hyperspectral imaging in which sampled hyperspectral data in a sm… ▽ More Hyperspectral imaging acquires data in both the spatial and frequency domains to offer abundant physical or biological information. However, conventional hyperspectral imaging has intrinsic limitations of bulky instruments, slow data acquisition rate, and spatiospectral tradeoff. Here we introduce hyperspectral learning for snapshot hyperspectral imaging in which sampled hyperspectral data in a small subarea are incorporated into a learning algorithm to recover the hypercube. Hyperspectral learning exploits the idea that a photograph is more than merely a picture and contains detailed spectral information. A small sampling of hyperspectral data enables spectrally informed learning to recover a hypercube from an RGB image. Hyperspectral learning is capable of recovering full spectroscopic resolution in the hypercube, comparable to high spectral resolutions of scientific spectrometers. Hyperspectral learning also enables ultrafast dynamic imaging, leveraging ultraslow video recording in an off-the-shelf smartphone, given that a video comprises a time series of multiple RGB images. To demonstrate its versatility, an experimental model of vascular development is used to extract hemodynamic parameters via statistical and deep-learning approaches. Subsequently, the hemodynamics of peripheral microcirculation is assessed at an ultrafast temporal resolution up to a millisecond, using a conventional smartphone camera. This spectrally informed learning method is analogous to compressed sensing; however, it further allows for reliable hypercube recovery and key feature extractions with a transparent learning algorithm. This learning-powered snapshot hyperspectral imaging method yields high spectral and temporal resolutions and eliminates the spatiospectral tradeoff, offering simple hardware requirements and potential applications of various machine-learning techniques. △ Less

Submitted 5 April, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

Journal ref: PNAS Nexus, pgad111, 2023

arXiv:2302.09124 [pdf, other]

doi 10.1145/3544548.3581302

ImageAssist: Tools for Enhancing Touchscreen-Based Image Exploration Systems for Blind and Low Vision Users

Authors: Vishnu Nair, Hanxiu 'Hazel' Zhu, Brian A. Smith

Abstract: Blind and low vision (BLV) users often rely on alt text to understand what a digital image is showing. However, recent research has investigated how touch-based image exploration on touchscreens can supplement alt text. Touchscreen-based image exploration systems allow BLV users to deeply understand images while granting a strong sense of agency. Yet, prior work has found that these systems requir… ▽ More Blind and low vision (BLV) users often rely on alt text to understand what a digital image is showing. However, recent research has investigated how touch-based image exploration on touchscreens can supplement alt text. Touchscreen-based image exploration systems allow BLV users to deeply understand images while granting a strong sense of agency. Yet, prior work has found that these systems require a lot of effort to use, and little work has been done to explore these systems' bottlenecks on a deeper level and propose solutions to these issues. To address this, we present ImageAssist, a set of three tools that assist BLV users through the process of exploring images by touch -- scaffolding the exploration process. We perform a series of studies with BLV users to design and evaluate ImageAssist, and our findings reveal several implications for image exploration tools for BLV users. △ Less

Submitted 17 February, 2023; originally announced February 2023.

Journal ref: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23), April 2023

arXiv:2302.08927 [pdf, other]

Unique Identification of 50,000+ Virtual Reality Users from Head & Hand Motion Data

Authors: Vivek Nair, Wenbo Guo, Justus Mattern, Rui Wang, James F. O'Brien, Louis Rosenberg, Dawn Song

Abstract: With the recent explosive growth of interest and investment in virtual reality (VR) and the so-called "metaverse," public attention has rightly shifted toward the unique security and privacy threats that these platforms may pose. While it has long been known that people reveal information about themselves via their motion, the extent to which this makes an individual globally identifiable within v… ▽ More With the recent explosive growth of interest and investment in virtual reality (VR) and the so-called "metaverse," public attention has rightly shifted toward the unique security and privacy threats that these platforms may pose. While it has long been known that people reveal information about themselves via their motion, the extent to which this makes an individual globally identifiable within virtual reality has not yet been widely understood. In this study, we show that a large number of real VR users (N=55,541) can be uniquely and reliably identified across multiple sessions using just their head and hand motion relative to virtual objects. After training a classification model on 5 minutes of data per person, a user can be uniquely identified amongst the entire pool of 50,000+ with 94.33% accuracy from 100 seconds of motion, and with 73.20% accuracy from just 10 seconds of motion. This work is the first to truly demonstrate the extent to which biomechanics may serve as a unique identifier in VR, on par with widely used biometrics such as facial or fingerprint recognition. △ Less

Submitted 17 February, 2023; originally announced February 2023.

Journal ref: 32nd USENIX Security Symposium (2023) 895-910

arXiv:2302.06591 [pdf, other]

doi 10.1109/CCTA54093.2023.10253334

Local retail electricity markets for distribution grid services

Authors: Vineet Jagadeesan Nair, Anuradha Annaswamy

Abstract: We propose a hierarchical local electricity market (LEM) at the primary and secondary feeder levels in a distribution grid, to optimally coordinate and schedule distributed energy resources (DER) and provide valuable grid services like voltage control. At the primary level, we use a current injection-based model that is valid for both radial and meshed, balanced and unbalanced, multi-phase systems… ▽ More We propose a hierarchical local electricity market (LEM) at the primary and secondary feeder levels in a distribution grid, to optimally coordinate and schedule distributed energy resources (DER) and provide valuable grid services like voltage control. At the primary level, we use a current injection-based model that is valid for both radial and meshed, balanced and unbalanced, multi-phase systems. The primary and secondary markets leverage the flexibility offered by DERs to optimize grid operation and maximize social welfare. Numerical simulations on an IEEE-123 bus modified to include DERs, show that the LEM successfully achieves voltage control and reduces overall network costs, while also allowing us to decompose the price and value associated with different grid services so as to accurately compensate DERs. △ Less

Submitted 11 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

Comments: 9 pages, 13 figures, Accepted to the 7th IEEE Conference on Control Technology and Applications (CCTA) 2023

arXiv:2301.05940 [pdf, other]

doi 10.56553/popets-2024-0003

SoK: Data Privacy in Virtual Reality

Authors: Gonzalo Munilla Garrido, Vivek Nair, Dawn Song

Abstract: The adoption of virtual reality (VR) technologies has rapidly gained momentum in recent years as companies around the world begin to position the so-called "metaverse" as the next major medium for accessing and interacting with the internet. While consumers have become accustomed to a degree of data harvesting on the web, the real-time nature of data sharing in the metaverse indicates that privacy… ▽ More The adoption of virtual reality (VR) technologies has rapidly gained momentum in recent years as companies around the world begin to position the so-called "metaverse" as the next major medium for accessing and interacting with the internet. While consumers have become accustomed to a degree of data harvesting on the web, the real-time nature of data sharing in the metaverse indicates that privacy concerns are likely to be even more prevalent in the new "Web 3.0." Research into VR privacy has demonstrated that a plethora of sensitive personal information is observable by various would-be adversaries from just a few minutes of telemetry data. On the other hand, we have yet to see VR parallels for many privacy-preserving tools aimed at mitigating threats on conventional platforms. This paper aims to systematize knowledge on the landscape of VR privacy threats and countermeasures by proposing a comprehensive taxonomy of data attributes, protections, and adversaries based on the study of 68 collected publications. We complement our qualitative discussion with a statistical analysis of the risk associated with various data sources inherent to VR in consideration of the known attacks and defenses. By focusing on highlighting the clear outstanding opportunities, we hope to motivate and guide further research into this increasingly important field. △ Less

Submitted 18 May, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

Journal ref: 24th Privacy Enhancing Technologies Symposium (2024) 21-40

arXiv:2211.12796 [pdf, other]

IMaSC -- ICFOSS Malayalam Speech Corpus

Authors: Deepa P Gopinath, Thennal D K, Vrinda V Nair, Swaraj K S, Sachin G

Abstract: Modern text-to-speech (TTS) systems use deep learning to synthesize speech increasingly approaching human quality, but they require a database of high quality audio-text sentence pairs for training. Malayalam, the official language of the Indian state of Kerala and spoken by 35+ million people, is a low resource language in terms of available corpora for TTS systems. In this paper, we present IMaS… ▽ More Modern text-to-speech (TTS) systems use deep learning to synthesize speech increasingly approaching human quality, but they require a database of high quality audio-text sentence pairs for training. Malayalam, the official language of the Indian state of Kerala and spoken by 35+ million people, is a low resource language in terms of available corpora for TTS systems. In this paper, we present IMaSC, a Malayalam text and speech corpora containing approximately 50 hours of recorded speech. With 8 speakers and a total of 34,473 text-audio pairs, IMaSC is larger than every other publicly available alternative. We evaluated the database by using it to train TTS models for each speaker based on a modern deep learning architecture. Via subjective evaluation, we show that our models perform significantly better in terms of naturalness compared to previous studies and publicly available models, with an average mean opinion score of 4.50, indicating that the synthesized speech is close to human quality. △ Less

Submitted 23 November, 2022; originally announced November 2022.

Comments: 18 pages, 8 figures

arXiv:2211.08536 [pdf]

Behavior of Hyper-Parameters for Selected Machine Learning Algorithms: An Empirical Investigation

Authors: Anwesha Bhattacharyya, Joel Vaughan, Vijayan N. Nair

Abstract: Hyper-parameters (HPs) are an important part of machine learning (ML) model development and can greatly influence performance. This paper studies their behavior for three algorithms: Extreme Gradient Boosting (XGB), Random Forest (RF), and Feedforward Neural Network (FFNN) with structured data. Our empirical investigation examines the qualitative behavior of model performance as the HPs vary, quan… ▽ More Hyper-parameters (HPs) are an important part of machine learning (ML) model development and can greatly influence performance. This paper studies their behavior for three algorithms: Extreme Gradient Boosting (XGB), Random Forest (RF), and Feedforward Neural Network (FFNN) with structured data. Our empirical investigation examines the qualitative behavior of model performance as the HPs vary, quantifies the importance of each HP for different ML algorithms, and stability of the performance near the optimal region. Based on the findings, we propose a set of guidelines for efficient HP tuning by reducing the search space. △ Less

Submitted 15 November, 2022; originally announced November 2022.

arXiv:2208.14573 [pdf, other]

doi 10.1145/3517428.3544802

Uncovering Visually Impaired Gamers' Preferences for Spatial Awareness Tools Within Video Games

Authors: Vishnu Nair, Shao-en Ma, Ricardo E. Gonzalez Penuela, Yicheng He, Karen Lin, Mason Hayes, Hannah Huddleston, Matthew Donnelly, Brian A. Smith

Abstract: Sighted players gain spatial awareness within video games through sight and spatial awareness tools (SATs) such as minimaps. Visually impaired players (VIPs), however, must often rely heavily on SATs to gain spatial awareness, especially in complex environments where using rich ambient sound design alone may be insufficient. Researchers have developed many SATs for facilitating spatial awareness w… ▽ More Sighted players gain spatial awareness within video games through sight and spatial awareness tools (SATs) such as minimaps. Visually impaired players (VIPs), however, must often rely heavily on SATs to gain spatial awareness, especially in complex environments where using rich ambient sound design alone may be insufficient. Researchers have developed many SATs for facilitating spatial awareness within VIPs. Yet this abundance disguises a gap in our understanding about how exactly these approaches assist VIPs in gaining spatial awareness and what their relative merits and limitations are. To address this, we investigate four leading approaches to facilitating spatial awareness for VIPs within a 3D video game context. Our findings uncover new insights into SATs for VIPs within video games, including that VIPs value position and orientation information the most from an SAT; that none of the approaches we investigated convey position and orientation effectively; and that VIPs highly value the ability to customize SATs. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Journal ref: Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '22), October 2022

arXiv:2208.09254 [pdf, other]

Mitigating Disparity while Maximizing Reward: Tight Anytime Guarantee for Improving Bandits

Authors: Vishakha Patil, Vineet Nair, Ganesh Ghalme, Arindam Khan

Abstract: We study the Improving Multi-Armed Bandit (IMAB) problem, where the reward obtained from an arm increases with the number of pulls it receives. This model provides an elegant abstraction for many real-world problems in domains such as education and employment, where decisions about the distribution of opportunities can affect the future capabilities of communities and the disparity between them. A… ▽ More We study the Improving Multi-Armed Bandit (IMAB) problem, where the reward obtained from an arm increases with the number of pulls it receives. This model provides an elegant abstraction for many real-world problems in domains such as education and employment, where decisions about the distribution of opportunities can affect the future capabilities of communities and the disparity between them. A decision-maker in such settings must consider the impact of her decisions on future rewards in addition to the standard objective of maximizing her cumulative reward at any time. In many of these applications, the time horizon is unknown to the decision-maker beforehand, which motivates the study of the IMAB problem in the technically more challenging horizon-unaware setting. We study the tension that arises between two seemingly conflicting objectives in the horizon-unaware setting: a) maximizing the cumulative reward at any time based on current rewards of the arms, and b) ensuring that arms with better long-term rewards get sufficient opportunities even if they initially have low rewards. We show that, surprisingly, the two objectives are aligned with each other in this setting. Our main contribution is an anytime algorithm for the IMAB problem that achieves the best possible cumulative reward while ensuring that the arms reach their true potential given sufficient time. Our algorithm mitigates the initial disparity due to lack of opportunity and continues pulling an arm till it stops improving. We prove the optimality of our algorithm by showing that a) any algorithm for the IMAB problem, no matter how utilitarian, must suffer $Ω(T)$ policy regret and $Ω(k)$ competitive ratio with respect to the optimal offline policy, and b) the competitive ratio of our algorithm is $O(k)$. △ Less

Submitted 19 August, 2022; originally announced August 2022.

arXiv:2208.06096 [pdf]

Comparing Baseline Shapley and Integrated Gradients for Local Explanation: Some Additional Insights

Authors: Tianshu Feng, Zhipu Zhou, Joshi Tarun, Vijayan N. Nair

Abstract: There are many different methods in the literature for local explanation of machine learning results. However, the methods differ in their approaches and often do not provide same explanations. In this paper, we consider two recent methods: Integrated Gradients (Sundararajan, Taly, & Yan, 2017) and Baseline Shapley (Sundararajan and Najmi, 2020). The original authors have already studied the axiom… ▽ More There are many different methods in the literature for local explanation of machine learning results. However, the methods differ in their approaches and often do not provide same explanations. In this paper, we consider two recent methods: Integrated Gradients (Sundararajan, Taly, & Yan, 2017) and Baseline Shapley (Sundararajan and Najmi, 2020). The original authors have already studied the axiomatic properties of the two methods and provided some comparisons. Our work provides some additional insights on their comparative behavior for tabular data. We discuss common situations where the two provide identical explanations and where they differ. We also use simulation studies to examine the differences when neural networks with ReLU activation function is used to fit the models. △ Less

Submitted 11 August, 2022; originally announced August 2022.

arXiv:2208.05604 [pdf, other]

doi 10.1145/3586183.3606754

Going Incognito in the Metaverse: Achieving Theoretically Optimal Privacy-Usability Tradeoffs in VR

Authors: Vivek Nair, Gonzalo Munilla Garrido, Dawn Song

Abstract: Virtual reality (VR) telepresence applications and the so-called "metaverse" promise to be the next major medium of human-computer interaction. However, with recent studies demonstrating the ease at which VR users can be profiled and deanonymized, metaverse platforms carry many of the privacy risks of the conventional internet (and more) while at present offering few of the defensive utilities tha… ▽ More Virtual reality (VR) telepresence applications and the so-called "metaverse" promise to be the next major medium of human-computer interaction. However, with recent studies demonstrating the ease at which VR users can be profiled and deanonymized, metaverse platforms carry many of the privacy risks of the conventional internet (and more) while at present offering few of the defensive utilities that users are accustomed to having access to. To remedy this, we present the first known method of implementing an "incognito mode" for VR. Our technique leverages local differential privacy to quantifiably obscure sensitive user data attributes, with a focus on intelligently adding noise when and where it is needed most to maximize privacy while minimizing usability impact. Our system is capable of flexibly adapting to the unique needs of each VR application to further optimize this trade-off. We implement our solution as a universal Unity (C#) plugin that we then evaluate using several popular VR applications. Upon faithfully replicating the most well-known VR privacy attack studies, we show a significant degradation of attacker capabilities when using our solution. △ Less

Submitted 23 October, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

Comments: Learn more at https://rdi.berkeley.edu/metaverse/metaguard/

Journal ref: 36th Annual ACM Symposium on User Interface Software and Technology (2023)

arXiv:2208.05586 [pdf, other]

Multi-Factor Key Derivation Function (MFKDF) for Fast, Flexible, Secure, & Practical Key Management

Authors: Vivek Nair, Dawn Song

Abstract: We present the first general construction of a Multi-Factor Key Derivation Function (MFKDF). Our function expands upon password-based key derivation functions (PBKDFs) with support for using other popular authentication factors like TOTP, HOTP, and hardware tokens in the key derivation process. In doing so, it provides an exponential security improvement over PBKDFs with less than 12 ms of additio… ▽ More We present the first general construction of a Multi-Factor Key Derivation Function (MFKDF). Our function expands upon password-based key derivation functions (PBKDFs) with support for using other popular authentication factors like TOTP, HOTP, and hardware tokens in the key derivation process. In doing so, it provides an exponential security improvement over PBKDFs with less than 12 ms of additional computational overhead in a typical web browser. We further present a threshold MFKDF construction, allowing for client-side key recovery and reconstitution if a factor is lost. Finally, by "stacking" derived keys, we provide a means of cryptographically enforcing arbitrarily specific key derivation policies. The result is a paradigm shift toward direct cryptographic protection of user data using all available authentication factors, with no noticeable change to the user experience. We demonstrate the ability of our solution to not only significantly improve the security of existing systems implementing PBKDFs, but also to enable new applications where PBKDFs would not be considered a feasible approach. △ Less

Submitted 16 February, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

Comments: To appear in USENIX Security '23

Journal ref: 32nd USENIX Security Symposium (2023) 2097-2114

arXiv:2208.04351 [pdf, other]

Learning to Learn to Predict Performance Regressions in Production at Meta

Authors: Moritz Beller, Hongyu Li, Vivek Nair, Vijayaraghavan Murali, Imad Ahmad, Jürgen Cito, Drew Carlson, Ari Aye, Wes Dyer

Abstract: Catching and attributing code change-induced performance regressions in production is hard; predicting them beforehand, even harder. A primer on automatically learning to predict performance regressions in software, this article gives an account of the experiences we gained when researching and deploying an ML-based regression prediction pipeline at Meta. In this paper, we report on a comparative… ▽ More Catching and attributing code change-induced performance regressions in production is hard; predicting them beforehand, even harder. A primer on automatically learning to predict performance regressions in software, this article gives an account of the experiences we gained when researching and deploying an ML-based regression prediction pipeline at Meta. In this paper, we report on a comparative study with four ML models of increasing complexity, from (1) code-opaque, over (2) Bag of Words, (3) off-the-shelve Transformer-based, to (4) a bespoke Transformer-based model, coined SuperPerforator. Our investigation shows the inherent difficulty of the performance prediction problem, which is characterized by a large imbalance of benign onto regressing changes. Our results also call into question the general applicability of Transformer-based architectures for performance prediction: an off-the-shelve CodeBERT-based approach had surprisingly poor performance; our highly customized SuperPerforator architecture initially achieved prediction performance that was just on par with simpler Bag of Words models, and only outperformed them for down-stream use cases. This ability of SuperPerforator to transfer to an application with few learning examples afforded an opportunity to deploy it in practice at Meta: it can act as a pre-filter to sort out changes that are unlikely to introduce a regression, truncating the space of changes to search a regression in by up to 43%, a 45x improvement over a random baseline. To gain further insight into SuperPerforator, we explored it via a series of experiments computing counterfactual explanations. These highlight which parts of a code change the model deems important, thereby validating the learned black-box model. △ Less

Submitted 22 May, 2023; v1 submitted 8 August, 2022; originally announced August 2022.

arXiv:2207.13176 [pdf, other]

doi 10.56553/popets-2023-0108

Exploring the Privacy Risks of Adversarial VR Game Design

Authors: Vivek Nair, Gonzalo Munilla Garrido, Dawn Song, James F. O'Brien

Abstract: Fifty study participants playtested an innocent-looking "escape room" game in virtual reality (VR). Within just a few minutes, an adversarial program had accurately inferred over 25 of their personal data attributes, from anthropometrics like height and wingspan to demographics like age and gender. As notoriously data-hungry companies become increasingly involved in VR development, this experiment… ▽ More Fifty study participants playtested an innocent-looking "escape room" game in virtual reality (VR). Within just a few minutes, an adversarial program had accurately inferred over 25 of their personal data attributes, from anthropometrics like height and wingspan to demographics like age and gender. As notoriously data-hungry companies become increasingly involved in VR development, this experimental scenario may soon represent a typical VR user experience. Since the Cambridge Analytica scandal of 2018, adversarially designed gamified elements have been known to constitute a significant privacy threat in conventional social platforms. In this work, we present a case study of how metaverse environments can similarly be adversarially constructed to covertly infer dozens of personal data attributes from seemingly anonymous users. While existing VR privacy research largely focuses on passive observation, we argue that because individuals subconsciously reveal personal information via their motion in response to specific stimuli, active attacks pose an outsized risk in VR environments. △ Less

Submitted 13 December, 2023; v1 submitted 26 July, 2022; originally announced July 2022.

Comments: Learn more at https://rdi.berkeley.edu/metaverse/metadata

Journal ref: 23rd Privacy Enhancing Technologies Symposium (2023) 238-256

arXiv:2207.06950 [pdf]

Using Model-Based Trees with Boosting to Fit Low-Order Functional ANOVA Models

Authors: Linwei Hu, Jie Chen, Vijayan N. Nair

Abstract: Low-order functional ANOVA (fANOVA) models have been rediscovered in the machine learning (ML) community under the guise of inherently interpretable machine learning. Explainable Boosting Machines or EBM (Lou et al. 2013) and GAMI-Net (Yang et al. 2021) are two recently proposed ML algorithms for fitting functional main effects and second-order interactions. We propose a new algorithm, called GAMI… ▽ More Low-order functional ANOVA (fANOVA) models have been rediscovered in the machine learning (ML) community under the guise of inherently interpretable machine learning. Explainable Boosting Machines or EBM (Lou et al. 2013) and GAMI-Net (Yang et al. 2021) are two recently proposed ML algorithms for fitting functional main effects and second-order interactions. We propose a new algorithm, called GAMI-Tree, that is similar to EBM, but has a number of features that lead to better performance. It uses model-based trees as base learners and incorporates a new interaction filtering method that is better at capturing the underlying interactions. In addition, our iterative training method converges to a model with better predictive performance, and the embedded purification ensures that interactions are hierarchically orthogonal to main effects. The algorithm does not need extensive tuning, and our implementation is fast and efficient. We use simulated and real datasets to compare the performance and interpretability of GAMI-Tree with EBM and GAMI-Net. △ Less

Submitted 15 December, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

Comments: 25 pages plus appendix

arXiv:2206.12353 [pdf]

Quantifying Inherent Randomness in Machine Learning Algorithms

Authors: Soham Raste, Rahul Singh, Joel Vaughan, Vijayan N. Nair

Abstract: Most machine learning (ML) algorithms have several stochastic elements, and their performances are affected by these sources of randomness. This paper uses an empirical study to systematically examine the effects of two sources: randomness in model training and randomness in the partitioning of a dataset into training and test subsets. We quantify and compare the magnitude of the variation in pred… ▽ More Most machine learning (ML) algorithms have several stochastic elements, and their performances are affected by these sources of randomness. This paper uses an empirical study to systematically examine the effects of two sources: randomness in model training and randomness in the partitioning of a dataset into training and test subsets. We quantify and compare the magnitude of the variation in predictive performance for the following ML algorithms: Random Forests (RFs), Gradient Boosting Machines (GBMs), and Feedforward Neural Networks (FFNNs). Among the different algorithms, randomness in model training causes larger variation for FFNNs compared to tree-based methods. This is to be expected as FFNNs have more stochastic elements that are part of their model initialization and training. We also found that random splitting of datasets leads to higher variation compared to the inherent randomness from model training. The variation from data splitting can be a major issue if the original dataset has considerable heterogeneity. Keywords: Model Training, Reproducibility, Variation △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: 14 pages, 4 Figures, 5 tables

arXiv:2206.08542 [pdf, other]

Strategic Representation

Authors: Vineet Nair, Ganesh Ghalme, Inbal Talgam-Cohen, Nir Rosenfeld

Abstract: Humans have come to rely on machines for reducing excessive information to manageable representations. But this reliance can be abused -- strategic machines might craft representations that manipulate their users. How can a user make good choices based on strategic representations? We formalize this as a learning problem, and pursue algorithms for decision-making that are robust to manipulation. I… ▽ More Humans have come to rely on machines for reducing excessive information to manageable representations. But this reliance can be abused -- strategic machines might craft representations that manipulate their users. How can a user make good choices based on strategic representations? We formalize this as a learning problem, and pursue algorithms for decision-making that are robust to manipulation. In our main setting of interest, the system represents attributes of an item to the user, who then decides whether or not to consume. We model this interaction through the lens of strategic classification (Hardt et al. 2016), reversed: the user, who learns, plays first; and the system, which responds, plays second. The system must respond with representations that reveal `nothing but the truth' but need not reveal the entire truth. Thus, the user faces the problem of learning set functions under strategic subset selection, which presents distinct algorithmic and statistical challenges. Our main result is a learning algorithm that minimizes error despite strategic representations, and our theoretical analysis sheds light on the trade-off between learning effort and susceptibility to manipulation. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: ICML 2022

arXiv:2205.12723 [pdf]

Interpretable Feature Engineering for Time Series Predictors using Attention Networks

Authors: Tianjie Wang, Jie Chen, Joel Vaughan, Vijayan N. Nair

Abstract: Regression problems with time-series predictors are common in banking and many other areas of application. In this paper, we use multi-head attention networks to develop interpretable features and use them to achieve good predictive performance. The customized attention layer explicitly uses multiplicative interactions and builds feature-engineering heads that capture temporal dynamics in a parsim… ▽ More Regression problems with time-series predictors are common in banking and many other areas of application. In this paper, we use multi-head attention networks to develop interpretable features and use them to achieve good predictive performance. The customized attention layer explicitly uses multiplicative interactions and builds feature-engineering heads that capture temporal dynamics in a parsimonious manner. Convolutional layers are used to combine multivariate time series. We also discuss methods for handling static covariates in the modeling process. Visualization and explanation tools are used to interpret the results and explain the relationship between the inputs and the extracted features. Both simulation and real dataset are used to illustrate the usefulness of the methodology. Keyword: Attention heads, Deep neural networks, Interpretable feature engineering △ Less

Submitted 23 May, 2022; originally announced May 2022.

arXiv:2204.13663 [pdf, other]

ADVISER: AI-Driven Vaccination Intervention Optimiser for Increasing Vaccine Uptake in Nigeria

Authors: Vineet Nair, Kritika Prakash, Michael Wilbur, Aparna Taneja, Corinne Namblard, Oyindamola Adeyemo, Abhishek Dubey, Abiodun Adereni, Milind Tambe, Ayan Mukhopadhyay

Abstract: More than 5 million children under five years die from largely preventable or treatable medical conditions every year, with an overwhelmingly large proportion of deaths occurring in under-developed countries with low vaccination uptake. One of the United Nations' sustainable development goals (SDG 3) aims to end preventable deaths of newborns and children under five years of age. We focus on Niger… ▽ More More than 5 million children under five years die from largely preventable or treatable medical conditions every year, with an overwhelmingly large proportion of deaths occurring in under-developed countries with low vaccination uptake. One of the United Nations' sustainable development goals (SDG 3) aims to end preventable deaths of newborns and children under five years of age. We focus on Nigeria, where the rate of infant mortality is appalling. We collaborate with HelpMum, a large non-profit organization in Nigeria to design and optimize the allocation of heterogeneous health interventions under uncertainty to increase vaccination uptake, the first such collaboration in Nigeria. Our framework, ADVISER: AI-Driven Vaccination Intervention Optimiser, is based on an integer linear program that seeks to maximize the cumulative probability of successful vaccination. Our optimization formulation is intractable in practice. We present a heuristic approach that enables us to solve the problem for real-world use-cases. We also present theoretical bounds for the heuristic method. Finally, we show that the proposed approach outperforms baseline methods in terms of vaccination uptake through experimental evaluation. HelpMum is currently planning a pilot program based on our approach to be deployed in the largest city of Nigeria, which would be the first deployment of an AI-driven vaccination uptake program in the country and hopefully, pave the way for other data-driven programs to improve health outcomes in Nigeria. △ Less

Submitted 5 July, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

Comments: Accepted for publication at International Joint Conference on Artificial Intelligence 2022, AI for Good Track (IJCAI-22)

arXiv:2204.12868 [pdf]

Performance and Interpretability Comparisons of Supervised Machine Learning Algorithms: An Empirical Study

Authors: Alice J. Liu, Arpita Mukherjee, Linwei Hu, Jie Chen, Vijayan N. Nair

Abstract: This paper compares the performances of three supervised machine learning algorithms in terms of predictive ability and model interpretation on structured or tabular data. The algorithms considered were scikit-learn implementations of extreme gradient boosting machines (XGB) and random forests (RFs), and feedforward neural networks (FFNNs) from TensorFlow. The paper is organized in a findings-base… ▽ More This paper compares the performances of three supervised machine learning algorithms in terms of predictive ability and model interpretation on structured or tabular data. The algorithms considered were scikit-learn implementations of extreme gradient boosting machines (XGB) and random forests (RFs), and feedforward neural networks (FFNNs) from TensorFlow. The paper is organized in a findings-based manner, with each section providing general conclusions supported by empirical results from simulation studies that cover a wide range of model complexity and correlation structures among predictors. We considered both continuous and binary responses of different sample sizes. Overall, XGB and FFNNs were competitive, with FFNNs showing better performance in smooth models and tree-based boosting algorithms performing better in non-smooth models. This conclusion held generally for predictive performance, identification of important variables, and determining correct input-output relationships as measured by partial dependence plots (PDPs). FFNNs generally had less over-fitting, as measured by the difference in performance between training and testing datasets. However, the difference with XGB was often small. RFs did not perform well in general, confirming the findings in the literature. All models exhibited different degrees of bias seen in PDPs, but the bias was especially problematic for RFs. The extent of the biases varied with correlation among predictors, response type, and data set sample size. In general, tree-based models tended to over-regularize the fitted model in the tails of predictor distributions. Finally, as to be expected, performances were better for continuous responses compared to binary data and with larger samples. △ Less

Submitted 5 May, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

arXiv:2204.12365 [pdf]

Explaining Adverse Actions in Credit Decisions Using Shapley Decomposition

Authors: Vijayan N. Nair, Tianshu Feng, Linwei Hu, Zach Zhang, Jie Chen, Agus Sudjianto

Abstract: When a financial institution declines an application for credit, an adverse action (AA) is said to occur. The applicant is then entitled to an explanation for the negative decision. This paper focuses on credit decisions based on a predictive model for probability of default and proposes a methodology for AA explanation. The problem involves identifying the important predictors responsible for the… ▽ More When a financial institution declines an application for credit, an adverse action (AA) is said to occur. The applicant is then entitled to an explanation for the negative decision. This paper focuses on credit decisions based on a predictive model for probability of default and proposes a methodology for AA explanation. The problem involves identifying the important predictors responsible for the negative decision and is straightforward when the underlying model is additive. However, it becomes non-trivial even for linear models with interactions. We consider models with low-order interactions and develop a simple and intuitive approach based on first principles. We then show how the methodology generalizes to the well-known Shapely decomposition and the recently proposed concept of Baseline Shapley (B-Shap). Unlike other Shapley techniques in the literature for local interpretability of machine learning results, B-Shap is computationally tractable since it involves just function evaluations. An illustrative case study is used to demonstrate the usefulness of the method. The paper also discusses situations with highly correlated predictors and desirable properties of fitted models in the credit-lending context, such as monotonicity and continuity. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: 20 pages, 8 figures

arXiv:2112.02958 [pdf, other]

Automap: Towards Ergonomic Automated Parallelism for ML Models

Authors: Michael Schaarschmidt, Dominik Grewe, Dimitrios Vytiniotis, Adam Paszke, Georg Stefan Schmid, Tamara Norman, James Molloy, Jonathan Godwin, Norman Alexander Rink, Vinod Nair, Dan Belov

Abstract: The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype o… ▽ More The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype of an automated partitioner that seamlessly integrates into existing compilers and existing user workflows. Our partitioner enables SPMD-style parallelism that encompasses data parallelism and parameter/activation sharding. Through a combination of inductive tactics and search in a platform-independent partitioning IR, automap can recover expert partitioning strategies such as Megatron sharding for transformer layers. △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: Workshop on ML for Systems at NeurIPS 2021

arXiv:2111.09839 [pdf, other]

Training Neural Networks with Fixed Sparse Masks

Authors: Yi-Lin Sung, Varun Nair, Colin Raffel

Abstract: During typical gradient-based training of deep neural networks, all of the model's parameters are updated at each iteration. Recent work has shown that it is possible to update only a small subset of the model's parameters during training, which can alleviate storage and communication requirements. In this paper, we show that it is possible to induce a fixed sparse mask on the model's parameters t… ▽ More During typical gradient-based training of deep neural networks, all of the model's parameters are updated at each iteration. Recent work has shown that it is possible to update only a small subset of the model's parameters during training, which can alleviate storage and communication requirements. In this paper, we show that it is possible to induce a fixed sparse mask on the model's parameters that selects a subset to update over many iterations. Our method constructs the mask out of the $k$ parameters with the largest Fisher information as a simple approximation as to which parameters are most important for the task at hand. In experiments on parameter-efficient transfer learning and distributed training, we show that our approach matches or exceeds the performance of other methods for training with sparse updates while being more efficient in terms of memory usage and communication costs. We release our code publicly to promote further applications of our approach. △ Less

Submitted 18 November, 2021; originally announced November 2021.

arXiv:2111.07564 [pdf, other]

Adding more data does not always help: A study in medical conversation summarization with PEGASUS

Authors: Varun Nair, Namit Katariya, Xavier Amatriain, Ilya Valmianski, Anitha Kannan

Abstract: Medical conversation summarization is integral in capturing information gathered during interactions between patients and physicians. Summarized conversations are used to facilitate patient hand-offs between physicians, and as part of providing care in the future. Summaries, however, can be time-consuming to produce and require domain expertise. Modern pre-trained NLP models such as PEGASUS have e… ▽ More Medical conversation summarization is integral in capturing information gathered during interactions between patients and physicians. Summarized conversations are used to facilitate patient hand-offs between physicians, and as part of providing care in the future. Summaries, however, can be time-consuming to produce and require domain expertise. Modern pre-trained NLP models such as PEGASUS have emerged as capable alternatives to human summarization, reaching state-of-the-art performance on many summarization benchmarks. However, many downstream tasks still require at least moderately sized datasets to achieve satisfactory performance. In this work we (1) explore the effect of dataset size on transfer learning medical conversation summarization using PEGASUS and (2) evaluate various iterative labeling strategies in the low-data regime, following their success in the classification setting. We find that model performance saturates with increase in dataset size and that the various active-learning strategies evaluated all show equivalent performance consistent with simple dataset size increase. We also find that naive iterative pseudo-labeling is on-par or slightly worse than no pseudo-labeling. Our work sheds light on the successes and challenges of translating low-data regime techniques in classification to medical conversation summarization and helps guides future work in this space. Relevant code available at \url{https://github.com/curai/curai-research/tree/main/medical-summarization-ML4H-2021}. △ Less

Submitted 28 November, 2021; v1 submitted 15 November, 2021; originally announced November 2021.

Comments: Accepted to Machine Learning for Healthcare Workshop, NeurIPS 2021

arXiv:2109.11603 [pdf]

Document Automation Architectures and Technologies: A Survey

Authors: Mohammad Ahmadi Achachlouei, Omkar Patil, Tarun Joshi, Vijayan N. Nair

Abstract: This paper surveys the current state of the art in document automation (DA). The objective of DA is to reduce the manual effort during the generation of documents by automatically integrating input from different sources and assembling documents conforming to defined templates. There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no co… ▽ More This paper surveys the current state of the art in document automation (DA). The objective of DA is to reduce the manual effort during the generation of documents by automatically integrating input from different sources and assembling documents conforming to defined templates. There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no comprehensive review of the academic research on DA architectures and technologies. The current survey of DA reviews the academic literature and provides a clearer definition and characterization of DA and its features, identifies state-of-the-art DA architectures and technologies in academic research, and provides ideas that can lead to new research opportunities within the DA field in light of recent advances in artificial intelligence and deep neural networks. △ Less

Submitted 23 September, 2021; originally announced September 2021.

Comments: 34 pages, 11 figures, 5 tables

MSC Class: 68T50 ACM Class: I.7.0; I.2.7; I.2.4

arXiv:2109.04244 [pdf, other]

Supervised Linear Dimension-Reduction Methods: Review, Extensions, and Comparisons

Authors: Shaojie Xu, Joel Vaughan, Jie Chen, Agus Sudjianto, Vijayan Nair

Abstract: Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input variable that contains maximal variation and preserves as much information as possible. PCA has also been used in prediction models where the original, high-dim… ▽ More Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input variable that contains maximal variation and preserves as much information as possible. PCA has also been used in prediction models where the original, high-dimensional space of predictors is reduced to a smaller, more manageable, set before conducting regression analysis. However, this approach does not incorporate information in the response during the dimension-reduction stage and hence can have poor predictive performance. To address this concern, several supervised linear dimension-reduction techniques have been proposed in the literature. This paper reviews selected techniques, extends some of them, and compares their performance through simulations. Two of these techniques, partial least squares (PLS) and least-squares PCA (LSPCA), consistently outperform the others in this study. △ Less

Submitted 9 September, 2021; originally announced September 2021.

arXiv:2109.01202 [pdf, other]

doi 10.1145/3472749.3474768

NavStick: Making Video Games Blind-Accessible via the Ability to Look Around

Authors: Vishnu Nair, Jay L. Karp, Samuel Silverman, Mohar Kalra, Hollis Lehv, Faizan Jamil, Brian A. Smith

Abstract: Video games remain largely inaccessible to visually impaired people (VIPs). Today's blind-accessible games are highly simplified renditions of what sighted players enjoy, and they do not give VIPs the same freedom to look around and explore game worlds on their own terms. In this work, we introduce NavStick, an audio-based tool for looking around within virtual environments, with the aim of making… ▽ More Video games remain largely inaccessible to visually impaired people (VIPs). Today's blind-accessible games are highly simplified renditions of what sighted players enjoy, and they do not give VIPs the same freedom to look around and explore game worlds on their own terms. In this work, we introduce NavStick, an audio-based tool for looking around within virtual environments, with the aim of making 3D adventure video games more blind-accessible. NavStick repurposes a game controller's thumbstick to allow VIPs to survey what is around them via line-of-sight. In a user study, we compare NavStick with traditional menu-based surveying for different navigation tasks and find that VIPs were able to form more accurate mental maps of their environment with NavStick than with menu-based surveying. In an additional exploratory study, we investigate NavStick in the context of a representative 3D adventure game. Our findings reveal several implications for blind-accessible games, and we close by discussing these. △ Less

Submitted 2 September, 2021; originally announced September 2021.

Comments: 14 pages with 13 figures and 2 tables

Journal ref: Proceedings of the 34th Annual ACM Symposium on User Interface Software and Technology (UIST '21), October 2021

arXiv:2107.10201 [pdf, other]

Learning a Large Neighborhood Search Algorithm for Mixed Integer Programs

Authors: Nicolas Sonnerat, Pengming Wang, Ira Ktena, Sergey Bartunov, Vinod Nair

Abstract: Large Neighborhood Search (LNS) is a combinatorial optimization heuristic that starts with an assignment of values for the variables to be optimized, and iteratively improves it by searching a large neighborhood around the current assignment. In this paper we consider a learning-based LNS approach for mixed integer programs (MIPs). We train a Neural Diving model to represent a probability distribu… ▽ More Large Neighborhood Search (LNS) is a combinatorial optimization heuristic that starts with an assignment of values for the variables to be optimized, and iteratively improves it by searching a large neighborhood around the current assignment. In this paper we consider a learning-based LNS approach for mixed integer programs (MIPs). We train a Neural Diving model to represent a probability distribution over assignments, which, together with an off-the-shelf MIP solver, generates an initial assignment. Formulating the subsequent search steps as a Markov Decision Process, we train a Neural Neighborhood Selection policy to select a search neighborhood at each step, which is searched using a MIP solver to find the next assignment. The policy network is trained using imitation learning. We propose a target policy for imitation that, given enough compute resources, is guaranteed to select the neighborhood containing the optimal next assignment amongst all possible choices for the neighborhood of a specified size. Our approach matches or outperforms all the baselines on five real-world MIP datasets with large-scale instances from diverse applications, including two production applications at Google. It achieves $2\times$ to $37.8\times$ better average primal gap than the best baseline on three of the datasets at large running times. △ Less

Submitted 20 May, 2022; v1 submitted 21 July, 2021; originally announced July 2021.

Showing 1–50 of 100 results for author: Nair, V