Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 132 results for author: Misha

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16176  [pdf, other

    cs.AI cs.CL cs.LG

    GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets

    Authors: Qiming Wu, Zichen Chen, Will Corcoran, Misha Sra, Ambuj K. Singh

    Abstract: Large language models (LLMs) have achieved remarkable success in natural language processing (NLP), demonstrating significant capabilities in processing and understanding text data. However, recent studies have identified limitations in LLMs' ability to reason about graph-structured data. To address this gap, we introduce GraphEval2000, the first comprehensive graph dataset, comprising 40 graph da… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPs 2024 Dataset and Benchmark track, under review

    MSC Class: H.2.8; I.2.6; I.2.7

  2. EntangleVR++: Evaluating the Potential of using Entanglement in an Interactive VR Scene Creation System

    Authors: Mengyu Chen, Marko Peljhan, Misha Sra

    Abstract: Interactive digital stories provide a sense of flexibility and freedom to players by allowing them to make choices at key junctions. These choices advance the narrative and determine, to some degree, how the story evolves for that player. As shown in prior work, the ability to control or participate in the construction of the narrative can give the player a high level of agency that results in a s… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Preprint for Frontiers in Virtual Reality, December 2023

    ACM Class: H.5.1

    Journal ref: Front. Virtual Real. 4:1252551 (2023)

  3. ConnectVR: A Trigger-Action Interface for Creating Agent-based Interactive VR Stories

    Authors: Mengyu Chen, Marko Peljhan, Misha Sra

    Abstract: The demand for interactive narratives is growing with increasing popularity of VR and video gaming. This presents an opportunity to create interactive storytelling experiences that allow players to engage with a narrative from a first person perspective, both, immersively in VR and in 3D on a computer. However, for artists and storytellers without programming experience, authoring such experiences… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Preprint for 2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR)

    ACM Class: H.5.1

    Journal ref: in 2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR), Orlando, FL, USA, 2024 pp. 286-297

  4. arXiv:2406.14373  [pdf, other

    cs.AI cs.CL cs.CY cs.HC cs.MA

    Artificial Leviathan: Exploring Social Evolution of LLM Agents Through the Lens of Hobbesian Social Contract Theory

    Authors: Gordon Dai, Weijia Zhang, Jinhan Li, Siqi Yang, Chidera Onochie lbe, Srihas Rao, Arthur Caetano, Misha Sra

    Abstract: The emergence of Large Language Models (LLMs) and advancements in Artificial Intelligence (AI) offer an opportunity for computational social science research at scale. Building upon prior explorations of LLM agent design, our work introduces a simulated agent society where complex social relationships dynamically form and evolve over time. Agents are imbued with psychological drives and placed in… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  5. arXiv:2406.11704  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-4 340B Technical Report

    Authors: Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek , et al. (58 additional authors not shown)

    Abstract: We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2406.11583  [pdf

    cs.DL cs.CY

    Where there's a will there's a way: ChatGPT is used more for science in countries where it is prohibited

    Authors: Honglin Bao, Mengyi Sun, Misha Teplitskiy

    Abstract: Regulating AI is a key societal challenge, but which regulation methods are effective is unclear. This study measures the effectiveness of restricting AI services geographically, focusing on ChatGPT. OpenAI restricts ChatGPT access in several countries, including China and Russia. If restrictions are effective, ChatGPT use should be minimal in these countries. We measured use with a classifier bas… ▽ More

    Submitted 27 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Three figures, two tables, 21 pages, and a 19-page appendix

  7. DanceGen: Supporting Choreography Ideation and Prototyping with Generative AI

    Authors: Yimeng Liu, Misha Sra

    Abstract: Choreography creation requires high proficiency in artistic and technical skills. Choreographers typically go through four stages to create a dance piece: preparation, studio, performance, and reflection. This process is often individualized, complicated, and challenging due to multiple constraints at each stage. To assist choreographers, most prior work has focused on designing digital tools to s… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: ACM Conference on Designing Interactive Systems (DIS '24)

  8. arXiv:2404.14219  [pdf, other

    cs.CL cs.AI

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra , et al. (90 additional authors not shown)

    Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 19 pages

  9. arXiv:2404.11120  [pdf, other

    cs.CV

    TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing

    Authors: Sherry X. Chen, Yaron Vaxman, Elad Ben Baruch, David Asulin, Aviad Moreshet, Kuo-Chin Lien, Misha Sra, Pradeep Sen

    Abstract: Despite many attempts to leverage pre-trained text-to-image models (T2I) like Stable Diffusion (SD) for controllable image editing, producing good predictable results remains a challenge. Previous approaches have focused on either fine-tuning pre-trained T2I models on specific datasets to generate certain kinds of images (e.g., with a specific object or person), or on optimizing the weights, text… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR) 2024

  10. arXiv:2403.16369  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Action-based Representations Using Invariance

    Authors: Max Rudolph, Caleb Chuck, Kevin Black, Misha Lvovsky, Scott Niekum, Amy Zhang

    Abstract: Robust reinforcement learning agents using high-dimensional observations must be able to identify relevant state features amidst many exogeneous distractors. A representation that captures controllability identifies these state elements by determining what affects agent control. While methods such as inverse dynamics and mutual information capture controllability for a limited number of timesteps,… ▽ More

    Submitted 24 June, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: Published at the Reinforcement Learning Conference 2024

  11. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  12. arXiv:2402.17937  [pdf, other

    cs.RO

    Can an LLM-Powered Socially Assistive Robot Effectively and Safely Deliver Cognitive Behavioral Therapy? A Study With University Students

    Authors: Mina J. Kian, Mingyu Zong, Katrin Fischer, Abhyuday Singh, Anna-Maria Velentza, Pau Sang, Shriya Upadhyay, Anika Gupta, Misha A. Faruki, Wallace Browning, Sebastien M. R. Arnold, Bhaskar Krishnamachari, Maja J. Mataric

    Abstract: Cognitive behavioral therapy (CBT) is a widely used therapeutic method for guiding individuals toward restructuring their thinking patterns as a means of addressing anxiety, depression, and other challenges. We developed a large language model (LLM)-powered prompt-engineered socially assistive robot (SAR) that guides participants through interactive CBT at-home exercises. We evaluated the performa… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  13. Exploring AI-assisted Ideation and Prototyping for Choreography

    Authors: Yimeng Liu, Misha Sra

    Abstract: Choreography creation is a multimodal endeavor, demanding cognitive abilities to develop creative ideas and technical expertise to convert choreographic ideas into physical dance movements. Previous endeavors have sought to reduce the complexities in the choreography creation process in both dimensions. Among them, non-AI-based systems have focused on reinforcing cognitive activities by helping an… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  14. arXiv:2402.04792  [pdf, other

    cs.AI cs.CL cs.HC

    Direct Language Model Alignment from Online AI Feedback

    Authors: Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Rame, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel

    Abstract: Direct alignment from preferences (DAP) methods, such as DPO, have recently emerged as efficient alternatives to reinforcement learning from human feedback (RLHF), that do not require a separate reward model. However, the preference datasets used in DAP methods are usually collected ahead of training and never updated, thus the feedback is purely offline. Moreover, responses in these datasets are… ▽ More

    Submitted 29 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 18 pages, 9 figures, 4 tables

  15. arXiv:2402.04464  [pdf

    cs.AI cs.CY

    Ten Hard Problems in Artificial Intelligence We Must Get Right

    Authors: Gavin Leech, Simson Garfinkel, Misha Yagudin, Alexander Briand, Aleksandr Zhuravlev

    Abstract: We explore the AI2050 "hard problems" that block the promise of AI and cause AI risks: (1) developing general capabilities of the systems; (2) assuring the performance of AI systems and their training processes; (3) aligning system goals with human goals; (4) enabling great applications of AI in real life; (5) addressing economic disruptions; (6) ensuring the participation of all; (7) at the same… ▽ More

    Submitted 19 April, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: 75 + 19 pages

  16. arXiv:2402.01878  [pdf, other

    cs.CL cs.LG

    LiPO: Listwise Preference Optimization through Learning-to-Rank

    Authors: Tianqi Liu, Zhen Qin, Junru Wu, Jiaming Shen, Misha Khalman, Rishabh Joshi, Yao Zhao, Mohammad Saleh, Simon Baumgartner, Jialu Liu, Peter J. Liu, Xuanhui Wang

    Abstract: Aligning language models (LMs) with curated human feedback is critical to control their behaviors in real-world applications. Several recent policy optimization methods, such as DPO and SLiC, serve as promising alternatives to the traditional Reinforcement Learning from Human Feedback (RLHF) approach. In practice, human feedback often comes in a format of a ranked list over multiple responses to a… ▽ More

    Submitted 22 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  17. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  18. arXiv:2311.09017  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Semidefinite programs simulate approximate message passing robustly

    Authors: Misha Ivkov, Tselil Schramm

    Abstract: Approximate message passing (AMP) is a family of iterative algorithms that generalize matrix power iteration. AMP algorithms are known to optimally solve many average-case optimization problems. In this paper, we show that a large class of AMP algorithms can be simulated in polynomial time by \emph{local statistics hierarchy} semidefinite programs (SDPs), even when an unknown principal minor of me… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: 50 pages

  19. arXiv:2311.08614  [pdf, other

    cs.CL cs.AI

    XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making

    Authors: Zichen Chen, Jianda Chen, Mitali Gaidhani, Ambuj Singh, Misha Sra

    Abstract: Large Language Models (LLMs) have recently made impressive strides in natural language understanding tasks. Despite their remarkable performance, understanding their decision-making process remains a big challenge. In this paper, we look into bringing some transparency to this process by introducing a new explanation dataset for question answering (QA) tasks that integrates knowledge graphs (KGs)… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 17 pages, 6 figures, 7 tables. Our dataset is available at: https://github.com/chen-zichen/XplainLLM_dataset.git

  20. arXiv:2311.04742  [pdf, other

    cs.CL q-bio.NC

    Using large language models to study human memory for meaningful narratives

    Authors: Antonios Georgiou, Tankut Can, Mikhail Katkov, Misha Tsodyks

    Abstract: One of the most impressive achievements of the AI revolution is the development of large language models that can generate meaningful text and respond to instructions in plain English with no additional training necessary. Here we show that language models can be used as a scientific instrument for studying human memory for meaningful material. We developed a pipeline for designing large scale mem… ▽ More

    Submitted 28 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: v2: 43 pages, with added discussion and a new appendix C

  21. arXiv:2310.08764  [pdf, other

    cs.CL cs.LG

    Calibrating Likelihoods towards Consistency in Summarization Models

    Authors: Polina Zablotskaia, Misha Khalman, Rishabh Joshi, Livio Baldini Soares, Shoshana Jakobovits, Joshua Maynez, Shashi Narayan

    Abstract: Despite the recent advances in abstractive text summarization, current summarization models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application. We argue that the main reason for such behavior is that the summarization models trained with maximum likelihood objective assign high probability to plausible sequences given the context, but t… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  22. arXiv:2310.00639  [pdf, other

    eess.IV cs.CV

    Segmentation-based Assessment of Tumor-Vessel Involvement for Surgical Resectability Prediction of Pancreatic Ductal Adenocarcinoma

    Authors: Christiaan Viviers, Mark Ramaekers, Amaan Valiuddin, Terese Hellström, Nick Tasios, John van der Ven, Igor Jacobs, Lotte Ewals, Joost Nederend, Peter de With, Misha Luyer, Fons van der Sommen

    Abstract: Pancreatic ductal adenocarcinoma (PDAC) is a highly aggressive cancer with limited treatment options. This research proposes a workflow and deep learning-based segmentation models to automatically assess tumor-vessel involvement, a key factor in determining tumor resectability. Correct assessment of resectability is vital to determine treatment options. The proposed workflow involves processing CT… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

    Comments: ICCV CVAMD 2023

  23. arXiv:2309.06657  [pdf, other

    cs.CL

    Statistical Rejection Sampling Improves Preference Optimization

    Authors: Tianqi Liu, Yao Zhao, Rishabh Joshi, Misha Khalman, Mohammad Saleh, Peter J. Liu, Jialu Liu

    Abstract: Improving the alignment of language models with human preferences remains an active research challenge. Previous approaches have primarily utilized Reinforcement Learning from Human Feedback (RLHF) via online RL methods such as Proximal Policy Optimization (PPO). Recently, offline methods such as Sequence Likelihood Calibration (SLiC) and Direct Preference Optimization (DPO) have emerged as attrac… ▽ More

    Submitted 23 January, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: Accepted in ICLR 2024

  24. arXiv:2306.11706  [pdf, other

    cs.RO cs.LG

    RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation

    Authors: Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz , et al. (14 additional authors not shown)

    Abstract: The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned de… ▽ More

    Submitted 22 December, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Transactions on Machine Learning Research (12/2023)

  25. arXiv:2306.09800  [pdf, other

    cs.LG cs.RO

    $\pi2\text{vec}$: Policy Representations with Successor Features

    Authors: Gianluca Scarpellini, Ksenia Konyushkova, Claudio Fantacci, Tom Le Paine, Yutian Chen, Misha Denil

    Abstract: This paper describes $\pi2\text{vec}$, a method for representing behaviors of black box policies as feature vectors. The policy representations capture how the statistics of foundation model features change in response to the policy behavior in a task agnostic way, and can be trained from offline data, allowing them to be used in offline policy selection. This work provides a key piece of a recipe… ▽ More

    Submitted 24 January, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted paper at ICLR2024

  26. arXiv:2306.09646  [pdf, other

    stat.ML cs.LG

    Vacant Holes for Unsupervised Detection of the Outliers in Compact Latent Representation

    Authors: Misha Glazunov, Apostolis Zarras

    Abstract: Detection of the outliers is pivotal for any machine learning model deployed and operated in real-world. It is essential for the Deep Neural Networks that were shown to be overconfident with such inputs. Moreover, even deep generative models that allow estimation of the probability density of the input fail in achieving this task. In this work, we concentrate on the specific type of these models:… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted for the 39th Conference on Uncertainty in Artificial Intelligence (UAI 2023)

  27. arXiv:2305.10425  [pdf, other

    cs.CL cs.AI

    SLiC-HF: Sequence Likelihood Calibration with Human Feedback

    Authors: Yao Zhao, Rishabh Joshi, Tianqi Liu, Misha Khalman, Mohammad Saleh, Peter J. Liu

    Abstract: Learning from human feedback has been shown to be effective at aligning language models with human preferences. Past work has often relied on Reinforcement Learning from Human Feedback (RLHF), which optimizes the language model using reward scores assigned from a reward model trained on human preference data. In this work we show how the recently introduced Sequence Likelihood Calibration (SLiC),… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  28. arXiv:2305.03425  [pdf, other

    cs.CV

    GAANet: Ghost Auto Anchor Network for Detecting Varying Size Drones in Dark

    Authors: Misha Urooj Khan, Maham Misbah, Zeeshan Kaleem, Yansha Deng, Abbas Jamalipour

    Abstract: The usage of drones has tremendously increased in different sectors spanning from military to industrial applications. Despite all the benefits they offer, their misuse can lead to mishaps, and tackling them becomes more challenging particularly at night due to their small size and low visibility conditions. To overcome those limitations and improve the detection accuracy at night, we propose an o… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: Accepted @ IEEE VTC2023-Spring, Florence, Italy

  29. arXiv:2304.06190  [pdf

    cs.DL cs.CY cs.MA nlin.AO

    Do "bad" citations have "good" effects?

    Authors: Honglin Bao, Misha Teplitskiy

    Abstract: The scientific community discourages authors of research papers from citing papers that did not influence them. Such "rhetorical" citations are assumed to degrade the literature and incentives for good work. While a world where authors cite only substantively appears attractive, we argue that mandating substantive citing may have underappreciated consequences on the allocation of attention and dyn… ▽ More

    Submitted 16 April, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Main: 28 pages, one table, 5 figures; Appendix: 11 pages, 13 figures

  30. arXiv:2303.16537  [pdf, other

    cs.CL

    LMExplainer: a Knowledge-Enhanced Explainer for Language Models

    Authors: Zichen Chen, Ambuj K Singh, Misha Sra

    Abstract: Large language models (LLMs) such as GPT-4 are very powerful and can process different kinds of natural language processing (NLP) tasks. However, it can be difficult to interpret the results due to the multi-layer nonlinear model structure and millions of parameters. A lack of clarity and understanding of how the language models (LMs) work can make them unreliable, difficult to trust, and potentia… ▽ More

    Submitted 3 August, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: 12 pages, 1 figure, 7 tables, and 3 case studies

  31. arXiv:2303.07280  [pdf, other

    cs.CV cs.AI cs.LG

    Vision-Language Models as Success Detectors

    Authors: Yuqing Du, Ksenia Konyushkova, Misha Denil, Akhil Raju, Jessica Landon, Felix Hill, Nando de Freitas, Serkan Cabi

    Abstract: Detecting successful behaviour is crucial for training intelligent agents. As such, generalisable reward models are a prerequisite for agents that can learn to generalise their behaviour. In this work we focus on developing robust success detectors that leverage large, pretrained vision-language models (Flamingo, Alayrac et al. (2022)) and human reward annotations. Concretely, we treat success det… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

  32. arXiv:2303.07206  [pdf

    cs.CY eess.SY

    Toward A Dynamic Comfort Model for Human-Building Interaction in Grid-Interactive Efficient Buildings: Supported by Field Data

    Authors: SungKu Kang, Kunind Sharma, Maharshi Pathak, Emily Casavant, Katherine Bassett, Misha Pavel, David Fannon, Michael Kane

    Abstract: Controlling building electric loads could alleviate the increasing grid strain caused by the adoption of renewables and electrification. However, current approaches that automatically setback thermostats on the hottest day compromise their efficacy by neglecting human-building interaction (HBI). This study aims to define challenges and opportunities for developing engineering models of HBI to be u… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: 17 pages, 11 figures

  33. arXiv:2303.06277  [pdf, other

    cs.CV

    SPOTR: Spatio-temporal Pose Transformers for Human Motion Prediction

    Authors: Avinash Ajit Nargund, Misha Sra

    Abstract: 3D human motion prediction is a research area of high significance and a challenge in computer vision. It is useful for the design of many applications including robotics and autonomous driving. Traditionally, autogregressive models have been used to predict human motion. However, these models have high computation needs and error accumulation that make it difficult to use them for realtime applic… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

  34. arXiv:2212.14272  [pdf, other

    stat.ML cs.LG

    Do Bayesian Variational Autoencoders Know What They Don't Know?

    Authors: Misha Glazunov, Apostolis Zarras

    Abstract: The problem of detecting the Out-of-Distribution (OoD) inputs is of paramount importance for Deep Neural Networks. It has been previously shown that even Deep Generative Models that allow estimating the density of the inputs may not be reliable and often tend to make over-confident predictions for OoDs, assigning to them a higher density than to the in-distribution data. This over-confidence in a… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

    Comments: Accepted for the 38th Conference on Uncertainty in Artificial Intelligence (UAI 2022)

  35. arXiv:2211.16785  [pdf, other

    cs.CV

    SafeSpace MFNet: Precise and Efficient MultiFeature Drone Detection Network

    Authors: Misha Urooj Khan, Mahnoor Dil, Muhammad Zeshan Alam, Farooq Alam Orakazi, Abdullah M. Almasoud, Zeeshan Kaleem, Chau Yuen

    Abstract: The increasing prevalence of unmanned aerial vehicles (UAVs), commonly known as drones, has generated a demand for reliable detection systems. The inappropriate use of drones presents potential security and privacy hazards, particularly concerning sensitive facilities. To overcome those obstacles, we proposed the concept of MultiFeatureNet (MFNet), a solution that enhances feature representation b… ▽ More

    Submitted 6 October, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Paper accepted in IEEE TVT

  36. arXiv:2211.16317  [pdf, other

    cs.CV

    TF-Net: Deep Learning Empowered Tiny Feature Network for Night-time UAV Detection

    Authors: Maham Misbah, Misha Urooj Khan, Zhaohui Yang, Zeeshan Kaleem

    Abstract: Technological advancements have normalized the usage of unmanned aerial vehicles (UAVs) in every sector, spanning from military to commercial but they also pose serious security concerns due to their enhanced functionalities and easy access to private and highly secured areas. Several instances related to UAVs have raised security concerns, leading to UAV detection research studies. Visual techniq… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

  37. arXiv:2211.07035  [pdf, other

    econ.GN cs.SI

    Elementary Bitcoin economics: from production and transaction demand to values

    Authors: Misha Perepelitsa

    Abstract: In this paper we give an elementary analysis of economics of Bitcoin that combines the transaction demand by the consumers and the supply of hashrate by miners. We argue that the decreasing block reward will have no significant effect on the exchange rate (price) of Bitcoin and thus the network will be transitioning to a regime where transaction fees will play a bigger part of miners' revenue. We… ▽ More

    Submitted 13 November, 2022; originally announced November 2022.

  38. CardsVR: A Two-Person VR Experience with Passive Haptic Feedback from a Deck of Playing Cards

    Authors: Andrew Huard, Mengyu Chen, Misha Sra

    Abstract: Presence in virtual reality (VR) is meaningful for remotely connecting with others and facilitating social interactions despite great distance while providing a sense of "being there." This work presents CardsVR, a two-person VR experience that allows remote participants to play a game of cards together. An entire deck of tracked cards are used to recreate the sense of playing cards in-person. Pri… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

  39. arXiv:2210.03713  [pdf, other

    cs.RO

    Integration of Riemannian Motion Policy with Whole-Body Control for Collision-Free Legged Locomotion

    Authors: Daniel Marew, Misha Lvovsky, Shangqun Yu, Shotaro Sessions, Donghyun Kim

    Abstract: In this paper, we present a Riemannian Motion Policy (RMP)flow-based whole-body control framework for improved dynamic legged locomotion. RMPflow is a differential geometry-inspired algorithm for fusing multiple task-space policies (RMPs) into a configuration space policy in a geometrically consistent manner. RMP-based approaches are especially suited for designing simultaneous tracking and collis… ▽ More

    Submitted 6 November, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: 8 pages, 6 figures, 2023 IEEE-RAS International Conference on Humanoid Robots

  40. arXiv:2210.00045  [pdf, other

    cs.CL

    Calibrating Sequence likelihood Improves Conditional Language Generation

    Authors: Yao Zhao, Misha Khalman, Rishabh Joshi, Shashi Narayan, Mohammad Saleh, Peter J. Liu

    Abstract: Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences. While MLE trained models assign high probability to plausible sequences given the context, the model probabilities often do not accurately rank-order generated sequences by quality. This has been empirically observed in beam search decoding… ▽ More

    Submitted 30 September, 2022; originally announced October 2022.

  41. arXiv:2209.05581  [pdf

    cs.LG cs.AI

    BayesLDM: A Domain-Specific Language for Probabilistic Modeling of Longitudinal Data

    Authors: Karine Tung, Steven De La Torre, Mohamed El Mistiri, Rebecca Braga De Braganca, Eric Hekler, Misha Pavel, Daniel Rivera, Pedja Klasnja, Donna Spruijt-Metz, Benjamin M. Marlin

    Abstract: In this paper we present BayesLDM, a system for Bayesian longitudinal data modeling consisting of a high-level modeling language with specific features for modeling complex multivariate time series data coupled with a compiler that can produce optimized probabilistic program code for performing inference in the specified model. BayesLDM supports modeling of Bayesian network models with a specific… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: Accepted at IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) 2022

  42. arXiv:2209.01175  [pdf

    cs.DL econ.GN

    Intentional and serendipitous diffusion of ideas: Evidence from academic conferences

    Authors: Misha Teplitskiy, Soya Park, Neil Thompson, David Karger

    Abstract: This paper investigates the effects of seeing ideas presented in-person when they are easily accessible online. Presentations may increase the diffusion of ideas intentionally (when one attends the presentation of an idea of interest) and serendipitously (when one sees other ideas presented in the same session). We measure these effects in the context of 25 computer science conferences using data… ▽ More

    Submitted 19 January, 2024; v1 submitted 2 September, 2022; originally announced September 2022.

  43. arXiv:2208.03581  [pdf, other

    cs.CV cs.LG

    Improved Pancreatic Tumor Detection by Utilizing Clinically-Relevant Secondary Features

    Authors: Christiaan G. A. Viviers, Mark Ramaekers, Peter H. N. de With, Dimitrios Mavroeidis, Joost Nederend, Misha Luyer, Fons van der Sommen

    Abstract: Pancreatic cancer is one of the global leading causes of cancer-related deaths. Despite the success of Deep Learning in computer-aided diagnosis and detection (CAD) methods, little attention has been paid to the detection of Pancreatic Cancer. We propose a method for detecting pancreatic tumor that utilizes clinically-relevant features in the surrounding anatomical structures, thereby better aimin… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: Published at MICCAI 2022 CaPTion Workshop on Cancer Prevention through early detecTion

  44. arXiv:2207.04508  [pdf

    cs.HC

    Adaptive Virtual Neuroarchitecture

    Authors: Abhinandan Jain, Pattie Maes, Misha Sra

    Abstract: Our surrounding environment impacts our cognitive-emotional processes on a daily basis and shapes our physical, psychological and social wellbeing. Although the effects of the built environment on our psycho-physiological processes are well studied, virtual environment design with a potentially similar impact on the user, has received limited attention. Based on the influence of space design on a… ▽ More

    Submitted 10 July, 2022; originally announced July 2022.

  45. arXiv:2206.10942  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    List-Decodable Covariance Estimation

    Authors: Misha Ivkov, Pravesh K. Kothari

    Abstract: We give the first polynomial time algorithm for \emph{list-decodable covariance estimation}. For any $α> 0$, our algorithm takes input a sample $Y \subseteq \mathbb{R}^d$ of size $n\geq d^{\mathsf{poly}(1/α)}$ obtained by adversarially corrupting an $(1-α)n$ points in an i.i.d. sample $X$ of size $n$ from the Gaussian distribution with unknown mean $μ_*$ and covariance $Σ_*$. In… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: Abstract slightly clipped. To appear at STOC 2022

    ACM Class: F.2.1

  46. arXiv:2206.05330  [pdf, other

    cs.DL cs.SI

    The Gender Gap in Scholarly Self-Promotion on Social Media

    Authors: Hao Peng, Misha Teplitskiy, Daniel M. Romero, Emőke-Ágnes Horvát

    Abstract: Self-promotion in science is ubiquitous but may not be exercised equally by men and women. Research on self-promotion in other domains suggests that, due to bias in self-assessment and adverse reactions to non-gender-conforming behaviors (``pushback''), women tend to self-promote less often than men. We test whether this pattern extends to scholars by examining self-promotion over six years using… ▽ More

    Submitted 10 October, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

  47. arXiv:2205.15659  [pdf, other

    cs.LG cs.DS stat.ML

    The CLRS Algorithmic Reasoning Benchmark

    Authors: Petar Veličković, Adrià Puigdomènech Badia, David Budden, Razvan Pascanu, Andrea Banino, Misha Dashevskiy, Raia Hadsell, Charles Blundell

    Abstract: Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms. Several important works have investigated whether neural networks can effectively reason like algorithms, typically by learning to execute them. The common trend in the area, however, is to generate targeted kinds of algorithmic data to evaluate… ▽ More

    Submitted 4 June, 2022; v1 submitted 31 May, 2022; originally announced May 2022.

    Comments: To appear in ICML 2022. 19 pages, 4 figures

  48. arXiv:2204.09815  [pdf, other

    math.NA cs.CV eess.IV

    Parametric Level-sets Enhanced To Improve Reconstruction (PaLEnTIR)

    Authors: Ege Ozsar, Misha Kilmer, Eric Miller, Eric de Sturler, Arvind Saibaba

    Abstract: We introduce PaLEnTIR, a significantly enhanced parametric level-set (PaLS) method addressing the restoration and reconstruction of piecewise constant objects. Our key contribution involves a unique PaLS formulation utilizing a single level-set function to restore scenes containing multi-contrast piecewise-constant objects without requiring knowledge of the number of objects or their contrasts. Un… ▽ More

    Submitted 13 February, 2024; v1 submitted 20 April, 2022; originally announced April 2022.

    Comments: 28 pages, 35 figures

    MSC Class: 65F22; 65F99; 65N21

  49. arXiv:2204.08582  [pdf, other

    cs.CL cs.AI cs.LG

    MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages

    Authors: Jack FitzGerald, Christopher Hench, Charith Peris, Scott Mackie, Kay Rottmann, Ana Sanchez, Aaron Nash, Liam Urbach, Vishesh Kakarala, Richa Singh, Swetha Ranganath, Laurie Crist, Misha Britan, Wouter Leeuwis, Gokhan Tur, Prem Natarajan

    Abstract: We present the MASSIVE dataset--Multilingual Amazon Slu resource package (SLURP) for Slot-filling, Intent classification, and Virtual assistant Evaluation. MASSIVE contains 1M realistic, parallel, labeled virtual assistant utterances spanning 51 languages, 18 domains, 60 intents, and 55 slots. MASSIVE was created by tasking professional translators to localize the English-only SLURP dataset into 5… ▽ More

    Submitted 17 June, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

    Comments: Preprint; 8 pages

  50. arXiv:2202.01863  [pdf

    eess.IV cs.CV cs.LG

    Best Practices and Scoring System on Reviewing A.I. based Medical Imaging Papers: Part 1 Classification

    Authors: Timothy L. Kline, Felipe Kitamura, Ian Pan, Amine M. Korchi, Neil Tenenholtz, Linda Moy, Judy Wawira Gichoya, Igor Santos, Steven Blumer, Misha Ysabel Hwang, Kim-Ann Git, Abishek Shroff, Elad Walach, George Shih, Steve Langer

    Abstract: With the recent advances in A.I. methodologies and their application to medical imaging, there has been an explosion of related research programs utilizing these techniques to produce state-of-the-art classification performance. Ultimately, these research programs culminate in submission of their work for consideration in peer reviewed journals. To date, the criteria for acceptance vs. rejection i… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.