Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–14 of 14 results for author: Kelly, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.03162  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Advancing Multimodal Medical Capabilities of Gemini

    Authors: Lin Yang, Shawn Xu, Andrew Sellergren, Timo Kohlberger, Yuchen Zhou, Ira Ktena, Atilla Kiraly, Faruk Ahmed, Farhad Hormozdiari, Tiam Jaroensri, Eric Wang, Ellery Wulczyn, Fayaz Jamil, Theo Guidroz, Chuck Lau, Siyuan Qiao, Yun Liu, Akshay Goel, Kendall Park, Arnav Agharwal, Nick George, Yang Wang, Ryutaro Tanno, David G. T. Barrett, Wei-Hung Weng , et al. (22 additional authors not shown)

    Abstract: Many clinical tasks require an understanding of specialized data, such as medical images and genomics, which is not typically found in general-purpose large multimodal models. Building upon Gemini's multimodal models, we develop several models within the new Med-Gemini family that inherit core capabilities of Gemini and are optimized for medical use via fine-tuning with 2D and 3D radiology, histop… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  2. arXiv:2403.09530  [pdf, other

    cs.CV cs.AI cs.CL cs.GR

    VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

    Authors: Chris Kelly, Luhui Hu, Jiayin Hu, Yu Tian, Deshun Yang, Bang Yang, Cindy Yang, Zihao Li, Zaoshan Huang, Yuexian Zou

    Abstract: The evolution of text to visual components facilitates people's daily lives, such as generating image, videos from text and identifying the desired elements within the images. Computer vision models involving the multimodal abilities in the previous days are focused on image detection, classification based on well-defined objects. Large language models (LLMs) introduces the transformation from nat… ▽ More

    Submitted 22 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: 12 pages, 7 figures, pending conference

  3. arXiv:2403.09027  [pdf, other

    cs.CV

    VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework

    Authors: Chris Kelly, Luhui Hu, Bang Yang, Yu Tian, Deshun Yang, Cindy Yang, Zaoshan Huang, Zihao Li, Jiayin Hu, Yuexian Zou

    Abstract: With the emergence of large language models (LLMs) and vision foundation models, how to combine the intelligence and capacity of these open-sourced or API-available models to achieve open-world visual perception remains an open question. In this paper, we introduce VisionGPT to consolidate and automate the integration of state-of-the-art foundation models, thereby facilitating vision-language unde… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 17 pages, 5 figures, and 1 table. arXiv admin note: substantial text overlap with arXiv:2311.10125

  4. arXiv:2403.07944  [pdf, other

    cs.CV cs.AI

    WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs

    Authors: Deshun Yang, Luhui Hu, Yu Tian, Zihao Li, Chris Kelly, Bang Yang, Cindy Yang, Yuexian Zou

    Abstract: Several text-to-video diffusion models have demonstrated commendable capabilities in synthesizing high-quality video content. However, it remains a formidable challenge pertaining to maintaining temporal consistency and ensuring action smoothness throughout the generated sequences. In this paper, we present an innovative video generation AI agent that harnesses the power of Sora-inspired multimoda… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: 11 pages, 2 figures, 2 tables

  5. arXiv:2402.10291  [pdf, other

    cs.LG stat.ML

    An Evaluation of Real-time Adaptive Sampling Change Point Detection Algorithm using KCUSUM

    Authors: Vijayalakshmi Saravanan, Perry Siehien, Shinjae Yoo, Hubertus Van Dam, Thomas Flynn, Christopher Kelly, Khaled Z Ibrahim

    Abstract: Detecting abrupt changes in real-time data streams from scientific simulations presents a challenging task, demanding the deployment of accurate and efficient algorithms. Identifying change points in live data stream involves continuous scrutiny of incoming observations for deviations in their statistical characteristics, particularly in high-volume data scenarios. Maintaining a balance between su… ▽ More

    Submitted 4 April, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 16 pages. arXiv admin note: text overlap with arXiv:1903.01661

    MSC Class: CCS

  6. arXiv:2401.07390  [pdf, other

    cs.LG cs.CV

    Knee or ROC

    Authors: Veronica Wendt, Byunggu Yu, Caleb Kelly, Junwhan Kim

    Abstract: Self-attention transformers have demonstrated accuracy for image classification with smaller data sets. However, a limitation is that tests to-date are based upon single class image detection with known representation of image populations. For instances where the input image classes may be greater than one and test sets that lack full information on representation of image populations, accuracy ca… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: 9 pages

  7. arXiv:2311.10125  [pdf, other

    cs.CV

    UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework

    Authors: Chris Kelly, Luhui Hu, Cindy Yang, Yu Tian, Deshun Yang, Bang Yang, Zaoshan Huang, Zihao Li, Yuexian Zou

    Abstract: In the current landscape of artificial intelligence, foundation models serve as the bedrock for advancements in both language and vision domains. OpenAI GPT-4 has emerged as the pinnacle in large language models (LLMs), while the computer vision (CV) domain boasts a plethora of state-of-the-art (SOTA) models such as Meta's SAM and DINO, and YOLOS. However, the financial and computational burdens o… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 9 pages, 29 figures

  8. arXiv:2308.01317  [pdf

    cs.CV eess.IV

    ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders

    Authors: Shawn Xu, Lin Yang, Christopher Kelly, Marcin Sieniek, Timo Kohlberger, Martin Ma, Wei-Hung Weng, Atilla Kiraly, Sahar Kazemzadeh, Zakkai Melamed, Jungyeon Park, Patricia Strachan, Yun Liu, Chuck Lau, Preeti Singh, Christina Chen, Mozziyar Etemadi, Sreenivasa Raju Kalidindi, Yossi Matias, Katherine Chou, Greg S. Corrado, Shravya Shetty, Daniel Tse, Shruthi Prabhakara, Daniel Golden , et al. (3 additional authors not shown)

    Abstract: In this work, we present an approach, which we call Embeddings for Language/Image-aligned X-Rays, or ELIXR, that leverages a language-aligned image encoder combined or grafted onto a fixed LLM, PaLM 2, to perform a broad range of chest X-ray tasks. We train this lightweight adapter architecture using images paired with corresponding free-text radiology reports from the MIMIC-CXR dataset. ELIXR ach… ▽ More

    Submitted 7 September, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

  9. arXiv:2212.13138  [pdf, other

    cs.CL

    Large Language Models Encode Clinical Knowledge

    Authors: Karan Singhal, Shekoofeh Azizi, Tao Tu, S. Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, Perry Payne, Martin Seneviratne, Paul Gamble, Chris Kelly, Nathaneal Scharli, Aakanksha Chowdhery, Philip Mansfield, Blaise Aguera y Arcas, Dale Webster, Greg S. Corrado, Yossi Matias, Katherine Chou, Juraj Gottweis, Nenad Tomasev, Yun Liu , et al. (5 additional authors not shown)

    Abstract: Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To a… ▽ More

    Submitted 26 December, 2022; originally announced December 2022.

  10. arXiv:2204.11669  [pdf

    eess.IV cs.AI physics.med-ph

    Deep-learning-enabled Brain Hemodynamic Mapping Using Resting-state fMRI

    Authors: Xirui Hou, Pengfei Guo, Puyang Wang, Peiying Liu, Doris D. M. Lin, Hongli Fan, Yang Li, Zhiliang Wei, Zixuan Lin, Dengrong Jiang, Jin Jin, Catherine Kelly, Jay J. Pillai, Judy Huang, Marco C. Pinho, Binu P. Thomas, Babu G. Welch, Denise C. Park, Vishal M. Patel, Argye E. Hillis, Hanzhang Lu

    Abstract: Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for mappin… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

    Journal ref: npj Digital Medicine (2023) 116

  11. arXiv:2102.00063  [pdf, other

    physics.comp-ph cond-mat.dis-nn cond-mat.mtrl-sci cs.LG

    Recurrent Localization Networks applied to the Lippmann-Schwinger Equation

    Authors: Conlain Kelly, Surya R. Kalidindi

    Abstract: The bulk of computational approaches for modeling physical systems in materials science derive from either analytical (i.e. physics based) or data-driven (i.e. machine-learning based) origins. In order to combine the strengths of these two approaches, we advance a novel machine learning approach for solving equations of the generalized Lippmann-Schwinger (L-S) type. In this paradigm, a given probl… ▽ More

    Submitted 21 September, 2021; v1 submitted 29 January, 2021; originally announced February 2021.

    Comments: 20 pages, 10 figures. Accepted to Computational Materials Science

    Journal ref: Computational Materials Science Volume 192, May 2021, 110356

  12. Testing And Hardening IoT Devices Against the Mirai Botnet

    Authors: Christopher Kelly, Nikolaos Pitropakis, Sean McKeown, Costas Lambrinoudakis

    Abstract: A large majority of cheap Internet of Things (IoT) devices that arrive brand new, and are configured with out-of-the-box settings, are not being properly secured by the manufactures, and are vulnerable to existing malware lurking on the Internet. Among them is the Mirai botnet which has had its source code leaked to the world, allowing any malicious actor to configure and unleash it. A combination… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Comments: 8 pages, conference paper

  13. arXiv:1809.04430  [pdf, other

    cs.CV cs.LG cs.NE physics.med-ph stat.ML

    Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy

    Authors: Stanislav Nikolov, Sam Blackwell, Alexei Zverovitch, Ruheena Mendes, Michelle Livne, Jeffrey De Fauw, Yojan Patel, Clemens Meyer, Harry Askham, Bernardino Romera-Paredes, Christopher Kelly, Alan Karthikesalingam, Carlton Chu, Dawn Carnell, Cheng Boon, Derek D'Souza, Syed Ali Moinuddin, Bethany Garie, Yasmin McQuinlan, Sarah Ireland, Kiarna Hampton, Krystle Fuller, Hugh Montgomery, Geraint Rees, Mustafa Suleyman , et al. (4 additional authors not shown)

    Abstract: Over half a million individuals are diagnosed with head and neck cancer each year worldwide. Radiotherapy is an important curative treatment for this disease, but it requires manual time consuming delineation of radio-sensitive organs at risk (OARs). This planning process can delay treatment, while also introducing inter-operator variability with resulting downstream radiation dose differences. Wh… ▽ More

    Submitted 13 January, 2021; v1 submitted 12 September, 2018; originally announced September 2018.

  14. arXiv:1711.04883  [pdf, other

    cs.DC cs.AI hep-lat

    Accelerating HPC codes on Intel(R) Omni-Path Architecture networks: From particle physics to Machine Learning

    Authors: Peter Boyle, Michael Chuvelev, Guido Cossu, Christopher Kelly, Christoph Lehner, Lawrence Meadows

    Abstract: We discuss practical methods to ensure near wirespeed performance from clusters with either one or two Intel(R) Omni-Path host fabric interfaces (HFI) per node, and Intel(R) Xeon Phi(TM) 72xx (Knight's Landing) processors, and using the Linux operating system. The study evaluates the performance improvements achievable and the required programming approaches in two distinct example problems: fir… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.

    Comments: 17 pages, 5 figures