Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–40 of 40 results for author: Patel, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16012  [pdf

    eess.IV cs.CV

    Wound Tissue Segmentation in Diabetic Foot Ulcer Images Using Deep Learning: A Pilot Study

    Authors: Mrinal Kanti Dhar, Chuanbo Wang, Yash Patel, Taiyu Zhang, Jeffrey Niezgoda, Sandeep Gopalakrishnan, Keke Chen, Zeyun Yu

    Abstract: Identifying individual tissues, so-called tissue segmentation, in diabetic foot ulcer (DFU) images is a challenging task and little work has been published, largely due to the limited availability of a clinical image dataset. To address this gap, we have created a DFUTissue dataset for the research community to evaluate wound tissue segmentation algorithms. The dataset contains 110 images with tis… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.06474  [pdf, other

    cs.AI cs.CL

    Towards a Personal Health Large Language Model

    Authors: Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra , et al. (9 additional authors not shown)

    Abstract: In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 72 pages

  3. arXiv:2402.03500  [pdf, other

    quant-ph cs.AI cs.LG

    Curriculum reinforcement learning for quantum architecture search under hardware errors

    Authors: Yash J. Patel, Akash Kundu, Mateusz Ostaszewski, Xavier Bonet-Monroig, Vedran Dunjko, Onur Danaci

    Abstract: The key challenge in the noisy intermediate-scale quantum era is finding useful circuits compatible with current device limitations. Variational quantum algorithms (VQAs) offer a potential solution by fixing the circuit architecture and optimizing individual gate parameters in an external loop. However, parameter optimization can become intractable, and the overall performance of the algorithm dep… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 32 pages, 11 figures, 6 tables. Accepted at ICLR 2024

  4. arXiv:2401.14351  [pdf, other

    cs.LG cs.DC

    ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models

    Authors: Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai

    Abstract: This paper presents ServerlessLLM, a locality-enhanced serverless inference system for Large Language Models (LLMs). ServerlessLLM exploits the substantial capacity and bandwidth of storage and memory devices available on GPU servers, thereby reducing costly remote checkpoint downloads and achieving efficient checkpoint loading. ServerlessLLM achieves this through three main contributions: (i) fas… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  5. arXiv:2310.10003  [pdf, other

    stat.ME cs.LG stat.ML

    Conformal Contextual Robust Optimization

    Authors: Yash Patel, Sahana Rayan, Ambuj Tewari

    Abstract: Data-driven approaches to predict-then-optimize decision-making problems seek to mitigate the risk of uncertainty region misspecification in safety-critical settings. Current approaches, however, suffer from considering overly conservative uncertainty regions, often resulting in suboptimal decisionmaking. To this end, we propose Conformal-Predict-Then-Optimize (CPO), a framework for leveraging hig… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

  6. arXiv:2308.11877  [pdf

    cs.CV cs.AI

    Integrated Image and Location Analysis for Wound Classification: A Deep Learning Approach

    Authors: Yash Patel, Tirth Shah, Mrinal Kanti Dhar, Taiyu Zhang, Jeffrey Niezgoda, Sandeep Gopalakrishnan, Zeyun Yu

    Abstract: The global burden of acute and chronic wounds presents a compelling case for enhancing wound classification methods, a vital step in diagnosing and determining optimal treatments. Recognizing this need, we introduce an innovative multi-modal network based on a deep convolutional neural network for categorizing wounds into four categories: diabetic, pressure, surgical, and venous ulcers. Our multi-… ▽ More

    Submitted 23 August, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

  7. arXiv:2307.11122  [pdf, other

    astro-ph.IM cs.LG stat.AP

    Diffusion Models for Probabilistic Deconvolution of Galaxy Images

    Authors: Zhiwei Xue, Yuhang Li, Yash Patel, Jeffrey Regier

    Abstract: Telescopes capture images with a particular point spread function (PSF). Inferring what an image would have looked like with a much sharper PSF, a problem known as PSF deconvolution, is ill-posed because PSF convolution is not an invertible transformation. Deep generative models are appealing for PSF deconvolution because they can infer a posterior distribution over candidate images that, if convo… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted to the ICML 2023 Workshop on Machine Learning for Astrophysics

  8. WiscSort: External Sorting For Byte-Addressable Storage

    Authors: Vinay Banakar, Kan Wu, Yuvraj Patel, Kimberly Keeton, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau

    Abstract: We present WiscSort, a new approach to high-performance concurrent sorting for existing and future byte-addressable storage (BAS) devices. WiscSort carefully reduces writes, exploits random reads by splitting keys and values during sorting, and performs interference-aware scheduling with thread pool sizing to avoid I/O bandwidth degradation. We introduce the BRAID model which encompasses the uniqu… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  9. arXiv:2306.11086  [pdf, other

    quant-ph cs.AI cs.LG

    Enhancing variational quantum state diagonalization using reinforcement learning techniques

    Authors: Akash Kundu, Przemysław Bedełek, Mateusz Ostaszewski, Onur Danaci, Yash J. Patel, Vedran Dunjko, Jarosław A. Miszczak

    Abstract: The variational quantum algorithms are crucial for the application of NISQ computers. Such algorithms require short quantum circuits, which are more amenable to implementation on near-term hardware, and many such methods have been developed. One of particular interest is the so-called variational quantum state diagonalization method, which constitutes an important algorithmic subroutine and can be… ▽ More

    Submitted 11 January, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: 24 pages with 13 figures, accepted in the New Journal of Physics, code available at https://github.com/iitis/RL_for_VQSD_ansatz_optimization

    Journal ref: New Journal of Physics, 26, 013034 (2024)

  10. arXiv:2305.14275  [pdf, other

    stat.ME cs.LG

    Amortized Variational Inference with Coverage Guarantees

    Authors: Yash Patel, Declan McNamara, Jackson Loper, Jeffrey Regier, Ambuj Tewari

    Abstract: Amortized variational inference produces a posterior approximation that can be rapidly computed given any new observation. Unfortunately, there are few guarantees about the quality of these approximate posteriors. We propose Conformalized Amortized Neural Variational Inference (CANVI), a procedure that is scalable, easily implemented, and provides guaranteed marginal coverage. Given a collection o… ▽ More

    Submitted 15 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  11. arXiv:2305.02961  [pdf

    cs.CV

    FUSegNet: A Deep Convolutional Neural Network for Foot Ulcer Segmentation

    Authors: Mrinal Kanti Dhar, Taiyu Zhang, Yash Patel, Sandeep Gopalakrishnan, Zeyun Yu

    Abstract: This paper presents FUSegNet, a new model for foot ulcer segmentation in diabetes patients, which uses the pre-trained EfficientNet-b7 as a backbone to address the issue of limited training samples. A modified spatial and channel squeeze-and-excitation (scSE) module called parallel scSE or P-scSE is proposed that combines additive and max-out scSE. A new arrangement is introduced for the module by… ▽ More

    Submitted 26 January, 2024; v1 submitted 4 May, 2023; originally announced May 2023.

  12. arXiv:2305.02024  [pdf, other

    cs.CV

    Neural Network Training and Non-Differentiable Objective Functions

    Authors: Yash Patel

    Abstract: Many important computer vision tasks are naturally formulated to have a non-differentiable objective. Therefore, the standard, dominant training procedure of a neural network is not applicable since back-propagation requires the gradients of the objective with respect to the output of the model. Most deep learning methods side-step the problem sub-optimally by using a proxy loss for training, whic… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: Ph.D. Dissertation (under review). Supervisor: Prof. Jiri Matas

  13. arXiv:2302.05658  [pdf, other

    cs.CL cs.AI cs.LG

    DocILE Benchmark for Document Information Localization and Extraction

    Authors: Štěpán Šimsa, Milan Šulc, Michal Uřičář, Yash Patel, Ahmed Hamdi, Matěj Kocián, Matyáš Skalický, Jiří Matas, Antoine Doucet, Mickaël Coustaty, Dimosthenis Karatzas

    Abstract: This paper introduces the DocILE benchmark with the largest dataset of business documents for the tasks of Key Information Localization and Extraction and Line Item Recognition. It contains 6.7k annotated business documents, 100k synthetically generated documents, and nearly~1M unlabeled documents for unsupervised pre-training. The dataset has been built with knowledge of domain- and task-specific… ▽ More

    Submitted 3 May, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: Accepted to ICDAR 2023

  14. arXiv:2302.03432  [pdf, other

    cs.CV

    SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation

    Authors: Yash Patel, Yusheng Xie, Yi Zhu, Srikar Appalaraju, R. Manmatha

    Abstract: Learning to segment images purely by relying on the image-text alignment from web data can lead to sub-optimal performance due to noise in the data. The noise comes from the samples where the associated text does not correlate with the image's visual content. Instead of purely relying on the alignment from the noisy data, this paper proposes a novel loss function termed SimCon, which accounts for… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

  15. arXiv:2301.12394  [pdf, other

    cs.LG cs.AI

    DocILE 2023 Teaser: Document Information Localization and Extraction

    Authors: Štěpán Šimsa, Milan Šulc, Matyáš Skalický, Yash Patel, Ahmed Hamdi

    Abstract: The lack of data for information extraction (IE) from semi-structured business documents is a real problem for the IE community. Publications relying on large-scale datasets use only proprietary, unpublished data due to the sensitive nature of such documents. Publicly available datasets are mostly small and domain-specific. The absence of a large-scale public dataset or benchmark hinders the repro… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

    Comments: Accepted to ECIR 2023

  16. arXiv:2301.09007  [pdf, other

    cs.CV

    MultiNet with Transformers: A Model for Cancer Diagnosis Using Images

    Authors: Hosein Barzekar, Yash Patel, Ling Tong, Zeyun Yu

    Abstract: Cancer is a leading cause of death in many countries. An early diagnosis of cancer based on biomedical imaging ensures effective treatment and a better prognosis. However, biomedical imaging presents challenges to both clinical institutions and researchers. Physiological anomalies are often characterized by slight abnormalities in individual cells or tissues, making them difficult to detect visual… ▽ More

    Submitted 21 January, 2023; originally announced January 2023.

  17. arXiv:2301.02280  [pdf, other

    cs.CV

    Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training

    Authors: Filip Radenovic, Abhimanyu Dubey, Abhishek Kadian, Todor Mihaylov, Simon Vandenhende, Yash Patel, Yi Wen, Vignesh Ramanathan, Dhruv Mahajan

    Abstract: Vision-language models trained with contrastive learning on large-scale noisy data are becoming increasingly popular for zero-shot recognition problems. In this paper we improve the following three aspects of the contrastive pre-training pipeline: dataset noise, model initialization and the training objective. First, we propose a straightforward filtering strategy titled Complexity, Action, and Te… ▽ More

    Submitted 29 March, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: CVPR 2023

  18. arXiv:2212.13185  [pdf, other

    cs.CV

    Generalized Differentiable RANSAC

    Authors: Tong Wei, Yash Patel, Alexander Shekhovtsov, Jiri Matas, Daniel Barath

    Abstract: We propose $\nabla$-RANSAC, a generalized differentiable RANSAC that allows learning the entire randomized robust estimation pipeline. The proposed approach enables the use of relaxation techniques for estimating the gradients in the sampling distribution, which are then propagated through a differentiable solver. The trainable quality function marginalizes over the scores from all the models esti… ▽ More

    Submitted 8 September, 2023; v1 submitted 26 December, 2022; originally announced December 2022.

  19. arXiv:2211.03646  [pdf, other

    cs.CV cs.LG

    Contrastive Classification and Representation Learning with Probabilistic Interpretation

    Authors: Rahaf Aljundi, Yash Patel, Milan Sulc, Daniel Olmeda, Nikolay Chumerin

    Abstract: Cross entropy loss has served as the main objective function for classification-based tasks. Widely deployed for learning neural network classifiers, it shows both effectiveness and a probabilistic interpretation. Recently, after the success of self supervised contrastive representation learning methods, supervised contrastive methods have been proposed to learn representations and have shown supe… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  20. Reinforcement Learning Assisted Recursive QAOA

    Authors: Yash J. Patel, Sofiene Jerbi, Thomas Bäck, Vedran Dunjko

    Abstract: Variational quantum algorithms such as the Quantum Approximation Optimization Algorithm (QAOA) in recent years have gained popularity as they provide the hope of using NISQ devices to tackle hard combinatorial optimization problems. It is, however, known that at low depth, certain locality constraints of QAOA limit its performance. To go beyond these limitations, a non-local variant of QAOA, namel… ▽ More

    Submitted 5 February, 2024; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: 17 pages, 6 figures. EPJ Quantum Technology journal version

    Journal ref: EPJ Quantum Technol. 11, 6 (2024)

  21. arXiv:2204.07942  [pdf

    eess.IV cs.CV cs.LG

    Wound Severity Classification using Deep Neural Network

    Authors: D. M. Anisuzzaman, Yash Patel, Jeffrey Niezgoda, Sandeep Gopalakrishnan, Zeyun Yu

    Abstract: The classification of wound severity is a critical step in wound diagnosis. An effective classifier can help wound professionals categorize wound conditions more quickly and affordably, allowing them to choose the best treatment option. This study used wound photos to construct a deep neural network-based wound severity classifier that classified them into one of three classes: green, yellow, or r… ▽ More

    Submitted 17 April, 2022; originally announced April 2022.

    Comments: 19 pages, 5 figures, 5 tables

  22. Multi-modal Wound Classification using Wound Image and Location by Deep Neural Network

    Authors: D. M. Anisuzzaman, Yash Patel, Behrouz Rostami, Jeffrey Niezgoda, Sandeep Gopalakrishnan, Zeyun Yu

    Abstract: Wound classification is an essential step of wound diagnosis. An efficient classifier can assist wound specialists in classifying wound types with less financial and time costs and help them decide an optimal treatment procedure. This study developed a deep neural network-based multi-modal classifier using wound images and their corresponding locations to categorize wound images into multiple clas… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: 30 pages, 10 figures, 15 tables

    Journal ref: Sci Rep 12, 20057 (2022)

  23. arXiv:2108.11179  [pdf, other

    cs.CV

    Recall@k Surrogate Loss with Large Batches and Similarity Mixup

    Authors: Yash Patel, Giorgos Tolias, Jiri Matas

    Abstract: This work focuses on learning deep visual representation models for retrieval by exploring the interplay between a new loss function, the batch size, and a new regularization approach. Direct optimization, by gradient descent, of an evaluation metric, is not possible when it is non-differentiable, which is the case for recall in retrieval. A differentiable surrogate loss for the recall is proposed… ▽ More

    Submitted 25 March, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

    Comments: CVPR 2022 camera-ready version

  24. arXiv:2103.04635  [pdf, other

    cs.CV

    FEDS -- Filtered Edit Distance Surrogate

    Authors: Yash Patel, Jiri Matas

    Abstract: This paper proposes a procedure to train a scene text recognition model using a robust learned surrogate of edit distance. The proposed method borrows from self-paced learning and filters out the training examples that are hard for the surrogate. The filtering is performed by judging the quality of the approximation, using a ramp function, enabling end-to-end training. Following the literature, th… ▽ More

    Submitted 26 May, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

    Comments: ICDAR 2021 camera-ready version

  25. arXiv:2010.11659  [pdf, other

    cs.SD cs.LG eess.AS

    Neural Network-based Acoustic Vehicle Counting

    Authors: Slobodan Djukanović, Yash Patel, Jiři Matas, Tuomas Virtanen

    Abstract: This paper addresses acoustic vehicle counting using one-channel audio. We predict the pass-by instants of vehicles from local minima of clipped vehicle-to-microphone distance. This distance is predicted from audio using a two-stage (coarse-fine) regression, with both stages realised via neural networks (NNs). Experiments show that the NN-based distance regression outperforms by far the previously… ▽ More

    Submitted 27 March, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

  26. arXiv:2009.14799  [pdf, other

    cs.LG stat.ML

    MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention

    Authors: Carson Eisenach, Yagna Patel, Dhruv Madeka

    Abstract: Recent advances in neural forecasting have produced major improvements in accuracy for probabilistic demand prediction. In this work, we propose novel improvements to the current state of the art by incorporating changes inspired by recent advances in Transformer architectures for Natural Language Processing. We develop a novel decoder-encoder attention for context-alignment, improving forecasting… ▽ More

    Submitted 26 January, 2022; v1 submitted 30 September, 2020; originally announced September 2020.

  27. A Mobile App for Wound Localization using Deep Learning

    Authors: D. M. Anisuzzaman, Yash Patel, Jeffrey Niezgoda, Sandeep Gopalakrishnan, Zeyun Yu

    Abstract: We present an automated wound localizer from 2D wound and ulcer images by using deep neural network, as the first step towards building an automated and complete wound diagnostic system. The wound localizer has been developed by using YOLOv3 model, which is then turned into an iOS mobile application. The developed localizer can detect the wound and its surrounding tissues and isolate the localized… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: 8 pages, 5 figures, 1 table

    Journal ref: IEEE Access. 30 May 2022

  28. arXiv:2007.00799  [pdf, other

    cs.CV

    Learning Surrogates via Deep Embedding

    Authors: Yash Patel, Tomas Hodan, Jiri Matas

    Abstract: This paper proposes a technique for training a neural network by minimizing a surrogate loss that approximates the target evaluation metric, which may be non-differentiable. The surrogate is learned via a deep embedding where the Euclidean distance between the prediction and the ground truth corresponds to the value of the evaluation metric. The effectiveness of the proposed technique is demonstra… ▽ More

    Submitted 17 July, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: ECCV 2020 camera-ready version

  29. arXiv:2002.04988  [pdf, other

    eess.IV cs.CV

    Saliency Driven Perceptual Image Compression

    Authors: Yash Patel, Srikar Appalaraju, R. Manmatha

    Abstract: This paper proposes a new end-to-end trainable model for lossy image compression, which includes several novel components. The method incorporates 1) an adequate perceptual similarity metric; 2) saliency in the images; 3) a hierarchical auto-regressive model. This paper demonstrates that the popularly used evaluations metrics such as MS-SSIM and PSNR are inadequate for judging the performance of i… ▽ More

    Submitted 8 November, 2020; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: WACV 2021 camera-ready version

  30. arXiv:1909.10699  [pdf, other

    cs.CL cs.IR cs.LG

    LitGen: Genetic Literature Recommendation Guided by Human Explanations

    Authors: Allen Nie, Arturo L. Pineda, Matt W. Wright Hannah Wand, Bryan Wulf, Helio A. Costa, Ronak Y. Patel, Carlos D. Bustamante, James Zou

    Abstract: As genetic sequencing costs decrease, the lack of clinical interpretation of variants has become the bottleneck in using genetics data. A major rate limiting step in clinical interpretation is the manual curation of evidence in the genetic literature by highly trained biocurators. What makes curation particularly time-consuming is that the curator needs to identify papers that study variant pathog… ▽ More

    Submitted 23 September, 2019; originally announced September 2019.

    Comments: 12 pages; 5 figures. Accepted by PSB 2020 (Pacific Symposium on Biocomputing) track: Artificial Intelligence for Enhancing Clinical Medicine

  31. arXiv:1908.04187  [pdf, other

    eess.IV cs.CV

    Human Perceptual Evaluations for Image Compression

    Authors: Yash Patel, Srikar Appalaraju, R. Manmatha

    Abstract: Recently, there has been much interest in deep learning techniques to do image compression and there have been claims that several of these produce better results than engineered compression schemes (such as JPEG, JPEG2000 or BPG). A standard way of comparing image compression schemes today is to use perceptual similarity metrics such as PSNR or MS-SSIM (multi-scale structural similarity). This ha… ▽ More

    Submitted 9 August, 2019; originally announced August 2019.

    Comments: arXiv admin note: text overlap with arXiv:1907.08310

  32. arXiv:1907.08310  [pdf, other

    eess.IV cs.CV

    Deep Perceptual Compression

    Authors: Yash Patel, Srikar Appalaraju, R. Manmatha

    Abstract: Several deep learned lossy compression techniques have been proposed in the recent literature. Most of these are optimized by using either MS-SSIM (multi-scale structural similarity) or MSE (mean squared error) as a loss function. Unfortunately, neither of these correlate well with human perception and this is clearly visible from the resulting compressed images. In several cases, the MS-SSIM for… ▽ More

    Submitted 31 July, 2019; v1 submitted 18 July, 2019; originally announced July 2019.

  33. arXiv:1907.00945  [pdf, ps, other

    cs.CV

    ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition -- RRC-MLT-2019

    Authors: Nibal Nayef, Yash Patel, Michal Busta, Pinaki Nath Chowdhury, Dimosthenis Karatzas, Wafa Khlif, Jiri Matas, Umapada Pal, Jean-Christophe Burie, Cheng-lin Liu, Jean-Marc Ogier

    Abstract: With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense. With the goal to systematically benchmark and push the state-of-the-art forward, the proposed competition builds on top of the RRC-MLT-2017 with an additional end-to-end task, an additional language in the real images dataset, a la… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

    Comments: ICDAR'19 camera-ready version. Competition available at https://rrc.cvc.uab.es/?ch=15. The first two authors contributed equally

  34. arXiv:1902.00378  [pdf, other

    cs.CV

    Self-Supervised Visual Representations for Cross-Modal Retrieval

    Authors: Yash Patel, Lluis Gomez, Marçal Rusiñol, Dimosthenis Karatzas, C. V. Jawahar

    Abstract: Cross-modal retrieval methods have been significantly improved in last years with the use of deep neural networks and large-scale annotated datasets such as ImageNet and Places. However, collecting and annotating such datasets requires a tremendous amount of human effort and, besides, their annotations are usually limited to discrete sets of popular visual classes that may not be representative of… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

    Comments: arXiv admin note: text overlap with arXiv:1807.02110

  35. arXiv:1812.10252  [pdf, other

    q-fin.TR cs.LG stat.ML

    Optimizing Market Making using Multi-Agent Reinforcement Learning

    Authors: Yagna Patel

    Abstract: In this paper, reinforcement learning is applied to the problem of optimizing market making. A multi-agent reinforcement learning framework is used to optimally place limit orders that lead to successful trades. The framework consists of two agents. The macro-agent optimizes on making the decision to buy, sell, or hold an asset. The micro-agent optimizes on placing limit orders within the limit or… ▽ More

    Submitted 26 December, 2018; originally announced December 2018.

    Comments: 10 pages, 12 figures

  36. arXiv:1809.04430  [pdf, other

    cs.CV cs.LG cs.NE physics.med-ph stat.ML

    Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy

    Authors: Stanislav Nikolov, Sam Blackwell, Alexei Zverovitch, Ruheena Mendes, Michelle Livne, Jeffrey De Fauw, Yojan Patel, Clemens Meyer, Harry Askham, Bernardino Romera-Paredes, Christopher Kelly, Alan Karthikesalingam, Carlton Chu, Dawn Carnell, Cheng Boon, Derek D'Souza, Syed Ali Moinuddin, Bethany Garie, Yasmin McQuinlan, Sarah Ireland, Kiarna Hampton, Krystle Fuller, Hugh Montgomery, Geraint Rees, Mustafa Suleyman , et al. (4 additional authors not shown)

    Abstract: Over half a million individuals are diagnosed with head and neck cancer each year worldwide. Radiotherapy is an important curative treatment for this disease, but it requires manual time consuming delineation of radio-sensitive organs at risk (OARs). This planning process can delay treatment, while also introducing inter-operator variability with resulting downstream radiation dose differences. Wh… ▽ More

    Submitted 13 January, 2021; v1 submitted 12 September, 2018; originally announced September 2018.

  37. arXiv:1807.02110  [pdf, other

    cs.CV

    TextTopicNet - Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces

    Authors: Yash Patel, Lluis Gomez, Raul Gomez, Marçal Rusiñol, Dimosthenis Karatzas, C. V. Jawahar

    Abstract: The immense success of deep learning based methods in computer vision heavily relies on large scale training datasets. These richly annotated datasets help the network learn discriminative visual features. Collecting and annotating such datasets requires a tremendous amount of human effort and annotations are limited to popular set of classes. As an alternative, learning visual features by designi… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

    Comments: arXiv admin note: text overlap with arXiv:1705.08631

  38. arXiv:1805.07641  [pdf, other

    cs.CV cs.LG

    Learning Sampling Policies for Domain Adaptation

    Authors: Yash Patel, Kashyap Chitta, Bhavan Jasani

    Abstract: We address the problem of semi-supervised domain adaptation of classification algorithms through deep Q-learning. The core idea is to consider the predictions of a source domain network on target domain data as noisy labels, and learn a policy to sample from this data so as to maximize classification accuracy on a small annotated reward partition of the target domain. Our experiments show that lea… ▽ More

    Submitted 19 May, 2018; originally announced May 2018.

  39. arXiv:1801.09919  [pdf

    cs.CV

    E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text

    Authors: Michal Bušta, Yash Patel, Jiri Matas

    Abstract: An end-to-end trainable (fully differentiable) method for multi-language scene text localization and recognition is proposed. The approach is based on a single fully convolutional network (FCN) with shared layers for both tasks. E2E-MLT is the first published multi-language OCR for scene text. While trained in multi-language setup, E2E-MLT demonstrates competitive performance when compared to ot… ▽ More

    Submitted 5 December, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

  40. arXiv:1705.08631  [pdf, other

    cs.CV

    Self-supervised learning of visual features through embedding images into text topic spaces

    Authors: Lluis Gomez, Yash Patel, Marçal Rusiñol, Dimosthenis Karatzas, C. V. Jawahar

    Abstract: End-to-end training from scratch of current deep architectures for new computer vision problems would require Imagenet-scale datasets, and this is not always possible. In this paper we present a method that is able to take advantage of freely available multi-modal content to train computer vision algorithms without human supervision. We put forward the idea of performing self-supervised learning o… ▽ More

    Submitted 24 May, 2017; originally announced May 2017.

    Comments: Accepted CVPR 2017 paper