Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–44 of 44 results for author: Huo, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06512  [pdf, other

    cs.CV cs.AI

    Merlin: A Vision Language Foundation Model for 3D Computed Tomography

    Authors: Louis Blankemeier, Joseph Paul Cohen, Ashwin Kumar, Dave Van Veen, Syed Jamal Safdar Gardezi, Magdalini Paschali, Zhihong Chen, Jean-Benoit Delbrouck, Eduardo Reis, Cesar Truyts, Christian Bluethgen, Malte Engmann Kjeldskov Jensen, Sophie Ostmeier, Maya Varma, Jeya Maria Jose Valanarasu, Zhongnan Fang, Zepeng Huo, Zaid Nabulsi, Diego Ardila, Wei-Hung Weng, Edson Amaro Junior, Neera Ahuja, Jason Fries, Nigam H. Shah, Andrew Johnston , et al. (6 additional authors not shown)

    Abstract: Over 85 million computed tomography (CT) scans are performed annually in the US, of which approximately one quarter focus on the abdomen. Given the current radiologist shortage, there is a large impetus to use artificial intelligence to alleviate the burden of interpreting these complex imaging studies. Prior state-of-the-art approaches for automated medical image interpretation leverage vision la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 18 pages, 7 figures

  2. arXiv:2404.09173  [pdf, other

    cs.LG cs.AI cs.CL

    TransformerFAM: Feedback attention is working memory

    Authors: Dongseong Hwang, Weiran Wang, Zhuoyuan Huo, Khe Chai Sim, Pedro Moreno Mengibar

    Abstract: While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs. We propose Feedback Attention Memory (FAM), a novel Transformer architecture that leverages a feedback loop to enable the network to attend to its own latent representations. This design fosters the emergence of working memory within the Transformer, a… ▽ More

    Submitted 7 May, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: 26 pages, 12 figures, 14 tables

  3. arXiv:2311.10798  [pdf, other

    cs.LG cs.AI cs.CV eess.IV

    INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

    Authors: Shih-Cheng Huang, Zepeng Huo, Ethan Steinberg, Chia-Chun Chiang, Matthew P. Lungren, Curtis P. Langlotz, Serena Yeung, Nigam H. Shah, Jason A. Fries

    Abstract: Synthesizing information from multiple data sources plays a crucial role in the practice of modern medicine. Current applications of artificial intelligence in medicine often focus on single-modality data due to a lack of publicly available, multimodal medical datasets. To address this limitation, we introduce INSPECT, which contains de-identified longitudinal records from a large cohort of patien… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  4. arXiv:2309.09996  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Improving Speech Recognition for African American English With Audio Classification

    Authors: Shefali Garg, Zhouyuan Huo, Khe Chai Sim, Suzan Schwartz, Mason Chua, Alëna Aksënova, Tsendsuren Munkhdalai, Levi King, Darryl Wright, Zion Mengesha, Dongseong Hwang, Tara Sainath, Françoise Beaufays, Pedro Moreno Mengibar

    Abstract: Automatic speech recognition (ASR) systems have been shown to have large quality disparities between the language varieties they are intended or expected to recognize. One way to mitigate this is to train or fine-tune models with more representative datasets. But this approach can be hindered by limited in-domain data for training and evaluation. We propose a new way to improve the robustness of a… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  5. arXiv:2308.14089  [pdf, other

    cs.CL cs.AI cs.LG

    MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

    Authors: Scott L. Fleming, Alejandro Lozano, William J. Haberkorn, Jenelle A. Jindal, Eduardo P. Reis, Rahul Thapa, Louis Blankemeier, Julian Z. Genkins, Ethan Steinberg, Ashwin Nayak, Birju S. Patel, Chia-Chun Chiang, Alison Callahan, Zepeng Huo, Sergios Gatidis, Scott J. Adams, Oluseyi Fayanju, Shreya J. Shah, Thomas Savage, Ethan Goh, Akshay S. Chaudhari, Nima Aghaeepour, Christopher Sharp, Michael A. Pfeffer, Percy Liang , et al. (5 additional authors not shown)

    Abstract: The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture… ▽ More

    Submitted 24 December, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

  6. arXiv:2302.01496  [pdf, ps, other

    cs.CL cs.LG cs.SD eess.AS

    Efficient Domain Adaptation for Speech Foundation Models

    Authors: Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Francoise Beaufays

    Abstract: Foundation models (FMs), that are trained on broad data at scale and are adaptable to a wide range of downstream tasks, have brought large interest in the research community. Benefiting from the diverse data sources such as different modalities, languages and application domains, foundation models have demonstrated strong generalization and knowledge transfer capabilities. In this paper, we presen… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  7. arXiv:2211.02712  [pdf, other

    cs.LG cs.SD eess.AS

    Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature Fusion

    Authors: Zhouyuan Huo, Khe Chai Sim, Bo Li, Dongseong Hwang, Tara N. Sainath, Trevor Strohman

    Abstract: Self-supervised pre-training of a speech foundation model, followed by supervised fine-tuning, has shown impressive quality improvements on automatic speech recognition (ASR) tasks. Fine-tuning separate foundation models for many downstream tasks are expensive since the foundation model is usually very big. Parameter-efficient fine-tuning methods (e.g. adapter, sparse update methods) offer an alte… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  8. arXiv:2210.07353  [pdf, other

    cs.CL cs.SD eess.AS

    JOIST: A Joint Speech and Text Streaming Model For ASR

    Authors: Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman

    Abstract: We present JOIST, an algorithm to train a streaming, cascaded, encoder end-to-end (E2E) model with both speech-text paired inputs, and text-only unpaired inputs. Unlike previous works, we explore joint training with both modalities, rather than pre-training and fine-tuning. In addition, we explore JOIST using a streaming E2E model with an order of magnitude more data, which are also novelties comp… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  9. DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and Temporal Relatedness

    Authors: Zepeng Huo, Taowei Ji, Yifei Liang, Shuai Huang, Zhangyang Wang, Xiaoning Qian, Bobak Mortazavi

    Abstract: In wearable sensing applications, data is inevitable to be irregularly sampled or partially missing, which pose challenges for any downstream application. An unique aspect of wearable data is that it is time-series data and each channel can be correlated to another one, such as x, y, z axis of accelerometer. We argue that traditional methods have rarely made use of both times-series dynamics of th… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: 5 pages, 2 figures, accepted in ICASSP'2022

  10. arXiv:2207.11382  [pdf, other

    cs.LG

    Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data

    Authors: Zepeng Huo, Xiaoning Qian, Shuai Huang, Zhangyang Wang, Bobak J. Mortazavi

    Abstract: Medical events of interest, such as mortality, often happen at a low rate in electronic medical records, as most admitted patients survive. Training models with this imbalance rate (class density discrepancy) may lead to suboptimal prediction. Traditionally this problem is addressed through ad-hoc methods such as resampling or reweighting but performance in many cases is still limited. We propose… ▽ More

    Submitted 29 July, 2022; v1 submitted 22 July, 2022; originally announced July 2022.

  11. Predicting the meal macronutrient composition from continuous glucose monitors

    Authors: Zepeng Huo, Bobak J. Mortazavi, Theodora Chaspari, Nicolaas Deutz, Laura Ruebush, Ricardo Gutierrez-Osuna

    Abstract: Sustained high levels of blood glucose in type 2 diabetes (T2DM) can have disastrous long-term health consequences. An essential component of clinical interventions for T2DM is monitoring dietary intake to keep plasma glucose levels within an acceptable range. Yet, current techniques to monitor food intake are time intensive and error prone. To address this issue, we are developing techniques to a… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

    Journal ref: In 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), pp. 1-4. IEEE, 2019

  12. arXiv:2203.12668  [pdf, other

    cs.LG cs.CL

    Pseudo Label Is Better Than Human Label

    Authors: Dongseong Hwang, Khe Chai Sim, Zhouyuan Huo, Trevor Strohman

    Abstract: State-of-the-art automatic speech recognition (ASR) systems are trained with tens of thousands of hours of labeled speech data. Human transcription is expensive and time consuming. Factors such as the quality and consistency of the transcription can greatly affect the performance of the ASR models trained with these data. In this paper, we show that we can train a strong teacher model to produce h… ▽ More

    Submitted 1 July, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: 6 pages, 2 figures, 9 tables, Proceedings of INTERSPEECH 2022

  13. arXiv:2110.00165  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Large-scale ASR Domain Adaptation using Self- and Semi-supervised Learning

    Authors: Dongseong Hwang, Ananya Misra, Zhouyuan Huo, Nikhil Siddhartha, Shefali Garg, David Qiu, Khe Chai Sim, Trevor Strohman, Françoise Beaufays, Yanzhang He

    Abstract: Self- and semi-supervised learning methods have been actively investigated to reduce labeled training data or enhance the model performance. However, the approach mostly focus on in-domain performance for public datasets. In this study, we utilize the combination of self- and semi-supervised learning methods to solve unseen domain adaptation problem in a large-scale production setting for online A… ▽ More

    Submitted 15 February, 2022; v1 submitted 30 September, 2021; originally announced October 2021.

    Comments: ICASSP 2022 accepted, 5 pages, 2 figures, 5 tables

  14. arXiv:2110.00155  [pdf, other

    cs.SD cs.LG eess.AS

    Incremental Layer-wise Self-Supervised Learning for Efficient Speech Domain Adaptation On Device

    Authors: Zhouyuan Huo, Dongseong Hwang, Khe Chai Sim, Shefali Garg, Ananya Misra, Nikhil Siddhartha, Trevor Strohman, Françoise Beaufays

    Abstract: Streaming end-to-end speech recognition models have been widely applied to mobile devices and show significant improvement in efficiency. These models are typically trained on the server using transcribed speech data. However, the server data distribution can be very different from the data distribution on user devices, which could affect the model performance. There are two main challenges for on… ▽ More

    Submitted 30 September, 2021; originally announced October 2021.

    Comments: 5 pages

  15. arXiv:2107.06917  [pdf, other

    cs.LG

    A Field Guide to Federated Optimization

    Authors: Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz , et al. (28 additional authors not shown)

    Abstract: Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  16. arXiv:2106.07820  [pdf, other

    cs.LG cs.DC

    On Large-Cohort Training for Federated Learning

    Authors: Zachary Charles, Zachary Garrett, Zhouyuan Huo, Sergei Shmulyian, Virginia Smith

    Abstract: Federated learning methods typically learn a model by iteratively sampling updates from a population of clients. In this work, we explore how the number of clients sampled at each round (the cohort size) impacts the quality of the learned model and the training dynamics of federated learning algorithms. Our work poses three fundamental questions. First, what challenges arise when trying to scale f… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

  17. arXiv:2008.06233  [pdf, other

    cs.LG stat.ML

    Privacy-Preserving Asynchronous Federated Learning Algorithms for Multi-Party Vertically Collaborative Learning

    Authors: Bin Gu, An Xu, Zhouyuan Huo, Cheng Deng, Heng Huang

    Abstract: The privacy-preserving federated learning for vertically partitioned data has shown promising results as the solution of the emerging multi-party joint modeling application, in which the data holders (such as government branches, private finance and e-business companies) collaborate throughout the learning process rather than relying on a trusted third party to hold data. However, existing federat… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

  18. arXiv:2008.05823  [pdf, other

    cs.LG cs.DC stat.ML

    Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

    Authors: An Xu, Zhouyuan Huo, Heng Huang

    Abstract: Although the distributed machine learning methods can speed up the training of large deep neural networks, the communication cost has become the non-negligible bottleneck to constrain the performance. To address this challenge, the gradient compression based communication-efficient distributed learning methods were designed to reduce the communication cost, and more recently the local error feedba… ▽ More

    Submitted 24 January, 2022; v1 submitted 13 August, 2020; originally announced August 2020.

  19. Julia Language in Machine Learning: Algorithms, Applications, and Open Issues

    Authors: Kaifeng Gao, Gang Mei, Francesco Piccialli, Salvatore Cuomo, Jingzhi Tu, Zenan Huo

    Abstract: Machine learning is driving development across many fields in science and engineering. A simple and efficient programming language could accelerate applications of machine learning in various fields. Currently, the programming languages most commonly used to develop machine learning algorithms include Python, MATLAB, and C/C ++. However, none of these languages well balance both efficiency and sim… ▽ More

    Submitted 17 May, 2020; v1 submitted 23 March, 2020; originally announced March 2020.

    Comments: Published in Computer Science Review

    Journal ref: Computer Science Review, Volume 37, 2020, 100254

  20. arXiv:2003.01753  [pdf, other

    cs.LG stat.ML

    Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery

    Authors: Zepeng Huo, Arash PakBin, Xiaohan Chen, Nathan Hurley, Ye Yuan, Xiaoning Qian, Zhangyang Wang, Shuai Huang, Bobak Mortazavi

    Abstract: Activity recognition in wearable computing faces two key challenges: i) activity characteristics may be context-dependent and change under different contexts or situations; ii) unknown contexts and activities may occur from time to time, requiring flexibility and adaptability of the algorithm. We develop a context-aware mixture of deep models termed the α-\b{eta} network coupled with uncertainty q… ▽ More

    Submitted 3 March, 2020; originally announced March 2020.

    Comments: 10 pages, 5 figures, accepted by AISTATS 2020

  21. arXiv:2002.11082  [pdf, other

    cs.LG cs.DC stat.ML

    Optimal Gradient Quantization Condition for Communication-Efficient Distributed Training

    Authors: An Xu, Zhouyuan Huo, Heng Huang

    Abstract: The communication of gradients is costly for training deep neural networks with multiple devices in computer vision applications. In particular, the growing size of deep learning models leads to higher communication overheads that defy the ideal linear training speedup regarding the number of devices. Gradient quantization is one of the common methods to reduce communication costs. However, it can… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

  22. arXiv:2002.02090  [pdf, other

    cs.LG cs.DC stat.ML

    Faster On-Device Training Using New Federated Momentum Algorithm

    Authors: Zhouyuan Huo, Qian Yang, Bin Gu, Lawrence Carin. Heng Huang

    Abstract: Mobile crowdsensing has gained significant attention in recent years and has become a critical paradigm for emerging Internet of Things applications. The sensing devices continuously generate a significant quantity of data, which provide tremendous opportunities to develop innovative intelligent applications. To utilize these data to train machine learning models while not compromising user privac… ▽ More

    Submitted 5 February, 2020; originally announced February 2020.

  23. arXiv:2002.01576  [pdf, other

    cs.LG stat.ML

    Large Batch Training Does Not Need Warmup

    Authors: Zhouyuan Huo, Bin Gu, Heng Huang

    Abstract: Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications. However, the optimizer converges slowly at early epochs and there is a gap between large-batch deep learning optimization heuristics and theoretical underpinnings. In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

  24. juSFEM: A Julia-based Open-source Package of Parallel Smoothed Finite Element Method (S-FEM) for Elastic Problems

    Authors: Zenan Huo, Gang Mei, Nengxiong Xu

    Abstract: The Smoothed Finite Element Method (S-FEM) proposed by Liu G.R. can achieve more accurate results than the conventional FEM. Currently, much commercial software and many open-source packages have been developed to analyze various science and engineering problems using the FEM. However, there is little work focusing on designing and developing software or packages for the S-FEM. In this paper, we d… ▽ More

    Submitted 23 January, 2020; originally announced January 2020.

    Comments: Revised version submitted to Computers & Mathematics with Applications on Dec. 4, 2019

    Journal ref: Computers & Mathematics with Applications, 2020

  25. arXiv:1912.04977  [pdf, other

    cs.LG cs.CR stat.ML

    Advances and Open Problems in Federated Learning

    Authors: Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson , et al. (34 additional authors not shown)

    Abstract: Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs re… ▽ More

    Submitted 8 March, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Published in Foundations and Trends in Machine Learning Vol 4 Issue 1. See: https://www.nowpublishers.com/article/Details/MAL-083

  26. arXiv:1910.04235  [pdf, other

    cs.LG stat.ML

    Straggler-Agnostic and Communication-Efficient Distributed Primal-Dual Algorithm for High-Dimensional Data Mining

    Authors: Zhouyuan Huo, Heng Huang

    Abstract: Recently, reducing communication time between machines becomes the main focus of distributed data mining. Previous methods propose to make workers do more computation locally before aggregating local solutions in the server such that fewer communication rounds between server and workers are required. However, these methods do not consider reducing the communication time per round and work very poo… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

  27. arXiv:1909.06695  [pdf, other

    cs.CL cs.LG stat.ML

    Ouroboros: On Accelerating Training of Transformer-Based Language Models

    Authors: Qian Yang, Zhouyuan Huo, Wenlin Wang, Heng Huang, Lawrence Carin

    Abstract: Language models are essential for natural language processing (NLP) tasks, such as machine translation and text summarization. Remarkable performance has been demonstrated recently across many NLP domains via a Transformer-based language model with over a billion parameters, verifying the benefits of model size. Model parallelism is required if a model is too large to fit in a single computing dev… ▽ More

    Submitted 14 September, 2019; originally announced September 2019.

    Comments: To appear in the proceedings of Neural Information Processing Systems Conference (2019)

  28. arXiv:1909.02625  [pdf, other

    cs.LG cs.DC stat.ML

    On the Acceleration of Deep Learning Model Parallelism with Staleness

    Authors: An Xu, Zhouyuan Huo, Heng Huang

    Abstract: Training the deep convolutional neural network for computer vision problems is slow and inefficient, especially when it is large and distributed across multiple devices. The inefficiency is caused by the backpropagation algorithm's forward locking, backward locking, and update locking problems. Existing solutions for acceleration either can only handle one locking problem or lead to severe accurac… ▽ More

    Submitted 19 January, 2022; v1 submitted 5 September, 2019; originally announced September 2019.

  29. arXiv:1907.00294  [pdf, other

    eess.IV cs.CV

    Generative Mask Pyramid Network for CT/CBCT Metal Artifact Reduction with Joint Projection-Sinogram Correction

    Authors: Haofu Liao, Wei-An Lin, Zhimin Huo, Levon Vogelsang, William J. Sehnert, S. Kevin Zhou, Jiebo Luo

    Abstract: A conventional approach to computed tomography (CT) or cone beam CT (CBCT) metal artifact reduction is to replace the X-ray projection data within the metal trace with synthesized data. However, existing projection or sinogram completion methods cannot always produce anatomically consistent information to fill the metal trace, and thus, when the metallic implant is large, significant secondary art… ▽ More

    Submitted 23 March, 2022; v1 submitted 29 June, 2019; originally announced July 2019.

    Comments: This paper is accepted to MICCAI 2019

  30. arXiv:1902.06158  [pdf, other

    math.OC cs.LG stat.ML

    Faster Gradient-Free Proximal Stochastic Methods for Nonconvex Nonsmooth Optimization

    Authors: Feihu Huang, Bin Gu, Zhouyuan Huo, Songcan Chen, Heng Huang

    Abstract: Proximal gradient method has been playing an important role to solve many machine learning tasks, especially for the nonsmooth problems. However, in some machine learning problems such as the bandit model and the black-box learning problem, proximal gradient method could fail because the explicit gradients of these problems are difficult or infeasible to obtain. The gradient-free (zeroth-order) me… ▽ More

    Submitted 16 February, 2019; originally announced February 2019.

    Comments: AAAI-2019, 22 pages

  31. Adversarial Sparse-View CBCT Artifact Reduction

    Authors: Haofu Liao, Zhimin Huo, William J. Sehnert, Shaohua Kevin Zhou, Jiebo Luo

    Abstract: We present an effective post-processing method to reduce the artifacts from sparsely reconstructed cone-beam CT (CBCT) images. The proposed method is based on the state-of-the-art, image-to-image generative models with a perceptual loss as regulation. Unlike the traditional CT artifact-reduction approaches, our method is trained in an adversarial fashion that yields more perceptually realistic out… ▽ More

    Submitted 9 December, 2018; originally announced December 2018.

    Journal ref: Medical Image Computing and Computer Assisted Intervention (MICCAI) 2018. Lecture Notes in Computer Science, vol 11070. Springer, Cham

  32. arXiv:1812.00477  [pdf, other

    cs.CV

    Ego-Downward and Ambient Video based Person Location Association

    Authors: Liang Yang, Hao Jiang, Jizhong Xiao, Zhouyuan Huo

    Abstract: Using an ego-centric camera to do localization and tracking is highly needed for urban navigation and indoor assistive system when GPS is not available or not accurate enough. The traditional hand-designed feature tracking and estimation approach would fail without visible features. Recently, there are several works exploring to use context features to do localization. However, all of these suffer… ▽ More

    Submitted 2 December, 2018; originally announced December 2018.

  33. arXiv:1807.04511  [pdf, other

    cs.LG stat.ML

    Training Neural Networks Using Features Replay

    Authors: Zhouyuan Huo, Bin Gu, Heng Huang

    Abstract: Training a neural network using backpropagation algorithm requires passing error gradients sequentially through the network. The backward locking prevents us from updating network layers in parallel and fully leveraging the computing resources. Recently, there are several works trying to decouple and parallelize the backpropagation algorithm. However, all of them suffer from severe accuracy loss o… ▽ More

    Submitted 29 May, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: NeurIPS 2018 Spotlight, Training deep learning faster, Convergence guarantee for Pipeline-based methods

  34. arXiv:1805.04634  [pdf, other

    q-bio.QM cs.CV stat.AP stat.ML

    Image-derived generative modeling of pseudo-macromolecular structures - towards the statistical assessment of Electron CryoTomography template matching

    Authors: Kai Wen Wang, Xiangrui Zeng, Xiaodan Liang, Zhiguang Huo, Eric P. Xing, Min Xu

    Abstract: Cellular Electron CryoTomography (CECT) is a 3D imaging technique that captures information about the structure and spatial organization of macromolecular complexes within single cells, in near-native state and at sub-molecular resolution. Although template matching is often used to locate macromolecules in a CECT image, it is insufficient as it only measures the relative structural similarity. Th… ▽ More

    Submitted 11 May, 2018; originally announced May 2018.

    Journal ref: British Machine Vision Conference (BMVC) 2018

  35. arXiv:1804.10574  [pdf, other

    cs.LG stat.ML

    Decoupled Parallel Backpropagation with Convergence Guarantee

    Authors: Zhouyuan Huo, Bin Gu, Qian Yang, Heng Huang

    Abstract: Backpropagation algorithm is indispensable for the training of feedforward neural networks. It requires propagating error gradients sequentially from the output layer all the way back to the input layer. The backward locking in backpropagation algorithm constrains us from updating network layers in parallel and fully leveraging the computing resources. Recently, several algorithms have been propos… ▽ More

    Submitted 21 July, 2018; v1 submitted 27 April, 2018; originally announced April 2018.

    Comments: ICML 2018

  36. arXiv:1711.03937  [pdf, ps, other

    cs.LG math.OC

    Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization

    Authors: Zhouyuan Huo, Bin Gu, Ji Liu, Heng Huang

    Abstract: Stochastic composition optimization draws much attention recently and has been successful in many emerging applications of machine learning, statistical analysis, and reinforcement learning. In this paper, we focus on the composition problem with nonsmooth regularization penalty. Previous works either have slow convergence rate or do not provide complete convergence analysis for the general proble… ▽ More

    Submitted 28 December, 2017; v1 submitted 10 November, 2017; originally announced November 2017.

    Comments: AAAI 2018

  37. arXiv:1612.06003  [pdf, ps, other

    cs.LG stat.ML

    Inexact Proximal Gradient Methods for Non-convex and Non-smooth Optimization

    Authors: Bin Gu, De Wang, Zhouyuan Huo, Heng Huang

    Abstract: In machine learning research, the proximal gradient methods are popular for solving various optimization problems with non-smooth regularization. Inexact proximal gradient methods are extremely important when exactly solving the proximal operator is time-consuming, or the proximal operator does not have an analytic solution. However, existing inexact proximal gradient methods only consider convex… ▽ More

    Submitted 8 September, 2018; v1 submitted 18 December, 2016; originally announced December 2016.

    Comments: AAAI 2018

  38. arXiv:1612.01425  [pdf, ps, other

    cs.LG

    Zeroth-order Asynchronous Doubly Stochastic Algorithm with Variance Reduction

    Authors: Bin Gu, Zhouyuan Huo, Heng Huang

    Abstract: Zeroth-order (derivative-free) optimization attracts a lot of attention in machine learning, because explicit gradient calculations may be computationally expensive or infeasible. To handle large scale problems both in volume and dimension, recently asynchronous doubly stochastic zeroth-order algorithms were proposed. The convergence rate of existing asynchronous doubly stochastic zeroth order alg… ▽ More

    Submitted 5 December, 2016; originally announced December 2016.

  39. arXiv:1611.07174  [pdf, other

    cs.CL cs.LG

    Deep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition

    Authors: Zewang Zhang, Zheng Sun, Jiaqi Liu, Jingwen Chen, Zhao Huo, Xiao Zhang

    Abstract: A deep learning approach has been widely applied in sequence modeling problems. In terms of automatic speech recognition (ASR), its performance has significantly been improved by increasing large speech corpus and deeper neural network. Especially, recurrent neural network and deep convolutional neural network have been applied in ASR successfully. Given the arising problem of training speed, we b… ▽ More

    Submitted 26 December, 2016; v1 submitted 22 November, 2016; originally announced November 2016.

    Comments: 11 pages, 13 figures

  40. Composing Music with Grammar Argumented Neural Networks and Note-Level Encoding

    Authors: Zheng Sun, Jiaqi Liu, Zewang Zhang, Jingwen Chen, Zhao Huo, Ching Hua Lee, Xiao Zhang

    Abstract: Creating aesthetically pleasing pieces of art, including music, has been a long-term goal for artificial intelligence research. Despite recent successes of long-short term memory (LSTM) recurrent neural networks (RNNs) in sequential learning, LSTM neural networks have not, by themselves, been able to generate natural-sounding music conforming to music theory. To transcend this inadequacy, we put f… ▽ More

    Submitted 7 December, 2016; v1 submitted 16 November, 2016; originally announced November 2016.

    Comments: 6 pages, 4 figures

    Journal ref: 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), p1864-1867

  41. arXiv:1610.09447  [pdf, ps, other

    cs.LG

    Asynchronous Stochastic Block Coordinate Descent with Variance Reduction

    Authors: Bin Gu, Zhouyuan Huo, Heng Huang

    Abstract: Asynchronous parallel implementations for stochastic optimization have received huge successes in theory and practice recently. Asynchronous implementations with lock-free are more efficient than the one with writing or reading lock. In this paper, we focus on a composite objective function consisting of a smooth convex function $f$ and a block separable convex function, which widely exists in mac… ▽ More

    Submitted 13 November, 2016; v1 submitted 28 October, 2016; originally announced October 2016.

  42. arXiv:1609.06804  [pdf, ps, other

    cs.LG math.OC

    Decoupled Asynchronous Proximal Stochastic Gradient Descent with Variance Reduction

    Authors: Zhouyuan Huo, Bin Gu, Heng Huang

    Abstract: In the era of big data, optimizing large scale machine learning problems becomes a challenging task and draws significant attention. Asynchronous optimization algorithms come out as a promising solution. Recently, decoupled asynchronous proximal stochastic gradient descent (DAP-SGD) is proposed to minimize a composite function. It is claimed to be able to off-loads the computation bottleneck from… ▽ More

    Submitted 28 September, 2016; v1 submitted 21 September, 2016; originally announced September 2016.

  43. arXiv:1605.09066  [pdf, ps, other

    cs.LG

    Distributed Asynchronous Dual Free Stochastic Dual Coordinate Ascent

    Authors: Zhouyuan Huo, Heng Huang

    Abstract: The primal-dual distributed optimization methods have broad large-scale machine learning applications. Previous primal-dual distributed methods are not applicable when the dual formulation is not available, e.g. the sum-of-non-convex objectives. Moreover, these algorithms and theoretical analysis are based on the fundamental assumption that the computing speeds of multiple machines in a cluster ar… ▽ More

    Submitted 26 October, 2017; v1 submitted 29 May, 2016; originally announced May 2016.

  44. arXiv:1604.03584  [pdf, other

    cs.LG math.OC

    Asynchronous Stochastic Gradient Descent with Variance Reduction for Non-Convex Optimization

    Authors: Zhouyuan Huo, Heng Huang

    Abstract: We provide the first theoretical analysis on the convergence rate of the asynchronous stochastic variance reduced gradient (SVRG) descent algorithm on non-convex optimization. Recent studies have shown that the asynchronous stochastic gradient descent (SGD) based algorithms with variance reduction converge with a linear convergent rate on convex problems. However, there is no work to analyze async… ▽ More

    Submitted 20 December, 2016; v1 submitted 12 April, 2016; originally announced April 2016.

    Comments: V1,v2,v3 have been withdrawn due to reference issue, because arXiv policy, we can't delete them. Please refer the newest version v4