Speech Processing
- Algorithms & Theory
- Climate & Sustainability
- Conferences & Events
- Data Management
- Data Mining & Modeling
- Distributed Systems & Parallel Computing
- Economics & Electronic Commerce
- Education Innovation
- General Science
- Generative AI
- Global
- Hardware & Architecture
- Health & Bioscience
- Human-Computer Interaction and Visualization
- Machine Intelligence
- Machine Perception
- Machine Translation
- Mobile Systems
- Natural Language Processing
- Networking
- Open Source Models & Datasets
- Photography
- Product
- Programs
- Quantum
- RAI-HCT Highlights
- Responsible AI
- Robotics
- Security, Privacy and Abuse Prevention
- Software Systems & Engineering
- Sound & Accoustics
- Speech Processing
- Year in Review
-
August 21, 2024
Restoring speaker voices with zero-shot cross-lingual voice transfer for TTS -
July 9, 2024
Assessing ASR performance with meaning preservation -
April 17, 2024
Robust speech recognition in AR through infinite virtual rooms with acoustic modeling -
December 1, 2023
Unsupervised speech-to-speech translation from monolingual data -
October 26, 2023
Spoken question answering and speech continuation using a spectrogram-powered LLM -
October 19, 2023
English learners can now practice speaking on Search -
June 22, 2023
SoundStorm: Efficient parallel audio generation -
June 21, 2023
Responsible AI at Google Research: AI for Social Good -
June 7, 2023
Evaluating speech synthesis in many languages with SQuId -
June 2, 2023
AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR -
March 6, 2023
Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages -
December 14, 2022
Who said what? Recorder's on-device solution for labeling speakers