default search action
Kevin W. Wilson
Person information
- affiliation: Google
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c33]Cong Han, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Unsupervised Multi-Channel Separation And Adaptation. ICASSP 2024: 721-725 - 2023
- [i13]Cong Han, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Unsupervised Multi-channel Separation and Adaptation. CoRR abs/2305.11151 (2023) - 2022
- [c32]Katharine Patterson, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Distance-Based Sound Separation. INTERSPEECH 2022: 901-905 - [i12]Katharine Patterson, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Distance-Based Sound Separation. CoRR abs/2207.00562 (2022) - 2021
- [c31]Soumi Maiti, Hakan Erdogan, Kevin W. Wilson, Scott Wisdom, Shinji Watanabe, John R. Hershey:
End-To-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings. ICASSP 2021: 7183-7187 - [c30]Zhong-Qiu Wang, Hakan Erdogan, Scott Wisdom, Kevin W. Wilson, Desh Raj, Shinji Watanabe, Zhuo Chen, John R. Hershey:
Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement. SLT 2021: 905-911 - [i11]Soumi Maiti, Hakan Erdogan, Kevin W. Wilson, Scott Wisdom, Shinji Watanabe, John R. Hershey:
End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings. CoRR abs/2105.02096 (2021) - 2020
- [c29]Quan Wang, Ignacio López-Moreno, Mert Saglam, Kevin W. Wilson, Alan Chiao, Renjie Liu, Yanzhang He, Wei Li, Jason Pelecanos, Marily Nika, Alexander Gruenstein:
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition. INTERSPEECH 2020: 2677-2681 - [c28]Scott Wisdom, Efthymios Tzinis, Hakan Erdogan, Ron J. Weiss, Kevin W. Wilson, John R. Hershey:
Unsupervised Sound Separation Using Mixture Invariant Training. NeurIPS 2020 - [i10]Scott Wisdom, Efthymios Tzinis, Hakan Erdogan, Ron J. Weiss, Kevin W. Wilson, John R. Hershey:
Unsupervised Sound Separation Using Mixtures of Mixtures. CoRR abs/2006.12701 (2020) - [i9]Quan Wang, Ignacio López-Moreno, Mert Saglam, Kevin W. Wilson, Alan Chiao, Renjie Liu, Yanzhang He, Wei Li, Jason Pelecanos, Marily Nika, Alexander Gruenstein:
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition. CoRR abs/2009.04323 (2020)
2010 – 2019
- 2019
- [c27]Scott Wisdom, John R. Hershey, Kevin W. Wilson, Jeremy Thorpe, Michael Chinen, Brian Patton, Rif A. Saurous:
Differentiable Consistency Constraints for Improved Deep Speech Enhancement. ICASSP 2019: 900-904 - [c26]Quan Wang, Hannah Muckenhirn, Kevin W. Wilson, Prashant Sridhar, Zelin Wu, John R. Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio López-Moreno:
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. INTERSPEECH 2019: 2728-2732 - [c25]Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin W. Wilson, Jonathan Le Roux, John R. Hershey:
Universal Sound Separation. WASPAA 2019: 175-179 - [i8]Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin W. Wilson, Jonathan Le Roux, John R. Hershey:
Universal Sound Separation. CoRR abs/1905.03330 (2019) - [i7]Zhong-Qiu Wang, Scott Wisdom, Kevin W. Wilson, John R. Hershey:
Alternating Between Spectral and Spatial Estimation for Speech Separation and Enhancement. CoRR abs/1911.07953 (2019) - 2018
- [c24]Sourish Chaudhuri, Joseph Roth, Daniel P. W. Ellis, Andrew C. Gallagher, Liat Kaver, Radhika Marvin, Caroline Pantofaru, Nathan Reale, Loretta Guarino Reid, Kevin W. Wilson, Zhonghua Xi:
AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies. INTERSPEECH 2018: 1239-1243 - [c23]Kevin W. Wilson, Michael Chinen, Jeremy Thorpe, Brian Patton, John R. Hershey, Rif A. Saurous, Jan Skoglund, Richard F. Lyon:
Exploring Tradeoffs in Models for Low-Latency Speech Enhancement. IWAENC 2018: 366-370 - [i6]Sourish Chaudhuri, Joseph Roth, Daniel P. W. Ellis, Andrew C. Gallagher, Liat Kaver, Radhika Marvin, Caroline Pantofaru, Nathan Reale, Loretta Guarino Reid, Kevin W. Wilson, Zhonghua Xi:
AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies. CoRR abs/1808.00606 (2018) - [i5]Quan Wang, Hannah Muckenhirn, Kevin W. Wilson, Prashant Sridhar, Zelin Wu, John R. Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio López-Moreno:
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. CoRR abs/1810.04826 (2018) - [i4]Kevin W. Wilson, Michael Chinen, Jeremy Thorpe, Brian Patton, John R. Hershey, Rif A. Saurous, Jan Skoglund, Richard F. Lyon:
Exploring Tradeoffs in Models for Low-latency Speech Enhancement. CoRR abs/1811.07030 (2018) - [i3]Scott Wisdom, John R. Hershey, Kevin W. Wilson, Jeremy Thorpe, Michael Chinen, Brian Patton, Rif A. Saurous:
Differentiable Consistency Constraints for Improved Deep Speech Enhancement. CoRR abs/1811.08521 (2018) - 2017
- [j2]Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Bo Li, Arun Narayanan, Ehsan Variani, Michiel Bacchiani, Izhak Shafran, Andrew W. Senior, Kean K. Chin, Ananya Misra, Chanwoo Kim:
Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 25(5): 965-979 (2017) - [c22]Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, Kevin W. Wilson:
CNN architectures for large-scale audio classification. ICASSP 2017: 131-135 - [c21]Bo Li, Tara N. Sainath, Arun Narayanan, Joe Caroselli, Michiel Bacchiani, Ananya Misra, Izhak Shafran, Hasim Sak, Golan Pundak, Kean K. Chin, Khe Chai Sim, Ron J. Weiss, Kevin W. Wilson, Ehsan Variani, Chanwoo Kim, Olivier Siohan, Mitchel Weintraub, Erik McDermott, Richard Rose, Matt Shannon:
Acoustic Modeling for Google Home. INTERSPEECH 2017: 399-403 - [p1]Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani, Bo Li, Ehsan Variani, Izhak Shafran, Andrew W. Senior, Kean K. Chin, Ananya Misra, Chanwoo Kim:
Raw Multichannel Processing Using Deep Neural Networks. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 105-133 - 2016
- [c20]Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani:
Factored spatial and spectral multichannel raw waveform CLDNNs. ICASSP 2016: 5075-5079 - [c19]Tara N. Sainath, Arun Narayanan, Ron J. Weiss, Ehsan Variani, Kevin W. Wilson, Michiel Bacchiani, Izhak Shafran:
Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction. INTERSPEECH 2016: 1971-1975 - [c18]Bo Li, Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Michiel Bacchiani:
Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition. INTERSPEECH 2016: 1976-1980 - [i2]Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, Kevin W. Wilson:
CNN Architectures for Large-Scale Audio Classification. CoRR abs/1609.09430 (2016) - [i1]Brian Patton, Yannis Agiomyrgiannakis, Michael Terry, Kevin W. Wilson, Rif A. Saurous, D. Sculley:
AutoMOS: Learning a non-intrusive assessor of naturalness-of-speech. CoRR abs/1611.09207 (2016) - 2015
- [c17]Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani, Andrew W. Senior:
Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms. ASRU 2015: 30-36 - [c16]Yedid Hoshen, Ron J. Weiss, Kevin W. Wilson:
Speech acoustic modeling from raw multichannel waveforms. ICASSP 2015: 4624-4628 - [c15]Tara N. Sainath, Ron J. Weiss, Andrew W. Senior, Kevin W. Wilson, Oriol Vinyals:
Learning the speech front-end with raw waveform CLDNNs. INTERSPEECH 2015: 1-5 - 2010
- [c14]Kevin W. Wilson, Bhiksha Raj:
Spectrogram dimensionality reductionwith independence constraints. ICASSP 2010: 1938-1941 - [c13]Bhiksha Raj, Kevin W. Wilson, Alexander Krueger, Reinhold Haeb-Umbach:
Ungrounded independent non-negative factor analysis. INTERSPEECH 2010: 330-333
2000 – 2009
- 2008
- [c12]Kevin W. Wilson, Bhiksha Raj, Paris Smaragdis, Ajay Divakaran:
Speech denoising using nonnegative matrix factorization with priors. ICASSP 2008: 4029-4032 - [c11]Kevin W. Wilson, Bhiksha Raj, Paris Smaragdis:
Regularized non-negative matrix factorization with temporal dependencies for speech denoising. INTERSPEECH 2008: 411-414 - 2007
- [c10]Naveen Goela, Kevin W. Wilson, Feng Niu, Ajay Divakaran, Isao Otsuka:
An SVM Framework for Genre-Independent Scene Change Detection. ICME 2007: 532-535 - 2006
- [b1]Kevin W. Wilson:
Estimating uncertainty models for speech source localization in real-world environments. Massachusetts Institute of Technology, Cambridge, MA, USA, 2006 - [j1]Kevin W. Wilson, Trevor Darrell:
Learning a Precedence Effect-Like Weighting Function for the Generalized Cross-Correlation Framework. IEEE Trans. Speech Audio Process. 14(6): 2156-2164 (2006) - 2005
- [c9]Kevin W. Wilson, Trevor Darrell:
Improving audio source localization by learning the precedence effect. ICASSP (4) 2005: 1125-1128 - [c8]Kate Saenko, Karen Livescu, Michael Siracusa, Kevin W. Wilson, James R. Glass, Trevor Darrell:
Visual Speech Recognition with Loosely Synchronized Feature Streams. ICCV 2005: 1424-1431 - 2004
- [c7]Neal Checka, Kevin W. Wilson, Michael Siracusa, Trevor Darrell:
Multiple person and speaker activity tracking with a particle filter. ICASSP (5) 2004: 881-884 - [c6]David Demirdjian, Kevin W. Wilson, Michael Siracusa, Trevor Darrell:
Real-time audio-visual tracking for meeting analysis. ICMI 2004: 331-332 - 2003
- [c5]Neal Checka, Kevin W. Wilson, Vibhav Rangarajan, Trevor Darrell:
A Probabilistic Framework for Multi-modal Multi-Person Tracking. CVPR Workshops 2003: 100 - [c4]Michael Siracusa, Louis-Philippe Morency, Kevin W. Wilson, John W. Fisher III, Trevor Darrell:
A multi-modal approach for determining speaker location and focus. ICMI 2003: 77-80 - 2002
- [c3]Kevin W. Wilson, Trevor Darrell:
Audio-video array source localization for intelligent environments. ICASSP 2002: 2109-2112 - [c2]Kevin W. Wilson, Vibhav Rangarajan, Neal Checka, Trevor Darrell:
Audiovisual Arrays for Untethered Spoken Interfaces. ICMI 2002: 389-394 - 2001
- [c1]Kevin W. Wilson, Neal Checka, David Demirdjian, Trevor Darrell:
Audio-video array source separation for perceptual user interfaces. PUI 2001: 4:1-4:7
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-15 19:35 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint