default search action

combined dblp search
author search
venue search
publication search

ask others

Marc Delcroix

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j31]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/access/AshiharaDIK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/access/AshiharaDIK24
Takanori Ashihara, Marc Delcroix, Yusuke Ijima, Makio Kashino:
Unveiling the Linguistic Capabilities of a Self-Supervised Speech Model Through Cross-Lingual Benchmark and Layer- Wise Similarity Analysis. IEEE Access 12: 98835-98855 (2024)
[j30]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/OchiaiIDISAK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/OchiaiIDISAK24
Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3589-3602 (2024)
[c160]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KimuraNKDAUM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KimuraNKDAUM24
Rino Kimura, Tomohiro Nakatani, Naoyuki Kamo, Marc Delcroix, Shoko Araki, Tetsuya Ueda, Shoji Makino:
Diffusion Model-Based MIMO Speech Denoising and Dereverberation. ICASSP Workshops 2024: 455-459
[c159]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PengDOPAAC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PengDOPAAC24
Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Takanori Ashihara, Shoko Araki, Jan Cernocký:
Probing Self-Supervised Learning Models With Target Speech Extraction. ICASSP Workshops 2024: 535-539
[c158]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WakayamaODYSAN24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WakayamaODYSAN24
Keigo Wakayama, Tsubasa Ochiai, Marc Delcroix, Masahiro Yasuda, Shoichiro Saito, Shoko Araki, Akira Nakayama:
Online Target Sound Extraction with Knowledge Distillation from Partially Non-Causal Teacher. ICASSP 2024: 561-565
[c157]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShiKDNA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShiKDNA24
Hao Shi, Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Ensemble Inference for Diffusion Model-Based Speech Enhancement. ICASSP Workshops 2024: 735-739
[c156]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NeumannBCDH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NeumannBCDH24
Thilo von Neumann, Christoph Böddeker, Tobias Cord-Landwehr, Marc Delcroix, Reinhold Haeb-Umbach:
Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization. ICASSP Workshops 2024: 775-779
[c155]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/AshiharaDMMAI24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/AshiharaDMMAI24
Takanori Ashihara, Marc Delcroix, Takafumi Moriya, Kohei Matsuura, Taichi Asami, Yusuke Ijima:
What Do Self-Supervised Speech and Speaker Models Learn? New Findings from a Cross Model Layer-Wise Analysis. ICASSP 2024: 10166-10170
[c154]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PengDOPAC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PengDOPAC24
Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki, Jan Cernocký:
Target Speech Extraction with Pre-Trained Self-Supervised Learning Models. ICASSP 2024: 10421-10425
[c153]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SegawaODNIAYM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SegawaODNIAYM24
Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada, Shoji Makino:
Neural Network-Based Virtual Microphone Estimation with Virtual Microphone and Beamformer-Level Multi-Task Loss. ICASSP 2024: 11021-11025
[c152]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/IwamotoODISAK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/IwamotoODISAK24
Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
How Does End-To-End Speech Recognition Training Impact Speech Enhancement Artifacts? ICASSP 2024: 11031-11035
[c151]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TawaraDAO24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TawaraDAO24
Naohiro Tawara, Marc Delcroix, Atsushi Ando, Atsunori Ogawa:
NTT Speaker Diarization System for Chime-7: Multi-Domain, Multi-Microphone end-to-end and Vector Clustering Diarization. ICASSP 2024: 11281-11285
[c150]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/FujitaSAKDMI24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/FujitaSAKDMI24
Kenichi Fujita, Hiroshi Sato, Takanori Ashihara, Hiroki Kanagawa, Marc Delcroix, Takafumi Moriya, Yusuke Ijima:
Noise-Robust Zero-Shot Text-to-Speech Synthesis Conditioned on Self-Supervised Speech-Representation Model with Adapters. ICASSP 2024: 11471-11475
[c149]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KlementDLBSDT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KlementDLBSDT24
Dominik Klement, Mireia Díez, Federico Landini, Lukás Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara:
Discriminative Training of VBx Diarization. ICASSP 2024: 11871-11875
[c148]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenKOD024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenKOD024
William Chen, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe:
Train Long and Test Long: Leveraging Full Document Contexts in Speech Processing. ICASSP 2024: 13066-13070
[c147]
- view
  authority control:
- export record
  dblp key:
  - conf/iwaenc/NakataniKDA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwaenc/NakataniKDA24
Tomohiro Nakatani, Naoyuki Kamo, Marc Delcroix, Shoko Araki:
Multi-Stream Diffusion Model for Probabilistic Integration of Model-Based and Data-Driven Speech Enhancement. IWAENC 2024: 65-69
[c146]
- view
  authority control:
- export record
  dblp key:
  - conf/iwaenc/HernandezOlivanDOTNA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwaenc/HernandezOlivanDOTNA24
Carlos Hernandez-Olivan, Marc Delcroix, Tsubasa Ochiai, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
Interaural Time Difference Loss for Binaural Target Sound Extraction. IWAENC 2024: 210-214
[i73]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-05111
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-05111
Kenichi Fujita, Hiroshi Sato, Takanori Ashihara, Hiroki Kanagawa, Marc Delcroix, Takafumi Moriya, Yusuke Ijima:
Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters. CoRR abs/2401.05111 (2024)
[i72]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-17632
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-17632
Takanori Ashihara, Marc Delcroix, Takafumi Moriya, Kohei Matsuura, Taichi Asami, Yusuke Ijima:
What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis. CoRR abs/2401.17632 (2024)
[i71]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-03058
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-03058
Marvin Tammen, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki, Simon Doclo:
Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers. CoRR abs/2402.03058 (2024)
[i70]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-13199
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-13199
Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki, Jan Cernocký:
Target Speech Extraction with Pre-trained Self-supervised Learning Models. CoRR abs/2402.13199 (2024)
[i69]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-13200
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-13200
Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Takanori Ashihara, Shoko Araki, Jan Cernocký:
Probing Self-supervised Learning Models with Target Speech Extraction. CoRR abs/2402.13200 (2024)
[i68]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-14860
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-14860
Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance. CoRR abs/2404.14860 (2024)
[i67]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-18972
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-18972
Atsunori Ogawa, Naoyuki Kamo, Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Takatomo Kano, Naohiro Tawara, Marc Delcroix:
Applying LLMs for Rescoring N-best ASR Hypotheses of Casual Conversations: Effects of Domain Adaptation and Context Carry-over. CoRR abs/2406.18972 (2024)
[i66]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-01291
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-01291
Kenichi Fujita, Takanori Ashihara, Marc Delcroix, Yusuke Ijima:
Lightweight Zero-shot Text-to-Speech with Mixture of Adapters. CoRR abs/2407.01291 (2024)
[i65]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-01857
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-01857
Hiroshi Sato, Takafumi Moriya, Masato Mimura, Shota Horiguchi, Tsubasa Ochiai, Takanori Ashihara, Atsushi Ando, Kentaro Shinayama, Marc Delcroix:
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling. CoRR abs/2407.01857 (2024)
[i64]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-00205
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-00205
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Masato Mimura, Takatomo Kano, Atsunori Ogawa, Marc Delcroix:
Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation. CoRR abs/2408.00205 (2024)
[i63]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-00344
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-00344
Carlos Hernandez-Olivan, Marc Delcroix, Tsubasa Ochiai, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
Interaural time difference loss for binaural target sound extraction. CoRR abs/2408.00344 (2024)
[i62]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-17142
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-17142
Shota Horiguchi, Atsushi Ando, Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, Marc Delcroix:
Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings. CoRR abs/2408.17142 (2024)
[i61]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-12528
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-12528
Carlos Hernandez-Olivan, Marc Delcroix, Tsubasa Ochiai, Daisuke Niizumi, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model. CoRR abs/2409.12528 (2024)
[i60]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-20301
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-20301
Takafumi Moriya, Shota Horiguchi, Marc Delcroix, Ryo Masumura, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Masato Mimura:
Alignment-Free Training for Transducer-based Multi-Talker ASR. CoRR abs/2409.20301 (2024)
[i59]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-06459
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-06459
Alexis Plaquet, Naohiro Tawara, Marc Delcroix, Shota Horiguchi, Atsushi Ando, Shoko Araki:
Mamba-based Segmentation Model for Speaker Diarization. CoRR abs/2410.06459 (2024)
[i58]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-11243
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-11243
Takanori Ashihara, Takafumi Moriya, Shota Horiguchi, Junyi Peng, Tsubasa Ochiai, Marc Delcroix, Kohei Matsuura, Hiroshi Sato:
Investigation of Speaker Representation for Target-Speaker Speech Processing. CoRR abs/2410.11243 (2024)
[i57]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-12182
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-12182
Shota Horiguchi, Takafumi Moriya, Atsushi Ando, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, Marc Delcroix:
Guided Speaker Embedding. CoRR abs/2410.12182 (2024)
2023
[j29]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/access/MoriyaSODS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/access/MoriyaSODS23
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takahiro Shinozaki:
Streaming End-to-End Target-Speaker Automatic Speech Recognition and Activity Detection. IEEE Access 11: 13906-13917 (2023)
[j28]
- view
  authority control:
- export record
  dblp key:
  - journals/spm/ZmolikovaDOKCY23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spm/ZmolikovaDOKCY23
Katerina Zmolíková, Marc Delcroix, Tsubasa Ochiai, Keisuke Kinoshita, Jan Cernocký, Dong Yu:
Neural Target Speech Extraction: An overview. IEEE Signal Process. Mag. 40(3): 8-29 (2023)
[j27]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/DelcroixVOKOA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/DelcroixVOKOA23
Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Yasunori Ohishi, Shoko Araki:
SoundBeam: Target Sound Extraction Conditioned on Sound-Class Labels and Enrollment Clues for Increased Performance and Continuous Learning. IEEE ACM Trans. Audio Speech Lang. Process. 31: 121-136 (2023)
[j26]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/NeumannKBDH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/NeumannKBDH23
Thilo von Neumann, Keisuke Kinoshita, Christoph Böddeker, Marc Delcroix, Reinhold Haeb-Umbach:
Segment-Less Continuous Speech Separation of Meetings: Training and Evaluation Criteria. IEEE ACM Trans. Audio Speech Lang. Process. 31: 576-589 (2023)
[j25]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/OchiaiDNA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/OchiaiDNA23
Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Mask-Based Neural Beamforming for Moving Speakers With Self-Attention-Based Tracking. IEEE ACM Trans. Audio Speech Lang. Process. 31: 835-848 (2023)
[c145]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/KanoODMACW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/KanoODMACW23
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Kohei Matsuura, Takanori Ashihara, William Chen, Shinji Watanabe:
Summarize While Translating: Universal Model With Parallel Decoding for Summarization and Translation. ASRU 2023: 1-8
[c144]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/SharmaCKSAWODSR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/SharmaCKSAWODSR23
Roshan S. Sharma, William Chen, Takatomo Kano, Ruchira Sharma, Siddhant Arora, Shinji Watanabe, Atsunori Ogawa, Marc Delcroix, Rita Singh, Bhiksha Raj:
Espnet-Summ: Introducing a Novel Large Dataset, Toolkit, and a Cross-Corpora Evaluation of Speech Summarization Systems. ASRU 2023: 1-8
[c143]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KanoODSMW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KanoODSMW23
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Roshan S. Sharma, Kohei Matsuura, Shinji Watanabe:
Speech Summarization of Long Spoken Document: Improving Memory Efficiency of Speech/Text Encoders. ICASSP 2023: 1-5
[c142]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MatsuuraAMTODM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MatsuuraAMTODM23
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Atsunori Ogawa, Marc Delcroix, Ryo Masumura:
Leveraging Large Text Corpora For End-To-End Speech Summarization. ICASSP 2023: 1-5
[c141]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NeumannBKDH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NeumannBKDH23
Thilo von Neumann, Christoph Böddeker, Keisuke Kinoshita, Marc Delcroix, Reinhold Haeb-Umbach:
On Word Error Rate Definitions and Their Efficient Computation for Multi-Speaker Speech Recognition Systems. ICASSP 2023: 1-5
[c140]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OgawaMKTD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OgawaMKTD23
Atsunori Ogawa, Takafumi Moriya, Naoyuki Kamo, Naohiro Tawara, Marc Delcroix:
Iterative Shallow Fusion of Backward Language Model for End-To-End Speech Recognition. ICASSP 2023: 1-5
[c139]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KamoDN23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KamoDN23
Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani:
Target Speech Extraction with Conditional Diffusion Model. INTERSPEECH 2023: 176-180
[c138]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SatoMODMASMITH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SatoMODMASMITH23
Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka, Nobukatsu Hojo:
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss. INTERSPEECH 2023: 854-858
[c137]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MoriyaSODAMTMOA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MoriyaSODAMTMOA23
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa, Taichi Asami:
Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data. INTERSPEECH 2023: 899-903
[c136]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AshiharaMMTIADH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AshiharaMMTIADH23
Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka, Yusuke Ijima, Taichi Asami, Marc Delcroix, Yukinori Honma:
SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge? INTERSPEECH 2023: 2888-2892
[c135]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MatsuuraAMTKOD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MatsuuraAMTKOD23
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Takatomo Kano, Atsunori Ogawa, Marc Delcroix:
Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization. INTERSPEECH 2023: 2943-2947
[c134]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixTDLSONB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixTDLSONB23
Marc Delcroix, Naohiro Tawara, Mireia Díez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukás Burget, Shoko Araki:
Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization. INTERSPEECH 2023: 3477-3481
[i56]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-13341
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-13341
Katerina Zmolíková, Marc Delcroix, Tsubasa Ochiai, Keisuke Kinoshita, Jan Cernocký, Dong Yu:
Neural Target Speech Extraction: An Overview. CoRR abs/2301.13341 (2023)
[i55]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-00978
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-00978
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Atsunori Ogawa, Marc Delcroix, Ryo Masumura:
Leveraging Large Text Corpora for End-to-End Speech Summarization. CoRR abs/2303.00978 (2023)
[i54]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13580
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13580
Marc Delcroix, Naohiro Tawara, Mireia Díez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukás Burget, Shoko Araki:
Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization. CoRR abs/2305.13580 (2023)
[i53]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-14723
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-14723
Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka, Nobukatsu Hojo:
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss. CoRR abs/2305.14723 (2023)
[i52]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-04233
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-04233
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Takatomo Kano, Atsunori Ogawa, Marc Delcroix:
Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization. CoRR abs/2306.04233 (2023)
[i51]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-08374
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-08374
Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka, Yusuke Ijima, Taichi Asami, Marc Delcroix, Yukinori Honma:
SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge? CoRR abs/2306.08374 (2023)
[i50]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-11394
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-11394
Thilo von Neumann, Christoph Böddeker, Marc Delcroix, Reinhold Haeb-Umbach:
MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems. CoRR abs/2307.11394 (2023)
[i49]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-03987
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-03987
Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani:
Target Speech Extraction with Conditional Diffusion Model. CoRR abs/2308.03987 (2023)
[i48]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-12656
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-12656
Naohiro Tawara, Marc Delcroix, Atsushi Ando, Atsunori Ogawa:
NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization. CoRR abs/2309.12656 (2023)
[i47]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-16482
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-16482
Thilo von Neumann, Christoph Böddeker, Tobias Cord-Landwehr, Marc Delcroix, Reinhold Haeb-Umbach:
Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization. CoRR abs/2309.16482 (2023)
[i46]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-02732
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-02732
Dominik Klement, Mireia Díez, Federico Landini, Lukás Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara:
Discriminative Training of VBx Diarization. CoRR abs/2310.02732 (2023)
[i45]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-11010
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-11010
Atsunori Ogawa, Takafumi Moriya, Naoyuki Kamo, Naohiro Tawara, Marc Delcroix:
Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition. CoRR abs/2310.11010 (2023)
[i44]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-12764
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-12764
Atsunori Ogawa, Naohiro Tawara, Marc Delcroix, Shoko Araki:
Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models. CoRR abs/2312.12764 (2023)
[i43]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-14609
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-14609
Atsunori Ogawa, Naohiro Tawara, Takatomo Kano, Marc Delcroix:
BLSTM-Based Confidence Estimation for End-to-End Speech Recognition. CoRR abs/2312.14609 (2023)
2022
[j24]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/HuangDGWRK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/HuangDGWRK22
Zili Huang, Marc Delcroix, Leibny Paola García-Perera, Shinji Watanabe, Desh Raj, Sanjeev Khudanpur:
Joint speaker diarization and speech recognition based on region proposal networks. Comput. Speech Lang. 72: 101316 (2022)
[c133]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NeumannKBDH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NeumannKBDH22
Thilo von Neumann, Keisuke Kinoshita, Christoph Böddeker, Marc Delcroix, Reinhold Haeb-Umbach:
SA-SDR: A Novel Loss Function for Separation of Meeting Style Data. ICASSP 2022: 6022-6026
[c132]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KanoODW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KanoODW22
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe:
Integrating Multiple ASR Systems into NLP Backend with Attention Fusion. ICASSP 2022: 6237-6241
[c131]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SatoODKKM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SatoODKKM22
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Naoyuki Kamo, Takafumi Moriya:
Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition. ICASSP 2022: 6287-6291
[c130]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OgawaTDA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OgawaTDA22
Atsunori Ogawa, Naohiro Tawara, Marc Delcroix, Shoko Araki:
Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models. ICASSP 2022: 6517-6521
[c129]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MoriyaAASTMMDS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MoriyaAASTMMDS22
Takafumi Moriya, Takanori Ashihara, Atsushi Ando, Hiroshi Sato, Tomohiro Tanaka, Kohei Matsuura, Ryo Masumura, Marc Delcroix, Takahiro Shinozaki:
Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration. ICASSP 2022: 8282-8286
[c128]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KinoshitaDI22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KinoshitaDI22
Keisuke Kinoshita, Marc Delcroix, Tomoharu Iwata:
Tight Integration Of Neural- And Clustering-Based Diarization Through Deep Unfolding Of Infinite Gaussian Mixture Model. ICASSP 2022: 8382-8386
[c127]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixKOZSN22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixKOZSN22
Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Katerina Zmolíková, Hiroshi Sato, Tomohiro Nakatani:
Listen only to me! How well can target speech extraction handle false alarms? INTERSPEECH 2022: 216-220
[c126]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SatoODKMMITM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SatoODKMMITM22
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Ryo Masumura:
Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations. INTERSPEECH 2022: 996-1000
[c125]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaNDBH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaNDBH22
Keisuke Kinoshita, Thilo von Neumann, Marc Delcroix, Christoph Böddeker, Reinhold Haeb-Umbach:
Utterance-by-utterance overlap-aware neural diarization with Graph-PIT. INTERSPEECH 2022: 1486-1490
[c124]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MoriyaSODS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MoriyaSODS22
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takahiro Shinozaki:
Streaming Target-Speaker ASR with Neural Transducer. INTERSPEECH 2022: 2673-2677
[c123]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KocourZOSDOBC22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KocourZOSDOBC22
Martin Kocour, Katerina Zmolíková, Lucas Ondel, Jan Svec, Marc Delcroix, Tsubasa Ochiai, Lukás Burget, Jan Cernocký:
Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model. INTERSPEECH 2022: 4955-4959
[c122]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/IwamotoODISAK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/IwamotoODISAK22
Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
How bad are artifacts?: Analyzing the impact of speech enhancement errors on ASR. INTERSPEECH 2022: 5418-5422
[c121]
- view
  authority control:
- export record
  dblp key:
  - conf/iwaenc/SvecZKDOMC22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwaenc/SvecZKDOMC22
Jan Svec, Katerina Zmolíková, Martin Kocour, Marc Delcroix, Tsubasa Ochiai, Ladislav Mosner, Jan Honza Cernocký:
Analysis of Impact of Emotions on Target Speech Extraction and Speech Separation. IWAENC 2022: 1-5
[c120]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/OhishiDOATNKHK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/OhishiDOATNKHK22
Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada, Kunio Kashino:
ConceptBeam: Concept Driven Target Speech Extraction. ACM Multimedia 2022: 4252-4260
[i42]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2201-03881
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-03881
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Naoyuki Kamo, Takafumi Moriya:
Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition. CoRR abs/2201.03881 (2022)
[i41]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2201-06685
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-06685
Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement Errors on ASR. CoRR abs/2201.06685 (2022)
[i40]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2202-06524
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-06524
Keisuke Kinoshita, Marc Delcroix, Tomoharu Iwata:
Tight integration of neural- and clustering-based diarization through deep unfolding of infinite Gaussian mixture model. CoRR abs/2202.06524 (2022)
[i39]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-03895
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-03895
Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Yasunori Ohishi, Shoko Araki:
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning. CoRR abs/2204.03895 (2022)
[i38]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-04811
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-04811
Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Katerina Zmolíková, Hiroshi Sato, Tomohiro Nakatani:
Listen only to me! How well can target speech extraction handle false alarms? CoRR abs/2204.04811 (2022)
[i37]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-03568
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-03568
Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking. CoRR abs/2205.03568 (2022)
[i36]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-08174
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-08174
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Ryo Masumura:
Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations. CoRR abs/2206.08174 (2022)
[i35]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-11964
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-11964
Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada, Kunio Kashino:
ConceptBeam: Concept Driven Target Speech Extraction. CoRR abs/2207.11964 (2022)
[i34]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-13888
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-13888
Keisuke Kinoshita, Thilo von Neumann, Marc Delcroix, Christoph Böddeker, Reinhold Haeb-Umbach:
Utterance-by-utterance overlap-aware neural diarization with Graph-PIT. CoRR abs/2207.13888 (2022)
[i33]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-07091
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-07091
Jan Svec, Katerina Zmolíková, Martin Kocour, Marc Delcroix, Tsubasa Ochiai, Ladislav Mosner, Jan Cernocký:
Analysis of impact of emotions on target speech extraction and speech separation. CoRR abs/2208.07091 (2022)
[i32]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2209-04175
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2209-04175
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takahiro Shinozaki:
Streaming Target-Speaker ASR with Neural Transducer. CoRR abs/2209.04175 (2022)
[i31]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-16112
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-16112
Thilo von Neumann, Christoph Böddeker, Keisuke Kinoshita, Marc Delcroix, Reinhold Haeb-Umbach:
On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems. CoRR abs/2211.16112 (2022)
2021
[j23]
- view
  authority control:
- export record
  dblp key:
  - journals/pieee/Haeb-UmbachHDWD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pieee/Haeb-UmbachHDWD21
Reinhold Haeb-Umbach, Jahn Heymann, Lukas Drude, Shinji Watanabe, Marc Delcroix, Tomohiro Nakatani:
Far-Field Automatic Speech Recognition. Proc. IEEE 109(2): 124-148 (2021)
[c119]
- view
  - electronic edition @ ieee.org
  - no references & citations available
- export record
  dblp key:
  - conf/ITGspeech/NeumannBKDH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ITGspeech/NeumannBKDH21
Thilo von Neumann, Christoph Böddeker, Keisuke Kinoshita, Marc Delcroix, Reinhold Haeb-Umbach:
Speeding Up Permutation Invariant Training for Source Separation. ITG Conference on Speech Communication 2021: 1-5
[c118]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/KanoODW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/KanoODW21
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe:
Attention-Based Multi-Hypothesis Fusion for Speech Summarization. ASRU 2021: 487-494
[c117]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WissingBKODKNAS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WissingBKODKNAS21
Julio Wissing, Benedikt T. Boenninghoff, Dorothea Kolossa, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Christopher Schymura:
Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial Domain. ICASSP 2021: 4705-4709
[c116]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiCLHZKD0Q21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiCLHZKD0Q21
Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian:
Dual-Path Modeling for Long Recording Speech Separation in Meetings. ICASSP 2021: 5739-5743
[c115]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixZOKN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixZOKN21
Marc Delcroix, Katerina Zmolíková, Tsubasa Ochiai, Keisuke Kinoshita, Tomohiro Nakatani:
Speaker Activity Driven Neural Speech Extraction. ICASSP 2021: 6099-6103
[c114]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OchiaiDNIKA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OchiaiDNIKA21
Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki:
Neural Network-Based Virtual Microphone Estimator. ICASSP 2021: 6114-6118
[c113]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OgawaTKD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OgawaTKD21
Atsunori Ogawa, Naohiro Tawara, Takatomo Kano, Marc Delcroix:
BLSTM-Based Confidence Estimation for End-to-End Speech Recognition. ICASSP 2021: 6383-6387
[c112]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangB0NDKOKHQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangB0NDKOKHQ21
Wangyou Zhang, Christoph Böddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach, Yanmin Qian:
End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend. ICASSP 2021: 6898-6902
[c111]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KinoshitaDT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KinoshitaDT21
Keisuke Kinoshita, Marc Delcroix, Naohiro Tawara:
Integrating End-to-End Neural and Clustering-Based Diarization: Getting the Best of Both Worlds. ICASSP 2021: 7198-7202
[c110]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/BoddekerZNKODKQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/BoddekerZNKODKQ21
Christoph Böddeker, Wangyou Zhang, Tomohiro Nakatani, Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Naoyuki Kamo, Yanmin Qian, Reinhold Haeb-Umbach:
Convolutive Transfer Function Invariant SDR Training Criteria for Multi-Channel Reverberant Speech Separation. ICASSP 2021: 8428-8432
[c109]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SatoODKMK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SatoODKMK21
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoyuki Kamo:
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition. Interspeech 2021: 1149-1153
[c108]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZmolikovaDR0C21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZmolikovaDR0C21
Katerina Zmolíková, Marc Delcroix, Desh Raj, Shinji Watanabe, Jan Cernocký:
Auxiliary Loss Function for Target Speech Extraction and Recognition with Weak Supervision Based on Speaker Characteristics. Interspeech 2021: 1464-1468
[c107]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MoriyaTAOSAMDA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MoriyaTAOSAMDA21
Takafumi Moriya, Tomohiro Tanaka, Takanori Ashihara, Tsubasa Ochiai, Hiroshi Sato, Atsushi Ando, Ryo Masumura, Marc Delcroix, Taichi Asami:
Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture. Interspeech 2021: 1787-1791
[c106]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SchymuraBODKNAK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SchymuraBODKNAK21
Christopher Schymura, Benedikt T. Bönninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
PILOT: Introducing Transformers for Probabilistic Sound Event Localization. Interspeech 2021: 2117-2121
[c105]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanLLZK0DEHMC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanLLZK0DEHMC21
Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R. Hershey, Nima Mesgarani, Zhuo Chen:
Continuous Speech Separation Using Speaker Inventory for Long Recording. Interspeech 2021: 3036-3040
[c104]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NeumannKBDH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NeumannKBDH21
Thilo von Neumann, Keisuke Kinoshita, Christoph Böddeker, Marc Delcroix, Reinhold Haeb-Umbach:
Graph-PIT: Generalized Permutation Invariant Training for Continuous Separation of Arbitrary Numbers of Speakers. Interspeech 2021: 3490-3494
[c103]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixVOKA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixVOKA21
Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki:
Few-Shot Learning of New Sound Classes for Target Sound Extraction. Interspeech 2021: 3500-3504
[c102]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaDT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaDT21
Keisuke Kinoshita, Marc Delcroix, Naohiro Tawara:
Advances in Integration of End-to-End Neural and Clustering-Based Diarization for Real Conversational Speech. Interspeech 2021: 3565-3569
[c101]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SatoOKDNA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SatoOKDNA21
Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Multimodal Attention Fusion for Target Speaker Extraction. SLT 2021: 778-784
[c100]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiLHLYZDKBQ0C21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiLHLYZDKBQ0C21
Chenda Li, Yi Luo, Cong Han, Jinyu Li, Takuya Yoshioka, Tianyan Zhou, Marc Delcroix, Keisuke Kinoshita, Christoph Böddeker, Yanmin Qian, Shinji Watanabe, Zhuo Chen:
Dual-Path RNN for Long Recording Speech Separation. SLT 2021: 865-872
[c99]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZmolikovaDBNC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZmolikovaDBNC21
Katerina Zmolíková, Marc Delcroix, Lukás Burget, Tomohiro Nakatani, Jan Honza Cernocký:
Integration of Variational Autoencoder and Spatial Clustering for Adaptive Multi-Channel Neural Speech Separation. SLT 2021: 889-896
[i30]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2101-04315
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-04315
Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki:
Neural Network-based Virtual Microphone Estimator. CoRR abs/2101.04315 (2021)
[i29]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2101-05516
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-05516
Marc Delcroix, Katerina Zmolíková, Tsubasa Ochiai, Keisuke Kinoshita, Tomohiro Nakatani:
Speaker activity driven neural speech extraction. CoRR abs/2101.05516 (2021)
[i28]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-01326
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-01326
Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Multimodal Attention Fusion for Target Speaker Extraction. CoRR abs/2102.01326 (2021)
[i27]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-11525
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-11525
Wangyou Zhang, Christoph Böddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach, Yanmin Qian:
End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend. CoRR abs/2102.11525 (2021)
[i26]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-11588
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-11588
Julio Wissing, Benedikt T. Boenninghoff, Dorothea Kolossa, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Christopher Schymura:
Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial Domain. CoRR abs/2102.11588 (2021)
[i25]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-11634
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-11634
Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian:
Dual-Path Modeling for Long Recording Speech Separation in Meetings. CoRR abs/2102.11634 (2021)
[i24]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2103-00417
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-00417
Christopher Schymura, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
Exploiting Attention-based Sequence-to-Sequence Architectures for Sound Event Localization. CoRR abs/2103.00417 (2021)
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2105-09040
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2105-09040
Keisuke Kinoshita, Marc Delcroix, Naohiro Tawara:
Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech. CoRR abs/2105.09040 (2021)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2106-00949
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-00949
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoyuki Kamo:
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition. CoRR abs/2106.00949 (2021)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2106-03903
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-03903
Christopher Schymura, Benedikt T. Bönninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
PILOT: Introducing Transformers for Probabilistic Sound Event Localization. CoRR abs/2106.03903 (2021)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2106-07144
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-07144
Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki:
Few-shot learning of new sound classes for target sound extraction. CoRR abs/2106.07144 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2107-14445
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-14445
Thilo von Neumann, Christoph Böddeker, Keisuke Kinoshita, Marc Delcroix, Reinhold Haeb-Umbach:
Speeding Up Permutation Invariant Training for Source Separation. CoRR abs/2107.14445 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2107-14446
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-14446
Thilo von Neumann, Keisuke Kinoshita, Christoph Böddeker, Marc Delcroix, Reinhold Haeb-Umbach:
Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers. CoRR abs/2107.14446 (2021)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2110-15581
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-15581
Thilo von Neumann, Keisuke Kinoshita, Christoph Böddeker, Marc Delcroix, Reinhold Haeb-Umbach:
SA-SDR: A novel loss function for separation of meeting style data. CoRR abs/2110.15581 (2021)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2111-00009
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-00009
Martin Kocour, Katerina Zmolíková, Lucas Ondel, Jan Svec, Marc Delcroix, Tsubasa Ochiai, Lukás Burget, Jan Cernocký:
Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model. CoRR abs/2111.00009 (2021)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2111-08201
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-08201
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe:
Attention-based Multi-hypothesis Fusion for Speech Summarization. CoRR abs/2111.08201 (2021)
2020
[j22]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/NakataniBKIDH20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/NakataniBKIDH20
Tomohiro Nakatani, Christoph Böddeker, Keisuke Kinoshita, Rintaro Ikeshita, Marc Delcroix, Reinhold Haeb-Umbach:
Jointly Optimal Denoising, Dereverberation, and Source Separation. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2267-2282 (2020)
[c98]
- view
  authority control:
- export record
  dblp key:
  - conf/eusipco/SchymuraODKNAK20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eusipco/SchymuraODKNAK20
Christopher Schymura, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
Exploiting Attention-based Sequence-to-Sequence Architectures for Sound Event Localization. EUSIPCO 2020: 231-235
[c97]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KoizumiYDMT20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KoizumiYDMT20
Yuma Koizumi, Kohei Yatabe, Marc Delcroix, Yoshiki Masuyama, Daiki Takeuchi:
Speech Enhancement Using Self-Adaptation and Multi-Head Self-Attention. ICASSP 2020: 181-185
[c96]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KinoshitaDAN20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KinoshitaDAN20
Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani:
Tackling Real Noisy Reverberant Meetings with All-Neural Source Separation, Counting, and Diarization System. ICASSP 2020: 381-385
[c95]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SchymuraODKNAK20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SchymuraODKNAK20
Christopher Schymura, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
A Dynamic Stream Weight Backprop Kalman Filter for Audiovisual Speaker Tracking. ICASSP 2020: 581-585
[c94]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixOZKTNA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixOZKTNA20
Marc Delcroix, Tsubasa Ochiai, Katerina Zmolíková, Keisuke Kinoshita, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
Improving Speaker Discrimination of Target Speech Extraction With Time-Domain Speakerbeam. ICASSP 2020: 691-695
[c93]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OchiaiDIKNA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OchiaiDIKNA20
Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki:
Beam-TasNet: Time-domain Audio Separation Network Meets Frequency-domain Beamformer. ICASSP 2020: 6384-6388
[c92]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NakataniTOKIDA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NakataniTOKIDA20
Tomohiro Nakatani, Riki Takahashi, Tsubasa Ochiai, Keisuke Kinoshita, Rintaro Ikeshita, Marc Delcroix, Shoko Araki:
DNN-supported Mask-based Convolutional Beamforming for Simultaneous Denoising, Dereverberation, and Source Separation. ICASSP 2020: 6399-6403
[c91]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TawaraOIDO20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TawaraOIDO20
Naohiro Tawara, Atsunori Ogawa, Tomoharu Iwata, Marc Delcroix, Tetsuji Ogawa:
Frame-Level Phoneme-Invariant Speaker Embedding for Text-Independent Speaker Recognition on Extremely Short Utterances. ICASSP 2020: 6799-6803
[c90]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NeumannKDBDNH20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NeumannKDBDNH20
Thilo von Neumann, Keisuke Kinoshita, Lukas Drude, Christoph Böddeker, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach:
End-to-End Training of Time Domain Audio Separation and Recognition. ICASSP 2020: 7004-7008
[c89]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KinoshitaODN20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KinoshitaODN20
Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani:
Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network. ICASSP 2020: 7009-7013
[c88]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MoriyaOKSTAMSD20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MoriyaOKSTAMSD20
Takafumi Moriya, Tsubasa Ochiai, Shigeki Karita, Hiroshi Sato, Tomohiro Tanaka, Takanori Ashihara, Ryo Masumura, Yusuke Shinohara, Marc Delcroix:
Self-Distillation for Improving CTC-Transformer-Based ASR Systems. INTERSPEECH 2020: 546-550
[c87]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OchiaiDKIKA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OchiaiDKIKA20
Tsubasa Ochiai, Marc Delcroix, Yuma Koizumi, Hiroaki Ito, Keisuke Kinoshita, Shoko Araki:
Listen to What You Want: Neural Network-Based Universal Sound Selector. INTERSPEECH 2020: 1441-1445
[c86]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaNDNH20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaNDNH20
Keisuke Kinoshita, Thilo von Neumann, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach:
Multi-Path RNN for Hierarchical Modeling of Long Sequential Data and its Application to Speaker Stream Separation. INTERSPEECH 2020: 2652-2656
[c85]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NeumannBDKDNH20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NeumannBDKDNH20
Thilo von Neumann, Christoph Böddeker, Lukas Drude, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach:
Multi-Talker ASR for an Unknown Number of Sources: Joint Training of Source Counting, Separation and ASR. INTERSPEECH 2020: 3097-3101
[c84]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OgawaTD20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OgawaTD20
Atsunori Ogawa, Naohiro Tawara, Marc Delcroix:
Language Model Data Augmentation Based on Text Domain Transfer. INTERSPEECH 2020: 4926-4930
[c83]
- view
  authority control:
- export record
  dblp key:
  - conf/mlsp/AroudiDNKAD20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mlsp/AroudiDNKAD20
Ali Aroudi, Marc Delcroix, Tomohiro Nakatani, Keisuke Kinoshita, Shoko Araki, Simon Doclo:
Cognitive-Driven Convolutional Beamforming Using EEG-Based Auditory Attention Decoding. MLSP 2020: 1-6
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2001-08378
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2001-08378
Marc Delcroix, Tsubasa Ochiai, Katerina Zmolíková, Keisuke Kinoshita, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam. CoRR abs/2001.08378 (2020)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2002-05873
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-05873
Yuma Koizumi, Kohei Yatabe, Marc Delcroix, Yoshiki Masuyama, Daiki Takeuchi:
Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention. CoRR abs/2002.05873 (2020)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2003-03987
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-03987
Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani:
Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system. CoRR abs/2003.03987 (2020)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2003-03998
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-03998
Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani:
Improving noise robust automatic speech recognition with single-channel time-domain enhancement network. CoRR abs/2003.03998 (2020)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2005-04669
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-04669
Ali Aroudi, Marc Delcroix, Tomohiro Nakatani, Keisuke Kinoshita, Shoko Araki, Simon Doclo:
Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding. CoRR abs/2005.04669 (2020)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2005-09843
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-09843
Tomohiro Nakatani, Christoph Böddeker, Keisuke Kinoshita, Rintaro Ikeshita, Marc Delcroix, Reinhold Haeb-Umbach:
Jointly optimal denoising, dereverberation, and source separation. CoRR abs/2005.09843 (2020)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-02786
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-02786
Thilo von Neumann, Christoph Böddeker, Lukas Drude, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach:
Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR. CoRR abs/2006.02786 (2020)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-05712
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-05712
Tsubasa Ochiai, Marc Delcroix, Yuma Koizumi, Hiroaki Ito, Keisuke Kinoshita, Shoko Araki:
Listen to What You Want: Neural Network-based Universal Sound Selector. CoRR abs/2006.05712 (2020)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-13579
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-13579
Keisuke Kinoshita, Thilo von Neumann, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach:
Multi-path RNN for hierarchical modeling of long sequential data and its application to speaker stream separation. CoRR abs/2006.13579 (2020)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2010-13366
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-13366
Keisuke Kinoshita, Marc Delcroix, Naohiro Tawara:
Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds. CoRR abs/2010.13366 (2020)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2011-15003
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-15003
Christoph Böddeker, Wangyou Zhang, Tomohiro Nakatani, Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Naoyuki Kamo, Yanmin Qian, Shinji Watanabe, Reinhold Haeb-Umbach:
Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation. CoRR abs/2011.15003 (2020)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2012-09727
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-09727
Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R. Hershey, Nima Mesgarani, Zhuo Chen:
Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording. CoRR abs/2012.09727 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[j21]
- view
  authority control:
- export record
  dblp key:
  - journals/ieicet/HentschelDOIN19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieicet/HentschelDOIN19
Michael Hentschel, Marc Delcroix, Atsunori Ogawa, Tomoharu Iwata, Tomohiro Nakatani:
Feature Based Domain Adaptation for Neural Network Language Models with Factorised Hidden Layers. IEICE Trans. Inf. Syst. 102-D(3): 598-608 (2019)
[j20]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/ZmolikovaDKONBC19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/ZmolikovaDKONBC19
Katerina Zmolíková, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Tomohiro Nakatani, Lukás Burget, Jan Cernocký:
SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech Mixtures. IEEE J. Sel. Top. Signal Process. 13(4): 800-814 (2019)
[c82]
- view
  authority control:
- export record
  dblp key:
  - conf/camsap/ArakiOKD19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/camsap/ArakiOKD19
Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Projection Back onto Filtered Observations for Speech Separation with Distributed Microphone Array. CAMSAP 2019: 291-295
[c81]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NeumannKDANH19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NeumannKDANH19
Thilo von Neumann, Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani, Reinhold Haeb-Umbach:
All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis. ICASSP 2019: 91-95
[c80]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ArakiOKD19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ArakiOKD19
Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Estimation of Sampling Frequency Mismatch between Distributed Asynchronous Microphones under Existence of Source Movements with Stationary Time Periods Detection. ICASSP 2019: 785-789
[c79]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KaritaWIDON19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KaritaWIDON19
Shigeki Karita, Shinji Watanabe, Tomoharu Iwata, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani:
Semi-supervised End-to-end Speech Recognition Using Text-to-speech and Autoencoders. ICASSP 2019: 6166-6170
[c78]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KuboNDKA19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KuboNDKA19
Yuki Kubo, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Shoko Araki:
Mask-based MVDR Beamformer for Noisy Multisource Environments: Introduction of Time-varying Spatial Covariance Model. ICASSP 2019: 6855-6859
[c77]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixZOKAN19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixZOKAN19
Marc Delcroix, Katerina Zmolíková, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki, Tomohiro Nakatani:
Compact Network for Speakerbeam Target Speaker Extraction. ICASSP 2019: 6965-6969
[c76]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OchiaiDKON19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OchiaiDKON19
Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Tomohiro Nakatani:
A Unified Framework for Neural Speech Separation and Extraction. ICASSP 2019: 6975-6979
[c75]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HentschelDOIN19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HentschelDOIN19
Michael Hentschel, Marc Delcroix, Atsunori Ogawa, Tomoharu Iwata, Tomohiro Nakatani:
A Unified Framework for Feature-based Domain Adaptation of Neural Network Language Models. ICASSP 2019: 7250-7254
[c74]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixWOKKON19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixWOKKON19
Marc Delcroix, Shinji Watanabe, Tsubasa Ochiai, Keisuke Kinoshita, Shigeki Karita, Atsunori Ogawa, Tomohiro Nakatani:
End-to-End SpeakerBeam for Single Channel Target Speech Recognition. INTERSPEECH 2019: 451-455
[c73]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KaritaSWDON19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KaritaSWDON19
Shigeki Karita, Nelson Enrique Yalta Soplin, Shinji Watanabe, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani:
Improving Transformer-Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration. INTERSPEECH 2019: 1408-1412
[c72]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OchiaiDKON19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OchiaiDKON19
Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Tomohiro Nakatani:
Multimodal SpeakerBeam: Single Channel Target Speech Extraction with Audio-Visual Speaker Clues. INTERSPEECH 2019: 2718-2722
[c71]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OgawaDKN19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OgawaDKN19
Atsunori Ogawa, Marc Delcroix, Shigeki Karita, Tomohiro Nakatani:
Improved Deep Duel Model for Rescoring N-Best Speech Recognition List Using Backward LSTMLM and Ensemble Encoders. INTERSPEECH 2019: 3900-3904
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1902-07881
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-07881
Thilo von Neumann, Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani, Reinhold Haeb-Umbach:
All-neural online source separation, counting, and diarization for meeting analysis. CoRR abs/1902.07881 (2019)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1912-08462
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-08462
Thilo von Neumann, Keisuke Kinoshita, Lukas Drude, Christoph Böddeker, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach:
End-to-end training of time domain audio separation and recognition. CoRR abs/1912.08462 (2019)
2018
[j19]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/DelcroixKOHN18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/DelcroixKOHN18
Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Christian Huemmer, Tomohiro Nakatani:
Context Adaptive Neural Network Based Acoustic Models for Rapid Adaptation. IEEE ACM Trans. Audio Speech Lang. Process. 26(5): 895-908 (2018)
[c70]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/MoriyaMASDYA18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/MoriyaMASDYA18
Takafumi Moriya, Ryo Masumura, Taichi Asami, Yusuke Shinohara, Marc Delcroix, Yoshikazu Yamaguchi, Yushi Aono:
Progressive Neural Network-based Knowledge Transfer in Acoustic Models. APSIPA 2018: 998-1002
[c69]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/HentschelDON18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/HentschelDON18
Michael Hentschel, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani:
Feature-Based Learning Hidden Unit Contributions for Domain Adaptation of RNN-LMs. APSIPA 2018: 1692-1696
[c68]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/HentschelDOIN18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/HentschelDOIN18
Michael Hentschel, Marc Delcroix, Atsunori Ogawa, Tomoharu Iwata, Tomohiro Nakatani:
Factorised Hidden Layer Based Domain Adaptation for Recurrent Neural Network Language Models. APSIPA 2018: 1940-1944
[c67]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KinoshitaDDN18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KinoshitaDDN18
Keisuke Kinoshita, Lukas Drude, Marc Delcroix, Tomohiro Nakatani:
Listening to Each Speaker One by One with Recurrent Selective Hearing Networks. ICASSP 2018: 5064-5068
[c66]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixZKON18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixZKON18
Marc Delcroix, Katerina Zmolíková, Keisuke Kinoshita, Atsunori Ogawa, Tomohiro Nakatani:
Single Channel Target Speaker Extraction and Recognition with Speaker Beam. ICASSP 2018: 5554-5558
[c65]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ArakiOKD18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ArakiOKD18
Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Meeting Recognition with Asynchronous Distributed Microphone Array Using Block-Wise Refinement of Mask-Based MVDR Beamformer. ICASSP 2018: 5694-5698
[c64]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KaritaODN18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KaritaODN18
Shigeki Karita, Atsunori Ogawa, Marc Delcroix, Tomohiro Nakatani:
Sequence Training of Encoder-Decoder Model Using Policy Gradient for End-to-End Speech Recognition. ICASSP 2018: 5839-5843
[c63]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OgawaDKN18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OgawaDKN18
Atsunori Ogawa, Marc Delcroix, Shigeki Karita, Tomohiro Nakatani:
Rescoring N-Best Speech Recognition List Based on One-on-One Hypothesis Comparison Using Encoder-Classifier Model. ICASSP 2018: 6099-6103
[c62]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZmolikovaDKHNC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZmolikovaDKHNC18
Katerina Zmolíková, Marc Delcroix, Keisuke Kinoshita, Takuya Higuchi, Tomohiro Nakatani, Jan Cernocký:
Optimization of Speaker-Aware Multichannel Speech Extraction with ASR Criterion. ICASSP 2018: 6702-6706
[c61]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KaritaWIOD18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KaritaWIOD18
Shigeki Karita, Shinji Watanabe, Tomoharu Iwata, Atsunori Ogawa, Marc Delcroix:
Semi-Supervised End-to-End Speech Recognition. INTERSPEECH 2018: 2-6
[c60]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MoriyaUSDYA18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MoriyaUSDYA18
Takafumi Moriya, Sei Ueno, Yusuke Shinohara, Marc Delcroix, Yoshikazu Yamaguchi, Yushi Aono:
Multi-task Learning with Augmentation Strategy for Acoustic-to-word Attention-based Encoder-decoder Speech Recognition. INTERSPEECH 2018: 2399-2403
[c59]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixWOKN18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixWOKN18
Marc Delcroix, Shinji Watanabe, Atsunori Ogawa, Shigeki Karita, Tomohiro Nakatani:
Auxiliary Feature Based Adaptation of End-to-end ASR Systems. INTERSPEECH 2018: 2444-2448
[c58]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DrudeBHHKDN18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DrudeBHHKDN18
Lukas Drude, Christoph Böddeker, Jahn Heymann, Reinhold Haeb-Umbach, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
Integrating Neural Network Based Beamforming and Weighted Prediction Error Dereverberation. INTERSPEECH 2018: 3043-3047
[c57]
- view
  authority control:
- export record
  dblp key:
  - conf/iwaenc/MatsuiNDKIAM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwaenc/MatsuiNDKIAM18
Yutaro Matsui, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Nobutaka Ito, Shoko Araki, Shoji Makino:
Online Integration of DNN-Based and Spatial Clustering-Based Mask Estimation for Robust MVDR Beamforming. IWAENC 2018: 71-75
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/iwaenc/ArakiOKD18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwaenc/ArakiOKD18
Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Comparison of Reference Microphone Selection Algorithms for Distributed Microphone Array Based Speech Enhancement in Meeting Recognition Scenarios. IWAENC 2018: 316-320
2017
[j18]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/HiguchiIAYDN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/HiguchiIAYDN17
Takuya Higuchi, Nobutaka Ito, Shoko Araki, Takuya Yoshioka, Marc Delcroix, Tomohiro Nakatani:
Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR. IEEE ACM Trans. Audio Speech Lang. Process. 25(4): 780-793 (2017)
[c55]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/HentschelODNM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/HentschelODNM17
Michael Hentschel, Atsunori Ogawa, Marc Delcroix, Tomohiro Nakatani, Yuji Matsumoto:
Exploiting imbalanced textual and acoustic data for training prosodically-enhanced RNNLMs. APSIPA 2017: 618-621
[c54]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ZmolikovaDKHON17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ZmolikovaDKHON17
Katerina Zmolíková, Marc Delcroix, Keisuke Kinoshita, Takuya Higuchi, Atsunori Ogawa, Tomohiro Nakatani:
Learning speaker representation for neural network based multichannel speaker extraction. ASRU 2017: 8-15
[c53]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ArakiOKD17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ArakiOKD17
Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Meeting recognition with asynchronous distributed microphone array. ASRU 2017: 32-39
[c52]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HiguchiKDN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HiguchiKDN17
Takuya Higuchi, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
Adversarial training for data-driven speech enhancement without parallel corpus. ASRU 2017: 40-47
[c51]
- view
  authority control:
- export record
  dblp key:
  - conf/hscma/ArakiIDOKHYTKN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/hscma/ArakiIDOKHYTKN17
Shoko Araki, Nobutaka Ito, Marc Delcroix, Atsunori Ogawa, Keisuke Kinoshita, Takuya Higuchi, Takuya Yoshioka, Dung T. Tran, Shigeki Karita, Tomohiro Nakatani:
Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming. HSCMA 2017: 16-20
[c50]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KinoshitaDOHN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KinoshitaDOHN17
Keisuke Kinoshita, Marc Delcroix, Atsunori Ogawa, Takuya Higuchi, Tomohiro Nakatani:
Deep mixture density network for statistical model-based feature enhancement. ICASSP 2017: 251-255
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ItoADN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ItoADN17
Nobutaka Ito, Shoko Araki, Marc Delcroix, Tomohiro Nakatani:
Probabilistic spatial dictionary based online adaptive beamforming for meeting recognition in noisy and reverberant environments. ICASSP 2017: 681-685
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuemmerDOKNK17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuemmerDOKNK17
Christian Huemmer, Marc Delcroix, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani, Walter Kellermann:
Online environmental adaptation of CNN-based acoustic models using spatial diffuseness features. ICASSP 2017: 4875-4879
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OchiaiDKOAKN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OchiaiDKOAKN17
Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Taichi Asami, Shigeru Katagiri, Tomohiro Nakatani:
Cumulative moving averaged bottleneck speaker vectors for online speaker adaptation of CNN-based acoustic models. ICASSP 2017: 5175-5179
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TranDOHN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TranDOHN17
Dung T. Tran, Marc Delcroix, Atsunori Ogawa, Christian Huemmer, Tomohiro Nakatani:
Feedback connection for deep neural network-based acoustic modeling. ICASSP 2017: 5240-5244
[c45]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaDKMN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaDKMN17
Keisuke Kinoshita, Marc Delcroix, Haeyong Kwon, Takuma Mori, Tomohiro Nakatani:
Neural Network-Based Spectrum Estimation for Online WPE Dereverberation. INTERSPEECH 2017: 384-388
[c44]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HiguchiKDZN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HiguchiKDZN17
Takuya Higuchi, Keisuke Kinoshita, Marc Delcroix, Katerina Zmolíková, Tomohiro Nakatani:
Deep Clustering-Based Beamforming for Separation with Unknown Number of Sources. INTERSPEECH 2017: 1183-1187
[c43]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TranDKHON17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TranDKHON17
Dung T. Tran, Marc Delcroix, Shigeki Karita, Michael Hentschel, Atsunori Ogawa, Tomohiro Nakatani:
Unfolded Deep Recurrent Convolutional Neural Network with Jump Ahead Connections for Acoustic Modeling. INTERSPEECH 2017: 1596-1600
[c42]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KaritaODN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KaritaODN17
Shigeki Karita, Atsunori Ogawa, Marc Delcroix, Tomohiro Nakatani:
Forward-Backward Convolutional LSTM for Acoustic Modeling. INTERSPEECH 2017: 1601-1605
[c41]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OgawaKDN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OgawaKDN17
Atsunori Ogawa, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
Improved Example-Based Speech Enhancement by Using Deep Neural Network Acoustic Model for Noise Robust Example Search. INTERSPEECH 2017: 1963-1967
[c40]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZmolikovaDKHON17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZmolikovaDKHON17
Katerina Zmolíková, Marc Delcroix, Keisuke Kinoshita, Takuya Higuchi, Atsunori Ogawa, Tomohiro Nakatani:
Speaker-Aware Neural Network Based Beamformer for Speaker Extraction in Speech Mixtures. INTERSPEECH 2017: 2655-2659
[c39]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TranDON17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TranDON17
Dung T. Tran, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani:
Uncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling. INTERSPEECH 2017: 3852-3856
[p7]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/WatanabeDMH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/WatanabeDMH17
Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey:
Preliminaries. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 3-17
[p6]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/DelcroixYIOKFHAN17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/DelcroixYIOKFHAN17
Marc Delcroix, Takuya Yoshioka, Nobutaka Ito, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto, Takuya Higuchi, Shoko Araki, Tomohiro Nakatani:
Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 21-49
[p5]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/KarafiatVZDWBCS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/KarafiatVZDWBCS17
Martin Karafiát, Karel Veselý, Katerina Zmolíková, Marc Delcroix, Shinji Watanabe, Lukás Burget, Jan Honza Cernocký, Igor Szöke:
Training Data Augmentation and Data Selection. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 245-260
[p4]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/KinoshitaDGHHKLMNRSY17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/KinoshitaDGHHKLMNRSY17
Keisuke Kinoshita, Marc Delcroix, Sharon Gannot, Emanuël A. P. Habets, Reinhold Haeb-Umbach, Walter Kellermann, Volker Leutnant, Roland Maas, Tomohiro Nakatani, Bhiksha Raj, Armin Sehr, Takuya Yoshioka:
The REVERB Challenge: A Benchmark Task for Reverberation-Robust ASR Techniques. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 345-354
[p3]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/WatanabeHMDMH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/WatanabeHMDMH17
Shinji Watanabe, Takaaki Hori, Yajie Miao, Marc Delcroix, Florian Metze, John R. Hershey:
Toolkits for Robust Speech Processing. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 369-382
[e1]
- view
  authority control:
- export record
  dblp key:
  - books/sp/WDMH2017
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/WDMH2017
Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey:
New Era for Robust Speech Recognition, Exploiting Deep Learning. Springer 2017, ISBN 978-3-319-64679-4 [contents]
2016
[j17]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/DelcroixOHNN16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/DelcroixOHNN16
Marc Delcroix, Atsunori Ogawa, Seong-Jun Hahm, Tomohiro Nakatani, Atsushi Nakamura:
Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation. Comput. Speech Lang. 36: 24-41 (2016)
[j16]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/ejasp/KinoshitaDGHHKL16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ejasp/KinoshitaDGHHKL16
Keisuke Kinoshita, Marc Delcroix, Sharon Gannot, Emanuël A. P. Habets, Reinhold Haeb-Umbach, Walter Kellermann, Volker Leutnant, Roland Maas, Tomohiro Nakatani, Bhiksha Raj, Armin Sehr, Takuya Yoshioka:
A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research. EURASIP J. Adv. Signal Process. 2016: 7 (2016)
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KunduMQTDS16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KunduMQTDS16
Souvik Kundu, Gautam Mantena, Yanmin Qian, Tian Tan, Marc Delcroix, Khe Chai Sim:
Joint acoustic factor learning for robust deep neural network based automatic speech recognition. ICASSP 2016: 5025-5029
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixKYOYN16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixKYOYN16
Marc Delcroix, Keisuke Kinoshita, Chengzhu Yu, Atsunori Ogawa, Takuya Yoshioka, Tomohiro Nakatani:
Context adaptive deep neural networks for fast acoustic model adaptation in noisy conditions. ICASSP 2016: 5270-5274
[c36]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixKOYTN16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixKOYTN16
Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Takuya Yoshioka, Dung T. Tran, Tomohiro Nakatani:
Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models. INTERSPEECH 2016: 1573-1577
[c35]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZmolikovaKVDWBC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZmolikovaKVDWBC16
Katerina Zmolíková, Martin Karafiát, Karel Veselý, Marc Delcroix, Shinji Watanabe, Lukás Burget, Jan Cernocký:
Data Selection by Sequence Summarizing Neural Network in Mismatch Condition Training. INTERSPEECH 2016: 2354-2358
[c34]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OgawaSKDYNT16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OgawaSKDYNT16
Atsunori Ogawa, Shogo Seki, Keisuke Kinoshita, Marc Delcroix, Takuya Yoshioka, Tomohiro Nakatani, Kazuya Takeda:
Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement. INTERSPEECH 2016: 3733-3737
[c33]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TranDON16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TranDON16
Dung T. Tran, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani:
Factorized Linear Input Network for Acoustic Model Adaptation in Noisy Conditions. INTERSPEECH 2016: 3813-3817
2015
[j15]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/ejasp/DelcroixYOKFIKE15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ejasp/DelcroixYOKFIKE15
Marc Delcroix, Takuya Yoshioka, Atsunori Ogawa, Yotaro Kubo, Masakiyo Fujimoto, Nobutaka Ito, Keisuke Kinoshita, Miquel Espi, Shoko Araki, Takaaki Hori, Tomohiro Nakatani:
Strategies for distant speech recognitionin reverberant environments. EURASIP J. Adv. Signal Process. 2015: 60 (2015)
[c32]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/YoshiokaIDOKFYF15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/YoshiokaIDOKFYF15
Takuya Yoshioka, Nobutaka Ito, Marc Delcroix, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto, Chengzhu Yu, Wojciech J. Fabian, Miquel Espi, Takuya Higuchi, Shoko Araki, Tomohiro Nakatani:
The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices. ASRU 2015: 436-443
[c31]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ArakiHDFTN15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ArakiHDFTN15
Shoko Araki, Tomoki Hayashi, Marc Delcroix, Masakiyo Fujimoto, Kazuya Takeda, Tomohiro Nakatani:
Exploring multi-channel features for denoising-autoencoder-based speech enhancement. ICASSP 2015: 116-120
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixKHN15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixKHN15
Marc Delcroix, Keisuke Kinoshita, Takaaki Hori, Tomohiro Nakatani:
Context adaptive deep neural networks for fast acoustic model adaptation. ICASSP 2015: 4535-4539
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DoNDH15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DoNDH15
Quoc Truong Do, Satoshi Nakamura, Marc Delcroix, Takaaki Hori:
WFST-based structural classification integrating dnn acoustic features and RNN language features for speech recognition. ICASSP 2015: 4959-4963
[c28]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaDON15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaDON15
Keisuke Kinoshita, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani:
Text-informed speech enhancement with deep neural networks. INTERSPEECH 2015: 1760-1764
[c27]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YuODYNH15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YuODYNH15
Chengzhu Yu, Atsunori Ogawa, Marc Delcroix, Takuya Yoshioka, Tomohiro Nakatani, John H. L. Hansen:
Robust i-vector extraction for neural network adaptation in noisy environment. INTERSPEECH 2015: 2854-2857
2014
[j14]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/SoudenKDN14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/SoudenKDN14
Mehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
Location Feature Integration for Clustering-Based Speech Separation in Distributed Microphone Arrays. IEEE ACM Trans. Audio Speech Lang. Process. 22(2): 354-367 (2014)
[c26]
- view
  authority control:
- export record
  dblp key:
  - conf/globalsip/DelcroixYOKFIKE14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/globalsip/DelcroixYOKFIKE14
Marc Delcroix, Takuya Yoshioka, Atsunori Ogawa, Yotaro Kubo, Masakiyo Fujimoto, Nobutaka Ito, Keisuke Kinoshita, Miquel Espi, Shoko Araki, Takaaki Hori, Tomohiro Nakatani:
Defeating reverberation: Advanced dereverberation and recognition techniques for hands-free speech recognition. GlobalSIP 2014: 522-526
2013
[j13]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/DelcroixWNN13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/DelcroixWNN13
Marc Delcroix, Shinji Watanabe, Tomohiro Nakatani, Atsushi Nakamura:
Cluster-based dynamic variance adaptation for interconnecting speech enhancement pre-processor and speech recognizer. Comput. Speech Lang. 27(1): 350-368 (2013)
[j12]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/DelcroixKNAOHWFYOKSHN13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/DelcroixKNAOHWFYOKSHN13
Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Atsunori Ogawa, Takaaki Hori, Shinji Watanabe, Masakiyo Fujimoto, Takuya Yoshioka, Takanobu Oba, Yotaro Kubo, Mehrez Souden, Seong-Jun Hahm, Atsushi Nakamura:
Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds. Comput. Speech Lang. 27(3): 851-873 (2013)
[j11]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/NakataniAYDF13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/NakataniAYDF13
Tomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Marc Delcroix, Masakiyo Fujimoto:
Dominance Based Integration of Spatial and Spectral Features for Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 21(12): 2516-2531 (2013)
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixOHNN13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixOHNN13
Marc Delcroix, Atsunori Ogawa, Seong-Jun Hahm, Tomohiro Nakatani, Atsushi Nakamura:
Unsupervised discriminative adaptation using differenced maximum mutual information based linear regression. ICASSP 2013: 7888-7892
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HahmODFHN13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HahmODFHN13
Seong-Jun Hahm, Atsunori Ogawa, Marc Delcroix, Masakiyo Fujimoto, Takaaki Hori, Atsushi Nakamura:
Feature space variational Bayesian linear regression and its combination with model space VBLR. ICASSP 2013: 7898-7902
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/icdsp/MaasKSYDKN13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icdsp/MaasKSYDKN13
Roland Maas, Walter Kellermann, Armin Sehr, Takuya Yoshioka, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani:
Formulation of the REMOS concept from an uncertainty decoding perspective. DSP 2013: 1-6
[c22]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixKNN13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixKNN13
Marc Delcroix, Yotaro Kubo, Tomohiro Nakatani, Atsushi Nakamura:
Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling? INTERSPEECH 2013: 2992-2996
[c21]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SehrYDKNMK13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SehrYDKNMK13
Armin Sehr, Takuya Yoshioka, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Roland Maas, Walter Kellermann:
Conditional emission densities for combining speech enhancement and recognition systems. INTERSPEECH 2013: 3502-3506
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/KinoshitaDYNSKM13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/KinoshitaDYNSKM13
Keisuke Kinoshita, Marc Delcroix, Takuya Yoshioka, Tomohiro Nakatani, Armin Sehr, Walter Kellermann, Roland Maas:
The reverb challenge: Acommon evaluation framework for dereverberation and recognition of reverberant speech. WASPAA 2013: 1-4
2012
[j10]
- view
  authority control:
- export record
  dblp key:
  - journals/spl/SoudenDKYN12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spl/SoudenDKYN12
Mehrez Souden, Marc Delcroix, Keisuke Kinoshita, Takuya Yoshioka, Tomohiro Nakatani:
Noise Power Spectral Density Tracking: A Maximum Likelihood Perspective. IEEE Signal Process. Lett. 19(8): 495-498 (2012)
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/spm/YoshiokaSDKMNK12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spm/YoshiokaSDKMNK12
Takuya Yoshioka, Armin Sehr, Marc Delcroix, Keisuke Kinoshita, Roland Maas, Tomohiro Nakatani, Walter Kellermann:
Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition. IEEE Signal Process. Mag. 29(6): 114-126 (2012)
[c19]
- view
  - electronic edition @ ieee.org
  - no references & citations available
- export record
  dblp key:
  - conf/apsipa/YoshiokaSDKMNK12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/YoshiokaSDKMNK12
Takuya Yoshioka, Armin Sehr, Marc Delcroix, Keisuke Kinoshita, Roland Maas, Tomohiro Nakatani, Walter Kellermann:
Survey on approaches to speech recognition in reverberant environments. APSIPA 2012: 1-4
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NakataniYADF12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NakataniYADF12
Tomohiro Nakatani, Takuya Yoshioka, Shoko Araki, Marc Delcroix, Masakiyo Fujimoto:
LogMax observation model with MFCC-based spectral prior for reduction of highly nonstationary ambient noise. ICASSP 2012: 4029-4032
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixOWNN12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixOWNN12
Marc Delcroix, Atsunori Ogawa, Shinji Watanabe, Tomohiro Nakatani, Atsushi Nakamura:
Discriminative feature transforms using differenced maximum mutual information. ICASSP 2012: 4753-4756
[c16]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaDSN12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaDSN12
Keisuke Kinoshita, Marc Delcroix, Mehrez Souden, Tomohiro Nakatani:
Example-based speech enhancement with joint utilization of spatial, spectral & temporal cues of speech and noise. INTERSPEECH 2012: 1926-1929
[c15]
- view
  - electronic edition @ isca-archive.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/mlslp/DelcroixONN12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mlslp/DelcroixONN12
Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani, Atsushi Nakamura:
Dynamic variance adaptation using differenced maximum mutual information. MLSLP 2012: 9-12
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/mlsp/SoudenKDN12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mlsp/SoudenKDN12
Mehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
Distributed microphone array processing for speech source separation with classifier fusion. MLSP 2012: 1-6
2011
[c13]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaSDN11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaSDN11
Keisuke Kinoshita, Mehrez Souden, Marc Delcroix, Tomohiro Nakatani:
Single Channel Dereverberation Using Example-Based Speech Enhancement with Uncertainty Decoding Technique. INTERSPEECH 2011: 197-200
[c12]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SoudenKDN11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SoudenKDN11
Mehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
A Multichannel Feature-Based Processing for Robust Speech Recognition. INTERSPEECH 2011: 689-692
[c11]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NakataniADYF11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NakataniADYF11
Tomohiro Nakatani, Shoko Araki, Marc Delcroix, Takuya Yoshioka, Masakiyo Fujimoto:
Reduction of Highly Nonstationary Ambient Noise by Integrating Spectral and Locational Characteristics of Speech and Noise for Robust ASR. INTERSPEECH 2011: 1785-1788
[p2]
- view
  authority control:
- export record
  dblp key:
  - books/daglib/p/DelcroixWN11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/daglib/p/DelcroixWN11
Marc Delcroix, Shinji Watanabe, Tomohiro Nakatani:
Variance Compensation for Recognition of Reverberant Speech with Dereverberation Preprocessing. Robust Speech Recognition of Uncertain or Missing Data 2011: 225-255
2010
[p1]
- view
  authority control:
- export record
  dblp key:
  - books/sp/10/MiyoshiDKYNH10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/10/MiyoshiDKYNH10
Masato Miyoshi, Marc Delcroix, Keisuke Kinoshita, Takuya Yoshioka, Tomohiro Nakatani, Takafumi Hikichi:
Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information. Speech Dereverberation 2010: 271-310

2000 – 2009

see FAQ

What is the meaning of the colors in the publication lists?

2009
[j8]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/DelcroixNW09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/DelcroixNW09
Marc Delcroix, Tomohiro Nakatani, Shinji Watanabe:
Static and Dynamic Variance Compensation for Recognition of Reverberant Speech With Dereverberation Preprocessing. IEEE Trans. Speech Audio Process. 17(2): 324-334 (2009)
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/KinoshitaDNM09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/KinoshitaDNM09
Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi:
Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction. IEEE Trans. Speech Audio Process. 17(4): 534-545 (2009)
2008
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/ieicet/MiyoshiDK08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieicet/MiyoshiDK08
Masato Miyoshi, Marc Delcroix, Keisuke Kinoshita:
Calculating Inverse Filters for Speech Dereverberation. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 91-A(6): 1303-1309 (2008)
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/NakataniJYKDM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/NakataniJYKDM08
Tomohiro Nakatani, Biing-Hwang Juang, Takuya Yoshioka, Keisuke Kinoshita, Marc Delcroix, Masato Miyoshi:
Speech Dereverberation Based on Maximum-Likelihood Estimation With Time-Varying Gaussian Source Model. IEEE Trans. Speech Audio Process. 16(8): 1512-1527 (2008)
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixNW08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixNW08
Marc Delcroix, Tomohiro Nakatani, Shinji Watanabe:
Combined static and dynamic variance adaptation for efficient interconnection of speech enhancement pre-processor with speech recognizer. ICASSP 2008: 4073-4076
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/iscas/KolossaADNOM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscas/KolossaADNOM08
Dorothea Kolossa, Shoko Araki, Marc Delcroix, Tomohiro Nakatani, Reinhold Orglmeister, Shoji Makino:
Missing feature speech recognition in a meeting situation with maximum SNR beamforming. ISCAS 2008: 3218-3221
2007
[j4]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/ejasp/HikichiDM07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ejasp/HikichiDM07
Takafumi Hikichi, Marc Delcroix, Masato Miyoshi:
Inverse Filtering for Speech Dereverberation Less Sensitive to Noise and Room Transfer Function Fluctuations. EURASIP J. Adv. Signal Process. 2007 (2007)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/DelcroixHM07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/DelcroixHM07
Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
Precise Dereverberation Using Multichannel Linear Prediction. IEEE Trans. Speech Audio Process. 15(2): 430-440 (2007)
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/DelcroixHM07a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/DelcroixHM07a
Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
Dereverberation and Denoising Using Multichannel Linear Prediction. IEEE Trans. Speech Audio Process. 15(6): 1791-1801 (2007)
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NakataniJHYKDM07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NakataniJHYKDM07
Tomohiro Nakatani, Biing-Hwang Juang, Takafumi Hikichi, Takuya Yoshioka, Keisuke Kinoshita, Marc Delcroix, Masato Miyoshi:
Study on Speech Dereverberation with Autocorrelation Codebook. ICASSP (1) 2007: 193-196
[c7]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaDNM07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaDNM07
Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi:
Multi-step linear prediction based speech dereverberation in noisy reverberant environment. INTERSPEECH 2007: 854-857
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/iscas/NakataniHKYDMJ07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscas/NakataniHKYDMJ07
Tomohiro Nakatani, Takafumi Hikichi, Keisuke Kinoshita, Takuya Yoshioka, Marc Delcroix, Masato Miyoshi, Biing-Hwang Juang:
Robust blind dereverberation of speech signals based on characteristics of short-time speech segments. ISCAS 2007: 2986-2989
2006
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/ieicet/DelcroixHM06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieicet/DelcroixHM06
Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
On a Blind Speech Dereverberation Algorithm Using Multi-Channel Linear Prediction. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 89-A(10): 2837-2846 (2006)
[c5]
- view
  - electronic edition @ ieee.org
  - no references & citations available
- export record
  dblp key:
  - conf/eusipco/HikichiDM06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eusipco/HikichiDM06
Takafumi Hikichi, Marc Delcroix, Masato Miyoshi:
On robust inverse filter design for room transfer function fluctuations. EUSIPCO 2006: 1-5
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DelcroixHM06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DelcroixHM06
Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
On the Use of Lime Dereverberation Algorithm in an Acoustic Environment With a Noise Source. ICASSP (1) 2006: 825-828
2005
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HikichiDM05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HikichiDM05
Takafumi Hikichi, Marc Delcroix, Masato Miyoshi:
Blind Dereverberation based on Estimates of Signal Transmission Channels without Precise Information of Channel Order. ICASSP (1) 2005: 1069-1072
[c2]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixHM05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixHM05
Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
Improved blind dereverberation performance by using spatial information. INTERSPEECH 2005: 2309-2312
2004
[c1]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixHM04
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixHM04
Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
Dereverberation of speech signals based on linear prediction. INTERSPEECH 2004: 877-880

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.