Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Picture for Tian Tan

Tian Tan

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

Add code
Jul 05, 2024
Figure 1 for Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Figure 2 for Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Figure 3 for Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Figure 4 for Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Viaarxiv icon

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models

Add code
Jun 22, 2024
Viaarxiv icon

Text-aware Speech Separation for Multi-talker Keyword Spotting

Add code
Jun 18, 2024
Viaarxiv icon

Can Large Language Models Understand Spatial Audio?

Add code
Jun 12, 2024
Figure 1 for Can Large Language Models Understand Spatial Audio?
Figure 2 for Can Large Language Models Understand Spatial Audio?
Figure 3 for Can Large Language Models Understand Spatial Audio?
Figure 4 for Can Large Language Models Understand Spatial Audio?
Viaarxiv icon

SALMONN: Towards Generic Hearing Abilities for Large Language Models

Add code
Oct 20, 2023
Figure 1 for SALMONN: Towards Generic Hearing Abilities for Large Language Models
Figure 2 for SALMONN: Towards Generic Hearing Abilities for Large Language Models
Figure 3 for SALMONN: Towards Generic Hearing Abilities for Large Language Models
Figure 4 for SALMONN: Towards Generic Hearing Abilities for Large Language Models
Viaarxiv icon

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

Add code
Oct 10, 2023
Viaarxiv icon

Connecting Speech Encoder and Large Language Model for ASR

Add code
Sep 26, 2023
Viaarxiv icon

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Add code
Sep 14, 2023
Figure 1 for Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
Figure 2 for Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
Figure 3 for Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
Viaarxiv icon

Multi-Modality Deep Network for Extreme Learned Image Compression

Add code
Apr 26, 2023
Figure 1 for Multi-Modality Deep Network for Extreme Learned Image Compression
Figure 2 for Multi-Modality Deep Network for Extreme Learned Image Compression
Figure 3 for Multi-Modality Deep Network for Extreme Learned Image Compression
Figure 4 for Multi-Modality Deep Network for Extreme Learned Image Compression
Viaarxiv icon

Adjacency constraint for efficient hierarchical reinforcement learning

Add code
Oct 30, 2021
Figure 1 for Adjacency constraint for efficient hierarchical reinforcement learning
Figure 2 for Adjacency constraint for efficient hierarchical reinforcement learning
Figure 3 for Adjacency constraint for efficient hierarchical reinforcement learning
Figure 4 for Adjacency constraint for efficient hierarchical reinforcement learning
Viaarxiv icon