A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
-
Updated
Nov 1, 2024 - Python
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Large-scale pretrained models for goal-directed dialog
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Official Pytorch implementation of ReXNet (Rank eXpansion Network) with pretrained models
PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
A PyTorch implementation of the 'FaceNet' paper for training a facial recognition model with Triplet Loss using the glint360k dataset. A pre-trained model using Triplet Loss is available for download.
Real-time hand pose estimation and gesture classification using TensorRT
Image Synthesis + Corgis = <3
PyTorch implementation of "Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning" with DDP and Apex AMP
Mining Discourse Markers for Unsupervised Sentence Representation Learning
Code and released pre-trained model for our ACL 2022 paper: "DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation"
ALBERT trained on Mongolian text corpus
Implemenation of Selective Kernel Networks by pytorch with pretrained weight
This is an implementation of electra according to the paper {ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators}
Forest Fire Detection By Convolutional Neural Network
This API utilizes a pre-trained model for emotion recognition from audio files. It accepts audio files as input, processes them using the pre-trained model, and returns the predicted emotion along with the confidence score. The API leverages the FastAPI framework for easy development and deployment.
Source code of our SIGIR'24 paper titled "Leave No Patient Behind: Enhancing Medication Recommendation for Rare Disease Patients".
Using OpenCV's pretrained model yolov3 for real time object detection. (faster)
A repository which contains dataset and a pre-trained Snips model for the Automotive Grade Linux's NLU intent engine.
Blip Image Captioning + GPT-2 Happy Model: Generate joyful responses to image captions using state-of-the-art NLP and computer vision. Pretrained models and data preprocessing included for seamless integration. Explore the intersection of deep learning, sentiment analysis, and language generation
Add a description, image, and links to the pretrained-model topic page so that developers can more easily learn about it.
To associate your repository with the pretrained-model topic, visit your repo's landing page and select "manage topics."