Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
View vealocia's full-sized avatar

Highlights

  • Pro

Organizations

@hustvl

Block or report vealocia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,334 130 Updated Sep 24, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,695 112 Updated Sep 19, 2024

[ICLR 2024 Spotlight] Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments

Python 18 1 Updated Mar 21, 2024

official code for Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking

C++ 40 1 Updated Sep 11, 2024

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,415 105 Updated Jul 5, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 4,858 372 Updated Aug 7, 2024

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 13,548 1,103 Updated Sep 24, 2024

[CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)

Python 43 Updated Sep 5, 2023

Bottom-up Object Detection by Grouping Extreme and Center Points

Python 1,031 172 Updated Apr 19, 2019

Multi-Scale Spatio-Temporal Attention based Video Instance Segmentation

Python 39 3 Updated Sep 2, 2022

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Python 1,336 134 Updated Dec 8, 2023

MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.

C++ 4,254 703 Updated Jul 29, 2024

Real-time Object Detection for Streaming Perception, CVPR 2022

Python 303 40 Updated Sep 21, 2022

[Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)

Python 204 20 Updated Aug 3, 2022

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

Python 391 24 Updated Jan 27, 2023

[CVPR 2022] SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation

Python 588 71 Updated Oct 20, 2023

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Python 237 17 Updated Mar 4, 2023

Official Implementation of DE-DETR and DELA-DETR in "Towards Data-Efficient Detection Transformers"

Python 80 8 Updated Mar 10, 2024

Best Practices, code samples, and documentation for Computer Vision.

Jupyter Notebook 9,467 1,170 Updated Feb 16, 2024

EDTER: Edge Detection with Transformer, in CVPR 2022

MATLAB 275 34 Updated Nov 23, 2023

[CVPR2023] All in One: Exploring Unified Video-Language Pre-training

Python 280 16 Updated Mar 25, 2023

Code for "Deep Snake for Real-Time Instance Segmentation" CVPR 2020 oral

Jupyter Notebook 1,155 229 Updated May 3, 2024

A general and accurate MACs / FLOPs profiler for PyTorch models

Python 559 38 Updated May 5, 2024

Official MegEngine implementation of RepLKNet

Python 269 19 Updated Apr 17, 2022

Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs (CVPR 2022)

Python 862 86 Updated Apr 24, 2024

[ICLR 2022] Official implementation of the paper "DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR"

Jupyter Notebook 506 87 Updated Jun 2, 2023

[TPAMI 2024 & CVPR 2022] Attention Concatenation Volume for Accurate and Efficient Stereo Matching

Python 453 63 Updated Aug 8, 2024

(AAAI 2023 Oral) Pytorch implementation of "CF-ViT: A General Coarse-to-Fine Method for Vision Transformer"

Python 102 7 Updated Jul 4, 2023

[ECCV2022] This is an official implementation of paper "RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation".

Python 74 3 Updated Feb 12, 2023

maximal update parametrization (µP)

Jupyter Notebook 1,353 93 Updated Jul 17, 2024
Next