We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation .
Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users .
You must be logged in to block users.
Contact GitHub support about this user’s behavior. Learn more about reporting abuse .
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Python 2.9k 264
Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Python 579 37
Fully Convolutional Networks for Panoptic Segmentation (CVPR2021 Oral)
Python 388 53
Learning Dynamic Routing for Semantic Segmentation
Python 377 46
Unifying Voxel-based Representation with Transformer for 3D Object Detection (NeurIPS 2022)
Python 217 14
Voxel Field Fusion for 3D Object Detection (CVPR2022)
Python 92 11