#
large-multimodal-models
Here are
24 public repositories
matching this topic...
AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
-
Updated
Jun 12, 2024
-
Python
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
-
Updated
Feb 1, 2024
-
Python
A Framework of Small-scale Large Multimodal Models
-
Updated
Jun 11, 2024
-
Python
A collection of resources on applications of multi-modal learning in medical imaging.
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
-
Updated
May 31, 2024
-
Python
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
-
Updated
May 22, 2024
-
Python
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
-
Updated
Mar 27, 2024
-
Python
Open Platform for Embodied Agents
-
Updated
Jun 12, 2024
-
Python
An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
-
Updated
Jun 12, 2024
-
Python
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
-
Updated
Apr 17, 2024
-
Python
The official evaluation suite and dynamic data release for MixEval.
-
Updated
Jun 12, 2024
-
Python
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
-
Updated
Dec 25, 2023
-
Python
An open-source implementation of LLaVA-NeXT.
-
Updated
Jun 12, 2024
-
Python
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
-
Updated
May 31, 2024
-
Python
A curated list of awesome Multimodal studies.
The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.
-
Updated
Apr 16, 2024
-
Python
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
-
Updated
May 19, 2024
-
Python
The offical Implementation of "Instruction-Guided Visual Masking"
-
Updated
Jun 3, 2024
-
Jupyter Notebook
An official implementation of ShareGPT4V: Improving Large Multi-modal Models with Better Captions
-
Updated
Jun 6, 2024
-
Python
Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.
Improve this page
Add a description, image, and links to the
large-multimodal-models
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
large-multimodal-models
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.