Best 27 Multi Modality Open Source Projects

☁️ Build multimodal AI applications with cloud-native stack

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V...

🏄 Scalable embedding, reasoning, ranking for images and sentences with ...

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Simple command line tool for text to image generation using OpenAI's CLI...

Extract markdown and images from URLs, PDFs, docs, slides, and more, rea...

Algorithms and Publications on 3D Object Tracking

Collaborative Diffusion (CVPR 2023)

Effortless plugin and play Optimizer to cut model training costs by 50%...

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint ...

[ICCV2019] Robust Multi-Modality Multi-Object Tracking

Unifying Voxel-based Representation with Transformer for 3D Object Detec...

This repo contains the official code of our work SAM-SLR which won the C...

[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from ...

Official code for NeurIPS2023 paper: CoDA: Collaborative Novel Box Disco...