Try Gemini 1.5 models , our newest multimodal models in Vertex AI, and see what you can build with a 1M token context window. Try Gemini 1.5 models , our newest multimodal models in Vertex AI, and see what you can build with a 1M token context window.

Explore AI models in Model Garden

Model Garden in the Google Cloud console is an ML model library that helps you discover, test, customize, and deploy Google proprietary and select OSS models and assets.

The following topics introduce you to the AI models available in Model Garden and how to use them.

Explore models

To view the list of available Vertex AI and open source foundation, tunable, and task-specific models, go to the Model Garden page in the Google Cloud console.

Go to Model Garden

The model categories available in Model Garden are:

Category	Description
Foundation models	Pretrained multitask large models that can be tuned or customized for specific tasks using Vertex AI Studio, Vertex AI API, and the Vertex AI SDK for Python.
Fine-tunable models	Models that you can fine-tune using a custom notebook or pipeline.
Task-specific solutions	Most of these prebuilt models are ready to use. Many can be customized using your own data.

To filter models in the filter pane, specify the following:

Modalities : Click the modalities (data types) that you want in the model.
Tasks : Click the task that you want the model to perform.
Features : Click the features that you want in the model.

To learn more about each model, click its model card.

Models available in Model Garden

You can find Google's first-party models and select open source models in Model Garden.

List of Google's first-party models

The following table lists the Google's first-party models that are available in Model Garden:

Model name	Modality	Description	Quickstarts
Gemini 1.5 Flash	Language, audio, vision	The fastest, most cost-effective Gemini multimodal model. It's built for high volume tasks and latency-sensitive, affordable applications. Because of how responsive Gemini 1.5 Flash is, it's a good option to create chat assistants and on-demand content generation applications.	Model card
Gemini 1.5 Pro	Language, audio, vision	Multimodal model that supports adding image, audio, video, and PDF files in text or chat prompts for a text or code response.	Model card
Gemini 1.0 Pro	Language	Designed to handle natural language tasks, multiturn text and code chat, and code generation.	Model card
Gemini 1.0 Pro Vision	Language, vision	Multimodal model that supports adding image, video, and PDF files in text or chat prompts for a text or code response.	Model card
PaLM 2 for Text	Language	Fine-tuned to follow natural language instructions and is suitable for a variety of language tasks.	Model card
PaLM 2 for Chat	Language	Fine-tuned to conduct natural conversation. Use this model to build and customize your own chatbot application.	Model card
Codey for Code Completion	Language	Generates code based on code prompts. Good for code suggestions and minimizing bugs in code.	Model card
Codey for Code Generation	Language	Generates code based on natural language input. Good for writing functions, classes, unit tests, and more.	Model card
Codey for Code Chat	Language	Get code-related assistance through natural conversation. Good for questions about an API, syntax in a supported language, and more.	Model card
Embeddings for Text	Language	Converts textual data into numerical vectors that can be processed by machine learning algorithms, especially large models.	Model card
Imagen for Image Generation	Vision	Create or edit studio-grade images at scale using text prompts.	Model card
Imagen for Captioning & VQA	Language	Generates a relevant description for a given image.	Model card
Embeddings for Multimodal	Vision	Generates vectors based on images, which can be used for downstream tasks like image classification and image search.	Model card
Chirp	Speech	A version of a Universal Speech Model that has over 2B parameters and can transcribe in over 100 languages in a single model.	Model card

List of models with open source tuning or serving recipes in Model Garden

The following table lists the OSS models that support open source tuning or serving recipes in Model Garden:

Model name	Modality	Description	Quickstart
E5	Language	Deploy E5, a text embedding model series.	Colab Model card
Instant ID	Language, Vision	Deploy Instant ID, an identity preserving, text-to-image generation model.	Colab Model card
Llama 3	Language	Explore and build with Meta's Llama 3 models (8B, 70B) on Vertex AI.	Model Card
Gemma	Language	Open weight models (2B, 7B) that are built from the same research and technology used to create Google's Gemini models.	Model Card
CodeGemma	Language	Open weight models (2B, 7B) designed for code generation and code completion that are built from the same research and technology used to create Google's Gemini models.	Model Card
PaliGemma	Language	Open weight 3B model designed for image captioning tasks and visual question and answering tasks that's built from the same research and technology used to create Google's Gemini models.	Model Card
Vicuna v1.5	Language	Deploy Vicuna v1.5 series models, which are foundation models fine-tuned from LLama2 for text generation.	Model Card
NLLB	Language	Deploy nllb series models for multi-language translation.	Model Card Colab
Mistral-7B	Language	Deploy Mistral-7B, a foundational model for text generation.	Model Card Colab
BioGPT	Language	Deploy BioGPT, a text generative model for the biomedical domain.	Model Card Colab
BiomedCLIP	Language, Vision	Deploy BiomedCLIP, a multimodal foundation model for the biomedical domain.	Model Card Colab
ImageBind	Language, Vision, Audio	Deploy ImageBind, a foundational model for multimodal embedding.	Model Card Colab
DITO	Language, Vision	Finetune and deploy DITO, a multimodal foundation model for open vocabulary object detection tasks.	Model Card Colab
OWL-ViT v2	Language, Vision	Deploy OWL-ViT v2, a multimodal foundation model for open vocabulary object detection tasks.	Model Card Colab
FaceStylizer (Mediapipe)	Vision	A generative pipeline to transform human face images to a new style.	Model Card Colab
Llama 2	Language	Finetune and deploy Meta's Llama 2 foundation models (7B, 13B, 70B) on Vertex AI.	Model Card
Code Llama	Language	Deploy Meta's Code Llama foundation models (7B, 13B, 34B) on Vertex AI.	Model Card
Falcon-instruct	Language	Finetune and deploy Falcon-instruct models (7B, 40B) by using PEFT.	Colab Model Card
OpenLLaMA	Language	Finetune and deploy OpenLLaMA models (3B, 7B, 13B) by using PEFT.	Colab Model Card
T5-FLAN	Language	Finetune and deploy T5-FLAN (base, small, large).	Model Card (fine-tuning pipeline included)
BERT	Language	Finetune and deploy BERT by using PEFT.	Colab Model Card
BART-large-cnn	Language	Deploy BART, a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder.	Colab Model Card
RoBERTa-large	Language	Finetune and deploy RoBERTa-large by using PEFT.	Colab Model Card
XLM-RoBERTa-large	Language	Finetune and deploy XLM-RoBERTa-large (a multilingual version of RoBERTa) by using PEFT.	Colab Model Card
Dolly-v2-7b	Language	Deploy Dolly-v2-7b, an instruction-following large language model with 6.9 billion parameters.	Colab Model Card
Stable Diffusion XL v1.0	Language, Vision	Deploy Stable Diffusion XL v1.0, which supports text-to-image generation.	Colab Model card
Stable Diffusion XL Lightning	Language, Vision	Deploy Stable Diffusion XL Lightning, a text-to-image generation model.	Colab Model card
Stable Diffusion v2.1	Language, Vision	Finetune and deploy Stable Diffusion v2.1 (supports text-to-image generation) by using Dreambooth.	Colab Model card
Stable Diffusion 4x upscaler	Language, Vision	Deploy Stable Diffusion 4x upscaler, which supports text conditioned image superresolution.	Colab Model card
InstructPix2Pix	Language, Vision	Deploy InstructPix2Pix, which supports image editing by using a text prompt.	Colab Model Card
Stable Diffusion Inpainting	Language, Vision	Finetune and deploy Stable Diffusion Inpainting, which supports inpainting a masked image by using a text prompt.	Colab Model Card
SAM	Language, Vision	Deploy Segment Anything, which supports zero-shot image segmentation.	Colab Model Card
Text-to-video (ModelScope)	Language, Vision	Deploy ModelScope text-to-video, which supports text-to-video generation.	Colab Model Card
Text-to-video zero-shot	Language, Vision	Deploy Stable Diffusion text-to-video generators, which support zero-shot text-to-video generation.	Colab Model Card
Pic2Word Composed Image Retrieval	Language, Vision	Deploy Pic2Word, which supports multi-modal composed image retrieval.	Colab Model Card
BLIP2	Language, Vision	Deploy BLIP2, which supports image captioning and visual-question-answering.	Colab Model Card
Open-CLIP	Language, Vision	Finetune and deploy the Open-CLIP, which supports zero-shot classification.	Colab Model card
F-VLM	Language, Vision	Deploy F-VLM, which supports open vocabulary image object detection.	Colab Model Card
tfhub/EfficientNetV2	Vision	Finetune and deploy the Tensorflow Vision implementation of the EfficientNetV2 image classification model.	Colab Model Card
EfficientNetV2 (TIMM)	Vision	Finetune and deploy the PyTorch implementation of the EfficientNetV2 image classification model.	Colab Model Card
Proprietary/EfficientNetV2	Vision	Finetune and deploy the Google proprietary checkpoint of the EfficientNetV2 image classification model.	Colab Model Card
EfficientNetLite (MediaPipe)	Vision	Finetune EfficientNetLite image classification model through MediaPipe model maker.	Colab Model Card
tfvision/vit	Vision	Finetune and deploy the TensorFlow Vision implementation of the ViT image classification model.	Colab Model Card
ViT (TIMM)	Vision	Finetune and deploy the PyTorch implementation of the ViT image classification model.	Colab Model Card
Proprietary/ViT	Vision	Finetune and deploy the Google proprietary checkpoint of the ViT image classification model.	Colab Model Card
Proprietary/MaxViT	Vision	Finetune and deploy the Google proprietary checkpoint of the MaxViT hybrid (CNN + ViT) image classification model.	Colab Model Card
ViT (JAX)	Vision	Finetune and deploy the JAX implementation of the ViT image classification model.	Colab Model Card
tfvision/SpineNet	Vision	Finetune and deploy the Tensorflow Vision implementation of the SpineNet object detection model.	Colab Model Card
Proprietary/Spinenet	Vision	Finetune and deploy the Google proprietary checkpoint of the SpineNet object detection model.	Colab Model Card
tfvision/YOLO	Vision	Finetune and deploy the TensorFlow Vision implementation of the YOLO one-stage object detection model.	Colab Model Card
Proprietary/YOLO	Vision	Finetune and deploy the Google proprietary checkpoint of the YOLO one-stage object detection model.	Colab Model Card
YOLOv8 (Keras)	Vision	Finetune and deploy the Keras implementation of the YOLOv8 model for object detection.	Colab Model Card
tfvision/YOLOv7	Vision	Finetune and deploy YOLOv7 model for object detection.	Colab Model Card
ByteTrack Video Object Tracking	Vision	Run batch prediction for video object tracking by using ByteTrack tracker.	Colab Model Card
ResNeSt (TIMM)	Vision	Finetune and deploy the PyTorch implementation of the ResNeSt image classification model.	Colab Model Card
ConvNeXt (TIMM)	Vision	Finetune and deploy ConvNeXt, a pure convolutional model for image classification inspired by the design of Vision Transformers.	Colab Model Card
CspNet (TIMM)	Vision	Finetune and deploy the CSPNet (Cross Stage Partial Network) image classification model.	Colab Model Card
Inception (TIMM)	Vision	Finetune and deploy the Inception image classification model.	Colab Model Card
DeepLabv3+ (with checkpoint)	Vision	Finetune and deploy the DeepLab-v3 Plus model for semantic image segmentation.	Colab Model Card
Faster R-CNN (Detectron2)	Vision	Finetune and deploy the Detectron2 implementation of the Faster R-CNN model for image object detection.	Colab Model Card
RetinaNet (Detectron2)	Vision	Finetune and deploy the Detectron2 implementation of the RetinaNet model for image object detection.	Colab Model Card
Mask R-CNN (Detectron2)	Vision	Finetune and deploy the Detectron2 implementation of the Mask R-CNN model for image object detection and segmentation.	Colab Model Card
ControlNet	Vision	Finetune and deploy the ControlNet text-to-image generation model.	Colab Model Card
MobileNet (TIMM)	Vision	Finetune and deploy the PyTorch implementation of the MobileNet image classification model.	Colab Model Card
MobileNetV2 (MediaPipe) Image Classification	Vision	Finetune the MobileNetV2 image classification model by using MediaPipe model maker.	Colab Model Card
MobileNetV2 (MediaPipe) Object Detection	Vision	Finetune the MobileNetV2 object detection model by using MediaPipe model maker.	Colab Model Card
MobileNet-MultiHW-AVG (MediaPipe)	Vision	Finetune the MobileNet-MultiHW-AVG object detection model by using MediaPipe model maker.	Colab Model Card
DeiT	Vision	Finetune and deploy the DeiT (Data-efficient Image Transformers) model for image classification.	Colab Model Card
BEiT	Vision	Finetune and deploy the BEiT (Bidirectional Encoder representation from Image Transformers) model for image classification.	Colab Model Card
Hand Gesture Recognition (MediaPipe)	Vision	Finetune and deploy on-device the Hand Gesture Recognition models by using MediaPipe.	Colab Model Card
Average Word Embedding Classifier (MediaPipe)	Vision	Finetune and deploy on-device the Average Word Embedding Classifier models by using MediaPipe.	Colab Model Card
MobileBERT Classifier (MediaPipe)	Vision	Finetune and deploy on-device the MobileBERT Classifier models by using MediaPipe.	Colab Model Card
MoViNet Video Clip Classification	Video	Finetune and deploy MoViNet video clip classification models.	Colab Model Card
MoViNet Video Action Recognition	Video	Finetune and deploy MoViNet models for action recognition inference.	Colab Model Card
Stable Diffusion XL LCM	Vision	Deploy this model which uses the Latent Consistency Model (LCM) to enhance text-to-image generation in Latent Diffusion Models by enabling faster and high-quality image creation with fewer steps.	Colab Model Card
LLaVA 1.5	Vision, Language	Deploy LLaVA 1.5 models.	Colab Model Card
Pytorch-ZipNeRF	Vision, Video	Train the Pytorch-ZipNeRF model which is a state-of-the-art implementation of the ZipNeRF algorithm in the Pytorch framework, designed for efficient and accurate 3D reconstruction from 2D images.	Colab Model Card
WizardLM	Language	Deploy WizardLM which is a large language model (LLM) developed by Microsoft, fine-tuned on complex instructions by adapting the Evol-Instruct method.	Colab Model Card
WizardCoder	Language	Deploy WizardCoder which is a large language model (LLM) developed by Microsoft, fine-tuned on complex instructions by adapting the Evol-Instruct method to the domain of code.	Colab Model Card
Mixtral 8x7B	Language	Deploy the Mixtral 8x7B model which is a Mixture of Experts (MoE) large language model (LLM) developed by Mistral AI. It is a decoder-only model with 46.7B parameters and was reported to match or outperform LLaMA 2 70B and GPT 3.5 on many benchmarks.	Colab Model Card
Llama 2 (Quantized)	Language	Fine-tune & deploy a quantized version of Meta's Llama 2 models.	Colab Model Card
LaMa (Large Mask Inpainting)	Vision	Deploy LaMa which uses fast Fourier convolutions (FFCs), a high receptive field perceptual loss and large training masks allows for resolution-robust image inpainting.	Colab Model Card
AutoGluon	Tabular	With AutoGluon you can train and deploy high-accuracy machine learning and deep learning models for tabular data.	Colab Model Card

List of partner models available in Model Garden

The following table lists the models that are available from Google partners in Model Garden:

Model name	Modality	Description	Quickstart
Anthropic Claude 3 Opus	Language	Anthropic's most capable model at performing complex tasks quickly. It's built to respond to open-ended prompts and new scenarios.	Model Card
Anthropic Claude 3 Sonnet	Language	A vision and text model that balances performance and speed to process enterprise workloads. It's engineered for a low cost, scaled AI deployments.	Model Card
Anthropic Claude 3 Haiku	Language	Anthropic's fastest and most compact vision and text model that provides quick responses to straightforward queries. It's meant for AI experiences mimicking human interactions.	Model Card

How to use model cards

Click a model card to use the model associated with it. For example, you can click a model card to test prompts, tune a model, create applications, and view code samples.

To learn how to use models associated with model cards, click one of the following tabs:

Test prompts

Use the Vertex AI PaLM API model card to test prompts.

In the Google Cloud console, go to the Model Garden page.

Go to Model Garden
Find a supported model that you want to test and click View details .
Click Open prompt design .

You're taken to the Prompt design page.
In Prompt , enter the prompt that you want to test.
Optional: Configure the model parameters.
Click Submit .

Tune a model

To tune supported models, use a Vertex AI pipeline or a notebook.

Tune using a pipeline

The BERT and T5-FLAN models support model tuning using a pipeline.

In the Google Cloud console, go to the Model Garden page.

Go to Model Garden
In Search models , enter BERT or T5-FLAN , then click the magnifying glass to search.
Click View details on the T5-FLAN or the BERT model card.
Click Open fine-tuning pipeline .

You're taken to the Vertex AI pipelines page.
To start tuning, click Create run .

Tune in a notebook

The model cards for most open source foundation models and fine-tunable models support tuning in a notebook.

In the Google Cloud console, go to the Model Garden page.

Go to Model Garden
Find a supported model that you want to tune and click View details .
Click Open notebook .

Deploy a model

The model card for the Stable Diffusion model supports deploying to an endpoint.

In the Google Cloud console, go to the Model Garden page.

Go to Model Garden
Find a supported model that you want to deploy. On its model card, click View details .
Click Deploy .

You're asked to save a copy of the model to Model Registry.
In Model name , enter a name for the model.
Click Save .

The Deploy to endpoint pane appears.
Define your endpoint as follows:
- Endpoint name : Enter a name for the endpoint.
- Region : Select a region to create the endpoint in.
- Access : To make the endpoint accessible through a REST API, select Standard . To create a private connect to the endpoint, select Private .
Click Continue .
Follow the instructions in the Google Cloud console and configure your model settings.
Click Continue .
Optional: Click the Enable model monitoring for this endpoint toggle to enable model monitoring.
Click Deploy .

View code samples

Most of the model cards for task-specific solutions models contain code samples that you can copy and test.

In the Google Cloud console, go to the Model Garden page.

Go to Model Garden
Find a supported model that you want to view code samples for and click the Documentation tab.
The page scrolls to the documentation section with sample code embedded in place.

Create a vision app

The model cards for applicable computer vision models support creating a vision application.

In the Google Cloud console, go to the Model Garden page.

Go to Model Garden
Find a vision model in the Task specific solutions section that you want to use to create a vision application and click View details .
Click Build app .

You're taken to Vertex AI Vision.
In Application name , enter a name for your application and click Continue .
Select a billing plan and click Create .

You're taken to Vertex AI Vision Studio where you can continue creating your computer vision application.

Pricing

For the open source models in Model Garden, you are charged for use of following on Vertex AI:

Model tuning : You are charged for the compute resources used at the same rate as custom training. See custom training pricing .
Model deployment : You are charged for the compute resources used to deploy the model to an endpoint. See predictions pricing .
Colab Enterprise : See Colab Enterprise pricing .

What's next

Learn about responsible AI best practices and Vertex AI's safety filters .
Learn about Generative AI on Vertex AI .
Learn how to tune foundation models .