Introduction to vector search
To provide feedback or request support for this feature, send email to
bq-vector-search@google.com
.
This document provides an overview of
vector search
in BigQuery. Vector
search lets you search embeddings to identify semantically similar entities.
Embeddings are high-dimensional numerical vectors that represent a given entity,
like a piece of text or an audio file. Machine learning (ML) models use
embeddings to encode semantics about such entities to make it easier to
reason about and compare them. For example, a common operation in clustering,
classification, and recommendation models is to measure the distance between
vectors in an
embedding space
to
find items that are most semantically similar.
To perform a vector search, you use the
VECTOR_SEARCH
function
and optionally a
vector index
. When a vector
index is used,
VECTOR_SEARCH
uses the
Approximate Nearest Neighbor
search technique to help improve vector search performance, with the
trade-off of reducing
recall
and so returning more approximate results. Brute force is used to return exact
results when a vector index isn't available, and you can choose to use brute
force to get exact results even when a vector index is available.
Pricing
The
CREATE VECTOR INDEX
statement
and the
VECTOR_SEARCH
function use
BigQuery compute pricing
.
For the
CREATE VECTOR INDEX
statement, only the indexed column is considered
in the bytes processed.
There is no charge for the processing required to build and refresh your vector
indexes when the total size of indexed table data in your organization is below
the 20 TB limit. To support indexing beyond this limit, you must
provide your own reservation
for handling the index management jobs. Vector indexes incur storage costs
when they are active. You can find the index storage size in the
INFORMATION_SCHEMA.VECTOR_INDEXES
view
.
If the vector index is not yet at 100% coverage, you are still charged for all
index storage that is reported in the
INFORMATION_SCHEMA.VECTOR_INDEXES
view.
Quotas and limits
For more information, see
Vector index limits
.
Limitations
- Queries that contain the
VECTOR_SEARCH
function aren't accelerated by
BigQuery BI Engine
.
- BigQuery data security and governance rules apply to the use of
VECTOR_SEARCH
. For more information, see the
Limitations
section in
VECTOR_SEARCH
.
These rules don't apply to vector index generation.
What's next