Use the best practices listed here as a quick reference
when building an application that uses Cloud Firestore.
Database location
When you create your database instance, select the
database
location
closest to your users and compute resources.
Far-reaching network hops are more error-prone and increase query latency.
To maximize the availability and durability of
your application, select a
multi-region location
and
place critical compute resources in at least two regions.
Select a
regional location
for lower costs, for lower write
latency if your application is sensitive to latency, or
for
co-location with other GCP resources
.
Document IDs
Field Names
Indexes
Reduce write latency
The main contributor to write latency is index fanout. The best practices to
reduce index fanout are:
Set
collection-level index exemptions
. An easy default is to disable Descending & Array indexing. Removing unused indexed values will also lower
storage costs
.
Reduce the number of documents in a transaction. For writing a large number
of documents, consider using a bulk writer instead of the atomic batch
writer.
Index exemptions
For most apps, you can rely on automatic indexing as well as any error message
links to manage your indexes. However, you may want to add
single-field exemptions
in the following cases:
Case
|
Description
|
Large string fields
|
If you have a string field that often holds long string values that
you don't use for querying, you can cut storage costs by exempting the field
from indexing.
|
High write rates to a collection containing documents with sequential values
|
If you index a field that increases or decreases sequentially between
documents in a collection, like a timestamp, then the maximum write rate to the
collection is 500 writes per second. If you don't query based on the field with sequential values, you can exempt the field
from indexing to bypass this limit.
In an IoT use case with a high write rate, for example, a collection containing documents with a timestamp field might approach the 500 writes per second limit.
|
TTL fields
|
If you use
TTL (time-to-live) policies
, note that the TTL
field must be a timestamp. Indexing on TTL fields is enabled by default and can
affect performance at higher traffic rates. As a best practice, add
single-field exemptions for your TTL fields.
|
Large array or map fields
|
Large array or map fields can approach the limit of 40,000 index entries per document. If you are not querying based on a large array or map field, you should exempt it from indexing.
|
Read and write operations
The exact maximum rate that an app can update a single document depends highly on the workload. For more information,
see
Updates to a single document
.
Use asynchronous calls where available instead of synchronous calls.
Asynchronous calls minimize latency impact. For example, consider an application
that needs the result of a document lookup and the results of a query before
rendering a response. If the lookup and the query do not have a data dependency,
there is no need to synchronously wait until the lookup completes before
initiating the query.
Do not use offsets. Instead, use
cursors
. Using an offset only avoids
returning the skipped documents to your application, but these documents are
still retrieved internally. The skipped documents affect the latency of the
query, and your application is billed for the read operations required to
retrieve them.
Transactions retries
The Cloud Firestore
SDKs and client
libraries
automatically retry failed
transactions to deal with transient errors. If your application accesses
Cloud Firestore through the
REST
or
RPC
APIs directly
instead of through an SDK, your
application should implement transaction retries to increase reliability.
Real-time updates
For best practices related to real-time updates, see
Understand real-time queries at scale
.
Designing for scale
The following best practices describe how to avoid situations that
create contention issues.
Updates to a single document
As you design your app, consider how quickly your app updates single documents.
The best way to characterize your workload's performance is to perform load
testing. The exact maximum rate that an app can update a single document
depends highly on the workload. Factors include the write rate, contention among requests, and the number affected indexes.
A document write operation updates the document and any associated indexes,
and Cloud Firestore synchronously applies the write operation across
a quorum of replicas. At high enough write rates, the database will start to
encounter contention, higher latency, or other errors.
High read, write, and delete rates to a narrow document range
Avoid high read or write rates to lexicographically close documents, or your
application will experience contention errors. This issue is known as
hotspotting, and your application can experience hotspotting if it does any of
the following:
Creates new documents at a very
high rate
and allocates its own monotonically increasing IDs.
Cloud Firestore allocates document IDs using a scatter algorithm. You
should not encounter hotspotting on writes if you create new documents using
automatic document IDs.
Creates new documents at a high rate in a collection with few documents.
Creates new documents with a monotonically increasing field, like a
timestamp, at a very high rate.
Deletes documents in a collection at a high rate.
Writes to the database at a very high rate
without gradually increasing traffic.
Avoid skipping over deleted data
Avoid queries that skip over recently deleted data. A query may have to skip
over a large number of index entries if the early query results have recently
been deleted.
An example of a workload that might have to skip over a lot of deleted data is
one that tries to find the oldest queued work items. The query might look like:
docs = db.collection('WorkItems').order_by('created').limit(100)
delete_batch = db.batch()
for doc in docs.stream():
finish_work(doc)
delete_batch.delete(doc.reference)
delete_batch.commit()
Each time this query runs it scans over the index entries for the
created
field on any recently deleted documents. This slows down queries.
To improve the performance, use the
start_at
method to find the best
place to start. For example:
completed_items = db.collection('CompletionStats').document('all stats').get()
docs = db.collection('WorkItems').start_at(
{'created': completed_items.get('last_completed')}).order_by(
'created').limit(100)
delete_batch = db.batch()
last_completed = None
for doc in docs.stream():
finish_work(doc)
delete_batch.delete(doc.reference)
last_completed = doc.get('created')
if last_completed:
delete_batch.update(completed_items.reference,
{'last_completed': last_completed})
delete_batch.commit()
NOTE: The example above uses a monotonically increasing field which is an anti-pattern for high write rates.
Ramping up traffic
You should gradually ramp up traffic to new collections or lexicographically
close documents to give Cloud Firestore sufficient time to prepare
documents for increased traffic. We recommend starting with a maximum of 500
operations per second to a new collection and then increasing traffic by 50%
every 5 minutes. You can similarly ramp up your write traffic, but keep in mind
the
Cloud Firestore Standard Limits
. Be sure
that operations are distributed relatively evenly throughout the key range. This
is called the "500/50/5" rule.
Migrating traffic to a new collection
Gradual ramp up is particularly important if you migrate app traffic from one
collection to another. A simple way to handle this migration is to read from the
old collection, and if the document does not exist, then read from the new
collection. However, this could cause a sudden increase of traffic to
lexicographically close documents in the new collection. Cloud Firestore
may be unable to efficiently prepare the new collection for increased traffic,
especially when it contains few documents.
A similar problem can occur if you change the document IDs of many documents
within the same collection.
The best strategy for migrating traffic to a new collection depends on your data
model. Below is an example strategy known as
parallel reads
. You will need to
determine whether or not this strategy is effective for your data, and an
important consideration will be the cost impact of parallel operations during
the migration.
Parallel reads
To implement parallel reads as you migrate traffic to a new collection, read
from the old collection first. If the document is missing, then read from the
new collection. A high rate of reads of non-existent documents can lead to
hotspotting, so be sure to gradually increase load to the new
collection. A better strategy is to copy the old document to the new collection
then delete the old document. Ramp up parallel reads gradually to ensure that
Cloud Firestore can handle traffic to the new collection.
A possible strategy for gradually ramping up reads or writes to a new collection
is to use a deterministic hash of the user ID to select a random percentage of
users attempting to write new documents. Be sure that the result of the user
ID hash is not skewed either by your function or by user behavior.
Meanwhile, run a batch job that copies all your data from the old documents to
the new collection. Your batch job should avoid writes to sequential document
IDs in order to prevent hotspots. When the batch job finishes, you can read only
from the new collection.
A refinement of this strategy is to migrate small batches of users at a time.
Add a field to the user document which tracks migration status of that user.
Select a batch of users to migrate based on a hash of the user ID. Use
a batch job to migrate documents for that batch of users, and use
parallel reads for users in the
middle of migration.
Note that you cannot easily roll back unless you do dual writes of both the old
and new entities during the migration phase. This would increase
Cloud Firestore costs incurred.
Privacy
- Avoid storing sensitive information in a Cloud Project ID. A Cloud Project ID
might be retained beyond the life of your project.
- As a data compliance best practice, we recommend not storing sensitive
information in document names and document field names.
Prevent unauthorized access
Prevent unauthorized operations on your database with
Cloud Firestore Security Rules. For example, using rules could avoid a scenario where a
malicious user repeatedly downloads your entire database.
Learn more about
using Cloud Firestore Security Rules
.