FirestoreV1
provides an API which provides lifecycle managed
PTransform
s for
Cloud Firestore
v1 API
.
This class is part of the Firestore Connector DSL and should be accessed via
FirestoreIO.v1()
.
All
PTransform
s provided by this API use
GcpOptions
on
PipelineOptions
for credentials access and projectId
resolution. As such, the lifecycle of gRPC clients and project information is scoped to the
bundle level, not the worker level.
Operations
Read
The currently supported read operations and their execution behavior are as follows:
RPC
|
Execution Behavior
|
Example Usage
|
FirestoreV1.PartitionQuery
|
Parallel Streaming
|
PCollection
PartitionQueryRequest
> partitionQueryRequests = ...;
PCollection
RunQueryRequest
> runQueryRequests = partitionQueryRequests
.apply(FirestoreIO.v1().read().
partitionQuery()
.build());
PCollection
RunQueryResponse
> runQueryResponses = runQueryRequests
.apply(FirestoreIO.v1().read().
runQuery()
.build());
|
FirestoreV1.RunQuery
|
Sequential Streaming
|
PCollection
RunQueryRequest
> runQueryRequests = ...;
PCollection
RunQueryResponse
> runQueryResponses = runQueryRequests
.apply(FirestoreIO.v1().read().
runQuery()
.build());
|
FirestoreV1.BatchGetDocuments
|
Sequential Streaming
|
PCollection
BatchGetDocumentsRequest
> batchGetDocumentsRequests = ...;
PCollection
BatchGetDocumentsResponse
> batchGetDocumentsResponses = batchGetDocumentsRequests
.apply(FirestoreIO.v1().read().
batchGetDocuments()
.build());
|
FirestoreV1.ListCollectionIds
|
Sequential Paginated
|
PCollection
ListCollectionIdsRequest
> listCollectionIdsRequests = ...;
PCollection
ListCollectionIdsResponse
> listCollectionIdsResponses = listCollectionIdsRequests
.apply(FirestoreIO.v1().read().
listCollectionIds()
.build());
|
FirestoreV1.ListDocuments
|
Sequential Paginated
|
PCollection
ListDocumentsRequest
> listDocumentsRequests = ...;
PCollection
ListDocumentsResponse
> listDocumentsResponses = listDocumentsRequests
.apply(FirestoreIO.v1().read().
listDocuments()
.build());
|
PartitionQuery should be preferred over other options if at all possible, becuase it has the
ability to parallelize execution of multiple queries for specific sub-ranges of the full results.
When choosing the value to set for
PartitionQueryRequest.Builder#setPartitionCount(long)
,
ensure you are picking a value this makes sense for your data set and your max number of workers.
If you find that a partition query is taking a unexpectedly long time, try increasing the
number of partitions.
Depending on how large your dataset is increasing as much as 10x can
significantly reduce total partition query wall time.
You should only ever use ListDocuments if the use of
show_missing
is needed to access a document. RunQuery and PartitionQuery will always be
faster if the use of
show_missing
is not needed.
Write
To write a
PCollection
to Cloud Firestore use
write()
, picking the
behavior of the writer.
Writes use Cloud Firestore's BatchWrite api which provides fine grained write semantics.
The default behavior is to fail a bundle if any single write fails with a non-retryable error.
PCollection
Write
> writes = ...;
PCollection
FirestoreV1.WriteSuccessSummary
> sink = writes
.apply(FirestoreIO.v1().write().
batchWrite()
.build());
Alternatively, if you'd rather output write failures to a Dead Letter Queue add
withDeadLetterQueue
when building your
writer.
PCollection
Write
> writes = ...;
PCollection
FirestoreV1.WriteFailure
> writeFailures = writes
.apply(FirestoreIO.v1().write().
batchWrite()
.withDeadLetterQueue().build());
Permissions
Permission requirements depend on the
PipelineRunner
that is used to execute the
pipeline. Please refer to the documentation of corresponding
PipelineRunner
s for more
details.
Please see
Security for server client
libraries > Roles
for security and permission related information specific to Cloud
Firestore.
Optionally, Cloud Firestore V1 Emulator, running locally, could be used for testing purposes
by providing the host port information via
FirestoreOptions.setEmulatorHost(String)
. In
such a case, all the Cloud Firestore API calls are directed to the Emulator.