Pub/Sub is an asynchronous and scalable messaging service that decouples
services producing messages from services processing those messages.
Pub/Sub allows services to communicate asynchronously, with
latencies on the order of 100 milliseconds.
Pub/Sub is used for streaming analytics and data integration
pipelines to load and distribute data. It's equally effective as a
messaging-oriented middleware for service integration or as a queue to parallelize tasks.
Pub/Sub lets you create systems of event producers and consumers,
called
publishers
and
subscribers
. Publishers communicate with
subscribers asynchronously by broadcasting events, rather than by
synchronous remote procedure calls (RPCs).
Publishers send events to the Pub/Sub service, without regard to
how or when these events are to be processed. Pub/Sub then
delivers events to all the services that react to them. In systems communicating
through RPCs, publishers must wait for subscribers to receive the data. However,
the asynchronous integration in Pub/Sub increases the flexibility
and robustness of the overall system.
To get started with Pub/Sub, check out the
Quickstart using Google Cloud console
.
For a more comprehensive introduction, see
Building a Pub/Sub messaging system
.
Common use cases
-
Ingesting user interaction and server events.
To use user
interaction events from end-user apps or server events from your system,
you might forward them to Pub/Sub. You can then use a
stream processing tool, such as Dataflow, which delivers
the events to databases. Examples of such databases are
BigQuery, Bigtable, and Cloud Storage.
Pub/Sub lets you gather events from many clients
simultaneously.
-
Real-time event distribution.
Events, raw or processed, may
be made available to multiple applications across your team and organization
for real- time processing. Pub/Sub supports an "enterprise
event bus" and event-driven application design patterns.
Pub/Sub lets you integrate with many systems that export
events to Pub/Sub.
-
Replicating data among databases.
Pub/Sub
is commonly used to distribute change events from databases. These events
can be used to construct a view of the database state and state history in
BigQuery and other data storage systems.
-
Parallel processing and workflows.
You can efficiently
distribute many tasks among multiple workers by using Pub/Sub
messages to communicate with the workers. Examples of such tasks are
compressing text files, sending email notifications, evaluating AI models,
and reformatting images.
-
Enterprise event bus.
You can create an enterprise-wide
real-time data sharing bus, distributing business events, database updates,
and analytics events across your organization.
-
Data streaming from applications, services, or IoT devices.
For example, a SaaS application can publish a real-time feed of events. Or,
a residential sensor can stream data to Pub/Sub for use in
other Google Cloud products through a data-processing pipeline.
-
Refreshing distributed caches.
For example, an application
can publish invalidation events to update the IDs of objects that have
changed.
-
Load balancing for reliability.
For example, instances of a
service may be deployed on Compute Engine in multiple zones but subscribe
to a common topic. When the service fails in any zone, the others can pick
up the load automatically.
Types of Pub/Sub services
Pub/Sub consists of two services:
Pub/Sub service.
This messaging service is the default
choice for most users and applications. It offers the highest reliability and
largest set of integrations, along with automatic capacity management.
Pub/Sub supports synchronous replication of all data to at
least two zones and best-effort replication to a third additional zone.
Pub/Sub Lite service.
A separate but similar messaging
service built for lower cost. It offers lower reliability compared to
Pub/Sub. It offers either zonal or regional topic storage.
Zonal Lite topics are stored in only one
zone. Regional Lite topics replicate data to a second
zone asynchronously. Also, Pub/Sub Lite
requires you to pre-provision and manage storage and throughput capacity.
Consider Pub/Sub Lite only for applications where achieving
a low cost justifies some additional operational work and lower
reliability.
For more details about the differences between Pub/Sub and
Pub/Sub Lite, see
Choosing Pub/Sub or Pub/Sub Lite
.
Comparing Pub/Sub to other messaging technologies
Pub/Sub combines the horizontal scalability of
Apache Kafka
and
Pulsar
with
features found in messaging middleware such as Apache ActiveMQ and
RabbitMQ. Examples of such features are dead-letter queues and filtering.
Another feature that Pub/Sub adopts from messaging middleware is
per-message parallelism
, rather than partition-based messaging.
Pub/Sub "leases" individual messages to subscriber clients, then
tracks whether a given message is successfully processed.
By contrast, other horizontally scalable messaging systems
use partitions for horizontal scaling. This forces subscribers
to process messages in each partition in order and limits the number of concurrent
clients to the number of partitions. Per-message processing
maximizes the parallelism of subscriber applications, and helps ensure
publisher and subscriber independence.
Compare Service-to-service and service-to-client communication
Pub/Sub is intended for service-to-service communication rather
than communication with end-user or IoT clients. Other patterns are
better supported by other products:
You can use a combination of these services to build client -> services -> database
patterns. For example, see the tutorial
Streaming Pub/Sub messages over WebSockets
.
Integrations
Pub/Sub has many integrations with other Google Cloud products to create a fully
featured messaging system:
- Stream processing and data integration.
Supported by
Dataflow
, including Dataflow
templates
and
SQL
, which allow processing and
data integration into BigQuery and data lakes on Cloud Storage. Dataflow
templates for moving data from Pub/Sub to
Cloud Storage, BigQuery, and other products are available in
the Pub/Sub and Dataflow UIs in the
Google Cloud console. Integration with
Apache Spark
, particularly when managed with
Dataproc
is also available. Visual composition of integration and
processing pipelines running on Spark + Dataproc can be accomplished with
Data Fusion
.
- Monitoring, Alerting and Logging.
Supported by Monitoring and
Logging products.
- Authentication and IAM.
Pub/Sub relies on a standard OAuth
authentication used by other Google Cloud products and supports granular IAM,
enabling access control for individual resources.
- APIs.
Pub/Sub uses standard
gRPC and REST service API
technologies
along with
client libraries
for several languages.
- Triggers, notifications, and webhooks.
Pub/Sub offers push-based
delivery of messages as HTTP POST requests to webhooks. You can implement workflow automation using
Cloud Functions
or other serverless products.
- Orchestration.
Pub/Sub can be integrated into multistep serverless
Workflows
declaratively. Big data and analytic orchestration often done with
Cloud Composer
, which supports Pub/Sub triggers.
You can also integrate Pub/Sub with
Application Integration
(
Preview
) which is an
Integration-Platform-as-a-Service (iPaaS) solution. Application Integration provides a
Pub/Sub trigger
to trigger or start integrations.
- Integration Connectors.
(
Preview
)
These
connectors
let you connect to various data sources.
With connectors, both Google Cloud services and third-party business applications are exposed
to your integrations through a transparent, standard interface. For Pub/Sub, you can create a Pub/Sub
connection
for use in your integrations.
Next steps