Authorized datasets
This document describes how to use
authorized datasets
in
BigQuery.
An authorized dataset lets you authorize all of the views in a specified
dataset to access the data in a second dataset.
With an authorized
dataset, you don't need to configure individual
authorized views
.
You can create an authorized dataset in BigQuery by using the
following:
Overview
A
view
in BigQuery is a virtual table
defined by a SQL query. For example, a view's query
might return only a subset of the columns of a table, excluding columns
that contain personal identifiable information (PII). To query a view, a user
needs to have
access to the resources that are accessed by the view's query.
If you want to let users query a view, without giving them direct access to the
resources referenced by the view, you can use an
authorized view.
An
authorized view
lets you, for example, share
more limited data in a view with specified groups or users (principals), without
giving the principals
access to all of the underlying data. You do this by giving the principals
access to the view,
and by giving the view access to the dataset that contains the underlying
data.
If you want to give a
collection of views
access to a dataset, without having
to authorize each individual view, you can group the views together into a
dataset, and then give the dataset that contains the views access to the dataset
that contains the data.
You can then give principals access to the dataset with the group of views, or
to individual views in the dataset, as needed.
A dataset that has access to another dataset
is called an
authorized dataset
.
The dataset that authorizes another dataset to access its data is called
the
shared dataset.
Required permissions and roles
To authorize a dataset, or to revoke a dataset's authorization, you must have
the following
Identity and Access Management (IAM) permissions
,
which let you update the access control list of the dataset you are sharing.
After a dataset is authorized, you need these same permissions if you want to
create or update views in the authorized dataset. For more information, see
Create or update a view in an authorized dataset
.
Permission
|
Resource
|
bigquery.datasets.get
|
The dataset you are sharing.
|
bigquery.datasets.update
|
The dataset you are sharing.
|
The following predefined
IAM roles
provide the required permissions.
Role
|
Resource
|
bigquery.dataOwner
|
The dataset you are sharing.
|
bigquery.admin
|
The dataset you are sharing.
|
Quotas and limits
Authorized datasets are subject to dataset limits. For more information,
see
Dataset limits
.
Authorize a dataset
You can authorize a dataset's current and future views to access another dataset
by adding the dataset you want to authorize to the access list of the dataset
you want to share, as follows:
Console
In the Google Cloud console, go to the
BigQuery
page.
Go to BigQuery
In the
Explorer
panel, expand your project and select the dataset
that contains the data you want to share.
Click the
more_vert
Actions
icon and select
Open
.
In the details pane that appears, click
Sharing
and select the
Authorize
Datasets
option.
In the
Authorized dataset
pane that appears, enter the
Dataset ID
of the dataset that you want to authorize, in the following format:
PROJECT
.
AUTHORIZED_DATASET
For example:
myProject.myDataset
Click
Add Authorization
and then click
Close
.
bq
Open the Cloud Shell:
Go to Cloud Shell
Write the existing metadata (including the access control list) for the
dataset you want to share into a
JSON file by using the
bq show
command.
bq show --format=prettyjson
PROJECT
:
SHARED_DATASET
>
FILE_PATH
Use a text editor to add the dataset that you want to authorize
into the existing
access
section of the JSON file that was created at
FILE_PATH
.
For example:
"access": [
...
{
"dataset": {
"dataset": {
"project_id": "
PROJECT
",
"dataset_id": "
AUTHORIZED_DATASET
"
},
"target_types": "VIEWS"
}
}
]
Update the shared dataset by using the
bq update
command. For example:
bq update --source
FILE_PATH
PROJECT
:
SHARED_DATASET
To verify that the authorized dataset has been added, enter the
bq show
command again.
For example:
bq show --format=prettyjson
PROJECT
:
SHARED_DATASET
API
Get the current metadata for the dataset you want to share by calling the
datasets.get
method, as follows:
GET https://bigquery.googleapis.com/bigquery/v2/projects/
PROJECT
/datasets/
SHARED_DATASET
The response body returns a
Dataset
resource that contains JSON metadata for the dataset.
Add the dataset that you want authorize into the
access
section of the JSON metadata that was returned in the
Dataset
resource as
follows:
"access": [
...
{
"dataset": {
"dataset": {
"project_id": "
PROJECT
",
"dataset_id": "
AUTHORIZED_DATASET
"
},
"target_types": "VIEWS"
}
}
]
Use the
datasets.update
method to update the dataset with the added authorization:
PUT https://bigquery.googleapis.com/bigquery/v2/projects/
PROJECT
/datasets/
SHARED_DATASET
Include the updated
Dataset
resource in the request body.
You can verify that the authorized dataset has been added by calling the
datasets.get
method again.
Revoke a dataset's authorization
When you delete a dataset authorized to access another source dataset, it can
take up to 24 hours for the change to fully reflect in the source dataset's
access control lists (ACLs)
. During this
time:
- You won't be able to access the source data through the deleted dataset.
- The deleted dataset might still appear in the source dataset's ACL and count
towards any authorized dataset limits. This could prevent you from creating
new authorized datasets until the ACL is updated.
To revoke the access granted to the views in an authorized dataset, remove the
authorized dataset from the shared dataset's access list, as
follows:
Console
In the Google Cloud console, go to the
BigQuery
page.
Go to BigQuery
In the
Explorer
panel, expand your project and select the shared
dataset.
Click the
more_vert
Actions
icon and select
Open
.
In the details pane that appears, click
Sharing
and select the
Authorize Datasets
option.
In the
Authorized dataset
pane that appears, find the entry for the
authorized dataset in the
Currently authorized datasets
section.
Click the delete icon next to the authorized dataset you want to remove,
and then click
Close
.
bq
Open the Cloud Shell:
Go to Cloud Shell
Write the existing metadata (including the access control list) for the
shared dataset into a
JSON file by using the
bq show
command.
bq show --format=prettyjson
PROJECT
:
SHARED_DATASET
>
FILE_PATH
Use a text editor to remove the authorized dataset from the
access
section of the JSON file that was created at
FILE_PATH
, as
follows:
{
"dataset": {
"dataset": {
"project_id": "
PROJECT
",
"dataset_id": "
AUTHORIZED_DATASET
"
},
"target_types": "VIEWS"
}
}
Update the shared dataset by using the
bq update
command. For example:
bq update --source
FILE_PATH
PROJECT
:
SHARED_DATASET
To verify that the authorized dataset has been removed, enter the
bq show
command again.
For example:
bq show --format=prettyjson
PROJECT
:
SHARED_DATASET
API
Get the current metadata for the shared dataset by calling the
datasets.get
method, as follows:
GET https://bigquery.googleapis.com/bigquery/v2/projects/
PROJECT
/datasets/
SHARED_DATASET
The response body returns a
Dataset
resource that contains JSON metadata for the dataset.
Remove the authorized dataset from the
access
section of the JSON that was returned in the
Dataset
resource, for
example:
{
"dataset": {
"dataset": {
"project_id": "
PROJECT
",
"dataset_id": "
AUTHORIZED_DATASET
"
},
"target_types": "VIEWS"
}
}
Use the
datasets.update
method to update the dataset with the removed authorization:
PUT https://bigquery.googleapis.com/bigquery/v2/projects/
PROJECT
/datasets/
SHARED_DATASET
Include the updated
Dataset
resource in the request body.
You can verify that the authorized dataset has been removed by calling the
datasets.get
method again.
Create or update a view in an authorized dataset
To create or update a view that is in an authorized dataset,
you must have the permissions for the shared dataset that are listed in
Required permissions and roles
,
in addition to the permissions that are required to
create
or
update
a view in a
standard dataset.
The following table summarizes the necessary
Identity and Access Management (IAM) permissions
to create or update a view that is in an authorized dataset:
Permission
|
Resource
|
bigquery.datasets.get
|
The dataset you are sharing.
|
bigquery.datasets.update
|
The dataset you are sharing.
|
bigquery.tables.getData
|
Any tables or views from the shared dataset that are referenced
in the new view you are creating or updating.
|
bigquery.tables.create
|
The authorized dataset in which you are creating a view.
|
bigquery.tables.update
|
The authorized dataset in which you are updating a view.
|
You don't need any additional permissions to
delete a view
from an
authorized dataset.
Query a view in an authorized dataset
To query a view in an authorized dataset, a user needs to have access to the
view, but access to the shared dataset is not required.
For more information, see
Authorized views
.
Authorized dataset example
The following example describes how to create and use an authorized dataset.
Assume you have two datasets, named
private_dataset
and
public_dataset
.
The
private_dataset
dataset contains a table named
private_table
. The
public_dataset
dataset contains a view named
private_table_filtered
. The
private_table_filtered
view is based on a query that returns some, but not
all, of the fields in the
private_table
table.
You can give a user access to
the data returned by the
private_table_filtered
view, but not all of the data in
the
private_table
table, as follows:
Grant the
bigquery.dataViewer
role to the user for the
public_dataset
dataset. This role includes the
bigquery.tables.getData
permission, which
lets the user query the views in the
public_dataset
dataset.
For information about how to grant a role to a user for a dataset, see
Controlling access to datasets
.
The user now has permission to query views in the
public_dataset
,
but they still cannot access the
private_table
table in
private_dataset
. If the user tries to query the
private_table
table
directly, or if they try to access the
private_table
table indirectly by
querying the
private_table_filtered
view,
they get an error message similar to the following:
Access Denied: Table
PROJECT
:private_dataset.private_table:
User does not have permission to query table
PROJECT
:private_dataset.private_table.
In the
BigQuery
page of the Google Cloud console, open the
private_dataset
dataset, click
Sharing
, and then select
Authorize Datasets
.
In the
Authorized dataset
pane that appears, enter
PROJECT
.public_dataset
in the
Dataset ID
field,
and then click
Add Authorization
.
The
public_dataset
dataset is added to the access control list of the
private_dataset
dataset, authorizing the views in the
public_dataset
dataset to query the data in the
private_dataset
dataset.
The user can now query the
private_table_filtered
view in the
public_dataset
dataset, which indirectly accesses the
private_dataset
dataset, without having
any permissions to directly access data in the
private_dataset
dataset.
Limitations
- You can create authorized datasets in different regions,
but BigQuery doesn't support cross-region queries. Therefore,
we recommend that you create datasets in the same region.
What's next
For information about authorizing an individual view to access data in a
dataset, see
Authorized views
.
For information about authorizing a table function or a user-defined function
to access data in a dataset, see
Authorized functions
.