Mask column data
This document shows you how to implement data masking in order to
selectively obscure sensitive data. By implementing data masking, you can
provide different levels of visibility to different groups of users.
For general information, see
Introduction to data masking
.
You implement data masking by adding a data policy to a column. To add a data
masking policy to a column, you must complete the following steps :
- Create a taxonomy with at least one policy tag.
- Optional: Grant the Data Catalog Fine-Grained Reader role
to one or more principals on one or more of the policy tags you created.
- Create up to three data policies for the policy tag, to map masking rules
and principals (which represent users or groups) to that tag.
- Set the policy tag on a column. That maps the data policies associated with
the policy tag to the selected column.
- Assign users who should have access to masked data to the
BigQuery Masked Reader role. As a best practice, assign the
BigQuery Masked Reader role at the data policy level.
Assigning the role at the project level or higher grants users permissions to
all data policies under the project, which can lead to issues caused by
excess permissions.
You can use the Google Cloud console or the BigQuery Data Policy API to work with
data policies.
When you have completed these steps, users running queries against the column
get unmasked data, masked data, or an access denied error, depending on the
groups that they belong to and the roles that they have been granted. For more
information, see
How Masked Reader and Fine-Grained Reader roles interact
.
Before you begin
-
Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account
to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page,
select or
create a Google Cloud project
.
Go to project selector
-
Make sure that billing is enabled for your Google Cloud project
.
-
Enable the Data Catalog and BigQuery Data Policy APIs.
Enable the APIs
-
In the Google Cloud console, on the project selector page,
select or
create a Google Cloud project
.
Go to project selector
-
Make sure that billing is enabled for your Google Cloud project
.
-
Enable the Data Catalog and BigQuery Data Policy APIs.
Enable the APIs
- BigQuery is automatically enabled in new projects, but you
might need to activate it in a pre-existing project.
Enable the BigQuery API.
Enable the API
- If you are creating a data policy that references a
custom masking routine
,
create the associated masking UDF so that it is available in the following
steps.
Create taxonomies
The user or service account that creates a taxonomy must be granted the
Data Catalog Policy Tag Admin role.
Console
- Open the
Policy tag taxonomies
page in the
Google Cloud console.
Open the Policy tag taxonomies page
- Click
Create taxonomy
.
On the
New taxonomy
page:
- For
Taxonomy name
, enter the name of the taxonomy that you want
to create.
- For
Description
, enter a description.
- If needed, change the project listed under
Project
.
- If needed, change the location listed under
Location
.
- Under
Policy Tags
, enter a policy tag name and description.
- To add a child policy tag for a policy tag, click
Add subtag
.
- To add a new policy tag at the same level as another policy tag,
click
+ Add policy tag
.
- Continue adding policy tags and child policy tags as needed for your
taxonomy.
- When you are done creating policy tags for your hierarchy, click
Create
.
For more information about how to work with policy tags, such as how to view or
update them, see
Work with policy tags
.
For best practices, see
Best practices for using policy tags in BigQuery
.
Create data policies
The user or service account that creates a data policy must have the
bigquery.dataPolicies.create
,
bigquery.dataPolicies.setIamPolicy
, and
datacatalog.taxonomies.get
permissions.
If you are creating a data policy that references a
custom masking routine
,
you also need
routine permissions
.
These permissions are included in the
BigQuery Admin and BigQuery Data Owner roles.
You can create up to nine data policies for a policy tag. One of these policies
is reserved for
column-level access control settings
.
Console
- Open the
Policy tag taxonomies
page in the
Google Cloud console.
Open the Policy tag taxonomies page
- Click the name of the taxonomy to open.
- Select a policy tag.
- Click
Manage Data Policies
.
- For
Data Policy Name
, type a name for the data policy. The data
policy name must be unique within the project that data policy resides
in.
- For
Masking Rule
, choose a predefined masking rule or a custom
masking routine. If you are selecting a custom masking routine, ensure
that you have both the
bigquery.routines.get
and the
bigquery.routines.list
permissions at the project level.
- For
Principal
, type the name of one or more users or groups to whom
you want to grant masked access to the column. Note that all users and
groups you enter here are granted the BigQuery Masked
Reader role.
- Click
Submit
.
API
Call the
create
method. Pass in a
DataPolicy
resource That meets the following requirements:
- The
dataPolicyType
field is set to
DATA_MASKING_POLICY
.
- The
dataMaskingPolicy
field identifies the data masking rule or
routine to use.
- The
dataPolicyId
field provides a name for the data policy that
is unique within the project that data policy resides in.
Call the
setIamPolicy
method and pass in a
Policy
. The
Policy
must
identify the principals who are granted access to masked data,
and specify
roles/bigquerydatapolicy.maskedReader
for the
role
field.
Set a data policy on a column by attaching the policy tag associated with
the data policy to the column.
The user or service account that sets a policy tag needs the
datacatalog.taxonomies.get
and
bigquery.tables.setCategory
permissions.
datacatalog.taxonomies.get
is included in the
Data Catalog Policy Tags Admin and Project Viewer roles.
bigquery.tables.setCategory
is included in the
BigQuery Admin (
roles/bigquery.admin
) and
BigQuery Data Owner (
roles/bigquery.dataOwner
) roles.
Console
Set the policy tag by
modifying a schema
using the
Google Cloud console.
Open the BigQuery page in the Google Cloud console.
Go to the BigQuery page
In the BigQuery Explorer, locate and select the table that
you want to update. The table schema for that table opens.
Click
Edit Schema
.
In the
Current schema
screen, select the target column and click
Add
policy tag
.
In the
Add a policy tag
screen, locate and select the policy tag that you want
to apply to the column.
Click
Select
. Your screen should look similar to the following:
Click
Save
.
bq
Write the schema to a local file.
bq show --schema --format=prettyjson \
project-id
:
dataset
.
table
> schema.json
where:
- project-id
is your project ID.
- dataset
is the name of the dataset that contains the table
you're updating.
- table
is the name of the table you're updating.
Modify schema.json to set a policy tag on a column. For the value of the
names
field of
policyTags
, use the
policy tag resource name
.
[
...
{
"name": "ssn",
"type": "STRING",
"mode": "REQUIRED",
"policyTags": {
"names": ["projects/
project-id
/locations/
location
/taxonomies/
taxonomy-id
/policyTags/
policytag-id
"]
}
},
...
]
Update the schema.
bq update \
project-id
:
dataset
.
table
schema.json
API
For existing tables, call
tables.patch
, or for new tables call
tables.insert
. Use the
schema
property of the
Table
object that you pass in
to set a policy tag in your schema definition. See the command-line example
schema to see how to set a policy tag.
When working with an existing table, the
tables.patch
method is preferred,
because the
tables.update
method replaces the entire table resource.
Enforce access control
When you create a data policy for a policy tag, access control is automatically
enforced. All columns that have that policy tag applied return masked data in
response to queries from users who have the Masked Reader role.
To stop enforcement of access control, you must
first delete all data policies associated with the policy tags in the
taxonomy. For more information, see
Enforce access control
.
Check IAM permissions on a data policy
To see what permissions you have on a data policy, call the
testIamPermissions
method.
Update data policies
The user or service account that updates a data policy must have the
bigquery.dataPolicies.update
permission.
If you are updating the policy tag the data policy is associated with, you also
require the
datacatalog.taxonomies.get
permission.
If you are updating the principals associated with the data policy, you
require the
bigquery.dataPolicies.setIamPolicy
permission.
The
bigquery.dataPolicies.update
and
bigquery.dataPolicies.setIamPolicy
permissions are included in the
BigQuery Admin and BigQuery Data Owner roles.
The
datacatalog.taxonomies.get
permission is included in the
Data Catalog Admin and Data Catalog Viewer roles.
Console
- Open the
Policy tag taxonomies
page in the
Google Cloud console.
Open the Policy tag taxonomies page
- Click the name of the taxonomy to open.
- Select a policy tag.
- Click
Manage Data Policies
.
- Optionally, change the masking rule.
- Optional: Add or remove principals.
- Click
Submit
.
API
To change the data masking rule, call the
patch
method and pass in a
DataPolicy
resource with an updated
dataMaskingPolicy
field.
To change the principals associated with a data policy, call the
setIamPolicy
method and pass in a
Policy
that updates
the principals that are granted access to masked data.
Delete data policies
The user or service account that creates a data policy must have the
bigquery.dataPolicies.delete
permission. This permission is included in the
BigQuery Admin and BigQuery Data Owner roles.
Console
- Open the
Policy tag taxonomies
page in the
Google Cloud console.
Open the Policy tag taxonomies page
- Click the name of the taxonomy to open.
- Select a policy tag.
- Click
Manage Data Policies
.
- Click
delete
next to the data policy
to delete.
- Click
Submit
.
- Click
Confirm
.
API
To delete a data policy, call the
delete
method.