When you work with your sales or support contact to setup access to Data Transfer v2.0,
you will be provided with a bucket name. You will need to provide your sales contact a
Google Group
which enables you to control
access to your data files in
Google Cloud Storage
.
You can choose to access your data using a
utility
or you can write your own
code.
Access data using gsutil
The gsutil tool is a command-line application, written in Python, that
lets you access your data without having to do any coding. You
could, for example, use gsutil as part of a script or batch file instead of
creating custom applications.
To get started with gsutil read the
gsutil
documentation
. The tool will prompt you for your credentials the first time
you use it and then store them for use later on.
gsutil examples
You can list all of your files using gsutil as follows:
gsutil ls gs://[bucket_name]/[object name/file name]
gsutil uses much of the same syntax as UNIX, including the wildcard
asterisk (*), so you can list all NetworkImpression files:
gsutil ls gs://[bucket_name]/dcm_account6837_impression_*
It's also easy to download a file:
gsutil cp gs://[bucket_name]/dcm_account6837_impression_2015120100.log.gz
You can copy your files from the dispersed DT Google buckets to your own Google API GCS Bucket
using a Unix shell script, there are two options:
In gsutil, if you are using a Unix System, run the following for all your buckets daily:
$ day=$(date --date="1 days ago" +"%m-%d-%Y")
$ gsutil -m cp gs://{<dcmhashid_A>,<dcmhashid_B>,etc.}/*$day*.log.gz gs://<client_bucket>/
Alternatively, a solution that is a little trickier is to use a bash file:
#!/bin/bash
set -x
buckets={dfa_-hasid_A dfa_-hashid_B,...} #include all hash ids
day=$(date --date="1 days ago" +"%m-%d-%Y")
for b in ${buckets[@]}; do /
gsutil -m cp gs://$b/*$day*.log.gz gs://
/ /
done
Access data programmatically
Google Cloud Storage
has APIs and
samples
for many programming
languages that allow you to access your data in a programmatic way. Below are
the steps specific to Data Transfer v2.0 that you must take to build a
working integration.
Get a service account
To get started using Data Transfer v2.0, you need to first
use
the setup tool
, which guides you through creating a project in the
Google API Console, enabling the API, and creating credentials.
To set up a new service account, do the following:
- Click
Create credentials > Service account key
.
- Choose whether to download the service account's public/private key as a
standard P12 file, or as a JSON file that can be loaded by a Google API client
library.
Your new public/private key pair is generated and downloaded to your machine;
it serves as the only copy of this key. You are responsible for storing it
securely.
Be sure to keep this window open, you will need the service account email
in the next step.
Add a service account to your group
- Go to
Google Group
- Click on My Groups and select the group you use for managing access
to your DT v2.0 Cloud Storage Bucket
- Click Manage
- Do not click Invite Members!
- Click Direct add members
- Copy the service account email from the previous step into the
members box
- Select No email
- Click the Add button
I accidentally clicked Invite Members
More...
- Don't Panic! You can fix it
- Head back to the Manage screen as before
- Click on Outstanding Invitations
- Find the service account and select it
- Click Revoke invitation at the top of the screen
- Click Direct add members and resume steps above
Scope
Any scopes passed to Cloud Storage must be Read Only
For example, when using the Java client library the correct scope to
use is:
StorageScopes.DEVSTORAGE_READ_ONLY