To set up GCS as data lake, you need to follow below steps
1. Navigate to Settings.
2. Click on Destination
3. Click on Setup Data Lake.
4. Insert all the data lake specific credentials, along with a name and description for the connection.
5. Click on Validate
and then Create
to save the data lake connection.
Authentication | HMAC Key authentication |
Access ID | Access ID linked to the service account |
Secret | Secret ID linked to the corresponding Access ID |
GCS Bucket Name | Google Cloud Stroage Bucket name |
GCS Bucket Path | Subdirectory under the bucket to sync the data into |
GCS Bucket Region | Region of the GCS bucket |
Output Format | Output data format from one of Avro, Parquet, CSV, JSON |
Compression | Whether the output files be compressed |
Normalization | Whether the input JSON be normalized |
To set up GCP as data warehouse, you need to follow below steps
1. Navigate to Settings.
2. Click on Destination
3. Click on Setup Data Warehouse.
4. Insert all the data warehouse specific credentials, along with a name and description for the connection.
5. Click on Validate
and then Create
to save the data warehouse connection.
Project ID | GCP Project ID |
Dataset Location | Location of the dataset |
Default dataset ID | Default Bigquery data set ID |
Loading method | GCS Staging / Standard inserts |
HMAC Access key ID | HMAC Access ID |
HMAC Key secret | HMAC Access key |
GCS Bucket name | Name of GCS bucket |
GCS Bucket path | Path of GCS bucket |
GCS Tmp files afterward processing | Delete / Keep all |
Service account key JSON | JSON service account key |
Transformation query run type | Interactive / Batch |
Google bigquery chunk size | Bigquery client’s chunk size |
Raw Table dataset name | Dataset to write raw tables |