Snowflake Destination Plugin
Latest: v3.2.10
The snowflake plugin helps you sync data to your Snowflake data warehouse.
There are two ways to sync data to Snowflake:
-
Direct (easy but not recommended for production or large data sets): This is the default mode of operation where CQ plugin will stream the results directly to the Snowflake database. There is no additional setup needed apart from authentication to Snowflake.
-
Loading via CSV/JSON from a remote storage: This is the standard way of loading data into Snowflake, it is recommended for production and large data sets. This mode requires a remote storage (e.g. S3, GCS, Azure Blob Storage) and a Snowflake stage to be created. The CQ plugin will stream the results to the remote storage. You can then load those files via a cronjob or via SnowPipe. This method is still in the works and will be updated soon with a guide.
Example Config
This example sets the connection string to a value read from the SNOWFLAKE_CONNECTION_STRING
environment variable:
kind: destination
spec:
name: snowflake
path: cloudquery/snowflake
version: "v3.2.10"
write_mode: "append"
spec:
connection_string: "${SNOWFLAKE_CONNECTION_STRING}"
# Optional parameters
# migrate_concurrency: 1
# batch_size: 1000 # 1K entries
# batch_size_bytes: 4194304 # 4 MiB
The Snowflake destination utilizes batching, and supports batch_size
and batch_size_bytes
.
Authentication
Authentication of the connection to Snowflake can be specified using:
-
A username and password in the DSN:
kind: destination spec: name: snowflake ... spec: connection_string: "user:pass@account/db/schema?warehouse=wh"
-
A private key inline:
kind: destination spec: name: snowflake ... spec: connection_string: "user@account/database/schema?warehouse=wh" private_key: | -----BEGIN PRIVATE KEY----- MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQC2ajPRIbPtbxZ1 3DONLA02eZJuCzsgIkBWov/Me5TL6cKN0gnY+mbA8OnNCH+9HSzgiU9P8XhTUrIN 85diD+rj6uK+E0sSyxGk6HG17TyR5oBq8nz2hbZlbaNi/HO9qYoHQgAgMq908YBz ... DUmOIrBYEMf2nDTlTu/QVcKb -----END PRIVATE KEY-----
-
A private key included from a file:
kind: destination spec: name: snowflake ... spec: connection_string: "user@account/database/schema?warehouse=wh" private_key: "${file:./private.key}"
where ./private.key is PEM-encoded private key file with contents of the form:
-----BEGIN PRIVATE KEY----- MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQC2ajPRIbPtbxZ1 3DONLA02eZJuCzsgIkBWov/Me5TL6cKN0gnY+mbA8OnNCH+9HSzgiU9P8XhTUrIN 85diD+rj6uK+E0sSyxGk6HG17TyR5oBq8nz2hbZlbaNi/HO9qYoHQgAgMq908YBz ... DUmOIrBYEMf2nDTlTu/QVcKb -----END PRIVATE KEY-----
Private Key Authentication Setup
The Snowflake guide for Key Pair Authentication (opens in a new tab) demonstrates how to create an RSA private key with the ability to authenticate as a Snowflake user.
Note that encrypted private keys are not supported by the Snowflake Go SQL driver, and hence not supported by the CloudQuery Snowflake plugin. You can decrypt a private key in file enc.key and store it in a file dec.key using the following command:
openssl pkcs8 -topk8 -nocrypt -in enc.key -out dec.key
Snowflake Spec
This is the top level spec used by the Snowflake destination plugin.
-
connection_string
(string
) (required)Snowflake
connection_string
.Example:
# user[:password]@account/database/schema?warehouse=user_warehouse[¶m1=value1¶mN=valueN] # or # user[:password]@account/database?warehouse=user_warehouse[¶m1=value1¶mN=valueN] # or # user[:password]@host:port/database/schema?account=user_account&warehouse=user_warehouse[¶m1=value1¶mN=valueN] # or # host:port/database/schema?account=user_account&warehouse=user_warehouse[¶m1=value1¶mN=valueN]
From Snowflake documentation:
account
- Name assigned to your Snowflake account. If you are not on us-west-2 or AWS deployment, append the region and platform to the end, e.g.,<account>.<region> or <account>.<region>.<platform>
. -
private_key
(string
) (optional)A PEM-encoded private key for connecting to snowflake. Equivalent to adding
authenticator=snowflake_jwt&privateKey=...
to theconnection_string
but parses, validates, and correctly encodes the key for use with snowflake. -
migrate_concurrency
(integer
) (optional) (default:1
)By default, tables are migrated one at a time. This option allows you to migrate multiple tables concurrently. This can be useful if you have a lot of tables to migrate and want to speed up the process.
Setting this to a negative number means no limit.
-
batch_size
(integer
) (optional) (default:1000
)Number of records to batch together before sending to the database.
-
batch_size_bytes
(integer
) (optional) (default:4194304
(= 4 MiB))Number of bytes (as Arrow buffer size) to batch together before sending to the database.
Underlying library
We use the official github.com/snowflakedb/gosnowflake (opens in a new tab) package for database connection.