Documentation
Plugins
Sources
PostgreSQL
Overview

PostgreSQL Source Plugin

Latest: v3.0.6

The CloudQuery PostgreSQL plugin syncs your PostgreSQL database to any of the supported CloudQuery destinations (e.g. PostgreSQL, BigQuery, Snowflake, and more). This plugin also supports CDC via PostgreSQL logical replication, which enables keeping your PostgreSQL up to date with any destination by subscribing to changes.

Supported database versions:

  • PostgreSQL >= v11 (Needed for CDC, logical replication).
  • PostgreSQL >= v10 (If no need for CDC).

Example

This example configures a PostgreSQL source, located at localhost:5432. The (top level) spec section is described in the Source Spec Reference.

kind: source
spec:
  name: "postgresql"
  registry: "github"
  path: "cloudquery/postgresql"
  version: "v3.0.6"
  tables: ["*"]

  spec:
    connection_string: "postgresql://postgres:pass@localhost:5432/postgres?sslmode=disable"

Make sure you use environment variable expansion in production instead of committing the credentials to the configuration file directly.

PostgreSQL Spec

This is the (nested) spec used by the PostgreSQL source Plugin.

  • connection_string (string, required)

    Connection string to connect to the database. This can be a URL or a DSN, as per pgxpool (opens in a new tab)

    • "postgres://jack:secret@localhost:5432/mydb?sslmode=prefer" connect with tcp and prefer TLS
    • "postgres://jack:secret@localhost:5432/mydb?sslmode=disable&application_name=pgxtest&search_path=myschema&connect_timeout=5" be explicit with all options
    • "postgres://localhost:5432/mydb?sslmode=disable" connect with os username cloudquery is being run as
    • "postgres:///mydb?host=/tmp" connect over unix socket
    • "dbname=mydb" unix domain socket, just specifying the db name - useful if you want to use peer authentication
    • "user=jack password=jack\\'ssooper\\\\secret host=localhost port=5432 dbname=mydb sslmode=disable" DSN with escaped backslash and single quote
  • pgx_log_level (string) (optional) (default: error)

    Available: "error", "warn", "info", "debug", "trace" define if and in which level to log pgx (opens in a new tab) call.

  • cdc_id (string) (optional)

    If set to a non-empty string the source plugin will start syncing CDC via PostgreSQL logical replication in real-time. The value should be unique across all sources.

  • rows_per_record (integer) (optional) (default: 1)

    Amount of rows to be packed into a single Apache Arrow record to be sent over the wire during sync (or initial sync in the CDC mode). We suggest using significantly more than the default (e.g. 5000) to sync from large databases/tables.

Verbose logging for debug

The PostgreSQL source can be run in debug mode.

Note: This will use pgx (opens in a new tab) built-in logging and might output data/sensitive information to logs so make sure to not use it in production, only for debugging.

kind: source
spec:
  name: postgresql
  path: cloudquery/postgresql
  version: "v3.0.6"
  tables: ["*"]
  spec:
    connection_string: ${PG_CONNECTION_STRING}
    pgx_log_level: debug # Available: error, warn, info, debug, trace. Default: "error"