Documentation
Plugins
Destinations
Elasticsearch
Overview

Elasticsearch Destination Plugin

Latest: v3.1.5

The Elasticsearch plugin syncs data from any CloudQuery source plugin(s) to an Elasticsearch cluster.

Example config

The following config will sync data to an Elasticsearch cluster running on localhost:9200:

kind: destination
spec:
  name: elasticsearch
  path: cloudquery/elasticsearch
  version: "v3.1.5"
  write_mode: "overwrite-delete-stale"
  spec:
    # Optional parameters
    # addresses: ["http://localhost:9200"]
    # username: ""
    # password: ""
    # cloud_id: ""
    # api_key: ""
    # service_token: ""
    # certificate_fingerprint: ""
    # ca_cert: ""
    # concurrency: 5 # default: number of CPUs
    # batch_size: 1000
    # batch_size_bytes: 5242880 # 5 MiB

The Elasticsearch destination utilizes batching, and supports batch_size and batch_size_bytes.

It supports append, overwrite and overwrite-delete-stale write modes. The default write mode is overwrite-delete-stale.

Elasticsearch Spec

This is the spec used by the Elasticsearch destination plugin.

  • addresses ([]string) (optional) (default: ["http://localhost:9200"])

    A list of Elasticsearch nodes to use.

  • username (string) (optional)

    Username for HTTP Basic Authentication.

  • password (string) (optional)

    Password for HTTP Basic Authentication.

  • cloud_id (string) (optional)

    Endpoint for the Elastic Service (https://elastic.co/cloud (opens in a new tab)).

  • api_key (string) (optional)

    Base64-encoded token for authorization; if set, overrides username/password and service token.

  • service_token (string) (optional)

    Service token for authorization; if set, overrides username/password.

  • certificate_fingerprint (string) (optional)

    SHA256 hex fingerprint given by Elasticsearch on first launch.

  • ca_cert (string) (optional)

    PEM-encoded certificate authorities. When set, an empty certificate pool will be created, and the certificates will be appended to it. See file variable substitution for how to read this value from a file.

  • concurrency (string) (optional) (default: number of CPUs)

    Number of concurrent worker goroutines to use for indexing.

  • batch_size (integer) (optional) (default: 1000)

    This parameter controls the maximum amount of items may be grouped together to be written as a single write.

  • batch_size_bytes (integer) (optional) (default: 5242880 (5 MiB))

    This parameter controls the maximum size of items that may be grouped together to be written as a single write.

Index Template Creation

The Elasticsearch destination will create an index template for every table during the migration step. It is recommended that you use the generated index templates, as it will automatically create indexes with the correct mappings for the table. However, to skip index template creation (or use your own), you may use the --no-migrate option when running cloudquery sync.

Index Naming

Index names will be formatted according to the selected write mode:

  • append: indexes will be named using the format <table_name>-<YYYY-MM-DD>. In other words, a new index will be created every day the table is synced. Entries will never be overwritten.
  • overwrite: indexes will be named using the format <table_name>. Objects with duplicate primary keys will be overwritten.
  • overwrite-delete-stale: indexes will be named using the format <table_name>. Objects with duplicate primary keys will be overwritten, and any objects that are not present in the current sync will be deleted.

Index templates will also be created such that they match the index names generated by the selected write mode.

Querying From Kibana

To query data from Kibana, you will need to create data views (opens in a new tab) (previously also known as "index patterns"). To query a specific table, the data view's index pattern should be in the format <table_name>-*. For example, if you have a table named aws_ec2_instances, you should create a data view with index pattern named aws_ec2_instances-*. One useful feature of Elasticsearch and Kibana, however, is the ability to query across all data. To do this for the aws source plugin, for example, you may use an index pattern named aws_*. This will then allow queries across all tables synced by the aws source plugin.

Underlying library

We use the official go-elasticsearch (opens in a new tab) package. It is tested against Elasticsearch 8.6.0. Please open an issue (opens in a new tab) if you encounter any problems with this (or another) version.