Skip to main content

Settings

Meltano supports a number of settings that allow you to fine tune its behavior, which are documented here. To quickly find the setting you're looking for, use the Table of Contents in the sidebar.

As described in the Configuration guide, Meltano will determine the values of these settings by first looking in the environment, then in your project's .env file, and finally in your meltano.yml project file, falling back to a default value if nothing was found.

You can use meltano config meltano list to list all available settings with their names, environment variables, and current values.

Configuration that is not environment-specific or sensitive should be stored in your meltano.yml project file and checked into version control. Sensitive values like passwords and tokens are most appropriately stored in the environment or your project's .env file.

meltano config meltano set <setting> <value>, which is used in the examples below, will automatically store configuration in meltano.yml or .env as appropriate.

If supported by the plugin type, its configuration can be tested using meltano config <plugin> test.

Plugin settings

For plugin settings, refer to the specific plugin's documentation (extractors, loaders), or use meltano config <plugin> list to list all available settings with their names, environment variables, and current values.

python

The python version to use for this plugin, specified as a path, or as the name of an executable to find within a directory in $PATH.

If not specified, the top-level python setting will be used, or if it is not set, the python executable that was used to run Meltano will be used (within a separate virtual environment).

This setting only applies when creating new virtual environments. If you've already created a virtual environment and you'd like to use a new Python version for it, you'll need to delete it from within .meltano/, then run meltano install for that plugin again.

This setting only applies to base plugins, which have their own virtual environment. Inherited plugins necessarily use the same virtual environment (and thus, the Python version) as their base plugin.

Your Meltano project

These are settings specific to your Meltano project.

python

The python version to use for plugins, specified as a path, or as the name of an executable to find within a directory in $PATH.

If not specified, the python executable that was used to run Meltano will be used (within a separate virtual environment).

This can be overridden on a per-plugin basis by setting the python property for the plugin.

This setting only applies when creating new virtual environments. If you've already created a virtual environment and you'd like to use a new Python version for it, you'll need to delete it from within .meltano/, then run meltano install for that plugin again.

This setting only applies to base plugins, which have their own virtual environment. Inherited plugins necessarily use the same virtual environment (and thus, the Python version) as their base plugin.

send_anonymous_usage_stats

Meltano is open source software thats free for anyone to use. The best thing a user could do to give back to the community, aside from contributing code or reporting issues, is contribute anonymous usage stats to allow the maintainers to understand how features are being utilized ultimately helping the community build a better product.

By default, Meltano shares anonymous usage data with the Meltano team using Snowplow. We use this data to learn about the size of our user base and the specific Meltano features they are using, which helps us determine the highest impact changes we can make in each release to make Meltano even more useful for you and others like you.

We also provide some of this data back to the community via MeltanoHub to help users understand the overall usage of plugins within Meltano.

If enabled, Meltano will use the value of the project_id setting to uniquely identify your project. If the project ID is a UUID, then it will be sent unchanged. Otherwise, it will be hashed, and its hash will be used to derive a UUID which will be used to uniquely identify your project.

This project ID is also sent along when Meltano requests available plugins from the URLs identified by the hub_url.

If you'd like to send the tracking data to a different Snowplow account than the one run by the Meltano team, the collector endpoints can be configured using the snowplow.collector_endpoints setting.

See more about our anonymization standards and anonymous usage stats Q&A below for more details. Also refer to the Meltano data team handbook page for our "Philosophy of Telemetry".

With all that said, if you'd still prefer to use Meltano without sending the maintainers this kind of data, you're able to disable tracking entirely using one of these methods:

  • When creating a new project, pass --no-usage-stats to meltano init
  • In an existing project, set the send_anonymous_usage_stats setting to false
  • To disable tracking in all projects in one go, set the MELTANO_SEND_ANONYMOUS_USAGE_STATS environment variable to false

How to use

meltano config meltano set send_anonymous_usage_stats false

Anonymization Standards

Unless otherwise approved, any user-entered data is anonymized client-side before being submitted to Meltano. This section describes which data is sent in clear text and which data is obfuscated via one-way hashing.

We capture these in clear text:

  • plugin names
  • plugin variant names
  • command names
  • execution context, such as:
    • OS version
    • Python version
    • project ID

We anonymize these with one-way hashing before reporting:

  • CLI args
  • plugin config

These items will never be collected or reported back to meltano:

  • your settings values
  • your secrets or credentials
  • the contents of your meltano.yml file

Anonymous Usage Stats Q&A

Q: What is a one way hash and how is it helpful?

A:

One-way hashing is a way of obfuscating sensitive data such that:

  1. The same input value always produces the same output value (aka "hash").
  2. The results are mathematically and statistically extremely difficult (read: near impossible) to reverse engineer back to the source value.
  3. Hash results are extremely helpful in safely and anonymously detecting changes to a file or configuration. Without passing the entire configuration, and without providing a hacker any means of decoding/decrypting the data back to its source, we can see that a file (such as meltano.yml) has not changed since its last hash was generated.
Q: Why does Meltano use hashing?

A:

Meltano hashes any fields at all which could be used by a hacker to compromise a project or user. We will never know what freeform text arguments you passed in via the command line, we won't have any data at all which could be used to compromise your environment, and whatever data we collect, we'll never sell, share, or trade your data with any third parties.

Should I enable or disable anonymous reporting?

A:

We hope you will choose to enable reporting, because this really does help us - and it helps the Meltano community in a very real way.

If you still have any concerns about keeping anonymous reporting enabled, we hope you'll share those concerns with us. You can do so by emailing hello@meltano.com or by logging an issue in our Meltano Issue Tracker.

project_id

Used by Meltano to uniquely identify your project if the send_anonymous_usage_stats setting is enabled.

How to use

meltano config meltano set project_id '<unique identifier>'

database_uri

  • Environment variable: MELTANO_DATABASE_URI
  • meltano * CLI option: --database-uri
  • Default: sqlite:///$MELTANO_SYS_DIR_ROOT/meltano.db

Meltano stores various types of metadata in a project-specific system database, that takes the shape of a SQLite database stored inside the .meltano directory at .meltano/meltano.db by default.

You can choose to use a different system database backend or configuration using the --database-uri option of meltano subcommands, or the MELTANO_DATABASE_URI environment variable.

Because internal database migrations make of use of the ALTER TABLE table RENAME COLUMN oldname TO newname syntax starting with Meltano v2.2.0, the minimum required SQLite version is now 3.25.1.

Some systems may come with an older version by default. You can run sqlite3 --version to check your version.

How to use

PostgreSQL
meltano config meltano set database_uri postgresql+psycopg://<username>:<password>@<host>:<port>/<database>
SQL Server (MSSQL)
meltano config meltano set database_uri mssql+pymssql://<username>:<password>@<host>:<port>/<database>

Using databases other than SQLite requires installing Meltano with extra components.

Targeting a PostgreSQL Schema

When using PostgreSQL as your system database, you can choose the target schema within that database by adding ?options=-csearch_path%3D<schema> directly to the end of your database_uri and MELTANO_DATABASE_URI.

You are also able to add multiple schemas, which PostgreSQL will work through from left to right until it finds a valid schema to target, by using ?options=-csearch_path%3D<schema>,<schema_two>

If you dont target a schema then by default PostgreSQL will try to use the public schema.

postgresql+psycopg://<username>:<password>@<host>:<port>/<database>?options=-csearch_path%3D<schema>

database_max_retries

This sets the maximum number of reconnection attempts in case the initial connection to the database fails because it isn't available when Meltano starts up.

Note: This affects the initial connection attempt only after which the connection is cached. Subsequent disconnections are handled by SQLALchemy

How to use

meltano config meltano set database_max_retries 3

database_retry_timeout

This controls the retry interval (in seconds) in case the initial connection to the database fails because it isn't available when Meltano starts up.

Note: This affects the initial connection attempt only after which the connection is cached. Subsequent disconnections are handled by SQLALchemy

How to use

meltano config meltano set database_retry_timeout 5

project_readonly

Enable this setting to indicate that your Meltano project is deployed as read-only, and to block all modifications to project files through the CLI in this environment.

Specifically, this prevents adding plugins or pipeline schedules to your meltano.yml project file, as well as modifying plugin configuration stored in meltano.yml or .env.

Note that meltano config <plugin> set can still be used to store configuration in the system database, but that settings that are already set in the environment or meltano.yml take precedence and cannot be overridden.

How to use

meltano config meltano set project_readonly true

hub_api_root

This sets the value of the root url for the hub api.

If provided, this setting overrides the hub_url.

How to use

meltano config meltano set hub_api_root "https://mysite.com/my-plugins"
meltano config meltano set hub_api_root false

hub_url

Where Meltano can find the Hub that lists all discoverable plugins.

The Hub is primarily used by meltano add and meltano lock. It is also used in cases where the full plugin definition is needed but no lock artifact is found.

How to use

meltano config meltano set hub_url http://localhost:4000

hub_url_auth

The value of the Authorization header sent when making a request to hub_url.

No Authorization header is applied under the following conditions:

  • hub_url_auth is not set
  • hub_url_auth is set to false, null or an empty string

How to use

meltano config meltano set hub_url_auth "Bearer $ACCESS_TOKEN"
meltano config meltano set hub_url_auth false

meltano CLI

These settings can be used to modify the behavior of the meltano CLI.

cli.log_level

  • Environment variable: MELTANO_CLI_LOG_LEVEL.
  • meltano CLI option: --log-level
  • Options: debug, info, warning, error, critical
  • Default: info

The granularity of CLI logging. Ignored if a local logging config is found.

How to use

meltano config meltano set cli log_level debug

cli.log_config

The path of a valid yaml formatted python logging dict config file to use to configure logging if present.

How to use

meltano config meltano set cli log_config /path/to/logging.yaml

A sample logging config:

version: 1
disable_existing_loggers: false

formatters:
default:
format: "[%(asctime)s] [%(process)d|%(threadName)10s|%(name)s] [%(levelname)s] %(message)s"
structured_plain:
(): meltano.core.logging.console_log_formatter
colors: False
structured_colored:
(): meltano.core.logging.console_log_formatter
colors: True
key_value:
(): meltano.core.logging.key_value_formatter
sort_keys: False
json:
(): meltano.core.logging.json_formatter

handlers:
console:
class: logging.StreamHandler
level: DEBUG
formatter: structured_colored
stream: "ext://sys.stderr"
file:
class: logging.FileHandler
level: INFO
filename: /var/log/meltano.log
formatter: json

root:
level: DEBUG
propagate: yes
handlers: [console, file]

cli.cwd

  • meltano CLI option: --cwd
  • Default: Current directory

With --cwd option you can run Meltano as if had been started in a specified directory.

How to use

Specify path to a directory which contains a valid meltano.yml project file.

meltano --cwd '/path/containing/meltano_yml'

meltano elt

These settings can be used to modify the behavior of meltano el and meltano elt.

elt.buffer_size

Size (in bytes) of the buffer between extractor and loader (Singer tap and target) that stores messages output by the extractor while they are waiting to be processed by the loader.

When an extractor generates messages (records) faster than the loader can process them, the buffer may fill up completely, at which point the extractor will be blocked until the loader has worked through enough messages to make half of the buffer size available again for new extractor output.

The length of a single line of extractor output is limited to half the buffer size. With a default buffer size of 10MiB, the maximum message size would therefore be 5MiB.

How to use

meltano config meltano set elt.buffer_size 52428800 # 50MiB in bytes

State Backends

state_backend.uri

URI for the state backend where you'd like Meltano to store state.

How to use

meltano config meltano set state_backend.uri "s3://your_bucket/meltano/state"

state_backend.lock_timeout_seconds

Number of seconds that a lock for a state ID should be considered valid in a state backend

How to use

meltano config meltano set state_backend.lock_timeout_seconds 720

state_backend.lock_retry_seconds

Number of seconds that a Meltano should wait if trying to access or modify state for a state ID that is locked

How to use

meltano config meltano set state_backend.lock_retry_seconds 720

Azure-Specific Settings


state_backend.azure.storage_account_url

One can Sign-in to Azure using the Azure CLI.

The storage_account_url will be used to fetch the default Azure credentials in the host system.

At least one of state_backend.azure.storage_account_url and state_backend.azure.connection_string has to be set in order to use Azure Blob Storage as a state backend. If state_backend.azure.storage_account_url is not set, Meltano will to try to use state_backend.azure.connection_string.

How to use

meltano config meltano set state_backend.azure.storage_account_url "https://<myStorageAccountName>.blob.core.windows.net"

state_backend.azure.connection_string

The Azure connection string to use when authenticating to Azure.

How to use

meltano config meltano set state_backend.azure.connection_string "DefaultEndpointsProtocol=https;AccountName=myAccountName;AccountKey=myAccountKey"

S3-Specific Settings


state_backend.s3.aws_access_key_id

The AWS access key ID to use when authenticating to S3.

How to use

meltano config meltano set state_backend.s3.aws_access_key_id "someaccesskeyid"

state_backend.s3.aws_secret_access_key

The AWS secret access key to use when authenticating to S3.

How to use

meltano config meltano set state_backend.s3.aws_secret_access_key "somesecretaccesskey"

state_backend.s3.endpoint_url

The endpoint URL to use when connecting to S3. Only necessary if using S3-compatible storage not hosted by AWS (e.g. Minio)

How to use

meltano config meltano set state_backend.s3.endpoint_url "https://play.min.io:9000"

GCS-Specific Settings


state_backend.gcs.application_credentials

Path to the credential file to use in authenticating to Google Cloud Storage

How to use

meltano config meltano set state_backend.gcs.application_credentials "path/to/creds.json"

Virtual environments

venv.backend

Snowplow Tracking

snowplow.collector_endpoints

Snowplow collector endpoints to be used if the send_anonymous_usage_stats setting is enabled. Events will be sent to all of these collectors.

Feature Flags

ff.strict_env_var_mode

Causes an exception to be raised if an environment variable is used within the project's Meltano configuration but that environment variable is not set.

ff.plugin_locks_required

When this flag is enabled, plugins will only use lock files to determine the settings, installation source, etc with the exception of the meltano add operations. This means that calling meltano run will fail if a lock file is not present for one of the plugins.

Github LogoEdit this page on Github