Plugins

Meltano takes a modular approach to data engineering in general and EL(T) in particular, where your project and pipelines are composed of plugins of different types, most notably extractors (Singer taps), loaders (Singer targets), utilities (dbt for transformation, Airflow/Dagster/etc. for orchestration, and much more on MeltanoHub).

Meltano provides the glue to make these components work together smoothly and enables consistent configuration and deployment.

To learn how to manage your project's plugins, refer to the Plugin Management guide.

Project Plugins

In order to use a given package as a plugin in a project, assuming it meets the requirements of the plugin type in question, Meltano needs to know:

where to find the package, typically a pip package identified by its name on PyPI, public or private Git repository URL, or local directory path,
what settings it supports
what capabilities it supports, and finally
what its configuration should be when invoked.

Together, a package's location (1) and the metadata (2) describing it in terms Meltano can understand make up the base plugin description. In your project, plugins extend this description with a specific configuration (3) and a unique name.

This means that different configurations of the same package (base plugin) would be represented in your project as separate plugins with their own unique names, that can be thought of as differently initialized instances of the same class. For example: extractors tap-postgres--billing and tap-postgres--events derived from base extractor tap-postgres, or tap-google-analytics--client-foo and tap-google-analytics--client-bar derived from base extractor tap-google-analytics.

Each plugin in a project can either:

inherit its base plugin description from a discoverable plugin that's supported out of the box,
define its base plugin description explicitly, making it a custom plugin, or
inherit both base plugin description and configuration from another plugin in the project.

To learn how to add a plugin to your project, refer to the Plugin Management guide.

Discoverable plugins

Base plugin descriptions for many popular extractors (Singer taps), loaders (Singer targets), and other plugins have already been collected by users and contributed to Meltano Hub, making them supported out of the box.

To find discoverable plugins refer to the lists of Extractors, Loaders, etc., on Meltano Hub.

To learn how to add a discoverable plugin to your project using a shadowing plugin definition or inheriting plugin definition, refer to the Plugin Management guide.

Variants

In the case of various popular data sources and destinations, multiple alternative implementations of Singer taps (extractors) and targets (loaders) exist, some of which are forks of an original (canonical) version that evolved in their own direction, while others were developed independently from the start.

These different implementations and their repositories typically use the same name (tap-<source> or target-<destination>) and may on the surface appear interchangeable, but often vary significantly in terms of exact behavior, quality, and supported settings.

In its index of discoverable plugins, Meltano considers these different implementations different variants of the same plugin, that share a plugin name and other source/destination-specific details (like a logo and description), but have their own implementation-specific variant name and metadata (like capabilities and settings).

Every discoverable plugin has a default variant that is known to work well and recommended for new users, which will be added to your project unless you explicitly select a different one. Users who already have experience with a different variant (or have specific reasons to prefer it) can explicitly choose to add it to their project instead of the default, so that they get the same behavior and can use the same settings as before. If the variant in question is not discoverable yet, it can be added as a custom plugin.

To learn how to add a non-default variant of a discoverable plugin to your project, refer to the Plugin Management guide.

Custom plugins

If you'd like to use a package in your project whose base plugin description isn't discoverable yet, you'll need to collect and provide this metadata yourself.

To learn how to add a custom plugin to your project using a custom plugin definition, refer to the Plugin Management guide.

Once you've got the plugin working in your project, please consider contributing its description to Meltano Hub to make it discoverable and supported out of the box for new users!

Plugin Inheritance

If you'd like to use the same package (base plugin) in your project multiple times with different configurations, you can add a new plugin that inherits from an existing one.

The new plugin will inherit its parent's base plugin description and configuration as if they were defaults, which can then be overridden as appropriate.

For performance reasons, inherited plugins with an identical pip_url to their parent share the parents underlying python virtualenv. If you would prefer to create a separate virtualenv for an inherited plugin, modify it's pip_url to be different to its parent.

To learn how to add an inheriting plugin to your project using an inheriting plugin definition, refer to the Plugin Management guide.

Lock artifacts

When you add a plugin to your project using meltano add, the discoverable plugin definition of the plugin will be downloaded and added to your project under plugins/<plugin_type>/<plugin_name>--<variant_name>.lock. This will ensure that the plugin's definition will be stable and version-controlled.

Later invocations of the plugin will use this file to determine the settings, installation source, etc.

Note that custom and inherited plugins do not get a lock file.

Types

Meltano supports the following types of plugins:

Extractors pull data out of arbitrary data sources.
Mappers perform stream map transforms on data between extractors and loaders.
Loaders load extracted data into arbitrary data destinations.
Utilities perform arbitrary tasks provided by pip packages with executables. All plugins previously referred to as transformers and orchestrators are being transistioned to utilities.
File bundles bundle files you may want in your project.

These plugin types are still supported but are transitioning to being referred to as Utilities:

Orchestrators orchestrate a project's scheduled pipelines.
Transformers run transforms.

These plugin types are deprecated:

Transforms transform data that has been loaded into a database (data warehouse).

Extractors

Extractors are pip packages used by meltano run or meltano invoke as part of data integration. They are responsible for pulling data out of arbitrary data sources: databases, SaaS APIs, or file formats.

Meltano supports Singer taps: executables that implement the Singer specification.

To learn which extractors are discoverable and supported out of the box, refer to the Extractors page.

Extras

Extractors support the following extras:

catalog
load_schema
metadata
schema
select
select_filter
state
use_cached_catalog

`catalog` extra

Setting: _catalog
Environment variable: <EXTRACTOR>__CATALOG, e.g. TAP_GITLAB__CATALOG
meltano elt CLI option: --catalog
Default: None

An extractor's catalog extra holds a path to a catalog file (relative to the project directory) to be provided to the extractor when it is run in sync mode using meltano elt or meltano invoke.

If a catalog path is not set, the catalog will be generated on the fly by running the extractor in discovery mode and applying the schema, selection, and metadata rules to the discovered file.

Selection filter rules are always applied to manually provided catalogs as well as discovered ones.

While this extra can be managed using meltano config or environment variables like any other setting, a catalog file is typically provided using meltano elt's --catalog option.

If the catalog does not seem to take effect, you may need to validate the capabilities of the tap.

How to use

Manage this extra:

meltano.yml
terminal
env

extractors:
- name: tap-gitlab
  catalog: extract/tap-gitlab.catalog.json

meltano config <extractor> set _catalog <path>

meltano elt <extractor> <loader> --catalog <path>

# For example:
meltano config tap-gitlab set _catalog extract/tap-gitlab.catalog.json

meltano elt tap-gitlab target-jsonl --catalog extract/tap-gitlab.catalog.json

export <EXTRACTOR>__CATALOG=<path>

# For example:
export TAP_GITLAB__CATALOG=extract/tap-gitlab.catalog.json

`load_schema` extra

Setting: _load_schema
Environment variable: <EXTRACTOR>__LOAD_SCHEMA, e.g. TAP_GITLAB__LOAD_SCHEMA
Default: $MELTANO_EXTRACTOR_NAMESPACE, which will expand to the extractor's namespace, e.g. tap_gitlab for tap-gitlab

An extractor's load_schema extra holds the name of the database schema extracted data should be loaded into, when this extractor is used in a pipeline with a loader for a database that supports schemas, like PostgreSQL or Snowflake.

The value of this extra can be referenced from a loader's configuration using the MELTANO_EXTRACT__LOAD_SCHEMA pipeline environment variable. It is used as the default value for the target-postgres and target-snowflake schema settings.

How to use

Manage this extra:

meltano.yml
terminal
env

extractors:
- name: tap-gitlab
  load_schema: gitlab_data

meltano config <extractor> set _load_schema <schema>

# For example:
meltano config tap-gitlab set _load_schema gitlab_data

export <EXTRACTOR>__LOAD_SCHEMA=<schema>

# For example:
export TAP_GITLAB__LOAD_SCHEMA=gitlab_data

`metadata` extra

Setting: _metadata, alias: metadata
Environment variable: <EXTRACTOR>__METADATA, e.g. TAP_GITLAB__METADATA
Default: {} (an empty object)

An extractor's metadata extra holds an object describing Singer stream and property metadata rules that are applied to the extractor's discovered catalog file when the extractor is run using meltano run, meltano invoke, or meltano elt. These rules are not applied when a catalog is provided manually.

Stream (entity) metadata <key>: <value> pairs (e.g. {"replication-method": "INCREMENTAL"}) are nested under top-level entity identifiers that correspond to Singer stream tap_stream_id values. These nested properties can also be thought of and interacted with as settings named _metadata.<entity>.<key>.

Property (attribute) metadata <key>: <value> pairs (e.g. {"is-replication-key": true}) are nested under top-level entity identifiers and second-level attribute identifiers that correspond to Singer stream property names. These nested properties can also be thought of and interacted with as settings named _metadata.<entity>.<attribute>.<key>.

Unix shell-style wildcards can be used in entity and attribute identifiers to match multiple entities and/or attributes at once.

Entity and attribute names can be discovered using meltano select --list --all <plugin>.

How to use

Manage this extra:

meltano.yml
terminal
env

extractors:
- name: tap-postgres
  metadata:
    some_stream_id:
      replication-method: INCREMENTAL
      replication-key: created_at
      created_at:
        is-replication-key: true

meltano config <extractor> set _metadata <entity> <key> <value>
meltano config <extractor> set _metadata <entity> <attribute> <key> <value>

# For example:
meltano config tap-postgres set _metadata some_stream_id replication-method INCREMENTAL
meltano config tap-postgres set _metadata some_stream_id replication-key created_at
meltano config tap-postgres set _metadata some_stream_id created_at is-replication-key true

export <EXTRACTOR>__METADATA='{"<entity>": {"<key>": "<value>", "<attribute>": {"<key>": "<value>"}}}'

# Once metadata has been set in `meltano.yml`, environment variables can be used
# to override specific nested properties:
export <EXTRACTOR>__METADATA_<ENTITY>_<ATTRIBUTE>_<KEY>=<value>

# For example:
export TAP_POSTGRES__METADATA_SOME_STREAM_ID_REPLICATION_METHOD=INCREMENTAL
export TAP_POSTGRES__METADATA_SOME_STREAM_ID_REPLICATION_KEY=created_at

Common metadata keys that can be set include:

key-properties: A list of properties that together uniquely identify a record in the stream. For example, ["id"].
replication-key: The name of a property in the source to use as a bookmark. For example, this will often be an "updated_at" field or an auto-incrementing primary key (requires replication-method).
replication-method: The replication method to use for a stream. Either FULL_TABLE or INCREMENTAL. Some extractors also support LOG_BASED, so check the extractor's documentation for details.

`schema` extra

Setting: _schema
Environment variable: <EXTRACTOR>__SCHEMA, e.g. TAP_GITLAB__SCHEMA
Default: {} (an empty object)

An extractor's schema extra holds an object describing Singer stream schema override rules that are applied to the extractor's discovered catalog file when the extractor is run using meltano elt or meltano invoke. These rules are not applied when a catalog is provided manually.

JSON Schema descriptions for specific properties (attributes) (e.g. {"type": ["string", "null"], "format": "date-time"}) are nested under top-level entity identifiers that correspond to Singer stream tap_stream_id values, and second-level attribute identifiers that correspond to Singer stream property names. These nested properties can also be thought of and interacted with as settings named _schema.<entity>.<attribute> and _schema.<entity>.<attribute>.<key>.

Unix shell-style wildcards can be used in entity and attribute identifiers to match multiple entities and/or attributes at once.

Entity and attribute names can be discovered using meltano select --list --all <plugin>.

If a schema is specified for a property that does not yet exist in the discovered stream's schema, the property (and its schema) will be added to the catalog. This allows you to define a full schema for taps such as tap-dynamodb that do not themselves have the ability to discover the schema of their streams.

How to use

Manage this extra:

meltano.yml
terminal
env

extractors:
- name: tap-postgres
  schema:
    some_stream_id:
      created_at:
        type: ["string", "null"]
        format: date-time

meltano config <extractor> set _schema <entity> <attribute> <schema description>
meltano config <extractor> set _schema <entity> <attribute> <key> <value>

# For example:
meltano config tap-postgres set _metadata some_stream_id created_at type '["string", "null"]'
meltano config tap-postgres set _metadata some_stream_id created_at format date-time

export <EXTRACTOR>__SCHEMA='{"<entity>": {"<attribute>": {"<key>": "<value>"}}}'

# Once schema descriptions have been set in `meltano.yml`, environment variables can be used
# to override specific nested properties:
export <EXTRACTOR>__SCHEMA_<ENTITY>_<ATTRIBUTE>_<KEY>=<value>

# For example:
export TAP_POSTGRES__SCHEMA_SOME_STREAM_ID_CREATED_AT_FORMAT=date

`select` extra

Setting: _select
Environment variable: <EXTRACTOR>__SELECT, e.g. TAP_GITLAB__SELECT
Default: ["*.*"]

An extractor's select extra holds an array of entity selection rules that are applied to the extractor's discovered catalog file when the extractor is run using meltano run, meltano invoke, or meltano elt. These rules are not applied when a catalog is provided manually.

A selection rule is comprised of an entity identifier that corresponds to a Singer stream's tap_stream_id value, and an attribute identifier that that corresponds to a Singer stream property name, separated by a period (.). Rules indicating that an entity or attribute should be excluded are prefixed with an exclamation mark (!). Unix shell-style wildcards can be used in entity and attribute identifiers to match multiple entities and/or attributes at once.

Entity and attribute names can be discovered using meltano select --list --all <plugin>.

While this extra can be managed using meltano config or environment variables like any other setting, selection rules are typically specified using meltano select.

How to use

Manage this extra:

meltano.yml
terminal
env

extractors:
- name: tap-gitlab
  select:
  - project_members.*
  - commits.*

meltano config <extractor> set _select '["<entity>.<attribute>", ...]'

meltano select <extractor> <entity> <attribute>

# For example:
meltano config tap-gitlab set _select '["project_members.*", "commits.*"]'

meltano select tap-gitlab project_members "*"
meltano select tap-gitlab commits "*"

export <EXTRACTOR>__SELECT='["<entity>.<attribute>", ...]'

# For example:
export TAP_GITLAB__SELECT='["project_members.*", "commits.*"]'

If the extractor uses a dot character within stream names, you can escape it with a backslash (\). For example, if the extractor has a stream named animals.data, you can select fields using the following configuration:

meltano.yml
terminal
env

extractors:
- name: tap-smoke-test
  select:
  - "animals\\.data.id"     # Use double backslash to escape the dot in quoted strings
  - animals\.data.verified  # Use single backslash to escape the dot in unquoted strings

meltano config tap-smoke-test set _select '["animals\\.data.id","animals\\.data.verified"]'

meltano select tap-smoke-test 'animals\.data' id
meltano select tap-smoke-test 'animals\.data' verified

export TAP_SMOKE_TEST__SELECT='["animals\\.data.id","animals\\.data.verified"]'

`select_filter` extra

Setting: _select_filter
Environment variable: <EXTRACTOR>__SELECT_FILTER, e.g. TAP_GITLAB__SELECT_FILTER
meltano elt CLI options: --select and --exclude
Default: []

An extractor's select_filter extra holds an array of entity selection filter rules that are applied to the extractor's discovered or provided catalog file when the extractor is run using meltano run, meltano invoke, or meltano elt, after schema, selection, and metadata rules are applied.

It can be used to only extract records for specific matching entities, or to extract records for all entities except for those specified, by letting you apply filters on top of configured entity selection rules.

Selection filter rules use entity identifiers that correspond to Singer stream tap_stream_id values. Rules indicating that an entity should be excluded are prefixed with an exclamation mark (!). Unix shell-style wildcards can be used in entity identifiers to match multiple entities at once.

Entity names can be discovered using meltano select --list --all <plugin>.

While this extra can be managed using meltano config or environment variables like any other setting, selection filers are typically specified using meltano elt's --select and --exclude options.

How to use

Manage this extra:

meltano.yml
terminal
env

extractors:
- name: tap-gitlab
  select:
  - project_members.*
  - commits.*
  select_filter:
  - commits

meltano config <extractor> set _select_filter '["<entity>", ...]'
meltano config <extractor> set _select_filter '["!<entity>", ...]'

meltano elt <extractor> <loader> --select <entity>
meltano elt <extractor> <loader> --exclude <entity>

# For example:
meltano config tap-gitlab set _select_filter '["commits"]'
meltano config tap-gitlab set _select_filter '["!project_members"]'

meltano elt tap-gitlab target-jsonl --select commits
meltano elt tap-gitlab target-jsonl --exclude project_members

export <EXTRACTOR>__SELECT_FILTER='["<entity>", ...]'
export <EXTRACTOR>__SELECT_FILTER='["!<entity>", ...]'

# For example:
export TAP_GITLAB__SELECT_FILTER='["commits","!project_members"]'

`state` extra

Setting: _state
Environment variable: <EXTRACTOR>__STATE, e.g. TAP_GITLAB__STATE
meltano elt CLI option: --state
Default: None

An extractor's state extra holds a path to a state file (relative to the project directory) to be provided to the extractor when it is run as part of a pipeline using meltano elt.

If a state path is not set, the state will be looked up automatically based on the ELT run's State ID.

While this extra can be managed using meltano config or environment variables like any other setting, a state file is typically provided using meltano elt's --state option.

How to use

Manage this extra:

meltano.yml
terminal
env

extractors:
- name: tap-gitlab
  state: extract/tap-gitlab.state.json

meltano config <extractor> set _state <path>

meltano elt <extractor> <loader> --state <path>

# For example:
meltano config tap-gitlab set _state extract/tap-gitlab.state.json

meltano elt tap-gitlab target-jsonl --state extract/tap-gitlab.state.json

export <EXTRACTOR>__STATE=<path>

# For example:
export TAP_GITLAB__STATE=extract/tap-gitlab.state.json

`use_cached_catalog` extra

Setting: _use_cached_catalog
Environment variable: <EXTRACTOR>__USE_CACHED_CATALOG, e.g. TAP_GITLAB__USE_CACHED_CATALOG
Default: True

An extractor's use_cached_catalog extra is a boolean flag that, when set to False, disables the use of a cached catalog file during the extractor's discovery process. By default, Meltano will cache the catalog file generated by an extractor to speed up subsequent runs. However, if the extractor's schema has changed in a way that would affect discovery output, you may want to bypass the cache to ensure the latest catalog is used.

Setting this extra to False forces the extractor to perform discovery and generate a new catalog file every time it runs, which can be useful during development or when an extractor supports dynamic catalog discovery, such as in tap-salesforce.

How to use

Manage this extra:

meltano.yml
terminal
env

extractors:
- name: tap-gitlab
  use_cached_catalog: false

meltano config <extractor> set _use_cached_catalog false

# For example:
meltano config tap-gitlab set _use_cached_catalog false

export <EXTRACTOR>__USE_CACHED_CATALOG=false

# For example:
export TAP_GITLAB__USE_CACHED_CATALOG=false

Loaders

Loaders are pip packages used by meltano elt as part of data integration. They are responsible for loading extracted data into arbitrary data destinations: databases, SaaS APIs, or file formats.

Meltano supports Singer targets: executables that implement the Singer specification.

To learn which loaders are discoverable and supported out of the box, refer to the Loaders page.

Extras

Loaders support the following extras:

dialect

`dialect` extra

Setting: _dialect
Environment variable: <LOADER>__DIALECT, e.g. TARGET_POSTGRES__DIALECT
Default: $MELTANO_LOADER_NAMESPACE, which will expand to the loader's namespace. Note that this default has been overridden on discoverable loaders, e.g. postgres for target-postgres and snowflake for target-snowflake.

A loader's dialect extra holds the name of the dialect of the target database, so that transformers in the same pipeline can determine the type of database to connect to.

The value of this extra can be referenced from a transformer's configuration using the MELTANO_LOAD__DIALECT pipeline environment variable. It is used as the default value for dbt's target setting, and should therefore correspond to a target name in transform/profile/profiles.yml.

How to use

Manage this extra:

meltano.yml
terminal
env

loaders:
- name: target-example-db
  dialect: example-db

meltano config <loader> set _dialect <dialect>

# For example:
meltano config target-example-db set _dialect example-db

export <LOADER>__DIALECT=<dialect>

# For example:
export TARGET_EXAMPLE_DB__DIALECT=example-db

Transforms

Transform plugins are being deprecated in favor of calling dbt packages directly.

The transform plugin type is still supported for now but will eventually be phased out.

Transforms are dbt packages containing dbt models, that are used by meltano elt as part of data transformation.

Together with the dbt transformer, they are responsible for transforming data that has been loaded into a database (data warehouse) into a different format, usually one more appropriate for analysis.

When a transform is added to your project using meltano add, the dbt package Git repository referenced by its pip_url will be added to your project's transform/packages.yml and the package will be enabled in transform/dbt_project.yml.

Extras

Transforms support the following extras:

package_name
vars

`package_name` extra

Setting: _package_name
Environment variable: <TRANSFORM>__PACKAGE_NAME, e.g. TAP_GITLAB__PACKAGE_NAME
Default: $MELTANO_TRANSFORM_NAMESPACE, which will expand to the transform's namespace, e.g. tap_gitlab for tap-gitlab

A transform's package_name extra holds the name of the dbt package's internal dbt project: the value of name in dbt_project.yml.

When a transform is added to your project using meltano add, this name will be added to the models dictionary in transform/dbt_project.yml.

The value of this extra can be referenced from a transformer's configuration using the MELTANO_TRANSFORM__PACKAGE_NAME pipeline environment variable. It is included in the default value for dbt's models setting: $MELTANO_TRANSFORM__PACKAGE_NAME $MELTANO_EXTRACTOR_NAMESPACE my_meltano_model.

How to use

Manage this extra:

meltano.yml
terminal
env

transforms:
- name: dbt-facebook-ads
  namespace: tap_facebook
  package_name: facebook_ads

meltano config <transform> set _package_name <name>

# For example:
meltano config dbt-facebook-ads set _package_name facebook_ads

export <TRANSFORM>__PACKAGE_NAME=<name>

# For example:
export DBT_FACEBOOK_ADS__PACKAGE_NAME=facebook_ads

`vars` extra

Setting: _vars
Environment variable: <TRANSFORM>__VARS, e.g. TAP_GITLAB__VARS
Default: {} (an empty object)

A transform's vars extra holds an object representing dbt model variables that can be referenced from a model using the var function.

When a transform is added to your project using meltano add, this object will be used as the dbt model's vars object in transform/dbt_project.yml.

Because these variables are handled by dbt rather than Meltano, environment variables can be referenced using the env_var function instead of $VAR or ${VAR}.

How to use

Manage this extra:

meltano.yml
terminal
env

{% raw %}
transforms:
- name: tap-gitlab
  vars:
    schema: '{{ env_var(''DBT_SOURCE_SCHEMA'') }}'
{% endraw %}

{% raw %}
meltano config <transform> set _vars <key> <value>

# For example
meltano config --plugin-type=transform tap-gitlab set _vars schema "{{ env_var('DBT_SOURCE_SCHEMA') }}"
{% endraw %}

export <TRANSFORM>__VARS='{"<key>": "<value>"}'

# For example:
export TAP_GITLAB__VARS='{"schema": "{{ env_var(''DBT_SOURCE_SCHEMA'') }}"}'

Orchestrators

Orchestrator plugins are transitioning over to being called Utilities. The new approach is to group all non-EL plugins under the utility plugin type.

The orchestrator plugin type is still supported for now but will eventually be phased out as utilities take over.

Orchestrators are pip packages responsible for orchestrating a project's scheduled pipelines.

Meltano supports Apache Airflow out of the box, but can be used with any tool capable of reading the output of meltano schedule list --format=json and executing each pipeline's meltano run command on a schedule.

When the airflow utility is added to your project using meltano add, its related file bundle will automatically be added as well.

Transformers

Transformers plugins are transitioning over to being called Utilities. The new approach is to group all non-EL plugins under the utility plugin type.

The transformer plugin type is still supported for now but will eventually be phased out as utilities take over.

Transformers are pip packages used by meltano run as part of data transformation. They are responsible for running transforms.

Meltano supports dbt and its dbt models out of the box.

When the dbt transformer is added to your project using meltano add, its related file bundle will automatically be added as well.

File bundles

File bundles are pip packages bundling files you may want in your project.

When a file bundle is added to your project using meltano add, the bundled files will automatically be added as well. The file bundle itself will not be added to your meltano.yml project file unless it contains files that are managed by the file bundle and to be updated automatically when meltano upgrade is run.

`update` extra

Setting: _update
Environment variable: <BUNDLE>__UPDATE, e.g. DBT__UPDATE
Default: {} (an empty object)

A file bundle's update extra holds an object mapping file paths (of files inside the bundle, relative to the project root) to booleans. Glob patterns are supported to allow matching of multiple files with a single path.

When a file path's value is True, the matching files are considered to be managed by the file bundle and updated automatically when meltano upgrade is run.

How to use

Manage this extra:

meltano.yml
terminal
env

files:
- name: dbt
  update:
    transform/dbt_project.yml: false
    profiles/*.yml: true

If a file path starts with a %2A, it must be wrapped in quotes to be considered valid YAML. For example, using %2A.yml to match all .yml files:

files:

name: dbt update: '*.yml': true

meltano config <bundle> set _update <path> <true/false>

# For example:
meltano config --plugin-type=files dbt set _update transform/dbt_project.yml false
meltano config --plugin-type=files dbt set _update profiles/*.yml true

export <BUNDLE>__UPDATE='{"<path>": <true/false>}'

# For example:
export DBT__UPDATE='{"transform/dbt_project.yml": false, "profiles/*.yml": true}'

Utilities

The utility plugin type represents all non-EL plugins. Plugins that were under the transformer (e.g. dbt) and orchestrator (e.g. Airflow, Dagster) plugin types are now included as utilities.

Also any additional pip package that exposes an executable can be added to your project as a utility. Meltano has a selection of available utilities listed on MeltanoHub, or you can easily add your own custom utility.

Meltano also has an Extension Developer Kit (EDK) that can be used to integrate existing data tools with Meltano.

Custom Utilities

Any pip package that exposes an executable can be added to your project as a custom utility.

meltano add --custom utility <plugin>

# For example:
meltano add --custom utility yoyo
(namespace): yoyo
(pip_url): yoyo-migrations
(executable): yoyo

You can then invoke the executable using meltano invoke:

meltano invoke <plugin> [<executable arguments>...]

# For example:
meltano invoke yoyo new ./migrations -m "Add column to foo"

The benefit of doing this as opposed to adding the package to requirements.txt or running pip install <package> directly is that any packages installed this way benefit from Meltano's virtual environment isolation. This avoids dependency conflicts between packages.

Mappers

Mappers allow you to transform or manipulate data after extraction and before loading. Common applications include:

Streams/properties can be aliased to provide custom naming downstream.
Stream records can be filtered based on any user-defined logic.
Properties can be transformed inline (i.e. converting types, sanitizing PII data).
Properties can be removed from the stream.
New properties can be added to the stream.

Note that mappers are currently only available when using meltano run.

How to use

You can install mappers like any other other plugin using meltano add:

$ meltano add mapper transform-field
2024-01-01T00:25:40.604941Z [info     ] Installing mapper 'transform-field'
2024-01-01T00:25:53.152127Z [info     ] Installed mapper 'transform-field'

To learn more about mapper 'transform-field', visit https://github.com/transferwise/pipelinewise-transform-field

Mappers are unique in that after install you don't invoke them directly. Instead you define mappings by name and add a config object for each mapping. This config object is passed to the mapper when the mapping name is called as part of a meltano run invocation. Note that this differs from other plugins, as you're not invoking a plugin name - but referencing the mapping name instead. Additionally, the requirements for the config object itself will vary by plugin.

So given a mapper with mappings configured like so:

mappers:
  - name: transform-field
    variant: transferwise
    pip_url: pipelinewise-transform-field
    executable: transform-field
    mappings:
    - name: hide-gitlab-secrets
      config:
        transformations:
          - field_id: "author_email"
            tap_stream_name: "commits"
            type: "MASK-HIDDEN"
          - field_id: "committer_email"
            tap_stream_name: "commits"
            type: "MASK-HIDDEN"
    - name: null-created-at
      config:
        transformations:
          - field_id: "created_at"
            tap_stream_name: "accounts"
            type: "SET-NULL"

You can then invoke the mappings by name:

# hide-gitlab-secrets will resolve to mapping with the same name. In this case, that mapping will perform two actions.
# Transform the "author_email" field in the "commits" stream and hide the email address.
# Transform the "committer_email" field in the "commits" stream and hide the email address.
$ meltano run tap-gitlab hide-gitlab-secrets target-jsonl

# null-created-at will resolve to mapping with the same name. In this case, that mapping will perform one action.
# Transform the "created_at" field in the "accounts" stream and set it to null.
$ meltano run tap-someapi null-created-at target-jsonl

You can also invoke multiple mappings at once in series:

$ tap-someapi fix-null-id fix-country-code target-jsonl

Each mapping will execute in a unique process instance of the mapper plugin. That means that you can also call mappings that leverage the same plugin at multiple locations numerous times within the run invocation:

# Fix any null country codes using transform-field mapper.
# Set the customers region based on their country code using your own mapper.
# Mask the id if the customer is in the EU region using transform-field mapper.
$ tap-someapi fix-null-country set-region-from-country  mask-id-if-eu target-jsonl

Project Plugins​

Discoverable plugins​

Variants​

Custom plugins​

Plugin Inheritance​

Lock artifacts​

Types​

Extractors​

Extras​

catalog extra​

How to use​

load_schema extra​

How to use​

metadata extra​

How to use​

schema extra​

How to use​

select extra​

How to use​

select_filter extra​

How to use​

state extra​

How to use​

use_cached_catalog extra​

How to use​

Loaders​

Extras​

dialect extra​

How to use​

Transforms​

Extras​

package_name extra​

How to use​

vars extra​

How to use​

Orchestrators​

Transformers​

File bundles​

update extra​

How to use​

Utilities​

Custom Utilities​

Mappers​

How to use​

Project Plugins

Discoverable plugins

Variants

Custom plugins

Plugin Inheritance

Lock artifacts

Types

Extractors

Extras

`catalog` extra

How to use

`load_schema` extra

How to use

`metadata` extra

How to use

`schema` extra

How to use

`select` extra

How to use

`select_filter` extra

How to use

`state` extra

How to use

`use_cached_catalog` extra

How to use

Loaders

Extras

`dialect` extra

How to use

Transforms

Extras

`package_name` extra

How to use

`vars` extra

How to use

Orchestrators

Transformers

File bundles

`update` extra

How to use

Utilities

Custom Utilities

Mappers

How to use