Part 2 - Store data

Let’s learn by example.

In part 1, we extracted data from GitHub and are now ready to load the data into a PostgreSQL database.

If you're having trouble throughout this tutorial, you can always head over to the Slack channel to get help.

Getting your target ready

We're going to load our data into a dockerized PostgreSQL database running on your laptop. View the docker docs if you don't yet have docker installed. To launch a local PostgreSQL container, you just need to run:

docker run --name meltano_postgres -p 5432:5432 -e POSTGRES_USER=meltano -e POSTGRES_PASSWORD=password -d postgres

$ docker run -p 5432:5432 -e POSTGRES_USER=meltano -e POSTGRES_PASSWORD=password --name meltano_postgres -d postgres

504e2b416874dd6a5db3fe6dd3ff63f1d42095bbc4e87314f1f708f69c8188de
$ docker container ls
    CONTAINER ID   IMAGE      COMMAND                  CREATED         STATUS         PORTS                    NAMES
    504e2b416874   postgres   "docker-entrypoint.s…"   3 seconds ago   Up 3 seconds   0.0.0.0:5432->5432/tcp   kind_rosalind

The container will need a few seconds to initialize. You can test the connection with your favorite SQL tool using connection data:

host: localhost
port: 5432
database: postgres
user: meltano
password: password

Add the postgres loader

Add the postgres loader using the meltano add target-postgres --variant=meltanolabs command (plugin type is automatically detected).

$ meltano add target-postgres --variant=meltanolabs
Added loader 'target-postgres' to your Meltano project
Variant:        meltanolabs (default)
Repository:     https://github.com/MeltanoLabs/target-postgres
Documentation:  https://hub.meltano.com/loaders/target-postgres--meltanolabs

2024-01-01T00:25:40.604941Z [info     ] Installing loader 'target-postgres'
---> 100%

2024-01-01T00:25:53.152127Z [info     ] Installed loader 'target-postgres'

To learn more about loader 'target-postgres', visit https://hub.meltano.com/loaders/target-postgres--meltanolabs

Use the meltano invoke target-postgres --help command to test that the installation worked.

$ meltano invoke target-postgres --help
Usage: target-postgres [OPTIONS]

  Execute the Singer target.

Options:
  --input FILENAME          A path to read messages from instead of from
                            standard in.
  --config TEXT             Configuration file location or 'ENV' to use
                            environment variables.
  --format [json|markdown]  Specify output style for --about
  --about                   Display package metadata and settings.
  --version                 Display the package version.
  --help                    Show this message and exit.

Configure the target-postgres loader

To configure the plugin, look at the options by running meltano config target-postgres list:

$ meltano config target-postgres list

[...]
host [env: TARGET_POSTGRES_HOST] current value: 'postgres' (from `meltano.yml`)
&ensp;&ensp;Host: Hostname for postgres instance. Note if sqlalchemy_url is set this will be ignored.
port [env: TARGET_POSTGRES_PORT] current value: None (default)
&ensp;&ensp;Port: The port on which postgres is awaiting connection. Note if sqlalchemy_url is set this will be ignored. Defaults to 5432
user [env: TARGET_POSTGRES_USER] current value: 'meltano' (from `meltano.yml`)
 &ensp;&ensp;User: User name used to authenticate. Note if sqlalchemy_url is set this will be ignored.
password [env: TARGET_POSTGRES_PASSWORD] current value: None (default)
&ensp;&ensp;Password: Password used to authenticate. Note if sqlalchemy_url is set this will be ignored.
database [env: TARGET_POSTGRES_DATABASE] current value: 'postgres' (from `meltano.yml`)
&ensp;&ensp;Database: Database name. Note if sqlalchemy_url is set this will be ignored.
[...]
add_record_metadata [env: TARGET_POSTGRES_ADD_RECORD_METADATA] current value: None (default)
&ensp;&ensp;Add Record Metadata: Note that this must be enabled for activate_version to work!This adds _sdc_extracted_at, _sdc_batched_at, and more to every table. See https://sdk.meltano.com/en/latest/implementation/record_metadata.html for more information.
[...]

To learn more about loader 'target-postgres' and its settings, visit https://hub.meltano.com/loaders/target-postgres--meltanolabs

Fill in the details for these four attributes, and set add_record_metadata to True by using the meltano config target-postgres set ATTRIBUTE VALUE command:

$ meltano config target-postgres set user meltano
&ensp;&ensp;Loader 'target-postgres' setting 'user' was set in `meltano.yml`: 'meltano'
$ meltano config target-postgres set password password
&ensp;&ensp;Loader 'target-postgres' setting 'password' was set in `.env`: (redacted)
$ meltano config target-postgres set database postgres
&ensp;&ensp;Loader 'target-postgres' setting 'database' was set in `meltano.yml`: 'postgres'
$ meltano config target-postgres set add_record_metadata True
&ensp;&ensp;Loader 'target-postgres' setting 'add_record_metadata' was set in `meltano.yml`: True
$ meltano config target-postgres set host localhost
&ensp;&ensp;Loader 'target-postgres' setting 'host' was set in `meltano.yml`: localhost

This will add the non-sensitive configuration to your meltano.yml project file:

plugins:
  loaders:
    - name: target-postgres
      variant: meltanolabs
      pip_url: git+https://github.com/MeltanoLabs/target-postgres.git
      config:
        user: meltano
        database: postgres
        add_record_metadata: true
        host: localhost

Sensitive configuration information (such as password) will instead be stored in your project's .env file so that it will not be checked into version control.

You can use meltano config target-postgres to check the configuration, including the default settings not visible in the project file.

$ meltano config target-postgres
{
&ensp;&ensp;&ensp;&ensp;"add_record_metadata": true,
&ensp;&ensp;&ensp;&ensp;"database": "postgres",
&ensp;&ensp;&ensp;&ensp;"dialect+driver": "postgresql+psycopg2",
&ensp;&ensp;&ensp;&ensp;"host": "localhost",
&ensp;&ensp;&ensp;&ensp;"password": "password",
&ensp;&ensp;&ensp;&ensp;"port": 5432,
&ensp;&ensp;&ensp;&ensp;"user": "meltano"
}

By default, the database schema is named using the tap : tap_github. You change the target schema by setting default_target_schema in the target_postgres configuration.

Run your data integration (EL) pipeline

Now that your Meltano project, extractor, and loader are all set up, it's time to run your first data integration (EL) pipeline!

Run your newly added extractor and loader in a pipeline using meltano run:

$ meltano run tap-github target-postgres
2024-09-20T13:16:13.885045Z [warning  ] No state was found, complete import.
2024-09-20T13:16:15.441183Z [info     ] INFO Starting sync of repository: [...]
2024-09-20T13:16:15.901789Z [info     ] INFO METRIC: {"type": "timer", "metric": "http_request_duration",[...]
---> 100%
2024-09-20T13:16:15.933874Z [info     ] INFO METRIC: {"type": "counter", "metric": "record_count", "value": 21,[...]
2024-09-20T13:16:16.435885Z [info     ] [...] message=Schema 'tap_github' does not exist. Creating... ...
2024-09-20T13:16:16.632945Z [info     ] ... message=Table '"commits"' does not exist. Creating...
2024-09-20T13:16:16.729076Z [info     ] ...message=Loading 21 rows into 'tap_github."commits"' ...
---> 100%
2024-09-20T13:16:16.864812Z [info     ] ...Loading into tap_github."commits": {"inserts": 21, "updates": 0, "size_bytes": 4641} ...
2024-09-20T13:16:16.885846Z [info     ] Incremental state has been updated at 2024-09-20 13:16:16.885259.
2024-09-20T13:16:16.960093Z [info     ] Block run completed            ....

If everything was configured correctly, you should now see your data flow from your source into your destination!

The postgres database should now have a schema tap_github with the table commits containing your data.

Next Steps

Next, head over to Part 3 to add data processing to your pipeline.

Getting your target ready​

Add the postgres loader​

Configure the target-postgres loader​

Run your data integration (EL) pipeline​