For every star on GitHub, we'll donate $2 to clean up our waterways. Star us now!

Part 2 - Loading Data, EL

Let’s learn by example.

Throughout this tutorial, we’ll walk you through the creation of a end-to-end modern ELT stack.

In part 1, we extracted data from GitHub and are now ready to load the data into a PostgreSQL database.

If you're having trouble throughout this tutorial, you can always head over to the Slack channel to get help.

Getting your target ready #

We’re going to load our data into a dockerized PostgreSQL database running on your laptop. View the docker docs if you don’t yet have docker installed. To launch a local PostgreSQL container, you just need to run:

$ docker run -p 5432:5432 -e POSTGRES_USER=meltano -e POSTGRES_PASSWORD=password -d postgres

```bash $ docker run -p 5432:5432 -e POSTGRES_USER=meltano -e POSTGRES_PASSWORD=password --name meltano_postgres -d postgres 504e2b416874dd6a5db3fe6dd3ff63f1d42095bbc4e87314f1f708f69c8188de $ docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 504e2b416874 postgres "docker-entrypoint.s…" 3 seconds ago Up 3 seconds>5432/tcp kind_rosalind ```

The container will need a few seconds to initialize. You can test the connection with your favorite SQL tool using connection data:

  • host: localhost
  • port: 5432
  • database: postgres
  • user: meltano
  • password: password

Add the postgres loader #

Add the postgres loader using the meltano add loader target-postgres --variant=meltanolabs command.

```console $ meltano add loader target-postgres --variant=meltanolabs Added loader 'target-postgres' to your Meltano project Variant: meltanolabs (default) Repository: Documentation: Installing loader 'target-postgres'... ---> 100% Installed loader 'target-postgres' To learn more about loader 'target-postgres', visit ```

Use the meltano invoke target-postgres --help command to test that the installation worked.

```console $ meltano invoke target-postgres --help Usage: target-postgres [OPTIONS] Execute the Singer target. Options: --input FILENAME A path to read messages from instead of from standard in. --config TEXT Configuration file location or 'ENV' to use environment variables. --format [json|markdown] Specify output style for --about --about Display package metadata and settings. --version Display the package version. --help Show this message and exit. ```

Configure the target-postgres loader #

To configure the plugin, look at the options by running meltano config target-postgres list:

```console $ meltano config target-postgres list [...] host [env: TARGET_POSTGRES_HOST] current value: 'postgres' (from `meltano.yml`)   Host: Hostname for postgres instance. Note if sqlalchemy_url is set this will be ignored. port [env: TARGET_POSTGRES_PORT] current value: None (default)   Port: The port on which postgres is awaiting connection. Note if sqlalchemy_url is set this will be ignored. Defaults to 5432 user [env: TARGET_POSTGRES_USER] current value: 'meltano' (from `meltano.yml`)   User: User name used to authenticate. Note if sqlalchemy_url is set this will be ignored. password [env: TARGET_POSTGRES_PASSWORD] current value: None (default)   Password: Password used to authenticate. Note if sqlalchemy_url is set this will be ignored. database [env: TARGET_POSTGRES_DATABASE] current value: 'postgres' (from `meltano.yml`)   Database: Database name. Note if sqlalchemy_url is set this will be ignored. [...] add_record_metadata [env: TARGET_POSTGRES_ADD_RECORD_METADATA] current value: None (default)   Add Record Metadata: Note that this must be enabled for activate_version to work!This adds _sdc_extracted_at, _sdc_batched_at, and more to every table. See for more information. [...] To learn more about loader 'target-postgres' and its settings, visit ```

Fill in the details for these four attributes, and set add_record_metadata to True by using the meltano config target-postgres set ATTRIBUTE VALUE command:

```console $ meltano config target-postgres set user meltano   Loader 'target-postgres' setting 'user' was set in `meltano.yml`: 'meltano' $ meltano config target-postgres set password password   Loader 'target-postgres' setting 'password' was set in `.env`: '(redacted)' $ meltano config target-postgres set database postgres   Loader 'target-postgres' setting 'database' was set in `meltano.yml`: 'postgres' $ meltano config target-postgres set add_record_metadata True   Loader 'target-postgres' setting 'add_record_metadata' was set in `meltano.yml`: True $ meltano config target-postgres set host localhost   Loader 'target-postgres' setting 'host' was set in `meltano.yml`: localhost ```

This will add the non-sensitive configuration to your meltano.yml project file:

       - name: target-postgres
         variant: meltanolabs
         pip_url: git+
           user: meltano
           database: postgres
           add_record_metadata: true
           host: localhost

Sensitive configuration information (such as password) will instead be stored in your project’s .env file so that it will not be checked into version control.

You can use meltano config target-postgres to check the configuration, including the default settings not visible in the project file.

```console $ meltano config target-postgres {     "host": "localhost",     "user": "meltano",     "password": "password",     "database": "postgres",     "add_record_metadata": true } ```

Run your data integration (EL) pipeline #

Now that your Meltano project, extractor, and loader are all set up, it’s time to run your first data integration (EL) pipeline!

Run your newly added extractor and loader in a pipeline using meltano run:

```console $ meltano run tap-github target-postgres 2022-09-20T13:16:13.885045Z [warning ] No state was found, complete import. 2022-09-20T13:16:15.441183Z [info ] INFO Starting sync of repository: [...] 2022-09-20T13:16:15.901789Z [info ] INFO METRIC: {"type": "timer", "metric": "http_request_duration",[...] ---> 100% 2022-09-20T13:16:15.933874Z [info ] INFO METRIC: {"type": "counter", "metric": "record_count", "value": 21,[...] 2022-09-20T13:16:16.435885Z [info ] [...] message=Schema 'tap_github' does not exist. Creating... ... 2022-09-20T13:16:16.632945Z [info ] ... message=Table '"commits"' does not exist. Creating... 2022-09-20T13:16:16.729076Z [info ] ...message=Loading 21 rows into 'tap_github."commits"' ... ---> 100% 2022-09-20T13:16:16.864812Z [info ] ...Loading into tap_github."commits": {"inserts": 21, "updates": 0, "size_bytes": 4641} ... 2022-09-20T13:16:16.885846Z [info ] Incremental state has been updated at 2022-09-20 13:16:16.885259. 2022-09-20T13:16:16.960093Z [info ] Block run completed. .... ```

If everything was configured correctly, you should now see your data flow from your source into your destination!

Next Steps #

Next, head over to Part 3, to add inline transformations to your ingestion process.