Troubleshooting
If you have a question about Meltano, are having trouble getting it to work, or have any kind of feedback, you can:
Check out the Meltano issue tracker to see if someone else has already reported the same issue or made the same request. Feel free to weigh in with extra information, or just to make sure the issue gets the attention it deserves.
Join the Meltano Slack workspace, which is frequented by the core team and thousands of community members and data experts. You can ask any questions you may have in here or just chat with fellow users.
Common Issues
Problem: "Why do incremental runs produce duplicate data?"
Singer takes an "at least once" approach to replication, so if you're encountering this, it might be intended behavior. This issue is a good summary of the current state and a proposal to change this behavior.
Problem: "My runs take too long."
This issue provides a good overview on a strategy to figure out performance issues.
How to Debug Problems
Log Level Debug
When you are trying to troubleshoot an issue the Meltano logs should be your first port of call.
If you have a failure using Meltano's execution commands (invoke
, elt
, run
, or test
) or you're experienced general unexpected behavior, you can learn more about what’s happening behind the scenes by setting Meltano’s cli.log_level
setting to debug, using the MELTANO_CLI_LOG_LEVEL
environment variable or the --log-level
CLI option:
- meltano config
- env
- command
meltano config meltano set cli log_level debug
export MELTANO_CLI_LOG_LEVEL=debug
meltano --log-level=debug <command> ...
In debug mode, Meltano will log additional information about the environment and arguments used to invoke your components (Singer taps and targets, dbt, Airflow, etc.), including the paths to the generated config, catalog, state files, etc. for you to review.
Here is an example with meltano elt
:
$ meltano --log-level=debug run tap-gitlab target-jsonl
2023-02-01T17:17:43.308389Z [info ] Environment 'dev' is active
2023-02-01T17:17:43.375158Z [debug ] Creating engine '<meltano.core.project.Project object at 0x10d9ff5e0>@sqlite:////demo-project/.meltano/meltano.db'
2023-02-01T17:17:43.646112Z [debug ] Found plugin parent parent=tap-gitlab plugin=tap-gitlab source=lockfile
2023-02-01T17:17:43.650014Z [debug ] found plugin in cli invocation plugin_name=tap-gitlab
2023-02-01T17:17:43.652873Z [debug ] Found plugin parent parent=target-jsonl plugin=target-jsonl source=lockfile
2023-02-01T17:17:43.656906Z [debug ] found plugin in cli invocation plugin_name=target-jsonl
2023-02-01T17:17:43.657112Z [debug ] head of set is extractor as expected block=<meltano.core.plugin.project_plugin.ProjectPlugin object at 0x1115be850>
2023-02-01T17:17:45.337292Z [debug ] found block block_type=loaders index=1
2023-02-01T17:17:45.337455Z [debug ] blocks idx=1 offset=0
2023-02-01T17:18:54.233065Z [debug ] found ExtractLoadBlocks set offset=0
2023-02-01T17:18:54.233220Z [debug ] All ExtractLoadBlocks validated, starting execution.
2023-02-01T17:18:56.271112Z [debug ] Created configuration at /home/.meltano/run/tap-gitlab/tap.54d0e4e3-eb71-4000-9138-47a25c8b3743.config.json
2023-02-01T17:18:56.271662Z [debug ] Could not find tap.properties.json in /home/.meltano/extractors/tap-gitlab/tap.properties.json, skipping.
2023-02-01T17:18:56.272003Z [debug ] Could not find tap.properties.cache_key in /home/.meltano/extractors/tap-gitlab/tap.properties.cache_key, skipping.
2023-02-01T17:18:56.272321Z [debug ] Could not find state.json in /home/.meltano/extractors/tap-gitlab/state.json, skipping.
2023-02-01T17:18:56.355385Z [warning ] No state was found, complete import.
...
Isolate the Connector
If it's unclear which part of the pipeline is generating the problem, test the tap and target individually by using meltano invoke
.
The invoke
command will run the executable with any specified arguments.
meltano invoke <plugin> [PLUGIN_ARGS...]
It's usually easiest to pipe the raw output of the tap to a file to confirm the tap works then pipe that file's contents to the target so the tap doesn't have to re-replicate the data. For example:
meltano invoke tap-csv > output.json
cat output.json | meltano invoke target-postgres
Validate Tap Capabilities
In cases where the tap is not loading any streams or it does not appear to be respecting the configured select
rules, you may need to validate the capabilities of the tap.
In prior versions of the Singer spec, the --properties
option was used instead of --catalog
for the catalog files.
If this is the case for a tap, ensure properties
is set as a capability for the tap instead of catalog
.
Then meltano elt
will accept the catalog file and will pass it to the tap using the appropriate flag.
For more information, please refer to the plugin capabilities reference.
Testing Specific Failing Streams
When extracting several streams with a single tap using the elt
command, it may be challenging to debug a single failing stream.
In this case, it can be useful to run the tap with just the single stream selected.
Instead of duplicating the extractor in meltano.yml
, try running meltano elt
with the --select
flag.
This will run the pipeline with just that stream selected.
You can also have meltano invoke
select an individual stream by setting the select_filter
extra as an environment variable:
export <TAP_NAME>__SELECT_FILTER='["<your_stream>"]'
Incremental Replication Not Running as Expected
If using a custom tap, ensure that the tap declares the state
capability as described in the plugin capabilities reference.
If you're using meltano run
be aware that the state ID is generated using the extractor name + loader name + environment name. If you switched multiple environments you might not have the state you were expecting.
If you're trying to run a pipeline with incremental replication using meltano elt
but it's running a full sync, ensure that you're passing a State ID via the --state-id
flag.
Dump Files Generated by Running Meltano Commands to STDOUT
The --dump
flag can be passed to the meltano invoke
and meltano elt
commands to dump the content of a pipeline-specific generated file to STDOUT
instead of actually running the pipeline. Note that adding support for meltano run
is tracked in this GitHub issue.
This can aid in debugging extractor catalog generation, incremental replication state lookup, and pipeline environment variables.
Supported values are:
- catalog: Dump the extractor catalog file that would be passed to the tap’s executable using the
--catalog
option. - state: Dump the extractor state file that would be passed to the tap’s executable using the
--state
option. - extractor-config: Dump the extractor config file that would be passed to the tap’s executable using the
--config
option. - loader-config: Dump the loader config file that would be passed to the target’s executable using the
--config
option.
Like any standard output, the dumped content can be redirected to a file using >, e.g. meltano elt ... --dump=state > state.json
.
Examples #
meltano elt tap-gitlab target-postgres --transform=run --state-id=gitlab-to-postgres
meltano elt tap-gitlab target-postgres --state-id=gitlab-to-postgres --full-refresh
meltano elt tap-gitlab target-postgres --catalog extract/tap-gitlab.catalog.json
meltano elt tap-gitlab target-postgres --state extract/tap-gitlab.state.json
meltano elt tap-gitlab target-postgres --select commits
meltano elt tap-gitlab target-postgres --exclude project_members
meltano elt tap-gitlab target-postgres --state-id=gitlab-to-postgres --dump=state > extract/tap-gitlab.state.json
Meltano UI
Early versions of Meltano promoted a simple UI feature that was used for setting up basic pipelines and viewing basic logs. Due to a refocusing of the product on the command line interface, the UI was deprioritized for continued feature enhancements. For interactive plugin configuration, we now recommend our --interactive
config option in the CLI.
Meltano will eventually have a UI again as the company and community grows. Please let us know your thoughts on what you would like to see in a future Meltano UI in this GitHub discussion.
To view the previous documentation on the UI, review this pull request where they were removed.
Limitations & Capabilities of the Deprecated UI
If you are still using the UI, please note that it is not compatible with many newer Meltano features.
The UI does work with:
- Schedules based on the
elt
command (meltano schedule add <schedule_name> --extractor <tap> --loader <target> --transform ...
)
The UI does not work with:
- Schedules based on jobs (
meltano schedule add <schedule_name> --job <job>
) - Environments
Replacing UI functionality
If you're using Airflow or Dagster as your orchestrator, the Airflow or Dagster webserver UI should be able to serve as a replacement for many Meltano UI use cases.
If using Airflow as orchestrator, see the "Airflow orchestrator" section of the "Deployment in Production" guide for more details on how to get the webserver running. From the webserver, you can view all DAGs. You can also access the Meltano logs for a specific task instance by going to the task instance context menu and clicking "Log" or "View logs".
No Plugin Settings Defined
To configure a plugin that does not already have settings defined, we recommend first adding a settings:
entry under the plugin definition. You may need to consult the plugin's documentation to determine what settings are available.
Related reference information and guides:
After adding definitions for the plugin's settings, you may then configure those settings as usual with the --interactive
option in meltano config
.
You may also contribute back to the community by publishing the settings to MeltanoHub in the form of a new pull request.