Standardising Logging in dbt Macros
As data teams scale their dbt (data build tool) projects, the complexity of models, tests, and macros grows. One often-overlooked aspect of maintainability and debugging is logging—specifically, how to make logs from custom macros clear, consistent, and easy to trace in the command-line interface (CLI).
If you’ve ever run a dbt job and been left wondering, “Which macro generated this log message?” or struggled to debug post-hook logic, you’re not alone. In this article, we’ll explore how to standardize logging across dbt macros so that every log message is clearly attributed, making your CLI output more readable and your debugging process more efficient.
We’ll cover:
- Why standardised logging matters in dbt
- How dbt’s built-in logging works
- Creating a reusable logging macro
- Using the logging macro in your custom macros and post-hooks
- Advanced tips for log levels and automation
- Best practices for maintainable dbt projects
Let’s dive in!
Why Standardised Logging Matters
dbt is a powerful tool for transforming data in your warehouse, but as your project grows, so does the number of custom macros and hooks. These macros often contain business logic, data quality checks, or dynamic SQL generation. When something goes wrong—or even when you just want to track progress—logging is your best friend.
Without standardisation | With standardisation |
---|---|
Log messages can be ambiguous (“Where did this come from?”) | Every log message is clearly attributed to its source macro |
Debugging post-hooks and complex macros is harder | CLI output is easier to scan and understand |
Team members may use inconsistent log formats | Debugging and maintenance are faster and less error-prone |
How dbt’s Built-in Logging Works
dbt provides a built-in log function for use in Jinja macros:
{% do log('This is a log message', info=True) %}
- The first argument is the message.
- The info parameter (default True) determines the log level (INFO or WARNING).
But: dbt’s log function doesn’t automatically include the macro name or context. This is where standardisation comes in.
Creating a Reusable Logging Macro
To standardise logging, we’ll create a macro that wraps dbt’s log function and prefixes each message with the macro name.
Step 1: Create the Logging Macro
Create a new file in your macros directory, e.g., macros/log_with_macro_name.sql
:
-- macros/log_with_macro_name.sql
{% macro log_with_macro_name(macro_name, message, info=True) %}
{% do log('[' ~ macro_name ~ '] : ' ~ message, info) %}
{% endmacro %}
How it works:
- macro_name: The name of the macro calling the logger.
- message: The log message.
- info: Log level (True for info, False for warning).
Step 2: Use the Logging Macro in Other Macros
Whenever you want to log something in a macro, use your new logging macro:
{% macro my_custom_macro() %}
{% do log_with_macro_name('my_custom_macro', 'Starting macro logic') %}
-- macro logic here
{% do log_with_macro_name('my_custom_macro', 'Finished macro logic') %}
{% endmacro %}
Step 3: See the Results in the CLI
When you run dbt, your CLI output will look like:
[my_custom_macro] : Starting macro logic
[my_custom_macro] : Finished macro logic
Now, every log message is clearly attributed!
Using the Logging Macro in Post-hooks
dbt allows you to run macros as post-hooks after models or seeds. Logging in post-hooks is especially useful for tracking data quality checks, auditing, or custom business logic.
Example:
models:
- name: my_model
post-hook:
- "{{ my_custom_macro() }}"
If my_custom_macro
uses the standardised logging macro, all its log messages will be clearly labelled in the CLI output.
Advanced: Supporting Log Levels
You might want to support different log levels (info, warning, error). dbt’s log
function only supports info
and warning
, but you can extend your macro for clarity.
-- macros/log_with_macro_name.sql
{% macro log_with_macro_name(macro_name, message, level='info') %}
{% if level == 'info' %}
{% do log('[' ~ macro_name ~ '] : ' ~ message, info=True) %}
{% elif level == 'warn' %}
{% do log('[' ~ macro_name ~ '] : ' ~ message, info=False) %}
{% else %}
{% do log('[' ~ macro_name ~ '] : ' ~ message, info=True) %}
{% endif %}
{% endmacro %}
Usage:
{% do log_with_macro_name('my_macro', 'This is an info message', level='info') %}
{% do log_with_macro_name('my_macro', 'This is a warning', level='warn') %}
Best Practices for Maintainable dbt Projects
- Always Use the Logging Macro:
Make it a team standard to use log_with_macro_name
in all custom macros.
- Document Your Logging Convention:
Add a section to your dbt project’s README explaining how to use the logging macro.
- Use Consistent Macro Names:
Pass the actual macro name as the first argument to avoid confusion.
- Leverage Log Levels:
Use info
for normal operation, warn
for potential issues.
- Review CLI Output Regularly:
Make log review part of your dbt run process, especially after changes to macros or hooks.
Real-World Example
Suppose you have a macro that checks for nulls in a critical column and logs the result:
{% macro check_nulls(model, column) %}
{% set sql %}
select count(*) as null_count
from {{ model }}
where {{ column }} is null
{% endset %}
{% set result = run_query(sql) %}
{% set null_count = result.columns[0].values()[0] %}
{% if null_count > 0 %}
{% do log_with_macro_name('check_nulls', column ~ ' has ' ~ null_count ~ ' nulls', level='warn') %}
{% else %}
{% do log_with_macro_name('check_nulls', column ~ ' has no nulls', level='info') %}
{% endif %}
{% endmacro %}
CLI Output:
[check_nulls] : user_id has 0 nulls
[check_nulls] : order_id has 5 nulls
Final Thoughts
Standardising logging in dbt macros is a simple but powerful practice that pays dividends in clarity, maintainability, and team productivity. By wrapping dbt’s log function in a custom macro, you ensure that every log message is clearly attributed, making your CLI output a valuable tool for debugging and monitoring.
Key takeaways:
- Use a logging macro to prefix messages with the macro name.
- Apply this macro consistently across all custom macros and post-hooks.
- Consider supporting log levels for more granular control.
- Document and enforce logging standards in your dbt project.
With these practices, your dbt project will be easier to debug, maintain, and scale—no more mystery log messages!
Further Reading
If you have questions or want to see more advanced logging patterns, let me know in the comments!
Discussion