Jinja loops in dbt are a game-changer for automating SQL generation and reducing repetitive code. By leveraging Python-like syntax, you can dynamically build queries, aggregate data, and streamline your dbt models for scalability and maintainability.
Whether you’re mapping values, generating unions, or building dynamic aggregations, these patterns will help you write cleaner, more efficient dbt code.
✅
Who is this for? - Hands-on developers.
TL;DR
Problem: Writing repetitive code is boring, ugly and inefficient.
Solution: Use Jinja patterns inside dbt transformations.
In Sum: We discuss 10 Jinja loop patterns to help you write cleaner, more efficient and beautiful dbt transformations.
Just like well-written python is pythonic, well-written dbt code is dbtonic. - dbt
Below, we’ll explore 10 practical loop patterns, each with multiple code examples, to help you write cleaner, more efficient dbt models. They include:
Auto-Generate CTEs with similar names
Function Parameter Generation
Looping through CASE Statements
Generating UNION Statements
Flexible Aggregation
Dynamic Column Selection
Generating Filters for WHERE Clauses
BONUS: Building Dynamic Window Functions
Loop Properties Cheat Sheet
👨💻
Link to download and instructions to run the code examples locally can be found at the end of the article.
1. Auto-Generate CTEs with similar names
Create multiple CTEs for different sources that are similar in names; you can use Jinja loops for this use case, too.
{% set email_companies_with_filters = [
{ "name": "google", "filter": "name like '%John%'" },
{ "name": "yahoo", "filter": "name like '%Jill%'" },
{ "name": "microsoft", "filter": "name like '%Doe%'" }
] %}
with
{% for company in email_companies_with_filters %}
{{ company.name }}_cte as (
select * from {{ ref('dynamic_case') }}
where {{ company.filter }}
){% if not loop.last %},{% endif %}
{% endfor %}
select * from microsoft_cte
The compiled code looks like this:
with
google_cte as (
select * from "public"."public"."dynamic_case"
where name like '%John%'
),
yahoo_cte as (
select * from "public"."public"."dynamic_case"
where name like '%Jill%'
),
microsoft_cte as (
select * from "public"."public"."dynamic_case"
where name like '%Doe%'
)
select * from microsoft_cte
2. Function Parameter Generation
Dynamically generate function parameters, such as percentiles using the range function, and use the values generated as function parameters.
{% set percentiles = range(10, 100, 10) %}
select
{% for percent in percentiles %}
percentile_cont({{ percent }} / 100) within group (order by spam_threshold) as spam_threshold_p{{ p }}
{% if not loop.last %}, {% endif %}
{% endfor %}
from {{ ref('stg_user_email_spam_rating') }}
3. Looping through CASE Statements
Suppose I have a raw table with names and emails, like below:
Sample Users table
If I need to feature engineer additional fields like the email provider and the email provider country, I don't have to write out every condition now. I can use a mapping table like shown below and use Jinja templating to loop through every case statement.
{% set email_provider_mapping = {
"gmail.com": {"company": "Google", "country": "US"},
"yahoo.com": {"company": "Yahoo", "country": "US"},
"hotmail.com": {"company": "Microsoft", "country": "US"},
"outlook.com": {"company": "Microsoft", "country": "US"},
"aol.com": {"company": "AOL", "country": "US"},
"protonmail.com": {"company": "Proton", "country": "CH"},
"posteo.de": {"company": "Posteo", "country": "DE"},
"gmx.net": {"company": "GMX", "country": "DE"},
"zoho.com": {"company": "Zoho", "country": "IN"}
} %}
select
name,
email,
case
{% for key, value in email_provider_mapping.items() %}
when email like '%{{ key }}' then '{{ value.company }}'
{% endfor %}
else 'other'
end as email_provider,
case
{% for key, value in email_provider_mapping.items() %}
when email like '%{{ key }}' then '{{ value.country }}'
{% endfor %}
else 'other'
end as country
from {{ ref('raw_emails') }}
Output for Dynamic CASE Statement
4. Generating UNION Statements
Combine data from multiple tables or sources without writing each SELECT manually.
{% set email_companies = ["google", "yahoo", "microsoft"] %}
with google_users as (
select *
from {{ ref('dynamic_case') }}
where email_provider = 'Google'
),
yahoo_users as (
select *
from {{ ref('dynamic_case') }}
where email_provider = 'Yahoo'
),
microsoft_users as (
select *
from {{ ref('dynamic_case') }}
where email_provider = 'Microsoft'
)
{% for email_company in email_companies %}
select
name,
email,
email_provider,
country
from {{ email_company }}_users
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
The compiled code looks something like this:
with google_users as (
select *
from "public"."public"."dynamic_case"
where email_provider = 'Google'
),
yahoo_users as (
select *
from "public"."public"."dynamic_case"
where email_provider = 'Yahoo'
),
microsoft_users as (
select *
from "public"."public"."dynamic_case"
where email_provider = 'Microsoft'
)
select
name,
email,
email_provider,
country
from google_users union all
select
name,
email,
email_provider,
country
from yahoo_users union all
select
name,
email,
email_provider,
country
from microsoft_users
5. Flexible Aggregation
Aggregate across multiple columns or dimensions without repeating code.
{% set companies = [{
"name": "google",
"filter": "email_provider = 'Google'"
}, {
"name": "yahoo",
"filter": "email_provider = 'Yahoo'"
}, {
"name": "microsoft",
"filter": "email_provider = 'Microsoft'"
}] %}
{% set metrics = [{
"name": "count",
"sql": "count(*)"
}, {
"name": "count_distinct",
"sql": "count(distinct email)"
}] %}
select
{% for company in companies %}
{% for metric in metrics %}
{{ metric.sql }} as {{ company.name }}_{{ metric.name }}
{% if not loop.last %}, {% endif %}
{% endfor %}
{% if not loop.last %}, {% endif %}
{% endfor %}
from {{ ref('dynamic_case') }}
The compiled code looks like this:
select
count(*) as google_count,
count(distinct email) as google_count_distinct,
count(*) as yahoo_count,
count(distinct email) as yahoo_count_distinct,
count(*) as microsoft_count,
count(distinct email) as microsoft_count_distinct
from "public"."public"."dynamic_case"
The above example is a great example to show that over-DRYing your code can affect the readability of the code. Look at the compiled code, which is more readable than the Jinja syntax. I just wanted to demonstrate that using Jinja too much can make your code difficult to understand.
🤠
My Thoughts: Don't over-DRY your code! Remember that using Jinja can make your models harder for other users to interpret. I highly recommend favoring readability when mixing Jinja with SQL, even if that means repeating some lines of SQL in a few places. If all your models are macros, it might be worth re-assessing.
6. Dynamic Column Selection
Select a dynamic set of columns, e.g., for wide tables or when columns change frequently.
{% set company_columns = [
{ "name": "google", "columns": ["name", "email"] },
{ "name": "yahoo", "columns": ["name", "country"] },
{ "name": "microsoft", "columns": ["email", "email_provider"] }
] %}
with google_users as (
select *
from {{ ref('dynamic_case') }}
where email_provider = 'Google'
),
yahoo_users as (
select *
from {{ ref('dynamic_case') }}
where email_provider = 'Yahoo'
),
microsoft_users as (
select *
from {{ ref('dynamic_case') }}
where email_provider = 'Microsoft'
)
{% for company in company_columns %}
select
{% for column in company.columns %}
{{ column }}
{% if not loop.last %}, {% endif %}
{% endfor %}
from {{ company.name }}_users
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
7. Generating Filters for WHERE Clauses
Build dynamic filters based on a list of values. In the example below, I am filtering user emails based on a banned countries list.
{% set banned_countries = ["US", "DE", "IN"] %}
select *
from {{ ref('dynamic_case') }}
where country not in (
{% for country in banned_countries %}
'{{ country }}'{% if not loop.last %}, {% endif %}
{% endfor %}
)
Compiled code looks like this:
select *
from "public"."public"."dynamic_case"
where country not in (
'US',
'DE',
'IN'
)
Bonus: Building Dynamic Window Functions
When you want to apply window functions across multiple columns or partitions.
{% set metrics = ["emails_received", "emails_opened", "clicked_links"] %}
select
date,
{% for metric in metrics %}
sum({{ metric }}) over (order by date rows between 6 preceding and current row) as rolling_7d_{{ metric }}
{% if not loop.last %}, {% endif %}
{% endfor %}
from {{ ref('daily_email_stats') }}
Loop Properties
Use the table below to understand Jinja loop properties.
Property
Usage
loop.last
This boolean is False unless the current iteration is the last iteration.
{% for item in list %}
{% if loop.last %}
--This is the last item
{{ item }}
{% endif %}
{% endfor %}
loop.first
A boolean that is True if the current iteration is the first iteration, otherwise False.
{% for item in list %}
{% if loop.first %}
--first item
{{ item }}
{% endif %}
{% endfor %}
loop.index
An integer representing the current iteration of the loop (1-indexed). So, the first iteration would have `loop.index` of 1, the second would be 2, and so on.
{% for item in list %}
--This is item number
{{ loop.index }}
{% endfor %}
As you continue to develop your data models, keep these techniques in mind to build robust, scalable, and DRY (Don’t Repeat Yourself) analytics pipelines. Happy modeling!
FAQs
How do I debug my Jinja?
You should get familiar with checking the compiled SQL in target/compiled/<your_project>/ and the logs in logs/dbt.log to see what dbt is running behind the scenes. You can also read about standardising logging in dbt macros here.
How do I document macros?
To document macros, use a schema file and nest the configurations under a macros: key.
Subscribe to our Newsletter
Join 2000+ data engineers and developers discovering the latest in dbt and analytics engineering.
dbt Jinja functions - In addition to the standard Jinja library, dbt has added additional functions and variables to the Jinja context that are useful when working with a dbt project.
EOF
(Where I tend to share unrelated things).
If you've come across or used other Jinja loop patterns, I would love to hear about them in the comments 😃
Download Code Examples
Congrats on reaching this far. Becoming good at using Jinja templates in your dbt project is all about practice. That's why I've decided to give away all the code examples used here for free to my reader community. Just subscribe and download for free.
Link to download the code and instructions to run it 👇
Learn how to seamlessly run your dbt Core project with Apache Airflow. We cover the different methods to setup, code repo, video tutorial for local dev, and best practices to help you orchestrate dbt runs using Airflow: the easiest way to modernize your data workflows.
Struggling with "maximum recursion depth exceeded" in dbt macros? This article explains why it happens—usually due to missing or misused variables—and shows simple, beginner-friendly steps to fix it, ensuring your dbt projects run smoothly and error-free.
Learn how to standardise logging in dbt macros for clear, consistent CLI output. This guide shows you how to create a reusable logging macro, making debugging and tracking macro activity in your dbt projects easier and more efficient.
Automate PII tagging in your dbt project using OpenAI! Streamline data governance, ensure compliance, and save time by leveraging AI to detect and flag sensitive information in your data models. Integrate this workflow into CI/CD for robust, scalable data protection.
Subscribe to my Newsletter
Join 2000+ data engineers and developers discovering the latest in dbt and analytics engineering.