Jinja loops in dbt are a game-changer for automating SQL generation and reducing repetitive code. By leveraging Python-like syntax, you can dynamically build queries, aggregate data, and streamline your dbt models for scalability and maintainability.

Whether you’re mapping values, generating unions, or building dynamic aggregations, these patterns will help you write cleaner, more efficient dbt code.

Who is this for?
- Hands-on developers.

TL;DR

  • Problem: Writing repetitive code is boring, ugly and inefficient.
  • Solution: Use Jinja patterns inside dbt transformations.
  • In Sum: We discuss 10 Jinja loop patterns to help you write cleaner, more efficient and beautiful dbt transformations.
Just like well-written python is pythonic, well-written dbt code is dbtonic.
- dbt

Below, we’ll explore 10 practical loop patterns, each with multiple code examples, to help you write cleaner, more efficient dbt models. They include:

  1. Auto-Generate CTEs with similar names
  2. Function Parameter Generation
  3. Looping through CASE Statements
  4. Generating UNION Statements
  5. Flexible Aggregation
  6. Dynamic Column Selection
  7. Generating Filters for WHERE Clauses
  8. BONUS: Building Dynamic Window Functions
  9. Loop Properties Cheat Sheet
👨‍💻
Link to download and instructions to run the code examples locally can be found at the end of the article.

1. Auto-Generate CTEs with similar names

Create multiple CTEs for different sources that are similar in names; you can use Jinja loops for this use case, too.

{% set email_companies_with_filters = [
  { "name": "google", "filter": "name like '%John%'" },
  { "name": "yahoo", "filter": "name like '%Jill%'" },
  { "name": "microsoft", "filter": "name like '%Doe%'" }
] %}

with
{% for company in email_companies_with_filters %}
  {{ company.name }}_cte as (
    select * from {{ ref('dynamic_case') }}
    where {{ company.filter }}
  ){% if not loop.last %},{% endif %}
{% endfor %}

select * from microsoft_cte

The compiled code looks like this:

with
  google_cte as (
    select * from "public"."public"."dynamic_case"
    where name like '%John%'
  ),
  yahoo_cte as (
    select * from "public"."public"."dynamic_case"
    where name like '%Jill%'
  ),
  microsoft_cte as (
    select * from "public"."public"."dynamic_case"
    where name like '%Doe%'
  )
select * from microsoft_cte

2. Function Parameter Generation

Dynamically generate function parameters, such as percentiles using the range function, and use the values generated as function parameters.

{% set percentiles = range(10, 100, 10) %}

select
  {% for percent in percentiles %}
    percentile_cont({{ percent }} / 100) within group (order by spam_threshold) as spam_threshold_p{{ p }}
    {% if not loop.last %}, {% endif %}
  {% endfor %}
from {{ ref('stg_user_email_spam_rating') }}

3. Looping through CASE Statements

Suppose I have a raw table with names and emails, like below:

Sample Users table

If I need to feature engineer additional fields like the email provider and the email provider country, I don't have to write out every condition now. I can use a mapping table like shown below and use Jinja templating to loop through every case statement.

{% set email_provider_mapping = {
  "gmail.com": {"company": "Google", "country": "US"},
  "yahoo.com": {"company": "Yahoo", "country": "US"},
  "hotmail.com": {"company": "Microsoft", "country": "US"},
  "outlook.com": {"company": "Microsoft", "country": "US"},
  "aol.com": {"company": "AOL", "country": "US"},
  "protonmail.com": {"company": "Proton", "country": "CH"},
  "posteo.de": {"company": "Posteo", "country": "DE"},
  "gmx.net": {"company": "GMX", "country": "DE"},
  "zoho.com": {"company": "Zoho", "country": "IN"}
} %}

select
  name,
  email,
  case
    {% for key, value in email_provider_mapping.items() %}
      when email like '%{{ key }}' then '{{ value.company }}'
    {% endfor %}
    else 'other'
  end as email_provider,
  case
    {% for key, value in email_provider_mapping.items() %}
      when email like '%{{ key }}' then '{{ value.country }}'
    {% endfor %}
    else 'other'
  end as country
from {{ ref('raw_emails') }}
Output for Dynamic CASE Statement

4. Generating UNION Statements

Combine data from multiple tables or sources without writing each SELECT manually.

{% set email_companies = ["google", "yahoo", "microsoft"] %}

with google_users as (
  select *
  from {{ ref('dynamic_case') }}
  where email_provider = 'Google'
),
yahoo_users as (
  select *
  from {{ ref('dynamic_case') }}
  where email_provider = 'Yahoo'
),
microsoft_users as (
  select *
  from {{ ref('dynamic_case') }}
  where email_provider = 'Microsoft'
)

{% for email_company in email_companies %}
select
  name,
  email,
  email_provider,
  country
from {{ email_company }}_users
{% if not loop.last %}
union all
{% endif %}
{% endfor %}

The compiled code looks something like this:

with google_users as (
  select *
  from "public"."public"."dynamic_case"
  where email_provider = 'Google'
),
yahoo_users as (
  select *
  from "public"."public"."dynamic_case"
  where email_provider = 'Yahoo'
),
microsoft_users as (
  select *
  from "public"."public"."dynamic_case"
  where email_provider = 'Microsoft'
)
select
  name,
  email,
  email_provider,
  country
from google_users union all
select
  name,
  email,
  email_provider,
  country
from yahoo_users union all
select
  name,
  email,
  email_provider,
  country
from microsoft_users

5. Flexible Aggregation

Aggregate across multiple columns or dimensions without repeating code.

{% set companies = [{
  "name": "google",
  "filter": "email_provider = 'Google'"
}, {
  "name": "yahoo",
  "filter": "email_provider = 'Yahoo'"
}, {
  "name": "microsoft",
  "filter": "email_provider = 'Microsoft'"
}] %}
{% set metrics = [{
  "name": "count",
  "sql": "count(*)"
}, {
  "name": "count_distinct",
  "sql": "count(distinct email)"
}] %}

select
  {% for company in companies %}
    {% for metric in metrics %}
      {{ metric.sql }} as {{ company.name }}_{{ metric.name }}
      {% if not loop.last %}, {% endif %}
    {% endfor %}
    {% if not loop.last %}, {% endif %}
  {% endfor %}
from {{ ref('dynamic_case') }}

The compiled code looks like this:

select
      count(*) as google_count,
      count(distinct email) as google_count_distinct,
      count(*) as yahoo_count,
      count(distinct email) as yahoo_count_distinct, 
      count(*) as microsoft_count, 
      count(distinct email) as microsoft_count_distinct
from "public"."public"."dynamic_case"

The above example is a great example to show that over-DRYing your code can affect the readability of the code. Look at the compiled code, which is more readable than the Jinja syntax. I just wanted to demonstrate that using Jinja too much can make your code difficult to understand.

🤠
My Thoughts: Don't over-DRY your code!
Remember that using Jinja can make your models harder for other users to interpret. I highly recommend favoring readability when mixing Jinja with SQL, even if that means repeating some lines of SQL in a few places. If all your models are macros, it might be worth re-assessing.

6. Dynamic Column Selection

Select a dynamic set of columns, e.g., for wide tables or when columns change frequently.

{% set company_columns = [
  { "name": "google", "columns": ["name", "email"] },
  { "name": "yahoo", "columns": ["name", "country"] },
  { "name": "microsoft", "columns": ["email", "email_provider"] }
] %}

with google_users as (
  select *
  from {{ ref('dynamic_case') }}
  where email_provider = 'Google'
),
yahoo_users as (
  select *
  from {{ ref('dynamic_case') }}
  where email_provider = 'Yahoo'
),
microsoft_users as (
  select *
  from {{ ref('dynamic_case') }}
  where email_provider = 'Microsoft'
)

{% for company in company_columns %}
select
  {% for column in company.columns %}
    {{ column }}
    {% if not loop.last %}, {% endif %}
  {% endfor %}
from {{ company.name }}_users
{% if not loop.last %}
union all
{% endif %}
{% endfor %}

7. Generating Filters for WHERE Clauses

Build dynamic filters based on a list of values. In the example below, I am filtering user emails based on a banned countries list.

{% set banned_countries = ["US", "DE", "IN"] %}

select *
from {{ ref('dynamic_case') }}
where country not in (
  {% for country in banned_countries %}
    '{{ country }}'{% if not loop.last %}, {% endif %}
  {% endfor %}
)

Compiled code looks like this:

select *
from "public"."public"."dynamic_case"
where country not in (
    'US', 
    'DE',
    'IN'
)

Bonus: Building Dynamic Window Functions

When you want to apply window functions across multiple columns or partitions.

{% set metrics = ["emails_received", "emails_opened", "clicked_links"] %}

select
  date,
  {% for metric in metrics %}
    sum({{ metric }}) over (order by date rows between 6 preceding and current row) as rolling_7d_{{ metric }}
    {% if not loop.last %}, {% endif %}
  {% endfor %}
from {{ ref('daily_email_stats') }}

Loop Properties

Use the table below to understand Jinja loop properties.

Property Usage
loop.last This boolean is False unless the current iteration is the last iteration.
    {% for item in list %}
      {% if loop.last %}   
        --This is the last item
        {{ item }}
      {% endif %}
    {% endfor %}
  
loop.first A boolean that is True if the current iteration is the first iteration, otherwise False.
    {% for item in list %}
      {% if loop.first %}
        --first item
        {{ item }}
      {% endif %}
    {% endfor %}
  
loop.index An integer representing the current iteration of the loop (1-indexed). So, the first iteration would have `loop.index` of 1, the second would be 2, and so on.
    {% for item in list %}
       --This is item number
       {{ loop.index }}
    {% endfor %}
  

As you continue to develop your data models, keep these techniques in mind to build robust, scalable, and DRY (Don’t Repeat Yourself) analytics pipelines. Happy modeling!

FAQs

How do I debug my Jinja?

You should get familiar with checking the compiled SQL in target/compiled/<your_project>/ and the logs in logs/dbt.log to see what dbt is running behind the scenes. You can also read about standardising logging in dbt macros here.

How do I document macros?

To document macros, use a schema file and nest the configurations under a macros: key.

Further Reading

EOF

(Where I tend to share unrelated things).

If you've come across or used other Jinja loop patterns, I would love to hear about them in the comments 😃

Download Code Examples

Congrats on reaching this far. Becoming good at using Jinja templates in your dbt project is all about practice. That's why I've decided to give away all the code examples used here for free to my reader community. Just subscribe and download for free.

Link to download the code and instructions to run it 👇