A monorepo is a single code repository that houses multiple dbt models, macros, tests, and configurations across domains or functional areas. Common in early-stage teams or centralized data functions.

A multi-repo architecture divides dbt work into multiple repositories, usually mapped to domains, business units, or teams. Each repo is self-contained with its own dbt_project.yml.

Industry Context

Monorepo Context Multi-repo Context
- Popular in startups or centralised data teams where agility and shared logic are key. - Common in data mesh, large-scale enterprises, or regulated industries requiring modular governance.
- Tools like dbt Cloud, GitHub Actions, or GitLab CI can easily run a single pipeline. - Often paired with enterprise CI/CD platforms (Azure DevOps, Jenkins, etc.) to manage isolated pipelines.
- Encourages tight cross-functional collaboration but may lead to coupling and permission complexity. - Promotes decentralized ownership (data product teams) in line with Data Mesh or DDD (Domain Driven Design).

Structural Comparison

Monorepo:

/dbt
  ├── models/
  │   ├── staging/
  │   ├── marts/
  │   ├── finance/
  │   └── marketing/
  ├── macros/
  ├── seeds/
  ├── tests/
  └── dbt_project.yml

Multi-repo:

  • Repo: dbt-finance
  • Repo: dbt-marketing
  • Repo: dbt-core-utils (for shared macros, tests)
    Each repo:
/dbt-project
  ├── models/
  ├── macros/
  ├── seeds/
  └── dbt_project.yml

Versioning, Releases, and Dependency Management

Monorepo

  • Single versioning cycle: one Git tag/version for all domains.
  • Challenge: hard to do independent releases; e.g., a marketing model change may trigger full regression testing.
  • Best Practice: Use dbt selectors (--select tag:finance) to narrow build scope.

Multi-repo

  • Independent lifecycle: each repo can be versioned, released, and deployed independently.
  • Best Practice: Publish reusable logic (macros/tests) as dbt packages in a dbt-core-utils repo.

Security & Governance

Dimension Monorepo Multi-repo
Access Control Requires branch protections and folder-level conventions. Git doesn’t natively support fine-grained access per folder. Git-level permissions per repo. E.g., Finance team can't see Marketing's code unless explicitly granted.
Audit Trail All changes logged in one repo – may get noisy. Easier to audit changes per domain or team.

Best Practices

  • Use CODEOWNERS in monorepos for folder-level review enforcement.
  • Implement SonarQube, dbt-expectations, and pre-commit hooks in both for code quality.

CI/CD & Automation

Monorepo

  • Single CI/CD pipeline with conditionals:if: changes include 'models/finance/**'
  • Use tools like:
    • dbt build --select path:models/finance/
    • GitHub Actions matrix strategies
  • Best Practice: Cache dbt deps and state comparison using artifacts for performance.

Multi-repo

  • Each repo has its own CI/CD pipeline.
  • E.g., dbt-finance triggers on PR merge and deploys only Finance models.
  • Best Practice: Use Git submodules or a private package registry (like GitHub Packages or Artifactory) for macro/test reuse.

Testing, Quality & Reuse

Factor Monorepo Multi-repo
Unit Tests Cross-folder tests easier to write and run Requires mocks or data contracts across repos
Macros Native sharing across domains Must be published and versioned via packages
Seeds & Fixtures Centrally defined and accessed Duplicated or abstracted via shared repo

Best Practices

  • Use dbt-utils and custom-utils in both setups.
  • Document macro and model contracts using dbt docs + descriptions + meta tags.
  • Use dbt artifacts and manifest.json for downstream system integrations (like lineage tools or alerting platforms).

Team Collaboration & Ownership

Monorepo

  • Encourages shared ownership, but needs discipline in naming, foldering, tagging.
  • Difficult for teams to deploy autonomously.
  • Good for centralized governance and fast prototyping.

Multi-repo

  • Allows true federated data ownership (e.g., Data Mesh).
  • Each domain can deploy independently.
  • Ideal for regulated industries (banking, healthcare) needing domain isolation.

Best Practices

  • Monorepo: Establish a project governance committee for standards and naming.
  • Multi-repo: Define and publish interface contracts for shared models (e.g., customer_dim schema, field expectations).

Interdependencies & Data Mesh

Dimension Monorepo Multi-repo
Cross-domain dependencies Simple DAGs using folder hierarchy Requires referencing published sources via source() across repos
Data Contracts Implicit within same repo Must be explicit with schemas and documentation
Lineage Easily visualized in dbt docs May need manifest stitching or 3rd party lineage tools (e.g., Alation, Collibra, Atlan)

Best Practices

  • In multi-repo: Publish key tables (e.g., customer_dim) as source() in downstream repos and version their schemas.

Decision Framework: Key Considerations

Before deciding on a repository structure, evaluate the following aspects:

Workflow Dynamics

  • Review Processes: Who is responsible for approving pull requests?
  • Deployment Pipelines: How are code changes promoted across environments (e.g., dev → QA → prod)?
  • Access Controls: Who has access to different environments and data objects?

Team Structure and Collaboration

  • Team Interactions: Do teams collaborate closely or operate independently?
  • Coding Standards: Are there unified or varied coding styles and review processes?
  • Data Source Usage: Do teams share data sources, or are they domain-specific?
  • Data Ownership: Are there clear boundaries regarding data ownership and consumption?

When to Choose What?

Choose Monorepo If:

  • Small or medium team size
  • High collaboration and fast prototyping
  • Centralized data modeling and governance
  • No strict domain boundaries or compliance barriers
  • Limited CI/CD complexity

Choose Multi-repo If:

  • Large teams with domain-specific ownership
  • Independent deployment cycles needed
  • Regulatory/permission isolation is critical
  • You're adopting Data Mesh or DDD principles
  • Need to scale modular governance and CI/CD

Final Recommendation

  • Start Simple: For small teams or organizations new to dbt, a monorepo offers simplicity and ease of management.
  • Assess Growth: As the organization scales, monitor for signs that a monorepo is becoming a bottleneck, such as long build times or complex merge conflicts.
  • Plan for Transition: If transitioning to a multi-repo setup, plan meticulously to manage dependencies and maintain consistency.

If you're scaling or pursuing data mesh, or have stringent governance needs, go multi-repo and invest in:

  • Shared macro/test packaging
  • Interface documentation
  • Cross-repo dependency tooling

References

How to Configure Your dbt Repository (One or Many)? | dbt Developer Blog