Monorepo Strategies and Management

Comprehensive architecture patterns, tooling, and best practices for managing large-scale unified codebases

Introduction

A monorepo (monolithic repository) is a software development strategy where code for multiple projects is stored in a single repository. This approach has gained significant traction among large tech companies and is increasingly adopted by teams of all sizes.

What is a Monorepo?

A monorepo contains multiple distinct projects with well-defined relationships and dependencies, all within a single repository. Unlike a monolithic application, projects in a monorepo can be deployed independently.

When to Use a Monorepo

Consider a monorepo when:

  • You have multiple projects that share code
  • Teams frequently collaborate across projects
  • You need atomic commits across multiple projects
  • Consistent tooling and standards are important
  • You want simplified dependency management

Avoid a monorepo when:

  • Projects have completely different tech stacks
  • Teams require strict access control separation
  • Projects have vastly different release cycles
  • Your VCS struggles with large repositories

Monorepo vs Polyrepo Comparison

Monorepo Advantages

  • Atomic Changes: Refactor across multiple projects in one commit
  • Shared Code: Easy code reuse without package publishing
  • Consistent Tooling: Single set of build tools and configurations
  • Simplified Dependencies: No version conflicts between internal packages
  • Better Refactoring: IDEs can find and update all usages

Polyrepo Advantages

  • Independent Versioning: Each project has its own release cycle
  • Smaller Repositories: Faster cloning and operations
  • Clear Boundaries: Enforced separation between projects
  • Flexible Tech Stacks: Each repo can use different tools
  • Granular Access Control: Per-repository permissions

Comparison Table

Aspect Monorepo Polyrepo
Code Sharing Direct imports Published packages
Atomic Changes ✅ Native ❌ Requires coordination
Build Complexity Higher Lower per repo
Repository Size Large Small
Team Autonomy Lower Higher
Tooling Investment High upfront Lower initial

Nx

Nx is a smart, extensible build framework designed for monorepos.

// nx.json
{
  "npmScope": "myorg",
  "affected": {
    "defaultBase": "main"
  },
  "tasksRunnerOptions": {
    "default": {
      "runner": "@nrwl/nx-cloud",
      "options": {
        "cacheableOperations": ["build", "test", "lint"]
      }
    }
  }
}

Key Features:

  • Intelligent build system with computation caching
  • Affected commands run only on changed projects
  • Rich plugin ecosystem
  • Distributed task execution

Lerna

Lerna optimizes workflow around managing multi-package repositories.

// lerna.json
{
  "version": "independent",
  "npmClient": "yarn",
  "useWorkspaces": true,
  "command": {
    "publish": {
      "conventionalCommits": true,
      "message": "chore(release): publish"
    },
    "bootstrap": {
      "hoist": true
    }
  }
}

Key Features:

  • Independent or fixed versioning modes
  • Automated publishing to npm
  • Bootstrapping local dependencies
  • Conventional commits support

Rush

Rush is a scalable monorepo manager for the web.

// rush.json
{
  "rushVersion": "5.100.0",
  "pnpmVersion": "8.6.0",
  "projects": [
    {
      "packageName": "@myorg/core",
      "projectFolder": "libraries/core"
    },
    {
      "packageName": "@myorg/app",
      "projectFolder": "apps/main-app"
    }
  ]
}

Key Features:

  • Phantom dependencies detection
  • Incremental builds
  • Rush plugins for extensibility
  • Strict dependency validation

Bazel

Bazel is Google’s build tool, designed for large-scale monorepos.

# WORKSPACE
workspace(name = "my_monorepo")

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

# BUILD file
js_library(
    name = "core",
    srcs = glob(["src/**/*.js"]),
    deps = [
        "//packages/utils",
        "@npm//lodash",
    ],
)

Key Features:

  • Language-agnostic build system
  • Hermetic builds
  • Remote caching and execution
  • Proven at massive scale

Turborepo

Turborepo is a high-performance build system for JavaScript and TypeScript monorepos.

// turbo.json
{
  "$schema": "https://turbo.build/schema.json",
  "pipeline": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": ["dist/**"]
    },
    "test": {
      "dependsOn": ["build"],
      "outputs": []
    },
    "dev": {
      "cache": false
    }
  }
}

Key Features:

  • Incremental builds
  • Remote caching
  • Parallel execution
  • Zero-config setup

Repository Structure Best Practices

monorepo/
├── apps/                    # Application projects
│   ├── web-app/
│   ├── mobile-app/
│   └── admin-dashboard/
├── packages/               # Shared packages
│   ├── ui-components/
│   ├── utils/
│   └── api-client/
├── libs/                   # Internal libraries
│   ├── auth/
│   └── data-access/
├── tools/                  # Build tools and scripts
│   ├── eslint-config/
│   └── webpack-config/
├── docs/                   # Documentation
├── .github/               # GitHub specific files
├── package.json           # Root package.json
└── turbo.json            # Monorepo tool config

Naming Conventions

// Package naming
@myorg/ui-components
@myorg/utils
@myorg/api-client

// Internal imports
import { Button } from '@myorg/ui-components';
import { formatDate } from '@myorg/utils';

Workspace Configuration

// package.json (root)
{
  "name": "my-monorepo",
  "private": true,
  "workspaces": [
    "apps/*",
    "packages/*",
    "libs/*",
    "tools/*"
  ],
  "scripts": {
    "build": "turbo run build",
    "test": "turbo run test",
    "lint": "turbo run lint"
  }
}

Dependency Management Strategies

Internal Dependencies

// packages/app/package.json
{
  "dependencies": {
    "@myorg/ui-components": "workspace:*",
    "@myorg/utils": "workspace:*"
  }
}

Version Management

// Version strategies
enum VersionStrategy {
  FIXED = "fixed",           // All packages same version
  INDEPENDENT = "independent", // Each package own version
  GROUPED = "grouped"        // Groups of packages share versions
}

Dependency Hoisting

# .npmrc or .yarnrc.yml
nodeLinker: node-modules
nmHoistingLimits: workspaces

Lock File Management

# Single lock file at root
yarn.lock
pnpm-lock.yaml
package-lock.json

# Ensure deterministic installs
npm ci
yarn install --frozen-lockfile
pnpm install --frozen-lockfile

Build Optimization and Caching

Computation Caching

// nx.json
{
  "tasksRunnerOptions": {
    "default": {
      "runner": "@nrwl/workspace/tasks-runners/default",
      "options": {
        "cacheableOperations": [
          "build",
          "test",
          "lint",
          "e2e"
        ],
        "cacheDirectory": ".cache/nx"
      }
    }
  }
}

Remote Caching

# Turborepo remote caching
turbo login
turbo link

# Nx Cloud
nx g @nrwl/nx-cloud:init

Incremental Builds

// tsconfig.json
{
  "compilerOptions": {
    "incremental": true,
    "tsBuildInfoFile": ".tsbuildinfo"
  }
}

Build Pipeline Optimization

// turbo.json
{
  "pipeline": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": ["dist/**", ".next/**"]
    },
    "test": {
      "dependsOn": ["build"],
      "inputs": ["src/**", "tests/**"]
    }
  }
}

Testing Strategies for Monorepos

Test Organization

monorepo/
├── apps/web/__tests__/        # App-specific tests
├── packages/ui/src/__tests__/ # Package tests
├── e2e/                       # End-to-end tests
└── integration/               # Integration tests

Affected Testing

# Run tests only for changed code
nx affected:test --base=main
turbo run test --filter=...[origin/main]
lerna run test --since origin/main

Test Configuration

// jest.config.base.js
module.exports = {
  preset: '../../jest.preset.js',
  coverageDirectory: '../../coverage/packages/ui',
  transform: {
    '^.+\\.(ts|tsx)$': 'ts-jest'
  }
};

Parallel Testing

# Run tests in parallel
jest --maxWorkers=50%
nx run-many --target=test --parallel=3
turbo run test --concurrency=4

Code Ownership and CODEOWNERS

CODEOWNERS File

# .github/CODEOWNERS
# Global owners
* @monorepo-admins

# App owners
/apps/web/ @web-team
/apps/mobile/ @mobile-team

# Package owners
/packages/ui-components/ @design-system-team
/packages/api-client/ @api-team

# Specific file patterns
*.sql @database-team
*.yml @devops-team

Ownership Strategies

// ownership.config.ts
export const ownership = {
  rules: [
    {
      pattern: 'packages/core/**',
      owners: ['@core-team'],
      minApprovals: 2
    },
    {
      pattern: 'apps/**',
      owners: ['@app-team'],
      minApprovals: 1
    }
  ]
};

Git Strategies for Large Repos

Sparse Checkout

# Enable sparse checkout
git sparse-checkout init --cone

# Add specific paths
git sparse-checkout set apps/web packages/ui

Shallow Clone

# Clone with limited history
git clone --depth=1 https://github.com/org/monorepo.git

# Fetch more history as needed
git fetch --deepen=100

Git LFS for Large Files

# Track large files
git lfs track "*.psd"
git lfs track "*.zip"

# .gitattributes
*.psd filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text

Branch Strategies

# Feature branches
feature/APP-123-new-feature

# Release branches
release/2023.10.0

# Hotfix branches
hotfix/critical-bug-fix

CI/CD for Monorepos

GitHub Actions Example

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        package: [web, api, mobile]
    steps:
      - uses: actions/checkout@v3
        with:
          fetch-depth: 0
      
      - name: Setup Node
        uses: actions/setup-node@v3
        with:
          node-version: 18
          cache: 'yarn'
      
      - name: Install dependencies
        run: yarn install --frozen-lockfile
      
      - name: Build affected
        run: yarn nx affected:build --base=origin/main
      
      - name: Test affected
        run: yarn nx affected:test --base=origin/main

Deployment Strategies

# Selective deployment
deploy:
  runs-on: ubuntu-latest
  steps:
    - name: Deploy changed apps
      run: |
        AFFECTED=$(yarn nx affected:apps --base=origin/main --plain)
        for app in $AFFECTED; do
          yarn deploy:$app
        done

Build Matrix Optimization

// scripts/generate-matrix.js
const { execSync } = require('child_process');

const affected = execSync('nx affected:apps --plain')
  .toString()
  .trim()
  .split(' ');

console.log(JSON.stringify({ app: affected }));

Migration Strategies

Polyrepo to Monorepo

# Step 1: Create monorepo structure
mkdir my-monorepo && cd my-monorepo
git init

# Step 2: Add repos as subdirectories
git subtree add --prefix=apps/web https://github.com/org/web-app.git main
git subtree add --prefix=apps/api https://github.com/org/api.git main

# Step 3: Update import paths
find . -type f -name "*.js" -exec sed -i 's/from "shared"/from "@myorg\/shared"/g' {} +

# Step 4: Setup workspace
yarn init -y
yarn add -D lerna nx turbo

Monorepo to Polyrepo

# Extract package with history
git subtree push --prefix=packages/ui origin ui-package

# Create new repo
git clone https://github.com/org/monorepo.git ui-components
cd ui-components
git filter-branch --subdirectory-filter packages/ui -- --all

# Update dependencies
npm init -y
npm install

Gradual Migration

// migration.config.ts
export const migrationPhases = [
  {
    phase: 1,
    packages: ['shared-utils', 'ui-components'],
    deadline: '2024-Q1'
  },
  {
    phase: 2,
    packages: ['api-client', 'auth'],
    deadline: '2024-Q2'
  }
];

Real-World Case Studies

Google

  • Size: 2+ billion lines of code
  • Tool: Bazel (originally Blaze)
  • Strategy: Single massive repository
  • Benefits: Unified tooling, atomic changes
  • Challenges: Custom VCS (Piper), specialized tools

Facebook (Meta)

  • Size: Hundreds of millions of lines
  • Tool: Buck (now open source)
  • Strategy: Mercurial-based monorepo
  • Benefits: Rapid iteration, code sharing
  • Challenges: Performance at scale

Microsoft

  • Project: Windows codebase
  • Tool: Git with VFS (Virtual File System)
  • Strategy: Git-based monorepo
  • Benefits: Unified Windows development
  • Challenges: 300GB+ repo size

Uber

  • Migration: Polyrepo to monorepo (2018)
  • Tool: Bazel
  • Languages: Go, Java, JavaScript
  • Benefits: 50% reduction in build times
  • Results: Improved developer productivity

Common Challenges and Solutions

Challenge: Long Clone Times

# Solution 1: Shallow clone
git clone --depth=1 --single-branch

# Solution 2: Partial clone
git clone --filter=blob:none

# Solution 3: Sparse checkout
git sparse-checkout init --cone

Challenge: IDE Performance

// Solution: Scope IDE to specific packages
// .vscode/settings.json
{
  "files.exclude": {
    "**/node_modules": true,
    "apps/!(web)/**": true,
    "packages/!(ui)/**": true
  },
  "search.exclude": {
    "**/dist": true,
    "**/coverage": true
  }
}

Challenge: Merge Conflicts

# Solution: Automated conflict resolution
# .gitattributes
*.generated.ts merge=ours
package-lock.json merge=ours

Challenge: CI Build Times

# Solution: Distributed builds
- name: Distribute builds
  uses: nrwl/nx-set-shas@v3
  
- run: |
    npx nx affected --target=build \
      --parallel=3 \
      --configuration=production \
      --nx-cloud

Bun Workspaces

New Addition (2023-2024): Bun’s built-in workspace support

// package.json
{
  "workspaces": ["packages/*", "apps/*"]
}

Key Features:

  • Native workspace support
  • Fast dependency installation
  • Built-in TypeScript support
  • Compatible with existing tools

PNPM Catalogs

Centralized Dependency Management:

# pnpm-workspace.yaml
catalog:
  react: ^18.2.0
  typescript: ^5.3.0
  vite: ^5.0.0

packages:
  - 'packages/*'
  - 'apps/*'

Performance Optimization Techniques

File System Optimization

# Use watchman for file watching
brew install watchman

# Configure git
git config core.preloadindex true
git config core.fscache true
git config gc.auto 256

Build Performance

// webpack.config.js
module.exports = {
  cache: {
    type: 'filesystem',
    buildDependencies: {
      config: [__filename]
    }
  },
  optimization: {
    runtimeChunk: 'single',
    moduleIds: 'deterministic'
  }
};

Memory Management

// .npmrc
max-old-space-size=4096

// package.json
{
  "scripts": {
    "build": "NODE_OPTIONS='--max-old-space-size=4096' nx build"
  }
}

Parallel Processing

// parallel-tasks.ts
import { cpus } from 'os';
import pLimit from 'p-limit';

const limit = pLimit(cpus().length);
const tasks = packages.map(pkg => 
  limit(() => buildPackage(pkg))
);

await Promise.all(tasks);

Best Practices Summary

Do’s

  • ✅ Start with clear package boundaries
  • ✅ Invest in tooling early
  • ✅ Use computation caching
  • ✅ Implement clear ownership rules
  • ✅ Automate dependency updates
  • ✅ Monitor repository metrics

Don’ts

  • ❌ Mix unrelated projects
  • ❌ Ignore tooling investment
  • ❌ Allow circular dependencies
  • ❌ Skip documentation
  • ❌ Neglect CI/CD optimization
  • ❌ Force monorepo on all projects

Conclusion

Monorepos can significantly improve development workflow for teams working on related projects. Success requires careful planning, appropriate tooling, and ongoing optimization. Start small, measure constantly, and scale gradually.

Additional Resources