Pull requests
We use pull requests (PRs) to propose changes to the codebase. They are the best way to suggest changes to the codebase, and they are also the best way to get feedback on your work.
Every PR in the ETL repository has an associated staging server created to it. To smooth this process, we have automated all of this with the command etl pr.
Learn more about how to use the etl pr command
PR work summary¶
Once you've created a PR, the automation user @owidbot will add a comment to the PR summarizing your work in the PR and providing links to relevant resources. This comment will include the following information:
- Quick links: Links to the site and tooling using changes introduced by the PR. This includes admin site, public site, Wizard, documentation.
- Login: Instructions on how to ssh into the staging server.
- chart-diff: Wizard app showing chart changes (if any) compared to PRODUCTION. Owidbot will complain if there are chart changes pending review.
- data-diff: Changes introduced in the data compared to PRODUCTION.
Scheduling a PR merge¶
You can schedule a PR merge by using the command /schedule at the end of your PR description. This is useful whenever you want to merge your PR at a specific time, e.g. nightly if it could trigger a long deployment process in the main branch.
You have multiple options to schedule a PR merge:
/schedule: The PR will be merged at the next sharp hour (e.g., 13:00, 14:00), based on the current UTC time./schedule 2024-11-19: The PR will be merged at midnight (00:00 UTC) on the specified date./schedule 2024-11-19T12:50:00.000Z: The PR will be merged at the next sharp hour immediately following the specified timestamp (e.g., if scheduled for 12:50 UTC, it will merge at 13:00 UTC).
You can find an example here.
Working on multiple branches in parallel¶
If you need several PRs in flight at once (e.g. several agent sessions, one per branch), etl pr --worktree creates the new branch in a separate git worktree so your current working tree is untouched:
The command creates the worktree at ../etl-<branch> and runs uv sync inside it, so the worktree's .venv/ is ready to use by the time the command finishes. To start working there:
Otherwise also run source .venv/bin/activate. Or, even better, set up auto-activation once (see below) — then cd alone is enough.
When you're done with the worktree (typically after the PR is merged), clean up:
Optional: auto-activate the venv when you cd¶
Add this snippet to your ~/.zshrc so the right .venv/ activates automatically every time you cd into a worktree (or any project folder with a .venv/):
autoload -U add-zsh-hook
load-py-venv() {
if [ -f .venv/bin/activate ]; then
source .venv/bin/activate
elif [ -f env/bin/activate ]; then
source env/bin/activate
elif [ -f venv/bin/activate ]; then
source venv/bin/activate
elif [ ! -z "$VIRTUAL_ENV" ] && [ -f poetry.toml -o -f requirements.txt ]; then
deactivate
fi
}
add-zsh-hook chpwd load-py-venv
load-py-venv
After reloading your shell (source ~/.zshrc or open a new terminal), you can cd between worktrees and the matching venv will activate on its own.
Tip: open each worktree in its own VS Code window (File > New Window). The Claude Code extension is scoped per workspace, so each window gets its own chat.
Sharing the data folder (optional)¶
--share-data symlinks the new worktree's data/ to the original's, so upstream ETL steps don't get recomputed:
Warning
Never run rm -rf data/ in a shared worktree — the trailing slash makes rm follow the symlink and wipe the original data/. Use git worktree remove ../etl-<branch> to clean up instead.