Project content¶

There are various folders and directories in the etl project. In this entry we summarize the most important ones.

You don't need to understand all of them to get started, but as you work more with the ETL you will likely find yourself exploring some of these folders.

Folders¶

Name	Description
`api/`	API to ETL for internal uses like updating metadata, search, etc.
`apps/`	Apps built around and for ETL management. Some include `wizard`, `backport`, `fasttrack`, etc.
`dag/`	Contains the dataset dependencies. That is, if `dataset A` needs `dataset B` to be up to date, this should be listed here.
`data/`	When you run the recipe code for a dataset, the dataset will be created under this directory. Note that not all the content from this directory is added to git.
`docs/`, `.readthedocs.yaml`	Project documentation config files and directory.
`etl/`	This is home to our ETL library. This is where all the recipes to generate our datasets live.
`export/`	Similar to `data/` but for `export` steps.
`lib/`	Other OWID sub-packages.
`owid_mcp/`	OWID's MCP server code.
`schemas/`	Metadata schemas for ETL datasets.
`scripts/`	Various scripts.
`snapshots/`	This is the entry point to ETL. This folder contains metadata and code to get external data and import it to our pipeline.
`tests/`	ETL library tests.
`vscode_extensions/`	In-house VS Code extensions.
`.claude/`, `.github`, `.streamlit`, `.vscode`	Config for various tools.
`Makefile`, `default.mk`	`make`-related files.

Files¶

Name	Description
`.env.example`	Example environment variables file.
`AGENTS.md`, `CLAUDE.md`	Documentation for AI agents used in ETL.
`Makefile`, `default.mk`	`make`-related files.
`README.md`	Main readme file.
`pyproject.toml`, `uv.lock`
`zensical.toml`	Config for Zensical documentation tool.