Skip to content

Maintenance

We have created a python library to enable easy access to our large data catalog. It also assists our work in ETL, as it contains various methods and objects essential to the data wrangling procceses.

Currently, this library lives in the etl repository ( find it here).

Installation

Simply install it from PyPI:

pip install owid-catalog

Update release

After working on your changes in the library, publishing to PyPI is automated:

  1. Bump the version in lib/catalog/pyproject.toml
  2. Update the changelog in lib/catalog/README.md
  3. Commit and push to master - the package will be automatically published to PyPI via GitHub Actions

The workflow triggers automatically when lib/catalog/pyproject.toml changes on the master branch. It includes a safety check to ensure the version was actually bumped before publishing.

Manual trigger: You can still manually trigger the workflow by clicking Run Workflow in GitHub Actions if needed.

Generate llms.txt

The library ships an llms.txt file (at docs/libraries/catalog/llms.txt) that is auto-generated from module docstrings and documentation markdown files. To regenerate it after changing docstrings or docs:

make docs.llms

This runs docs/ignore/others/bake_llms_txt.py, which inspects the public API surface and doc files so the output stays in sync with the codebase.