GitHub fetcher GitHub issue
A specialized fetcher in %puck.sources that recognizes GitHub URLs and handles the GitHub-specific shape (web URL → raw URL translation). Most publishers will host classes on GitHub; this fetcher makes that work without ceremony.
vibecode
{"vibecode": { "doc": "github_fetcher", "role": "spec sketch for a specialized fetcher that handles GitHub-hosted classes — recognizes github.com and raw.githubusercontent.com URLs, translates web URLs to raw URLs, fetches the bytes", "status": "sketch", "key_concepts": ["specialized_fetcher_for_github", "web_url_to_raw_url_translation", "participates_in_puck_sources_chain"], "related": ["requirements/caspian/downloads/ (fetcher framework)", "requirements/caspian/downloads/caching/ (where verified bytes land)", "requirements/caspian/downloads/blockchain/ (signature source)"] }}
What it does GitHub issue
The GitHub fetcher recognizes two URL shapes:
- Web URL (what humans share):
https://github.com/USER/REPO/blob/REF/PATH - Raw URL (where the bytes live):
https://raw.githubusercontent.com/USER/REPO/REF/PATH
When the fetcher matches a %puck[url] lookup against either form:
- If the URL is the web form, translate it to the raw form (drop the
/blob/segment, swap the host). - Fetch the bytes from the raw URL via HTTPS.
- Return the bytes to the engine for verification (per the cache verification flow).
The signature check is the engine's concern, not the fetcher's. The fetcher just hands over bytes.
Why a specialized fetcher GitHub issue
A generic HTTPS fetcher could already handle raw.githubusercontent.com URLs — the bytes are just served over HTTPS. The reason to specialize:
- Web URL transparency. Developers bookmark
github.com/...URLs and share them in docs, READMEs, examples. A specialized fetcher makes those URLs loadable directly via%puck, without forcing publishers to communicate the raw URL form separately. - GitHub-specific affordances later on (not in this initial sketch): authentication for private repos, GitHub API for "latest commit on this branch" resolution, rate-limit-aware retries, LFS handling.
Open GitHub issue
Sketch-level for now. Things to settle as the design matures:
- Authentication for private repos (API tokens? OAuth? configured via what surface?).
- Whether the fetcher uses the GitHub API at all, or just hits raw.githubusercontent.com directly.
- Rate-limit handling — GitHub throttles unauthenticated requests; what does the fetcher do when it hits a 429?
- Handling of LFS-stored artifacts (the raw URL returns an LFS pointer, not the bytes).
- GitHub Pages URLs (
USER.github.io/REPO/...) — separate URL space; out of scope for this fetcher, probably needs its own treatment or a more general "static site" fetcher. - Whether the fetcher recognizes other Git-hosting services (GitLab, Codeberg, Gitea) under their own variants, or whether each gets its own specialized fetcher.
See also GitHub issue
- Downloads (parent) — the fetcher framework this plugs into.
- Caching — where bytes land after the fetcher returns them, and where signature verification happens.
- Blockchain — the signature source the engine uses to verify whatever this fetcher returns.