Skip to content

billytrend/github-nginx-cache

 
 

Repository files navigation

Github nginx cache

Docker Publish Test

This repo contains nginx configuration tuned to sit in front of github endpoints and provide caching functionality. Github will not rate-limit conditional requests. The proxy_cache_* nginx directives force nginx to revalidate any cached content from the upstream server (in this case, github). Revalidation is performed by nginx as a conditional request, therefore it will not reduce api limits. This works for both authenticated and unauthenticated requests.

Here is an example how rate-limiting is mitigated for unauthenticated requests against both https://api.github.com and the cache running on http://localhost:8000.

Rate limiting example

Quick Start

docker run -d -p 8000:80 ghcr.io/billytrend/github-nginx-cache
curl localhost:8000/api/repos/billytrend/github-nginx-cache

The github domains are mapped as follows:

Github URL Cache URL
api.github.com/* localhost:8000/api/*
raw.githubusercontent.com/* localhost:8000/raw/*
codeload.github.com/* localhost:8000/codeload/*

Develop

Build

docker build .

Debug

Fish

docker build -t custom-nginx . && docker run -it -p 8000:80  -v (pwd)/nginx-logs:/var/log/nginx custom-nginx
curl localhost:8000/health/alive

Bash

docker build -t custom-nginx . && docker run -it -p 8000:80  -v $(pwd)/nginx-logs:/var/log/nginx custom-nginx
curl localhost:8000/health/alive

Test

# Run image on localhost:8000
cd test
npm ci
npm run test

Implementation details

Github consistency

The cache is designed for the highest possible github consistency such that it ignores any Cache-Control headers that github sends and forces nginx to REVALIDATE for every request. A limitation in nginx means that the lowest value for proxy_cache_valid directive is one second. This means that two identical requests to github within the space of one second will HIT (return cached response without revalidating) rather than REVALIDATE.

Cache partitioning

The cache may be used for complicated applications where multiple app and oauth tokens are being used to access github. The default behaviour in this case is to parition the cache by token. This means that a request with token A will not leverage any cached content from requests using token B.

This behaviour is for the following reasons.

  1. Security - there are edge cases in which using two tokens within one second of each other could cause a response to be leaked to the second request even if the second token was not allowed to access the resource.
  2. Prevent cache churn - if the cache was not partitioned, multiple requests to one api route with different tokens may cause the cache to be evacuated unnecessarily if these tokens have different access permissions.

There may be cases however where this behaviour needs to be overridden at the discretion of the client. For example when using a GitHub app, the token may expire every hour or so in which case the default behaviour would be for the cache to reset every hour which is not desirable.

By setting X-Cache-Key header, the cache will be paritioned on this arbitrary string rather than the token.

About

Caching layer for the github API using nginx

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages

  • TypeScript 83.2%
  • Shell 7.8%
  • JavaScript 6.1%
  • Dockerfile 2.9%