Skip to content

Latest commit

 

History

History
47 lines (24 loc) · 3.07 KB

CONTRIBUTING.md

File metadata and controls

47 lines (24 loc) · 3.07 KB

prereqs

You need a supported version of python >= 3.10.12, < 3.12.0

pyenv install 3.11.8

and make sure poetry is using it, usually:

poetry env use ~./.pyenv/versions/3.11.8/bin/python

setup dependencies

poetry install

run

poetry run python run.py

test

We maintian three kinds of tests, http level tests (with both sync and async clients), openai sdk tests (for components that don't require db authentication), and astra-assistants client library tests for end to end functionality.

poetry run pytest --disable-warnings

The client library itself also has it's own test suite.

internals

The assistant-api-server repo is a python server app that heavily depends on fastAPI, pydantic, and the DataStax python driver. It relies on LiteLLM for third party LLM support and we've been very happy with their responsiveness in github as well as their ability to quickly add new models as the AI landscape evolves.

The app is mostly stateless (with the exception of a db connection cache) and all authentication tokens and LLM provider configuration are passed as http headers. The astra-assistants python library makes it easy for users to just store these configurations as environment variables and it takes care of the rest. We serve the app in production using uvicorn and scale it in kubernetes using HPA.

The app consists of both generated and hand written code. The generated code is based on OpenAI's openapi spec and generated with openapi-generator-cli from openapi-generator.tech. It mostly lives in the openapi_server directory. Leveraging the openapi spec was one of the first design decisions we made and it was a no brainer, OpenAI's spec is of very high quality (they use it to generate their SDKs) and using it ensures that all the types for all the endpoints are built correctly and enforced by pydantic.

We keep track of what version of the openai openapi spec we're working with in OPEN_API_SPEC_HASH

The hand written code takes the method stubs from open-api-server/apis and implements them using the types from openapi-server/models and openapi-server_v1/models inside of impl/routes and impl/routes_v2. The third party LLM support is abstracted in impl/services/inference_utils.py and the database interactions occur in impl/astra_vector.py. We collect throughput, duration, and payload size metrics and export them with a prometheus exporter which is exposed with a /metrics endpoint. The prometheus exporter is configured to work using prom's multi-process collector to support our multi process uvicorn production deployment.

Finally in the tests directory we have implemented tests and CI using both an http client directly (originally generated by openapi-generator.tech and tweaked manually) and custom tests that use both the openai SDK and our astra-assistants library directly.

v2 implementation

In impl/main.py we disambiguate between v1 and v2 openai headers and route accordingly.

client library

The client openai sdk wrapper lives in client and is implemented as a single file python script.