One .NET library to consume OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosed APIs.
-
Updated
Jul 19, 2024 - C#
One .NET library to consume OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosed APIs.
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Official implementation for the paper *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
FlashInfer: Kernel Library for LLM Serving
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
ChatAlice is a robust, cross-platform desktop application designed for MacOS, Windows, and Linux operating systems. It features support for API integration with major large language models (LLMs), notably ChatGPT, Claude, and others.
The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
Sparsity-aware deep learning inference runtime for CPUs
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
Multi-modal Chatbot based on OpenAI
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap
🌱 EcoLogits tracks the energy consumption and environmental footprint of using generative AI models through APIs.
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."