CivAgent is an LLM-based Human-like Agent acting as a Digital Player within the Strategy Game Unciv.
-
Updated
Jul 17, 2024 - Python
CivAgent is an LLM-based Human-like Agent acting as a Digital Player within the Strategy Game Unciv.
A framework for automatically manipulating and evaluating the political ideology of LLMs with two ideology tests: Wahl-O-Mat and Political Compass Test.
A prompt collection for testing and evaluation of LLMs.
The prompt engineering, prompt management, and prompt evaluation tool for Java.
An evaluation dataset comprising of 274 grid-based puzzles with different complexities
Use LLM for Web scraping (collection data)
Trained Without My Consent (TraWiC): Detecting Code Inclusion In Language Models Trained on Code
This repository contains the lab work for Coursera course on "Generative AI with Large Language Models".
A compilation of referenced benchmark metrics to evaluate different aspects of knowledge for Large Language Models.
The prompt engineering, prompt management, and prompt evaluation tool for Go.
Dive into the world of LLM Guardrails using tools like NVIDIA’s NeMo Guardrails. Discover the mechanisms that ensure applications produce reliable, robust, safe, and ethical outputs, and understand their crucial role in LLMs
The prompt engineering, prompt management, and prompt evaluation tool for TypeScript, JavaScript, and NodeJS.
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
The prompt engineering, prompt management, and prompt evaluation tool for Kotlin.
Upload, score, and visually compare multiple LLM-graded summaries simultaneously!
Benchmark LLMs' abilities to plan, strategize, and reason by making them play chess against each other.
Innovated CS120.AI an Angular, Django based chatbot designed for courses at Old Dominion University. Harnessed the power of the Transformers library from Hugging Face to fine-tune the Llama 2 and used the RAG-based method to query course data stored in the Pinacone Vector Database.
Add a description, image, and links to the llm-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the llm-evaluation topic, visit your repo's landing page and select "manage topics."