Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

Overview

In this paper, we propose an autonomous LLM-based agent framework, called HaluAgent, which enables relatively smaller LLMs (e.g. Baichuan2-Chat 7B) to actively select suitable tools for detecting multiple hallucination types such as text, code, and mathematical expression. In HaluAgent, we integrate the LLM, multi-functional toolbox, and design a fine-grained three-stage detection framework along with memory mechanism. To facilitate the effectiveness of HaluAgent, we leverage existing Chinese and English datasets to synthesize detection trajectories for fine-tuning, which endows HaluAgent with the capability for bilingual hallucination detection.

Trajectory Generation

We employ GPT-4 to generate hallucination detection trajectories following HaluAgent framework. HaluAgent first segments the input text into a set of semantically complete sentences, then selects tools to check each sentence individually, and finally reflects on the detection results to further correct mistakes. To support this process, we use memory mechanism to store useful information such as historical detection trajectories and current detection results. There are 2017 generated trajectories in total. The code for generating trajectories can be found in generation.

cd haluagent/generation
python traj_generate.py

Fine-tune

We fine-tune Baichuan2-Chat to get HlauAgent. The fine-tuning code can be found in finetune.

cd haluagent/fine-tune
bash run.sh

Evaluation

HaluAgent can perform hallucination detection on various types of tasks and datasets. We conduct experiments on both in-domain and out-of-domain datasets. Below are the experimental results.

Here are the evaluation code and datasets.

cd haluagent/evaluation
python traj_detection.py --model_path [model_path] --input [input_file] --output [output_file]

Citation

If you find our work helpful for your research, please consider citing our work.

@article{cheng2024small,
  title={Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector},
  author={Cheng, Xiaoxue and Li, Junyi and Zhao, Wayne Xin and Zhang, Hongzhi and Zhang, Fuzheng and Zhang, Di and Gai, Kun and Wen, Ji-Rong},
  journal={arXiv preprint arXiv:2406.11277},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
asset		asset
data		data
haluagent		haluagent
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

Overview

Trajectory Generation

Fine-tune

Evaluation

Citation

About

Releases

Packages

Languages

RUCAIBox/HaluAgent

Folders and files

Latest commit

History

Repository files navigation

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

Overview

Trajectory Generation

Fine-tune

Evaluation

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages