OpenAI's widely used ChatGPT has been notorious for generating fake references. Experts have discouraged users from using the popular chatbot to create academic papers or any undertaking that entails factual information. 

Why ChatGPT Can't Reveal its Sources and Sometimes Generate Fake References

Experts from Duke University explained that ChatGPT, built upon a Large Language Model (LLM), does not have the ability to match relevant sources to any given topic. 

That means ChatGPT has built-in and limited knowledge of a list of sources on a particular topic, making it prone to "hallucination" or generating false information. 

However, experts noted that ChatGPT is not entirely untrustworthy and can provide useful information on selected topics or sources, although it may still fabricate non-existent sources.

FRANCE-TECHNOLOGY-INTERNET-CHATGPT
This photograph taken in Mulhouse, eastern France on October 19, 2023, shows figurines next to the ChatGPT logo.
(Photo : SEBASTIEN BOZON/AFP via Getty Images)

RAGE: New Tool to Unveil ChatGPT's Sources

A team of researchers from the University of Waterloo wanted to shed light on ChatGPT's answers by revealing its sources through "RAGE."

This newly introduced tool, known as "RAGE," aims to unveil the origins of data utilized by extensive language models such as ChatGPT and validate the accuracy and reliability of the information they generate.  

The research team noted that LLMs are built upon the foundation of "unsupervised deep learning," which combines data gleaned from diverse online sources.

This presents a conundrum for developers and users alike when it comes to verifying its trustworthiness. Moreover, LLMs are vulnerable to hallucination, fabricating content grounded in non-existent concepts and references.

Joel Rorseth, a doctoral candidate in computer science at the University of Waterloo and the primary investigator of the research, underscored the skepticism surrounding the LLMs' capacity to furnish precise explanations, often resorting to fabricated justifications or citations. 

The new tool devised by Rorseth's team leverages the "retrieval-augmented generation" (RAG) technique to contextualize the responses provided by LLMs, enabling users to introduce supplementary sources for comparative analysis. 

Read Also: Disengaged Students Are More Likely to Use AI Tools Like ChatGPT for Assignments: Study

RAGE Against the Machine

This tool is dubbed "'RAGE' against the machine" in defiance of its opaque nature and because it focuses on retrieval-augmented generation explainability.

It aims to play a pivotal role in evaluating the dependability of the information disseminated by LLMs, particularly in sectors like healthcare and jurisprudence, where reliance on such technologies is growing.

Rorseth emphasized the need for stringent regulatory measures amid the swift technological progressions, underscoring the necessity of ensuring the safety, credibility, and dependability of these AI technologies. 

"We're in a place right now where innovation has outpaced regulation. People are using these technologies without understanding their potential risks, so we need to make sure these products are safe, trustworthy, and reliable," Rorseth said in a statement

The findings of the research team were published in arXiv. 

Related Article: ChatGPT Answers to Computer Programming Questions More Than 50% Inaccurate, Study Shows

Byline


ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Join the Discussion