Microsoft and MIT Researchers Hope to Reduce AI Hallucinations with DoLa

ODSC - Open Data Science
2 min readSep 29, 2023

In the realm of NLP, large language models have played a pivotal role in how we interact with text data. Though there have been significant advancements, the problem of “hallucinations” can continue to persist. Hallucinations are when models generate information inconsistent with real-world facts.

According to a new paper from a group of researchers from MIT and Microsoft, a new approach may help to reduce instances of AI hallucinations. One of the issues associated with AI hallucinations are dangers associated with these problems

This is especially concerning when it comes to applications that are aimed at the clinical and legal industries. This is often due to unique regulatory and compliance standards which means that the risk of AI hallucinations can create unique legal and liability issues.

This is where DoLa comes into play. DoLa’s strategy involves prioritizing information from deeper layers of the model while downplaying that from intermediate or shallower ones. This contrastive decoding approach enhances the LLM’s factual knowledge without the need for external knowledge retrieval or further fine-tuning.

So far, experimental results seem to look promising. It has been demonstrated to enhance the integrity of LLMs like LLaMA on TruthfulQA and FACTOR datasets. Moreover, experiments focusing on chain-of-thought reasoning in StrategyQA and GSM8K cc hint at its potential to improve factual reasoning substantially.

But it gets more interesting. When evaluated using GPT-4 for open-ended text production, DoLa generates informative and significantly more factual responses, garnering superior ratings compared to traditional decoding methods. What’s more, it only adds a minimal amount of time to the decoding process, making it a practical and efficient solution.

Though the research looks promising, one has to keep in mind that in the paper, they noted that the team did not extend their tests into other domains, such as instruction-following or responding to human feedback. Also, their approach relies solely on preexisting architecture and parameters, limiting the scope of possible enhancements.

Unlike retrieval-augmented LLMs, DoLa leans entirely on the model’s existing knowledge, refraining from adding new information through external retrieval modules. The researchers express hope that future work will integrate these components, potentially overcoming these limitations.

If you’re interested, you can find the GitHub for DoLa here.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Interested in attending an ODSC event? Learn more about our upcoming events here.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.