Web Science and Digital Libraries Research Group

Posts

Showing posts with the label ChatBot

2025-05-06: Part 3 - Large Language Models (LLMs) are hallucinating in Arabic about the Quran (DeepSeek)

By Hussam Hallak - May 06, 2025

Introduction By the time I finished trying to reproduce the results in the paper I reviewed in Part 1 and Part 2 of this blog post, DeepSeek released its first free chatbot app DeepSeek-V3 on January 20th 2025. I could not fight the urge to see how it compares to Open AI's ChatGPT and Google Gemini. The purpose of this experiment is to provide a disproof by counter example , that, contrary to popular belief, LLMs are not capable of producing error-free answers to questions. I am using prompts to find Arabic verses in the Quran on misinformation. I repeated the same experiment I did with Google Gemini and ChatGPT-4o in January 2025; the results were not better. In addition to being slower, I kept getting the annoying message "The server is busy message. Please try again later." which I didn't try to find a solution for because the service got restored when I waited and tried again later. For prompts tested in the paper, DeepSeek's answers to the first prompt (...

2025-05-06: Part 2 - Large Language Models (LLMs) are hallucinating in Arabic about the Quran (Open AI ChatGPT-4o)

By Hussam Hallak - May 06, 2025

Introduction I discussed LLMs' hallucinations and demonstrated their inability to correctly cite Quranic verses using Google Gemini as an example in part 1 of this blog post. Google Gemini's answers to different prompts included incorrect Quranic citations and phrases that do not exist in the Quran attributed to specific verses in the Quran. I also explained the importance of correctly citing and presenting Quranic verses in Arabic. Here, in part 2, I present a comparison between Google Gemini and Open AI's ChatGPT. Using the same prompts, ChatGPT-4o hallucinations produced similar results to that of Google Gemini. Again, misinformation remains one of the biggest challenges that compromises LLMs' credibility quoting verses from the Quran in Arabic. Experts in Classical Arabic and those who memorize the Quran word for word can identify errors in ChatGPT's output, however, less educated people will have difficulty distinguishing between correct and incorrect answers ...

2025-05-06: Part 1 - Large Language Models (LLMs) are hallucinating in Arabic about the Quran - Part 1 (Google Gemini)

By Hussam Hallak - May 06, 2025

Introduction Large Language Models (LLMs) have been omnipresent in the past few years and everyone, in one way or another, is using them. The most popular chatbots like OpenAI’s ChatGPT or Google’s Gemini use LLMs to generate text. Although the technology doesn't seem to be harmful to the casual eye, some of its applications have already been used to plagiarize, solve homework assignments , and even write research papers . These are valid reasons to be careful about possible misuses of LLMs, but the technology is here to stay and we shouldn't be worried about it going anywhere. What we should worry about is the truthfulness of LLMs' generated content. I believe that misinformation is one of the biggest challenges for LLMs because if users who are consuming the output of LLMs believe it to be a fact , the consequences could be disastrous. The nature of LLMs (being trained on enormous amounts of data from the internet and other sources) makes them susceptible to being loade...