The use of artificial intelligence (AI) tools — especially large language models (LLMs) — presents a growing concern in the legal world. The issue stems from the fact that general-purpose models such as ChatGPT can fabricate court decisions or misrepresent existing ones, resulting in false citations, court confusion and unjust trials. There have been numerous cases of lawyers citing nonexistent decisions, including in states like New York and California. In fact, a database organized by a researcher at HEC Paris business school suggests there have been over 360 cases of AI hallucinations — when a model claims seemingly true information that’s factually incorrect — in court in the United States.
AI may give inaccurate information more often than we think. According to a study conducted by Stanford’s RegLab and the Institute for Human-Centered AI, research institutes dedicated to regulation and AI alignment, general-purpose models hallucinate anywhere from 69% to 88% of the time when prompted to answer specific legal questions. Therefore, while legal technologies utilizing AI undoubtedly can help improve efficiency in law, this efficiency can also bring inaccuracies and deskilling if used incorrectly.
Hallucination is close to inseparable from LLMs. Due to the methodology used to train models, there is almost always a possibility an AI model could misrepresent facts or create false ones altogether. Gaps in training data, as well as reward systems used in reinforcement learning, are among the numerous complicated factors that lead to hallucination. Some techniques reduce rates of AI hallucinations, such as retrieval-augmented generation (RAG), in which the integration of an external dataset allows an LLM to ground its responses and generate answers to prompts based on the dataset. Nonetheless, it’s close to impossible to completely eliminate hallucinations.
Despite their risks, legal technologies can undeniably enhance attorneys’ workflow. New software developments allow attorneys to access a wide range of past decisions in a short amount of time. The ability of these models to process, compare and differentiate past precedents promises new possibilities in legal practice and academic research.
However, the prominence of hallucinations, especially in general-purpose models, makes AI unsuitable as a primary source of legal information. Even specialized models by LexisNexis and Thomson Reuters — which are backed by industry-trusted legal databases Lexis and Westlaw — hallucinate, albeit at a lower rate, between 17% and 33% of the time, respectively. While those models are older and less developed, and these emerging technologies continue to improve, the risks of hallucination are still too high for these tools to be the only means of gathering and verifying legal information.
Much like how the internet redefined legal research, AI tools have the ability to do the same, if not more. However, it’s still crucial to remain vigilant when using these tools. Technology will continue to improve and it’s almost inevitable AI will become integrated into the legal field. However, even after its use becomes the norm, lawyers and scholars must carefully consider the veracity of AI outputs, especially considering the serious consequences for victims and deep policy implications.
U.S. Supreme Court Chief Justice John G. Roberts, Jr. warned that “any use of AI requires caution and humility” in his 2023 year-end report on the federal judiciary. Though Roberts acknowledged the positive effects of such technology — for example, how wider access to legal information may allow the justice system to be more accessible to those with fewer resources — he was clear about the harms AI may bring.
Since then, the use of AI in law has only increased. According to Clio, a legal software, in 2024, 79% of law firms had adopted AI in some way. Hallucination persists, as there continue to be new incidents of false citations due to AI. It’s partially up to AI developers to identify ways to mitigate hallucination, but more importantly, individual lawyers must remain alert as to how they’re interacting with these tools. Ultimately, AI models are only tools, and mindful human judgment should prevail over blind trust.