{"id":148509,"date":"2026-04-10T08:44:08","date_gmt":"2026-04-10T16:44:08","guid":{"rendered":"https:\/\/xira.com\/p\/2026\/04\/10\/understanding-ai-hallucinations-making-sure-you-dont-end-up-at-the-wrong-stop\/"},"modified":"2026-04-10T08:44:08","modified_gmt":"2026-04-10T16:44:08","slug":"understanding-ai-hallucinations-making-sure-you-dont-end-up-at-the-wrong-stop","status":"publish","type":"post","link":"https:\/\/xira.com\/p\/2026\/04\/10\/understanding-ai-hallucinations-making-sure-you-dont-end-up-at-the-wrong-stop\/","title":{"rendered":"Understanding AI Hallucinations: Making Sure You Don\u2019t End Up At The Wrong Stop"},"content":{"rendered":"<p>We talk a lot about the ethical duty of lawyers and legal professionals to understand the risks and benefits of relevant technology. But when it comes to using GenAI, that might not be enough. If we want to prevent the increasing number of hallucinations and inaccurate citations that are bedeviling lawyers and even judges, we need to understand how and why GenAI systems fail.<\/p>\n<p>That was the point of a <a href=\"https:\/\/arxiv.org\/pdf\/2603.23857\" rel=\"nofollow noopener\" target=\"_blank\">recent paper<\/a> by a group of scientists and engineers: <a href=\"https:\/\/www.linkedin.com\/in\/dylan-johnson-restrepo-43b217298\/\" rel=\"nofollow noopener\" target=\"_blank\">Dylan Restrepo<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/njohnsonrestrepo\/\" rel=\"nofollow noopener\" target=\"_blank\">Nicholas Restrepo<\/a>, <a href=\"https:\/\/www.google.com\/url?sa=t&amp;source=web&amp;rct=j&amp;opi=89978449&amp;url=https:\/\/donlab.columbian.gwu.edu\/research\/&amp;ved=2ahUKEwi1h7S34uCTAxX94ckDHTBuKgsQFnoECB8QAQ&amp;usg=AOvVaw0I3XbNfmYMpH3bpjw2MwHW\" rel=\"nofollow noopener\" target=\"_blank\">Frank Huo<\/a>, and <a href=\"https:\/\/physics.columbian.gwu.edu\/neil-johnson\" rel=\"nofollow noopener\" target=\"_blank\">Neil Johnson<\/a>. The paper carried the lengthy title, <em>When AI Output Trips to Bad but Nobody Notices: Legal Implications of AI\u2019s Mistakes<\/em>. In addition to their own calculations and analysis, the group also consulted a couple of lawyers: <a href=\"https:\/\/www.linkedin.com\/in\/daniela-johnson-restrepo-864a47126\/\" rel=\"nofollow noopener\" target=\"_blank\">Daniela Restrepo<\/a> and <a href=\"https:\/\/www.linkedin.com\/in\/jeanpaul-roekaert\/\" rel=\"nofollow noopener\" target=\"_blank\">Jean Paul Roekaert<\/a>. I can\u2019t vouch for the mathematical calculations but what they conclude squares with my own experience.<\/p>\n<p><strong>The Basic Premise<\/strong><\/p>\n<p>The group concludes at the outset that rather than a random, unpredictable glitch, a physics-based analysis demonstrates that hallucination is a \u201cforeseeable engineering risk.\u201d Meaning, of course, the circumstances generating its occurrence can be at least a little predictable.<\/p>\n<p>According to the paper, GenAI systems have \u201ca deterministic mechanism at its core that can cause output to flip from reliable to fabricated at a calculable step.\u201d\u00a0 And that step unfortunately comes when the lawyer\u2019s need is the greatest.<\/p>\n<p>The group\u2019s analysis starts from the proposition that we should know by now: GenAI is \u201ca probabilistic text generator engineered to predict the next most plausible token in a sequence, without any internal concept of legal truth.\u201d It is not, argues the group, a database of verified legal authorities. (The group focused on the publicly available systems and not on the closed systems that claim to rely on verified legal authorities.)<\/p>\n<p><strong>What This Means<\/strong><\/p>\n<p>Because it\u2019s predicting, not analyzing, GenAI does well when faced with inquiries about valid legal principles, logical-sounding arguments, undisputed case facts, procedural history, and the like. But when faced with something novel and complex, the tool is pushed \u201cinto a region where training data is sparse.\u201d In an effort to please and respond, it is then prone to, well, make stuff up.<\/p>\n<p>The paper puts it this way:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>The tool is therefore most prone to failure exactly when the lawyer\u2019s need is greatest: on a difficult point of law with sparse precedent. The act of researching an unsettled legal issue via an LLM becomes the principal trigger for the tipping instability.<\/p>\n<\/blockquote>\n<p>These are important points since lawyers live in a world where a hallucination, an error, can have devastating consequences. So, as we <a href=\"https:\/\/abovethelaw.com\/2025\/12\/like-lawyers-in-pompeii-is-legal-ignoring-the-coming-ai-trust-crisis-part-iii\/\" rel=\"nofollow noopener\" target=\"_blank\">have discussed<\/a>, given that risk, GenAI outputs must be checked over and over, often mitigating the cost savings of using the tools in the first place. But if we understand why the errors occur and more importantly when, we can better and more safely use the tools.<\/p>\n<p><strong>A Blessing\u2026And a Curse<\/strong><\/p>\n<p>If true, then the group\u2019s findings are a blessing since it suggests a sliding scale of verification: less where the output focuses on well-known information and much more when it strays into the novel. Saves time and energy.<\/p>\n<p>But for those uninformed of this predictability, the fact that failure can occur at a certain point can be a curse. Why: a lawyer with a legal project often starts with undisputed facts, then seeks information on what the law generally is with respect to the issues at hand. And then goes to more complex, ambiguous areas thinking it\u2019s okay.<\/p>\n<p>The example given in the paper is a statute of limitations question. A lawyer starts their use of ChatGPT by plugging in undisputed facts. They then seek the general law with respect to the limitation period. All well and good: \u00a0the lawyer gets correct responses and then, in the words of the paper, \u201cgains confidence in the tool.\u201d\u00a0 So, the lawyer then begins asking for more ambiguous information about how that law can be used to leverage the facts or to develop arguments.<\/p>\n<p>If the lawyer takes all the outputs and prepares a brief based on the information obtained, they (or their supervisor) might be tempted to spot check the first few paragraphs, find nothing amiss and, when pressed for time, conclude the rest of the outputs are also fine when they are not.<\/p>\n<p>So, the blessing becomes a curse: \u201cAI\u2019s period of correct output increases rather than decreases the risk of harm, because it builds the user\u2019s trust just before the fabrication appears.\u201d<\/p>\n<p><strong>What To Do<\/strong><\/p>\n<p>So, what do we make of all this? Again, I\u2019m no scientist but I do know from experience that the more general information I seek from GenAI, the more prone it is to be correct. When I stray into more ambiguous areas where there is less known about a subject, the more errors I tend to get.<\/p>\n<p>For example, I once asked for information about a well-known painter. I got great information. But when I asked about another painter in the same school of painting who was relatively obscure, the tool just made up a name. Or when I asked what\u2019s the\u00a0 subway stop to take to catch the Q70 bus to LaGuardia Airport, it got it right. When I asked the best route from my hotel (which involves more ambiguity), it sent me to the wrong stop. It did say sorry when I pointed out the error (after some argument).<\/p>\n<p>The point being for lawyers and legal professionals is to understand that \u201cAI possesses no independent legal agency: it is a computational tool.\u201d Granted, it is a computational tool with which you can converse like a human. It reacts in human ways. It\u2019s tempting to anthropomorphize it.<\/p>\n<p>But that\u2019s where we go wrong. Instead, we need to start with thinking of it not as a person but a product with a foreseeable engineering risk. Like a sharp knife or an ATV. A risk that appears to materialize when faced with novelty and ambiguity.\u00a0 But it\u2019s that novelty and ambiguity that creates the greatest risk of hallucination, according to the paper.<\/p>\n<p>For lawyers, that means if you are going to use this sharp knife, you better know how and in what circumstances. You need to know how to do that safely.<\/p>\n<p>The paper says it the best: \u201cThe duty of technological competence, as expressed in ABA Model Rule 1.1 and its state-level counterparts, must evolve. It is no longer sufficient for a lawyer to know how to operate a piece of software. Competence now requires a practical understanding of how that software can fail.\u201d That it is clearly right about.<\/p>\n<p>Want to use GenAI? Use it to access known information that would be time consuming or difficult to otherwise get. Ask it to do a lot of things where accuracy isn\u2019t that important. But don\u2019t ask novel or unsettled legal questions, without checking and double checking what you get back. Else you might get off at the wrong subway stop.<\/p>\n<p>Or much worse.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\">\n<p><em><strong>Stephen Embry is a lawyer, speaker, blogger, and writer. He publishes\u00a0<a href=\"https:\/\/www.techlawcrossroads.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">TechLaw Crossroads<\/a>, a blog devoted to the examination of the tension between technology, the law, and the practice of law<\/strong><\/em>.<\/p>\n<p>The post <a href=\"https:\/\/abovethelaw.com\/2026\/04\/understanding-ai-hallucinations-making-sure-you-dont-end-up-at-the-wrong-stop\/\" rel=\"nofollow noopener\" target=\"_blank\">Understanding AI Hallucinations: Making Sure You Don\u2019t End Up At The Wrong Stop<\/a> appeared first on <a href=\"https:\/\/abovethelaw.com\/\" rel=\"nofollow noopener\" target=\"_blank\">Above the Law<\/a>.<\/p>\n<p>We talk a lot about the ethical duty of lawyers and legal professionals to understand the risks and benefits of relevant technology. But when it comes to using GenAI, that might not be enough. If we want to prevent the increasing number of hallucinations and inaccurate citations that are bedeviling lawyers and even judges, we need to understand how and why GenAI systems fail.<\/p>\n<p>That was the point of a <a href=\"https:\/\/arxiv.org\/pdf\/2603.23857\" rel=\"nofollow noopener\" target=\"_blank\">recent paper<\/a> by a group of scientists and engineers: <a href=\"https:\/\/www.linkedin.com\/in\/dylan-johnson-restrepo-43b217298\/\" rel=\"nofollow noopener\" target=\"_blank\">Dylan Restrepo<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/njohnsonrestrepo\/\" rel=\"nofollow noopener\" target=\"_blank\">Nicholas Restrepo<\/a>, <a href=\"https:\/\/www.google.com\/url?sa=t&amp;source=web&amp;rct=j&amp;opi=89978449&amp;url=https:\/\/donlab.columbian.gwu.edu\/research\/&amp;ved=2ahUKEwi1h7S34uCTAxX94ckDHTBuKgsQFnoECB8QAQ&amp;usg=AOvVaw0I3XbNfmYMpH3bpjw2MwHW\" rel=\"nofollow noopener\" target=\"_blank\">Frank Huo<\/a>, and <a href=\"https:\/\/physics.columbian.gwu.edu\/neil-johnson\" rel=\"nofollow noopener\" target=\"_blank\">Neil Johnson<\/a>. The paper carried the lengthy title, <em>When AI Output Trips to Bad but Nobody Notices: Legal Implications of AI\u2019s Mistakes<\/em>. In addition to their own calculations and analysis, the group also consulted a couple of lawyers: <a href=\"https:\/\/www.linkedin.com\/in\/daniela-johnson-restrepo-864a47126\/\" rel=\"nofollow noopener\" target=\"_blank\">Daniela Restrepo<\/a> and <a href=\"https:\/\/www.linkedin.com\/in\/jeanpaul-roekaert\/\" rel=\"nofollow noopener\" target=\"_blank\">Jean Paul Roekaert<\/a>. I can\u2019t vouch for the mathematical calculations but what they conclude squares with my own experience.<\/p>\n<p><strong>The Basic Premise<\/strong><\/p>\n<p>The group concludes at the outset that rather than a random, unpredictable glitch, a physics-based analysis demonstrates that hallucination is a \u201cforeseeable engineering risk.\u201d Meaning, of course, the circumstances generating its occurrence can be at least a little predictable.<\/p>\n<p>According to the paper, GenAI systems have \u201ca deterministic mechanism at its core that can cause output to flip from reliable to fabricated at a calculable step.\u201d\u00a0 And that step unfortunately comes when the lawyer\u2019s need is the greatest.<\/p>\n<p>The group\u2019s analysis starts from the proposition that we should know by now: GenAI is \u201ca probabilistic text generator engineered to predict the next most plausible token in a sequence, without any internal concept of legal truth.\u201d It is not, argues the group, a database of verified legal authorities. (The group focused on the publicly available systems and not on the closed systems that claim to rely on verified legal authorities.)<\/p>\n<p><strong>What This Means<\/strong><\/p>\n<p>Because it\u2019s predicting, not analyzing, GenAI does well when faced with inquiries about valid legal principles, logical-sounding arguments, undisputed case facts, procedural history, and the like. But when faced with something novel and complex, the tool is pushed \u201cinto a region where training data is sparse.\u201d In an effort to please and respond, it is then prone to, well, make stuff up.<\/p>\n<p>The paper puts it this way:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>The tool is therefore most prone to failure exactly when the lawyer\u2019s need is greatest: on a difficult point of law with sparse precedent. The act of researching an unsettled legal issue via an LLM becomes the principal trigger for the tipping instability.<\/p>\n<\/blockquote>\n<p>These are important points since lawyers live in a world where a hallucination, an error, can have devastating consequences. So, as we <a href=\"https:\/\/abovethelaw.com\/2025\/12\/like-lawyers-in-pompeii-is-legal-ignoring-the-coming-ai-trust-crisis-part-iii\/\" rel=\"nofollow noopener\" target=\"_blank\">have discussed<\/a>, given that risk, GenAI outputs must be checked over and over, often mitigating the cost savings of using the tools in the first place. But if we understand why the errors occur and more importantly when, we can better and more safely use the tools.<\/p>\n<p><strong>A Blessing\u2026And a Curse<\/strong><\/p>\n<p>If true, then the group\u2019s findings are a blessing since it suggests a sliding scale of verification: less where the output focuses on well-known information and much more when it strays into the novel. Saves time and energy.<\/p>\n<p>But for those uninformed of this predictability, the fact that failure can occur at a certain point can be a curse. Why: a lawyer with a legal project often starts with undisputed facts, then seeks information on what the law generally is with respect to the issues at hand. And then goes to more complex, ambiguous areas thinking it\u2019s okay.<\/p>\n<p>The example given in the paper is a statute of limitations question. A lawyer starts their use of ChatGPT by plugging in undisputed facts. They then seek the general law with respect to the limitation period. All well and good: \u00a0the lawyer gets correct responses and then, in the words of the paper, \u201cgains confidence in the tool.\u201d\u00a0 So, the lawyer then begins asking for more ambiguous information about how that law can be used to leverage the facts or to develop arguments.<\/p>\n<p>If the lawyer takes all the outputs and prepares a brief based on the information obtained, they (or their supervisor) might be tempted to spot check the first few paragraphs, find nothing amiss and, when pressed for time, conclude the rest of the outputs are also fine when they are not.<\/p>\n<p>So, the blessing becomes a curse: \u201cAI\u2019s period of correct output increases rather than decreases the risk of harm, because it builds the user\u2019s trust just before the fabrication appears.\u201d<\/p>\n<p><strong>What To Do<\/strong><\/p>\n<p>So, what do we make of all this? Again, I\u2019m no scientist but I do know from experience that the more general information I seek from GenAI, the more prone it is to be correct. When I stray into more ambiguous areas where there is less known about a subject, the more errors I tend to get.<\/p>\n<p>For example, I once asked for information about a well-known painter. I got great information. But when I asked about another painter in the same school of painting who was relatively obscure, the tool just made up a name. Or when I asked what\u2019s the\u00a0 subway stop to take to catch the Q70 bus to LaGuardia Airport, it got it right. When I asked the best route from my hotel (which involves more ambiguity), it sent me to the wrong stop. It did say sorry when I pointed out the error (after some argument).<\/p>\n<p>The point being for lawyers and legal professionals is to understand that \u201cAI possesses no independent legal agency: it is a computational tool.\u201d Granted, it is a computational tool with which you can converse like a human. It reacts in human ways. It\u2019s tempting to anthropomorphize it.<\/p>\n<p>But that\u2019s where we go wrong. Instead, we need to start with thinking of it not as a person but a product with a foreseeable engineering risk. Like a sharp knife or an ATV. A risk that appears to materialize when faced with novelty and ambiguity.\u00a0 But it\u2019s that novelty and ambiguity that creates the greatest risk of hallucination, according to the paper.<\/p>\n<p>For lawyers, that means if you are going to use this sharp knife, you better know how and in what circumstances. You need to know how to do that safely.<\/p>\n<p>The paper says it the best: \u201cThe duty of technological competence, as expressed in ABA Model Rule 1.1 and its state-level counterparts, must evolve. It is no longer sufficient for a lawyer to know how to operate a piece of software. Competence now requires a practical understanding of how that software can fail.\u201d That it is clearly right about.<\/p>\n<p>Want to use GenAI? Use it to access known information that would be time consuming or difficult to otherwise get. Ask it to do a lot of things where accuracy isn\u2019t that important. But don\u2019t ask novel or unsettled legal questions, without checking and double checking what you get back. Else you might get off at the wrong subway stop.<\/p>\n<p>Or much worse.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\">\n<p><em><strong>Stephen Embry is a lawyer, speaker, blogger, and writer. He publishes\u00a0<a href=\"https:\/\/www.techlawcrossroads.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">TechLaw Crossroads<\/a>, a blog devoted to the examination of the tension between technology, the law, and the practice of law<\/strong><\/em>.<\/p>\n<p>The post <a href=\"https:\/\/abovethelaw.com\/2026\/04\/understanding-ai-hallucinations-making-sure-you-dont-end-up-at-the-wrong-stop\/\" rel=\"nofollow noopener\" target=\"_blank\">Understanding AI Hallucinations: Making Sure You Don\u2019t End Up At The Wrong Stop<\/a> appeared first on <a href=\"https:\/\/abovethelaw.com\/\" rel=\"nofollow noopener\" target=\"_blank\">Above the Law<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We talk a lot about the ethical duty of lawyers and legal professionals to understand the risks and benefits of relevant technology. But when it comes to using GenAI, that might not be enough. If we want to prevent the increasing number of hallucinations and inaccurate citations that are bedeviling lawyers and even judges, we [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[16],"tags":[],"class_list":["post-148509","post","type-post","status-publish","format-standard","hentry","category-above_the_law"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/posts\/148509","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/comments?post=148509"}],"version-history":[{"count":0,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/posts\/148509\/revisions"}],"wp:attachment":[{"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/media?parent=148509"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/categories?post=148509"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/tags?post=148509"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}