We build legal AI for a living. We tell lawyers every day that general models cannot be trusted with specific legal questions. Yet, recently when we faced a legal question in our own business regarding regulations governing employee benefits, we fell into the very trap we warn clients about when we asked an ungrounded frontier model. The AI returned a beautifully written, authoritative response complete with citations, assuring us the regulations did not apply. It looked perfect. It was also completely wrong.
When we ran the question through the system grounded in our primary law database, the answer flipped 180 degrees. This experience was a wake-up call. In the law, regardless of how "smart" the model is, grounding the AI in primary law is essential for reliable legal AI.
A Straightforward Question, Two Opposite Answers
Our question was straightforward. We needed to know whether certain employee benefit regulations applied to our company. Nothing unusual about it. Businesses ask lawyers this type of question regularly.
The frontier AI model analyzed our question, cited relevant regulations, walked through the legal reasoning, and concluded definitively that the regulatory requirements did not apply to us. In all likelihood anyone reading the response would find it professional and authoritative.
It was convincing. But because we spend our days explaining to lawyers why they should be skeptical of ungrounded AI, we fed the same question to our grounded system, which has direct access to primary law databases including statutes, and regulatory guidance.
Our grounded AI took a different approach entirely. Instead of drawing from general knowledge, it went straight to the source material. It pulled up the actual regulations, dove into agency manuals, and even found an IRS regulation that turned out to be relevant to our question. After working through all of this primary source material, it reached the opposite conclusion from the frontier model - the regulations absolutely applied to our company.
Something important distinguished the second answer from the first. Every conclusion was linked to specific legal authorities that we could actually read. We could click through to the statutes, review the regulatory guidance, and see the examples that directly addressed our situation. Rather than simply telling us what it thought, the AI showed us where to find the law that supported its analysis.
This has serious consequences. If we had relied on that first answer and ignored the regulatory requirements, our company could have faced substantial penalties down the road. In a client relationship, giving advice based on that first response would constitute malpractice.
Even the People Building Legal AI Needed a Reminder
We recognized an uncomfortable reality through this experience. Even as a legal AI company that constantly warns about these exact risks, we almost fell for it. We asked the AI a quick question because it was convenient, and we nearly accepted the confident answer it gave us.
It is an easy mistake to make. Frontier models are exceptionally effective at sounding authoritative. They provide detailed reasoning that makes logical sense. Yet this reasoning is not connected to actual law. It is like asking a lawyer to give you advice off the top of their head without looking at any legal authorities. Most lawyers would recognize that as inadequate, but when the advice comes from an AI system, the same skepticism often disappears.
Ungrounded AI does not simply get things wrong. It gets things wrong while sounding completely right. An attorney using an ungrounded model for legal research might receive analysis that appears thorough and well-reasoned, then rely on that analysis to advise clients. This creates a direct path to malpractice.
What Grounding Actually Changes
Avoiding AI is not the solution. Using it properly is. When we grounded the AI in primary legal sources, it became an effective research tool. Instead of asking it what it knew, we asked it to examine specific legal authorities and provide analysis based on those sources.
This changes the task the AI performs. Rather than generating answers from general knowledge, it reads the relevant statutes and regulations, applies the legal principles it finds there, and provides reasoning tied to actual legal authorities. Interpretation errors can still occur, but now the attorney has links to the citations and reasoning needed to validate the analysis quickly.
A collaborative workflow emerges from this approach. AI identifies relevant authorities and provides initial analysis. Attorneys review those authorities and make the final legal determination. This captures the efficiency benefits of AI while maintaining professional standards.
What We Learned About Using What We Build
We spend our days building this technology and explaining its limitations to legal professionals. We understand how these systems work and where they fail. Yet we still nearly relied on an answer that was completely wrong. That should be concerning to everyone, including us.
Knowing intellectually that ungrounded models can produce unreliable output is different from experiencing it in your own work. We caught the error because we were skeptical enough to verify, but the ease with which we almost accepted that first answer was unsettling. The model sounded so certain. The reasoning looked solid. It would have been simple to move forward without a second thought.
We came away from this experience with more respect for the challenges of using AI responsibly in legal work. Grounding helps significantly, but it does not eliminate the need for professional judgment and verification. The lawyer still has to be the lawyer. AI delivers measurable efficiencies with finding and analyzing legal authorities, but it cannot replace the responsibility that comes with giving legal advice.