General-Purpose AI vs. Source-Grounded Research

A side-by-side comparison on real regulatory questions

Why this matters

General-purpose AI tools like Microsoft Copilot search the public internet. Most public utility commission filings, testimony, and orders are buried in state document management systems that search engines don't fully crawl. A general-purpose chatbot can only find what Google or Bing has indexed: law firm blogs, news articles, and public-facing utility websites.

Source-grounded research works differently. The system crawls official PUC websites for each state, pulls every docket and document within each case, and stores them in its own database. When an analyst asks a question, it searches that database as the only source of truth, citing the document name, docket number, and page number for every claim.

We ran the same regulatory questions through both tools. Here's what happened.

Test 1: Wrong Jurisdiction

Question

Summarize the stipulation and the Commission's decision in INT-G-21-01.

Copilot's answer

Identified INT-G-21-01 as a Washington UTC case. Cited Washington statutes (RCW 80.28) and described it as a general rate case involving revenue requirement, cost of capital, and rate design under Washington law.

The correct answer

INT-G-21-01 is an Idaho PUC case: a depreciation and amortization proceeding for Intermountain Gas Company. It resulted in a 2.41% combined rate effective January 1, 2021, with a July 1, 2023 accrual deferral mechanism. Not a rate case. Not in Washington.

What went wrong

Copilot guessed the jurisdiction and case type because it couldn't access the actual filing. An analyst submitting Copilot's version would cite the wrong state's statutes to the wrong commission.

Test 2: Wrong Case Number

Question

What is the case number of Idaho Power's most recent rate case?

Copilot's answer

IPC-E-24-07, filed May 31, 2024.

The correct answer

IPC-E-25-16, filed May 30, 2025, requesting a $199.1 million (13.09%) increase based on a 10.40% ROE.

What went wrong

Copilot returned an outdated case number because search engine results hadn't been updated. The actual most recent filing was months newer.

Test 3: Said Information Didn't Exist — When It Did

Question

What are the estimated costs of the security issuance in Case INT-G-25-03?

Copilot's answer

"The estimated costs were not disclosed in the public order."

The correct answer

Total estimated costs were $284,031.70, comprising agent fees of $200,194.70 and legal fees of $83,837.00. The original application had estimated up to $375,000. These figures appear in the compliance filing with specific page references.

What went wrong

The information was in the compliance filing, not in the final order. Copilot only found the order (via web search) and concluded the data didn't exist. The source-grounded tool searched the actual case docket and found the compliance filing.

Test 4: Fabricated Rule Language

Question

How is image advertising treated by the Idaho Public Utilities Commission?

Copilot's answer

Quoted IDAPA 31.12.01 as stating: "This account shall include the cost of advertising designed to promote the image or goodwill of the company... Such costs are not allowable for ratemaking purposes."

The correct answer

That quoted language does not exist in IDAPA 31.12.01. The actual rule chapter 31.12.01 covers Systems of Accounts but does not contain that specific text. The source-grounded tool confirmed no document in its database contained the literal phrase and identified the actual precedent: Staff testimony in INT-G-16-02 citing disallowance of advertising that does not directly benefit customers.

What went wrong

Copilot generated plausible-sounding regulatory language and presented it as a direct quote from an administrative rule. This is fabrication. If cited in testimony or a commission filing, it would be immediately identifiable as false and would damage the analyst's credibility.

The Citation Problem

Beyond factual errors, there's a structural difference in how results are sourced.

Copilotcites website domains like "[utc.wa.gov]" or "[puc.idaho.gov]", without specifying which document, which docket, or which page. Verification requires manually searching the entire website to find what Copilot might have been referencing. In practice, this means the citation is unverifiable.

Source-grounded research cites the document name, docket number, filing date, company name, and specific page numbers. Every claim links to the actual filing. Verification is one click.

For work that may be submitted to a commission, filed in testimony, or used to support a rate case, unverifiable citations are not usable.

The Speed Tradeoff

Copilot returns answers in 15–30 seconds. Source-grounded research takes 3+ minutes for a comprehensive answer.

The reason is straightforward: Copilot skims the internet and summarizes what it finds. Source-grounded research reads through case documents, cross-references filings across multiple dockets, and builds citations. That takes longer, but it's also why the output is citable and accurate.

For perspective: the research completed in 3 minutes would take an analyst days or weeks to do manually. Copilot is faster because it's doing less.

Bottom Line

Copilot produces answers that look credible but cannot be verified and are sometimes completely wrong. Source-grounded research cites the actual document, docket, and page number because it's searching the source material, not the internet.

For regulatory work where accuracy determines outcomes, the distinction is not a preference. It's a requirement.

Talk to us →