Opinion

Designing AI factual claims for “easy verification”

”Sometimes the AI just makes stuff up” is a problem I don’t really expect to go away. In the nearterm, AI is going to keep occasionally hallucinating, or misinterpreting information. Eventually, AI will be powerful enough we need to be worried if it’s presenting misleading information on purpose. There might be a nice window where the AI is powerful enough to not make things up but non-agentic enough that we don’t have to worry about deliberate manipulation. But, even then, interpreting data is tricky.I’m worried about this for my own use, but, I’m more worried about this on the global scale. I’m worried about people trusting things AI made up, and I’m worried about the internet proliferating with slop that makes it harder to even find original statements that are a human’s real testimony.An approach that might help is to make AI reports more “Verification centric.” i.e. designed from the ground up to make it as easy and frictionless to verify as possible.Right now, some AI chatbots provides little citation-links. That’s better than not-having-them. But, those are a pain to open and investigate. Probably you very rarely do so.So, imagine a world where when you answer a question, the AI doesn’t guess. Instead:It finds a primary source. It answers with an exact quote from that source (with enough context for you to evaluate yourself what the quote means).UI lets you quickly expand the quote into the full document, with the quote highlighted, if you want more context.It names the author of the quote.Some fast/dumb AIs do a sanity check that the quote is accurately transcribed. “Quote” here can mean both “a paragraph of prose” or “a table of data.” The AI is allowed to truncate text to make it easier to read, but not to make up text.In this world, “hunt down primary sources” would be one of the main skills AIs are trained to do. (Maybe, pretraining data would be labeled such primary sources are more directly “available” to it’s fast-response intuitions). I’ve built myself a Claude skill aimed at this, but part of the point here is, as we deploy AIs at scale, to end up in a world where AI claims always come bundled with a human who’s responsible for them. (In worlds where the source is one human quoting another human, it’d say “By Alice Foo, according to Bob Bar.”)The AI scaffolding and UI would be designed to make it hard to end up accidentally using confabulated info. The automatic fast/dumb checker AI can verify the links in the chain that require no judgment. The UI makes it as frictionless as possible for you to sanity check that the results make sense in context.”Okay Ray, but often I want it to summarize or explain things in a way that’s never been explained before.”Often the original source for a fact is not written in a way that’s easy for most people to interpret. For this sort of case, I’m imagining:(optionally) The AI probably gives you a fast response from memoryBut, it immediately also begins the search for the original sources, and then another AI instance writes an explanation based on those sources, and checks that that each claim it ultimately made has a source.If you want the AI to make inferences, inferences are labeled as different from original sources.Regardless, the original claims are presented in a UI (maybe as a side-note inline with the claims) that are still easy to cross-check.Doing this at scale efficiently is legit engineering problem. I think pieces of this could be done by dumb python scripts without. But, this feels like a better world to be aiming for.Bonus points for cross-checking dataWhen stakes are higher (anyone trying to use AI to do novel intellectual work), you’d also want to combine the “fast dumb AI who checks the quote is literally accurate” with “smart AI then tries looking for corroborating or disproving evidence (which also comes in verified exact quotes). For now…Meanwhile, if you’re interested in using the Direct Quotes skill, here’s the current version I’m using:”Direct Quotes” complete skill fileDirect QuotesThis skill changes how you present information when the user wants direct evidence from original sources. Instead of summarizing or paraphrasing, you lead with the source material itself so the user can read the evidence firsthand.Why this mattersUsers asking for quotes or source verification want to see the actual words, not your interpretation. They’re trying to evaluate evidence themselves — your job is to find the best source and present it cleanly, not to editorialize. Think of yourself as a research librarian pulling the exact passage someone needs.Finding the right sourceSearch for the most original source available. This means:A person’s own writing, speech transcript, or official statement over a journalist’s summary of itA primary document (court filing, research paper, official report) over news coverage about itThe earliest credible version of a quote over later repetitionsIf the true original isn’t findable, use the closest thing you can get and note that it’s secondhand (e.g., “as reported by…” or “as quoted in…”)Use web search to find sources. Don’t rely on memory for quote text — search and verify. If you find the original source URL, use web_fetch to pull the full text so you can quote accurately.Output formatEvery response using this skill follows this exact structure:1. Summary lineA single plain-text line, under 100 characters, that captures the headline takeaway. No markdown formatting on this line. Think of it as a tweet-length thesis.2. BlockquoteImmediately after the summary line, a markdown blockquote (>) containing the relevant passage. Guidelines for the quote:Length: Include enough surrounding context that the user can understand the quote’s meaning without needing your explanation. Usually 2-6 sentences. Don’t trim so aggressively that the reader has to take your word for what the quote means.Bolding: Bold the most important sentences — the ones that directly answer the user’s question or contain the key claim. Usually 1-3 bolded sections. Don’t bold everything; the contrast is what makes it useful.Accuracy: Quote the source verbatim. Don’t silently fix grammar or modernize language. If you need to add a word for clarity, use [brackets].Continuity: If you need to skip irrelevant sentences in the middle, use […] to indicate the gap. But prefer finding a continuous passage over stitching fragments.Multi-paragraph quotes: Use > on each line, with > (angle bracket + space on an empty line) between paragraphs to maintain the blockquote.3. Finding sourcesDo not use paywalled sources. If you can only find an abstract, look for a public source, potentially on sci-hub or Anna’s Archive, or find a pdf of the source material.Try to search and find original sources.3. Source attributionBelow the blockquote, a single line with:[DD Mon YYYY — Source Name](URL)Date: The publication or statement date in DD Mon YYYY format (e.g., 23 Aug 2021). If only a year is known, use just the year. If no date is findable, omit it but keep the dash.Source Name: The publication, speaker, or document name. Be specific: “FDA Press Release” not just “FDA”; “Elon Musk on X” not just “social media”.URL: Link to the original source page. If the exact passage is locatable, use a text fragment URL (#:~:text=…) so the link jumps to the relevant section. Encode spaces as %20 in the fragment.When you can’t find a reliable sourceIf you search and can’t find a trustworthy original source for the claim:Say so plainly: “I wasn’t able to find an original source for this.”If the quote is commonly attributed but unverified, note that.Don’t fabricate a quote or guess at wording. An honest “I couldn’t find this” is far more valuable than a plausible-sounding fake.Multiple claims in one requestIf the user asks about several claims or quotes at once, repeat the format (summary → blockquote → source) for each one, separated by a horizontal rule (—).What NOT to doDon’t add lengthy analysis or commentary before or after the quote. A sentence or two of context is fine if truly needed, but the quote should be the star.Don’t use headers like “## Source” or “## Quote” — the format is clean and self-explanatory.Don’t present multiple sources for the same claim unless the user asks for comparison. Pick the single best/most original one.Don’t blockquote your own summary or analysis — only source material goes in blockquotes.ExampleUser prompt: “Read this and summarize the key desiderata for what McCulloch was looking for” (with a link to McCulloch’s 1960 paper “What Is a Number, that a Man May Know It, and a Man, that He May Know a Number?”)Correct output:McCulloch sought a minimal mental event (“psychon”) with four propertiesMy object, as a psychologist, was to invent a kind of least psychic event, or “psychon,” that would have the following properties: First, it was to be so simple an event that it either happened or else it did not happen. Second, it was to happen only if its bound cause had happened – shades of Duns Scotus! – that is, it was to imply its temporal antecedent. Third, it was to propose this to subsequent psychons. Fourth, these were to be compounded to produce the equivalents of more complicated propositions concerning their antecedents.In 1929 it dawned on me that these events might be regarded as the all-or-none impulses of neurons, combined by convergence upon the next neuron to yield complexes of propositional events.1960 — Warren S. McCulloch, “What Is a Number, that a Man May Know It, and a Man, that He May Know a Number?”, General Semantics Bulletin No. 26/27Why this is correct: The summary line is under 100 characters and captures the key idea. The blockquote is verbatim from the source, with the four desiderata bolded as the most important content. Context sentences are included so the reader understands what a “psychon” is and where it led (neurons). The source attribution has the year, full title, and a direct link to the PDF.Discuss Read More

Related Posts

At odds with the unavoidable meta-message

2025 in AI predictions

Finetuning Borges

Leave a Reply Cancel reply