Scout — Agentic Research Assistant

The goal — what I'm exploring

Scout is my exploration of what an AI agent really is once you strip away the frameworks: a while-loop with a budget and a stop condition. The research question is how you make a tool-using agent legible and trustworthy — grounding every claim in a real source, citing by contract, knowing when to stop, and treating each tool as an attack surface. The engineering lives in the scaffolding around the model, which is exactly where I think the interesting work in agents actually is.

How it uses AI

Scout is a from-scratch ReAct agent on Gemini — no agent framework, the whole loop is a few hundred lines on purpose. Each iteration the model decides its next move via function calling (search the web, read a page, or finish); Scout runs that tool, summarizes the result so it doesn't blow the context window, and feeds it back. A separate synthesis call writes the final report and is only allowed to cite sources it actually fetched, so the model can't invent references. The whole point is to make the reasoning loop legible rather than hiding it behind a black box.

How it works

An agent is a while-loop with a budget and a stop condition — and “decide when you’re done” is the hard part. Scout makes that loop legible: every plan, thought, search, and read streams to the screen.
Cited by contract: the synthesis step may only assert facts from a fetched source, every claim carries a [n] marker, and a fabricated citation with no source renders as plain, visibly-unsupported text.
A step budget plus an explicit finish tool tame the two classic agent failure modes — pacing the room without ever stopping, and answering from a single snippet it never opened.
Each observation is summarized before it enters working memory, so an 8-step run never blows the context window — full page text lives only in the Sources panel.
fetch_url is treated as the SSRF footgun it is: http/https only, per-hop host re-validation through redirects, blocked private/link-local IPs, 8s/2MB caps, and DOMPurify on the content.
Zero-setup on a shared demo key with a live capacity meter; bring-your-own Gemini key (stored only in your browser) for unlimited runs, with heavier models server-gated to BYOK.

The goal — what I'm exploring

How it uses AI

How it works

The stack