Intelligence: a typed query plane over the moat.

Six plan ops, one Zod-strict grammar. Send a programmatic plan object or a natural-language question. Either way the same deterministic executor runs the plan against the database and returns typed rows. The response always echoes the executed plan so any natural-language answer can be replayed as a programmatic call. Not a chatbot. Not a narrative. AI never sets the numbers.

Sample response

The AI picks the query. The database produces the answer.

Pick a natural-language question. Watch the planner emit a Zod-strict plan. See the typed rows that came out of the executor. Below the result: the equivalent programmatic call you can paste into your terminal.

POST/v1/queryplan_sourcenl
questionEngland LSOAs under £250k AND rising YoY AND in bottom quartile crime, sort by YoY desc, limit 5

Plan emittedZod-strict

{
  "op": "rank_areas",
  "params": {
    "signals": [
      {
        "key": "property.median_price",
        "filter": {
          "lte": 250000
        }
      },
      {
        "key": "property.price_change_pct_yoy",
        "filter": {
          "gt": 0
        }
      },
      {
        "key": "crime.total_12m",
        "filter": {
          "percentile_lte": 25
        }
      }
    ],
    "sort_by": {
      "signal": "property.price_change_pct_yoy",
      "mode": "value",
      "direction": "desc"
    },
    "country": "England",
    "limit": 5
  }
}

Resultsrank_areas

scopeEnglandlimit5matched5
E01005207Manchester · M1 catchment+18.4%
E01033620Birmingham · B1 catchment+14.6%
E01011358Leeds · LS1 catchment+12.9%
E01008397Newcastle · NE1 catchment+11.2%
E01017641Sheffield · S1 catchment+10.4%

Replay as a programmatic call (no LLM, deterministic)

curl https://api.onegoodarea.com/v1/query \
  -H "Authorization: Bearer oga_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "plan": { "op": "rank_areas", "params": { ... } } }'

Reading this view

plan_source
nl means the planner translated a natural- language question into the plan above (one LLM call against the AiProvider seam). client means a caller sent the plan directly. Either path runs through the SAME executor; the response always echoes the plan back.
Zod-strict
The plan is validated against a discriminated union with .strict() on every nested object. Unknown ops, unknown params, and extra keys are REJECTED before the executor touches the database. Invalid LLM output returns 422 with the raw planner output for inspection.
Deterministic SQL
The executor dispatches each op to a deterministic Postgres query. AI never sets the numbers. AI never touches the database. Two identical plans always return the same rows against the same data state.

Sample shape and realistic values. Real responses depend on the data state at the time of the call.

Six plan ops. One typed grammar.

Every operation under /v1/query is one of these six. The shape is a Zod-strict discriminated union; the planner can only emit valid plans; unknown ops are rejected before execution.

op: rank_areas

Rank areas

Filter and sort LSOAs across one or more signals. AND semantics across up to 8 filter signals; 11 comparison operators including percentile bands.

Shapesignals[].filter: lte / gt / percentile_between · sort_by
op: get_area

Get area

Return the full AreaProfile for one resolved area. Same shape as GET /v1/area on the Signals product; exposed here for plan chaining.

Shapeparams: { area }
op: score_area

Score area

Run the deterministic scoring engine for one area with a chosen profile. Same engine as POST /v1/score; same engine_version stamp.

Shapeparams: { area, preset, weights? }
op: find_peers

Find peers

k-NN over normalised signal values. Euclidean dimension-mean-squared, bounded [0,1], robust to missing dimensions. Default k=20, min 3 overlapping dims.

Shapetarget: postcode | geo_code | area
op: find_insights

Find insights

Anomaly screening. Rank LSOAs by |peer_relative_z| on a pre-materialised z-score signal. signal_key must end in _peer_relative_z.

Shapesignal_key: crime.total_12m_peer_relative_z
op: find_forecast

Find forecast

Linear regression projection for one signal at one LSOA. Postgres regr_slope / regr_intercept over the trailing window. Constant ±2·σ band. NOT a learned model.

Shapewindow_months: 24 · horizon_months: 12

One executor. Two input modes. Zero AI in the answer path.

A natural-language question routes through the planner; a programmatic plan skips the planner entirely. Both arrive at the same Zod validator and the same executor. Plans run against the database; AI never reads or returns a row.

INPUT · NL{ question }INPUT · PROGRAMMATIC{ plan }PLANNERNL → planZOD STRICTvalidate planEXECUTORdeterministic SQLDATABASEsignal_valuessignal_timeseriesRESPONSEplan · rows
  • Planner seam

    AiProvider, not a model lock-in.

    The planner calls a provider seam, not Anthropic directly. Swapping models is a config change. The 92.9% baseline is measured against the seam, not the provider.

  • Zod strict

    Invalid plans get rejected before SQL.

    Every nested object is .strict(). Unknown ops, unknown params, extra keys: 422 with the raw planner output. The executor only ever sees valid plans.

  • Same executor

    NL and programmatic share the same code path.

    The executor does not know whether the plan came from a question or a client. plan_source is metadata for audit, not a branch in the runtime.

  • Replayable

    Every NL answer can be re-run as a programmatic call.

    The response echoes the executed plan. Paste it into a plan body and the LLM is never touched again. That is the audit-safety contract.

Measured accuracy. Published corpus. Falsifiable.

We run a CLI eval harness against a curated corpus of natural- language questions and compare the emitted plan to the expected plan via a deep-diff. Every result is in-repo and reproducible.

92.9%planner accuracy on a 14-case curated corpus, measured against claude-sonnet-4-20250514
get_area2 / 2
score_area2 / 2
find_peers2 / 2
find_insights2 / 2
find_forecast2 / 2
rank_areas3 / 4

92.9% on n=14 carries a wide Wilson 95% confidence interval of roughly 70 to 99 percent. The corpus is small by design and version-controlled. Headline number is provider-specific and gets re-run on any model swap. End-to-end NL → result accuracy is a separate eval; this number is the NL → plan seam only.

One typed query plane. Three convenience endpoints.

/v1/query is the typed plane (6 ops). /v1/peers, /v1/insights, /v1/forecast are convenience endpoints over the same plan ops. Same executor, two surfaces.

The typed query plane. Send EXACTLY ONE of {question} or {plan}. NL routes through the planner; programmatic skips it. Both arrive at the same executor. Response always echoes the executed plan plus plan_source.

Parameters

  • questionstring

    Natural-language question. Routed through the planner. Mutually exclusive with plan.

  • planQueryPlan

    Pre-built plan matching QueryPlanSchema (Zod-strict). The LLM is never touched. Mutually exclusive with question.

  • bundlestring (Lever)

    Custom signal-bundle id. Plan signals are gated against the bundle whitelist after planning; 422 bundle_signal_not_allowed otherwise.

Response

Per-op union response. Every variant carries { plan, plan_source: 'client' | 'nl', results, meta:{generated_at} }. X-Engine-Version header carries the effective methodology pin.

Status codes

200
Plan validated and executed.
400
Request shape invalid (both question and plan, or neither).
401
Missing or invalid API key.
404
Signals API flag off.
422
Planner returned no JSON, invalid plan, LLM error, or plan referenced signals outside the active bundle. Raw planner output carried in the body.

Same query plane. Five different buyer workflows.

One executor. Five very different questions buyers are trying to answer. The grammar is the same; the angle of attack is not.

CRE and site selection

The problem

A retail or site-selection team has to screen hundreds of candidate catchments against compound criteria: affordable footprint, rising demographic momentum, low crime, deprivation profile that matches the brand. Today they stitch ONS extracts, Land Registry and a crime spreadsheet, then re-rank by hand. Each refresh means rebuilding the join.

Why Intelligence

One typed compound rank_areas plan answers all of that. signals[] of up to 8 AND-joined filters, sort_by any of them, country or LAD scope, limit. find_peers narrows the shortlist to areas like the best-performing store. find_insights flags catchments where one signal is unusually high or low vs the peer group. The plan grammar IS the API.

Their value

Hundreds of catchment screens compress into one round trip. The criteria are version-controlled JSON, not a spreadsheet. The shortlist is reproducible: every result echoes the executed plan so a colleague can paste it back and get the same answer.

Screen the whole UK against your compound site criteria in one typed call, then ask for the peer set of your best-performing catchment.

Public sector

The problem

Research and policy teams have to defend every number they publish. A black-box AI score is unusable; they need to point at the methodology, the inputs, and the SQL. The analysis also needs to be reproducible next quarter when the data refreshes.

Why Intelligence

Every response echoes the executed plan plus plan_source ('client' or 'nl'), so any answer is replayable as a programmatic call. Forecast meta exposes n_observations, r2, slope_per_month, residual_stderr; insights expose signed peer_relative_z and abs_z; peers expose distance and n_dims_used. No inference inside the executor.

Their value

Methodology defensibility. The team can point at a published methodology, a Zod schema, and the SQL that produced every row. The AI is constrained to picking the query, never to setting the numbers.

An AI query plane where the AI is the interface, not the answer. Every row traces to deterministic SQL and a published methodology.

Lenders

The problem

A regulated lender needs an answer they can defend to a model risk committee. 'Our planner uses AI' is a non-starter unless it is measurable, version-pinned, and auditable. The methodology cannot silently change between two quarterly portfolio runs.

Why Intelligence

Three guarantees together: the 92.9% planner accuracy on the 14-case curated corpus is published with its methodology, the executed plan plus plan_source ride in every response so model risk can replay any NL answer as a deterministic {plan} POST, and the X-Engine-Version header honours methodology pinning per org so two runs at the same pin return the same numbers.

Their value

Auditable AI-assisted screening. The compliance story is 'here is the corpus, here is the accuracy number, here is the version we ran under, here is the plan that produced this row'. Not 'the LLM told us'.

AI-assisted area queries with a published accuracy number, a typed plan you can replay, and a methodology-pin header so quarterly runs are deterministic.

Insurance and InsureTech

The problem

An underwriter needs to comp a risk postcode against its true peer group, not 'national average' and not 'same town'. Areas with similar demographic and built-environment signatures. They also need to spot catchments drifting away from that peer norm before the loss ratio tells them.

Why Intelligence

find_peers gives a stable, symmetric similarity metric (Euclidean dimension-mean-squared over normalised signals in [0,1]). find_insights then ranks LSOAs by |peer_relative_z| on crime.total_12m_peer_relative_z or property.median_price_peer_relative_z, so the underwriter scans for unusually high crime vs the peer group, not in absolute terms. cohort_id lets the org pin a custom peer set when the global graph is not tight enough.

Their value

A relative-risk lens, not just an absolute-risk lens. The underwriter can defend a flag as 'this catchment is 3.8 stddev from its peer group on crime', with the peer set itself materialised and inspectable.

Peer-relative anomaly screening over a materialised similarity graph. Comp risk against a real peer group, not a postcode-district average.

PropTech

The problem

Listing platforms and property-search products want to surface 'areas like this one' tiles and answer ad-hoc natural-language queries from users (e.g. 'cheap places to buy where prices are rising and crime is low'). They do not want to build a query planner, an AI ops stack, or a peer-graph cache.

Why Intelligence

POST /v1/query takes a free-text question; the planner translates through the SAME deterministic executor used in programmatic mode; the response carries the executed plan back. PropTech can render the rows directly OR pre-stage common queries as {plan} bodies for high-traffic pages (no LLM cost per pageview). find_peers is the 'similar areas' tile in one call.

Their value

Two surfaces over one executor. NL for ad-hoc, {plan} for high-traffic. No internal planner to maintain, no LLM-cost-per-pageview unless they want it.

One typed API behind both your 'similar areas' tile and your natural-language area search.

Query UK areas in JSON or English. Get the same deterministic answer.

Six plan ops. One typed grammar. AI emits the plan, the database produces the rows, the response always echoes the plan so any natural-language answer can be replayed as a programmatic call. Engine version 2.0.2 is stamped on every response.