Research

Pushing reasoning
to the edge.

Our research lab studies how agents deliberate, use tools reliably, and stay governable as they take on real authority. Here's what we're working on and what we've published.

peer-reviewed papers

avg. reasoning-error reduction

open benchmarks released

median edge inference

Focus areas

Four lines of inquiry

Reasoning · 2026

Deliberate-then-Act: self-critique loops for high-stakes decisions

A framework where agents draft, critique, and revise a plan before committing — cutting downstream errors by 38% on operational benchmarks.

Read paper

Tool use · 2026

Reliable function-calling under partial failure

How agents recover when an API times out, returns malformed data, or partially succeeds — with verification and graceful rollback.

Read paper

Edge inference · 2025

Sub-second reasoning on live operational streams

Distillation and caching techniques that bring multi-step reasoning to 120ms median latency at the point of decision.

Read paper

Governance · 2025

Confidence-gated autonomy

Calibrating when an agent should act, ask, or escalate — so authority scales with demonstrated reliability.

Read paper

Reasoning · 2025

Explanations as a first-class output

Training agents to produce human-auditable rationales that hold up under scrutiny — without degrading task accuracy.

Read paper

Tool use · 2025

SparkBench: an open benchmark for workflow agents

200 realistic enterprise tasks spanning CRM, ticketing, and analytics — open-sourced for the community to build on.

View benchmark

Read with us

Get new papers, benchmarks, and lab notes in your inbox. No noise — just the research.

Subscribe to the lab

Pushing reasoningto the edge.

Four lines of inquiry

Deliberate-then-Act: self-critique loops for high-stakes decisions

Reliable function-calling under partial failure

Sub-second reasoning on live operational streams

Confidence-gated autonomy

Explanations as a first-class output

SparkBench: an open benchmark for workflow agents

Read with us

Pushing reasoning
to the edge.