AI infrastructure, tools, and open research.
Sparkco is an open-source research project on the post-AGI stack — the runtime containers agents live in, the harnessing (glue code) inside them, and the messaging between them. It's built by the team behind SimpleFunctions, where we're exploring how live prediction-market probabilities can serve as a real-time world state for AI agents. The site is our public log of that work: a live feed of AI and prediction-market signals, plus the setups and tools we recommend for agent builders.
We ship tools as CLIs first, not MCP — 0 tokens to expose, ~100% reliable, pipe-composable.
Parametric memory: replacing the context window with weights.
Today's chat models remember by re-reading the entire conversation on every turn. Compaction loses information, retrieval crowds the window, and a new session starts blank. We're testing whether the facts, preferences, and behavior in a dialogue can be encoded directly into model weights — leaving the context free for what's actually being said now.
Want to collaborate? patrick@simplefunctions.dev
Read the full directionHide
The context window is a finite token sequence, fully recomputed on every turn. Every existing workaround — summarization memory, vector retrieval, KV caching — moves the cost without solving it: long context drifts, compaction discards information, retrieval crowds the same window it pulls from. If conversational state could live in weight deltas instead of tokens, the window would only need to hold the current turn.
- Test-time training. ByteDance In-Place TTT (ICLR 2026 oral) and Stanford/NVIDIA TTT-E2E update MLP projection weights online during inference, compressing long context into fast weights. All published work targets long-document throughput; nobody has tested whether the fast weights survive once the document is dropped from context.
- Hypernetwork → adapter. Sakana's Doc-to-LoRA (Feb 2026) and P2P (Oct 2025) train a hypernet that emits a LoRA from raw text or a user profile in under a second. Validates "text → weights" as a tractable mapping — but neither was designed for accumulating dialogue history.
- Dialogue-direct fine-tuning. PLUM (Nov 2024) fine-tunes a LoRA on dialogue Q/A pairs and matches RAG at 100 turns. MemLoRA trains memory management itself as a LoRA. IBM's Activated LoRA (Dec 2025) solves multi-LoRA hot-swap without KV recompute — making per-conversation memory modules feasible.
- Knowledge editing. ROME and MEMIT do surgical single-fact edits on weights, but catastrophic forgetting appears past ~1000 edits. Not a candidate at dialogue scale.
These live in disjoint communities — efficient inference, recsys, personalization NLP, on-device, model editing — and have never been compared on the same benchmark. None has been evaluated end-to-end on a real user's multi-hundred-turn history across technical, strategic, philosophical, and personal domains, with the conversation removed from context. Existing benchmarks (RULER, needle-in-haystack, LaMP) are synthetic or shallow.
- TTT fast weights as memory. Ingest a fact-bearing dialogue with In-Place TTT, drop the context, probe. Iterations 1–2 ran on a single A100 with a self-trained checkpoint — full write-up here. Negative: trained fast weights produced perturbation noise, not retrievable encoding, even at small inference-time scales. Joint base+TTT training is the next attack surface.
- Doc-to-LoRA over real dialogues. Same probes, hypernet-generated LoRA instead of TTT. Compare raw-dialogue input against structured-profile input for information retention.
- Modular memory adapters. Decompose dialogue history into facts, preferences, and project context. Train one LoRA per axis; hot-swap with Activated LoRA. Measure single-load vs combined-load interference.
- Capacity and forgetting curves. Stream new facts turn-by-turn; locate the point at which turn N overwrites turn 1. Trace the capacity–fidelity tradeoff.
- A "conversation memory retention" benchmark — three difficulty tiers, six fact dimensions. None currently exists for this scenario.
- First head-to-head comparison of TTT fast weights, Doc-to-LoRA, PLUM-style dialogue-LoRA, and classical summarization memory on the same eval.
- An empirical answer to whether modular per-domain memory adapters can be composed without cross-interference.
Three layers, and what's already out there.
Containers
Sandboxes, microVMs, durable runtimes — where the agent lives.
- e2bCode-interpreter sandboxes; the default for general-purpose runs.
- ModalgVisor + GPU-native; sub-1s starts, scales to 50k+ concurrent.
- DaytonaOpen source; ~90–200ms cold start, fastest in class.
- Fly.io SpritesStateful microVMs with checkpoint/restore and persistent NVMe.
- Vercel SandboxFirecracker + idle-billed; the JS-stack default.
SimpleFunctions sits on top: autonomous daemons, scheduler, and risk gates for prediction-market agents.
Harnessing
Glue code inside the container. Context curation, tool routing, the runtime loop.
- Claude Agent SDKAnthropic's harness; powers Claude Code itself.
- Inspect AIEval-grade harness used by METR, Apollo, and government AISIs.
- LangGraphLangChain's runtime layer — durable execution, threads, HITL.
- Claude Code / Cursor / AiderOpinionated harnesses-in-product; not sold separately.
SimpleFunctions ships /api/agent/world as ~800-token markdown context, plus a CLI with --json for deterministic harness mode.
Messaging
Between containers. Discovery, identity, stateful tasks — not tool-calling.
- A2AGoogle's Agent2Agent (Linux Foundation, 2025) — the emerging consensus.
- ANPPeer-to-peer agent network over HTTPS + DIDs for identity.
- LettaShared memory blocks + thread-based message passing.
- AutoGen GroupChatIn-process orchestration; supervisor / round-robin patterns.
SimpleFunctions Chatbus: agents DM and broadcast in real time — the messaging substrate for trading agents.
What we ship publicly.
Harness & agents
- harnessDual pi-agent runtime — two agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol.
- MementoContext-integrity stress testing for Claude. Adversarial harness tampers with memory between sessions and watches whether the agent notices.
- claude-arenaAI vs AI vs AI — autonomous Claude agents battle in a live CTF arena with trading.
- claude-tradingAutonomous Claude agents trade against each other on a live exchange — maker vs takers.
SimpleFunctions
Curated lists
- awesome-cli-agentic-toolsCLI tools for AI agents — prediction markets, agent frameworks, coding agents, browser agents, developer CLIs.
- awesome-prediction-marketsAPIs, datasets, and resources for developers and AI agents.
- prediction-markets-reading256 articles on Kalshi, Polymarket, market microstructure, calibration, and trading strategies.
Terminal tools
- kalshi-orderbook-viewerDepth charts for prediction markets, in your terminal.
- kalshi-price-monitorAlerts on significant Kalshi/Polymarket price changes.
- polymarket-sports-mmSports market maker; pre-game and live quoting tuned to the quadratic reward function.
- polymarket-ticker-resolverResolve any Polymarket ID format (numeric, conditionId, CLOB token, slug). Zero deps.
Signals & probability
- prediction-market-edge-detectorDetect mispricings across 30,000+ markets.
- prediction-market-regimeReal-time crisis / risk-off / risk-on / complacent classifier.
- prediction-market-uncertaintyUncertainty index from 30,000+ markets — one number, 0–100.
- causal-tree-decompositionStandalone causal-tree probability engine; thesis → weighted confidence. Zero deps.
World-state plumbing
SDK adapters
- crewai-prediction-marketsCrewAI tools.
- langchain-prediction-marketsLangChain tools.
- openai-agents-prediction-marketsOpenAI Agents SDK tools.
- vercel-ai-prediction-marketsVercel AI SDK tools.
- create-prediction-market-agentScaffold a project. Works with LangChain, CrewAI, OpenAI Agents SDK, or vanilla TypeScript.
- prediction-market-mcp-exampleMinimal MCP server example.
Live feed
Mixed stream from prediction markets, theses, new listings, and the blog.
BNB Up or Down - May 14, 1:50PM-1:55PM ET
Ethereum Up or Down - May 14, 1:50PM-1:55PM ET
Dogecoin Up or Down - May 14, 1:50PM-1:55PM ET
Hyperliquid Up or Down - May 14, 1:50PM-1:55PM ET
Bitcoin Up or Down - May 14, 1:50PM-1:55PM ET
Solana Up or Down - May 14, 1:50PM-1:55PM ET
Hormuz blockade disrupts fertilizer supply chains. Fertilizer prices spike, US farm costs surge, foo
The thesis confidence increases slightly due to intensified market focus on Strait of Hormuz transit volatility and persistent fertilizer price pressure, though structural political outcomes remain near zero probability.
US freezes Russian assets, sanctions Iran, bombs Iran — each action tells the world the dollar syste
The thesis remains under pressure as Bitcoin-related market indicators for 2026 have collapsed, while Gold maintains significant thesis-implied edge despite moderate price increases. Confidence is adjusted slightly downward to 0.32 as the '
California 2026 Governor: Mahan Underpriced at 15¢. The mailman's son from Watsonville has the stron
Recent market signals show a modest increase in the probability of a Newsom-Becerra endorsement, slightly pressuring the path of least resistance for independent outsiders like Mahan. Thesis confidence remains low and stable as markets awai
Sell Hormuz normalization: 31¢ is still too rich
R4 prices end-of-June Hormuz normalization at 31¢, but end-of-May is already at 12¢ — the pace of de-escalation implied by that 19¢ jump in one month is inconsistent with Iran deal odds collapsing 12¢ to just 2¢ on high volume. With blockad
Buy $120 WTI crude: 48¢ with momentum and supply catalyst
WTI $120 by end-of-June surged 6¢ in a single session to 48¢, while $140 rose 2¢ to 20¢ — a convex payoff ladder with strong directional momentum. Hormuz blockade is the structural driver; USO +4.04% and nat gas +5.9% confirm physical marke
Contrarian: Hezbollah disarmament at 18¢ is mispriced low
The 'Hezbollah disarms by December 31' contract surged 6¢ to 18¢ — a 50% single-day increase signaling a genuine regime shift in Lebanon probability. This is a contrarian long against the oil-shock narrative: a Hezbollah disarmament would d
Buy Chris Coons YES vote at 5¢: 2,972 IY screams mispricing
M17 prices at just 5¢ with an implied yield of 2,972 on a 233-day horizon — one of the highest IY figures in the dataset at a tight spread. The market implies near-certainty that Coons votes NO on the next Fed Chair nominee, but with bipart
Retail sales miss is live: buy NO at 7¢ with 100k IY
Y4 prices US retail sales MoM for April 2026 above the threshold at just 7¢ — implying a 93% probability of a miss — with an IY of 100,000 on a 1-day horizon and a CRI of 13.3. With CPI at 4.0% squeezing real consumer purchasing power and r
Government shutdown plus Republican House at 12¢: political optionality buy
R1 prices the combined 'shutdown AND Republican House 2026' outcome at 12¢, with a regime shift from taker to neutral (score 0.45) indicating flow is stabilizing after a selloff. L1 prices Congress overriding Trump's veto before 2027 at 10¢
Buy China handshake duration lag: 52¢ contagion gap wide open
C5 and C6 show trigger contracts moving -43¢ and -51¢ respectively while the lagging 'handshake duration' contract sits at 21¢ — a 50-52¢ contagion gap that has not closed. C2 confirms with a 58¢ gap on a -24¢ trigger move. The lagging cont
Sell Trump China announcement contract: 55¢ gap signals over-pricing
C3 shows a -55¢ contagion gap where the lagging 'official China announcement' contract sits at 82¢ while the trigger moved only +24¢ — the lagging contract is OVER-priced relative to the trigger's signal. At 82¢, the announcement contract p
The United States will launch a ground invasion of Iran. After 5 weeks of airstrikes, the US faces t
Thesis confidence drops as multiple mediation channels (Oman, Pakistan) report breakthroughs, directly contradicting the 'no diplomatic off-ramp' core assumption. Market prices for oil and shipping transit have aggressively corrected, sugge
Putin profits from Iran war oil prices. Russian military budget fully funded. Ukraine peace talks st
The thesis confidence faced a minor downward revision as oil futures markets showed a trend toward stabilizing or retreating from high-end upside bets, contradicting the expectation of an extreme price spike supporting Russia's war budget.
Oil above $100 drives electricity costs up. Data center operating costs surge. AI companies delay or
Recent market signals show a strong retreat in energy price expectations, specifically regarding WTI oil and natural gas benchmarks, which weakens the thesis that electricity costs will surge to the point of impacting data center expansion.
What we'd install on a fresh machine
Three of ours, five from the community we trust.
npm i -g @spfunctions/cli@spfunctions/harness
SparkcoDual-agent runtime. Two pi-agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol. $1/day to run.
npm i -g @spfunctions/harnessBrowse 69+ CLI tools
Taste-curated. Filter by category, sorted by Sparkco-first then stars.
npm i -g @spfunctions/cligit clone https://github.com/spfunctions/polymarket-sports-mm@spfunctions/prediction-market-mcp
SparkcoMCP server with 4 tools. Works with Claude, Cursor, VS Code.
npx @spfunctions/prediction-market-mcppip install simplefunctions-aigit clone https://github.com/spfunctions/prediction-market-mcp-examplegit clone https://github.com/spfunctions/kalshi-price-monitorgit clone https://github.com/spfunctions/prediction-market-contextgit clone https://github.com/spfunctions/causal-tree-decompositioncreate-prediction-market-agent
SparkcoScaffold agent projects: LangChain, CrewAI, OpenAI, vanilla TS.
npx create-prediction-market-agentuses: spfunctions/world-state-action@v1npm i langchain-prediction-marketsnpm i openai-agents-prediction-marketsnpm i vercel-ai-prediction-marketspip install crewai-prediction-marketsnpm i agent-world-awarenessgit clone https://github.com/spfunctions/prediction-market-edge-detector@spfunctions/harness
SparkcoDual-agent runtime. Two pi-agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol. $1/day to run.
npm i -g @spfunctions/harness@spfunctions/bi
SparkcoAgent-friendly BI CLI. Query CSV/JSON/Parquet with SQL via DuckDB. 4 commands: head, schema, query, convert.
npm i -g @spfunctions/bicode --install-extension saoudrizwan.claude-devpip install openai-agentsgo install github.com/xo/usql@latestbrew install stripe/stripe-cli/stripego install github.com/cube2222/octosql/cmd/octosql@latestnpx @anthropic/playwright-mcpgit clone https://github.com/nweii/prediction-market-analysispip install sqlite-utilsbrew install supabase/tap/supabasegit clone https://github.com/Polymarket/agentsgit clone https://github.com/elizaOS/kalshi-ai-trading-botgit clone https://github.com/berlinbra/polymarket-mcp-servergit clone https://github.com/polybot-nexus/polybotgit clone https://github.com/PredictOS/predictospip install dr-manhattangit clone https://github.com/CloddsBot/cloddsbotgit clone https://github.com/polymarket-pipeline/pipelinegit clone https://github.com/gnosis/prediction-market-agentgit clone https://github.com/kalshi-trading/bot-clipip install kalshi-pythonpip install prediction-market-agent-toolingLatest from the blog
Insights on AI agents, prediction markets, and developer tools.
Automated Prediction Market Trading: CLI Agents on Kalshi
A practical guide for developers and traders on using CLI-based agents to automate order placement on Kalshi prediction markets. Covers thesis-driven trading logic, real tickers, and the agentic runtime behind production-grade automation.
Prediction Market Terminal Dashboard: Bloomberg-Style Monitoring for Kalshi Traders
A practical guide to building a professional-grade terminal dashboard for monitoring Kalshi prediction markets in real time. Covers CLI tooling, agentic scanning, position tracking, and thesis-driven trade execution.
Prediction Market Edge Detection: How to Find Mispriced Contracts on Kalshi
A systematic approach to finding mispriced prediction market contracts using causal models, orderbook analysis, and executable edge calculations.
Thesis-Driven Prediction Market Trading: Why Causal Models Beat Signal Chasing
Signal-based bots react to noise. Thesis-driven agents understand why prices should move. Here's how causal models change prediction market trading.
AI Agents for Prediction Markets: How SimpleFunctions Connects Claude to Kalshi
How to connect your AI agent to prediction market data using SimpleFunctions MCP server — get context, inject signals, and trade on Kalshi.
How to Build a Prediction Market Trading Bot with SimpleFunctions CLI
Build a prediction market bot that scans for edges, monitors thesis confidence, and executes trades on Kalshi — all from the terminal.