PPLayouts
Static reference docs — architecture, strategy specs, ops runbook, metric definitions, glossary
The factory is a framework for rapidly spinning up, paper-trading, and evaluating Polymarket strategies. It now has explicit runtime environments for research, paper, and live execution.
| Path | Purpose |
|---|---|
| factory/runner.py | Main run loop — orchestrates all strategies per cycle |
| factory/environment.py | Runtime environment policy: research / paper / live gating |
| factory/strategies/*.py | Individual strategy implementations |
| factory/broker.py | Trade opening/closing, portfolio state |
| factory/live_broker.py | Real-money execution path scoped to live trades only |
| factory/db.py | SQLite schema, queries, migrations |
| factory/feed.py | Gamma API market fetching + formatting |
| factory/claude.py | Claude API wrapper for strategy reasoning |
| factory/notify.py | WhatsApp/alert dispatch |
| factory/models.py | Signal, Trade, Run dataclasses |
| eval/report.py | Weekly evaluation report generator |
| scripts/export_dashboard_data.py | Exports JSON snapshot for this dashboard |
| scripts/build_replay_benchmark.py | Builds strategy-level replay benchmark summaries from logged signals, execution checks, and resolved outcomes |
| scripts/update_wiki.py | Generates wiki/*.md from DB via Claude (Karpathy pattern) |
| data/factory.sqlite3 | Live database (gitignored) |
| Environment | Behavior |
|---|---|
| research | Scans and logs signals only. Never opens or resolves positions. |
| paper | Paper-only trading path. Opens and resolves only paper trades. |
| live | Real-money path. Only explicit mode="live" plus live_ready=True strategies can execute. |
Thesis: Recent news contains information not yet priced into prediction markets.
Method: Claude scans top markets + news headlines, picks topics with likely EV, then estimates p̂ per market from news snippets.
| max_position_usdc | $15 |
| min_ev_pp | 10 pp |
| hold window | 7–30 days |
| n_topics per run | 3 |
| min_volume | $10,000 |
| days_to_close | 7–60 days |
| max_trades_per_run | 3 |
Thesis: In multi-outcome markets the sum of all YES prices should be ~1.0. When the sum is significantly below 1.0, buying all outcomes locks in basket EV.
Method: Scan multi-outcome events, compute basket sum, filter for clean legs, score by gap and volume.
| max_position_usdc | $8 per leg |
| arb_threshold | ≤ 0.90 |
| min_outcomes | 3 |
| min_volume | $15,000 |
| days_to_close | 7–30 days |
| max_new_baskets_per_run | 3 |
Thesis: Liquid markets sometimes lag relevant news and do not reprice quickly enough.
Method: Filters liquid near-term markets, fetches recent news, uses Claude to judge whether the market appears stale. Dedupes by topic cluster.
| max_position_usdc | $10 |
| days_to_close | 3–45 days |
| price range | 0.10–0.85 |
| min_volume | $8,000 |
| max_trades_per_run | 2 |
Thesis: Some market pairs violate basic logical consistency (prerequisite vs downstream, broader vs narrower).
Method: Heuristic pair discovery by keyword clustering, then a Claude pass to classify the relationship and identify the cheaper implication.
| max_position_usdc | $10 |
| min_ev_pp | 10 pp |
| relationship_gap_pp | 10 pp |
| days_to_close | ≤ 120 days |
| max_trades_per_run | 2 |
| hold window | 3–30 days |
Thesis: Gossip/tabloid coverage directionally corroborates celebrity event markets before the crowd reprices.
Method: Screener for celebrity event markets (pregnancy, romance, scandal). Fails closed unless tabloid coverage corroborates the market side.
Thesis: Binary Polymarket markets can offer holding-yield carry by buying a market-neutral full set (YES + NO) and collecting rewards.
Method: Scan binary markets with enough duration and liquidity, rank by carry yield, and only execute in the live environment.
mode="live" · live_ready=True · blocked from paper by environment policy
Thesis: Liquid leader / laggard divergences across obviously related markets create short-lived arbitrage windows.
Method: Finds markets correlated by keyword, compares prices of leader vs laggard, alerts when divergence exceeds threshold.
trading_enabled=False · promotable=True · live_ready=False
Thesis: Esport markets expiring within 48 hours with strong liquidity/price signals can be identified with deterministic filters.
Method: Screener using deterministic liquidity/price filters + subtype tagging. No LLM pass — pure heuristic.
trading_enabled=False · promotable=True · live_ready=False
Thesis: Markets sometimes stay open after real-world resolution, creating free EV.
Kill reason: -92.3% ROI on 12 closed trades. Conclusive failure — paused=True, trading_enabled=False, exposure cap set to 0.
Thesis: Markets at extreme prices (>93% or <7%) are systematically overconfident and can be faded.
Kill reason: 0% win rate, -100% ROI on 6 closed trades. Too blunt — no category filtering, no news validation, static fade amounts.
Thesis: Open-Meteo ensemble probabilities can beat Polymarket crowd pricing on daily temperature bucket markets.
Pause reason: 45% WR, -19.5% ROI on 82 closed trades. Too many correlated bets per city/day, EV threshold too low for noisy bucket outcomes.
Time windows drive operational scheduling — faster buckets run every cycle, slower ones skip midday churn.
| Label | Duration | Runner cadence | Current strategies |
|---|---|---|---|
| super_short | < 1 hour | Every cycle | esport48 |
| intraday | 1h – 24h | Every cycle | — |
| short | 1–7 days | Every cycle | stale_market, celebrity_tabloid |
| medium | 8–30 days | Can skip midday | ev_news, spread_arb, correlated_pairs, correlated_laggard |
| long | 31+ days | Once/day | — |
Open exposure is capped by both strategy-level limits and time-window-level portfolio limits.
| Metric | Kill threshold | Keep threshold |
|---|---|---|
| Win rate | < 30% | > 50% |
| ROI | < -10% | > 0% |
| Min trades to evaluate | 5 closed trades minimum | |
| Edge type | Description |
|---|---|
| information | Faster / better news processing than the crowd (ev_news, stale_market, celebrity_tabloid) |
| structural | Mathematical inconsistency baked into market structure (spread_arb) |
| resolution_lag | Markets staying open after real-world outcome (resolution_hunter — killed) |
| logical_inconsistency | Cross-market logical violations (correlated_pairs, correlated_laggard) |
| quantitative | External data model beats crowd calibration (weather_edge — paused) |
| mean_reversion | Extreme-price markets revert (fade_certainty — killed) |
The replay benchmark is a strategy-level score built from persisted signals, signal_execution_checks, and resolved-trade labels where available. It is intended as a keep/discard gate for alert-only and generated strategies, not as a replacement for realized P&L review.
Promote an alert-only strategy to paper trading only after all of the following are true:
trading_enabled = False while paper-eval checklist is open.trading_enabled = True only after checklist complete.live_ready = False until a separate live-broker checklist exists.| Strategy | Status | Promotable | Blocker |
|---|---|---|---|
| correlated_laggard | alert-only | Yes | Paper-eval checklist open — see EX-20260401-006 |
| esport48 | alert-only | Yes | Paper-eval checklist open — see EX-20260401-007 |
| celebrity_tabloid | paper trading | — | Feed coverage — top-100 Gamma rarely surfaces celebrity markets |
At the moment a signal fires, the runner records:
Run status:
| Value | Meaning |
|---|---|
| ok | Run completed successfully without fatal errors |
| warning | Run completed but warnings/errors exceeded threshold |
| error | Run failed or ended in a clearly broken state |
| unknown | Status cannot be determined from stored data |
Strategy status:
| Value | Meaning |
|---|---|
| active | Currently part of the active strategy stack |
| paused | Intentionally disabled but still in current-era reporting context |
| legacy | Historical strategy, no longer part of current active stack |
| unknown | Cannot classify with confidence |
Experiment status:
| Value | Meaning |
|---|---|
| active | Currently in progress |
| planned | Defined but not yet active |
| review_due | Has reached or passed a stated review point |
| completed | Reached a documented conclusion |
| archived | Retained for history, not current focus |
| Field | Definition |
|---|---|
| open_exposure_active | Total open exposure from active strategies (absolute, not signed) |
| open_exposure_legacy | Total open exposure from legacy/paused strategies |
| open_position_count_active | Count of open positions from active strategies |
| open_position_count_legacy | Count of open positions from legacy strategies |
| Field | Definition |
|---|---|
| realized_pnl_30d | Realized P&L from closed positions in the last 30 days |
| realized_pnl_all_time | Realized P&L across all available history |
| Field | Definition |
|---|---|
| execution_checks_30d | Count of signal execution checks in the last 30 days |
| strategies_with_execution_checks_30d | Distinct strategies with at least 1 check in the last 30 days |
| avg_ev_after_slippage_50_pp_30d | Average EV after $50 slippage across checks (30d) |
| avg_max_size_positive_ev_30d | Average max +EV size (USD) across checks (30d) |
| benchmark_top_strategy_alert_only | Best current alert-only strategy by replay benchmark score |
| benchmark_top_score_alert_only | Replay benchmark score of the top alert-only strategy |
| benchmark_signal_count_alert_only | Total signals included in the current alert-only replay benchmark snapshot |
null for absent scalar valuesunknown for enum-like status fields[] for genuinely empty collections0 for unknown, or empty string for unknown status| Job | Schedule |
|---|---|
| com.polymarket.factory | Every 2 hours at :00 (paper environment) |
| com.polymarket.factory.live | 19:30 daily (live environment) |
| com.polymarket.factory.aggressive | 10:30 / 22:30 daily |
| com.polymarket.factory.backup | 03:45 daily |
Live at data/factory.sqlite3 (gitignored). Tables:
runs — one row per runner executionsignals — strategy signals generated each rundecisions — open/close/skip decisions per signalsignal_execution_checks — Phase A fill proxies at signal timerun_logs — log entries per runtrades includes a mode column so paper and live positions are tracked separatelydata/trades.csv is exported during migration periodhttps://gamma-api.polymarket.com/marketshttps://clob.polymarket.com| Term | Definition |
|---|---|
| p̂ | Estimated true probability for a market outcome, derived by a strategy's reasoning pass |
| EV | Expected value — the edge a trade offers relative to the current market price, in percentage points |
| EV pp | EV expressed in percentage points (e.g. 12 pp EV = 12% expected edge) |
| Phase A | Signal-time execution check — a fill-proxy snapshot of market microstructure when a signal fires |
| fill proxy | An estimate of what fill price would have been, based on CLOB bid/ask data. Not an actual fill. |
| basket EV | In spread_arb: the guaranteed return from buying all legs in a multi-outcome market when sum < 1.0 |
| alert-only | Strategy mode where signals are logged and reported but no positions are opened |
| paper trading | Strategy mode where positions are opened in the simulator but no real money is deployed |
| research | Runner environment that scans and logs signals only, with no position opening or resolution |
| live_ready | Strategy prerequisite for the live environment; not sufficient on its own without environment-policy approval |
| promotable | Flag indicating a strategy is a valid candidate for promotion once evidence is sufficient |
| source_confidence | Label indicating whether Phase A data came from a direct CLOB quote or a heuristic fallback |
| replay benchmark | Strategy-level composite score built from directional labels, execution realism, capacity, uniqueness, and coverage |
| Gamma API | Polymarket's market data API — provides the top-100 active markets feed used by all strategies |
| CLOB | Central Limit Order Book — Polymarket's on-chain order book, used for execution checks |
| DDGS | DuckDuckGo Search — used by news-based strategies to fetch recent headlines |
| OpenClaw | Internal tool for WhatsApp messaging via the runner summary dispatch |
| launchd | macOS daemon scheduler — runs the paper, live, aggressive, and backup jobs on their configured calendars |
| Karpathy pattern | Auto-generating living documentation from DB data via LLM — used by update_wiki.py |