This release bundles substantial work since v0.2.3:
- Structured-output Research Manager, Trader, and Portfolio Manager
(canonical with_structured_output pattern, single LLM call per agent,
rendered markdown preserves the existing report shape).
- LangGraph checkpoint resume for crash recovery (--checkpoint flag).
- Persistent decision log replacing the per-agent BM25 memory, with
deferred reflection driven by yfinance returns + alpha vs SPY.
- DeepSeek, Qwen, GLM, and Azure OpenAI provider support; dynamic
OpenRouter model selection.
- Docker support; cache and logs moved to ~/.tradingagents/ to fix
Docker permission issues.
- Windows UTF-8 encoding fix on every file I/O site.
- 5-tier rating consistency (Buy / Overweight / Hold / Underweight / Sell)
across Research Manager, Portfolio Manager, signal processor, memory log.
Plus the small quality items in this commit:
1. Suppress noisy Pydantic serializer warnings from OpenAI Responses-API
parse path by defaulting structured-output to method="function_calling"
(root-cause fix, not a warnings filter — same typed result, no warnings).
2. Ship scripts/smoke_structured_output.py so contributors can verify
their provider's structured-output path with one command.
3. Add opt-in memory_log_max_entries config — when set, oldest resolved
memory log entries are pruned once the cap is exceeded; pending
entries (unresolved) are never pruned.
4. backend_url default changed from the OpenAI URL to None so the
per-provider client falls back to its native endpoint instead of
leaking OpenAI's URL into Gemini / other clients.
CHANGELOG.md added with the full v0.2.4 entry. 92 tests pass without API keys.
Three related changes that take the rating pipeline from heuristic-only
to type-safe at the source.
1) Research Manager prompt now uses the same 5-tier scale (Buy /
Overweight / Hold / Underweight / Sell) as the Portfolio Manager,
signal_processing, and the memory log. The prior 3-tier wording
(Buy / Sell / Hold) was the only remaining inconsistency in the
pipeline.
2) Centralise the 5-tier vocabulary and the heuristic prose-rating
parser into tradingagents/agents/utils/rating.py. Both the memory
log and the signal processor now share the same parser instead of
duplicating regex and word-walker logic.
3) Make structured output a first-class part of the Portfolio Manager's
primary call. The PM uses llm.with_structured_output(PortfolioDecision)
so each provider's native structured-output mode (json_schema for
OpenAI/xAI, response_schema for Gemini, tool-use for Anthropic,
function_calling for OpenAI-compatible providers) yields a typed
Pydantic instance directly. A render helper turns that instance back
into the same markdown shape downstream consumers (memory log, CLI
display, saved reports) already expect, so no other code has to know
the PM now produces structured output. Providers without structured
support fall back gracefully to free-text + the deterministic
heuristic.
The previous SignalProcessor had been making a second LLM call to
re-extract the rating from the PM's prose; that round-trip is now
eliminated. SignalProcessor is a thin adapter over parse_rating(),
makes zero LLM calls, and stays for backwards compatibility with
process_signal() callers.
Schema (PortfolioDecision) captures rating + executive_summary +
investment_thesis + optional price_target + time_horizon, with field
descriptions doubling as output instructions. Agent prose remains the
primary artifact; structured output is layered onto the PM only because
it is the one agent whose output has machine-readable downstream
consumers.
15 new tests cover the heuristic parser (markdown-bold edge cases that
had no coverage before), the structured PM happy path, the free-text
fallback path, and that SignalProcessor never invokes the LLM. Full
suite: 77 tests pass in ~2s without API keys.
Long analyses can take many minutes; a crash or interruption forced users
to re-run from scratch and re-pay every LLM call. This adds an opt-in
checkpoint layer backed by per-ticker SQLite databases so the graph
resumes from the last successful node.
How to use:
- CLI: tradingagents analyze --checkpoint
- CLI: tradingagents analyze --clear-checkpoints
- Python: config["checkpoint_enabled"] = True
Lifecycle:
- propagate() recompiles the graph with a SqliteSaver when enabled and
injects a deterministic thread_id derived from ticker+date so the
same ticker+date resumes while a different date starts fresh.
- On successful completion the per-thread checkpoint rows are cleared.
- The context manager is closed in a try/finally so a crash never
leaks the SQLite connection or leaves the graph in checkpoint mode.
Storage: ~/.tradingagents/cache/checkpoints/<TICKER>.db
(override via TRADINGAGENTS_CACHE_DIR).
The checkpointer module is new (tradingagents/graph/checkpointer.py)
and the GraphSetup now returns the uncompiled workflow so it can be
recompiled with a saver when needed.
Adds langgraph-checkpoint-sqlite>=2.0.0 dependency. 3 new tests verify
the crash/resume cycle and that a different date starts fresh.
The previous per-agent BM25 memory was effectively dead code — its only
caller was a commented-out line in main.py. Replace it with a single
append-only markdown decision log driven by the propagate() lifecycle.
Lifecycle:
- store_decision() appends a pending entry at the end of every run
- _resolve_pending_entries() runs at the start of the next same-ticker
run, fetches yfinance returns + alpha vs SPY, and writes one LLM
reflection per resolved entry through an atomic temp-file rename
- Portfolio Manager consumes state["past_context"] (5 most recent
same-ticker entries plus 3 cross-ticker reflection-only excerpts)
Storage at ~/.tradingagents/memory/trading_memory.md
(override: TRADINGAGENTS_MEMORY_LOG_PATH).
Tag schema:
- Pending: [YYYY-MM-DD | TICKER | Rating | pending]
- Resolved: [YYYY-MM-DD | TICKER | Rating | +X.X% | +Y.Y% | Nd]
Removes rank-bm25 dependency and the legacy reflect_and_remember()
plumbing across reflection.py, trading_graph.py, and the agent factories.
49 new tests in tests/test_memory_log.py cover the storage, deferred
reflection, prompt injection, and legacy-removal paths. Full suite
(58 tests) passes in under 2 seconds without API keys.