- CI gate, unified verified data-access contract, provider and data-vendor registry
- env-over-CLI config precedence, current-generation model catalog
- programmatic report output, plus sweep fixes for data and structured output
The README reproducibility example named gpt-4.1 and the structured-output smoke
script listed gemini-2.5-flash / deepseek-chat / qwen-plus / grok-4 — all retired
from the catalog. Generalize the note and refresh the smoke defaults.
The date hint sat at the end of each analyst's system prompt, after a long
indicator block, so weaker models anchored to their training cutoff when
generating tool-call date ranges. Lead each prompt with it instead.
The per-section markdown report tree was written only by the CLI, so programmatic
(TradingAgentsGraph) runs produced no saved reports.
- Extract the writer into tradingagents/reporting.write_report_tree.
- The CLI's save_report_to_disk delegates to it (no behavior change).
- Add TradingAgentsGraph.save_reports(final_state, ticker) so headless/API callers
get the same report tree, defaulting under results_dir.
The committed lockfile is not consumed by the pip-based install or CI; it only
drifts. A deliberate dependency upgrade, if wanted, is its own scoped PR.
The knob was accepted but inert — analysts run strictly sequentially and the
value was never used. Remove it rather than ship a misleading config key.
Parallel analyst execution is tracked for v0.3 (#634/#671/#487).
A weak model can write a placeholder ('None', 'N/A') into an optional price
field, tripping schema validation. Coerce null-ish strings to None on the
trader/PM float fields; real numeric strings still parse.
Nodes after the trader do not append to messages, so the debug stream reprinted
the same trailing message once per node. Print it only when it changes; the
returned state is unchanged.
- Local servers (LM Studio, vLLM) reject the object-form tool_choice langchain
sends for function calling. The generic openai_compatible provider now binds
the schema as a tool without forcing tool_choice.
- A structured call can return no parsed result (a thinking model answering in
plain text); fall back to free text with a clear reason instead of an opaque
render error.
The yfinance news fetch queried the raw ticker while every other path uses the
canonical symbol, so broker/forex/crypto aliases silently returned no news.
Normalize it (XAUUSD -> GC=F) and keep the user's ticker in the report header.
Optional enrichment vendors (FRED macro, Polymarket events) raised on a bad LLM
indicator, a missing key, or a network blip, which aborted the whole run.
- Router: mark macro_data and prediction_markets optional; a sole-vendor failure
returns a sentinel instead of re-raising. Core categories still raise.
- FRED: reject a descriptive phrase up front and return guidance instead of
400ing the API; an unknown series returns a not-found message, not a crash.
Trim each provider to current-generation models and drop the special-casing
they required:
- OpenAI: remove gpt-4.1 (deprecated; the only non-reasoning model).
- Anthropic: remove Claude Sonnet 4.5 (legacy; the only Sonnet that 400s on effort).
- Google: remove the Gemini 2.5 line (superseded by 3.x).
- Gemini client: drop the integer thinking_budget mapping; 3.x takes the string
thinking_level directly.
Effort/reasoning gates stay as defense in depth for custom model IDs. All kept
IDs verified against live APIs.
Interactive selections and flag defaults overrode TRADINGAGENTS_* env vars.
Rule: an explicit env value or CLI flag wins; otherwise the env-applied
default is kept.
- Research depth: skip the prompt when both round-count env vars are set, and
stop overwriting them (#977).
- Checkpoint: --checkpoint/--no-checkpoint is tri-state; omitting it keeps
TRADINGAGENTS_CHECKPOINT_ENABLED (#976).
- Docker ollama: use TRADINGAGENTS_LLM_PROVIDER + OLLAMA_BASE_URL, not a bare
LLM_PROVIDER the overlay never reads (#975).
- Reasoning/thinking knobs: settable via env; the prompt is skipped when set.
- Effort gating: forward effort only to models that accept it (Anthropic
Opus 4.5+/Sonnet 4.6+, OpenAI reasoning models); drop it elsewhere.
- Boolean env values: raise a named error on invalid input instead of
silently becoming False.
Label each OpenRouter model prompt by mode (quick/deep) like the other
providers, so the two consecutive selections are distinguishable. Populate the
dropdown with the newest models from mainstream chat providers rather than the
universal-newest (which surfaced niche/experimental releases); Custom ID still
reaches anything. Cancelled required prompts now exit cleanly instead of
crashing, and the output-language prompt falls back to English.
Add quality guidance to the narrative field so the sentiment report stays
informative and substantive, with each point adding new signal for the trader.
Verified each provider's hard-coded list against current official docs:
- MiniMax: add MiniMax-M3 (1M ctx, multimodal) as the default; keep M2.7 line.
- Qwen: use the live qwen{3.7,3.6}-{plus,max} IDs.
- GLM: add glm-5.2 as the latest flagship.
- xAI: drop deprecated grok-4-fast-* / grok-4-0709 builds.
- DeepSeek: migrate to deepseek-v4-pro / deepseek-v4-flash (the chat/reasoner
aliases are deprecated 2026-07-24 and now map to V4 Flash).
OpenAI, Anthropic, and Gemini were already current and are unchanged.
With the tree clean, the lint job runs ruff check . on every push and PR rather
than only the files a PR changes, so a lint regression is caught anywhere.
Clear the deferred full-repo lint backlog so the whole tree passes the strict
ruff select (E,W,F,I,B,UP,C4,SIM). Mechanical fixes dominate: import sorting,
pep585/604 annotations, dropped dead imports, and whitespace. The few semantic
changes are behavior-preserving: declare __all__ on the agent_utils and
alpha_vantage re-export hubs; expand 'from x import *' to explicit names; use
immutable tuple defaults instead of mutable list defaults; contextlib.suppress
for try/except/pass; and narrow an over-broad assertRaises.
The output-language instruction is applied across all report-producing agents
(analysts, researchers, risk debators, research manager, trader, portfolio
manager), but nothing enforced it, so agents had silently dropped it before. Add
a parametrized guard asserting each report agent calls get_language_instruction()
so a non-English run stays fully localized and the regression can't recur.
The Responses API exists only on native OpenAI. When the openai provider is
pointed at a custom base_url (a proxy, gateway, or local server that speaks only
Chat Completions), keep the Responses API off so the call does not fail.
A truncated/incomplete chunked response raises http.client exceptions
(IncompleteRead/BadStatusLine) that are not OSErrors, so they bypassed the
existing handler and crashed the analysis. Broaden the catch so the fetch
degrades to its placeholder string like every other transport failure.
The JSON search endpoint is reliably WAF-blocked (403) for public clients, so
probing it on every call doubled request volume against Reddit's per-IP rate
limit and tripped 429 on the RSS fallback, blanking the sentiment feed. Fetch
the Atom/RSS feed directly (JSON kept as an opt-in path that still degrades to
RSS on 403), back off once on a 429 honouring Retry-After, and pace requests a
little wider. Also broaden the error handling to catch http.client chunked
transfer errors (IncompleteRead/BadStatusLine) alongside OSError, which on their
own slipped through and crashed the pipeline.
yfinance intermittently returns a year-old partial frame (e.g. June 2025 rows
for a June 2026 request) that still has rows and a Close, so it passed the
empty-check and silently fed a wrong close price and indicators into the report
(#1021). Add a freshness guard that rejects a frame whose latest row is far
older than the requested date, on both the raw OHLCV path and the indicator
path. It raises the existing NoMarketDataError with a stale-specific detail, so
the vendor router's try-next-vendor and single unavailable-signal handling apply
unchanged; the sentinel now surfaces that detail so the agent reports the
specific reason rather than fabricating a value.
Every condition where a vendor cannot return usable data now derives from a
single VendorError base (errors.py): NoMarketDataError, VendorRateLimitError,
and VendorNotConfiguredError (still a ValueError for back-compat). Vendor-named
errors subclass the generic bases, and the router catches the base types, so a
new vendor needs no new except clause. Not-configured now has explicit
try-next-vendor handling instead of falling through the generic catch-all. The
number of error types tracks the number of distinct router reactions, not the
number of causes.
Surface live, market-implied probabilities for forward-looking events (Fed
decisions, recession, elections, geopolitics, crypto) to the news analyst via a
new get_prediction_markets tool and a prediction_markets vendor category. Backed
by Polymarket's public Gamma API (no key). Results are filtered to open,
forward-looking markets (closed and past-dated events excluded), ranked by
traded volume, and rendered with implied probability, volume, resolution date,
and the recent move. External errors degrade to a clear unavailable message
rather than interrupting the analyst.
Surface Federal Reserve Economic Data (rates, inflation, labor, growth) to the
news analyst via a new get_macro_indicators tool and a macro_data vendor
category. Friendly aliases (cpi, unemployment, fed_funds_rate, 10y_treasury,
yield_curve, ...) map to FRED series IDs; raw series IDs are accepted too. The
report gives the latest value, change over the window, and a recent observation
table. Windowing is lookahead-safe (observation_end = curr_date), missing values
are skipped, and a missing FRED_API_KEY surfaces as a clear not-configured
condition through the vendor router rather than a crash.
Bedrock uses the Converse API (langchain-aws) and the AWS credential chain, so
it has its own client like Anthropic/Google rather than the OpenAI-compatible
registry. langchain-aws is an optional dependency (pip install ".[bedrock]"),
lazy-imported with a clear install hint; importing the package never requires
it. The model name is a Bedrock model ID / inference profile ID.
Each is a one-row entry in the OpenAI-compatible provider registry (base_url,
key env, CLI option); the model is user-specified since they serve many models.
The OpenAI-compatible family (openai, xAI, DeepSeek, Qwen, GLM, MiniMax,
OpenRouter, Ollama) all speak the same Chat Completions API and differ only by
base_url, key, and two narrow wire-format quirks already isolated in subclasses.
Replace the scattered base-URL dict, key handling, and client-class branches with
one ProviderSpec registry that get_llm and the factory drive off; provider quirks
stay in their subclasses. Add a generic "openai_compatible" provider for any
OpenAI-compatible server (vLLM, LM Studio, llama.cpp, relays) via backend_url +
optional key — adding a provider is now one registry row. Native Anthropic/Google
keep their own clients (genuinely different APIs). Also fixes the env backend URL
being ignored when the provider was chosen interactively (#978).
The market analyst is bound to call get_verified_market_snapshot and its prompt
requires it as the source of truth, but the tool was missing from the market
ToolNode executor — so the call failed and the model reported it "unavailable"
and skipped verification. Register it (with a regression guard) so the snapshot
actually runs and grounds the report.
The yfinance news date filter only ran when an article had a parsed date, so
flat-format and undated articles bypassed it and leaked future news into
historical/backtest runs. Parse the flat providerPublishTime, apply one
look-ahead-safe window rule across ticker and global news (undated kept only
when the window reaches the present), and return an informative message when
everything is filtered out.
Alpha Vantage requests had no timeout (a stall could hang the run) and any
notice mentioning "API key" was raised as a rate limit — so an invalid/missing
key was mislabeled and silently treated as transient. Add a 30s request timeout
and classify rate-limit phrasing before key errors (rate-limit notices also
mention "API key"), surfacing a bad key as a real configuration error.
yfinance treats end as exclusive, so get_YFin_data_online dropped the requested
end_date row and load_ohlcv dropped the current day. Request one day past the
end so the range is inclusive (look-ahead is still prevented by the curr_date
filter; the header still shows the requested range). Also correct the load_ohlcv
docstring to the 5-year window it actually downloads.
The router silently extended every request to all available vendors regardless
of config, so an explicit single-vendor choice still fell back to others and
returned data from an unexpected source (#988, #289), and serious primary-vendor
errors were swallowed without a trace (#989). The configured vendor list is now
the exact chain (list several for ordered fallback; "default" uses all), unknown
vendors raise, and swallowed vendor errors are logged. Adds an autouse config
isolation fixture so vendor config can't leak between tests.
The CLI validated, normalized, and classified tickers with its own logic that
diverged from the data layer: it rejected '=' symbols like GC=F (#980),
classified BTCUSD as a stock (#981), and accepted unpriceable BTC-USDT (#982).
Route the CLI through normalize_symbol (now mapping USDT/USDC crypto quotes to
Yahoo's -USD pair), so validation, classification, and pricing agree.
resolve_instrument_identity and the reflection return lookup queried Yahoo with
the raw ticker, so broker/forex/commodity symbols (XAUUSD, BTCUSD, EURUSD)
failed identity or could mismatch the priced instrument even though the price
path already normalized them. Route both through normalize_symbol (#983, #984).
GitHub Actions: pytest across Python 3.10-3.13, a clean-install import smoke
that catches undeclared runtime deps, and a strict Ruff gate (standard rule set)
scoped to the files each PR changes. Declares python-dotenv (imported by the CLI
but previously undeclared) and adds a [dev] extra. Recommends Python 3.12 for
setup, verified from a clean isolated install.
Setting the LLM env vars now skips the matching CLI selection step and uses
the value, so OpenAI-compatible endpoints (opencode, LM Studio, etc.) and
unattended runs work without prompting. Unset vars are chosen interactively
as before.
TRADINGAGENTS_LLM_PROVIDER -> skips provider step (still verifies API key)
TRADINGAGENTS_LLM_BACKEND_URL -> custom endpoint (else provider default)
TRADINGAGENTS_DEEP_THINK_LLM / _QUICK_THINK_LLM -> skips model step
TRADINGAGENTS_OUTPUT_LANGUAGE -> skips language step
Builds on the existing TRADINGAGENTS_* config overrides (which already feed
DEFAULT_CONFIG); this wires the CLI to honor them instead of re-prompting.
Analyzing a symbol Yahoo Finance does not recognize (e.g. XAUUSD+) could
produce an invented price instead of an error. The agent now either prices
the correct instrument or clearly reports that data is unavailable.
Ticker support:
- Commodities/forex/crypto resolve to the symbol Yahoo actually serves, so
you can enter the common form and it just works:
XAUUSD / XAUUSD+ / GOLD -> GC=F (gold)
USOIL -> CL=F (WTI crude)
EURUSD -> EURUSD=X
BTCUSD -> BTC-USD
SPX500 / NAS100 -> ^GSPC / ^NDX
Native Yahoo symbols (AAPL, GC=F, ^GSPC) keep working unchanged. New
instruments are added by extending the alias table.
Reliability:
- Unknown or delisted symbols now return a clear "data unavailable" result
the agent reports verbatim, instead of a value the model fills in.
- A failed fetch no longer leaves a broken symbol cached until the cache is
cleared by hand.
A-shares already resolve through the Yahoo Finance vendor (Shanghai .SS,
Shenzhen .SZ) with correct identity and indicators; add the SSE/SZSE
composite benchmarks so their alpha isn't measured against SPY, and
document the exchange-suffix tickers we support (incl. A-shares, crypto).
Adds a cross-provider temperature config (and TRADINGAGENTS_TEMPERATURE),
forwarded to every LLM client when set, so runs can be made less variable
on models that honor it. Adds a README "Reproducibility" section that
separates the sources of run-to-run variation, what users can control
(temperature, non-reasoning model, pinned date), and what is inherent to
LLM-driven analysis, and notes that the identity and verified-data fixes
already removed the "different companies / fabricated prices" variance.
#178#168
The market analyst could confabulate exact figures — citing a Bollinger
band or a "historically validated bounce" the data doesn't support (#830).
Add a deterministic get_verified_market_snapshot tool (latest OHLCV row,
common indicators, recent closes) the analyst must consult and treat as
the source of truth for any exact price/indicator claim, and instruct it
not to assert historical validation or support bounces without tool-backed
dates and prices.
#830
The analyst emitted free-form prose, so its sentiment header varied by
provider and run and downstream consumers needed drifting regex. Extend
the structured-output pattern the trio already uses: a SentimentReport
schema (band + 0-10 score + confidence + narrative) rendered to a
deterministic header, with a free-text fallback for providers that lack
native structured output.
#796