chore: release v0.2.4 — structured agents, checkpoint, memory log, providers

This release bundles substantial work since v0.2.3:

- Structured-output Research Manager, Trader, and Portfolio Manager
  (canonical with_structured_output pattern, single LLM call per agent,
  rendered markdown preserves the existing report shape).
- LangGraph checkpoint resume for crash recovery (--checkpoint flag).
- Persistent decision log replacing the per-agent BM25 memory, with
  deferred reflection driven by yfinance returns + alpha vs SPY.
- DeepSeek, Qwen, GLM, and Azure OpenAI provider support; dynamic
  OpenRouter model selection.
- Docker support; cache and logs moved to ~/.tradingagents/ to fix
  Docker permission issues.
- Windows UTF-8 encoding fix on every file I/O site.
- 5-tier rating consistency (Buy / Overweight / Hold / Underweight / Sell)
  across Research Manager, Portfolio Manager, signal processor, memory log.

Plus the small quality items in this commit:

1. Suppress noisy Pydantic serializer warnings from OpenAI Responses-API
   parse path by defaulting structured-output to method="function_calling"
   (root-cause fix, not a warnings filter — same typed result, no warnings).
2. Ship scripts/smoke_structured_output.py so contributors can verify
   their provider's structured-output path with one command.
3. Add opt-in memory_log_max_entries config — when set, oldest resolved
   memory log entries are pruned once the cap is exceeded; pending
   entries (unresolved) are never pruned.
4. backend_url default changed from the OpenAI URL to None so the
   per-provider client falls back to its native endpoint instead of
   leaking OpenAI's URL into Gemini / other clients.

CHANGELOG.md added with the full v0.2.4 entry. 92 tests pass without API keys.
This commit is contained in:
Yijia-Xiao
2026-04-25 21:54:30 +00:00
parent 4016fd4efa
commit 7c37249f80
8 changed files with 562 additions and 2 deletions

266
CHANGELOG.md Normal file
View File

@@ -0,0 +1,266 @@
# Changelog
All notable changes to TradingAgents are documented here.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
Breaking changes within the 0.x line are called out explicitly.
## [0.2.4] — 2026-04-25
### Added
- **Structured-output decision agents.** Research Manager, Trader, and Portfolio
Manager now use `llm.with_structured_output(Schema)` on their primary call
and return typed Pydantic instances. Each provider's native structured-output
mode is used (`json_schema` for OpenAI / xAI, `response_schema` for Gemini,
tool-use for Anthropic, function-calling for OpenAI-compatible providers).
Render helpers preserve the existing markdown shape so memory log, CLI
display, and saved reports keep working unchanged. (#434)
- **LangGraph checkpoint resume** — opt-in via `--checkpoint`. State is saved
after each node so crashed or interrupted runs resume from the last
successful step. Per-ticker SQLite databases under
`~/.tradingagents/cache/checkpoints/`. `--clear-checkpoints` resets them. (#594)
- **Persistent decision log** replacing the per-agent BM25 memory. Decisions
are stored automatically at the end of `propagate()`; the next same-ticker
run resolves prior pending entries with realised return, alpha vs SPY, and
a one-paragraph reflection. Override path with `TRADINGAGENTS_MEMORY_LOG_PATH`.
Optional `memory_log_max_entries` config caps resolved entries; pending
entries are never pruned. (#578, #563, #564, #579)
- **DeepSeek, Qwen (Alibaba DashScope), GLM (Zhipu), and Azure OpenAI**
providers, plus dynamic OpenRouter model selection.
- **Docker support** — multi-stage build with separate dev and runtime images.
- **`scripts/smoke_structured_output.py`** — diagnostic that exercises the
three structured-output agents against any provider so contributors can
verify their setup with one command.
- **5-tier rating scale** (Buy / Overweight / Hold / Underweight / Sell) used
consistently by Research Manager, Portfolio Manager, signal processor, and
the memory log; Trader keeps 3-tier (Buy / Hold / Sell) since transaction
direction is naturally ternary.
- **Pytest fixtures** — lazy LLM client imports plus placeholder API keys so
the test suite runs cleanly without credentials. (#588)
### Changed
- **`backend_url` default is now `None`** rather than the OpenAI URL. Each
provider client falls back to its native default. The previous default
leaked the OpenAI URL into non-OpenAI clients (e.g. Gemini), producing
malformed request URLs for Python users who switched providers without
overriding `backend_url`. The CLI flow is unaffected.
- All file I/O passes explicit `encoding="utf-8"` so Windows users no longer
hit `UnicodeEncodeError` with the cp1252 default. (#543, #550, #576)
- Cache and log directories moved to `~/.tradingagents/` to resolve Docker
permission issues. (#519)
- `SignalProcessor` reads the rating from the Portfolio Manager's rendered
markdown via a deterministic heuristic — no extra LLM call.
- OpenAI structured-output calls default to `method="function_calling"` to
avoid noisy `PydanticSerializationUnexpectedValue` warnings emitted by
langchain-openai's Responses-API parse path. Same typed result, no warnings.
### Fixed
- Empty memory no longer triggers fabricated past-lessons in agent prompts;
the memory-log redesign makes this structurally impossible since only the
Portfolio Manager consults memory and only when entries exist. (#572)
- Tool-call logging processes every chunk message, not just the last one, and
memory score normalization handles empty score arrays. (#534, #531)
### Removed
- `FinancialSituationMemory` (the per-agent BM25 system) and the dead
`reflect_and_remember()` plumbing; subsumed by the persistent decision log.
- Hardcoded Google endpoint that caused 404 when `langchain-google-genai`
changed its API path. (#493, #496)
### Contributors
Thanks to everyone who shaped this release through code, design, and reports:
- [@claytonbrown](https://github.com/claytonbrown) — checkpoint resume (#594), test fixtures (#588), design feedback on cost tracking (#582) and structured validation (#583)
- [@Bcardo](https://github.com/Bcardo) — memory-log redesign (#579), empty-memory hallucination report (#572), encoding fix proposal (#570)
- [@voidborne-d](https://github.com/voidborne-d) — memory persistence design (#564), portfolio manager state fix (#503)
- [@mannubaveja007](https://github.com/mannubaveja007) — structured-output feature request (#434)
- [@kelder66](https://github.com/kelder66) — RAM-only memory issue (#563)
- [@Gujiassh](https://github.com/Gujiassh) — tool-call logging fix (#534), test stub PR (#533)
- [@iuyup](https://github.com/iuyup) — memory score normalization fix (#531)
- [@kaihg](https://github.com/kaihg) — Google base_url fix (#496)
- [@32ryh98yfe](https://github.com/32ryh98yfe) — Gemini 404 report (#493)
- [@uppb](https://github.com/uppb) — OpenRouter dynamic model selection (#482)
- [@guoz14](https://github.com/guoz14) — OpenRouter limited-model report (#337)
- [@samchenku](https://github.com/samchenku) — indicator name normalization (#490)
- [@JasonOA888](https://github.com/JasonOA888) — y_finance pandas import fix (#488)
- [@tiffanychum](https://github.com/tiffanychum) — stale import cleanup (#499)
- [@zaizou](https://github.com/zaizou) — Docker permission issue (#519)
- [@Stosman123](https://github.com/Stosman123), [@mauropuga](https://github.com/mauropuga), [@hotwind2015](https://github.com/hotwind2015) — Windows encoding bug reports (#543, #550, #576)
- [@nnishad](https://github.com/nnishad), [@atharvajoshi01](https://github.com/atharvajoshi01) — encoding fix proposals (#568, #549)
## [0.2.3] — 2026-03-29
### Added
- **Multi-language output** for analyst reports and final decisions, with a
CLI selector. Internal agent debate stays in English for reasoning quality. (#472)
- **GPT-5.4 family models** in the default catalog, with deep/quick model split.
- **Unified model catalog** as a single source of truth for CLI options and
provider validation.
### Changed
- `base_url` is forwarded to Google and Anthropic clients so corporate proxies
work consistently across providers. (#427)
- Standardised the Google `api_key` parameter to the unified `api_key` form.
### Fixed
- Backtesting fetchers no longer leak look-ahead data when `curr_date` is in
the middle of a fetched window. (#475)
- Invalid indicator names from the LLM are caught at the tool boundary instead
of crashing the run. (#429)
- yfinance news fetchers respect the same exponential-backoff retry as price
fetchers. (#445)
### Contributors
- [@ahmedk20](https://github.com/ahmedk20) — multi-language output (#472)
- [@CadeYu](https://github.com/CadeYu) — model catalog typing (#464)
- [@javierdejesusda](https://github.com/javierdejesusda) — unified Google API key parameter (#453)
- [@voidborne-d](https://github.com/voidborne-d) — yfinance news retry (#445)
- [@kostakost2](https://github.com/kostakost2) — look-ahead bias report (#475)
- [@lu-zhengda](https://github.com/lu-zhengda) — proxy/base_url support request (#427)
- [@VamsiKrishna2021](https://github.com/VamsiKrishna2021) — invalid indicator crash report (#429)
## [0.2.2] — 2026-03-22
### Added
- **Five-tier rating scale** (Buy / Overweight / Hold / Underweight / Sell)
introduced for the Portfolio Manager.
- **Anthropic effort level** support for Claude models.
- **OpenAI Responses API** path for native OpenAI models.
### Changed
- `risk_manager` renamed to `portfolio_manager` to match the role description
shown in the CLI display.
- Exchange-qualified tickers (e.g. `7203.T`, `BRK.B`) preserved across all
agent prompts and tool calls.
- Process-level UTF-8 default attempted for cross-platform consistency
(note: this approach did not actually take effect; replaced in v0.2.4 with
explicit per-call `encoding="utf-8"` arguments).
### Fixed
- yfinance rate-limit errors are retried with exponential backoff. (#426)
- HTTP client SSL customisation is supported for environments that need
custom certificate bundles. (#379)
- Report-section writes handle list-of-string content gracefully.
### Contributors
- [@CadeYu](https://github.com/CadeYu) — exchange-qualified ticker preservation (#413)
- [@yang1002378395-cmyk](https://github.com/yang1002378395-cmyk) — HTTP client SSL customisation (#379)
## [0.2.1] — 2026-03-15
### Security
- Patched `langchain-core` vulnerability (LangGrinch). (#335)
- Removed `chainlit` dependency affected by CVE-2026-22218.
### Added
- `pyproject.toml` build-system configuration; the project now installs via
modern packaging tooling.
### Removed
- `setup.py` — dependencies consolidated to `pyproject.toml`.
### Fixed
- Risk manager reads the correct fundamental report source. (#341)
- All `open()` calls receive an explicit UTF-8 encoding (initial pass).
- `get_indicators` tool handles comma-separated indicator names from the LLM. (#368)
- `Propagation` initialises every debate-state field so risk debaters never
see missing keys.
- Stock data parsing tolerates malformed CSVs and NaN values.
- Conditional debate logic respects the configured round count. (#361)
### Contributors
- [@RinZ27](https://github.com/RinZ27) — `langchain-core` security patch (#335)
- [@Ljx-007](https://github.com/Ljx-007) — risk manager fundamental-report fix (#341)
- [@makk9](https://github.com/makk9) — debate-rounds config issue (#361)
## [0.2.0] — 2026-02-04
This is the largest release since the initial public version. The framework
moved from single-provider to a multi-provider architecture and grew several
production-ready surfaces.
### Added
- **Multi-provider LLM support** (OpenAI, Google, Anthropic, xAI, OpenRouter,
Ollama) via a factory pattern, with provider-specific thinking configurations.
- **Alpha Vantage** integration as a configurable primary data provider, with
yfinance as a community-stability fallback.
- **Footer statistics** in the CLI: real-time tracking of LLM calls, tool
calls, and token usage via LangChain callbacks.
- **Post-analysis report saving** — the framework writes per-section markdown
files (analyst reports, debate transcripts, final decision) when a run
completes.
- **Announcements panel** — fetches updates from `api.tauric.ai/v1/announcements`
for the CLI welcome screen.
- **Tool fallbacks** so a single vendor outage does not stop the pipeline.
### Changed
- Risky / Safe risk debaters renamed to **Aggressive / Conservative** for
consistency with the displayed agent labels.
- Default data vendor switched to balance reliability and quota across
community deployments.
- Ollama and OpenRouter model lists updated; default endpoints clarified.
### Fixed
- Analyst status tracking and message deduplication in the live display.
- Infinite-loop guard in the agent loop; reflection and logging hardened.
- Various data-vendor implementation bugs and tool-signature mismatches.
### Contributors
This release is the first with substantial outside contributions; many community
PRs from late 2025 also landed here.
- [@luohy15](https://github.com/luohy15) — Alpha Vantage data-vendor integration (#235)
- [@EdwardoSunny](https://github.com/EdwardoSunny) — yfinance fetching optimisations (#245)
- [@Mirza-Samad-Ahmed-Baig](https://github.com/Mirza-Samad-Ahmed-Baig) — infinite-loop guard, reflection, and logging fixes (#89)
- [@ZeroAct](https://github.com/ZeroAct) — saved results path support (#29)
- [@Zhongyi-Lu](https://github.com/Zhongyi-Lu) — `.env` gitignore (#49)
- [@csoboy](https://github.com/csoboy) — local Ollama setup (#53)
- [@chauhang](https://github.com/chauhang) — initial Docker support attempt (#47, later reverted; the merged Docker support shipped in v0.2.4)
## [0.1.1] — 2025-06-07
### Removed
- Static site assets that had been bundled with v0.1.0; the public site now
lives separately.
## [0.1.0] — 2025-06-05
### Added
- **Initial public release** of the TradingAgents multi-agent trading
framework: market / sentiment / news / fundamentals analysts; bull and bear
researchers; trader; aggressive, conservative, and neutral risk debaters;
portfolio manager. LangGraph orchestration, yfinance data, per-agent
BM25 memory, single-provider OpenAI integration, interactive CLI.
[0.2.4]: https://github.com/TauricResearch/TradingAgents/compare/v0.2.3...v0.2.4
[0.2.3]: https://github.com/TauricResearch/TradingAgents/compare/v0.2.2...v0.2.3
[0.2.2]: https://github.com/TauricResearch/TradingAgents/compare/v0.2.1...v0.2.2
[0.2.1]: https://github.com/TauricResearch/TradingAgents/compare/v0.2.0...v0.2.1
[0.2.0]: https://github.com/TauricResearch/TradingAgents/compare/v0.1.1...v0.2.0
[0.1.1]: https://github.com/TauricResearch/TradingAgents/compare/v0.1.0...v0.1.1
[0.1.0]: https://github.com/TauricResearch/TradingAgents/releases/tag/v0.1.0

View File

@@ -28,6 +28,7 @@
# TradingAgents: Multi-Agents LLM Financial Trading Framework
## News
- [2026-04] **TradingAgents v0.2.4** released with structured-output agents (Research Manager, Trader, Portfolio Manager), LangGraph checkpoint resume, persistent decision log, DeepSeek/Qwen/GLM/Azure provider support, Docker, and a Windows UTF-8 encoding fix. See [CHANGELOG.md](CHANGELOG.md) for the full list.
- [2026-03] **TradingAgents v0.2.3** released with multi-language support, GPT-5.4 family models, unified model catalog, backtesting date fidelity, and proxy support.
- [2026-03] **TradingAgents v0.2.2** released with GPT-5.4/Gemini 3.1/Claude 4.6 model coverage, five-tier rating scale, OpenAI Responses API, Anthropic effort control, and cross-platform stability.
- [2026-02] **TradingAgents v0.2.0** released with multi-provider LLM support (GPT-5.x, Gemini 3.x, Claude 4.x, Grok 4.x) and improved system architecture.
@@ -251,6 +252,8 @@ _, decision = ta.propagate("NVDA", "2026-01-15")
We welcome contributions from the community! Whether it's fixing a bug, improving documentation, or suggesting a new feature, your input helps make this project better. If you are interested in this line of research, please consider joining our open-source financial AI research community [Tauric Research](https://tauric.ai/).
Past contributions, including code, design feedback, and bug reports, are credited per release in [`CHANGELOG.md`](CHANGELOG.md).
## Citation
Please reference our work if you find *TradingAgents* provides you with some help :)

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "tradingagents"
version = "0.2.3"
version = "0.2.4"
description = "TradingAgents: Multi-Agents LLM Financial Trading Framework"
readme = "README.md"
requires-python = ">=3.10"

View File

@@ -0,0 +1,176 @@
"""End-to-end smoke for structured-output agents against a real LLM provider.
Runs the three decision-making agents (Research Manager, Trader, Portfolio
Manager) directly with their structured-output bindings and prints the
typed Pydantic instance + the rendered markdown for each. Use this to
verify a provider's native structured-output mode (json_schema for
OpenAI / xAI / DeepSeek / Qwen / GLM, response_schema for Gemini, tool-use
for Anthropic) returns clean instances on the schemas we ship.
Usage:
OPENAI_API_KEY=... python scripts/smoke_structured_output.py openai
GOOGLE_API_KEY=... python scripts/smoke_structured_output.py google
ANTHROPIC_API_KEY=... python scripts/smoke_structured_output.py anthropic
DEEPSEEK_API_KEY=... python scripts/smoke_structured_output.py deepseek
The script does NOT call propagate(), to keep the surface tight and the
cost low — it exercises only the three structured-output calls we just
added, plus the heuristic SignalProcessor.
"""
from __future__ import annotations
import argparse
import os
import sys
from tradingagents.agents.managers.portfolio_manager import create_portfolio_manager
from tradingagents.agents.managers.research_manager import create_research_manager
from tradingagents.agents.trader.trader import create_trader
from tradingagents.graph.signal_processing import SignalProcessor
from tradingagents.llm_clients import create_llm_client
PROVIDER_DEFAULTS = {
"openai": ("gpt-5.4-mini", None),
"google": ("gemini-2.5-flash", None),
"anthropic": ("claude-sonnet-4-6", None),
"deepseek": ("deepseek-chat", None),
"qwen": ("qwen-plus", None),
"glm": ("glm-5", None),
"xai": ("grok-4", None),
}
# Minimal but realistic state for the three agents.
DEBATE_HISTORY = """
Bull Analyst: NVDA's data-center revenue grew 60% YoY last quarter, driven by
Blackwell ramp; sovereign AI deals with multiple governments add a $40B+
multi-year tailwind. Margins remain above peer average.
Bear Analyst: Concentration risk is real — top three customers are >40% of
revenue. Any pause in hyperscaler capex would compress the multiple. China
export restrictions still cap a meaningful portion of demand.
"""
def _make_rm_state():
return {
"company_of_interest": "NVDA",
"investment_debate_state": {
"history": DEBATE_HISTORY,
"bull_history": "Bull Analyst: NVDA's data-center revenue grew 60% YoY...",
"bear_history": "Bear Analyst: Concentration risk is real...",
"current_response": "",
"judge_decision": "",
"count": 1,
},
}
def _make_trader_state(investment_plan: str):
return {
"company_of_interest": "NVDA",
"investment_plan": investment_plan,
}
def _make_pm_state(investment_plan: str, trader_plan: str):
return {
"company_of_interest": "NVDA",
"past_context": "",
"risk_debate_state": {
"history": "Aggressive: lean in. Conservative: trim. Neutral: balanced sizing.",
"aggressive_history": "Aggressive: ...",
"conservative_history": "Conservative: ...",
"neutral_history": "Neutral: ...",
"judge_decision": "",
"current_aggressive_response": "",
"current_conservative_response": "",
"current_neutral_response": "",
"count": 1,
},
"market_report": "Market report.",
"sentiment_report": "Sentiment report.",
"news_report": "News report.",
"fundamentals_report": "Fundamentals report.",
"investment_plan": investment_plan,
"trader_investment_plan": trader_plan,
}
def _print_section(title: str, content: str) -> None:
bar = "=" * 70
print(f"\n{bar}\n{title}\n{bar}\n{content}")
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("provider", choices=list(PROVIDER_DEFAULTS.keys()))
parser.add_argument("--deep-model", default=None, help="Override deep_think_llm")
parser.add_argument("--quick-model", default=None, help="Override quick_think_llm")
args = parser.parse_args()
default_model, _ = PROVIDER_DEFAULTS[args.provider]
deep_model = args.deep_model or default_model
quick_model = args.quick_model or default_model
print(f"Provider: {args.provider}")
print(f"Deep model: {deep_model}")
print(f"Quick model: {quick_model}")
# Build the LLM clients via the framework's factory.
deep_client = create_llm_client(provider=args.provider, model=deep_model)
quick_client = create_llm_client(provider=args.provider, model=quick_model)
deep_llm = deep_client.get_llm()
quick_llm = quick_client.get_llm()
# 1) Research Manager
rm = create_research_manager(deep_llm)
rm_result = rm(_make_rm_state())
investment_plan = rm_result["investment_plan"]
_print_section("[1] Research Manager — investment_plan", investment_plan)
# 2) Trader (consumes RM's plan)
trader = create_trader(quick_llm)
trader_result = trader(_make_trader_state(investment_plan))
trader_plan = trader_result["trader_investment_plan"]
_print_section("[2] Trader — trader_investment_plan", trader_plan)
# 3) Portfolio Manager (consumes both)
pm = create_portfolio_manager(deep_llm)
pm_result = pm(_make_pm_state(investment_plan, trader_plan))
final_decision = pm_result["final_trade_decision"]
_print_section("[3] Portfolio Manager — final_trade_decision", final_decision)
# 4) SignalProcessor extracts the rating with zero LLM calls.
sp = SignalProcessor()
rating = sp.process_signal(final_decision)
_print_section("[4] SignalProcessor → rating", rating)
# 5) Lightweight checks: each rendered output should carry the expected
# section headers so downstream consumers (memory log, CLI display,
# saved reports) keep working.
checks = [
("Research Manager", investment_plan, ["**Recommendation**:"]),
("Trader", trader_plan, ["**Action**:", "FINAL TRANSACTION PROPOSAL:"]),
("Portfolio Manager", final_decision, ["**Rating**:", "**Executive Summary**:", "**Investment Thesis**:"]),
]
print("\n" + "=" * 70 + "\nStructure checks\n" + "=" * 70)
failures = 0
for name, text, required in checks:
for marker in required:
ok = marker in text
print(f" {'PASS' if ok else 'FAIL'} {name}: contains {marker!r}")
failures += int(not ok)
print()
if failures:
print(f"Smoke FAILED: {failures} structure check(s) missing.")
return 1
print("Smoke PASSED: structured output → rendered markdown chain works for", args.provider)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -291,6 +291,59 @@ class TestTradingMemoryLogCore:
assert log.load_entries() == []
assert log.get_past_context("NVDA") == ""
# Rotation: opt-in cap on resolved entries
def test_rotation_disabled_by_default(self, tmp_path):
"""Without max_entries, all resolved entries are kept."""
log = make_log(tmp_path)
for i in range(7):
_resolve_entry(log, "NVDA", f"2026-01-{i+1:02d}", DECISION_BUY, f"Lesson {i}.")
assert len(log.load_entries()) == 7
def test_rotation_prunes_oldest_resolved(self, tmp_path):
"""When max_entries is set and exceeded, oldest resolved entries are pruned."""
log = TradingMemoryLog({
"memory_log_path": str(tmp_path / "trading_memory.md"),
"memory_log_max_entries": 3,
})
# Resolve 5 entries; rotation should keep only the 3 most recent.
for i in range(5):
_resolve_entry(log, "NVDA", f"2026-01-{i+1:02d}", DECISION_BUY, f"Lesson {i}.")
entries = log.load_entries()
assert len(entries) == 3
# Confirm the OLDEST were dropped, not the newest.
dates = [e["date"] for e in entries]
assert dates == ["2026-01-03", "2026-01-04", "2026-01-05"]
def test_rotation_never_prunes_pending(self, tmp_path):
"""Pending entries (unresolved) are kept regardless of the cap."""
log = TradingMemoryLog({
"memory_log_path": str(tmp_path / "trading_memory.md"),
"memory_log_max_entries": 2,
})
# 3 resolved + 2 pending. With cap=2, only 2 resolved survive; both pending stay.
for i in range(3):
_resolve_entry(log, "NVDA", f"2026-01-{i+1:02d}", DECISION_BUY, f"Resolved {i}.")
log.store_decision("NVDA", "2026-02-01", DECISION_BUY)
log.store_decision("NVDA", "2026-02-02", DECISION_OVERWEIGHT)
# Trigger rotation by resolving one more entry — pending entries must stay.
_resolve_entry(log, "NVDA", "2026-01-04", DECISION_BUY, "Resolved 3.")
entries = log.load_entries()
pending = [e for e in entries if e["pending"]]
resolved = [e for e in entries if not e["pending"]]
assert len(pending) == 2, "pending entries must never be pruned"
assert len(resolved) == 2, f"expected 2 resolved after rotation, got {len(resolved)}"
def test_rotation_under_cap_is_noop(self, tmp_path):
"""No rotation when resolved count <= max_entries."""
log = TradingMemoryLog({
"memory_log_path": str(tmp_path / "trading_memory.md"),
"memory_log_max_entries": 10,
})
for i in range(3):
_resolve_entry(log, "NVDA", f"2026-01-{i+1:02d}", DECISION_BUY, f"Lesson {i}.")
assert len(log.load_entries()) == 3
# Rating parsing: markdown bold and numbered list formats
def test_rating_parsed_from_bold_markdown(self, tmp_path):

View File

@@ -17,11 +17,14 @@ class TradingMemoryLog:
_REFLECTION_RE = re.compile(r"REFLECTION:\n(.*?)$", re.DOTALL)
def __init__(self, config: dict = None):
cfg = config or {}
self._log_path = None
path = (config or {}).get("memory_log_path")
path = cfg.get("memory_log_path")
if path:
self._log_path = Path(path).expanduser()
self._log_path.parent.mkdir(parents=True, exist_ok=True)
# Optional cap on resolved entries. None disables rotation.
self._max_entries = cfg.get("memory_log_max_entries")
# --- Write path (Phase A) ---
@@ -153,6 +156,7 @@ class TradingMemoryLog:
if not updated:
return
new_blocks = self._apply_rotation(new_blocks)
new_text = self._SEPARATOR.join(new_blocks)
tmp_path = self._log_path.with_suffix(".tmp")
tmp_path.write_text(new_text, encoding="utf-8")
@@ -206,6 +210,7 @@ class TradingMemoryLog:
if not matched:
new_blocks.append(block)
new_blocks = self._apply_rotation(new_blocks)
new_text = self._SEPARATOR.join(new_blocks)
tmp_path = self._log_path.with_suffix(".tmp")
tmp_path.write_text(new_text, encoding="utf-8")
@@ -213,6 +218,43 @@ class TradingMemoryLog:
# --- Helpers ---
def _apply_rotation(self, blocks: List[str]) -> List[str]:
"""Drop oldest resolved blocks when their count exceeds max_entries.
Pending blocks are always kept (they represent unprocessed work).
Returns ``blocks`` unchanged when rotation is disabled or under cap.
"""
if not self._max_entries or self._max_entries <= 0:
return blocks
# Tag each block with (kept, is_resolved) by parsing tag-line markers.
decisions = []
for block in blocks:
stripped = block.strip()
if not stripped:
decisions.append((block, False))
continue
tag_line = stripped.splitlines()[0].strip()
is_resolved = (
tag_line.startswith("[")
and tag_line.endswith("]")
and not tag_line.endswith("| pending]")
)
decisions.append((block, is_resolved))
resolved_count = sum(1 for _, r in decisions if r)
if resolved_count <= self._max_entries:
return blocks
to_drop = resolved_count - self._max_entries
kept: List[str] = []
for block, is_resolved in decisions:
if is_resolved and to_drop > 0:
to_drop -= 1
continue
kept.append(block)
return kept
def _parse_entry(self, raw: str) -> Optional[dict]:
lines = raw.strip().splitlines()
if not lines:

View File

@@ -7,6 +7,10 @@ DEFAULT_CONFIG = {
"results_dir": os.getenv("TRADINGAGENTS_RESULTS_DIR", os.path.join(_TRADINGAGENTS_HOME, "logs")),
"data_cache_dir": os.getenv("TRADINGAGENTS_CACHE_DIR", os.path.join(_TRADINGAGENTS_HOME, "cache")),
"memory_log_path": os.getenv("TRADINGAGENTS_MEMORY_LOG_PATH", os.path.join(_TRADINGAGENTS_HOME, "memory", "trading_memory.md")),
# Optional cap on the number of resolved memory log entries. When set,
# the oldest resolved entries are pruned once this limit is exceeded.
# Pending entries are never pruned. None disables rotation entirely.
"memory_log_max_entries": None,
# LLM settings
"llm_provider": "openai",
"deep_think_llm": "gpt-5.4",

View File

@@ -18,6 +18,22 @@ class NormalizedChatOpenAI(ChatOpenAI):
def invoke(self, input, config=None, **kwargs):
return normalize_content(super().invoke(input, config, **kwargs))
def with_structured_output(self, schema, *, method=None, **kwargs):
"""Wrap with structured output, defaulting to function_calling for OpenAI.
langchain-openai's Responses-API-parse path (the default for json_schema
when use_responses_api=True) calls response.model_dump(...) on the OpenAI
SDK's union-typed parsed response, which makes Pydantic emit ~20
PydanticSerializationUnexpectedValue warnings per call. The function-calling
path returns a plain tool-call shape that does not trigger that
serialization, so it is the cleaner choice for our combination of
use_responses_api=True + with_structured_output. Both paths use OpenAI's
strict mode and produce the same typed Pydantic instance.
"""
if method is None:
method = "function_calling"
return super().with_structured_output(schema, method=method, **kwargs)
# Kwargs forwarded from user config to ChatOpenAI
_PASSTHROUGH_KWARGS = (
"timeout", "max_retries", "reasoning_effort",