Files
tradingagents/tradingagents/agents/schemas.py
Yijia-Xiao 0fda24515f feat: structured-output Portfolio Manager + 5-tier rating consistency (#434)
Three related changes that take the rating pipeline from heuristic-only
to type-safe at the source.

1) Research Manager prompt now uses the same 5-tier scale (Buy /
   Overweight / Hold / Underweight / Sell) as the Portfolio Manager,
   signal_processing, and the memory log.  The prior 3-tier wording
   (Buy / Sell / Hold) was the only remaining inconsistency in the
   pipeline.

2) Centralise the 5-tier vocabulary and the heuristic prose-rating
   parser into tradingagents/agents/utils/rating.py.  Both the memory
   log and the signal processor now share the same parser instead of
   duplicating regex and word-walker logic.

3) Make structured output a first-class part of the Portfolio Manager's
   primary call.  The PM uses llm.with_structured_output(PortfolioDecision)
   so each provider's native structured-output mode (json_schema for
   OpenAI/xAI, response_schema for Gemini, tool-use for Anthropic,
   function_calling for OpenAI-compatible providers) yields a typed
   Pydantic instance directly.  A render helper turns that instance back
   into the same markdown shape downstream consumers (memory log, CLI
   display, saved reports) already expect, so no other code has to know
   the PM now produces structured output.  Providers without structured
   support fall back gracefully to free-text + the deterministic
   heuristic.

   The previous SignalProcessor had been making a second LLM call to
   re-extract the rating from the PM's prose; that round-trip is now
   eliminated.  SignalProcessor is a thin adapter over parse_rating(),
   makes zero LLM calls, and stays for backwards compatibility with
   process_signal() callers.

Schema (PortfolioDecision) captures rating + executive_summary +
investment_thesis + optional price_target + time_horizon, with field
descriptions doubling as output instructions.  Agent prose remains the
primary artifact; structured output is layered onto the PM only because
it is the one agent whose output has machine-readable downstream
consumers.

15 new tests cover the heuristic parser (markdown-bold edge cases that
had no coverage before), the structured PM happy path, the free-text
fallback path, and that SignalProcessor never invokes the LLM.  Full
suite: 77 tests pass in ~2s without API keys.
2026-04-25 19:57:26 +00:00

94 lines
3.5 KiB
Python

"""Pydantic schemas used by agents that produce structured output.
The framework's primary artifact is still prose: each agent's natural-language
reasoning is what users read, what gets stored in the memory log, and what
gets saved as markdown reports. Structured output is layered onto agents
whose results have downstream machine-readable consumers (currently only
the Portfolio Manager) so that:
- The rating is type-safe and never has to be regex-extracted
- Schema field descriptions become the model's output instructions
- Each provider's native structured-output mode is used (json_schema for
OpenAI/xAI, response_schema for Gemini, tool-use for Anthropic)
- A render helper turns the parsed Pydantic instance back into the same
markdown shape the rest of the system already consumes, so display,
memory log, and saved reports keep working unchanged
"""
from __future__ import annotations
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field
class PortfolioRating(str, Enum):
"""5-tier portfolio rating used by the Research Manager and Portfolio Manager."""
BUY = "Buy"
OVERWEIGHT = "Overweight"
HOLD = "Hold"
UNDERWEIGHT = "Underweight"
SELL = "Sell"
class PortfolioDecision(BaseModel):
"""Structured output produced by the Portfolio Manager.
The model fills every field as part of its primary LLM call; no separate
extraction pass is required. Field descriptions double as the model's
output instructions, so the prompt body only needs to convey context and
the rating-scale guidance.
"""
rating: PortfolioRating = Field(
description=(
"The final position rating. Exactly one of Buy / Overweight / Hold / "
"Underweight / Sell, picked based on the analysts' debate."
),
)
executive_summary: str = Field(
description=(
"A concise action plan covering entry strategy, position sizing, "
"key risk levels, and time horizon. Two to four sentences."
),
)
investment_thesis: str = Field(
description=(
"Detailed reasoning anchored in specific evidence from the analysts' "
"debate. If prior lessons are referenced in the prompt context, "
"incorporate them; otherwise rely solely on the current analysis."
),
)
price_target: Optional[float] = Field(
default=None,
description="Optional target price in the instrument's quote currency.",
)
time_horizon: Optional[str] = Field(
default=None,
description="Optional recommended holding period, e.g. '3-6 months'.",
)
def render_pm_decision(decision: PortfolioDecision) -> str:
"""Render a PortfolioDecision back to the markdown shape the rest of the system expects.
Memory log, CLI display, and saved report files all read this markdown,
so the rendered output preserves the exact section headers (``**Rating**``,
``**Executive Summary**``, ``**Investment Thesis**``) that downstream
parsers and the report writers already handle.
"""
parts = [
f"**Rating**: {decision.rating.value}",
"",
f"**Executive Summary**: {decision.executive_summary}",
"",
f"**Investment Thesis**: {decision.investment_thesis}",
]
if decision.price_target is not None:
parts.extend(["", f"**Price Target**: {decision.price_target}"])
if decision.time_horizon:
parts.extend(["", f"**Time Horizon**: {decision.time_horizon}"])
return "\n".join(parts)