Reading Between the Lines: Using LLMs to Forecast Market Moves After FED Central Bank Signals

Central bank speeches move markets. Always have. But now, you’ve got a new edge—Large Language Models (LLMs) can decode the Fed faster and more precisely than humans ever could.

Jun 18, 2025

You already know the stakes. A single shift in tone—from “moderately elevated” to “persistently high” inflation—can send equities plunging, push bond yields lower, or trigger a flight to the dollar. Traders have spent decades trying to read between the lines of policy statements. Now, LLMs are doing it better.

Why Central Bank Language Moves Markets

Markets don’t trade on today—they trade on expectations of what’s coming next. That makes central bank communication a battlefield for market sentiment.

But central bankers don’t speak plainly. They hedge. They caveat. They use dense, sometimes contradictory language by design. Traditional NLP methods—like word counts and sentiment dictionaries—flatten nuance and often miss what truly matters: the shift in context.

What LLMs See That Old Models Miss

LLMs like GPT-4, Claude, FinBERT, and domain-specific models can understand tone, context, and shifts in forward guidance. Not just extract keywords—actually understand what’s changing.

Here’s what matters:

Semantic Shift Detection
Fine-tune an LLM on historical FOMC statements. Use vector embeddings to measure how each speech deviates from the last. It’s a quantitative way to catch subtle narrative shifts—before they become market consensus.
Context-Aware Market Labeling
Combine the text of central bank speeches with macro context (e.g., CPI prints, jobs data). Train the LLM to classify the communication: hawkish, dovish, or neutral. Add another layer—forecast asset reactions. It doesn’t just tell you what was said. It tells you what markets might do next.
Chain-of-Thought Reasoning
LLMs can explain why a certain phrase might matter. That’s powerful. It turns opaque signals into interpretable logic—without black-box handwaving.

Fine-Tuning GPT-2 on Federal Reserve Speeches

A Step-by-Step Pipeline

To capture the unique language and tone of Federal Reserve communications, you need a tailored model trained specifically on their speeches. In this pipeline I want to walk you through collecting, cleaning, and preparing speech transcripts, then fine-tuning GPT-2 to understand and represent central bank language effectively.

Each step builds toward a specialized GPT model that can extract nuanced meaning from policy speeches—helping you analyze market signals with deeper insight.

Step 1: Scrape speech transcripts from the Federal Reserve website
Starting by collecting the raw data. Using Python’s requests and BeautifulSoup libraries, we fetch the list of speech URLs from the Fed’s speeches page. Then, for each speech URL, you extract the main text content by targeting the relevant HTML elements. This step ensures we gather all relevant speeches to build a comprehensive training dataset.

Step 2: Clean and consolidate the text
Raw HTML content often contains extra whitespace, formatting tags, or irrelevant sections. We can clean the text by stripping out HTML tags, joining paragraphs into a single string, and normalizing whitespace. This cleaning step prepares consistent, noise-free data for the model to learn from.

Step 3: Build a dataset object
Once the speeches are cleaned, we assemble them into a structured dataset using Hugging Face’s Dataset class. This allows for efficient batching, mapping, and processing during training. Let’s filter out very short texts to focus on substantial speeches.

Step 4: Tokenize the speeches using GPT-2’s tokenizer
Then we convert the raw text into tokens — numeric representations the model understands — using the GPT-2 tokenizer. Tokenization splits text into subword units and handles padding or truncation to ensure fixed-length inputs. This step adapts the speeches into a format suitable for GPT-2’s autoregressive architecture.

Step 5: Fine-tune GPT-2 as a causal language model
We set up training using Hugging Face’s Trainer with relevant arguments (number of epochs, batch size, save intervals). You use a data collator to manage input batching without masking tokens, as GPT-2 uses causal (left-to-right) language modeling. Fine-tuning lets GPT-2 learn the unique tone, style, and structure of central bank speeches, enabling it to capture subtle shifts in language over time.

After training, save the fine-tuned model locally. We can then extract embeddings from GPT-2’s hidden layers for tasks like semantic search, sentiment classification, or detecting shifts in policy tone.

This pipeline utilizes GPT-2’s autoregressive strengths to model long, complex policy speeches, making it a practical choice for analyzing Federal Reserve communications. Fine-tuning transforms raw text into a market-aware tool that understands not just words but their implied meaning and intent in a financial context.

A Real Case: Waller’s December 2024 Speech

Let’s break it down.

On December 2, 2024, Fed Governor Christopher Waller gave a speech. The transcript of which can be found here 🔗.

Inflation was stalling. The labor market was softening. Productivity gains remained solid.

Waller leaned dovish. He flagged the case for easing but noted more data would guide the final decision. This signaled a shift—but not a firm commitment.

Markets had priced in a 25 bp cut for the December FOMC meeting. When the Fed delivered the cut but projected higher inflation for 2025 and slower cuts ahead, equities dropped over 3% in a day. The 2-year Treasury yield initially fell—then climbed again into January.

This was a textbook case where tone, timing, and context mattered more than the headline.

What an LLM Would Have Predicted

With an LLM pipeline, you could have:

Embedded Waller’s speech alongside previous Fed statements.
Scored the semantic shift—the model detects a move from wait-and-see to leaning dovish.
Classified the tone—the model labels it dovish but cautious.
Forecast market reaction—anticipates a drop in 2-year yields, with equity risk tied to forward guidance.

Prompt Example:

“Compare the tone and forward guidance in Governor Waller’s December 2, 2024 speech to prior statements and the November FOMC statement. Identify notable shifts in language on inflation, labor market, and policy outlook. What signals should traders watch for in bonds and equities?”

Expected LLM Output:

Waller acknowledges inflation setbacks but highlights strong productivity, cooling wage growth, and labor market softening.
Shift detected from “wait-and-see” to “leaning dovish,” with openness to rate cuts if disinflation continues.
Caution introduced: “further data could sway the Fed,” signaling cuts aren’t guaranteed.
Predicts short-term drop in 2-year yields and a possible equity rally, but warns Fed’s cautious tone may limit risk appetite.
Advises traders to watch Treasury and S&P 500 futures volatility, especially if Fed communications reinforce caution.

Signal Extraction and Trading Implications:

Primary Signal: The Fed is likely to cut rates but proceed cautiously in 2025.
Secondary Signal: Short-term volatility in bonds and equities, with bond yields peaking as markets absorb the slower easing pace.
Trading Application: Position for a peak in 2-year yields late December/early January followed by decline; approach equities with caution post-FOMC due to risk of selloff if Fed messaging disappoints.

What Actually Happened

On 18th December, Jerome Powell announced Fed cuts rates by 25 basis points to bolster labor market, triggering market shifts. The Fed's statement introduced a new qualifier on the "extent and timing" of future rate cuts, suggesting a slower pace in 2025 than previously anticipated.

2-Year Yield on Dec 31, 2024: ~4.24%
Peak on Jan 13, 2025: ~4.80%
June 17, 2025: 3.94%

Markets initially rallied on the rate cut, then sold off as they absorbed the Fed’s slower path forward. Yields climbed—then fell again as disinflation took hold.

The LLM’s forecast captured the turn.

What to Watch Out For

Prompt Engineering Matters
You need tight, context-rich prompts. Generic inputs give you noise.
Data Inconsistencies
Some speeches have poor formatting or delayed transcripts. Clean, timestamped inputs are key.
False Narratives
Even with chain-of-thought, LLMs can hallucinate or misattribute causal relationships. Validate outputs before trading them.

Bottom Line

If you want to trade macro with an edge, stop relying on lagging indicators. Start decoding the signals in real time.

Central banks don’t just move rates—they move expectations. And expectations are set by words. LLMs let you quantify tone, detect surprise, and model market reaction more precisely than any human can.

Waller’s December 2024 speech hinted at the shift. A fine-tuned LLM can catch that signal—it picks up the directional tone change before the market fully reacts. You see the reflection in bond yields and equity markets, where the impact becomes tangible.

If you’re serious about building adaptive, forward-looking strategies, don’t just track the Fed. Read between the lines.

Alina Khay

Discussion about this post