AI Business Consultant

Deep Reinforcement Learning for Trading

29 June, 2023

Trading in financial markets is a complex and dynamic endeavor that requires sophisticated strategies to navigate successfully. In recent years, deep reinforcement learning (RL) has emerged as a promising approach to develop intelligent trading agents. In this blog post, I will take you through the process of setting up and conducting a deep RL trading project, focusing on the entire workflow from building and testing the algorithm to deploying it in a simulated trading account at Interactive Brokers.

Understanding the Purpose

Before diving into the technical details, it's essential to clarify the project's purpose. While many tutorials on deep RL for trading exist, few provide insights into integrating brokers into the process. The goal of this project is to address this gap and create a framework for building, testing, and deploying trading algorithms using broker integration. The emphasis will be on the overall workflow, with the intention of rapid implementation and iterative improvements, rather than aiming for an immediately profitable trading algorithm.

The Challenges of Backtesting

Backtesting, the process of evaluating a trading strategy using historical data, has its own set of challenges. It's important to be aware of these issues when assessing trading performance. Common pitfalls include in-sample backtesting, survivor-biased data, overfitting, and the data mining fallacy. By acknowledging these challenges, we can focus on continuous improvement through hyperparameter tuning, trading logic refinement, additional feature engineering, and model development.

The Project Steps

Now, let's explore the main steps involved in setting up and conducting the deep RL trading project:

1. Imports, Inputs, & Helper Functions

2. Building a LSTM Model for Price Prediction

3. Building a CNN Model for Price Movement Prediction

4. Building a Reinforcement Learning Trading Agent

5. Combining the Models & Deploying at Interactive Brokers

Let's look closer aat each of these steps.

Step 1. Imports, Inputs, & Helper Functions

To start, we need to import the necessary packages, define our model and trading inputs, and create helper functions for the algorithm. Packages like scikit-learn for data preprocessing and evaluation, Keras for LSTM and CNN deep learning models, and IB-insync for interacting with the Interactive Brokers TWS API will be utilized. We'll define inputs such as stock symbols, DataFrame timeframe, LSTM architecture inputs, CNN logic, IB timeframe loop, daily portfolio stop loss threshold, percent allocation, and stop loss and take profit thresholds. Additionally, we'll create helper functions for data manipulation, preprocessing, and retrieval.

Step 2. Building a LSTM Model for Price Prediction

Next, we'll build a Long-Short-Term Memory (LSTM) model, which is a type of recurrent neural network (RNN), for price prediction. LSTMs are well-suited for modeling time-based or sequence-based data. In the context of trading, our LSTM model will predict the price one day into the future. It will consider inputs such as open, high, low, close, volume, 20-day SMA, 50-day SMA, and 200-day SMA. Further development will involve additional feature engineering to enhance performance.

Step 3. Building a CNN Model for Price Movement Prediction

In this step, we'll build a Convolutional Neural Network (CNN) model to predict the probability of a price increase or decrease. CNNs are commonly used for analyzing visual images, but in trading, they can provide insights into price movements. The output of the CNN model will be a probability, indicating the likelihood of an up or down move in the next candlestick. During broker integration, we'll set up thresholds for trading based on this probability.

Step 4. Building a Reinforcement Learning Trading Agent

The next phase involves building a reinforcement learning trading agent that aims to maximize the Sharpe ratio. We'll leverage an open-source RL model developed by Teddy Koker and use gradient ascent to optimize the Sharpe ratio over a set of training data. The RL agent will generate a value between -1 and 1, representing the percentage by which our portfolio should buy or sell an asset. Positive values indicate a long trade, while negative values indicate a short trade. The goal is to create a strategy with a high Sharpe ratio when tested on out-of-sample data.

Step 5. Combining the Models & Deploying at Interactive Brokers

In the final step, we'll integrate our LSTM, CNN, and reinforcement learning models into a cohesive deep RL trading agent. This agent will incorporate the trading logic, including API connections, available balance retrieval from Interactive Brokers, position sizing calculations, and the definition of trading rules for both long and short trades. Once the strategy is initialized and executed, we'll deploy the algorithm in a simulated trading account at Interactive Brokers to evaluate its performance.


Embarking on a deep RL trading project provides valuable insights into the challenging world of financial markets and AI-driven trading algorithms. While the initial implementation may not result in an immediately profitable trading strategy, it lays the foundation for continuous improvement. Through model refinement, trading logic optimization, additional feature engineering, and ongoing development, the project holds the potential for growth and enhancement. By following this workflow, you can gain hands-on experience and build a solid foundation for exploring deep RL in the trading domain.

Contact Me