I have created a quantitative trading strategy that incorporates a simple machine learning model to trade the SPY as part of my ongoing research in quantitative trading. The focus here was not on creating a strategy with alpha but rather to develop a framework both in my mind and in code to develop more advanced models in the future.
1. Does SPY Exhibit Short-Term Mean Reversion or Momentum?
Examining whether SPY exhibits short-term mean reversion or momentum is the central idea in this strategy. If negative returns tend to precede positive returns (or positive returns precede negative returns), that suggests that mean reversion exists. If positive returns tend to precede positive returns (or negative returns precede negative returns), that suggests that momentum exists.
In other words, short-term mean reversion or momentum exist if there is some relationship between the daily return of SPY and the lagged daily return of SPY over various time periods.
Below I plot the daily returns of SPY versus the lagged daily return of SPY over periods of one day to nine days. The top left plot shows the relationship between the daily return of SPY versus the one-day lagged daily return of SPY. The blue line in the plots is a smoother to aid in detecting patterns in the data.
So is there any evidence of mean reversion or momentum? The interesting plots are the first two plots which represent the one-day and two-day lagged returns. The plots suggest that there is weak evidence of some short-term mean reversion, especially when there has been a large negative return in the past. There are relatively more extreme observations in the top left quadrant of the plots compared to the other quadrants. These observations represent instances where the SPY has gone down by a lot and then corrected upwards the following day. The smoother has a slight negative slope which confirms the weak evidence for short-term mean reversion.
Why might this happen? The narrative is that occasionally broad-based and panic-induced selling occurs which exhausts all the short-term selling pressure, so the market corrects the following day.
2. Training a Logistic Regression Model
We have established that there might be some relationship between SPY daily returns and the one-day and two-day lagged daily returns, but is this enough to build a predictive model? I decided to treat this problem as a classification problem to keep things simple — I am only interested in predicting the direction of future returns (positive or negative) and not the magnitude.
The machine learning method I decided on using is logistic regression which is a simple learning method that is often used before more flexible learning methods.
The data was split into a training set that consists of SPY observations from 2000 to 2013. The following 10 models were trained on this training set with each model identical to the model above it but with the addition of an additional lagged return as a predictor. Models 3 through 9 are not shown for brevity.
m01 <- glm(daily_return_sign ~ daily_return_lag1, data = SPY_training, family = binomial) m02 <- glm(daily_return_sign ~ daily_return_lag1 + daily_return_lag2, data = SPY_training, family = binomial) ... m10 <- glm(daily_return_sign ~ daily_return_lag1 + daily_return_lag2 + daily_return_lag3 + daily_return_lag4 + daily_return_lag5 + daily_return_lag6 + daily_return_lag7 + daily_return_lag8 + daily_return_lag9 + daily_return_lag10, data = SPY_training, family = binomial)
3. Assessing Model Accuracy
Logistic regression models output the probability that an observation will belong to each class (in our case, whether the daily return will be positive or negative), but it is up to the practitioner to decide on a logical probability threshold to assign the observation to a class. Which probability threshold to decide on is a trade off between the number of true positives you want versus false positives. This post on StackExchange helped me understand this concept more clearly.
One logical probability cut off to choose is if the probability that the observation belongs to a class is greater than 50%, then assign that observation to that class. Unfortunately, I am unable to use this decision rule in this case because the model predicts for almost all the observations that the probability that the daily return is positive is greater than 50%. In fact, the predicted probabilities are clustered around 54% — which is also the probability that the daily return is positive in the data.
Simply put, the models were not able to predict with much confidence given the predictors available to them, so the predicted probabilities that the daily return is positive had a very narrow range centered around the observed mean probability for positive return in the data. As such, I set the probability threshold of around 54% to assign observations to a predicted positive return or predicted negative return.
The models were evaluated on a test set that consists of SPY observations from 2014 to present. I decided to use the accuracy rate as a measure of assessing model fit which is simply the percent of observations that the model predicted correctly. Here are how the models performed.
model accuracy 1 m01 0.5204 2 m02 0.5302 3 m03 0.5432 4 m04 0.5383 5 m05 0.5106 6 m06 0.5155 7 m07 0.5334 8 m08 0.5318 9 m09 0.5188 10 m10 0.5253
4. Choosing the Model and Constructing Trading Signal
These might seem like good results since the models have accuracy over 50% until you remember that a classifier that would simply predict a positive daily return for every observation would have an accuracy of 54% (since 54% of the time, the daily return of SPY is positive).
Unsurprisingly, predicting future returns is hard and these models are bad. But let’s take a closer look at model m03 which has an accuracy of 54.3%. Here is a table of actual daily returns versus predicted daily returns.
negative_actual 160 120
positive_actual 160 173
When model m03 predicted negative returns, it was right 160 times and wrong 160 times with an accuracy of 50%. When model m03 predicted positive returns, however, it was right 173 times and wrong 120 times with an accuracy of 59%. This is pretty promising, so let’s use model m03 and only trade when it predicts a positive return.
The logic for constructing the trading signal is to use the 54% probability threshold for assign observations to a predicted positive return or negative return, go fully long when the predicted class is a positive return, and be flat when the predicted class is a negative return.
5. Assessing Strategy Performance
Here is the equity curve. It turns out that this simple strategy isn’t complete garbage and has actually outperformed the buy-and-hold return of SPY, at least over this test set. The strategy performed extremely well from late 2015 to present. Adding in transaction costs, however, would cause this strategy to underperform the SPY because there are a large number of trades.
Here is the SPY closing price with a mapped trading signal color gradient. Blue indicates times when the model was fully long and black indicates times when the model was flat. The plot shows that the strategy tends to go long as the market dips and hopes for a correction which exploits the short-term mean reversion explored earlier in this post.
The code for this post is comprised of 218 lines of code in R and can be found on my Github.