A Simple Machine Learning Model to Trade SPY

1.006 Trading Signal

I moved this post to my new blog located here: A Simple Machine Learning Model to Trade SPY.


  1. Pingback: Quantocracy's Daily Wrap for 06/09/2016 | Quantocracy

  2. Hey can you explain more about what is model m01,m02,..,m10. Are you trying to predict the direction for future days? like m01 for tomorrows direction, m02 for day after tomorrow direction and m10 after 10 days? etc

    • Sure, I was struggling a little bit with the language, so I’m not surprised that it’s not clear. Suppose that it’s the end of the trading day. You know the daily return for the current day, the day before, and the day before that. Using these three daily returns, the model attempts to predict the daily return for the next day. That’s what model m03 is.

      More formally, the models attempt to predict the daily return of day t = 0 using the daily returns of day t = -1, t = -2, …, t = -10.

  3. mark leeds

    Hi: I tried to something similar a long time ago ( I also had another class called don’t trade ) and I found something that maybe
    goes on with your strategy ? If you use the 50 percent threshold approach then, even if you get more right than wrong, it’s
    the values of the returns that drive it. Atleast that’s what I found. A very interesting and well done post.

    Your approach may have more possibilities because mine was trying to do classification intraday. My guess ( or should I say “infenrence” ) is that those returns are waaay too noisy to ever work with that approach. So, it never worked out and I moved
    on. You may have more success. I would check including “don’t trade” also. ( which then causes you need to nnet because
    it’s multinomial ) . I think I did 60 and don’t trade in between. It may not have been 40 and 60 but you get the idea. I never saw this blog before so I’m gonna subscribe. Very nicely done and it brought back some ( good and bad ) memories so thanks.

    • Thanks, that’s a good idea for the classification. It should be possible to optimize the probability thresholds using cross validation or out of sample testing.

  4. mark leeds

    One other suggestion: Since the returns clearly mattered in my case, I tried to still invent some kind of objective function in order to decide what to get into and what not to get into. ( I was dealing intraday so there could be many choices because a lot of stocks ) . It’s somewhat vague in my memory but I I think I used the probabilities as some kind of proxy for returns and then created a linear program that tried to maximize the sum of the p_i’s subject to the portfolio being market neutral. That didn’t help much either because the proxy isn’t all that great. But my point is that I think there needs to be a way to quantify the proabilities and use them to decide how either A) enter or B) position size when the probability is higher than just 39 or 61 ( using the 40-60 approach ). Good luck.

    • Right, the probability should be a measure of confidence, and the further away the probability is away from 50%, the more confidence the model has in its prediction. So that should mean a stronger signal (a bigger position). For the purposes of this blog post, I just decided to let the signal take on values of only fully long or fully short, but I like the idea of varying the position size / signal strength depending on the confidence.

  5. Pingback: Best Links of the Last Two Weeks | Quantocracy

Leave a Reply

Your email address will not be published. Required fields are marked *