On Backtesting

How does one create a mechanical trading system? However one comes up with ideas for entry and exit signals, testing those ideas is essential. In trading, one generally tests ideas on historic data, and thus the term “Backtest” is often used to refer to this sort of test, a test that is performed looking back at what would have happened to ascertain what might happen in the future.

Here are some thoughts on backtesting, in no particular order.

Backtesting is the single most important skill to develop as a system trader. Most of your ideas will stink, and being able to elegantly prove so in short order is critical to success. You only have so much time, so it is important to avoid spending undue time on bad ideas.

Backtesting is not merely an analytic, scientific process (i.e. develop a hypothesis then test the hypothesis). Aside from outright errors in your test, the number one thing to avoid is backfitting (or over-fitting) the data, which involves optimizing more and more variables until the results look good, but basically have very little predictive value because they have been fit exclusively to some optimal historic path through the data.

Backtesting does not involve looking at charts to find examples that conform to your hypothesis. Humans are biased toward information that proves them correct and are amazingly blind toward contrary data. I find this is particularly relevant on visually-presented data such as charts. Buy a backtesting software package and use it to judge the worth of your ideas. I suggest Amibroker.

You do not want the best performance possible when backtesting. You want the best performance that is based on the fewest variables, with each variable providing as broad a bell-shaped optimization curve as possible. You need to get to know your variables intimately. If a variable under optimization causes the results to thrash up and down, lose it. If a variable leads to poor results except at one magical value, lose it. If it ramps off nicely and then drops off a cliff… well, you might be able to use it, but pick a value a ways off the cliff even if it provides lesser performance.

That said, make sure you are using the right scale to determine the value of a variable to your system. It may have a beautiful, broad, bell-shaped curve if viewed at the right level of granularity that is hidden if optimized to coarsely.

Avoid using optimization steps that are too fine-grained. You might find a beautiful tree and never notice it is in a very ugly forest.

It is remarkably easy to accidentally create a system that cannot be traded in the real world. This can happen in many different ways including: 1) using forward data, perhaps hidden away in an indicator; 2) creating entries or exits that depend on real-time orders that you won’t be able to match; 3) ignoring the possibility of the bid-ask spread and assuming stops always get optimally filled.

Thus, when you set up to backtest, you should start with some key ideas and make sure they are coded into all your systems. These sorts of questions should come to mind. What time of day can I place orders? What order types does my broker support? How much time will it take to keep up with the orders generated by my system? How much liquidity do I need in the stocks I trade?

Keep a chunk of historic data set aside for validation once you have a system. And don’t repeatedly test your optimized system on the validation data and then go back and optimize some more. Do that enough times and you have more or less optimized your system to the data that was supposedly going to be used to validate your system.

Use lots of data, and make sure the data spans different market environments. But don’t forget that the market shifts over time. For example, volume today is different than it was 10 years ago.

My own approach is to bias my system under test toward the negative so I don’t get burned by flaws in my data. So, for instance, I look ahead and rule out any trade that would return more than a particular percent. I do not, however, rule out flawed trades that would lead to a large loss. Your data vendor will almost always include flawed data. You’ll need to account for it in some manner, even if not in the manner I chose.

Don’t expect symmetry between shorts and longs. Market participants tend to bring different factors to bear on stocks going up versus stocks going down. Euphoria is different than panic. But you should generally check for symmetry as part of a practice of being thorough.

You do not simply care about the CAR (Compound Annual Return) and the DD (Drawdown). You want to know much, much more about the results. Picture your equity curve as land and the drawdown as water. What’s the ratio of land to water? In other words, the depth of a drawdown isn’t the only factor, but also the breadth of the drawdown. I suggest using a logarithmic scale for your returns. You want to see as linear a ramp up as possible, with drawdowns that may happen all the time but all look pretty similar. If you only look at the CAR and DD, you can easily end up with a system that performed fantastic at one particular moment in history and rather stinks the rest of the time.

Get to know yourself as a trader and build into your backtests whatever assumptions are needed to maintain your sanity and even your enjoyment of the trading. As I’ve mentioned elsewhere, though it took me a few years, I eventually discovered that I do much better if I’m in cash at night, with all my positions closed out. Over time, I think I’ve sort of figured out how to make this a strength, but initially I abided by it even though it hurt my backtest results. Let me put this another way. When backtesting, it is easy to ignore yourself. Don’t.

A reason to be biased toward shorter-term trading systems: your data will provide a more valid test of your hypothesis. If you are using 6 or 7 years of data (and I wouldn’t recommend less even if it slows your computer down significantly when running tests), many trades a week will result in far more trials than a system dependent on several trades a year. I’m simply not comfortable drawing conclusions without thousands of trials across different market conditions. Thus, I have always and only tested relatively short-duration trades (1 to 10 days or so).

A reason to be biased against shorter-term trading systems: your frictional costs are higher. With System (my cleverly named system), I have to make about 3% a month to break even.

In general, picking up nickels in front of a steam roller is a bad idea. A system that only rarely has a huge loss but when it does, wipes out the results from many, many wins is very problematic. First and foremost, from a backtesting validity point of view, do you have enough loss samples to even consider your results valid? Better to have lots of wins and losses if you want more statistically valid results.

7 Replies to “On Backtesting”

  1. 3% a month for you is break-even? 3% a month for me would be retirement!! 🙂
    I’ve been thinking of some good (I hope) questions.
    What’s your typical short/long ratio? Do you always invest fully each day, or each day when you’re trading? Do you think the S&P 500 is the best benchmark for the stocks you’re trading?
    I was thinking, maybe there’s some benchmark/index that you’re more closely tracking than the S&P 500 (although I can’t imagine what it would be with these returns.) One interesting test I’d like to see would be to take all of the stocks you’ve held long since you’ve been trading System, and do a backtest where you just buy and hold them all equally from when you started trading. Capturing dividends, and having no trading costs, how does that compare to what you’re getting by trading? In other words, could it be that you’re picking good long-term stocks?
    In case you (or any of the other trading commenters) are interested, here’s my current investment distribution:
    Vanguard 500 (VIIIX): 55%
    American Funds EuroPacific Growth (RERFX): 19%
    Vanguard Extended Mkt (VEXMX): 16%
    Vanguard Small Cap (NAESX): 5%
    Vanguard REIT (VGSIX): 5%
    Top two are holdings through work; I’d prefer the Vanguard International over RERFX, but it’s not bad and the best of the other investment choices offered.
    Rusty

  2. Okay, I may take more than one comment to answer, but let me knock a couple of them out.

    If you read my history of trading posts (my personal history, that is), you’ll see I’ve been evolving over time. This year, I’ve been through 2 distinct phases. Earlier in the year, I had a set ratio of longs and shorts, with more shorts than longs but with more put into each long. Sometimes I would have empty slots if, for instance, there weren’t enough shorts to trade that particular day.

    At the beginning of August, with my major revision, I switched to a much more dynamic system. The past couple weeks, I’ve been almost entirely short (probably over 90%). In mid-August, I was entirely long for days at a time. It is biased toward choosing shorts if they are available, and probably ends up being around 60% short to 40% long.

    I don’t really have a benchmark per se. I’m basically trying to see if I can generate returns that would make a conventional benchmark look like a flat line… not saying I’ll succeed in the long run, but that’s what I’m attempting to do. And I’m trying to be as uncorrelated to the market as possible. In other words, I don’t want the market’s move to have much to do with how I do on a given day. However, such a thing is challenging to measure… or it is for me. Such a thing might be right up your alley, come to think of it.

    I haven’t tested my picks in the long run other than to confirm I do a lot worse if they are held for more than 1 day, which isn’t surprising to me, as I do not believe I have any edge at all except in the very short term.

    Here’s a bit of my thought process. Which sounds more realistic: picking a stock that will go up 200% in a year, or picking a stock that will go up 10% in a month, or picking a stock that will go up 0.45% (after trading fees) in a day? To me, the last option was sort of intuitive, and so that’s where I ended up putting most of my time. If you haven’t guessed, I’m a big proponent of compounding…

  3. Jay,
    Interesting thought process. My natural instincts go the opposite direction towards picking a stock that will go up 200% in 2-3 years aka Warren Buffet. The premise being that emotions and mob mentality ocassionally cause the market to price a stock well below fair value. I’ve found a few of these and the returns are nice. However the money goes nowhere for a year or two before the market psychology changes and the stock takes off.

    Have you come across anyone teaching a combination approach of using fair value to find potential stocks and then using technical analysis to determine which are the most timely?

    John

  4. John, I’ve not read his book, but from some of his blog posts I get the feeling that Phil Town is trying that approach. He cites Buffet a lot, sets out 5 fundamental-looking criteria, but uses technical indicators to determine the entry.

    http://www.philtown.typepad.com/

    His basic approach seems sound. I think he’s advocating that you 1) find great businesses as though you want to own the whole thing, 2) do it when the price movement is showing you that the big boys agree with your sentiment, but 3) while it is still a good value based on forward P/E. Or something like that.

  5. BTW, I should have made my question above more obviously non-rhetorical. I’m not claiming there is one right answer, which is why I said one particular approach seemed intuitive to me rather than claiming one particular approach was right.

    But I do think the question is critical to ask oneself, perhaps even at some regular interval as you educate yourself as a trader.

  6. Seems to me that there are at least three different investing styles: technical (price action only, no information about company), fundamental (earnings and price information, numerical data about company but no “analysis”), and analysis-based (full analysis of company’s business and future potential).

    I have never considered the third investing style, because I do not have the time, resources, or inclination to do in-depth study and analysis of companies I might want to invest in. When I first got interested in investing, I went for style #2. I studied contrarian and P/E ratio strategies, and I read (and still really like) Joel Greenblatt’s little book. One nice thing about a style like this is that you don’t have to trade as much, so if you use a service like Zecco, you can get your trading costs to essentially zero.

    Since then, I’ve drifted to #1. My skills (computer programming, math, etc.) are more suited to this, and backtest data is easier to get (I never found any free sources of historical earnings data, p/e ratios, etc.). And since trading on #1 subjectively feels like I’m gaming the market, that’s a plus as well. But trading costs are much higher, so there’s got to be a significant edge.

  7. Steve, I thought Joel’s book was great. He’s a solid writer and didn’t waste words just to improve the word count. The basic premise of the book (and associated trading approach) seemed sound to me, though I would want to evaluate the price with some sort of technical analysis before actually buying based on his criteria.

    And it does always come back to having an edge, along with plentiful opportunities to exploit the edge.

Leave a Reply

Your email address will not be published. Required fields are marked *