How does one create a mechanical trading system? However one comes up with ideas for entry and exit signals, testing those ideas is essential. In trading, one generally tests ideas on historic data, and thus the term “Backtest” is often used to refer to this sort of test, a test that is performed looking back at what would have happened to ascertain what might happen in the future.
Here are some thoughts on backtesting, in no particular order.
Backtesting is the single most important skill to develop as a system trader. Most of your ideas will stink, and being able to elegantly prove so in short order is critical to success. You only have so much time, so it is important to avoid spending undue time on bad ideas.
Backtesting is not merely an analytic, scientific process (i.e. develop a hypothesis then test the hypothesis). Aside from outright errors in your test, the number one thing to avoid is backfitting (or over-fitting) the data, which involves optimizing more and more variables until the results look good, but basically have very little predictive value because they have been fit exclusively to some optimal historic path through the data.
Backtesting does not involve looking at charts to find examples that conform to your hypothesis. Humans are biased toward information that proves them correct and are amazingly blind toward contrary data. I find this is particularly relevant on visually-presented data such as charts. Buy a backtesting software package and use it to judge the worth of your ideas. I suggest Amibroker.
You do not want the best performance possible when backtesting. You want the best performance that is based on the fewest variables, with each variable providing as broad a bell-shaped optimization curve as possible. You need to get to know your variables intimately. If a variable under optimization causes the results to thrash up and down, lose it. If a variable leads to poor results except at one magical value, lose it. If it ramps off nicely and then drops off a cliff… well, you might be able to use it, but pick a value a ways off the cliff even if it provides lesser performance.
That said, make sure you are using the right scale to determine the value of a variable to your system. It may have a beautiful, broad, bell-shaped curve if viewed at the right level of granularity that is hidden if optimized to coarsely.
Avoid using optimization steps that are too fine-grained. You might find a beautiful tree and never notice it is in a very ugly forest.
It is remarkably easy to accidentally create a system that cannot be traded in the real world. This can happen in many different ways including: 1) using forward data, perhaps hidden away in an indicator; 2) creating entries or exits that depend on real-time orders that you won’t be able to match; 3) ignoring the possibility of the bid-ask spread and assuming stops always get optimally filled.
Thus, when you set up to backtest, you should start with some key ideas and make sure they are coded into all your systems. These sorts of questions should come to mind. What time of day can I place orders? What order types does my broker support? How much time will it take to keep up with the orders generated by my system? How much liquidity do I need in the stocks I trade?
Keep a chunk of historic data set aside for validation once you have a system. And don’t repeatedly test your optimized system on the validation data and then go back and optimize some more. Do that enough times and you have more or less optimized your system to the data that was supposedly going to be used to validate your system.
Use lots of data, and make sure the data spans different market environments. But don’t forget that the market shifts over time. For example, volume today is different than it was 10 years ago.
My own approach is to bias my system under test toward the negative so I don’t get burned by flaws in my data. So, for instance, I look ahead and rule out any trade that would return more than a particular percent. I do not, however, rule out flawed trades that would lead to a large loss. Your data vendor will almost always include flawed data. You’ll need to account for it in some manner, even if not in the manner I chose.
Don’t expect symmetry between shorts and longs. Market participants tend to bring different factors to bear on stocks going up versus stocks going down. Euphoria is different than panic. But you should generally check for symmetry as part of a practice of being thorough.
You do not simply care about the CAR (Compound Annual Return) and the DD (Drawdown). You want to know much, much more about the results. Picture your equity curve as land and the drawdown as water. What’s the ratio of land to water? In other words, the depth of a drawdown isn’t the only factor, but also the breadth of the drawdown. I suggest using a logarithmic scale for your returns. You want to see as linear a ramp up as possible, with drawdowns that may happen all the time but all look pretty similar. If you only look at the CAR and DD, you can easily end up with a system that performed fantastic at one particular moment in history and rather stinks the rest of the time.
Get to know yourself as a trader and build into your backtests whatever assumptions are needed to maintain your sanity and even your enjoyment of the trading. As I’ve mentioned elsewhere, though it took me a few years, I eventually discovered that I do much better if I’m in cash at night, with all my positions closed out. Over time, I think I’ve sort of figured out how to make this a strength, but initially I abided by it even though it hurt my backtest results. Let me put this another way. When backtesting, it is easy to ignore yourself. Don’t.
A reason to be biased toward shorter-term trading systems: your data will provide a more valid test of your hypothesis. If you are using 6 or 7 years of data (and I wouldn’t recommend less even if it slows your computer down significantly when running tests), many trades a week will result in far more trials than a system dependent on several trades a year. I’m simply not comfortable drawing conclusions without thousands of trials across different market conditions. Thus, I have always and only tested relatively short-duration trades (1 to 10 days or so).
A reason to be biased against shorter-term trading systems: your frictional costs are higher. With System (my cleverly named system), I have to make about 3% a month to break even.
In general, picking up nickels in front of a steam roller is a bad idea. A system that only rarely has a huge loss but when it does, wipes out the results from many, many wins is very problematic. First and foremost, from a backtesting validity point of view, do you have enough loss samples to even consider your results valid? Better to have lots of wins and losses if you want more statistically valid results.