Monday, April 28, 2008
Friday, April 25, 2008
On a side note, I think it's fairly obvious it's harder to beat the market during a bull period like we've seen since 2003. That fact alone would tend to reduce the margin of outperformance. The real question is how does the strategy do once the bull ends, as it may be doing now. A connected question is, how do you know what kind of market it is in the first place? I think the best longer-term strategy will perform well in a bull, bear or congested market. So even if we could prove the COTs data has lost its power since 2003, which as I show above it clearly hasn't, I think you'd want to see how it does in a bear market before jumping to any conclusions.
As for the out-of-sample testing on my NDX setup, it was also interesting. This kind of testing is done to help see how the setup would have done in real-live trading. In six tests (I plan to add a few more as I refine my testing further), the setup achieved an out-of-sample CAGR of 91 percent of the in-sample period. In other words, if CAGR during the test window was 10 percent, then the out-of-sample CAGR would have been 9.1 percent. Out-of-sample "efficiency" above 60 to 70 percent is considered robust. Out-of-sample efficiency for regressed annual return was 76 percent. The out-of-sample efficiency for the Sharpe score was 1.5, with an out-of-sample Robust Sharpe efficiency of 1.1 and an out-of-sample largest drawdown only 38 percent as large as the in-sample drawdown. On average, the out-of-sample efficiency of all these scores was 1.4 - meaning that it was 40 percent better than the backtesting window.
I also like to get a more robust look at all this out-of-sample data by dividing it by its standard deviation in the six out-of-sample tests. I find this evens out any unusual data spikes resulting from a small number of exceptionally strong or weak trades in a single test window. By that measure, the average out-of-sample score was even better - 2.0 - which suggests that the out-of-sample data was quite consistent across the six tests, another good sign. I intend to start publishing more data like this for all my setups on my Latest Signals page table. I've also been improving my testing procedures to optimize my search for the top setups and hope to have an improve S&P 500 setup to announce soon. See you in a little while with this afternoon's COTs data update.
NDX setup parameter values: My trading setup for the NASDAQ 100 combines the signals from two different setups. The first setup goes long when the commercial trader net percentage-of-open-interest position is -1.05 standard deviations or greater (higher) than its three-week moving average. It goes short when the position is -1.05 standard deviations or lower than the moving average. The setup uses a five-week trade delay. The second setup goes long when the small trader net percentage-of-open-interest position is 0.45 standard deviations or lower than its 14-week moving average and goes short when the position is 1.4 standard deviations or higher than the average. This setup has no trade delay. I'll post a spreadsheet for this setup some time early next week.
Tuesday, April 22, 2008
Thankfully, Steve LeCompte over at CXOAdvisory.com has taken down his erroneous post about my S&P 500 trading setup and reevaluated the data in a new post today. He raises a good point about how "trading friction" (i.e. transaction costs) would have eaten into the past profitability of the setup, which I'm going to include in my ongoing revision process of my setups based on the Commitments of Traders reports. I think this independent look at my trading strategy is great and can only help me build a better system. I didn't take into account trade friction earlier because my original group of setups traded fairly infrequently. As my revision process comes up with setups that trade more often, this is definitely an important factor to consider. But I think some of Steve's other conclusions are again flawed. (See my post here about his earlier erroneous study of my strategy.) Here is the response I just sent him about his latest work:
Thanks for revising your erroneous post and reevaluating the data. You raise some good issues about this particular trading setup for the S&P 500, for which I thank you. But I also would like to draw you attention to other conclusions you draw that I believe are flawed and raise questions about some of your evaluation methods.
You are correct to say that trade friction is an important variable to take into account. As I've mentioned on my blog and to you, I'm going through a re-evaluation process of my setups right now to find those that are the most statistically robust. Trade friction would be a good element to include.
However, when you delve into the area of statistical robustness, your conclusions are weaker:
1) You say the performance of the SPX setup with trade friction mostly lags buying and holding the index in the last five years. You evaluate this by studying only the profit. That's probably the weakest measure you can use. It's easy to find incredibly profitable trading setups that aren't very statistically robust by other measures. One more robust measure to use, for example, is the Sharpe score. This tells if the return was achieved at the expense of great volatility. The 2003-07 Sharpe score for this setup is 2.8, while for buying and holding it is 2.0 - a large difference suggesting the setup achieved the same return with less tough-to-stomach ups and downs. Over the entire 1995-2007 period, the Sharpe for the setup was 3.4, while for SPX it was 1.2. That's not to say your point about trade friction eating up those profits isn't a good one. It's just that your evaluation method isn't based on a very robust measure.
2) You also say the COTs dataset for the SP500 has a small sample size that reduces confidence in the setup. Your conclusion doesn't seem to be based on anything very solid. I invite you to read up on how to determine this question by studying Robert Pardo's new book on trading strategy development. One way of checking the adequacy of the sample size is the number of trades. This setup has 150. That's well over the 30 minimum trades Pardo recommends for a reliable setup.
In his book, you will also see a simple method described to evaluate if you've got enough data for your strategy. I've blogged about this here. Using this method, this setup's nine-week moving average period uses only 3 percent of the available degrees of freedom of the dataset. (That is based on the 12 trading rules in the strategy.) That's well below the 10 percent maximum suggested by Pardo.
Pardo outlines other methods of reducing the risk of data-mining, which I've implemented or am in the process of implementing during my revision process. One is out-of-sample testing. This particular setup achieves an out-of-sample efficiency of 1.3 in 10 tests - meaning on average the out-of-sample performance was 30% higher than for the in-sample data for Sharpe, Robust Sharpe, compound annual growth rate, drawdown and regressed annual return.
3) Finally, I think you're incorrect to conclude that your study confirms that "the predictive power of COT report data may have diminished in recent years." I invite you to take another look at your own chart of annualized return by calendar year for buying and holding the SP500 and the COTs Timer Strategy with your trade friction. The best performances came at the end of that five-year period, in 2005 to 2007. As well, you draw this conclusion based on evaluating one possible setup. Again, I would suggest that's not a very robust conclusion. All that said, I thank you again for including me in your research and for raising some good issues to evaluate further.
Monday, April 21, 2008
Update (1:20 p.m.): Steve just emailed saying he's removed that erroneous post and will retest the data. He'll post the correct results tomorrow morning.
Friday, April 18, 2008
So what do the precious little Commitments of Traders reports have to say about all this? As you can see from the table on my Latest Signals page, they're also lining up smartly on the bullish side. My trading setup for the S&P 500 flipped to bullish a few weeks ago, with the execution date for the trade coming up on the open of Monday, April 21. That setup is based on trading on the same side as the "smart money" commercial traders in SPX futures and options. Right now, they're resolutely upbeat. This makes five of my six equity setups bullish (all except for the holdout Dow Jones industrials). Also taking effect on Monday's open is my bullish signal for crude oil.
From today's new COTs data release, my setup for the 30-year Treasury bond has flipped to bearish. (This means it calls for the yield to rise.) This new signal takes effect Monday, too. I should note that I'm presently long Treasuries through my setup for the 10-year note. That setup is far more robust statistically speaking than the setup for the 30-year bond, and it remains in bullish mode. So I don't plan on taking any action on the 30-year signal. I'll be updating the 30-year setup soon to improve its robustness.
Housekeeping: I know, I know - I keep promising more updates for my setups based on the COTs reports. Rest assured, I'm busy testing and refining various things, but it's not at the point where it's warranted any new announcements or changed trading setups. The research I'm doing now includes out-of-sample analysis of the setups. This is just one step among many in order in verifying how reliable the setups would be in real-time trading. So far, I'm happy to say the results are very positive and haven't caused me to drop any of my setups. But I'm hopeful this and other testing might lead to still-better ones in some markets and to refinements of setups in markets I haven't had a chance to revisit yet, like the agricultural commodities and currencies. Another twist on all this is testing more combined setups, of which I already use a couple - setups based on the best signals from two groups of traders (for example, the commercials and large speculators). Thanks for your patience with this process. Have a great weekend, and see you next week with a portfolio update and hopefully some further news on revisions.
Friday, April 11, 2008
- This afternoon's report gave me two new bearish signals for copper and platinum. These setups both work with eight-week trade delays, so the signals don't take effect until June 9. Wow, that's something to think about. We're still digging ourselves out from the snow up here in Quebec's Eastern Townships, and it's nice of the COTs reports to bring to mind those coming balmy days. Given the long delays, there's ample room for several pending signals to stack up in a row for these setups. See my Latest Signals table for some notes on my copper setup in particular.
- My six equities setups (including the BKX Bank Index setup based on the three-month Eurodollars COTs data) seem to be strengthening their bullish alignment. All except the Dow Jones industrials are or will shortly be in bullish mode, with the S&P 500 setup going long as of the open on Monday, April 21. After a week like this one, perhaps we're finally lining up for the long-awaited rally that's been playing hide-and-seek with traders for so long.
- Trades for Monday, April 14, include going long again for gold. This setup operates with a two-week delay and has been flopping around like Gill in the dentist's office in Finding Nemo, but it now seems to have settled happily into bullish mode as the large speculators have steadily increased their net long percentage-of-open-interest position in gold futures and options.
Have a great weekend, and see you back here next week with a portfolio update and those promised revisions to my setups. Sorry not to have posted any of those this week, but rest assured that fascinating refinements are on their way!