by Tom Starke
Can Monkeys beat the market? Much has been discussed about the famous “Monkey Portfolio” where a monkey throws darts at a board with stock tickers and the resulting portfolio outperforms the underlying benchmark. Obviously, it is a very bold proposition and I couldn’t resist putting it to the test myself. Quite frankly, I have no doubt that this is actually true, I was more interested in HOW true it is, and can we possibly derive a credible trading strategy from that?
It reminds me of the infamous “spooky action at the distance” aka “entanglement” in quantum physics where there is an obvious, experimentally verified, connection between elementary particles across large distance (even across the universe), but unfortunately, it cannot be exploited for transmission of information faster than the speed of light.
Another question that has yet to be answered conclusively is the reason for this inefficiency. I will draw my own, humble conclusions further down.
Constructing the dataset
Before starting to look at the simulations, let’s have a quick look at how the dataset was constructed. Wikipedia provides a page with the current components of the S&P 500 and also provides a list of the changes all the way back to 2000. With that information we can reverse engineer historical compositions of the index. I did this by copying and pasting the tables in a text file and processing them in Python to get the desired results. I found that from the start of the dataset to now, 308 companies remained in the index, which is a bit more than 50%. I was a little surprised by that number and expected slightly less.
Disclaimer: I have no way of checking if the reverse-engineered portfolios are actually correct as I do not currently have access to the right data sources. There have probably been some events such as name and ticker changes that the tables do not account for. However, since we are dealing with some fairly substantial random processes, let’s assume that the error is reasonably small. In fact, there is another source of inaccuracy here, which is the fact that the data source I am using, Yahoo Finance, does not provide price data for all the 708 companies that have been part of the index…