What is crowdsourcing? If we ask Wikipedia, it tells us that it “is a sourcing model in which individuals or organizations obtain goods and services. These services include ideas and finances, from a large, relatively open and often rapidly-evolving group of internet users; it divides work between participants to achieve a cumulative result.” Within a trading context, how can crowdsourcing be used to generate signals? If we are using price data in a trading model, which is likely to be the case in many trading models, we already using a “crowdsourced” data source. After all, the price is essentially generated by the “crowd” of market participants trading with one another. Market positioning is also an example of this. Understanding how participants are positioned (and hence whether those positions are above water or not), can provide a useful trading signals.
However, it is by no means the only examples of crowdsourcing from markets. Another place is the use of forecasts for economic data releases and company earning releases, which have been traditionally compiled by data companies from researchers working at large financial firms. These can be used to proxy what the market consensus is going into these data releases. Newer companies such as Estimize crowdsource their estimates for earnings numbers from a wide variety of analysts including retail, buy and sell side. There is a large body of research work describing the idea of pre-earning drift. Potentially, different estimates can be combined to create a variety of trading signals. Whilst, I haven’t looked at this for earnings numbers, I have found this approach useful for macro data releases.
There are also “high” level forms of crowdsourcing, where the trading signals are generated directly. Traditionally, within a quant fund, trading signals (or “alphas”) have been generated through research from quants. Capital is then allocated between the various alphas to create a portfolio. Quantopian takes a slightly different approach. It essentially allows anyone to code up a trading strategy on their online platform. Quantopian can then select from these alphas, those which show most promise and allocate capital to them. In a sense, the community provides the building blocks, and the allocator (Quantopian) builds the house. Obviously, it requires a lot of research work to identify the best alphas and that is the challenge, whether you crowdsource the alphas or not. In particular, a key challenge is to avoid those signals which have been heavily data mined (and hence could perform poorly out of sample), which can be even more challenging if you do not have a clear understanding of the process which developed the alphas. Perhaps even more important than looking at the returns chart of a strategy, is an understanding of how it was put together and the various assumptions – which can give you an idea of how stable it can be in the future.
Open source software is extensively used within finance, and is a good example of crowdsourcing. We have seen quant hedge funds also begin to open source some of their existing code, such as Arctic from Man-AHL, which is a pretty cool wrapper for storing time series data on MongoDB. You could argue that a financial firm is losing a commercial advantage by open sourcing some of its code? I think in general probably not, provided the code opened up is relatively generic (and of course I’m not advocating every fund suddenly open sources all their trading signals, the details of which can be quite proprietary). It is with that in mind that I’ve open sourced some of the code I’ve written for backtesting trading strategies (finmarketpy). I’ve also been working on an FX TCA based library all in Python over the past year. If I can get sufficient sponsorship to fund it, I’m hoping to make the basic framework behind my FX TCA (transaction cost analysis) library open source, to hopefully crowdsource ways of improving it from the community. At the same there will be proprietary parts of the library available specifically available for sponsors. I hope that hybrid approach will hopefully provide the best of both worlds, transparency for TCA and also the funding to ensure the project continues.
Crowdsourcing gives us the “market”, can provide us with interesting data sources and open source code for use in finance (and much more!). I suspect in the future though, the concept of crowdsourcing will become even more prevalent in trading, and in ways we probably haven’t even thought of!