Stock Market Prediction With Natural Language Machine Learning

Machines – is there anything they can’t learn? 20 years ago, the answer to that question would be very different. However, with modern processing power and deep learning tools, it seems that computers are getting quite nifty in the brainpower department. In that vein, a research group attempted to use machine learning tools to predict stock market performance, based on publicly available earnings documents. 

The team used the Azure Machine Learning Workbench to build their model, one of many tools now out in the marketplace for such work. To train their model, earnings releases were combined with stock price data before and after the announcements were made. Natural language processing was used to interpret the earnings releases, with steps taken to purify the input by removing stop words, punctuation, and other ephemera. The model then attempted to find a relationship between the language content of the releases and the following impact on the stock price.

Particularly interesting were the vocabulary issues the team faced throughout the development process. In many industries, there is a significant amount of jargon – that is, vocabulary that is highly specific to the topic in question. The team decided to work around this, by comparing stocks on an industry-by-industry basis. There’s little reason to be looking at phrases like “blood pressure medication” and “kidney stones” when you’re comparing stocks in the defence electronics industry, after all.

With a model built, the team put it to the test. Stocks were sorted into 3 bins —  low performing, middle performing, and high performing. Their most successful result was a 62% chance of predicting a low performing stock, well above the threshold for chance. This suggests that there’s plenty of scope for further improvement in this area. As with anything in the stock market space, expect development in this area to continue at a furious pace.

We’ve seen machine learning do great things before, too – even creative tasks, like naming tomatoes. 

Hedgefund Startup Powered By Crowdsourced Code

In the financial sector, everyone is looking for a new way to get ahead. Since the invention of the personal computer, and perhaps even before, large financial institutions have been using software to guide all manner of investment decisions. The turn of the century saw the rise of High Frequency Trading, or HFT, in which highly optimized bots make millions of split-second  transactions a day.

Recently, [Wired] reported on Numerai — a hedge fund founded on big data and crowdsourcing principles. The basic premise is thus — Numerai takes its transaction data, encrypts it in a manner that hides its true nature from competitors but remains computable, and shares it with anyone who cares to look. Data scientists then crunch the numbers and suggest potential trading algorithms, and those whose algorithms succeed are rewarded with cold, hard Bitcoin.

Continue reading “Hedgefund Startup Powered By Crowdsourced Code”