Stock Market Prediction With Natural Language Machine Learning

Machines – is there anything they can’t learn? 20 years ago, the answer to that question would be very different. However, with modern processing power and deep learning tools, it seems that computers are getting quite nifty in the brainpower department. In that vein, a research group attempted to use machine learning tools to predict stock market performance, based on publicly available earnings documents. 

The team used the Azure Machine Learning Workbench to build their model, one of many tools now out in the marketplace for such work. To train their model, earnings releases were combined with stock price data before and after the announcements were made. Natural language processing was used to interpret the earnings releases, with steps taken to purify the input by removing stop words, punctuation, and other ephemera. The model then attempted to find a relationship between the language content of the releases and the following impact on the stock price.

Particularly interesting were the vocabulary issues the team faced throughout the development process. In many industries, there is a significant amount of jargon – that is, vocabulary that is highly specific to the topic in question. The team decided to work around this, by comparing stocks on an industry-by-industry basis. There’s little reason to be looking at phrases like “blood pressure medication” and “kidney stones” when you’re comparing stocks in the defence electronics industry, after all.

With a model built, the team put it to the test. Stocks were sorted into 3 bins —  low performing, middle performing, and high performing. Their most successful result was a 62% chance of predicting a low performing stock, well above the threshold for chance. This suggests that there’s plenty of scope for further improvement in this area. As with anything in the stock market space, expect development in this area to continue at a furious pace.

We’ve seen machine learning do great things before, too – even creative tasks, like naming tomatoes. 

38 thoughts on “Stock Market Prediction With Natural Language Machine Learning

    1. Heh, perfect! You can hide it behind whatever tools and buzzwords you like but anything that involves a speculative market boils down to gambling on a rigged game and you the little guy may get lucky but you will never be part of the group who is “in the know and can do more than just guess what the market is going to do”.

      1. “rigged game” ? “little guy” ? Spoken like a true forever pauper with no aspirations to financial wealth. Perhaps you should read the book “Market Wizards”, or the “Big Short”. There are plenty of present day market titans who were “little guys” at one point. They became ‘whales’ (financial jargon) in the industry because they saw something others didn’t and used their smarts & ingenuity (and most of all *desire* and *ambition*) to acquire an edge on the market and leverage it to their benefit.

        I’m a derivatives trader (on the side) and have done quite well (my networth is in the 8 digit range). There’s an abundance of information out there. To quote a famous speech by “Gordon Gekko” in the classic movie “Wall Street” –
        “Greed – in all it’s forms , greed for life, for money, for love, for knowledge – has marked the upward surge of mankind”. Either you desire wealth, or be one of the proletariat.

        The markets don’t care who you are. There are plenty of ways to get your share.

        1. And even more ways to lose everything. FTFY.

          In order for you to become rich, many have lost money which is now in your pockets. And most people will fit into the losing category, whatever their opinion of their trading prowess might be.

          1. That’s not exactly true. Strike that; that’s not true at all! Yes, there is inherent risk involved in markets, but an increase in the value of my investment does not come at the expense of someone else. How would that even work? If the value of one company goes up, would the value of another company have to go down? Do some research; learn how market economics works. Claiming that money I may have made on the markets came out of your pockets just makes you sound bitter and full of envy.

          2. No, because market economics is not a zero-sum game. The whole point of it is to create wealth. If a nation’s GDP increases, is it at the expense of other nations? No–wealth is created in all kinds of ways. If that weren’t the case, then it would be impossible for our standard of living to be where it is in comparison to what we had in the 1700’s.

          3. No wealth is created by the stock market. It is more or less a gamble.

            I have to deposit funds at my bank if I want to trade stocks. If I made a bad trade, I lose that money, and if you are at the other end of that trade – you gain money.

            And the whole world economy is rigged, if you want to talk about that subject:

            As American economist Barry Eichengreen summarized: “It costs only a few cents for the Bureau of Engraving and Printing to produce a $100 bill, but other countries had to pony up $100 of actual goods in order to obtain one.”


            Further, my bank lends me the money it does not have so I can buy a house. I have to repay with real money. However if some event happened, like in 1929, when there was a bank run, the entire rotten banking system would collapse in a day.

            The entire thing is one huge bloody mess. Nobody understands it fully. God help us all when it goes down.

          4. Yeah, that’s not how it works at all. When you buy shares, the money goes to the company. The company uses that pool of money to operate and *create goods and services*. Those goods and services add value to the company, and part of the profits come back to you as the investor.

            I understand how it seems a bit complicated since we now essentially operate on a debt standard, but if you think of it from the gold standard point of view it becomes more obvious:

            The amount of gold held by the government determines the amount of cash available in the market. Some industrious soul goes out digging, and comes back with more gold. More. Adding to the wealth already present. When they sell that gold, exactly nobody is diminished in wealth.

            A healthy market operates in a world where value is created in both goods and services. If I invest in those actions and make money, that money does not come out of someone else’s pocket–it’s created by the success of the company.

            None of that means that you can’t lose money, obviously. But if you lose money on a bad investment, nobody is enriched by your bad choice unless the loss is a result of someone engaging in illegal activities.

    2. Predictive analytics, Machine Learning, Natural Language Processing, is still difficult to execute….however, there are platforms that are doing much better than any super computer could have done just a couple years ago. It is here now and is an exciting time. I think some of the success story here got lost on the topic of the stock market but we can also related to deep learning projects in research of cures for diseases, aggregating massive amounts of information such as in a patent search, etc Advanced search has come a long, long way…..

  1. ” In that vein, a research group attempted to use machine learning tools to predict stock market performance, based on publicly available earnings documents. ”

    I seem to remember an article years ago where detailed knowledge of the companies, and their respective fields were used.

  2. I have to say, I’d be more impressed if they could predict the stock *before* the earnings are announced. You’d probably be more successful checking the direction of the stock the milliseconds after the results are published. This exercise seems to be more about parsing the earnings report to see if it’s positive or negative news than about “predicting” anything.

    1. Agreed. One can’t really “predict” the market. There are analytical tools that can be applied to get probability numbers, but as the usual disclaimer in the industry goes “past performance does not guarantee future results”. So that’s when you deploy derivative strategies involving futures or option positions to bracket your risk exposure (weighing the probability of a win vs loss vs risk vs reward), etc. etc.

      For the cynics “insider trading” on equities (or for that matter, futures) is something the SEC will prosecute. As “Bud Fox” found out in the ending of the movie “Wall St”.

      1. You’re right. Because it’s illegal means that no one will do it or that at least anyone with a lot to lose won’t do it. Of course people can still win in a rigged system without cheating. The issue is that many people tend to get luck and skill confused. That’s not to say there is no skill involved. It’s just that luck and available assets have more to do with the win than skill.

        1. Please explain how the system is “rigged” ? (I suggest you look up Dennis Ritchie and his “Turtles” – he gave some new ab-inititio traders a ‘stake’, showed them a trading method, and they proceeded to run up huge profits). Plenty of folks (the dedicated, smart ones) make a decent living as traders – sure an equally larger sample will go bust (mostly from avarice, fear, inadequate capitalization, or just plain stupidity).

          Admittedly the large whales in the industry (Goldman Sachs, etc) have huge divisions devoted to proprietary trading ops and can move markets, but – little guys can intelligently ride their coat tails (or even outsmart them – as the 2 kids who were small-fry to Wall St, did – by getting custom derivative contracts made to take the opposite side of the ‘bet’ on MBS’s – mortage backed securities).

          Money management and risk management are the keys to successful trades.

  3. have to point out one thing:
    so after feeding the reports to a modified word2vec they were able to get 62% accuracy on judging something is rubbish or not? that’s just 12% more than flipping a coin w/o reading/analysing anything.

    1. A meteorologist needs to beat “persistence”, that is, “there is a 72% chance tomorrow’s weather will be like today’s”
      In other words, if you predict that tomorrow’s weather will be the same as today, you will be right about 72% of the time.

  4. also removing punctuation:

    let’s eat, grandma!


    let’s eat grandma!

    but more or less the deal is to classify the stuff based on the attitude of the report, rather than the evaluation of actual data [if there’s anything like this]. however it is hard to ignore out the general manipulative biasing – even if this is usually just tries to paint a better picture.

    1. So, is your point that this type of AI doesn’t need to be very knowledgeable about grammar, because the humans who wrote the reports are awful at grammar and regardless of if the comma was there or not, the financial filings always meant the normal phrase closest to the exact phrase used, even when omitting a comma that teachers would insist is necessary?

  5. This is a great evolution of our history and the future is even brighter than what we see as human vs the AI. I believe that machine learning will definitely change the course our foundation of investment. More to be seen with specific examples and the results

  6. Earnings reports are very well known to have predictive value for short-term share price movements.

    The study breaks shares up into three groups (up, no-change, down) and guesses them based on its parsing the earnings report. It gets the “lows” correct 60% of the time (and the others presumably less).

    I would bet good money that any of you would outperform the model on this task: reading an earnings report and figuring out if it’s good news or bad news. Heck, you can figure that out most of the time just by seeing if they include a graph on page one — firms bury performance graphs when they’re bad.

    Headline should read: Neural network correctly classifies sentiment in earnings reports. Which still ain’t nothing, but it’s not forecasting.

  7. “20 years ago, the answer to that question would be very different.” — 20 years ago, I watched a movie where a guy started to predict the stock market with a home-built computer and then some freaky dude tried to get him to predict the coming of the messiah, and some scary corporate chick supplied him parts in return to manipulate the stock market and then he drilled a hole in his head. So, the weirdest part of that movie is that it predicted the future?!?! (LOL, Pi, 1998)

  8. only proof either do insured AAA bonds or nothing at all.. It’s all driven by social trends otherwise.. This also isn’t AI or learning it’s regression modeling

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.