Research suggests that emotions can have a significant impact on the behaviour of individuals. This is also true for the investors making their decisions in the highly volatile stock market. Currently, AI companies put more effort into using sentiment analysis to predict the movement of shares. Among other things, blogs are one of the right places for determining how traders and investors feel about publicly traded companies.
Stock sentiment analysis employs Natural Language Processing (NLP) for splitting the phrases, sentences, or entire passages of text into three classes labelled as Positive, Neutral and Negative. When coupled with Machine Learning (ML), NLP enables extracting sentiment from a large number of blogs in real-time.
Stock market and sentiment analysis
Predicting stock market price movements has always been a great challenge. A number of factors influence prices and investors seek alternative data sources to gain the edge over other traders.
This scientific paper tries to predict the close price of companies based on the wisdom of crowds extracted from various blog posts. Blogs with a big enough following tend to shape the opinion of the crowd.
The model in this work manages to achieve 84% accuracy in predicting price movements correctly, which is surprisingly good. Let’s dive into the details of this paper.
The rise of the Internet and Big Data
Rise of the Internet started an influx of social media and blogs. First time in the history of mankind every single investor could share their opinions with millions of other investors.
A similar study indicates that a sentiment analysis based model had a 69.09% accuracy in predicting Shanghai Composite Index price using a popular Chinese financial microblog.
Another study back in 2007, conducted by The Wall Street Journal found a clear link between long-term stock market returns and media sentiment. The author came to the conclusion that a negative sentiment by the media correlates with a downward trend in a broader stock market.
How does it work?
This publication analyzes how sentiment influences prices in 39 banks listed on the National Stock Exchange of India. Data was collected from the blog http://www.mmb.moneycontrol.com/ with these key attributes: blog posts, stock price, number of messages.
The pipeline of the paper consists of these steps:
- Collect blog posts.
- Clean and preprocess data.
- Perform tokenization, parsing, lemmatization where necessary.
- Find communities that relate to a specific stock (company).
- Perform sentiment analysis in every single post.
- Assign +1 or +2 for different levels of positive sentiment.
- Conversely, assign -1 or -2 for various levels of negative sentiment.
- Track words that have strong sentiment like “buy”, “sell”, “plunges”, etc.
- The sum of all scores is the overall sentiment score
The proposed model is rather simple: if the overall sentiment score is positive – we should match with the higher stock price and, on the other hand, if the price goes down during a day, the overall sentiment score is expected to be negative.
When tested on 62 trading days, whilst the National Stock Exchange was open for trading, the model was correct 84% of the time.
In conclusion, sentiment analysis from microblogs seems to work well with stock market predictions. It is self-evident that an average investor would benefit greatly from a system, that provides an accurate representation of what social media and articles say about certain companies.
The only limitation of this scientific paper is that experiments were only conducted on blogs. It would be extremely useful to find out how the results would change if social media and articles were added to the mix.
If you would like to read the original paper, you can find it by clicking this link: “Sentiment Analysis of Stock Blog Network Communities for Prediction of Stock Price Trends” by Sandeep Ranjan / 2018, Indian Journal of Finance