Topic modelling
Wordcloud
Text mining is the process of extracting high-quality information from text. Topic modeling is also a form of text mining which employs statistical machine learning techniques to identify patterns in large amounts of text. It can take your huge collection of documents and group the words into clusters of words, identify topics, by a using process of similarity.
In our project we use text mining and topic modeling, to identify the most recurring words from our dataset (all the tweets we have extracted between 1st Jan 2021 to 14th December 2021, and while excluding the most common words sets like ‘a’ ‘the’ ‘if’ etc.,). Based on this we are able to decipher – positive, negative and neutral words and mindsets associated with the vaccine. We further visualise this into 3 different word colour based on this interpretation.
Top frequent words
To better understand which are the most used words in each month, we have visualised this in the form of a bar graph with a widget which can be used to change the month and the corresponding bar graph with 15 most used words of that month are revisualised.
- See the Topic modelling for the code that produced these plots.