This paper describes some of the most common machine learning techniques used to analyze and extract information from textual data and applies these methodologies to a large sample of posts (tweets) related to the Covid-19 pandemic that the Bank of Italy collected from Twitter. We analyze the most actively discussed topics related to the pandemic and we build a real-time indicator for the average sentiment of the users (sentiment analysis).
At the outbreak of the pandemic there is a marked spike in the number of tweets shared on the platform and a marked drop in their sentiment. Symptoms of the disease are the most relevant topic at the beginning of the pandemic waves while the economic situation and the consequences of restrictive measures become a major concern later.