Do Topic-Dependent Models Improve Microblog Sentiment Estimation?


When estimating the sentiment of movie and product reviews, domain adaptation has been shown to improve sentiment estimation performance.  But when estimating the sentiment in microblogs, topic-independent sentiment models are commonly used.

We examined whether topic-dependent models improve performance when a large number of training tweets are available. We collected tweets with emoticons for six months and then created two types of topic-dependent polarity estimation models:  models trained on Twitter tweets containing a target keyword and models trained on an enlarged set of tweets containing terms related to a topic. We also created a topic-independent model trained on a general sample of tweets. When we compared the performance of the models, we noted that for some topics, topic-dependent models performed better, although for the majority of topics, there was no significant difference in performance between a topic-dependent and a topic-independent model.

We then proposed a method for predicting which topics are likely to have better sentiment estimation performance when a topic-dependent sentiment model is used. This method also identifies terms and contexts for which the term polarity often differs from the expected polariy. For example, ‘charge’ is generally positive, but in the context of ‘phone’, it is often negative. Details can be found in our ICWSM 2014 paper.