What Does a Sentiment Analysis Score Mean?
By Laurel Earhart
The most frequently asked question pertaining to Sentiment Analysis is: What does a score mean?
Sentiment Analysis engines measure opinions in social media online or in textual sources such as news and journal articles.
A single score, in and of itself, means nothing. To understand the value of the score, one must understand the system it is built upon. Here are five issues to consider:
1. A Sentiment Analysis system has a “scale” or, the absolute highest or lowest value of a descriptive term pertaining to an object in question. Simply knowing that something scored a “.25” means nothing if you don’t know if the scale of scores was from -1 to +1 or from 1 to 10.
2. A Sentiment Analysis system requires the ability to understand “velocity” or how sharply sentiment is changing over time. This allows a researcher to identify tipping points and hone in on the events that triggered changes in sentiment. Again, if something’s scored a .25 for the last week, but prior to that it had been rising sharply, that signals something has happened to level off rising public opinion.
3. Sentiment Analysis system requires the ability to demonstrate “relativity”, or, how sentiment has changed over time, is different between sources, or between objects. It allows the researcher to determine, for example, if blogs or news are more favorable to a particular political candidate, and whether that candidate should spend more time on interviews or on her social media campaign. By contrast, some systems may have “normalized” scores, so 0.5 means something like “half the mentions were positive”. Understanding what a score is relative TO is extremely important.
4. A Sentiment Analysis system should provide “granularity” – how subtle a shift can be detected? Does the system provide “positive – neutral – negative” rankings or a more sensitive analysis? If something declines from .8 to a .25 it is still positive, yet it represents a sharp decline.
5. A Sentiment Analysis system has “accuracy” – who decided these were the scores, anyway? Many systems were designed to replicate human scoring as closely as possible, and tested against human subjects in a variety of applications. SentiMetrix was tested at the University of Maryland and was determined to be comparable to human scoring. It was also similarly tested in government applications. If a system is not designed to replicate human scoring, what exactly is it designed to do?
Trying to figure out a score without understanding the system is like trying to figure out which way is North without a compass or any earthly cues.
Once the system is understood, then the scores make sense.




Noch keine Kommentare.
RSS-Feed für Kommentare zu diesem Beitrag. TrackBack URI