o Why are we sure that we are measuring sentiments accurately?
We have carefully verified the accuracy of our measurements, using a series of control experiments, where the same documents have been graded both by our system and by human subjects. Different people may assign somewhat different grades to the same documents. So we have used the Pearson product-moment correlation coefficient as the measure of the tendency of the grades assigned by our system, and by the people to be similar. We have established that the variance between sentiment grades assigned by our system, and the average grade assigned by people, differs only so slightly from the variance between multiple people.
o How are we maintaining opinion expressing word bank?
We understand the importance of keeping the opinion expressing word bank up to date. As our systems process data from the sources we track, they also note the new words that pop up, and as they start popping up more frequently, they get automatically routed into the word scoring pipeline. We do the same periodically with the words already in the data bank, to detect changes in the word usage over time. We are using pre-qualified human subjects to provide grading data for texts containing words that are of interest to us, and process this scoring data using our proprietary algorithms to extract word scores.
o How the data sources are selected?
We want to be at least on par with the premier news search engines on the Web, and track roughly the same sources they do, and add to that a set of most popular blogs. When we have to make a selection, we are doing it using the popularity data available on the Web. In cases where we engage with a client, we either access sources they provide to us, or access freely available sources.
o How are we maintaining the list of data sources?
In order to always provide the most comprehensive information, we periodically re-scan the news sources to detect newly popular sites and sources for tracking. We are doing the same with blogs.
o Can I limit the analysis to the sources that interest me?
Sure! You can limit the sources by location (including country), by type, select individual sources from the list or work with us to create a new collection for you.
o Can I use SentiGrade to analyze my proprietary data?
Yes, you can. Our API allows to send the documents over for scoring, and later track sentiments expressed in those documents. Please contact us for more information.
o What’s so special about SentiMetrix technology?