Technology FAQ


o Why are we sure that we are measuring sentiments accurately? 


We have carefully verified the accuracy of our measurements, using a series of control experiments, where the same documents have been graded both by our system and by human subjects.  Different people may assign somewhat different grades to the same documents. So we have used the Pearson product-moment correlation coefficient as the measure of the tendency of the grades assigned by our system, and by the people to be similar.   We have established that the variance between sentiment grades assigned by our system, and the average grade assigned by people, differs only so slightly from the variance between multiple people.  


o How are we maintaining opinion expressing word bank?


We understand the importance of keeping the opinion expressing word bank up to date.   As our systems process data from the sources we track, they also note the new words that pop up, and as they start popping up more frequently, they get automatically routed into the word scoring pipeline.  We do the same periodically with the words already in the data bank, to detect changes in the word usage over time. We are using pre-qualified human subjects to provide grading data for texts containing words that are of interest to us, and process this scoring data using our proprietary algorithms to extract word scores.  


o How the data sources are selected?


We want to be at least on par with the premier news search engines on the Web, and track roughly the same sources they do, and add to that a set of most popular blogs.  When we have to make a selection, we are doing it using the popularity data available on the Web.   In cases where we engage with a client, we either access sources they provide to us, or access freely available sources.     


o How are we maintaining the list of data sources?


In order to always provide the most comprehensive information, we periodically re-scan the news sources to detect newly popular sites and sources for tracking.  We are doing the same with  blogs.


o Can I limit the analysis to the sources that interest me?


Sure!  You can limit the sources by location (including country), by type, select individual sources from the list or work with us to create a new collection for you.


o Can I use SentiGrade to analyze my proprietary data?


Yes, you can.  Our API allows to send the documents over for scoring, and later track sentiments expressed in those documents.  Please contact us for more information. 


o What’s so special about SentiMetrix technology?


We are able to track the full spectrum of the Internet media, not just one part of it, such as news sites, newspapers, or blogs.  We believe that this is the only way our customers can get a balanced, objective view of what’s going on there, and understand the trends that will shape tomorrow’s events.   

We measure sentiments on a continuous scale, not just “good” vs. “bad”, which provides the level of granularity on par with traditional marketing research methods, and have taken great care to ensure that these measurements are indeed accurate.
 
We provide access to full range of our features and functionality via a Web site and an API, and do not force our customers into long-term consulting agreements.  

Finally, each member of the SentiMetrix team has many years of experience dealing with large amount of Internet data using natural language processing and machine learning.  Our expertise in developing text mining methods is second to none, and we fully understand the needs of our customers for real time insights into the sentiments of the online community.


 

 

Copyright SentiMetrix, Inc 2008