FAQs

  • How can we be sure that we are measuring sentiment accurately?

Using a series of control experiments, we have carefully verified the accuracy of our measurements where the same documents have been graded both by our system and by human subjects.  Because different people may assign somewhat different grades to the same documents,  we have used the Pearson product-moment correlation coefficient as the measure of the tendency of the grades assigned by our system and by individuals to be similar.   We have determined that the variance between sentiment grades assigned by our system and the average grade assigned by people differs only slightly from the variance between individuals themselves.  Our conclusions were validated by an independent third party for the US Intelligence community, as well as an academic institution.

  • How do we maintain our sentiment vocabulary?

We understand the importance of keeping the sentiment vocabulary database up to date.   As our systems process data from the sources we track, they also note terms not previously encountered.   As new terms occur with greater frequency,  they are automatically added into the vocabulary scoring pipeline.  We also re-process our vocabulary with the words already in the database to detect subtle nuances and changes in word usage over time. We employ qualified analysts to provide grading data for texts containing words that are of interest to us, and process this scoring data using our proprietary algorithms to extract word scores.

  • How the data sources are selected and maintained?

To represent the broadest and most popular market segments, we track over 9000 news content sources and augment these sources with one million blogs.    As we add or remove sources to our library of content, we do so with the intent to provide a representative sample of the most widely accessed content.    In cases where we build an application for a client, we either access sources they provide to us, or crawl additional industry specific sources from the web.

  • Can I limit the analysis to the sources that interest me?

You can limit the sources by location (including country), by type, select individual sources from the list or work with us to create a new collection or category grouping.

  • Can I use SentiGradeTM to analyze my proprietary data?

Yes, you can.  Our service allows clients to send the documents to us for scoring, and later track sentiments expressed in those documents.  Please contact us for more information.

  • What’s so special about SentiMetrix technology?

We are able to track the full spectrum of the Internet media, not just one part of it, such as news sites, newspapers, or blogs.  We believe that this is the only way our customers can get a balanced, objective view of issues that affect their brands, and understand the trends that will shape tomorrow’s events.

We measure sentiments on a continuous scale, not just “good” vs. “bad”.  Our system provides the level of granularity on par with traditional marketing research methods.  We have taken great care to ensure that these measurements are indeed accurate.

We provide access to full range of our features and functionality via a Web site and an API, and do not require our customers to sign long-term consulting agreements.

Finally, each member of the SentiMetrix team has many years if not decades of experience dealing with large amounts of Internet content using natural language processing and machine learning.  Our expertise in developing text mining methods is second to none, and we fully understand the urgency of our clients’ need for information and insight into the sentiments of the online community.

© 2010 SentiMetrix