New algorithm spots sarcasm in customer testimonials

Social networking may generate positive verbiage, but, believe it or not, may actually be laced with sarcasm. Researchers say they have developed a methodology that catches sarcastic statements in four out of five instances.
Written by Joe McKendrick, Contributing Writer

There's a lot being said about companies and their products in social media forums, and the emerging practice of sentiment analysis gives companies a chance to mine useful information from these online discussions.

For example, monitoring mentions in social networking discussions looks at how often a topic — such as a particular product or brand — is mentioned, how many comments a topic receives, and what people are saying about suppliers, partners, and competitors.

However, as good as they're getting, existing sentiment analysis tools can't tell if a customer statement, which may be loaded with positive phrases, is actually laced with sarcasm.  As in: "I await the next release with bated breath," or "All the features you want. Too bad they don’t work!” Or how about: "The CEO is a legend in his own mind." Or, a product may even be damned with faint praise, such as "The new release is adequate for what it was intended to do."

These are "indirect negative sentiments." Researchers at The Hebrew University have devised a system they call SASI, or Semi-supervised Algorithm for Sarcasm Identification, which they claim can recognize sarcastic sentences in product reviews online with at least 77% precision. In a new paper, the researchers, Oren Tsur, Dmitry Davidov, and Ari Rappoport, report they scanned 66,000 Amazon.com product reviews, with 15 human annotators from different cultural backgrounds tagging sentences for sarcasm.

They eloquently understated the challenge:

"Evaluation of sarcasm is a hard task due to the elusive nature of sarcasm... The subtleties of sarcasm are context sensitive, culturally dependent and generally fuzzy."

The researchers developed a "sarcasm classification method"  that began with a "1" to "5" rating system that reviewers applied to identify the level of sarcasm present in a customer statement. This classification established syntactic and pattern-based features that became part of the algorithm model. Punctuation features -- such as the number of “!” or "?" characters in a sentence -- also play into the analysis.

There still are instances of indirect negative sentiments that even SASI can't get yet, the authors state. For example, "SASI fails to distinguish between the following two sentences: “This book was really good until page 2!” and “This book was really good until page 430!"

Still, sentiment analysis is becoming an important part of market research as companies try to better capture information flowing in through social media channels.

Of course, maybe everyone could get into the habit of placing the newly designed "SarcMark" (available at a steal for $1.99 a download) at the end of their statements, and that will help alleviate marketers' confusion.

This post was originally published on Smartplanet.com

Editorial standards