How Sentiment Analysis Helps Find Opinion Spam

July 26, 2022

Market Analysis provide insight into planning marketing strategies for an industry offering. Individual pieces of a full Market Analysis such as market sizing can help in fundamental decision-making, but will not guide the overall market strategy and the specifics required to make the strategy succeed in the real world.

Customers are increasingly relying on online customer reviews at the point of purchase. Reading reviews is considered as one of the final stages in the purchase path with significant impact on converting a potential customer to an actual customer.

Unfortunately, with the ease of posting anonymous content on digital media, the prevalence of fictitious reviews, referred to as opinion spam, has grown significantly over time. Accordingly, there is a growing concern among both businesses and customers about the credibility of online reviews in general. Compounding matters further, research has shown that artificial intelligence (AI) can be successfully used to create sophisticated fake reviews that are difficult to distinguish from real reviews and are considered highly reliable by unsuspecting readers. Therefore, distinguishing this online spam from genuine reviews has become a pressing issue.

Spam detection has steadily emerged as one of the major sub-areas of sentiment analysis. Opinion mining, also called sentiment analysis, is a practice involving analysis of people’s opinions, evaluations, sentiments, and attitudes from written language.

Here are some of the common techniques used to identify opinion spam and the challenges they present:

  1. Anomaly patterns in review rating: Rating is considered as a representation of the reviewer’s sentiment orientation. Anomalies in review ratings are analyzed to detect fake reviews. However, the review rating may not necessarily represent the sentiment completely. Two reviews with the same rating may have different content that will have different impact on the reader.

  2.    Lack of domain specific vocabulary: It is based on the belief that genuine reviews would have more product/service specific details, while the fake reviews would contain more generic thoughts. However, a business could solicit fake reviews from their employees or incentivize customers to do so.  So this method would not work in case of fake reviews written by domain experts.

Reviewer behavior: Frequent item-set mining method may be used on a combination of:

  1. Publicly accessible data – reviewer name, time and frequency of posting reviews, first reviewers of products, and
  2. Review website private data – IP address, time taken to post a review, etc. to identify behavioral abnormalities such as multiple user IDs created by a single reviewer, several similar reviews for one product/service. These characteristics are then modeled to detect individuals or groups indulging in opinion spamming. However, there may be spammers who modify their writing style to appear as genuine reviewers, and thus cannot be detected.

  1. Text analysis: There are various text analytics approaches to build detection models using various signals such as –
    1. Lexical features such as word and part-of-speech n-grams, lexical density and other lexical attributes.
    2. Content and writing style similarities between different reviewers.
    3. Inconsistency in the semantics. For example, a reviewer wrote “My husband and I visited this hotel …” in one review and then in another review he/she wrote “My wife really loves …”. However, text analytics falls short of evaluating sentiment in the reviews, determined by the context in which certain terms are used and the tone of the review.

Why companies must invest in opinion spam detection tracking?

Opinion spam is an easily produced and repeatable activity. Spammers find creative ways to avoid detection and proliferate opinion spam even after detection and deletion. Treating spam detection as a one off instead of an on-going exercise may prove costly for businesses; companies must invest in mechanisms to continuously track reviews of their product/services/brand and eliminate opinion spam to provide users with genuine experience of their business. Depending on factors such as the industry domain, nature of business and others, organizations can evaluate their optimal tracking frequency. For example, for a specific client in the information technology domain, we evaluated that there was usually a spurt in opinion spams when the client published a new product release and thus designed the client’s tracking frequency to match their product release schedule.

What is the right technique to detect opinion spam?

While, each of the techniques above have their own challenges, infoAnalytica employs a combination of two or more of these supervised, unsupervised and semi-supervised machine learning techniques for a more accurate detection of opinion spam on an on-going basis. For example: Using a combination of the rating and the sentiment of reviews. There may be some reviews that have low rating but positive sentiment and some that have high rating but negative sentiment. This contradiction between the rating and the sentiment polarity indicates to the possibility of spam. Further, analysis of the sentiment strength determined by the length of the review and use of contextual details related to product/service features, experience, sales process etc., can be incorporated for a more accurate detection of opinion spam.

Keep up with B2B Trends

Thank you!

Thanks for contacting us!
We will be in touch with you shortly.
Go Home