Is there anything left unsaid about social media research and marketing?: Part 3

Sentiment and Semantic analysis

Part 3: Sentiment And Semantic Analysis

It took a bit longer than anticipated to write Part 3 of a series of posts about the content proliferation around social media research and social media marketing. In the previous two parts, we talked about Enterprise Feedback Management (December 2013) and Short -event-driven- Intercept Surveys (February 2014). This post is about sentiment and semantic analysis: two interrelated terms in the “race” to reach the highest sentiment accuracy that a social media monitoring tool can achieve. From where we sit, this seems to be a race that DigitalMR is running on its own, competing against its best score.

The best academic institution in this field, Stanford University, announced a few months ago that they had reached 80% sentiment accuracy; they since elevated it to 85% but this has only been achieved in the English language, based on comments for one vertical, namely movies -a rather straight-forward case of: “I liked the movie” or “I did not like it and here is why…”. Not to say that there will not be people sitting on the fence with their opinion about a movie, but even neutral comments in this case, will have less ambiguity than other product categories or subjects. The DigitalMR team of data scientists has been consistently achieving over 85% sentiment accuracy in multiple languages and multiple product categories since September 2013; this is when a few brilliant scientists (engineers and psychologists mainly) cracked the code of multilingual sentiment accuracy!

Let’s dive into sentiment and semantics in order to have a closer look on why these two types of analysis are important and useful for next-generation market research.

Sentiment Analysis

The sentiment accuracy from most automated social media monitoring tools (we know of about 300 of them) is lower than 60%. This means that if you take 100 posts that are supposed to be positive about a brand, only 60 of them will actually be positive; the rest will be neutral, negative or irrelevant. This is almost like the flip of a coin, so why do companies subscribe to SaaS tools with such unacceptable data quality? Does anyone know? The caveat around sentiment accuracy is that the maximum achievable accuracy using an automated method is not 100% but rather 90% or even less. This is so, because when humans are asked to annotate sentiment to a number of comments, they will not agree at least 1 in 10 times. DigitalMR has achieved 91% in the German language but the accuracy was established by 3 specific DigitalMR curators. If we were to have 3 different people curate the comments we may come up with a different accuracy; sarcasm -and in more general ambiguity- is the main reason for this disagreement. Some studies (such as the one mentioned in the paper Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews) of large numbers of tweets, have shown that less than 5% of the total number of tweets reviewed were sarcastic. The question is: does it make sense to solve the problem of sarcasm in machine learning-based sentiment analysis? We think it does and we find it exciting that no-one else has solved it yet.

Automated sentiment analysis allows us to create structure around large amounts of unstructured data without having to read each document or post one by one. We can analyse sentiment by: brand, topic, sub-topic, attribute, topic within brands and so on; this is when social analytics becomes a very useful source of insights for brand performance. The WWW is the largest focus group in the world and it is always on. We just need a good way to turn qualitative information into robust contextualised quantitative information.

Semantic Analysis

Some describe semantic analysis as “keyword analysis” which could also be referred to as “topic analysis”, and as described in the previous paragraph, we can even drill down to report on sub-topics and attributes.

Semantics is the study of meaning and understanding language. As researchers we need to provide context that goes along with the sentiment because without the right context the intended meaning can easily be misunderstood. Ambiguity makes this type of analytics difficult, for example, when we say “apple”, do we mean the brand or the fruit? When we say “mine”, do we mean the possessive proposition, the explosive device, or the place from which we extract useful raw materials?

Semantic analysis can help:

  • extract relevant and useful information from large bodies of unstructured data i.e. text.
  • find an answer to a question without having to ask anyone!
  • discover the meaning of colloquial speech in online posts and
  • uncover specific meanings to words used in foreign languages mixed with our own

What does high accuracy sentiment and semantic analysis of social media listening posts mean for market research? It means that a 50 billion US$ industry can finally divert some of the spending- from asking questions to a sample, using long and boring questionnaires- to listening to unsolicited opinions of the whole universe (census data) of their product category’s users.

This is big data analytics at its best and once there is confidence that sentiment and semantics are accurate, the sky is the limit for social analytics. Think about detection and scoring of specific emotions and not just varying degrees of sentiment; think, automated relevance ranking of posts in order to allocate them in vertical reports correctly; think, rating purchase intent and thus identifying hot leads. After all, accuracy was the only reason why Google beat Yahoo and became the most used search engine in the world. 

The Positive Effect of Negativity

Whenever something unexpected happens in life, some of us pause; we look up and we try to figure it out. Especially if you are curious - like good researchers are supposed to be – discovering a paradox can lead to a hypothesis and looking to prove it can be real fun. When we came across the paradox described in our latest blog post, we investigated and analysed it to death, and then we started thinking about similar cases that may support our hypothesis i.e negativity under certain circumstances may have a positive effect.

Click here to download a free eBook on The Positive Effect of Negativity

What seems to have happened in the case of Coca-Cola, is that passive social media users saw the racist comments towards the ad and decided to come to its defence. They expressed their dissatisfaction with the reaction of the people posting negative comments using very strong language. Indirectly, these negative comments about the negative posts “attacking” Coca-Cola can be considered as positive for the brand; in simple terms negative about the negative equals positive - imitating how multiplication works in mathematics.

Is it possible that we are looking at a new marketing phenomenon applicable both in business and in politics on how to harness ‘The Positive Effect of Negativity’? In a nutshell, every brand that would like to stir some noise around it makes a controversial statement with its communications - but hopefully controversial only for a small and evil group of people. When the minority group reacts against the brand, the passive and sleepy majority then wakes up and defends the brand against evil; the positive effect of negativity may simply be turning apathy into action.

To find out more, download our eBook on ‘The Positive Effect of Negativity’.

The Amazing Paradox of Negative PR for Coca Cola

Most people would agree that any PR is good PR; however, I don’t think I was the only one thinking that Coca-Cola did NOT see it coming on February 3rd - the day after the Superbowl final. I kept thinking that the brand is damaged, that sales would be affected negatively, and that some heads are probably rolling within TCCC (The Coca-Cola Company) in Atlanta. A simple piece of pre-advertising research could have told them that some people in the US are so patriotic, or perhaps we can even use the word racist, that they felt offended by the fact that people of other ethnicities were singing ‘America the Beautiful’ in their own language as opposed to English. All hell broke loose on Twitter and other social media platforms immediately after the ad was aired and continued for the following days and weeks.

Click here to download a free Coca-Cola Superbowl ad case study

We had initially harvested the posts with the intention of publishing a blog post showcasing eListen’s sentiment accuracy of over 85%; however since this wasn’t a paying project it kept falling to the bottom of the priority list to process, analyse, and create some content around it. Our thought was to use the approach “sales by fear”; we wanted to tell all the brands out there that they need to be constantly “listening” to online chatter about themselves and their competitors in order to be able to handle situations as they arise. They should not allow their brands to be at risk of negative PR and loss of brand equity, something potentially catastrophic in real business terms.

According to DigitalMR’s findings during the 8 days prior to the Superbowl, there were 139,997 posts about Coca-Cola in the English language; 22% Negative, 7% Positive and 71% Neutral. During the 8 days following the airing of the ad, the number increased by 169% to 376,382 posts. The interesting fact here is that although the number of posts increased by 169% after the campaign airing, the amount of negative posts still accounted for 22% of the total while positive posts jumped to 51%.

If you want to find out more about what really happened, and what the actual effect of the campaign was, please click here to download a slide deck with our findings, and stay tuned for the upcoming eBook on ‘The Positive Effect of Negativity’.

 

Microsoft plagiarising DigitalMR, or is this just an innocent coincidence?

Plagiarism Or CoincidenceI think you would all agree that if an SME sees a large and globally renowned company standing behind the same arguments and spreading the same message, their first thought would be: “We must be doing something right!”.

DigitalMR has been developing its eListen solution for the past couple of years, trying to convey that every brand needs to LISTEN to its customers in order to take the correct steps towards brand management and growth.  Using terms such as  ‘are you listening?’ and ‘there is a lot of noise’ we have been promoting eListen as a tool that “is listening for you” to provide actionable insights, while we make sure we eliminate the noise from the data and provide deliverables with high sentiment accuracy.  eListen is language agnostic and can reach up to 90% sentiment accuracy enabling brands to identify their influencers, track and measure their reach, and see how they compare to the competition. 

Microsoft - are you listeningYou can imagine our surprise when a few days ago we discovered that Microsoft is actually using terms almost identical to ours for their own social listening tool, a service available in their Dynamics CRM.  This new feature of the CRM only came after the acquisition of Netbreeze in 2013, so I can’t help but wonder if Netbreeze had been following us all along... 

It seems we are on the same page as far as the importance of listening, sentiment accuracy, and elimination of noise are concerned, and their eBook ‘Your Brand Sux – Turning Social Sentiment into Opportunity’ explains in full detail what we’ve been saying for so long.  The difference between Microsoft and DigitalMR is that they are a company of a pure technological background, while we are a unique combination of technology and market researchers, which is what enables us to provide insights in the first place.

It is only natural that the competition will most likely intensify as market research shifts towards the digital and more brands start realising how important and useful social media listening can be.  With that in mind, the fact that a large company like Microsoft “adopted” our messages at such an advanced stage of development is definitely perceived as some form of reassurance that we’re on the right track. 

You can find out more about eListen in the Solutions section of our website.