5 Signs that Social Listening and Social Analytics are Gaining Traction in Market Research

On October 26th I participated at the first Research Analytics: Digital Advance Research (RA:DAR) symposium in New York, organised by ESOMAR. The symposium took place at a superb venue in the AOL offices; a 21st century conference venue with live streaming capability. I think the future of conferences, as such, will be to combine physical and virtual presence of delegates who can interact with the speaker and audience regardless of whether they are in the room or at their desk anywhere in the world.

Michalis Michael at ESOMAR RA:DAR

I had the privilege to chair the social listening session which was more of an interactive session that helped the delegates better understand how:

  1. irrelevant social media posts are eliminated from a search on social media about a specific product category
  2. to create a hierarchical taxonomy
  3. to train a machine learning algorithm to predict brand sentiment

Two case studies were presented by Norine McDonald from ICOS and David Rabjohns from Motivequest; with the latter winning the “best presentation” award for his presentation on how Prius increased its sales 88% above expectations. 

My impression from this meeting and other recent interactions is that we are getting closer and closer to making social listening a mainstream discipline, suitable for multiple verticals and use cases in market research. The following 5 signs support this impression: 

  1. the few market research end-clients at the symposium reached out and asked lots of questions, and some even proposed a one to one meeting to further explore possibilities 
  2. market research agencies from different countries mentioned that more and more clients are asking them to integrate social listening and analytics in their research projects
  3. a pilot with a blue-chip multinational and  a multinational market research agency is under way to establish the correlation between the net promoter score(NPS)  from their monthly surveys and DigitalMR’s net sentiment score (NSS) from social media monitoring
  4. a project was commissioned for a private equity firm who want to conduct commercial due diligence on a company they are considering to acquire
  5. two regional market research agencies in LATAM recently became partners of DigitalMR in order to offer listening247 to their clients

There are varying opinions about the total spend on social listening, ranging from US$ 600 million to over US$ 1 Billion. Of course we do not know how much of this is  for consumer insights, versus the already established spend on social media monitoring tools for PR purposes. This is still a small fraction of the US$ 60 Billion spent on market research globally. The observed social traction however, indicates that this amount will grow exponentially during the coming years.

The main difference between Ovum, Forrester, and Gartner

Ovum, Forrester, Gartner

Ovum, Forrester, and Gartner are the three largest technology analysts in the world. Recently, two of the three, published reports and articles about DigitalMR and its platforms for social listening and online communities; we thought we should return the favour and speak highly of the two, while we shame the third.

They like to call themselves research companies, a bit confusing for the market research sector which is traditionally about surveys and focus groups. Their kind of research is different; the main customer of these three companies is the CTO/CIO of an organisation, as opposed to the consumer insights or MR (our customer) which usually sits under marketing.

A good place to start, when looking for definitions, is Wikipedia:

Ovum is an independent [1]analyst and consultancy firm headquartered in London, specializing in global coverage of IT, and telecommunicationsindustries. It began operations in 1985 and by 2005 claimed to be the largest technology analyst firm that was headquartered outside of the United States.[2]”

Forrester Research is an independent technology and market researchcompany that provides advice on existing and potential impact of technology, to its clients and the public. Forrester Research was founded in July 1983. Revenue: 300 MM US$”

Gartner, Inc. is an American information technology researchand advisoryfirm providing technology related insight headquartered in Stamford, Connecticut, United States. The company was founded in 1979. Revenue: 2.021 B US$”

Ovum is probably the smallest of the three in revenue terms (no accurate number found), BUT according to its Wikipedia entry: “in 2012, Ovum was jointly named Global Analyst Firm of the Year [4] by the Institute of Industry Analyst Relations (IIAR). Ovum was ranked higher than global players Gartner, IDC, Forrester and Frost & Sullivan across 8 out of 15 criteria, including objectivity of research and advice, ease and quality of consulting engagement, ease of finding and using research and value for money.”

Whatever their Wikipedia definitions are the bottom line is: all three are at their core technology analyst companies.

From the admittedly narrow perspective of DigitalMR, there is a very specific way to differentiate and rank the three companies mentioned in the title of this post; it may be a selfish perspective but it provided the excuse to return the favour to the two that wrote about DigitalMR. One of them wrote about both listening247 and communities247, one wrote only about our online communities platform, and the third, neither/nor; I am sure you can see where this is going…:

  1. Ovum is ranked first as they discovered both our platforms and thought that there was something really unique about the way we integrate social listening with communities online.

  2. Forrester is ranked second because they only wrote about communities247, albeit twice. Even though we talked to them about listening247 and the fact that it delivers sentiment accuracy over 80% in any language and topic precision over 85% at the first hierarchical level, they did not find it important enough to inform their subscribers. I personally think there is a lot more innovation in listening247 than in communities247. A lot more R&D dollars (pounds) have gone into our social listening and analytics, but maybe Forrester is taking its time because it is much bigger (read less agile) than Ovum.

  3. You guessed it: Gartner published nothing about DigitalMR as of the date of publishing this post, thus depriving their subscribers from the best kept secret in market research: DigitalMR. I admit I could be a little biased (you know, being the founder and all) but the ranking is quite objective when it comes to the “metric” we chose to rank these three companies. Funnily enough, Gartner is 7 times the size of Forrester in revenues, which might explain the findings of this highly scientific research paper.

In summary:

  • Ovum discovered a large and rare diamond and shared it with their subscribers here.
  • Forrester discovered a precious stone but they do not explain its full value yet.

  • Gartner is keeping its subscribers in the dark when it comes to social listening and online communities combined. 

If you are not an Ovum or a Forrester client you can still buy the reports about DigitalMR for a few hundred dollars by clicking on the links above.

I kind of like the fact that we have turned the tables on them. Up until now, only they had the prerogative to write about us technology companies; this has changed forever as of today . Beware technology analysts who think of yourselves as research companies, this is (hopefully) the first of many articles and reports from tech companies (your targets) about YOU.

P.S. The Gartner logo is no longer displayed in this blog post following an email from the Office of the Ombudsman at Gartner, asking for it to be removed. Frankly we believe that any publicity is good publicity and Gartner may be behaving like a sore loser :) in this case!

5 Tips To Reduce Noise In Social Listening

Warning: There is a lot of noise

When we say noise in a social listening or social analytics context, we mean posts that are irrelevant to the subject being researched. If for example our social media monitor is created to harvest online posts about beer, the search query will be structured around brands of interest and other beer related keywords. It is horrifying to consider that 80%-90% of what an initial harvesting query (of online posts) will return are irrelevant posts i.e. noise. 

So how do we get rid of this noise?  


Here are our top 5 tips on how to significantly reduce noise from your social listening reports:

1. As a researcher team responsible for the search queries you should appoint a team of intelligent humans with great vocabulary in the language used (to harvest for social media monitoring) including colloquialisms/slang etc. and mainstream common sense.

2. The researchers should have an intimate knowledge of the research subject or product category. For example, if the category is cars it would help if one of the team members was a “petrol head”, or if it is watches someone should be a watch enthusiast who knows most of the makes.

3. There is no substitute to thorough research before you create your first search query. It is important to discover as many synonyms and homonyms so that the FIRST search query will be informed accordingly. An example of a homonym is apple (computers) and apple (fruit) or mine (for gold) and mine (that explodes). If we are interested in harvesting posts about apple the company and their products, then our search query should exclude posts about apple the fruit.

4. The best method for tip 3. (above) is to use Regular Expression queries which usually include Boolean logic. A simple query for the apple example would look like this: apple AND (computer OR phone) NOT (juice OR fruit).

5. After we run the first regular expression query that will harvest the first batch of online posts for us, our intelligent researchers (from tip 1.) will check a large enough random sample of our social media posts and search for irrelevant posts. Once this is done, patterns of noise will be identified so that search query version 2 can be created, this time avoiding harvesting posts that were identified as irrelevant. This is an iterative process that goes on for as long as our human researcher finds patterns that can become part of our regular expression string, which will exclude the noise.

By the end of these noise cleaning iterations, in most cases we end up with regular expression strings that are multiple pages long; this process usually takes a few days for a skilled team. The good news is that we only need to go through this arduous process once, when setting up the social listening for a product category for the first time. Due to the fact that languages are alive and tend to evolve, the search queries used for social listening harvesting would likely have to be reviewed once a year. We should look for new words, phrases and acronyms that become popular and are relevant (to be included) or irrelevant (to be excluded).

It is the users of DIY social media monitoring tools such as Sysomos, Brandwatch, radian6 or Meltwater Buzz that I am the most concerned about; if they are not familiar with the above issues then they are probably analysing data on beautiful dashboards, sharing them with supervisors and other colleagues with pride, not knowing that the proverbial “garbage in, garbage out” applies in an extreme form.

Please do share your own experiences on how you deal with noise reduction from social media.

Precision and Recall in Social Listening

Precision & Recall For the last 4 years, we have been talking about the importance of sentiment accuracy in social listening. When people asked: “What is sentiment accuracy?” we responded along these lines:

• 80% sentiment accuracy means: if you are given 100 posts from the web about your brand that are annotated with positive, negative or neutral sentiment, you will agree with 80 of them and disagree with 20


• 80% sentiment accuracy means: if you are given 100 positive posts from the web about your brand, only 80 will be positive; the rest will be negative, neutral or irrelevant.

We then went on to explain that 100% sentiment accuracy is not attainable because even humans do not agree among themselves. In 10%-30% of cases, there may be a lack of consensus on whether a post is positive, negative or neutral. If we can accept that ambiguity will always exist, due to sarcasm and other complex forms of expression, then how do we expect a machine learning algorithm to agree with all of the humans checking the data?

Maybe at this point we should also explain that in social listening, the most popular way to check sentiment accuracy is to extract a random sample of 1000 posts and have 2-3 humans manually annotate them with sentiment. We then compare the sentiment that the algorithm has assigned to each of the posts and determine the percent agreement between all 3 human curators and the algorithm.

As clients of social media monitoring become more sophisticated, they start asking questions like: “When you say accuracy do you mean precision or recall?” If the vendor is one of the usual suspects that offer social media monitoring tools, then chances are that they will not understand the question. For them, we share here a simple Wikipedia definition: “In simple terms, high precision means that an algorithm returned substantially more relevant results than irrelevant, while high recall means that an algorithm returned most of the relevant results.” Another more detailed definition provided on Wikipedia is this:

“In a classification task, the precision for a class is the number of true positives (i.e. the number of items correctly labelled (by the algorithm) as belonging to the positive class) divided by the total number of elements labelled (by the algorithm) as belonging to the positive class (i.e. the sum of true positives and false positives - which are items incorrectly labelled as belonging to the class). Recall, in this context, is defined as the number of true positives divided by the total number of elements that actually belong to the positive class (i.e. the sum of true positives and false negatives - which are items that were not labelled as belonging to the positive class but should have been).”

Although there were previous titles given to accuracy in the past 4 years for simplicity’s sake, we know it really was “precision”. Now that the consumer insights managers started getting involved in social listening, we need to adapt the way we vendors talk and explain the new terms. Here we should add that precision and recall are not only relevant for measuring sentiment accuracy but also we can use them to measure semantic accuracy i.e. how accurately a solution can report topics and themes of online conversations.

Also, I doubt if these definitions are on the radar of ESOMAR, MRS, MRA or CASRO. If this is true, I suggest that the market research associations start defining how the accuracy of social listening data is measured for the sake of all the market research companies and clients looking for guidance. If they need help, we, the practitioners of social listening and analytics, are here to offer a helping hand in better defining the market research methods of the future.

Image source: By Walber (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons

The Six 3 Letter Acronyms You Should Thoroughly Understand If You Are In Business


A couple of years ago we published an eBook with the title: “The five most important social media acronyms”. When I was asked to be the keynote speaker at an event for entrepreneurs in London a few weeks ago, the notion of the 5 TLAs (Three Letter Acronyms) resurfaced, and ended up being central to the talk. What the entrepreneurs wanted to know was how to come up with and execute their own social media strategy; I was very pleased to realise that the eBook content not only was still valid, it was also somewhat predictive at the time it was published. There was the 6th TLA that had to be added though… POC (Private Online Communities)… so here are the (now) 6 TLAs that every business should understand and employ “seamlessly integrated” (I know it’s a tired cliché but very true and necessary in this case):

3 Letter Acronyms

The best way to showcase the importance of these 6 TLAs and the way they should be “seamlessly integrated” is to weave them into a short story, here goes:

“Fiona, the marketing director of Sunbucks – a chain of coffee shops – wants to look at web listening for one of her company’s brands. She googles “web listening” and DigitalMR comes up first on the first page of Google (1.SEO/2.SEM). She clicks on the link that takes her to the social listening page on the DigitalMR website. Once there, Fiona watches the video clip and reads a few lines about the benefits and differentiators of listening247. She then clicks on the call-to-action button to request a demo. She is taken to a landing page that was created by Ellen – a DigitalMR marketer who is not a scripter/programmer, but just knows how to use the intuitive and simple CMS (3.). Once on the landing page, she enters her details in order to request a demo for social media monitoring (5. SMM). Fiona is now registered as a lead in the CRM (4.) and Ellen contacts her via email in order to arrange an online demo with one of the DigitalMR consultants. During the demo, Colin – the DigitalMR consultant – demonstrates how social listening and social analytics is done, and explains the importance of discussion themes, sub-themes, and sentiment accuracy. Fiona asks about the possibility of finding influencers and using them as brand ambassadors. This prompts Colin to explain the power of integrating social listening with an online community (6. POC) for co-creation and customer advocacy.

A more generic way to explain the use and connection of the 6 TLAs is outlined in the list below:

  1. Be present at the Zero Moment of Truth (ZMOT) when prospective customers will search for products and services in your sector (SEO/SEM). Become part of the conversation.

  2. With a simple and intuitive CMS maintain full control of your website’s content, updating the latter as often as possible in order to continuously optimise your SEO. A blog on the website serves this purpose quite well.

  3. After prospects find your digital content through online search (e.g. Google) they should provide their contact details in one of your inbound/content marketing landing pages in order to access your valuable content. The lead contact details are stored in your CRM so that you can nurture them toward a sale.

  4. In order to be part of the conversation on social media you have to understand which segments of your clients/prospects are out there posting, responding or just reading posts. You also need to understand which are the hot topics so that you can produce content around those topics/themes. The only way to do this is by having access to a social media monitoring and analytics tool (SMM).

  5. It gets better even though this is already impressive enough: Creative customers/prospects or influencers in your sector can be discovered through social listening and invited to join online communities (POC) so that they can help with the creation of digital content that will resonate with their peers. Not only that, they can then share the digital content which is the result of co-creation with their friends and networks.

With the addition of online communities we complete a full circle connecting back to being part of the conversation when prospective customers ask a question or when they look for valuable content using search engines. The content created on an online community has a lot more chances to be sought after at the ZMOT since it was created by the same people that it is targeting.

With an approach like the one described above, an organisation has the possibility to reach millions of customers without having to use any of the traditional mass media. Amplified customer advocacy is definitely a lot cheaper than TV commercials! On top of that, the messaging is more believable simply because it is not an advertisement; it is shared by other customers of the product or service that can be trusted more than brand advertising.

Does this sound too good to be true?