DISQUS

Dirk Shaw's blog: How much sentiment analysis is good enough?

  • Ken Burbary · 5 months ago
    Dirk, thanks for sharing your insight/experience on this topic. It's one that hits close to home for me, as we've been dealing with the same challenges associated with social monitoring.

    While there is no argument from me about the importance and value of reliable sentiment data, I think we're still in the infancy period of getting sentiment analysis incorporated into existing processes and operations. NLP technologies are only so reliable when it comes to automating sentiment analysis. And as you've pointed out in this post, doing things the old fashioned way (all human effort) has its own shortfalls too. The space will need to mature before marketers and brands can effectively use sentiment analysis to its full potential.
  • dirkmshaw · 5 months ago
    Hey Ken,

    Could not agree more on the infancy stages. Another question this raises is what about switching cost. Imagine spending countless hours tagging and cleaning up data and you decide to move to a new platform. Does this mean start over?

    Thanks.dirk
  • Ken Burbary · 5 months ago
    I'd say so. Platform lock-in seems like a real issue. I think the important points to take away from both of our blog posts re: social media monitoring this week is that the space is very young, and there are flaws/shortcomings. As the vendors work to improve their products, we (marketers/users) need to work closely together with the vendors to iron out these issues. Social media monitoring remains a highly valuable activity, but there are some hidden "gotchas" that come along with operationalizing it. Hopefully we can remove those and make the process easier to get going through collaboration.
  • dirkmshaw · 5 months ago
    I agree. They :tools: may also want to work work with one another to set some standards. Possible learn a thing or two from the CMS industry. Thanks again for your comments. As always very on target an insightful.
  • Puja Madan · 5 months ago
    Dirk, what a timely post! I'll agree with Ken abt the need for this space to mature. I've found the tools that automate sentiment don't always get it right. Even with them, some manual intervention is required. The other option, esp for bigger brands can be a mammoth task. Tools such as MAP, Heartbeat (Sysomos) and VoxTrot (crimson hexagon) claim to do things differently, though I haven't used them...

    I wonder if brands are particularly worried about ALL sentiment though. Maybe for some the priority is to be able to sniff out the negative sentiment and take appropriate action. What do you think?
  • dirkmshaw · 5 months ago
    Hi Puja,

    The question about all sentiment is one i dont have a solid answer to either. I wonder if there is a % based on total mentions that will give a brand a statistically accurate measure of sentiment.

    Thanks again for commenting and sharing these other tools.

    dirk
  • Rick Blythe · 4 months ago
    What sentiment are we referring to here? I track Pre-sentiment and Post-sentiment, as I think an interaction with a customer which changes a negative sentiment into a positive sentiment is where I can really show the effectiveness of my efforts. It also really looks good in reports :=)
  • Ken Burbary · 5 months ago
    Puja,

    One word: Segmentation

    I'd agree that not at sentiment is created equally. Brands should be placing the sentiment of some consumer/customer types over others, depending on their goals/priorities.
  • Mark Evans · 5 months ago
    Dirk,

    You're right, sentiment is attracting a growing amount of attention because many companies want to quickly know if the conversations about their brands and products are positive, negative or neutral.

    If you'd like a demo of Sysomos' MAP and Heartbeat services - and how they automatically assess the sentiment of the conversations - please let me know.

    cheers, Mark
  • KDPaine · 5 months ago
    Yes, first of all, you need to be careful about your "30 seconds per record" I can promise you that if you are reading and analyzing comments as well, 30 seconds is not realistic. We figure our readers can code an average of 50 posts per hour.
    Secondly, random sampling 20% of mentions is statistically acceptable. You do, however have to be clear with clients that you may well miss an important post, but you are just as likely to miss an important post for the competition as well.
  • dirkmshaw · 5 months ago
    Hi KD,

    thanks for commenting and sharing your insight. Do you also see a variance in time to tag based on complexity of products? So do you all offer outsourcing for sentiment analysis.

    Great stuff.
    Dirk
  • KDPaine · 5 months ago
    Yes, there's a huge variance between people twittering about toilet paper
    and dixie cups and posting about missile defense and energy issues, that's
    why we have an average per hour by type of client -- non-profit, education,
    defense, technology, consumer etc. And yes, we prefer to call it
    "Northsourcing" since we're in the Great North Woods of NH, :) but yes, we
    offer it and do alot of it.
  • adamcohen · 5 months ago
    Great post Dirk. Automated analysis can get a jump start, but no way that it can be left alone - there is always a need to include some level of manual intervention. Radian6's Amber Naslund and I had a spirited discussion on Twitter last week. An interesting angle on this is how "worth it" is the level of accuracy? Alone, let's say an automated tool provides 75% accuracy with sentiment. With some manual analysis you can beef that up to 90%. Would it be worth it if the automation could be higher? 85%? Less manual intervention would be required, but in my opinion there will always be some amount of manual intervention needed. For example, none of the automated tools "get" sarcasm. I've also seen in healthcare most of the automated tools flag things incorrectly - when a bog post about a prescription drug uses words like "inhibit" or "destroys" it is often because the drug is working, however those words can be flagged as "negative."
  • dirkmshaw · 5 months ago
    Hey Adam,

    Worth is a tough question. In this model of tagging a mention with a rating regardless of manual or automated, the desired "worth" may be to optimize something in the future whether it be a campaign or a product.

    Another way to find more "worth" is to be able to extract the tagged or curated mentions and place them in various stages of the purchase path.

    This extends the definition of worth to talk about sentiment in context of lead generation and new sales. This implies these monitoring solution open up and allow you to re-purpose the mentions you have curated. Here is how this may come to fruition in context of an online customer experience.

    http://dirkshaw.blogspot.com/2009/06/one-of-big...

    Thanks for commenting. Dirk
  • tonylopresti · 5 months ago
    Dirk, that is a great question! I think it depends on the goals of the analysis. If the goal is to find out at a high-level the proportion of positive/negative comments for a certain product or brand, then sampling, even 5 – 10% of the data, is probably all you need to do to get a high degree of confidence result.

    However, if your goal is to understand WHY people are positive/negative or if you want to drill into the data a bit more in depth, sampling increasingly falls down. Consider the following questions:

    •What are the top-10 positive comment types about my brand?
    •What are the top-10 negative comment types about my brand?
    •What positive comments are emerging this month?
    •What negative comments are emerging last month?
    •What is most important to my most valuable customers?
    •What is most important to males vs. females?

    Answering each of these questions requires a data size that is much greater than a 5-10% overall sample. The emerging issue question, for example, is impossible to answer without reviewing and coding EVERY record. It is for this reason that automated text analysis, sentiment and categorization technologies, like those developed by my company, Clarabridge, have caught on like wildfire over the past few years.

    If you just want a basic pulse, yes, sampling is fine. If you want to answer detailed questions like the ones above, in my opinion you need sophisticated technology. For a quick explanation of text analysis and mining see: http://tinyurl.com/howtextminingworks

    Regards,

    Tony
  • connie_techrigy · 5 months ago
    Hi Dirk,
    Automated sentiment can work very well if the user takes the time to identify their dictionary needs and then tweak the dictionary. In SM2 it's quite easy to get 95% accuracy plus.

    I had a customer that was a sponsor of the David Letterman show. I removed jokes, hilarious from the positive dictionary and added terms like sponsorship, boycott, advertiser, etc to trigger negative sentiment.

    Our customer was very pleased with the automatic sentiment while their brand was experiencing a crisis. Interestingly enough, that dictionary would have worked for all 20 sponsors. How long did it take me to identify & review? Not long -an hour. So it's very possible. You just need a robust tool.
    Connie
    Community Strategist | Techrigy
    @cbensen