Everything you need to know about Text Analytics

Dec 18, 2019

12 mins read

Tanuj Diwan


Text Analytics: From sending a short message to our loved ones to writing a formal email to a colleague, we use text in various forms. It is the most authentic form of information exchange that has enabled mankind to evolve faster and for the better.

Each text that we use has a meaning and a purpose of its own. Emails. Short text messages. Tweets. Facebook statuses. Blogs. Live chat excerpts. Customer support notes. Survey results. The list of textual data is endless.

All the forms of textual data that we listed above are unstructured data. Apart from their textual semblance, from a data perspective, they are all different from each other.

For example, the data structure of a Twitter tweet data would be entirely different compared to a Facebook status post.

What if all these texts that a specific cohort of users send and receive can be placed under a microscope?

That would enable us to pick up the nuances of communication, unspoken messages, or even common behavioral traits. Text analytics is what enables all of this.

In simple terms, text analytics is all about converting unstructured data into meaningful data from which advanced insights can be picked up.

Text analytics is the process of picking up insights from clusters of textual data.

Demystifying Text Analytics

Imagine your business having 100 customer reviews. This collection of reviews is a cluster of textual data. They would have different words, dates, and even varying tones of customers — from good to bad or even worse.

Text analytics can sweep through these 100 reviews and figure out whether a review is praising the product/service or bashing it for any reason. But isn’t that something that a person can do manually?

In the olden days that used to be the case. But, in the past decade, as the volume of unstructured data exploded, the manual approach to text analytics was proven to be ineffective and unproductive.

The amount of labor required to carry out text analytics made the manual approach inefficient.

Walmart, an American retailer handles about a million transactions every day (Source).

Imagine what a mammoth task it would be to analyze these transactions on a word-to-word basis.

Thanks to the arithmetic progression of computing power in agreement with Moore’s Law, text analytics is now largely automated.

Advanced data-crunching systems use a series of tools to facilitate text analytics.

Some of the processes include lexical analysis, categorization, clustering, pattern recognition, tagging, annotation, information extraction, link and association analysis, visualization, and predictive analytics.

Apart from these ground-level processes, there are also several other approaches to text analytics.

The many approaches to text analytics

Each approach to text analytics brings out diverse results of text analytics. As you are reading this article, there are newer approaches that are being innovated and brought to the world.

Here are five top approaches that are widely used for text analytics.

Word spotting

Word spotting, also known as keyword spotting is a technique used to spot specific words that represent the entire meaning of the sentence. The assumption is that if a specific word is present, the entire sentence is about that particular word.

Let’s see how to word spotting can be used to categorize the NPS Score responses to an NPS survey.

Text Analytics Survey responses

As you can see from the image above, for every review that spots the word ‘price’ the system assumes that the sentence is about pricing.

Similarly, the mention of the word ‘easy’ leads to the assumption that the sentence is about user-friendliness. When combined with the NPS score of 1 to 10, it gives a fair idea of whether the customer sentiment is positive or negative.

Manual rules

Manual rules are more like word spotting but work at an advanced level where more words in complex scenarios are categorized. Manual rules are useful for businesses that want to customize the rules on how various words would be treated by the text analytics program.

For example, the word ‘stock’ has an entirely different meaning in retail and financial industry parlance. Manual rules make it possible for businesses to set clear rules of how each word relating to their industry should be understood and analyzed by the system.

Text categorization

This is perhaps the approach to text analytics that is most prevalent today. It seeks the support of machine learning which picks up models from existing datasets and matches them to new datasets to create forecasts or suggestions.

In-text categorization, the system is fed a pre-built set of text examples and their relevant categories. The machine learning algorithm learns how each text is categorized and creates rules for itself.

When new text is presented, it applies these rules categorize the new text into further categories. This supervised learning-based approach to text categorization saves time and effort to set and implement new rules.

The machine learning algorithm itself creates the classifiers for categorizing new data.

Text Analytics-text-categorisation

Topic modeling

Topic modeling is an unsupervised approach to text analytics. In this approach, the algorithm is fed raw textual data as it is.

The algorithm itself picks up clusters of topics from the dataset to arrive at predictions.

The image below shows how topic modeling works to create models for topics from raw textual data.

Text Analytics topic Modeling

To cite a real-world example, an input of customer responses to an NPS survey can result in a topic modeling that might look like:

X% easy-to-use, good service, friendly staff
X% bad, poor, expensive, not useful

Thematic analysis

While other text analytics approaches focus on clustering the text or topics, the thematic analysis focuses on creating themes from the dataset provided to it.

Like topic modeling, it also takes the unsupervised training route. This makes it ideal for businesses that have large datasets of a specific nature.

For example, the customer ratings and reviews of a pizza delivery service. Thematic analysis can sift through customer data to bring up themes around which most customer feedback revolves.

Some examples include the friendliness of staff, recipes of pizzas, service quality, etc.

Text analysis


The thematic analysis also can have several sub-themes under each of these main themes. However, to achieve that level of data extraction, the text analytics algorithm implemented should be advanced as well.

The importance of text analysis

Now all this dissection of text analytics begs the question of ‘why text analytics is important?’ What benefit can it accrue to businesses that spend the pain and resources to implement it?

Turns out there are three major gains that businesses of all nature can reap through reap analytics. They are:

  1. Understanding the tone of textual content
  2. Predict emerging topics of discussion
  3. Quick translation of multilingual customer feedback

Understanding the tone of textual content

Tweets with less than 140 characters can wreak more damage to a brand than a full-fledged press release. A one-sentence-long NPS survey response can hint at whether the customer is happy or not.

A page-long customer review on a third-party marketplace can tell stories about a brand’s customer service efficacy.

Text analytics can sift through all this structured and unstructured data to figure out whether the tone used in the text is positive or negative.

How does that magic happen? Sentiment analysis makes it happen.

Sentiment analysis is a systematic way of extracting subjective information using a variety of technologies including Natural Language Processing (NLP) and text analytics.

Using sentiment analysis, it is possible to determine whether the user’s perspective about a product or service is positive, negative, or neutral.

Sentiment analysis

Predict emerging topics of discussion

Text analytics empowers businesses with ‘social listening’ capabilities. It allows businesses to tune into structured and unstructured data across emails, text messages, emails, and customer reviews to narrow down positive and negative topics.

For instance, text analytics can help determine the overall topic of discussion in response to a specific change in business offerings like features, pricing, integrations, etc.

Such an overarching approach to identifying topics of interest would enable a business to identify positive and negative topics of high impact that can be improved immediately.

Translation of multilingual customer feedback

For any globe-spanning business, there would be a diverse customer base that uses multiple languages for communication.

Translating every byte of communication of interpretation and analysis is a tough job with manual processes.

Text analytics helps cut through these linguistic barriers. It can translate unstructured text from all channels that customers are using.

Proper training models can also enable the algorithm to decipher urban slang-heavy or acronym-heavy internet speech.

In other words, text analysis can help you understand customer praise or ranting in any medium in any language in plain language that your business can understand.

Real-world use cases of text analytics

Given the various approaches and benefits of text analytics, what are the real-world use cases to which a business can apply text analytics? Here are some of the use cases.

Customer service

There are three core traits that customer service demands — swift responses, empathy, and accurate solutions. Now if your customer support function is used to getting a never-ending queue of customer queries,

it could be a problem. It is not possible to read through all queries and provide swift responses. Also, a majority of the queries could be basic that can be easily resolved with an automated responder system.

Text analytics with its analytic capabilities can cluster customer queries of similar nature into categories. This would make it easier for customer support agents to attend to them with a specific approach.

Even better, the business can figure out perennial problems that can be resolved with a permanent fix.

Personalized advertising

87% of shoppers now begin product searches online (Retail Dive). That is a huge heap of text that is being used there. Right from the preliminary text search advanced search regarding a specific brand or label, users use a variety of text during the research phase.

All this text can be mined to create effective product placement and online advertising campaigns. This is the same strategy that has made Google AdWords into an advertising behemoth.

Text analytics helps businesses go beyond search queries and pick up customer behavior patterns from social media and other digital mediums.


Every recruiter who has an ambitious target for hiring will testify how difficult it is to sift through resumes and online candidate profiles on career profiles to find the right candidate. It is the most time-consuming part of recruiting.

Text analytics can make a difference here. It can help recruiters to look for specific text phrases in the candidate’s profile that can help them filter potential individuals who could be personally interviewed.

How text analytics can add value to your NPS

Since the days of the Industrial Revolution, businesses have been trying hard to find out how customers feel about their products or surveys. The search reached some kind of conclusion in the last decade when the Net Promoter Score® (NPS®) was introduced.

NPS® aims to collect customer feedback in a concise form with a single question. “How likely are you to recommend this product/brand/service to your friends or colleagues?” depending on the response rate, the business can decide how well they are faring in serving customers.

The unique trait of NPS is that, although it is a measure of customer satisfaction, it is not an absolute indication of business performance.

Rather, it indicates areas where the business can improve its offerings to sustain loyal customers, turn delusional customers into loyal ones, and convert detractors into passive customers.

NPS® surveys receive their responses in textual form as well. In addition to the standard question, most surveys are also accompanied by open-ended questions that allow the customer to write at length their feedback about the product or service.

Analyze all survey responses without sampling

Imagine running an NPS survey and getting thousands of responses. It is almost impossible and also unproductive to go through each response manually. A sampling of responses is also not recommended since responses that portray serious flaws in the product might be missed out.

In such a scenario, it is text analytics that can come to your aid. It helps analyze all survey responses without the need for sampling. The thorough analysis of the responses helps in arriving at a near-accurate measure of promoters, passives, and detractors.

Categorization of survey responses

The purpose of Net Promoter Score® is to categorize users based on their loyalty into promoters, passives, and detractors. From a business perspective, user responses must be also further segmented.

That calls for identifying a common pattern of responses — a common theme that every customer talks about is happy or unhappy about.

With text analytics, it is possible to quickly categorize the widely scattered text responses of users into common topics or themes.

For example, what do promoters feel about pricing and customer service? The same can be compared with the responses of passives to identify areas of improvement.

Data to craft a marketing story

Take a look at the world’s top brands. Apple. Airbnb. Amazon. Alibaba. Facebook. Apart from their business offerings, they are also known for their story of how the business was born out of an idea. And, countless stories revolve around their business data.

For example, Apple is a company that is hailed for its user data privacy. Airbnb is known for its wonderful experiences at unique host locations.

For your business, there could be a similar USP that could be indicated with data from NPS® survey responses. Text analytics help populates data of a similar nature to craft a marketing story that can augment your brand image.

In a nutshell

Text analytics has been around for a long time. Ever since the days of cave drawings, mankind has been trying to decipher the hidden meaning in texts and symbols.

In today’s world, text analytics has leapfrogged to a new dimension. It is used to dig out the hidden meaning in text messages and content that users create in their daily life.

That search for meaning also enables businesses to understand their customers better and serve them better. To sum it up, text analytics is a must-have tool for businesses that want to read between the lines of what their customers are writing about them.

Tanuj Diwan

How much did you enjoy this article?