Automated text classification use cases for the business

Diagram representing example text classification pipeline

Text classification or text categorization is an emerging field. The data companies need to process every day is alreadytoo bigto do it by hand and this data will be growing bigger and bigger. We live in the information age, where data is the most valuable asset we posses.

The drastic increase of computing power and the fact that hardware is becoming cheaper and cheaper the software industry has managed to bring back to life a century old algorithms for automatic analysis of data. These algorithms are mathematically proven to be correct and working, but as many things in math it was impossible to implement them in real life 10 years ago.

Today most of the modern software is shipped as a service (SAAS) and even the tech giants rely on the cloud to provide their products, making it very easy to integrate and benefit "out of the box" from cutting edge technology such as artificial intelligence (AI). Natural Language processing (NLP) is a sub field of AI,
quote from wikipediaNatural language processing (NLP) is a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.
or in other words "the study of making computers understand human language". This article will highlight some of the most common use cases of NLP based text categorization/classification your business might benefit from.

Support ticket organization

Some companies receive millions of consumer issues in the form of support tickets. Most organization have support departments for each individual category, for example in payment industries these could be, fraud, refunds, sales, marketing and so on, later on these tickets are ordered by priority and support teams address them via priority queue. In general customers dont't care that much which department will be best suited for their problem, so they just fire queries and expect immediate response, especially when money is involved. This creates a lot of overhead for the teams having to manually categorize each ticket, and then probably having to prioritize them again by hand. Artificial intelligence algorithms exit than can automate all categorization of tickets and even prioritize them on some criterias, this can have huge impact on businesses that are still not benefiting from such NLP solutions.

diagram representing example support tickets process in companies

Content aggregation and clustering

Our society produces way more information than we are able to comprehensive. A good example would be the news medias. If I want to be informed on certain topics I am interested in, I need to find the articles that are relevant to me, this means I have to traverse gigabytes of data on different sites, which is impossible to do by hand. That is why we have news aggregation services. They track the news for us and provide aggregated news feeds customized to our topics of interest. Most news aggregators already use AI to categorize the news articles. If you provide new aggregating service, being able to automatically know the topic of url or text can save you a lot of manual work and reading, so nowadays to be competitive in this industry you should be using machine learning for this task.

Companies tend to keep a lot of unstructured data in-house. They might see great benefits if they just run some software that can automatically organize, clusterize and categorize this data making it possible to query, and extract any meaningful insights from it.

image of clustered abstract elements by color

Fraud and spam detection

This is actually very famous AI problem. Most mail service providers are already leveraging NLP techniques to detect spam in their systems. They even go one step further and put labels on emails to further improve the use experience and making it easier for user to organize their mail box.

Fraud detection is another related topic, which is as old as fintech industries themselves, there is a constant battle between banks and money laundering, criminal funding organizations. Companies are investing billions in fighting them, and most of the fintech leaders are using AI to identify or assist in the identification of potential fraudulent transactions or malicious user activity.

image representing fraudulent activities

Language detection

Automatic language detection allows us to handle user queries in a more efficient way and route questions to the right people in the team. It can also be used to gather demographic insights for your users. Of course there are many more applications. The most popular tools, like google analytics are using NLP to improve their service based on demographic factors and language and this is probably the main reason they are the de facto tool for web site analytics.

Text analysing

The ever growing text base we have to deal with, requires that we separate the quality content from the poorly written, copied or spammy articles. This is yet another very time consuming task we can automate. Using NLP we can extract different metrics for raw text. Readability - how hard is is to understand the text, is it well structured, is it too repetitive and so on. The statistics show that, many companies are still relying on manual work for this. There is a great opportunity for optimization and cutting of the expenses, even if the automated solutions can't automate the task 100% they can be used to at least reduce the amount of manual work by a great proportion.

Lead generation

image of lead generation presented as magnet pulling potential customers

Taking timely action on social media events can help you keep your current customers happy and gain new customers. For example, if a person tweets that they are interested in a certain product or service, text analytics can discover this automatically and feed this info to a sales representative who can then pursue this prospect and convert them into a customer. Being able to automate such task can bring, a lot of new customers to your business, or at least cut the expenses for some overgrown marketing department that just track the social media.


Having to manually go through CVs, can be a very time consuming task, with the right NLP tool you can extract features of the candidate based on plain text. You could analyse not only the technical skills they have listed but also references, job descriptions and other unstructured text the candidates might have put into their CV. Using NLP you can even go one step further and define the successful candidate profile for certain position and then evaluate how similar to the perfect employee for the position a potential hiree is. The opportunities here are quite a few.

Review/Comments ratings

Customer are often just give 1 star or 5 star, their reviews are too binary, or sometimes they write positive review but give low ratings, either intentionally or by mistake. Having a tool that could identify the user rating entirely based on the review itself could be quite useful for review sites or marketing campaigns. Further more most sites or blogs don't even have ratings, with NLP you can evaluate the user moods, and automatically give ratings to the user comments. This could save your marketing team days of work.

Unstructured data

The fact is that whenever there is unstructured data, we might want to consider using AI to help us bring some order. Data without structure has no value to the business, you need to be able to run queries on it and search for patterns. Around 75%-80% of the internet data is in the form of text and most of that data is unstructured and impossible to process automatically, unless we use NLP services to assist in this task. You see now how important NLP can be, your company is potentially not benefiting from one of its most valuable assets that is the data, or you are spending money on teams that are manually organizing it. NLP is all about expense cut, and making thins more efficient.

structuring of unstructured data represented with square blocks

DigitalOwl text classification service

Our text classification service is able to classify any url, or plain text into 48 predefined general categories. You can check out our interactive console to see how it works. It is a rest service and can easily be integrated into any business pipeline.