Predictive Analytics Times
Predictive Analytics Times
Interview: The Institute of Business Forecasting & Planning Talks to Dr. Eric Siegel
  Dr. Eric Siegel cuts through the buzzwords surrounding...
Investment Modeling Grounded In Data Science
 For more from Dr. Elder, join Predictive Analytics World...
Some Thoughts On Being a Data Science Entrepreneur in a Disruptive Economy
 The movie “Being There” may seem like an odd...
Asking the Right Analytics Questions and Whether Tiger Woods is better than Jack Nicklaus
  One of the most fundamental contributions we can...

5 years ago
Why text analytics is so important in search


Choosing the right keywords for search is the most important component of getting the results you’re looking for. Everyone knows this, but it’s easier said than done. Even with the most well thought out keywords, search results don’t always deliver what you’re expecting.

Improving the accuracy of search is of utmost importance to companies like Google and Yahoo, and one of the best ways to do this is to incorporate text analytics (AKA text mining) into the back end.

Let’s take a typical enterprise search engine and break down the steps that go into an actual search. First, a database of unstructured content is fed into a pipeline, where it is converted into a structured document. That document is then fed into an index, and when a person queries the index, results appear.

Text analytics occurs within the pipeline, before the content is indexed, where it analyses the content and extracts meaningful metadata such as entities being discussed, sentiment, and themes.

The information gained from the text mining process can then be used to create a more efficient search. A common tool for this purpose is faceted search. Any time you’ve used an advanced search option while using a search engine, you’ve been using faceted search. It is particularly useful because it enables cross-referencing through all of that metadata.

Faceted search engines come in a variety of complexities and flavours. Major retail websites use rudimentary faceted search to narrow down the categories in which you are searching, while databases such as ones for academic or legal documents may have a more complex set of cross referencing tools.

Text analytics is crucial for word sense disambiguation. Word sense disambiguation is the process of determining what meaning of a word that has multiple definitions is being used in a sentence.

In a typical string based search engine, a search for a term with multiple definitions is going to yield results for all possible uses of the word. Using text mining, the context of the rest of the sentence or phrase in which the word is located is used to determine what the word refers to, when that knowledge is applied to search, it improves the relevance of search results.

More than anything, text mining’s power in search is that it allows you to ask more general questions like “who’s hot and who’s not?” and “is there any breaking news I need to know?” and get results that actually answer those questions.

All in all, the ability to add context and extract metadata from unstructured content before it is indexed makes search engines a far more powerful tool.

By: Mekkin Bjarnadottir, marketing manager, Lexalytics
Originally published at

Leave a Reply

Pin It on Pinterest

Share This