Published: September 2, 2013 - 02:00

What’s behind the curtain of semantic search?

In order to understand the user’s search query, we should distinguish between the following aspects: On the one hand, the intention or the aim of the user and on the other hand the relations between and the context of the terms in the data base

Information on the web is basically made up of websites consisting of structured data. Many applications embed their structured data directly in the website because the automatic processing of web contents is becoming increasingly important. The meta-formats integrated in HTML provide vocabulary for the web content. Certain sematic search engines form so-called question-answering systems. They give structured answers to questions based on the existing meta-formats directly in a natural language.

The meaning of the search term

In order to find out what the search term means, lexical connections between the words first have to be analysed. The lexical connection describes the meaning of words, sentences, texts and dialogues. The next step is to channel the meaning of individual words into the meaning of whole sentences.

This is a difficult problem in natural languages because the word formations often have an additional meaning that reaches far beyond the single word elements (e. g. sayings or idioms).

When interpreting the search term, anaphoric relations should also be taken into account. They represent the link between a sentence and another, previous one.

Although the average length of a search query tends to be very short, the appropriate results for the single queries still need to be found. The user should reach a certain website as quickly and directly as possible in order to be able to carry out their actual intention on the web (e.g. purchase of goods, download of data).

Beside the classification by broad objectives, the most important task is to analyse the problem in order to be able to provide the user with direct solutions instead of results. Methods such as Adaptive Systems, Open Information Extraction, Entity Linking and Topic Models help to analyse the connection between the requested data and the content on the web.

In general we can say that the potential for further development and improvement of the actual search mechanisms is still far from being fully exploited, since the database on the Web and the available computing capacities are continuously providing new possibilities for information processing.

Are you interested? Don’t hesitate to contact us!