top of page
NEW ROM LOGO_FINAL_ENGLISH_Artboard 1 copy 11.png

Overcoming information overload: the role of organizational search engines

Updated: Jun 23


A lit document stands out among stacks of dark papers, creating a dramatic contrast. The glowing paper adds a sense of focus and importance.

A wise man once said: "Just as the preoccupation with the problem of hunger at the beginning of the twentieth century was replaced by massive engagement with proper nutrition and diets at the end of the century, so the problem of lack of information in the early eighties was replaced, towards the end of the nineties, by an equally significant problem of information flooding and overload."


We document a great deal today. Admittedly, not everything, and not always all the right things, but certainly an increasingly extensive amount of disk space is dedicated to storing documents and other content items.


The extensive documentation quickly brings us to the problem of orientation. To the problem of locating content items that document important knowledge, which certainly exists somewhere. The important knowledge that exists in people's minds sometimes focuses on where the document recording the knowledge is located, rather than what the knowledge itself is.


One of the technological means that helps us locate this documented knowledge is the organizational search engine. A search engine component exists in many different systems within the organization, but these are dedicated, and more and more organizations are seeking a general search engine for themselves.


Many search engines are available on the market. Good, but confusing. The reason for this lies in the fact that any random group of search engines we examine typically contains products where the overlap and competition between them is minimal. In other words, one sells books and the other newspapers. However, despite both having pages, the difference is significant.

To understand which tool we need most in the organization (everything is needed, but resources are never sufficient), we must first understand the types of capabilities that search engines address. These are described in the following diagram:

Flowchart depicting data management processes with sections labeled Interface, Brokers, Collaboration, APIs, and various functions.

Explanation:

Colors:

  • Components colored in orange enable expanding the quantity of results. For example, the ability to work with a thesaurus allows searching for the word "airplane" to also reach documents containing "aircraft" (synonym) and documents related to "fighter jet" (parent-child relationship).

  • Components colored in yellow enable reducing the quantity of results and/or organizing them for focus purposes. For example, Text Mining capability will filter when requesting documents related to the Smith company, documents and content related to the company, but not to Mr. Smith. This is based on an analysis of relationships within the sentence where the text is written. Whether the title "Ltd." was added to it, or the title Mr./Mrs., and so on.

  • Components colored in purple are related to the search engine's interfaces with other systems. For example, APIs enable the activation of the search function from within operational IT systems.

  • Components colored in green are not independent components, but important infrastructures for the success of any infrastructure product. For example, a good permissions mechanism knows how to impersonate the user (at the level of permitted authorizations) when requesting to search and retrieve information from various sources within the organization, and based on the user's permissions, to include or ignore the results obtained. Sounds trivial, but far from it in reality.


Locations:

  • Components located in the left area of the diagram are relevant when defining the words on which the search will be performed (before accessing the various contents).

  • Components located in the center are relevant when processing the results obtained from the various contents, towards bringing them back to the user.

  • Components located on the right are related to the external world of contents, which come from outside the central repository where the search is performed.

  • Components in the upper part are related to the interface with the user. The clouds are related to infrastructure.


Product Families:

  • Most products primarily represent two or three components, which are the center of gravity of their capabilities. Therefore, there is partial overlap between products. The main product families focus on:

    • Multilingual retrieval (Lingual).

    • Federated Search.

    • Smart search / Data mining (Text Mining)

    • Catalog products (Auto Categorization/Categorization)

Want to learn more about portals and channels?

Here are some articles you might find interesting:

Comentários


bottom of page