Taxonomy - A World of Products or Knowledge Management Work?
- Dr. Moria Levy
- May 1, 2002
- 3 min read

Recently, Daniel Rasmus, Vice President and Principal Analyst at Giga Information Group, was quoted as claiming that "taxonomy does not require technology at all" and adds "taxonomy is simply a way to classify things."
Yet, there is a growing list of vendors offering software and other related applications, with many promises in their arsenal. Organizational portals will become more efficient through the rapid creation of libraries with catalogued information; employee experience will improve significantly through the creation of a more efficient and effective information retrieval system; and this trend is expected to continue.
The market for taxonomy and automatic categorization products is estimated at approximately $600 million and is expected to double by 2005 (inclusive).
So, who is right?
Apparently, like any other topic, the truth lies somewhere in between.
But let's start from the beginning and end at the end.
What is taxonomy?
A glossary of terms. Organizational taxonomy - a collection of terms used daily in the organization. An outsider wouldn't recognize them, but they are frequently used in every discussion and conversation related to work matters.
Examples: system names, technologies, clients, and even organizational units.
Categorization is the process of classifying content items (such as documents, database records, and conversations) and assigning them values from an organizational taxonomy. For example, a specific document relates to ADSL technology, specifically for client ABC, and pertains to the item type "discussion summary."
Why do we need categorization?
In many of the systems we build, we try to reflect the way users will collect things and the words they will use to find things, because ultimately, we are trying to build a schema that will make content more efficient for people, both for those who need information and must locate it, and for those who input information into the system.
What service do the products claim to offer?
The products offer a variety of services:
Automatic construction of organizational taxonomy:
The method - identifying common words in content items that are not stop words of the language.
Building a fixed hierarchical tree or on-the-fly based on the organizational taxonomy.
Automatically assigning content items to the built trees (this stage is called auto-categorization).
Examining tree efficiency:
Checking that trees are filled in a balanced manner; there are no empty branches and no relatively overloaded branches. A problem could indicate a poor assignment, but usually indicates an incorrect construction of the initial stages (a, b).
And if everything is so good, why does the Vice President and Principal Analyst of Giga claim it's not about technology?
The truth can be told - his statement could be dismissed with a literal reference to the definition of the term, since taxonomy is just a glossary of terms. However, a more comprehensive reference can be provided to the taxonomy product family, which, as mentioned, encompasses all the stages described above.
Taxonomy products are not "out of the box" technologies, and even the most sophisticated and automatic systems need some manual assistance from people who know how to work with material classification.
Stage b - Building trees certainly requires manual assistance. But even stage A, even if performed using tools, undergoes human validation. And of course, any unbalanced tree discovered in stage D is returned to the manual analysis table for treatment and improvement.
Taxonomy products are not always cheap. Moreover, their effectiveness in Hebrew is even more limited than what is known in English.
What is recommended?
For an internal organizational initiative, it is recommended to perform stages a and b manually.
If the number of content items is very large, and if the organization can afford to purchase a product focused on auto-categorization, such a product would certainly ease stages C and D. If not, part of the activity can be performed manually, and programs can be written in-house that are indeed less good than existing products, but in an 80:20 ratio, shorten the process.
To summarize -
Working with taxonomy always requires human intervention. Products can help, but cannot serve as a substitute.
Actually, like many other topics in life.
Comentários