Behind the Scenes of Text Understanding Without Prior Knowledge
- Ido Shalem
- Jan 1, 2003
- 4 min read

One of the hot buzzwords in the field of knowledge management and document management is computer systems capable of "reading" and analyzing text without the need for prior knowledge, dictionaries, and taxonomies.
In the following article, we will try to understand how this is possible and whether computers are truly capable of "reading" and "understanding" text.
People often use idioms to express their thoughts. People who write or speak about similar topics will likely use the same idioms.
If we don't all agree on the meaning of idioms and use them, we will have great difficulty understanding one another.
Here are examples of three paragraphs from books that deal with knowledge management. In each of the paragraphs, identical idioms or those with a common denominator across all paragraphs have been marked:
- Hedlund, Gunnar, and Nonaka, Ikujiro. "Models of Knowledge Management in the West and Japan." In Implementing Strategic Process: Change, Learning and Cooperation. P. Lorange et al., editors. Oxford: Basil Blackwell, 1993. Pages 117-144.
Hedlund and Nonaka present a framework for discussing knowledge management that extends the work of Galbraith, Arrow, Simon, and others in the field of management and organizational theory. They point out that creating and exploiting knowledge within an organization revolves around the interaction of tacit and explicit knowledge and the "transfer and transformation of knowledge between individuals, organizational units, and the surrounding environment." They provide a conceptual framework that looks at different aspects of knowledge management and demonstrate its use in a model that contrasts U.S. and Japanese practices of managing knowledge. Hedlund and Nonaka argue that the characteristics of knowledge management have serious implications for the types of activities (including innovations and strategies) in which a firm or organization is likely to succeed. They reinforce the important idea that not only the success but also the very survival of organizations will depend, in large part, on how well they create, transfer, and exploit their knowledge resources.
- Earl, Michael J. "Knowledge as Strategy: Reflections on Skandia International and Shorko Films." in Strategic Information Systems: A European Perspective. Edited by Claudio Ciborra and Tawfik Jelassi. John Wiley & Sons, Chichester. 1994. pp. 53-69.
Earl provides an assessment of knowledge, its value, and knowledge work through case studies of Skandia International and Shorko Films, two firms that were among the first to institute knowledge management. He makes the case for "knowledge as strategy," focusing on the value of information technologies in exploiting organizational knowledge. Earl classifies information systems from the perspective of knowledge and uses the case studies to develop a model for managing knowledge as a strategic resource. He concludes that any knowledge-based strategy requires a combination of organizational and technological capabilities, as evidenced by Skandia and Shorko, which he considers prototypes of successful firms in a knowledge economy.
- Huber, George P. "Organizational Learning: The Contributing Processes and the Literatures." Organization Science. 2:1(Feb. 1991):88-115.
Huber reviews four constructs linked to organizational learning: knowledge acquisition, information distribution, information interpretation, and organizational memory. He analyzes and critiques the extant literature related to each, noting a lack of cumulative work and synthesis among different research groups. Although now seven years old, this article is still tremendously useful as a literature review and history. Huber pulls together various theoretical perspectives that have contributed to our understanding of how organizations learn and what constitutes organizational knowledge, developing a "big picture" overview of the importance of learning to knowledge management.
From analyzing the texts, it can be seen that the following expressions are central and repeat themselves in various ways:RetryClaude can make mistakes. Please double-check responses.
Knowledge Management
Knowledge Economy
Knowledge work
Organizational learning
Organizational Knowledge
Organizational memory
Organizational Theory
Information System
Information distribution
Information interpretation
Information Technology
In other words, the computer is capable of identifying central expressions that describe the content worlds with which the texts deal. Instead of reading hundreds of documents, the computer can extract the central ideas from the texts, highlight them for the reader, and enable them to quickly understand the topics the texts address and choose from among them those that are relevant to them. Moreover, it is capable of attaching to each document the key expressions that describe its content.
This form of text analysis is based on linguistic rules formulated by the Jewish-American linguist Noam Chomsky. Chomsky argues that every word has meaning, but the meaning is context-dependent.
For example, we all know that red is a color. But the expression "red hat" is fundamentally different from the expression "red traffic light" and from "red lines." From this principle of Chomsky's, we can learn that focusing on the individual word without its context (as is common in statistical methods) loses the context that gives the expression its meaning.
Another claim by Chomsky is that in every content world, group of people, etc., an agreement develops on the use of concepts and expressions. This agreement is necessary to ensure that we understand each other's intentions precisely. For example, documents dealing with knowledge management will likely mention expressions such as explicit knowledge and tacit knowledge, as well as communities of knowledge, among others.
In other words, the claim is that the dictionaries and taxonomies required in a significant portion of search solutions found in the market are located within the texts themselves and can be extracted for the user in real time.
In summary, it can be said that the algorithm on which VirtualSelf company's Information Manager is based largely mimics the reading process we perform. - If computers are capable of doing this, can we say that they are indeed capable of "understanding" and "reading" documents?
Does the computer that succeeded in locating the expression "Organizational learning" indeed "understand" the meaning of the expression? – Probably not.
Does every person reading the expression "Organizational learning" truly understand its meaning in depth? – Not certain.
Any person would likely be able to distinguish that certain expressions indeed repeat themselves and therefore probably have more central meaning, even without understanding their meaning in depth. So too the computer...
Are computers intelligent? – Not certain.
On the other hand, are all people intelligent?.....
댓글