Computer text understanding: is it really that bad?
I have recently read an article where its author states that the computer will never be able to understand the text as it is understood by the human. He cites a number of impossible tasks to machines as proof with an emphasis on the lack of efficient algorithms and modeling impossibility of a complete system, which would take into account all the possible alternatives of the text. However, is it really that bad? Is it true that for the solution of such tasks is needed special processing power? What is a situation of natural language text processing?
What does it mean to "understand"?
The first thing I was confused is the question itself. Could a computer be able ever to understand the text as it understood by the human? What exactly does it mean to "understand as the human"? Generally, what does it mean to "understand"? In the book “Data Mining: Practical Machine Learning Tools and Techniques” authors asked themselves a similar question. What does it mean to "get trained"? Let us assume that we have applied to the "interpreter" some training technique. How do we check whether or not a student is learning? If a student attended all the lectures on the subject, it does not mean that the student has learned and understood it. In order to test this, teachers hold examinations, where student is asked to complete some tests on the subject. Same thing is with the computer, we want to know whether it has learned (whether it has understood the text). In order to find out that we have to check, as it solves the specific applications, translates the text, highlights the facts, gives concrete meaning of a polysemantic word, etc. In this perspective, the meaning misses the importance at all. The meaning can be assumed as a certain state of the interpreter in accordance with which it handles text.
Further, the author of the original article gives an example of sentence translation "The first Nicholas printed the letter from Sophia", pointing out several possible translations of the word "printed" with the multiple meanings. A person can easily understand what it is all about, but can a computer?
In order to answer this question, let us consider how the person decides what meaning of the word should be used. I think everybody would agree that in the first place to do such tasks, we would focus on the context. The context can be presented clearly in the form of sentences that frame this or implicitly in the form of knowledge about the sentence (in our case, the knowledge that the sentence comes from the novel "War and Peace”, etc).
Let us consider the first option to use the contextual sentences. Let us suppose that we have two pairs of sentences: "The first Nicholas opened the letter from Sophia. It was difficult to read by the light of the torch" and "The first Nicholas printed the letter from Sophia. The printer did not work well, so some characters were not readable". The second sentence of each pair contains the keywords that let definitely to identify the meaning of the word "printed" in the previous sentence. In the first case, it is a “torch" and in the second, it is "printer." Here comes a question: what is preventing the computer to do the same maneuver to learn the true meaning of the word is in question? The answer is nothing. In fact, the attribute-value systems have been used in practice for a long time. For example, tf-idf index is widely used in the search engines to calculate relevance. Normally, information is collected on co-occurrence of words ("open" and "torch", "print" and "printer"), and on its basis is selected a relevant document or a more accurate translation of the word.
It is much more difficult when a text has an implicit context (knowledge of the circumstances). Namely, a simple collection of statistics will not help much, because the knowledge is needed. What is the knowledge and how it is represented? One of the ways to represent it is ontology. The ontology is a set of concepts, such as <Subject, Predicate, Object>; for example, <Nicholas is the human>. It is an important task to building the ontology for a particular field. There are a number of initiatives, such as Linked Data, where people collect some information and build a web of interrelated concepts. Moreover, there are some achievements in the field of automatic extraction of facts from the text. Namely, from the sentence "The first Nicholas opened the letter from Sophia" we can automatically extract facts <Nicholas, opened, letter " and " Letter, from, Sophia> etc. As an open-source example can be Stanford Parser, which has a pretty good understanding of sentence structure in English. Some companies, such as Invention Machine generally build their business on the systems to extract the facts.
Let us assume that we already have the complete ontology for our field. The word "print/open" has been used a few times for each meaning of the word. This word can produce facts < [Someone], print, package> in terms of "open". It can be used in facts <print, whereon, printer> in terms of "print". Finally, let us assume that the ontology already represents knowledge of the circumstances. In this case, the determining task of the correct meaning of word comes to the all facts of the sentence ontology for all the possible meanings of the word "print/open", as well as a selection of meaning that are surrounded by the most known facts (such as facts about the circumstances and facts that were extracted directly from the sentence).
Before I move further, I will make some conclusions:
1. Statistics is a powerful tool for analyzing the text.
2. It is a reality to extract knowledge (facts) from the text.
3. It is feasible task to get knowledge about the subject field.
Further, the author of article gives a number of specific tasks that are beyond the computer’s ability. I will not argue that some tasks are really difficult, but they could be done. I will give the above-mentioned tasks with possible solutions in random order below, but first I say a few words about the natural language processing.
The text is a set of attributes in terms of NLP. These characters can be words (roots and forms of words, case, letter case, and parts of speech), punctuation (especially those that are put at the end), smiles, and whole sentences. Based on these characters can be built more complex programs (sequences of words), appraisal groups the evaluation groups and the words from the given dictionaries, as well as even more complex, such as alliteration, antonymy and synonymy, homophones, etc. All of this can be used as indicators for solving various tasks of word processing.
Here are the tasks
Determining mood of the text
In general, the author proposed unclear division of the cheerful text and sad text. It comes to my mind three ways of classification:
1. The optimistic / pessimistic text.
2. Positive / negative (e.g., opinion).
3. Humorous / serious.
Anyway, this is the task of classification, which means it can be used standard algorithms, such as Naïve Bayes and SVM. The only question is what kind of characters should be extracted from the text in order to maximize the results of classification.
I have never dealt with classification of the text into the optimistic and the pessimistic, but I will bet that is sufficient to use the roots of all words as characters. The results can be much better if the dictionaries will be compiled for each of the classes. For example, the "pessimistic" dictionary may include words like "sad", "loneliness", "sorrow" etc., and the "optimistic" dictionary will have "cool", "yo" and "fun".
Classification of opinions and other user generated content that show the attitude of the speaker to a certain object (for example, a new camera, government actions to Microsoft Company) has recently become so widespread that it has been singled out in a separate field - opinion mining, also known as sentiment analysis (, ). There are many approaches to the opinion mining. Appraisal groups  showed a good score up to 90.2% of correct opinions for the texts that are consisting at least of 5-6 sentences, as well as for smaller texts; for example, tweets.
The task to determine the humorous text is not so popular, but it has its gained experience (). As a rule, in order to classify the humor are used antonymy, alliteration, and "adult slang".
It should be noted that a computer can also recognize successfully the humor and sarcasm with irony.
Ideology of the author
The competence, hidden complexes, approach to the work and a family may be found and extracted from the text, even things that an average person cannot see in it. It would be enough to select the appropriate characters and properly set the classifier. Perhaps, the results will not be very accurate, but Wikipedia generally asserts that among the people are able to identify correctly things only 70% and a score of 70% is lower than the average for these classifiers.
Metaphors, proverbs and omissions
All these tasks require the additional information. If the ontology is ready-made for the object field, it would not be difficult to find objects with similar properties. In order to do that is being used proximity measure, which is computed on the basis of statistical data, as well as the most "relevant" object is being sought.
As I said above, the problem with the determining of a particular polysemantic word can be solved by using statistical analysis in machine translation. So the only real problem is to generate grammatically correct text. There are two subtasks:
1. The correct determination of links between the words.
2. The proper mapping of found structures in the new language.
The task of determining links between the words is essentially the same classification problem, where the classes are all the possible links between the words. Libraries, such as Stanford Parser are using probabilistic classifiers and fuzzy set theory to determine the most "correct" version of the links between the words.
However, there are problems with the mapping of structures found in the new language. These problems have to do with nature of the translation, namely they do not apply to a computer. Professional translators will never indicate the languages that they know instead they indicate a language pair of translation. For example, a translator may be able to translate from Italian into Russian, but not from Russian to Italian. Namely, they can do a reverse translation, but far from perfect. The problem consists in translation of the structures from one language into another, which may not have a corresponding word for translation. What to do in this case is unclear. Therefore, the computer and theoretical linguistics are continuing to develop, bringing more and more new rules. At the same time, it is not difficult to put those rules in a computer program for the machine translation.
A big problem
Thus the computers can already extract facts from text, understand the mood of the author, recognize sarcasm, and much more. So what is the problem? Why so far there is not a universal "machine", which could take the text and solve all the problems that the human can do. I have come to the conclusion after a few years of practice in NLP that is difficult to arrange the intellectual processing systems of text. Creating a system of several components causes growth of combinative links between them, as well it requires taking into account of all relations with their probabilistic parameters. For example, it can be used the machine learning or manually created rules for opinion mining. However, if these two approaches are combined, the question arises to what extent each of them should affect the result: what will it depend on? What is the nature of these relations? How are the numerical parameters calculated? Field of the natural language processing is still in its juvenile age; so far humanity can only create a system to solve the local tasks. What would happen when all local tasks will be solved? Will the human have enough abilities (memory, thinking speed) to combine all gained experience? It is difficult to predict.
Links to resources:
 Bo Pang, Lillian Lee. Opinion Mining and Sentiment Analysis
 Bing Liu. Opinion Mining
 Casey Whitelaw. Using Appraisal Groups for Sentiment Analysis
 Rada Mihalcea. Making Computers Lough: Investigations in Automatic Humor Recognition
|Vote for this post
Bring it to the Main Page