Tuesday, May 5, 2015

Approaches To Machine Translation

The aim of machine translation (MT) is to produce high-quality translation automatically. Though this aim is not yet achieved MT research has achieved significant developments. This article intends to explain the design strategies for machine translation. Bilingual MT systems are designed specifically for particular two languages, Tamil and English, for example. Multilingual MT systems support translation of more than two languages. There are three basic approaches as listed below.
  1. Direct Translation
  2. Interlingua Approach
  3. Transfer Approach
Direct translation approach belongs to the first generation of machine translation systems. The interlingua and transfer approaches are characteristic of the second generation systems.

1. Direct Translation

Direct translation is also called binary translation approach as it is designed for a particular pair of languages. Translation is direct from the source text to the target text. Very little syntactic and semantic analysis are done. Source language analysis is oriented specifically to the production of equivalent target language text. MT systems that use this approach use a bilingual dictionary.

2. Interlingua Approach

An interlingua is a knowledge representation formalism
. It is language neutral as it is independent of the way particular languages express meaning. Interlingual machine translation techniques produce the interlingua in such a way that translating from source language to more than one target language is possible. Interlingua approach has to face the challenge of designing an efficient and comprehensive knowledge representation. Moreover complete resolution of all ambiguities in the source language text is required to make this approach possible.

3. Transfer Approach

Transfer approach has three stages. The first stage converts source language text into abstract source language oriented representations. The second stage deals with lexical differences between languages and converts the representations resulted from the first stage into corresponding target language representations. Finally the third stage generates the target language text. Transfer systems require dictionaries for source and target languages. These dictionaries should contain morphological, grammatical, and semantic information. A bilingual transfer dictionary is required to relate base source language forms and base target language forms.
Both interlingua and transfer approaches are now known generically as rule-based systems.

Monday, May 4, 2015

Introduction To Expert Systems

Introduction

An expert is a person who is well versed in a particular domain such as robotics, linguistics, medicine, finance, etc. The knowledge required for solving a given problem is called knowledge domain. A typical expert must be in a position to advise on a particular matter related to his area of expertise. He should be able to answer queries from his clients. For example, a physician should be able to answer queries such as which medicine should be given to a diseased person, what is the dosage of medicine for a particular disease, what kind of food a person suffering from diabetics should take, and so on. He need not answer queries on linguistics, robotics, or finance as these are not related to his domain. Why not a computer system does his job? The answer lies in expert system.

What are expert systems?

Expert systems are artificial intelligence programs capable of doing the functions of a typical human expert. A medical expert system, for example, should have knowledge on various kinds of patients, disorders, symptoms, medicines, etc. Knowledge base and inference engine are the two major components of an expert system. A knowledge engineer is employed to develop the required knowledge by talking to a human expert and gather as much knowledge as possible and then store it in a knowledge base. Knowledge can also be learned automatically from other sources using machine learning techniques. A user interface is used to get queries from the user. Later these queries are given to the inference engine which then makes inferences with the available rules and the knowledge base and sends the response or advice. Expert system uses an explanation facility to explain the user how it arrives at the conclusion. There are many products available in the market. MYCIN, for example, is a medical expert system and ELIZA for human-computer interaction.

Advantages and disadvantages

Computers, naturally have many advantages over human beings. For example, computers do not get exhausted, do not get fed up, work faster, do not cost more, and can work efficiently. So an expert system can obviously take these advantages for granted. On the other hand they struggle to match with humans in intelligence and commonsense. Making knowledge base on par with that of humans is a tedious job.

Introduction To Natural Language Processing

Introduction To Natural Language Processing
Natural Language Processing (NLP) is the study of making machines communicate with human beings through natural languages. Knowledge on many areas including linguistics, artificial intelligence, cognitive science and psychology is required to develop a natural language system. Though natural language processing is a vast area, many work on a portion of it such as text segmentation, tagging, information extraction, machine translation, and so on.
Analysis In Natural Language Processing
The process of natural language analysis runs into stages: tokenization, lexical analysis, syntactic analysis, semantic analysis, and pragmatic analysis. In the first stage, given text is split into words or tokens. This splitting of text is called tokenization. The main problem in tokenization is in finding the word boundaries when ambiguity exists. For example, abbreviations have dots in between which may confuse the tokenizer. Analysing the lexicon and finding the lexical features for the words is the next stage. For machine translation tasks, the lexical features should include parts of speech. Obviously lexicon is very important for lexical analysis. The lexicon contains different words and the features associated with. In order to reduce the size of the lexicon only the basic forms of the words are kept. Morphology is then used to get other forms of words. For example, the word walk is enough for a lexicon and other forms of it can be derived by adding suffixes, say s for plural, ed for past-tense, and ing for present participle.
Once lexical analysis done, analysis should be done at sentence level. Parsing is done to determine the syntactic structure of the sentence. A procedure called parser is used for this purpose. Understanding the text is the critical stage in natural language processing. Semantic analysis finds the meaning for the given text. Meaning may vary depending on the context in which particular sentence was uttered. In a text, to understand one sentence, it may be necessary to understand the previous sentence as well. For example,
Julia won the race. She was very happy.
In the above text, if someone only read the second sentence he cannot know who is she. The purpose of discourse analysis is to find these interdependencies. Finally world knowledge is also required to understand the text. Pragmatics is used for this purpose.

Natural Language Generation

The NLP system should be able to generate natural language sentences automatically. Natural language generator applies reasoning for motivation, think about what to utter, how to utter and also to formulate the output. Natural language generation depends on many kinds of knowledge: domain knowledge, linguistic knowledge, strategic rhetoric knowledge and text types.

Approaches to Natural Language Processing

Symbolic approach, empirical approach and artificial neural network approach are some of the approaches. Symbolic approach applies linguistic theory given the linguistic knowledge. Rules are the common forms in which linguistic knowledge is encoded. Corpus and statistical methods are central to empirical approach. Corpus is a large collection of texts. Preparing the corpus is the tedious task in this approach. Hybrid approaches are becoming popular nowadays. Artificial Network approach is a different way of implementing NLP functions.

Monday, April 6, 2015

Overview on machine learning

What is machine learning

As the name implies machine learning means making the computer able to learn on
its own. It is one of the most active research area within the purview of artificial intelligence.
Technically it is defined as computational methods using the past information available to improve performance with practice and to acquire knowledge automatically.
Machine learning involves designing efficient and accurate prediction algorithms. It uses induction as a way of thinking. In fact it relies on induction process to a great extent. The algorithm designed for this purpose gets labelled training examples and produces the result in the form of prediction rule.
As the complexity, variety and size of machine learning models increases it is required to make use of optimization approaches having great availability and theoretical properties.

Need for machine learning

Learning is a central feature in intelligent systems. It is not possible to build an intelligent system without having a learning module. It also contributes to developing mechanisms for cognition, perception and action.

Types of machine learning

There are two kinds of learning: Supervised and unsupervised. Supervised approach aims to deduct input-output relations based on input-output samples. Once this relation is learned, it is easy to predict output values for unknown input points. Unsupervised approach on the other hand does not accept output values as training samples. It actually depends on the situation and aims to extract relevant information from the given data.

Goals

The main goal is to design general purpose algorithms. These algorithms should be efficient and should consider amount of data required. Moreover they should be applicable to wide variety of problems.
The result of machine learning process should be a prediction rule that makes predictions as accurate as possible. Human experts should be able to understand these predictions easily.

Applications

Applications of machine learning include classifying text, language processing, speech recognition, computational biology, games, diagnosis in medical field, computer vision, data mining, robot control and so on.