Unit-5

UNIT 5 COMPUTER DICTIONARIES AND ONLINE DICTIONARIES
Structure
5.0 Objectives
5.1 Introduction
5.2 Computer and Areas of it Applications
5.2.1 Main Functions of Computer
5.2.2 Secondary Functions of Computer
5.2.3 Application Software
5.3 Functions of the Dictionary
5.3.1 Digital dictionary
5.4 What is Computer Dictionary?
5.4.1 Computer Dictionaries and Printed Dictionaries : Salient Features
5.4.2 Role of Computers in Dictionary-Making
5.4.3 Dictionaries for Computers
5.5 Online Dictionaries
5.5.1 What is Online Dictionary?
5.5.2 Online Dictionaries and Computer Dictionaries : Salient features
5.6 Knowledge Representation
5.6.1 Syntactic
5.6.2 Semantics
5.6.3 Use of Knowledge Representation by Computer
5.7 Computer and Lexical Resources
5.8 Let Us Sum Up
5.9 Glossary
5.10 Exercises
5.11 Suggested Readings
5.0 Objectives
Until the end of the last century, we understood the meaning of the dictionary to be a
reference book, which helps the readers of the language to understand the meaning of
difficult words. In the fourth phase of the last century, attempts were made to accomplish
various language-related tasks with the help of computers, so a new thinking started on the
functions of the dictionary. A computer can use a variety of inputs such as printed text,
sound, pictures, movies, etc. Due to this, there was a radical change in the form of
dictionaries. It is in the context of these things that we will see in this unit what is the process
of making a computer dictionary and what are its functions. After reading this unit you
should be able to:
• explain the functions of dictionary;

• introduce various language-related programmes of computer;
• explain the nature and characteristics of a digital dictionary;
80
• describe the role of computers in linguistic computation;
• introduce the different types of knowledge representation;
• explain the concept of knowledge representation; and
• differentiate between computer terminology and lexical resources.
5.1 Introduction
Computer is a magic box. It can solve complex questions in complex areas such as missile
engineering and space flight or weather forecasting very quickly and with near perfect
accuracy. It is said that the computer can work many times faster than the human brain. This
is especially true in the context of mathematical problems and detailed statistics. But there is
one area where computers cannot compete with the human brain. That area is the use of
natural languages. This is because a computer does not have a brain of its own. It is a
mechanical device, which performs its work according to the stated rules. It does not have
any knowledge or thinking of its own. Since then it has been thought that if artificial
intelligence is provided to the computer, it will be able to perform linguistic tasks more easily
and authentically. Here artificial intelligence means -the same knowledge as we have in the
brains of the speakers of the language. In this way, the computer will not only be able to
present the most useful dictionary in front of us, but will mark the dictionary parts in such a
way that it can be used by itself in programs like translation etc.
5.2 Computer and Areas of its Applications
We have discussed that a computer can perform a variety of tasks. When computers came
into existence, they were called ‘number crunching machines’ because they could multiply
and divide thousands of numbers in minutes. That too with reliability, while a person used to
take hours for this work and the result was also not reliable. Later it was thought that
computers could be used for linguistic tasks.
The first linguistic application that computer scientists envisioned through computers was
translation. It was thought that through computers, we would be able to do translation work
faster and better, which is the biggest need of the modern era. The basis behind this fantasy
was the belief in the immense power of the computer. The biggest obstacle in the realization
of this imagination was how to provide information about the structure of the language and
the meaning of the words to the computer, so that the computer could do this work better.
Why computer needs linguistic knowledge can be explained with the help of an example.
There is a Hindi sentence – ‘आनंद घर म नह ं है।’. It can be translated by machine ‘There is no
happiness at home’ while the expected translation was ‘Anand is not at home.’ It is a feature
of the language that words can be used as names of people and also as common nouns. If we
can give this information about word categories to the computer, then it will be able to
translate correctly.
81
In this way, we can say that the information we have regarding the use of language, if it can
also become a part of computer programmes, then we will be able to do many language
related tasks naturally.
In this context, we will further discuss the applications of computer and see what is the nature
of various applications? This information will guide us as to what language resources we will
need for language tasks.
5.2.1 Main Functions of Computer

We discussed computer performs operations like addition, multiplication, division etc. on
mathematical numbers. Similarly, using the characters of languages like English, Hindi etc.,
it creates the text of the languages. In these texts, we can edit the text by correcting spelling
etc. We call these two functions respectively ‘data processing’; and ‘word processing’.
Data Processing : We discussed that computers solve mathematical problems very quickly.
We call this work ‘data’. Data processing is a simple task not only for computers, but also for
modest calculators. The cells of the computer’s memory dictionary represent numbers. The
computer doesn’t even recognize decimal points. It detects the presence or absence of power
in the mines as ‘1’ (one) and ‘0’ (zero) respectively. Decimal digits are represented by this
‘Binary system’. Here we are giving the price of the places. Here are some examples in the
decimal system by adding the same value to that digit based on the 1 or 0 in the place. These
digits are only up to 255, which requires 8 places in the two-digit system.
Location 8 7 6 5 4 3 2 1 Decimal Value

Decimal Value 128 64 32 16 8 4 2 1 of Two-Digit
double-digit 1 1 0 1 0 1 0 1 = 85
two-digit 2 1 0 0 1 1 0 1 0 = 154
double digit 3 1 1 0 1 1 0 0 1 = 217
You notice that the two digits above have eight places. In this way, we can represent numbers
from 0 to 255 (e.g.1 = 00000001; 255 = 11111111) with the help of 8 bits for mathematics.
By increasing the bits, we can represent even larger numbers.
8 bits is also called a ‘byte’. If we consider a symbol of eight bits (byte) as the symbol of a
letter of the language, then out of the total possible 256 symbols, we can symbolize 256
letters of the language in the computer. Where numbers are infinite, the alphabet of a
language is often limited. English requires a maximum of 80 characters including 52 letters
and punctuation marks. In the available 256 bytes, all the letters of Hindi (about 100), 52
letters of English and similar diacritical marks all come. That is, in the same programme we
can do the work of two languages. Now-a-days it is the era of 32 or 64 bit computers. In this
we can save lakhs of characters. That is, on the same computer we can work in all the
languages of the world. For this system, the system of character writing called ‘Unicode’
(Universal Code) was made the basis. To work globally, we should adopt the system of
Unicode.
82
Word Processing : Working through languages on a computer is called ‘word processing’.
We will get a little introduction to the function of word processing here.
When we copy the text by writing by hand, we often need to cut, add, move the object from
one place to another for the purpose of editing the text. These tasks are difficult on paper but
very easy in computer. Alongwith these, we are giving details of some other facilities
available on the computer:
1. Insert : It is difficult to add something to the text written on the paper, because there is a
lack of space. You can add anything (text, pictures, diagrams, animations, etc.) anywhere
on the computer.
2. Cut : If we want to delete any written part, then we will select (highlight) it and delete it
by giving instructions. If you accidentally cut the part, you can also undo it by pressing
‘Z’ (Control with Z ) keys or by clicking on ‘undo’.
3. Copy and Paste : If we want any text in another place, that is, if we want to copy it, then
we will select (highlight) that part and paste it in another place.
4. Cut and Paste : We can cut the selected part from there and paste it at another place.
5. Find and Replace : Let’s say you think a name has been misspelled and you want to
correct it. To do this, you type the wrong word in the search box, then the computer will
bring that word to you. You can then type in a new word to replace it. With the instruction
‘Replace all’, the computer replaces hundreds or thousands of uses of that word in the
wrong form in the text simultaneously (within a few seconds).
When a copy of the original manuscript is sent for printing, it also has to indicate the visible
form of the text. We call this printing ‘copy editing’. A ‘copy edit’ includes the following –
font selection (different fonts/letter layoutsfor each item), font size (ranging from 6, the
smallest for foot notes, etc. to size 72, the largest for posters, etc.), font colour etc. Some of
the other features we are giving along with the instruction buttons of the computer are bold,
italics, underline, margin (left, right, centre). Even if the text is of 40-50 pages, the computer
follows these instructions in a few seconds. Many other similar facilities are available in the
computer. You can get their information from self-study.
5.2.2 Secondary Functions of Computer

The programmes we are going to discuss here are called ‘Utilities’ in computer language. We
call them secondary functions for the reason that without them also we can do the work of
‘word processing’. In fact this function is not even available in all languages. For example,
earlier there was no programme for checking spelling and grammar in Hindi. If these become
useful programmes, users working through the language will be of great convenience. Let us
have a look at some of the major useful programmes:
1. Spell check :
With this we can find out all the spelling mistakes in the typed text and choose the correct
spelling. With the help of this the computer recognizes the wrong words and marks them.
With this we can improve the language quickly and accurately. This programme not only
83
recognizes wrong words but also suggests their correct substitute and if you choose the option
then computer can replace the wrong word without typing.
One of the great features of the Spell Check is the ability to correct misspelled words in all
places at once with the help of ‘Find and Replace’. For example, if we have entered the word
‘Romesh’ in a novel instead of ‘Ramesh’, then by pressing one key, hundreds of places can
be used to correct this word simultaneously. Computer also gives us the facility to enter
words like names of people, names of cities etc. in the computer’s memory (i.e. in the
dictionary located in the computer) so that further spelling of these words can also be
checked.
How does the Spell Check programme work? He actually matches the typed words with the
words entered in the dictionary inside the computer and marks the words which are new or
wrong. This matching is not done in the context of the whole word, but the computer keeps
doing it in sequence of each character or alphabet. Due to the quick speed of the computer,
this work is done quickly, whereas in matching one word, the computer has to perform
hundreds of operations.
2. Grammar Check :
The help of Grammar Check is taken to check the grammatical mistakes of a particular
language. This indicates incorrect structures of the particular language. But it can work only
when it is prepared in the context of a particular language. That is, if we want to check the
grammar of the typed content in Hindi language, then for that Hindi related grammar
checking programme should be developed.
It is important to note that the grammar check function is different from the spell check
function. But, on the other hand, it is said that spell check without grammar check is not very
useful. Similarly, this programme helps in correcting many grammatical errors. This type of
programme is working successfully in English language, while there is lack of effective
grammar checking program in Hindi.
3. Auto Correct :
It is part of word processor programmes. Suppose people write ‘grammar’ as ‘grammar’, then
the computer automatically corrects that word and prints it. How does auto-correct work in a
computer? It actually matches the typed words with the words entered into the computer and
automatically corrects words that are new or misspelled. The computer does the work of self-
correction on the basis of pre-set words entered in its memory. This list of pre-determined
words is stored in the computer. Apart from this, you can also add your own words to this
list. This facility is also available in Hindi. Due to the fast speed of computers, this work is
done automatically and typographical errors are reduced.
4. Creating Number Order or Alphabetical Order (Sorting) :
With the help of this we can bring thousands of words in alphabetical order in minutes. Not
only words, numbers of different order can also be brought in the correct order with its help.
It can also bring numbers and words in reverse order. It takes us a lot of time and labour to do
this work on paper and errors also remain. The alphabetical order is determined by the order
of the alphabet of the language. Just as the computer naturally presents the numbers 1, 2, 3
84
etc. in ascending or descending order before us, it can also present the words of the language
in alphabetical order in any (i.e. decreasing or increasing) order. But for this, it is very
important that the alphabetical order is fixed and the alphabetical order is harmonized with all
the keyboards and fonts.
5.2.3 Application Software
We type the text of the language on the computer, make necessary amendments and additions
in them, so can’t the computer do some other language-related tasks, which are necessary for
humans? Computers can compose poems and stories; can be a good tool for teaching
language. If we can use computers to translate, then a great need of the modern age can be
met. For all these tasks, we have to give specific types of information to the computer. The
use of those information by computers is what we call ‘artificial intelligence’. That is, he acts
in such a way as if he is doing the act after thinking carefully on the basis of the knowledge
of the subject.
Playing chess, driving a robot etc. are also tasks accomplished by artificial intelligence. There
is no role of language in this. We call the tasks of artificial intelligence related to language as
‘Natural Language Processing’ (NLP) programmes, because they contain the same kind of
language-related tasks that humans behave in a natural and instinctive way. Let us get
acquainted with some of these major programmes of natural language processing. One of
these is related to ‘character recognition’ and two to sound and writing. These two
programmes related to sound and writing are - (1) reading the written text; and (2) writing
down the oral text. In addition, there are programmes for applications related to
‘transliteration’' and ‘translation’. Let us first learn about the programme related to ‘OCR’.
(a) Optical Character Recognition (OCR ) :
The basis of Character Recognition programme is scanning. Scanning is the process in which
the computer recognizes a picture etc. in terms of points of light and digitizes the picture in
the computer. Saves from the system in pictorial form (i.e. in the form of picture) or in the
form of text. When a computer ‘reads’ printed or typed (and sometimes handwritten) text, it
prints it on the computer screen as text. We can edit this printed text whereas the text in
pictorial form cannot be modified or edited in any way. This type of computer programme for
recognizing words in text is called ‘Optical Character Recognition (OCR)’.
How does this programme work? When we select the option of OCR, the computer
recognizes each character of the text one by one and saves them as spoken text (as typed).
Character recognition is a different process than scanning a picture or page with a programme
scanner. After scanning the printed text, the programme recognizes each character and saves
them as characters. Sometimes even the computer makes a mistake in character recognition.
Then we can run the ‘Spell Check’ programme to correct the mistakes.
Why is Character Recognition Programme necessary? What are the benefits of this? Many
times the typed work is available on paper, but for some reason it is stored on the hard disk of
the computer, C.D. or not found in pen drive. Or publishers have to republish their book
many times and the book is available on computer hard disk, pen drive or CD. etc., then re-
typing, proof revision etc. of the entire book becomes necessary. In this type of situation,
character recognition programme proves to be very useful.
85
In the English language, this type of character recognition programme is working
successfully, which makes it possible to save the text available in English as a picture or text.
On the other hand, there is a lack of effective programmes to save the content in the form of
text in Hindi. Government organization namely C-DAC had made a character recognition
programme, but it is not commercially available for everyone’s use.
Let us now look at both the programmes related to sound and writing - reading the written
translation and writing the spoken text. The first of these is called ‘text-to-speech’ and the
second ‘speech-to-text’ programme.
(b) Text to Speech :

The ‘Text to Speech’ programme reads out the printed text and presents its spoken form
(pronounced form). This programme is a good means of providing education to lakhs of blind
people of the country. Language students and tourists etc. can take advantage of listening to
written material through this. A question may arise in your mind that how is this possible
through computer?
The programme takes words from a lexicon pre-recorded in human voice in the computer and
presents them as normal reading. That is, if the spoken form of all the words is recorded in
the computer, then the character reader can bring their pronunciation and present it. This
requires material already written in human pronunciation. Alongwith this, the features of
pronunciation are also added to it. If the written text is not in the standard spelling or if a
word is not in the dictionary of the computer, then the computer cannot pronounce it.
Although, the pronunciation of computer looks artificial, you must have heard the mechanical
sound of robots etc. in films. For the execution of natural language like human, we have to
give information about proper force, tension etc. It complicates the task. This programme in
Hindi has been prepared as an experiment. In some programmes available on the net, you
enter some sentences, they pronounce them.
(c) Speech to Text :

This programme listens to the language and presents its written form. While using it, you
speak and the computer listens to your language and types it in the form of text. This
programme is extremely useful because writers do not have to depend on typists and work
can be done at a faster pace. It is often seen that stenographers take dictation but cannot
rectify any confusion. When the computer takes dictation, we can immediately see which
spoken word it has not caught. In this way, along with speed in work comes quality. It is an
indispensable assistant for working on computers in offices. Taking dictation of numerical
data for stenographer is a difficult task, when it contains decimals or fractions, then we can
also enter data of these numbers in the computer by dictation and can also check it on the
computer screen itself.
How does the computer do this work? The computer stores the written form of the language
as well as the spoken form. When you speak, the computer matches your speech to the
storage. When the match is done, it brings its written form on the screen. In fact, the
computer brings out that word in typed form by matching your pronunciation with the
86
pronunciation of the words recorded in its memory. He does not make mistakes in matching
big words like ‘unconsciousness’ because there is no similar word, but in small words like
‘fine’ and ‘pain’, the chances of mistake are more. The basic reason for this is that the
computer does not recognize each sound, but tries to capture the entire pronunciation of that
word. This goes to prove that getting the pronunciation right is vital to successful use of this
type.
We all have individual characteristics of speaking style and pronunciation. Then how can the
computer recognize everyone’s pronunciation? For this we all have to train the computer.
When we read words from a computer store, the computer copies your pronunciation and
intonation/style and matches your pronunciation to the same one a second time. For this
reason training is essential for every person who can speak.
The following two programs – Transliteration; and Translation - are related to the translation
field. You are studying in translation course. Therefore, you will study these two topics in
detail elsewhere. Here we will give their general introduction:
(d) Transliteration :
In this, we present the text written in one language in another script. Transliteration between
languages of the same language-family (such as Hindi, Gujarati, Marathi etc.) is easier, but
transliteration between languages of two different language-families (such as English and
Indian languages, etc.) is a bit difficult.
(e) Translation :
Translation is one of the most important in the field of Natural Language Processing (NLP),
because in the modern era there is a daily need for large amounts of translation between
major languages of the world. Translation between languages of the same family is relatively
simple; the language of Science or Law is relatively easier to translate than colloquial or
literary language. If our knowledge of language becomes better, then we will be able to give
better instructions to the machine. Translation done through computer is called ‘Computer
Translation’. It is also called ‘Machine Translation (MT)’. You will study about this in detail
in Block 4 of MTT- 020 titled ‘The Process of Translation’. Here we would like to point out
that the research and development in the field of MT so far has ensured that 100% translation
is not possible through it. That is why human-machine collaboration is essential. It is because
of this collaboration that the concepts of ‘Human-Assisted Machine Translation (HAMT)’
and ‘Machine-Assisted Human Translation (MAHT)’ have become tangible. However, the
translation done in both these ways is also not 100% correct. That’s why today ‘Computer-
Assisted Translation (CAT)’ is talked about. Overall, it can be said that machine can never
replace human. We should see the machine only as our reliable ally.
5.3 Functions of the dictionary
Why do we use dictionaries? Words are signs of meaning in language, they are carriers of
meaning. If we do not know the meaning of the word then we will not be able to know the
meaning of the sentence in which it is used. Alphabetically arranged dictionaries help us to
reach that word and understand its meaning.
87
How do we identify the meaning from the dictionary? The relationship between word and
meaning is very complex in many cases. In fact, the meaning of the word has a wide range,
where different shades of meaning are expressed in different situations. We can understand
the difference in meaning from English expressions like ‘to work’, ‘useless’, ‘getting a new
job’ (project, job) etc. In fact, explaining the meanings of each word through appropriate
sentence usages is an effective strategy for making dictionary. Such dictionaries can be given
the name ‘Dictionary of Usage’. If we do not give ‘usage’, then the user or scholar of the
language will not be able to use the word properly.
The developed form of ‘Dictionary of Usage’ is the ‘Learner’s Dictionary’. In the modern
era, this dictionary of scholars has become a benchmark for the creation of scientific, useful
and purposeful/functional dictionaries from the point of view of dictionary making, because it
tries to clarify the meaning of the words by moving away from the traditional method of
defining the meaning of the word. It is meant to say that the learner’s dictionary does not give
semantics in all ways, explains the meaning of similar words, gives appropriate expressions
according to the context, presents semantics and word formation in a logical manner, which
helps to understand the meaning of the language. Users can pay attention to the meaning at
the level of cognition. Due to these features, the ‘Oxford Advance Learners Dictionary’ of
English has become the most popular and most authentic dictionary. The dictionary has
become an essential reference book for teachers and writers of the English language.
On the lines of the Oxford Advanced Learner Dictionary, Hindi’s ‘Chhatrakosh’ prepared by
Prof. V.R. Jagannathan is extremely useful for users of the Hindi language. This dictionary is
the first Hindi learner’s dictionary, which is useful for Hindi teachers, translators and writers.
It can be called an essential reference book from the point of view of informing about the
nuances of the language. This type of dictionary being like computer software increases its
usefulness. This means that it is a treasure trove of features that can be usefully used by
people of all levels according to their needs.
Dictionary and Thesaurus complement each other. Dictionaries give the meaning of words,
Thesaurus suggests words according to the meaning or idea or feeling. Like in the Thesaurus,
all the ideas of language are arranged in about 1000 classes or categories. If you look at the
category of ‘anger’ in the vast category of emotions, you will find all the words related to it.
User can choose the appropriate word from them. There is a dearth of thesauruses. He
presents all the words related to ideas in one place, but cannot give their meaning there.
Before selecting a word, if a person wants to see the meaning and usage context of that word,
then he will have to go to the learner’s dictionary again. In this context, we can say that for
the users or scholars of the language, the co-ordinated form of the learner’s dictionary and
thesaurus is most useful. This coordinated form is extremely rare in the printed version
presented in the book form. You constantly have to keep turning pages to move from one
book to another and from one entry to another. Thus, the student can get the perfect
combination of dictionary and thesaurus only on computer, because not only he can access
the desired material at a faster speed, but also keep in front of him all the provisionally
selected entries before the final decision..
88
5.3.1 Digital dictionary
In the previous section of this unit, we discussed that the quality of reciprocity is the most
important requirement of a learner’s dictionary. An integrated form of dictionary and
thesaurus can serve this purpose. If this form of the dictionary is made in numerical method,
then it is easy for the student to consult the dictionary.
There can be two forms of digital dictionary available on the computer. One in which we
keep the pages of the entire dictionary as separate images in the computer. We call this the
‘PDF’ format. Suppose the Dictionary compiled by Hindi Sahitya Sammelan or Kashi Nagari
Pracharini Sabha is in 15 volumes (in 4500 pages). If anyone wants to find out 4-5 words
than many pages of 15 books will have to be turned. On the other hand, the only advantage of
this digital dictionary is that you can open all the pages in seconds. But to have 4500 pages as
images we need huge memory and accessing the pages can also take time.
Another way to keep the entire dictionary on the computer is to keep the entries as typed text.
It will take less space in terms of memory, we will be able to access words directly, not
pages. The third advantage is that related words can be reached by the method of surfing. If
while making the entries of this dictionary, we enter the information about the source of the
word etc. in different fields, then it will be easy to find the related information. We will
discuss this in the next section.
The biggest advantage of digital method is that with printed text, we can make pronunciation,
picture, movie, animation etc.as a part of dictionary, which is not possible in printed
dictionary.
5.4 What is a Computer Dictionary?
There is a popular saying in English about ‘democracy’. It is a government ‘of the people, for
the people and by the people’. Almost the same thing can be said about the computer
dictionary. This dictionary belongs to the computer, is for the computer and is created by the
computer itself. In the context of this explanation, we can know about the functions and uses
of the dictionary kept in the computer.
5.4.1 Computer Dictionaries and Printed Dictionaries : Salient Features

Computer dictionary is the dictionary of computer. When we say that this ‘dictionary belongs
to the computer’ it means that even though a normal printed dictionary may contain very
useful material, it is not as useful in terms of its size and shape as a computer dictionary. The
computer dictionary has certain features which are not there in the printed dictionary. You
can also call these features/characteristics the ‘difference’ between a computer dictionary and
a normal printed dictionary. These points of difference can be discussed on the following
grounds:
1. Possibility of Continuous Updating: Continuous development is taking place in the

field of knowledge. Due to this a large number of new words are being developed,
new meanings are being given to the available words. Therefore, it becomes necessary
that they should be included in the dictionary. But printed dictionaries, especially
89
voluminous dictionaries, can be said to have this limitation that they cannot be
modified and expanded in the context of new words and new meanings of words, until
the situation of reprint comes. Some printed dictionaries are never revised due to
practical difficulties. Due to the continuous variability of language, continuous
updating of the dictionary is essential. This can only happen in the context of a digital
dictionary. If we wish, we can make necessary amendments daily. The process is
simple and time-saving. Also it is low cost. Thus, it can be said that adding new
words, modifying and editing available words in a computer dictionary is simple and
time-saving as compared to printed dictionaries.
2. Possibility of Quick Availability/Access : Searching for a word in printed

dictionaries, especially large vocabularies printed in several parts, is laborious and
time consuming. In a computer dictionary, we can not only quickly access the word,
but also quickly look up all the words with cross-references.
3. Ability to Search for Information : The biggest feature of numerical method is -

search. Suppose we want to find out how many times the word ‘lotus’ appears in a
200 page novel, which is impossible in a printed book. The computer will tell in a
second where that word has appeared. Suppose we want to know where ‘lotus’
appears as the subject. This search cannot be done on the basis of typed text, but for
this we have to write a parsing programme. Suppose we want to know how many
synonyms are there for ‘fire’? We can search for alternatives through design of a
computer-generated dictionary programme. We have to set the space to show
synonyms, antonyms etc. with each entry while making the entries. Then all words
related to the word with the entry can be found. We will discuss this in detail in the
next section.
4. Limitlessness of Size : There is a limit to printed dictionaries in terms of size. They

can be monolingual, bilingual, trilingual, quadrilingual. But it is not possible to
include countless languages in this sequence. But the specialty of computer dictionary
is that a large number of languages can be included in it and their synonyms can be
kept.
5.4.2 Role of Computers in Dictionary-Making

While discussing the ‘Role of Computer in Dictionary-Marking Process’ in Section 4.6 of
Unit 4 of this course it was stated that computer is also used as a tool for dictionary making.
In has also been told that computers are used for word-storage and editing-revision, indexing
and concordance. In this unit we will discuss here that the making of computer dictionary can
be done effectively by a computer.
Before discussing this, it would be necessary to know how the dictionary is formed in the
traditional way. The determination of entries and the collection of the particulars given under
each entry are two major tasks in the making of a traditional dictionary. Entries must contain
information such as meaning, usage, source of the word, synonyms, antonyms, idioms, cross-
references (references to other related words) etc. Lexicographers usually collect all this
90
information in cards and give the book shape to the dictionary by putting the cards in
alphabetical order. This work is very labour-intensive and there is a possibility of mistakes in
many places. The greatest difficulty is to find out the various meanings of the word. We
cannot do this work on the basis of memory. Extracting all the meanings from the books after
reading them, recognizing new meanings and entering them in proper place etc. is a very
difficult task.
To perform this work in a systematic manner, we get the biggest help from the computer. For
this we need to have a collection of natural language texts. This collection can be of written
language, also of oral language. Let’s say that our dictionary will eventually consist of 5 lakh
words. To know the different meanings of all these words, we would need a huge collection.
Good computer dictionary analyze about 20 million (about 50,000 pages) of material from a
variety of sources, such as literature, journals, science topics, everyday practice, etc.
‘Concordance’ programme related to words brings to us all those sentences in which there are
various contexts of its use. We can analyze all these usages together and finalize it as an
entry. Similarly, from the point of view of meaning, the details of many other features can
also be known immediately. We can also analyze the characteristics of related usages in the
concordance programme itself.
Another purpose of the Concordance is to determine the ‘lemma’. Let’s take an example of
the any verb which has many forms in Hindi. Lexicography considers root form of the very
and all the rest as derived from it. Lexicology considers any one of these as the word of the
main entry (i.e. the lemma) and interprets the meaning of all the other words within it. The
dictionary identifies a base form of related words and executes the meaning of all the words
under it. We can call this process ‘lemmatization’. ‘Lemmatization’ is an essential part of the
process of making any computer dictionary.
What is the need of the lemma? Many traditional dictionaries do not adopt the process of
analyzing the meaning of related words. They treat the visible forms of words as separate
entries to indicate their meaning. In this way, the ‘design’ of the dictionary helps us to create
the dictionary correctly, also provides the facility to view the information of the created
dictionary in the desired way.
If the ‘design’ is more deep and well thought out, the more the utility of the dictionary will
increase. Due to its design, in a computer dictionary, we can refer to the context of the use of
literary words, abuse, curse, blessing etc., along with the meaning of the words. Then type the
name of the context used in the search condition, then you can also see all the words of that
context. It will be mandatory to set aside a separate place for the mention of such usage areas
in the design layout.
The concept of lemma indicates the scientific method of dictionary making. The basis of this
concept is the semantic analysis of language through words. It is believed that computer
lexicon should be based on semantic analysis, only then we can take full advantage of it in
programmes like spontaneous machine translation etc. We cannot imagine smooth translation
from bilingual dictionaries based only on semantic equivalence.
91
When we identify all the forms of the word through concordance and determine their lemma,
then we also concord/index all the words that are part of the lemma and analyze their
meaning. The machine (computer) places before us all the forms of the lemma and we can
see their applications in a systematic way and include their information in the entry.
Now we can consider another aspect of dictionary making by computer. Some computer
software manufacturers have created programmes for dictionary making. When you enter
your information in it, then the computer dictionary takes shape. But if you want to create a
dictionary according to your wish, then you should prepare your own programme. We shall
call this aspect the ‘design of the computer dictionary’. This design controls both the size and
‘functionality’ of the dictionary.
5.4.3 Dictionaries for Computers

The title is a bit shocking to us, because we tend to view computers as a tool, not as a
consumer. Before this discussion, we need to know how computer works related to language.
Let’s say we have a nice programme for spell checking. He will catch the misspelling of
words. He recognizes different forms from them by comparison. But he will consider ‘that
boys goes’ as correct, because all the three words are spelled correctly, with only
grammatical error. Computer does not know language like humans, only follows rules.
In the past few decades, artificial intelligence has begun to be incorporated into computer
programmes. The computer can now play the game ‘chess’. For this he has to take stock of
the situation, calculate the consequences of three-four moves possible for him and choose the
most suitable move. In this process, the computer introduces the same type of intelligence as
humans do.
The computer translates with the help of given grammatical rules and bilingual dictionaries.
So if we say ‘The Man is pregnant’, the computer will produce its translated form in another
language. But if we can add to our dictionary the information that the word ‘pregnant’ refers
only to women or to females animals, then the computer will not translate the sentence, but
will mention that the sentence is incorrect. In this way, the dictionary kept in the computer
will not be just a collection of information, but its meaning and usage characteristics will be
mentioned in such a way that the computer can use it properly in the execution of its other
programmes. This is what we call the system of keeping lexical resources in the computer in
a machine readable form.
Why is there a need to give the knowledge of the dictionary to the computer? In this context,
we can discuss some programmes in which the computer can use this knowledge. We talked
about spell checking programmes. There is another similar programme known as ‘grammar
check’. We can keep grammatical information related to words in the computer, but we
cannot mention nouns as subject or verb in the dictionary. Almost all noun words can come
in the form of a subject or an object. To identify the causal relation of words, we will need a
programme called ‘Parser’. The Parser can do three things - it can recognize the form of the
original word by recognizing the forms of the words; can recognize the categories of words
(like noun, pronoun etc.); and can identify the function of words in a sentence and indicate
92
their relation to each other and inform the formation of various phrases. Only by using this
information related to the word, the computer can complete the work of grammatical or
stylistic checking, translation etc.
Not only for computers, advanced dictionaries of languages also need to provide grammatical
information. We have already mentioned that the main purpose of the dictionary is to list all
the possible usages of the word, so that we can get the knowledge of the meaning according
to the context. In this way, the grammatical information related to the word is also important
in determining the meaning. The learner or user of the language wants to know where the use
of which word is appropriate. We know that some adjectives come before nouns, but cannot
come as predicate adjectives in verb phrases. As :
The immense wealth of the state -- The wealth of the state is immense.
His pure conduct -- His conduct was pure.
Smooth motion -- Motion is smooth.
Now-a-days, grammatical information is also being given in good dictionaries. e.g.
Leisured people Adjclassif: ATTRIB

(Source: Collins Cobuild English Language Dictionary (1987), Collins, UK)
Here ‘ATTRIB’ means that the word is not used as a predicative adjective. When printed
dictionaries can provide grammatical information, it is essential for computers to do the
same.
Another programme in which the computer makes use of its knowledge of semantics is called
an ‘Query’ programme. In the query program, the computer has to recognize the question
asked, and after recognizing it, the correct answer has to be given. For example, ‘What time
is Super Fast train arriving?’ In response to this, ‘at half past ten’ or ‘at ten twenty’, the
answers are relatively simple, 10.30; 10.20 etc. can be spoken on the basis of numbers. A
truly successful query is one where the computer can respond to the situation with logical
inference. There is a famous example in this context :
The text is:

All three went to the restaurant. Ordered food. Been sitting for a long time. Then went out
after getting irritated.
Question to computer - Have all three eaten the food?
Computer answer - No.
The computer guessed the answer based on semantics.
In the case of eating, the following expressions would have been - payment of bill, being
satisfied, completing the meal.
In the event of not eating - getting irritated, angry, dissatisfied with the service, being sad
about the delay.
93
That is, we are giving the same type of knowledge to the computer, which is there in a normal
person. The computer is reasoning in the same way that we humans do. Such a knowledge-
rich computer can perform linguistic tasks naturally and easily. We call the process of giving
this knowledge to the computer, that is, making linguistic information available for the use of
the computer.
Before learning about ‘Knowledge Representation’ in the context of computer dictionary, it

would be worthwhile to consider online dictionaries.
5.5 Online Dictionaries
Along with computer dictionaries, online dictionaries are also often used. It has been clarified
in Section 5.4 that the computer dictionary is actually of the computer, for the computer and
created by the computer itself. Now let us try to know what is online dictionary and how it is
different from computer dictionary.
5.5.1 What is an Online Dictionaries?

Online dictionaries are also often discussed along with computer dictionaries. In the context
of computer dictionary, it has been told in Section 5.4 of this unit that this dictionary belongs
to the computer, is meant for the computer and is created by the computer itself. Online
dictionaries is also basically computer dictionary, but the special thing is that we can use
these dictionaries through internet. That is, to use online dictionaries, it is necessary to have
an internet connection. While using it, the user has to go to the website. For this, the user
needs to know how to use the website. Although this facility can be used with general
computer knowledge.
5.5.2 Online Dictionaries and Computer Dictionaries: Salient Features

As a result of the discussion made so far, in conclusion we will see what kind of dictionaries
are suitable for computers or on computers. In this context, first of all we would like to see
what is the difference between Online dictionaries and computer dictionaries.
We have discussed in Section 5.4 of this unit that digital dictionaries have four characteristics
– quick access to words, facility to be updated from time to time, facility to find required
information and limitlessness of size. However, these features form the basis of the difference
between the two as compared to the traditional printed dictionaries. But these are common
features of both computer dictionaries and online dictionaries. That is, these characteristics
are of computer dictionary as well as of online dictionary. In the light of these characteristics,
overall it will be said that computer dictionaries and online dictionaries have the quality of
reciprocity, while printed dictionaries lack this type of reciprocity.
Here we also have to keep in mind that in terms of properties such as instant access to words,
facility to update from time to time, facility to find required information and no limit of size,
the computer dictionary and the dictionary available on the internet have similarities. But
functionally there is a slight difference between the dictionary located on the computer and
the dictionary available on the internet. The basis of the difference is that the computer
dictionary is located on the computer (also called offline dictionary), but for the dictionary
94
available on the internet, the computer must be equipped with internet facility. In the absence
of internet connection, it is not possible to use the dictionaries available on the internet.
Online dictionaries may also be available free of cost and can be used after paying a certain
amount.
In fact, the topic of discussion is that what kind of digital dictionaries are available. Hundreds
of dictionaries are available in English. All dictionaries have their own characteristics.
The important thing is that the digital dictionary created on the computer can be used in other
programmes as well. This content in the form of a dictionary is called ‘Computer lexicon.
Let us now move on to the process of making linguistic information available for use by the
computer i.e. ‘Knowledge Representation '. Learn about But before that it would be
appropriate to ask the following self-examination questions.
5.6 Knowledge Representation
Section 5.4.3 of this unit on ‘Dictionaries for Computers’, we have explained that a computer
with knowledge can perform linguistic tasks naturally and easily. It is actually making
available linguistic information for the computer to use, which we can call giving knowledge
to the computer. ‘Knowledge Representation’ is the process of making linguistic information
available for computer use.
‘Knowledge Representation’ is a vast topic in itself, which is a major part of the field of
Natural Language Processing (NLP) from the point of view of language. In terms of
computing, knowledge management, knowledge engineering etc. are developed in the fields.
In the context of knowledge representation, we will discuss here only those topics, which are
related to the field of words and meaning.
As discussed above, the purpose of Knowledge Representation is to put the linguistic objects
in such a form that the computer, i.e. the programmes run by the computer, also get the
necessary knowledge about the language. In this way, in the field of language computing, we
move away from ‘data base’ (data accumulation) towards ‘knowledge accumulation’.
‘Knowledge accumulation’ is actually a replica of the knowledge set accumulated in the
human brain. We know that only birds fly, humans cannot fly. In the computer, if we
associate the verb ‘fly’ with birds, then the computer can misjudge ‘he flew to the office’ by
inference. The complexity of the language is that we can also say ‘he is flying’ in an
idiomatic sense, the computer can also see that a word like ‘flying’ can also come in an
idiomatic sense, figuratively to humans. The question is how do we keep these information or
materials in the computer. In other words, the question is what is the methodology for
knowledge representation?
We will discuss only four types of knowledge representation here. Two of these are related to
the ‘function of words in sentences’ and two are related to ‘semantics’.
5.6.1 Knowledge Representation : Syntactic
The first is logic. Logic is being able to make inferences and draw conclusions in the context
of statements. For example, there are two statements. There is a saying - ‘Animals have four
95
legs.’ Then there is another statement - ‘X is an animal.’ In the context of these two
statements, all the animals that come in the category of ‘x’ (such as elephant, lion, deer etc.)
will have the characteristic of having four legs. Man is not an ‘animal’ in this sense, he is a
‘creature’. ‘Predicate logic’ is the dominant form of sentence analysis in the modern era;
Boolean algebra is a major tool for linguistic operations. You can get their introduction from
‘internet’.
The second major type of Syntactic knowledge representation is the ‘Parser’, which interprets
the properties of forms, words, and phrases by analyzing their functions.
5.6.2 Knowledge Representation : Semantics

We will discuss here the technique of keeping words in the form of knowledge accumulation
in the computer in terms of their composition and meaning.
The first type in this is ‘Semantic network’. To understand this, we are presenting a diagram,
which is as follows:
Animal ISA
Monkey
ISA DOES
HASA
DOES HASA tail Lives on trees
Lives in home Cat
DOES
ISA
Catches mice
Pet
?
We can discuss three types of relations of each object with other objects in the language.
These relations are – what she is, what qualities she has; and what she does. For example,
• Salmon is a fish. (ISA)
• Cat has a tail. (HASA)
• Kite /Eagle flies. (DOESA)
These three relations of words reveal its meaning like a woven net. The relation ‘is’ (ISA)
indicates its place in the order of things, the relation ‘is attribute’ (HASA) indicates the
characteristics, the relation DOESA refers to all the functions of that object. All animals have
the same characteristic of has four legs. That’s why if we keep increasing the network as
mentioned above, then we can see the meaning of the words on a wider scale.
There are some drawbacks of Semantic Network. For example we can’t show absent relation
like ‘Monkey is not a pet.’ This is where predicate logic comes to us as a means of filling this
gap.
Another type of word and semantic representation is ‘WordNet’. It is also a network, which
organizes the words of the language into ‘SYNSET’ (Set of Synonyms) and presents the
relations between them in a hierarchical manner. Let us try to understand this by taking the
example of an Synset :
96
‘good’ - good opportunity to plant tomatoes. (good time)
‘appropriate’ - appropriate time for social changes (time is ripe)
‘proper, good’ - proper to start work (the right time)
You can see that we can consider these three words as a synonymous set from the point of
view of usage context. It is also noteworthy that the same set of English synonyms appears in
Hindi as well, because semantic arrangements are universal across languages.
‘WordNet’ presents a set of synonyms of four types of words. These words are noun, verb,
adjective and adverb. Synonyms set consists of words of close-synonyms; at the same time,
co-usages are also taken in the context of use such as the use of Hindi word ‘Sagar’ to
indicate the abundance or vastness of the quantity - ‘Vidyasagar’, the word ‘Sagar’ etc. If a
word has different meanings, then we keep them in different sets of synonyms.
We can see the relation of different synonym sets to other sets in three major arrangements in
‘WordNet’. These are :
a) ISA Relation : ‘Lion is an animal.’ - In this sentence ‘lion’ is a ‘member noun’

(hyponym), the animal is a ‘class noun’ (hypernym). All members of the class (lion, deer,
cow etc.) are class ‘co-ordinate terms’ of the ‘class name’. In this context, we can mention
that class synonyms have some common characteristics, which we can call their HASA
relation. An example of this is being given further.
b) HASA Relation : ‘The cow has two horns.’ - In this sentence ‘cow’ is a holonym, and
‘horn’ is a meronym. Meronym ‘horn’ applies to all the words of member names like cow,
deer, goat etc.
class/animal/-/member/ cow holo leg horn

deer '' ''
goat '' ''
rabbit organ '' --
c) Relation of Synonymy and Antonymy : It is the combined form of both mentioned

above. We can denote a synonym by ‘is the same as’. Antonym indicates the opposite
meaning. For example,
• the sense of Hindi word ‘Ishwar’ is synonymous with ‘God’.
• ‘Theist’ is a antonym for ‘Atheist’. (not quite the same)
• The well is deep. (The well has depth.)
• The pond is shallow. (The pond doesn’t have depth.)
Having considered the context of knowledge representation, let us now consider its use by
computers.
5.6.3 Use of Knowledge Representation by Computer

For the use of computers, we can name linguistic information as ‘Ontology’, because in this
various aspects related to the meaning of words are presented in a logical manner and the
97
computer can recognize these arrangements and use them. In Ontology, we get information
about classes (generic nouns) and their expressed form. Then we see the relation between the
class and its members (e.g. animal; cow). All properties of a class are properties of its
members. The general characteristics of a class are inherited by its members, such as the
characteristic of the ‘four-legged animal’ applied to all animals. Then the interrelationships of
all the words are specified in the manner described above. In this context, we can say that the
lexical reference in Ontology is more systematic and useful than the general thesaurus and
has less scope for ambiguity.
An international organization has been established for the creation of resources for language
related tasks in computers, named W3C (unified organization ‘www Consortium’). This
organization has listed four main features of OWL (Web Ontology Language). Let us look at
all four of them :
1. We can organize all the words of the concepts of language into classes and sub-
classes. For example, the words ‘animal - bird – pigeon’ etc. form a series of class
names and member names. This is the primary basis for the accumulation of
knowledge of the language.
2. Then let us describe the characteristics of classes and class members. Classes have
shared characteristics. Each member of a class has a unique characteristic of its own,
which distinguishes it from all other members. All the characteristics of a word make
up the meaning of that word.
3. Class and class member are generic words, abstract concepts. Objects existing in real
life are manifestations of those classes or sub-classes. This may also have its own
specific symptoms. For example, ‘A dog has four legs, but Ramesh’s dog Moti has
only three legs.’ If we use a Hindi sentence ‘Ramesh Ghar Gayi’, which is a wrong
sentence, because an attribute of the class ‘man’ is ‘masculine’. Here the expressed
form ‘Ram’ or ‘John’ is always masculine, because gender indicates ISA relation, not
HASA relation. Hence ‘This cow is not an animal.’ is always wrong.
4. The last characteristics are ‘Operations’. Let us talk of two sets in the context of
human beings – ‘Indian’ ( i.e. all the people of India) and ‘African’. Another ‘man’
and ‘woman’. From these we can create subclasses like ‘African men’, ‘Indian
women’ etc. Both the sets bring their own characteristics, from which the
characteristics of the subclass can be known.
With the help of Ontology, the computer will be able to use these words unambiguously in
translation, inquiry etc. related programmes.
5.7 Computer and Lexical Resources
In Section 5.6 of this unit we discussed that the process of organizing Natural Language
Resources i.e. linguistic units for various linguistic programmes is called ‘Knowledge
Representation’. In ‘Knowledge Representation’, the meaning and grammatical
98
characteristics of words are adjusted for use by computer programmes in a variety of ways.
For this reason, we will not be able to say them ‘Semantic Net’ or ‘WordNet’ etc. We call
dictionaries, word lists, dictionaries of multi-word expressions, thesauruses, wordnets,
semantic nets, etc. as ‘Lexical Resources’.
The develoment of lexicographic resources is the goal of those who do linguistic work in
computers. With these lexical resources, there is a possibility of use them in the context of
language testing, analysis of language composition, translation, language teaching, ‘enquiry’
etc. programmes of information-communication etc.
Now we will discuss some of the dictionaries or lexical resources available on the computer.
on the internet. If you search by name Hindi dictionary, you will find hundreds of
dictionaries like, www.hindi-dictionary.net, www.babylon.com, en.bab.ladictionary etc. The
last two are truly multilingual dictionaries and give the meaning of a given word in the other
language depending on the chosen source language and target language. There is a lack of
usages in these, there is a limit of words.
Some of the dictionaries available on the Internet are free, while many dictionaries can be
obtained by paying a fee. www.shobdkosh.com contains the usages and co-usages of the
word. Similar to all usages, English equivalents etc. are also available. As soon as the
characters are typed, many words appear in the drop down menu. His translation can be seen.
There are also uses of words. But there is a lack in these too.
Now, we will discuss ‘Wordnet’ and ‘Thesaurus’' in the context of other lexical resources.
‘Hindi Wordnet’ (Hindi wordnet : www.cfilt.iitb.ac.in) : Prof. Pushpak Bhattacharya created
this site. It contains 92456 words and 36910 synonyms. 23636 Synonym sets are related to
each other. There are hundreds of rare words in the name of synonyms.Collection of word
has been painstakingly done.
‘Thesaurus’ : Arvind Lexicon ‘arvindlexicon.com’ is the largest thesaurus. Its free version
contains 85000 words and 73000 English synonyms in 8500 entries. If there are three
meanings of a word, then all the equivalents and English words come to the fore. Each
synonym is associated with a hyperlink and on clicking it, it comes up with its meaning as in
the original entry. Many synonyms are rare. If you look at the depth of meaning, derived
words and the convenience of quick search, it is a very useful tool. It is a true bilingual
thesaurus; there are no usages in it like a learner’s dictionary.
We discussed that computer dictionary can be a combination of usage and thesaurus. The
same effort has been made in ‘Chhatra Kosha’(learnner’s dictionary) available on
www.hindi-dictionary.com. It explains semantics with usage and provides quick access to
thesaurus-like etymologies and related words.
5.8 Let Us Sum Up
Computer solves mathematical problems, keeps track of company’s accounts. We call this
task ‘data processing’. Editing of language-lessons on computer also works, we call it ‘word
processing’.
99
We can do many other things related to language through computers. This field is named
‘Natural Language Processing’ (NLP). The computer can type by listening to the spoken
language, it can ‘hear’ the written language by reading it. He can read our written text and
correct spelling and grammar mistakes. To do all these tasks, the computer needs the
vocabulary and grammatical rules etc. of the language. With the help of the same resources,
the computer completes many tasks like translation, writing poetry, answering the questions
asked, etc. Computer’s ability to do this work is named ‘Artificial Intelligence’ in Computer
Science.
What is the basis of ‘artificial intelligence’? The dictionary (i.e. list of words) kept in the
computer is just data. We have to put words and expressions into the computer in such a way
that it can recognize the features of word usage and sentence structure. That is, if linguistic
resources are in a machine-readable state, the computer can use that information in a variety
of programmes. We call the system of keeping the elements of language in the computer
‘Knowledge Representation’.
We can put the words of the language in two ways. One of these types is to keep the word in
the computer in the form of a ‘WordNet’, in which the interrelationship of the words is
clarified. For example, ‘Animal’ and ‘Cow’ are the names of the class and class members.
There is a relation between ‘cow’ and ‘horn’. We can also clarify these types of words
through the ‘Semantic Net’. Another way of storing words in a computer is to store words as
a ‘dictionary’. The dictionary mainly provides grammatical information about words. It tells
how the word is formed, which grammatical class the word belongs to, such as noun,
pronoun, adjective, and what kind of relationship that word has with other words in the
sentence etc. In this way, we create a programme called ‘Parser’, which analyzes the sentence
i.e. to identify the compositional relations of the words. On the basis of this analysis, we are
able to complete the work of grammatical checking, translation etc. of the language.
‘Knowledge Representation’ is logic. This argument is based on our knowledge of the world,
such that ‘the elephant flew’ is a false sentence, because ‘flying’ is not a characteristic of
animals. This knowledge of reality is the basis of lexical resources like semantic net. We refer
to processes like synonyms, antonyms etc. in dictionaries. They actually reveal the logical
nature of meaning. Sentences with synonyms convey the same meaning and sentences with
antonyms convey the opposite meaning like ‘I am good’ = ‘I am fine’, ‘This house is good,
but that house is bad’. In the context of knowledge representation, we can say that both words
and syntax are essential for computers and in this unit we have focused on words.
In what form is the dictionary kept on the computer so that it can use it itself? If the
dictionary is kept in printed form, then it will not be of computer use. We have to keep only
the dictionary containing logical, meaningful analysis in the computer. In this context, we
have to keep the dictionary in the form of ‘Wordnet’, ‘Semantic Net’ etc., in which class-
members, organ-organ relations, antonyms-synonyms, co-uses etc. should have all the
characteristics. We would like to call these resources ‘Lexical Resources’ instead of calling
them dictionary, because their purpose and functions are diverse.
100
There are many types of dictionaries. Dictionaries of Synonym and explanatory dictionaries
are of general category, as they do not show how the word is used in a sentence. ‘Learner’s
Dictionary’ explains the meaning of the word with its use in the sentence. From this point of
view, this dictionary is useful not only for learners, but also for computer programmes like
machine translation, creative writing etc. Both these types of dictionaries are suitable for the
knowledge of the meaning of the word used, but it is not possible to choose the word
according to the thoughts. Thesauruses give us a variety of options, but the usage context is
not clear. Computer dictionaries can be an effective combination of learner’s dictionaries and
thesauruses, as computers allow quick access to desired information with ease of word
search.
In this unit we have also explained the difference between traditional printed dictionary and
computer dictionary. We have seen that digital dictionaries have four characteristics – quick
access to words, facility to update from time to time, facility to find required information and
no limit of size. It can be said that digital dictionaries are endowed with the property of
mutuality, while printed dictionaries lack mutuality. Along with this, it has also been clarified
that in terms of the characteristics of the digital dictionary, there is similarity between the
computer dictionary and the dictionary available on the net i.e. online dictionary, but from the
functional point of view, there is a slight difference between them. This difference can be
seen in terms of internet connection. The computer dictionary is located on the computer, but
for the dictionary available on the net, there should be internet facility on the computer. Some
of the dictionaries available on the internet are free, while many dictionaries can be obtaine
by paying a fee. Today, dictionaries of many languages are available on the internet. If we
consider the English language, we find that there are hundreds of dictionaries available on the
net. All dictionaries have their own characteristics. At the end of the unit, we have also
discussed ‘WordNet’ and ‘Thesaurus’ in the context of lexical resources.
In fact, the development of lexical resources is the ultimate goal of those who do linguistic
work in computers, because it is from these resources that authentic material can be used in
many areas such as language testing, analysis of language composition, translation, language
teaching, ‘inquiry’ etc.
5.9 Glossary
Digital : Not only the text stored (data etc.) from the computer’s two-digit system, sound
images etc. can also be entered through digital method.
Computing : Processing tasks performed using computers.
Knowledge Representation : The storage of knowledge in the form of information for its
use in a computer.
Knowledge base : Computer stored information in the form of knowledge.
Artificial intelligence : Computer’s efficiency in tasks such as playing chess, which appears
as if the computer is ‘thinking’.
101
Surfing: Moving from one website or web-page to another site or page on the internet.
Index : An ordered list of words that appear in the text.
Concordance : A list of all the usages of a word in a text.
Reasoning : Reasoning in a logical manner.
Aggregate Set: A set of objects.
Interactiveness : A way to ‘communicate’ with the computer.
5.10 Exercises
1. Throw light on the main and secondary functions of computer.

2. Give an introduction to the major programmes of Natural Language Processing
related to the field of translation.
3. Write a note on ‘Functions of Dictionary’.
4. What is Computer Dictionary? What is the difference between Online dictionary and
Computer dictionary?
5. Write an essay on ‘Knowledge Representation’.
6. Throw light on ‘Computer and Lexical Resources’.
5.11 Suggested Readings
• Malhotra, Vijay Kumar, 1996. Computer Ke Bhashik Anuprayog (Linguistic Applications

of Computer), Vani Prakashan, Delhi.
• Sethi, Harish Kumar, 2009. E-Anuvad Aur Hindi (E-Translation and Hindi), Kitabghar,
New Delhi.
• Allen, James, 1995 (Indian edition 2003). Natural Language Understanding, Pearsan
Education, Delhi.
• Jurafsky, Daniel & James H. Martin, 2000. Speech and Language Processing, Pearson
Education, Delhi.
• Coppin, Ben, 2004. Artificial Intelligange Illuminated, Narosa Publishing House (P) Ltd,
New Delhi.
102

Unit-5

Uploaded by

Copyright:

Available Formats

Unit-5

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit-5

Uploaded by

Copyright:

Available Formats

UNIT 5 COMPUTER DICTIONARIES AND ONLINE DICTIONARIES

• explain the functions of dictionary;

5.2 Computer and Areas of its Applications

5.2.1 Main Functions of Computer

Location 8 7 6 5 4 3 2 1 Decimal Value

5.2.2 Secondary Functions of Computer

(b) Text to Speech :

(c) Speech to Text :

5.4 What is a Computer Dictionary?

5.4.1 Computer Dictionaries and Printed Dictionaries : Salient Features

1. Possibility of Continuous Updating: Continuous development is taking place in the

2. Possibility of Quick Availability/Access : Searching for a word in printed

3. Ability to Search for Information : The biggest feature of numerical method is -

4. Limitlessness of Size : There is a limit to printed dictionaries in terms of size. They

5.4.2 Role of Computers in Dictionary-Making

5.4.3 Dictionaries for Computers

Leisured people Adjclassif: ATTRIB

The text is:

Question to computer - Have all three eaten the food?

Computer answer - No.

The computer guessed the answer based on semantics.

Before learning about ‘Knowledge Representation’ in the context of computer dictionary, it

5.5 Online Dictionaries

5.5.1 What is an Online Dictionaries?

5.5.2 Online Dictionaries and Computer Dictionaries: Salient Features

5.6.2 Knowledge Representation : Semantics

a) ISA Relation : ‘Lion is an animal.’ - In this sentence ‘lion’ is a ‘member noun’

class/animal/-/member/ cow holo leg horn

c) Relation of Synonymy and Antonymy : It is the combined form of both mentioned

5.6.3 Use of Knowledge Representation by Computer

5.7 Computer and Lexical Resources

Computing : Processing tasks performed using computers.

Knowledge base : Computer stored information in the form of knowledge.

Index : An ordered list of words that appear in the text.

Concordance : A list of all the usages of a word in a text.

Reasoning : Reasoning in a logical manner.

Aggregate Set: A set of objects.

Interactiveness : A way to ‘communicate’ with the computer.

1. Throw light on the main and secondary functions of computer.

5.11 Suggested Readings

• Malhotra, Vijay Kumar, 1996. Computer Ke Bhashik Anuprayog (Linguistic Applications

You might also like