Lecture Notes on MS
Lecture Notes on MS
ON
Multimedia Systems
Course Code: OEC-CS701B
SYLLABUS
Module I:
Introduction: Multimedia today, Impact of Multimedia,Multimedia Systems, Components and Its
Applications
Module II:
Text and Audio, Image and Video
Text: Types of Text, Ways to Present Text, Aspects of Text Design, Character, Character Set, Codes,
Unicode, Encryption; Audio: Basic Sound Concepts, Types of Sound, Digitizing Sound, Computer
Representation of Sound (Sampling Rate, Sampling Size, Quantization), Audio Formats, Audio tools,
MIDI
Image: Formats, Image Color Scheme, Image Enhancement; Video: Analogue and Digital Video,
Recording Formats and Standards (JPEG, MPEG, H.261) Transmission of Video Signals, Video
Capture, and Computer based Animation.
Module III:
Synchronization, Storage models and Access Techniques: Temporal relationships, synchronization
accuracy specification factors, quality of service, Magnetic media, optical media, file systems
(traditional, multimedia) Multimedia devices –Output devices, CD-ROM, DVD, Scanner, CCD
Module IV:
Image and Video Database, Document Architecture and Content Management: Image
representation, segmentation, similarity-based retrieval, image retrieval by color, shape and texture;
indexing- k-d trees, R-trees, quad trees; Case studies- QBIC, Virage.Video Content, querying, video
segmentation, indexing,
Content Design and Development, General Design Principles
Hypertext: Concept, Open Document Architecture (ODA), Multimedia and Hypermedia CodingExpert
Group (MHEG),Standard Generalized Markup Language (SGML), DocumentType Definition(DTD),
Hypertext Markup Language (HTML) in Web Publishing. Case study of Applications.
Module V:
Multimedia Applications: Interactive television,Video-on-demand, Video Conferencing, Educational
Applications, Industrial Applications, Multimedia archives and digital libraries, media editors.
Text Book:
1. Ralf Steinmetz and Klara Nahrstedt , Multimedia: Computing, Communications &Applications , Pearson Ed.
2. Nalin K. Sharda , Multimedia Information System , PHI.
3. Fred Halsall , Multimedia Communications , Pearson Ed.
CONTENTS
Lecture 10: Computer Representation of Sound (Sampling Rate, Sampling Size, Quantization)
Lecture 26: Similarity-based retrieval, image retrieval by color, shape and texture
Lecture 33: General Design Principles Hypertext: Concept, Open Document Architecture (ODA),
Multimedia is an interactive media and provides multiple ways to represent information to the user in a powerful
manner. It provides an interaction between users and digital information. It is a medium of communication. Some
of the sectors where multimedia is used extensively are education, training, reference material, business
presentations, advertising and documentaries.
Definition of Multimedia
By definition Multimedia is a representation of information in an attractive and interactive manner with the use of a
combination of text, audio, video, graphics and animation. In other words we can say that Multimedia is a
computerized method of presenting information combining textual data, audio, visuals (video), graphics and
animations. For examples: E-Mail, Yahoo Messenger, Video Conferencing, and Multimedia Message Service
(MMS).
Multimedia as name suggests is the combination of Multi and Media that is many types of media
hardware/software) used for communication of information.
Components of Multimedia
Following are the common components of multimedia:
Text- All multimedia productions contain some amount of text. The text can have various types of fonts and sizes
to suit the profession presentation of the multimedia software.
Graphics- Graphics makes the multimedia application attractive. In many cases people do not like reading large
amount of textual matter on the screen. Therefore, graphics are used more often than text to explain a concept,
present background information etc. There are two types of Graphics:
Bitmap images- Bitmap images are real images that can be captured from devices such as digital cameras or
scanners. Generally bitmap images are not editable. Bitmap images require a large amount of memory.
Vector Graphics- Vector graphics are drawn on the computer and only require a small amount of memory. These
graphics are editable.
Audio- A multimedia application may require the use of speech, music and sound effects. These are called audio or
sound element of multimedia. Speech is also a perfect way for teaching. Audio are of analog and digital types.
Analog audio or sound refers to the original sound signal. Computer stores the sound in digital form. Therefore, the
sound used in multimedia application is digital audio.
Video- The term video refers to the moving picture, accompanied by sound such as a picture in television. Video
element of multimedia application gives a lot of information in small duration of time. Digital video is useful in
multimedia application for showing real life objects. Video have highest performance demand on the computer
memory and on the bandwidth if placed on the internet. Digital video files can be stored like any other files in the
computer and the quality of the video can still be maintained. The digital video files can be transferred within a
computer network. The digital video clips can be edited easily.
Animation- Animation is a process of making a static image look like it is moving. An animation is just a
continuous series of still images that are displayed in a sequence. The animation can be used effectively for
attracting attention. Animation also makes a presentation light and attractive. Animation is very popular in
multimedia application
Applications of Multimedia
Following are the common areas of applications of multimedia.
Multimedia in Business- Multimedia can be used in many applications in a business. The multimedia technology
along with communication technology has opened the door for information of global wok groups. Today the team
members may be working anywhere and can work for various companies. Thus the work place will become global.
The multimedia network should support the following facilities:
Voice Mail
Electronic Mail
Multimedia based FAX
Office Needs
Employee Training
Sales and Other types of Group Presentation
Records Management
Multimedia in Marketing and Advertising- By using multimedia marketing of new products can be greatly
enhanced. Multimedia boost communication on an affordable cost opened the way for the marketing and
advertising personnel. Presentation that have flying banners, video transitions, animations, and sound effects are
some of the elements used in composing a multimedia based advertisement to appeal to the consumer in a way
never used before and promote the sale of the products.
Multimedia in Entertainment- By using multimedia marketing of new products can be greatly enhanced.
Multimedia boost communication on an affordable cost opened the way for the marketing and advertising
personnel. Presentation that have flying banners, video transitions, animations, and sound effects are some of the
elements used in composing a multimedia based advertisement to appeal to the consumer in a way never used
before and promote the sale of the products.
Multimedia in Education- Many computer games with focus on education are now available. Consider an example
of an educational game which plays various rhymes for kids. The child can paint the pictures, increase reduce size
of various objects etc apart from just playing the rhymes. Several other multimedia packages are available in the
market which provide a lot of detailed information and playing capabilities to kids.
Multimedia in Bank- Bank is another public place where multimedia is finding more and more application in
recent times. People go to bank to open saving/current accounts, deposit funds, withdraw money, know various
financial schemes of the bank, obtain loans etc. Every bank has a lot of information which it wants to impart to in
customers. For this purpose, it can use multimedia in many ways. Bank also displays information about its various
schemes on a PC monitor placed in the rest area for customers. Today on-line and internet banking have become
very popular. These use multimedia extensively. Multimedia is thus helping banks give service to their customers
and also in educating them about banks attractive finance schemes.
Multimedia in Hospital- Multimedia best use in hospitals is for real time monitoring of conditions of patients in
critical illness or accident. The conditions are displayed continuously on a computer screen and can alert the
doctor/nurse on duty if any changes are observed on the screen. Multimedia makes it possible to consult a surgeon
or an expert who can watch an ongoing surgery line on his PC monitor and give online advice at any crucial
juncture.
In hospitals multimedia can also be used to diagnose an illness with CD-ROMs/ Cassettes/ DVDs full of
multimedia based information about various diseases and their treatment.Some hospitals extensively use
multimedia presentations in training their junior staff of doctors and nurses. Multimedia displays are now
extensively used during critical surgeries.
Multimedia Pedagogues- Pedagogues are useful teaching aids only if they stimulate and motivate the students. The
audio-visual support to a pedagogue can actually help in doing so. A multimedia tutor can provide multiple
numbers of challenges to the student to stimulate his interest in a topic. The instruction provided by pedagogue
have moved beyond providing only button level control to intelligent simulations, dynamic creation of links,
composition and collaboration and system testing of the user interactions.
Communication Technology and Multimedia Services- The advancement of high computing abilities,
communication ways and relevant standards has started the beginning of an era where you will be provided with
multimedia facilities at home. These services may include:
Basic Television Services
Interactive entertainment
Digital Audio
Video on demand
Home shopping
Financial Transactions
Interactive multiplayer or single player games
Digital multimedia libraries
E-Newspapers, e-magazines
LECTURE NOTE 2
The word multi and media are combined to form the word multimedia. The word “multi” signifies “many.”
Multimedia is a type of medium that allows information to be easily transferred from one location to another.
Multimedia is the presentation of text, pictures, audio, and video with links and tools that allow the user to
navigate, engage, create, and communicate using a computer.
Multimedia refers to the computer-assisted integration of text, drawings, still and moving images(videos)
graphics, audio, animation, and any other media in which any type of information can be expressed, stored,
communicated, and processed digitally.
To begin, a computer must be present to coordinate what you see and hear, as well as to interact with. Second,
there must be interconnections between the various pieces of information. Third, you’ll need navigational tools
to get around the web of interconnected data.
Multimedia is being employed in a variety of disciplines, including education, training, and business.
Categories of Multimedia
Linear Multimedia:
It is also called Non-interactive multimedia. In the case of linear multimedia, the end-user cannot control the
content of the application. It has literally no interactivity of any kind. Some multimedia projects like movies in
which material is thrown in a linear fashion from beginning to end. A linear multimedia application lacks all the
features with the help of which, a user can interact with the application such as the ability to choose different
options, click on icons, control the flow of the media, or change the pace at which the media is displayed. Linear
multimedia works very well for providing information to a large group of people such as at training sessions,
seminars, workplace meetings, etc.
Non-Linear Multimedia:
In Non-Linear multimedia, the end-user is allowed the navigational control to rove through multimedia content
at his own desire. The user can control the access of the application. Non-linear offers user interactivity to
control the movement of data. For example computer games, websites, self-paced computer-based training
packages, etc.
Applications of Multimedia
Multimedia indicates that, in addition to text, graphics/drawings, and photographs, computer information can be
represented using audio, video, and animation. Multimedia is used in:
Education
In the subject of education, multimedia is becoming increasingly popular. It is often used to produce study
materials for pupils and to ensure that they have a thorough comprehension of various disciplines. Edutainment,
which combines education and entertainment, has become highly popular in recent years. This system gives
learning in the form of enjoyment to the user.
Entertainment
The usage of multimedia in films creates a unique auditory and video impression. Today, multimedia has
completely transformed the art of filmmaking around the world. Multimedia is the only way to achieve difficult
effects and actions.
The entertainment sector makes extensive use of multimedia. It’s particularly useful for creating special effects
in films and video games. The most visible illustration of the emergence of multimedia in entertainment is music
and video apps. Interactive games become possible thanks to the use of multimedia in the gaming business.
Video games are more interesting because of the integrated audio and visual effects.
Business
Marketing, advertising, product demos, presentation, training, networked communication, etc. are applications
of multimedia that are helpful in many businesses. The audience can quickly understand an idea when
multimedia presentations are used. It gives a simple and effective technique to attract visitors’ attention and
effectively conveys information about numerous products. It’s also utilized to encourage clients to buy things in
business marketing.
Technology & Science
In the sphere of science and technology, multimedia has a wide range of applications. It can communicate audio,
films, and other multimedia documents in a variety of formats. Only multimedia can make live broadcasting
from one location to another possible.
It is beneficial to surgeons because they can rehearse intricate procedures such as brain removal and
reconstructive surgery using images made from imaging scans of the human body. Plans can be produced more
efficiently to cut expenses and problems.
Fine Arts
Multimedia artists work in the fine arts, combining approaches employing many media and incorporating viewer
involvement in some form. For example, a variety of digital mediums can be used to combine movies and
operas.
Digital artist is a new word for these types of artists. Digital painters make digital paintings, matte paintings, and
vector graphics of many varieties using computer applications.
Engineering
Multimedia is frequently used by software engineers in computer simulations for military or industrial training.
It’s also used for software interfaces created by creative experts and software engineers in partnership. Only
multimedia is used to perform all the minute calculations.
Components of Multimedia
Multimedia consists of the following 5 components:
Text
Characters are used to form words, phrases, and paragraphs in the text. Text appears in all multimedia creations
of some kind. The text can be in a variety of fonts and sizes to match the multimedia software’s professional
presentation. Text in multimedia systems can communicate specific information or serve as a supplement to the
information provided by the other media.
Graphics
Non-text information, such as a sketch, chart, or photograph, is represented digitally. Graphics add to the appeal
of the multimedia application. In many circumstances, people dislike reading big amounts of material on
computers. As a result, pictures are more frequently used than words to clarify concepts, offer background
information, and so on. Graphics are at the heart of any multimedia presentation. The use of visuals in
multimedia enhances the effectiveness and presentation of the concept. Windows Picture, Internet Explorer, and
other similar programs are often used to see visuals. Adobe Photoshop is a popular graphics editing program that
allows you to effortlessly change graphics and make them more effective and appealing.
Animations
A sequence of still photographs is being flipped through. It’s a set of visuals that give the impression of
movement. Animation is the process of making a still image appears to move. A presentation can also be made
lighter and more appealing by using animation. In multimedia applications, the animation is quite popular. The
following are some of the most regularly used animation viewing programs: Fax Viewer, Internet Explorer, etc.
Video
Photographic images that appear to be in full motion and are played back at speeds of 15 to 30 frames per
second. The term video refers to a moving image that is accompanied by sound, such as a television picture. Of
course, text can be included in videos, either as captioning for spoken words or as text embedded in an image, as
in a slide presentation. The following programs are widely used to view videos: Real Player, Window Media
Player, etc.
Audio
Any sound, whether it’s music, conversation, or something else. Sound is the most serious aspect of multimedia,
delivering the joy of music, special effects, and other forms of entertainment. Decibels are a unit of
measurement for volume and sound pressure level. Audio files are used as part of the application context as well
as to enhance interaction. Audio files must occasionally be distributed using plug-in media players when they
appear within online applications and WebPages. MP3, WMA, Wave, MIDI, and RealAudio are examples of
audio formats. The following programs are widely used to view videos: Real Player, Window Media Player, etc.
LECTURE NOTE 3
Words and symbols in any form, spoken or written, are the most common system of communication. They
deliver the most widely understood meaning to the greatest number of people. Most academic related text such
as journals, e-magazines are available in the WebBrowser readable form.
The size of a text is usually measured in points. One point is approximately 1/72 of an inch i.e. 0.0138. The size
of a font does not exactly describe the height or width of its characters. This is because the x-height (the height
of lower case character x) of two fonts may differ.
Typefaces of fonts can be described in many ways, but the most common characterization of a typeface is serif
and sans serif. The serif is the little decoration atthe end of a letter stroke. Times, Times New Roman, Bookman
are some fonts which comes under serif category. Arial, Optima, Verdana are some examples of sans serif font.
Serif fonts are generally used for body of the text for better readability and sans serif fonts are generally used
for headings. The following fonts shows a few categories of serif and sans serif fonts.
F F
(Serif Font) (Sans serif font)
• As many number of type faces can be used in a single presentation, this conceptof
using many fonts in a single page is called ransom-note topography.
• For small type, it is advisable to use the most legible font.
• In large size headlines, the kerning (spacing between the letters) can be adjusted
• In text blocks, the leading for the most pleasing line can be adjusted.
• Drop caps and initial caps can be used to accent the words.
• The different effects and colors of a font can be chosen in order to make the text
look in a distinct manner.
• Anti aliased can be used to make a text look gentle and blended.
• For special attention to the text the words can be wrapped onto a sphere or bent
like a wave.
• Meaningful words and phrases can be used for links and menu items.
• In case of text links(anchors) on web pages the messages can be accented.
The most important text in a web page such as menu can be put in the top 320 pixels.
Fonts :
Postscript fonts are a method of describing an image in terms of mathematical constructs (Bezier curves), so it is
used not only to describe the individual characters of a font but also to describe illustrations and whole pages of
text. Since postscript makes use of mathematical formula, it can be easily scaled bigger or smaller.
Apple and Microsoft announced a joint effort to develop a better and faster quadratic curves outline font
methodology, called truetype In addition to printing smooth characters on printers, TrueType would draw
characters to a low resolution (72 dpi or 96 dpi) monitor.
The American standard code for information interchange (SCII) is the 7bit character coding system most commonly
used by computer systems in the United states and abroad. ASCII assigns a number of value to 128 characters,
including both lower and uppercase letters, punctuation marks, Arabic numbers and math symbols. 32 control
characters are also included. These control characters are used for device control messages, such as carriage return,
line feed, tab and form feed.
A byte which consists of 8 bits, is the most commonly used building blockfor computer processing. ASCII uses only
7 bits to code is 128 characters; the 8th bit of the byte is unused. This extra bit allows another 128 characters to be
encoded before the byte is used up, and computer systems today use these extra 128 values for an extended character
set. The extended character set is commonlyfilled with ANSI (American National Standards Institute) standard
characters, including frequently used symbols.
Unicode
Unicode makes use of 16-bit architecture for multilingual text and character encoding. Unicode uses about 65,000
characters from all known languages and alphabets in the world.
Several languages share a set of symbols that have a historically related derivation, the shared symbols of each
language are unified into collections of symbols (Called scripts). A single script can work for tens or even
hundreds oflanguages.
Microsoft, Apple, Sun, Netscape, IBM, Xerox and Novell are participatingin the development of this standard and
Microsoft and Apple have incorporated Unicode into their operating system.
There are several software that can be used to create customized font. These toolshelp an multimedia developer to
communicate his idea or the graphic feeling. Using these software different typefaces can be created. In some
multimedia projects it may be required to create special characters. Using thefont editing tools it is possible to
create a special symbol and use it in the entire text.
Following is the list of software that can be used for editing and creating fonts:
• Fontographer
• Fontmonger
• Cool 3D text
Special font editing tools can be used to make your own type so you can communicate an idea or graphic feeling
exactly. With these tools professional typographers create distinct text and display faces.
Fonto grapher:
It is macromedia product, it is a specialized graphics editor for bothMacintosh and Windows platforms. You can use
it to create postscript, truetype and bitmapped fonts for Macintosh and Windows.
Making Pretty Text:
To make your text look pretty you need a toolbox full of fonts and special graphics applications that can stretch,
shade, color and anti-alias your words into real artwork. Pretty text can be found in bitmapped drawings where
characters have been tweaked, manipulated and blended into agraphic image.
Hypermedia and Hypertext:
Multimedia is the combination of text, graphic, and audio elements into a single collection or presentation – becomes
interactive multimedia when you give the user some control over what information is viewed and when it is viewed.
LECTURE NOTE 4
Text
- Text is the basic element of multimedia. It involves the use of text types, sizes, colors and background
colors.
- In multimedia, text is mostly use for titles, headlines, menu, paragraph, list etc.
- In a multimedia application, other media or screen can be linked through the use of text. This is what you
call Hypertext.
- The most commonly used software for viewing text files are Microsoft Word, Notepad, Word pad
etc. Mostly the text files are formatted with , DOC, TXT etc extension.
2. Graphics
- It is a digital representation of non-text information, such as a drawing, chart, or photograph.
- Graphics make the multimedia application attractive. They help to illustrate ideas through still pictures.
- There are two types of graphics used: bitmaps (paint graphics) and vectors (draw graphics).
- Bitmaps graphics also called raster graphics. A bitmap represents the images as an array of dots called
pixels. Bitmap graphics are resolution-dependent and generate large file sizes.
- Vectors graphics are images drawn on the computer with software that uses geometrical formulas to
represent images and only require a small amount of memory.
3. Video
- Video is the technology of electronically capturing, recording, processing, storing, transmitting, and
reconstructing a sequence of still images representing scenes in motion.
- Photographic images that are played back at speeds of 15 to 30 frames a second and the provide the
appearance of full motion.
The use of video:
- The embedding of video in multimedia applications is a powerful way to convey information which can
incorporate a personal element which other media lack.
- Promoting television shows, films, or other non-computer media that traditionally have used trailers in their
advertising.
- Giving users an impression of a speaker’s personality.
- Showing things that move. For example a clip from a motion picture. Product demos of physical products
are also well suited for video.
4. Audio
- In multimedia audio means related with recording, playing etc.
- Audio is an important components of multimedia because this component increase the understandability and
improves the clarity of the concept.
- Audio includes speech, music or any other sound.
5. Animation
- Animation is the rapid display of a sequence of images of 2-D artwork or model positions in order to create
an illusion of movement.
- It is an optical illusion of motion due to the phenomenon of persistence of vision, and can be created and
demonstrated in a number of ways.
- Entertainment multimedia titles in general, and children’s titles specifically, rely heavily on animation.
The use of animation:
- To attract attention
- To inform about the state of process
- Demonstrations
- Interactive simulations
LECTURE NOTE 5
A character set is the key component behind displaying, manipulating and editing text, numbers and symbols on a
computer. A character set is created through a process known as encoding i.e. each character is assigned with a
unique code or value.
All word and/or data processing applications are embedded with one or more character sets. The characters within a
character set can be text, number or even symbols. Each character is represented by a number. ASCII characters set
is one of the most popular character sets used by general computers to display text or numbers on computer screen.
It represents uppercase and lower case English alphabets, number, mathematical operators and symbols from number
0–127.
A character set (also called a repertoire) is a collection of characters that have been grouped together for a specific
purpose.
A character is a minimal unit of text that has semantic value. For example, the letter A is a character, as is the
number 1. But the number 10 consists of two characters — the 1 character, and the 0 character.
A character doesn't need to be a letter or number. It could be a symbol or icon of some sort. For example, the
greater-than sign > is a character, as are each of the various smiley face -̈
◆ characters.
v̈
Some character sets are used by multiple languages. For example, the Latin character set is used by many
languages, including Italian, Spanish, Portuguese, French, German, Dutch, English, Danish, Swedish, Norwegian,
and Icelandic. Other character sets (such as the Thai character set) are used in only one language.
Some character sets are used for other purposes, such as punctuation marks, diacritics, mathematical symbols,
technical symbols, arrows, dingbats, emoji, and more.
The term "character set" tends to have slightly different conotations depending on the context, and it is often used
quite loosely. The term is also avoided by some in favor of more precise terminology. This is in large part due to
the introduction of the Universal Coded Character Set (UCS) as defined by both ISO/IEC 10646 and Unicode
standards. The UCS now encompasses most other character sets, and it has changed the way that characters are
encoded on computer systems.
The terms character set and character encoding are often used interchangeably despite the fact that they have
different meanings.
Character encoding is a set of mappings between the bytes in the computer and the characters in the character set.
It's how a character is encoded so that computer systems can read, write, and display that character as intended.
This means that any system that supports that encoding method can use the characters that the encoding method
supports.
Probably the main reason these terms have been used interchangeably is because, historically, the same standard
would be used to define the repertoire (character set) as well as how it was going to be encoded.
There have been many hundreds of different encoding systems over the years. This has caused all sorts of
problems, especially when users from different systems tried to share data.
Users would often open a document, only to find it unreadable, with weird looking characters being displayed all
through it. This is because the person who created the document used a different encoding system to the person
who was trying to read it.
Things have changed a lot since the 1990s when this issue was prevalent.
The Unicode Consortium was founded in 1991 to develop, extend and promote use of the Unicode Standard, which
specifies the representation of text in modern software products and standards.
The Unicode Standard now encodes almost all characters used on computers in the world, and it is also a superset
of most other character sets (i.e. the character sets of many existing international, national and corporate standards
are incorporated within the Unicode Standard).
The way Unicode works is, it assigns each character a unique numeric value and name. It provides a unique
number (also known as a code point) for every character, regardless of the platform, the program, or the language.
This resolves issues that would arise when multiple encoding systems used the same code points for different
characters.
The Unicode Standard has been adopted by industry leaders as Apple, HP, IBM, JustSystems, Microsoft, Oracle,
SAP, Sun, Sybase, Unisys and many others. It is also required by standards such as XML, Java, ECMAScript
(JavaScript), LDAP, CORBA 3.0, WML, and more.
See this Unicode reference for a list of commonly used Unicode characters, along with the code for adding them to
a web page or other HTML document.
The ISO Working Group responsible for ISO/IEC 10646 and the Unicode Consortium have been working together
since 1991 to create one universal standard for coding multilingual text.
Although ISO/IEC 10646 is a separate standard to Unicode, both standards use the same character codes and
encoding forms.
Also, each version of the Unicode Standard identifies the corresponding version of ISO/IEC 10646.
However, the Unicode Standard imposes additional constraints on implementations to ensure that they treat
characters uniformly across platforms and applications.
UTF-8
UTF-8 is a character encoding defined by Unicode, which is capable of encoding all possible
characters. UTF stands for Unicode Transformation Format. The 8 means it uses 8-bit blocks to represent a
character.
The W3C recommends that web authors use UTF-8 as the character encoding for all web content.
Other UTF encodings include UTF-7, UTF-16, and UTF-32, however, UTF-8 is by far the most commonly used.
HTML5 defines over 2,000 named character references that correspond to Unicode characters/code points. These
named character references (often referred to as "named entities" or "HTML entities") can be used within HTML
documents as an alternative to the character's numeric value.
For example, to display a copyright symbol, authors can use © (HTML named character
reference), © (hexadecimal value of the Unicode code point), or © (decimal conversion of the
hexadecimal value).
LECTURE NOTE 6
Unicode is a standard for character encoding. The introduction of ASCII characters was not enough to cover all the
languages. Therefore, to overcome this situation, it was introduced. The Unicode Consortium introduced this encoding
scheme.
Internal Storage Encoding of Characters
We know that a computer understands only binary language (0 and 1). Moreover, it is not able to directly understand or
store any alphabets, other numbers, pictures, symbols, etc. Therefore, we use certain coding schemes so that it can
understand each of them correctly. Besides, we call these codes alphanumeric codes.
UNICODE
Unicode is a universal character encoding standard. This standard includes roughly 100000 characters to represent
characters of different languages. While ASCII uses only 1 byte the Unicode uses 4 bytes to represent characters. Hence,
it provides a very wide variety of encoding. It has three types namely UTF-8, UTF-16, UTF-32. Among them, UTF-8 is
used mostly it is also the default encoding for many programming languages.
UCS
It is a very common acronym in the Unicode scheme. It stands for Universal Character Set. Furthermore, it is the
encoding scheme for storing the Unicode text.
The UTF is the most important part of this encoding scheme. It stands for Unicode Transformation Format. Moreover,
this defines how the code represents Unicode. It has 3 types as follows:
UTF-7
This scheme is designed to represent the ASCII standard. Since the ASCII uses 7 bits encoding. It represents the ASCII
characters in emails and messages which use this standard.
UTF-8
It is the most commonly used form of encoding. Furthermore, it has the capacity to use up to 4 bytes for representing the
characters. It uses:
• 2 bytes to represent additional Latin and Middle Eastern letters and symbols.
• It can also represent emojis which is today a very important feature of most apps.
UTF-16
It is an extension of UCS-2 encoding. Moreover, it uses to represent the 65536 characters. Moreover, it also supports 4
bytes for additional characters. Furthermore, it is used for internal processing like in java, Microsoft windows, etc.
UTF-32
It is a multibyte encoding scheme. Besides, it uses 4 bytes to represent the characters.
• ASCII
• ISCII
Importance of Unicode
• As it is a universal standard therefore, it allows writing a single application for various platforms. This
means that we can develop an application once and run it on various platforms in different languages.
Hence we don’t have to write the code for the same application again and again. And therefore the
development cost reduces.
• We can use it to convert from one coding scheme to another. Since Unicode is the superset for all
encoding schemes. Hence, we can convert a code into Unicode and then convert it into another coding
standard.
• It is preferred by many coding languages. For example, XML tools and applications use this standard
only.
Advantages of Unicode
• Sampling - Sampling is a process of measuring air pressure amplitude at equally spaced moments in
time, where each measurement constitutes a sample. A sampling rate is the number of times the
analog sound is taken per second. A higher sampling rate implies that more samples are taken during
the given time interval and ultimately, the quality of reconstruction is better. The sampling rate is
measured in terms of Hertz, Hz in short, which is the term for Cycle per second. A sampling rate of
5000 Hz(or 5kHz,which is more common usage) implies that mt uj vu8i 9ikuhree sampling rates
most often used in multimedia are 44.1kHz(CD-quality), 22.05kHz and 11.025kHz.
• Quantization - Quantization is a process of representing the amplitude of each sample as integers or
numbers. How many numbers are used to represent the value of each sample known as sample size
or bit depth or resolution. Commonly used sample sizes are either 8 bits or 16 bits. The larger the
sample size, the more accurately the data will describe the recorded sound. An 8-bit sample size
provides 256 equal measurement units to describe the level and frequency of the sound in that slice
of time. A 16-bit sample size provides 65,536 equal units to describe the sound in that sample slice
of time. The value of each sample is rounded off to the nearest integer (quantization) and if the
amplitude is greater than the intervals available, clipping of the top and bottom of the wave occurs.
• Encoding - Encoding converts the integer base-10 number to a base-2 that is a binary number. The
output is a binary expression in which each bit is either a 1(pulse) or a 0(no pulse).
Quantization of Audio
Quantization is a process to assign a discrete value from a range of possible values to each sample. Number of
samples or ranges of values are dependent on the number of bits used to represent each sample. Quantization results
in stepped waveform resembling the source signal.
• Quantization Error/Noise - The difference between sample and the value assigned to it is known
as quantization error or noise.
• Signal to Noise Ratio (SNR) - Signal to Ratio refers to signal quality versus quantization error.
Higher the Signal to Noise ratio, the better the voice quality. Working with very small levels often
introduces more error. So instead of uniform quantization, non-uniform quantization is used as
companding. Companding is a process of distorting the analog signal in controlled way by
compressing large values at the source and then expanding at receiving end before quantization takes
place.
Transmission of Audio
In order to send the sampled digital sound/ audio over the wire that it to transmit the digital audio, it is first to be
recovered as analog signal. This process is called de-modulation.
• PCM Demodulation - PCM Demodulator reads each sampled value then apply the analog filters to
suppress energy outside the expected frequency range and outputs the analog signal as output which
can be used to transmit the digital signal over the network.
What are the Different Types of Sound
Sound can be of different types—soft, loud, pleasant, unpleasant, musical, audible (can be heard), inaudible
(cannot be heard), etc. Some sounds may fall into more than one category. For instance, the sound produced when
an aeroplane takes off is both loud and unpleasant. The sound produced by a marble cutter, on the other hand, may
not be as loud, but some people might find it irritating and unpleasant.
We find certain sounds pleasant and associate them with music. In a musical sound, there are a number of
frequencies present in a definite ratio or relation to each other. Broadly, musical instruments are classified into the
following three categories.
In stringed instruments like violin, guitar, and sitar, sound is produced by a vibrating string. The shrillness or pitch
of the sound is altered by changing the length of the vibrating portion of the string. For example, a sitar player
plucks the string with the right hand while the pitch of the sound produced is changed by pressing the string with
the index finger of the left hand. These instruments also have an air chamber, which helps increase the loudness of
the sound produced.
Activity
1. Keep all the bowls in a row. Fill all of them with varying levels of water.
2. Strike the bowls one after the other with the drum sticks. What do you find?
Observation: On striking the bowls one after the other, you
get tinkling sounds of different pitches. The pitch is highest in the bowl with the maximum amount of water, and
lowest in the bowl with the least amount of water.
LECTURE NOTE 9
Digitization of Sound
Source
-- Generates Sound
The destination receives (sensed the sound wave pressure changes) and has to deal with accordingly:
Destination
-- Receives Sound
• To get audio or video into a computer, we have to digitize it (convert it into a stream of
numbers) Need to convert Analog-to-Digital -- Specialised Hardware
• So, we have to understand discrete sampling (both time and voltage)
• Sampling - divide the horizontal axis (the time dimension) into discrete pieces. Uniform sampling
is ubiquitous.
• Quantization - divide the vertical axis (signal strength) into pieces. Sometimes, a non-linear
function is applied.
o 8 bit quantization divides the vertical axis into 256 levels. 16 bit gives you 65536 levels.
• Sampling - Sampling is a process of measuring air pressure amplitude at equally spaced moments in
time, where each measurement constitutes a sample. A sampling rate is the number of times the
analog sound is taken per second. A higher sampling rate implies that more samples are taken during
the given time interval and ultimately, the quality of reconstruction is better. The sampling rate is
measured in terms of Hertz, Hz in short, which is the term for Cycle per second. A sampling rate of
5000 Hz(or 5kHz,which is more common usage) implies that mt uj vu8i 9ikuhree sampling rates
most often used in multimedia are 44.1kHz(CD-quality), 22.05kHz and 11.025kHz.
• Quantization - Quantization is a process of representing the amplitude of each sample as integers or
numbers. How many numbers are used to represent the value of each sample known as sample size
or bit depth or resolution. Commonly used sample sizes are either 8 bits or 16 bits. The larger the
sample size, the more accurately the data will describe the recorded sound. An 8-bit sample size
provides 256 equal measurement units to describe the level and frequency of the sound in that slice
of time. A 16-bit sample size provides 65,536 equal units to describe the sound in that sample slice
of time. The value of each sample is rounded off to the nearest integer (quantization) and if the
amplitude is greater than the intervals available, clipping of the top and bottom of the wave occurs.
• Encoding - Encoding converts the integer base-10 number to a base-2 that is a binary number. The
output is a binary expression in which each bit is either a 1(pulse) or a 0(no pulse).
Quantization of Audio
Quantization is a process to assign a discrete value from a range of possible values to each sample. Number of
samples or ranges of values are dependent on the number of bits used to represent each sample. Quantization results
in stepped waveform resembling the source signal.
• Quantization Error/Noise - The difference between sample and the value assigned to it is known
as quantization error or noise.
• Signal to Noise Ratio (SNR) - Signal to Ratio refers to signal qualityLECTURE
versus quantization
NOTEerror.10
Higher the Signal to Noise ratio, the better the voice quality. Working with very small levels often
introduces more error. So instead of uniform quantization, non-uniform quantization is used as
companding. Companding is a process of distorting the analog signal in controlled way by
compressing large values at the source and then expanding at receiving end before quantization takes
place.
Transmission of Audio
In order to send the sampled digital sound/ audio over the wire that it to transmit the digital audio, it is first to be
recovered as analog signal. This process is called de-modulation.
• PCM Demodulation - PCM Demodulator reads each sampled value then apply the analog filters to
suppress energy outside the expected frequency range and outputs the analog signal as output which
can be used to transmit the digital signal over the network.
LECTURE NOTE 11
M4A audio file type
The M4A is an mpeg-4 audio file. It is an audio-compressed file used in the modern setting due to increased
quality demand as a result of cloud storage and bigger hard drive space in contemporary computers. Its high quality
keeps it relevant, as users who need to hear distinct sounds on audio files will need this over more common file
types.
M4A file
types are compressed audio files used by Apple iTunes.
Music download software like Apple iTunes use M4A instead of MP3 because it’s smaller in size and higher in
quality. Its limitations come in the form of compatibility, as a lot of software are unable to recognize the M4A,
making it ideal for only a select type of user.
2. FLAC
The FLAC audio file is Free Lossless Audio Codec. It is an audio file compressed into a smaller size of the
original file. It’s a sophisticated file type that is lesser-used among audio formats. This is because, even though it
has its advantages, it often needs special downloads to function. When you consider that audio files are shared
often, this can make for quite an inconvenience to each new user who receives one.
LECTURE NOTE 11
The FLAC is
a lossless audio file.
What makes the FLAC so important is the lossless compression can save size and promote sharing of an audio file
while being able to return to the original quality standard. The near-exact amount of storage space required of the
original audio file is sixty percent – this saves a lot of hard drive space and time spent uploading or downloading.
3. MP3
The MP3 audio file is an MPEG audio layer 3 file format. The key feature of MP3 files is the compression that
saves valuable space while maintaining near-flawless quality of the original source of sound. This compression
makes the MP3 very popular for all mobile audio-playing devices, particularly the Apple iPod.
The MP3
stays relevant among newer audio file types due to its high quality and small size.
MP3 continues to be relevant in today’s digital landscape because it’s compatible with nearly every device capable
of reading audio files. The MP3 is probably best used for extensive audio file sharing due to its manageable size. It
also works well for websites that host audio files. Finally, the MP3 remains popular because of its overall sound
quality. Though not the highest quality, it has enough other benefits to compensate.
4. MP4
An MP4 audio file is often mistaken as an improved version of the MP3 file. However, this couldn’t be further
from the truth. The two are completely different and the similarities come from their namesake rather than their
functionality. Also note that the MP4 is sometimes referred to as a video file instead of an audio file. This isn’t an
error, as in fact it’s both an audio and video file.
There are
plenty of differences between the MP4 and MP3.
An MP4 audio file type is a comprehensive media extension, capable of holding audio, video and other media. The
MP4 contains data in the file, rather than code. This is important to note as MP4 files require different codecs to
implement the code artificially and allow it to be read.
5. WAV
A WAV audio file is a Waveform Audio File that stores waveform data. The waveform data stored presents an
image that demonstrates strength of volume and sound in specific parts of the WAV file. It is entirely possible to
transform a WAV file using compression, though it’s not standard. Also, the WAV is typically used on Windows
systems.
The WAV
offers an uncompressed format.
The easiest way to envision this concept is by thinking of ocean waves. The water is loudest, fullest and strongest
when the wave is high. The same holds true for the waveform in the WAV. The visuals are high and large when the
sound increases in the file. WAV files are usually uncompressed audio files, though it’s not a requirement of the
format.
6. WMA
The WMA (Windows Media Audio) is a Windows-based alternative to the more common and popular MP3 file
type. What makes so beneficial is its lossless compression, retaining high audio quality throughout all types of
restructuring processes. Even though it’s such a quality audio format, it’s not the most popular due to the fact it’s
inaccessible to many users, especially those who don’t use the Windows operating system.
The WMA is
a great file for Windows users.
If you’re a Windows user, simply double-click any WMA file to open it. The file will open with Windows Media
Player (unless you’ve changed the default program). If you’re not using Windows, there are some alternatives to
help you out. The first option is to download a third-party system that plays the WMA. If this isn’t something you
want to do, consider converting the WMA to a different audio format. There are plenty of conversion tools
available.
7. AAC
The AAC (Advanced Audio Coding) is an audio file that delivers decently high-quality sound and is enhanced
using advanced coding. It has never been one of the most popular audio formats, especially when it comes to music
files, but the AAC does still serve some purpose for major systems. This includes popular mobile devices and
video gaming units, where the AAC is a standard audio component.
The AAC is a highly-practical audio file.
To open an AAC file, the most common and direct format for most users is through iTunes. All this entails is
launching the iTunes system and opening the AAC file from your computer in the ‘File’ menu. If you don’t have
iTunes and want an alternative, consider downloading third-party software capable of opening the AAC. If that
doesn’t suit your needs, convert the AAC to a more common audio file type.
Many computers support stereo and higher-quality audio, such as CD and DAT quality. Audio Tool automatically
recognizes the capabilities of the computer's audio devices. Therefore, the features of Audio Tool that you can use
depend on your particular system configuration. This chapter describes all of the features and controls of Audio
Tool. However, if your computer supports only monaural audio, you will not see some controls, such as a balance
slider.
Required Accessories
For recording (creating) voice audio files, use the microphone that is shipped with your computer or a commercial-
variety microphone. Connect the microphone to your computer's microphone jack.
You can listen to audio output in one of these ways; either from your computer's speaker, or headphones, or from
externally powered speakers connected to the speaker output.
Refer to the audiotool(1) manual page for information about the Audio Tool command's syntax.
sound file.
The top center of the Audio Tool window displays the name of a sound file along with the file's status.
Three menu buttons are located near the top of the window (File, Edit, and Volume). Use these buttons to edit
audio files and the audio configuration parameters, such as play and record volume. The four buttons located near
the bottom of the window (Rev, Play, Fwd, and Rec) function like buttons on a tape recorder.
1. With the pointer inside Audio Tool, choose File -> Open....
2. Double-click SELECT on the directory that contains the audio file you want to hear.
3. Click SELECT on the name of the audio file.
4. Click SELECT on the Open button.
Note -
Depending on the audio capabilities of your computer, you may or may not see all of the choices shown in
the figure (e.g., if your computer does not support stereo, the balance slider is not displayed)
Note that you can choose just a small segment of the sound file to be played. To do this, click SELECT on
the location in the recording where you want to start, and click ADJUST at the end of the segment that you
want to hear.
Recording
Figure
Sound
11-2 Open Window
To record an audio file, do the following:
1. If you're recording from a microphone, place the microphone near the sound source.
2. Press MENU on the Volume button and choose Record...
3. In the Audio Control: Record window, select a source; either click SELECT on the Microphone button or
on the Line In button.
4. Click SELECT on the Auto-Adjust button, and speak into the microphone for three to five seconds. Or,
play a selection from the sound source that's connected to your audio hardware's "line in."
Auto adjust automatically sets the recording level. You can also move the Record Volume: slider to adjust
the recording level manually.
If you make a mistake, you can either choose File -> New (which cannot be undone) to clear all the sound
data and start over, or you can edit the sound later. Refer to "Editing an Audio File" for information on
how to do this.
2. Double-click SELECT on the directory destination for the file from the scrolling directory list.
3. Type the file name in the Save As text field.
4. Click MENU on the Format: menu button to choose an appropriate format.
You can cancel a file-save operation by clicking SELECT on the Cancel button in the Save As window.
Or, you can press the keyboard's Stop key while the mouse cursor is in Audio Tool's base window.
To update the file that is being edited, follow the previous steps to modify the file, then choose File Save
from Audio Tool's base window.
Editing anFigure
Audio
11-6
FileSaving a File in the Save As Window
Because you can see an audio file graphically, you can determine which portions are sound and which are silence.
This feature allows you to edit audio files such as voice messages. For instance, you might want to edit out long
pauses (silence) between sentences or phrases. To modify the sound file, you edit the audio file as you would edit a
text file in Text Editor, using cut, copy, and paste commands from the Edit menu.
To cut a portion of sound from one location and paste it in another, do the following:
1. Click SELECT at the beginning of the sound portion you want to cut.
2. Click ADJUST at the end of the portion you want to cut.
Or, instead of steps 1 and 2, position the hairline cursor at the beginning of the sound portion, hold down
the SELECT mouse button, and drag the pointer to the end of the portion.
4. Click SELECT on the destination point for the portion of sound you cut.
5. Choose Edit -> Paste.
The cut portion of sound is pasted into the new location, as shown in Figure 11-8.
Undoing an Edit
You can undo the last edit you made by choosing Edit -> Undo.
1. Click SELECT on the Compose button in the Mail Tool control area, and address the mail.
2. Click SELECT on the Attach... -> Voice... button in the Mail Tool Compose window control area.
The Audio Tool window appears, as shown in Figure 11-9. Note that the mail button labeled "Done"
appears only when Audio Tool is invoked from the Mail Tool Attach button.
Figure 11-9 Audio Tool Window Opened From the Mail Tool Compose Window
The audio attachment glyph appears in the Attachments window (Figure 11-10).
Note also that if you have a pre-recorded message in Audio Tool that you want to attach to a mail message,
you can drag the file's glyph from Audio Tool's drag and drop target to the Compose window's
Attachments subwindow.
Figure 11-10 Audio File Attached to a Mail Message
The mail message containing the file is displayed in the View Message window.
3. Double-click SELECT on the audio attachment glyph in the view window Attachments area.
Musical Instrument Digital Interface (MIDI) is a standard to transmit and store music, originally designed
for digital music synthesizers. MIDI does not transmit recorded sounds. Instead, it includes musical notes, timings
and pitch information, which the receiving device uses to play music from its own sound library.
MIDI was developed in the early 1980s to provide interoperability
between digital music devices.
Before MIDI, digital piano keyboards, music synthesizers and drum machines from different manufacturers could
not talk to each other.
MIDI was developed in the early 1980s to provide interoperability between digital music devices. It was
spearheaded by the president of Roland instruments and developed with Sequential Circuits, an early synthesizer
company that Yamaha purchased in 1987. Other early adopters included Yamaha, Korg, Kawai and Moog.
MIDI does not send the sound wave made by an instrument; instead, it sends information about the music notes,
and the receiving device uses its own internal mechanisms to generate the sounds.
MIDI sends data only about notes, not the sound of the notes.
Think of MIDI as the punched paper sheet music of a player piano.
To illustrate how MIDI works, other music formats such as an RCA cable or MP3 files are like sending a
phonograph record; a MIDI signal is like the punched paper sheet music of a player piano. MIDI only sends
information about the notes, not what they sounded like. It is like recording everything about how the keys are
pressed in a piano performance and then using that key press information on a different piano.
• Velocity and aftertouch pressure. How hard the key was pressed and held.
• Program change. Change which program the receiver is using. This will often change what type of
instrument the receiver sounds like.
MIDI can also be extended by SysEx commands, which are added on by the device manufacturer. These can give a
device the ability to completely control the receiving device. It also allows MIDI to be used in show control, such
as turning on and off lights or animatronics. For example, some amusement park rides use MIDI to trigger sound,
motion and effects playback in time with the movement of the ride.
The MIDI protocol uses 8-bit serial transmission with one start bit and one stop bit. It has a 31.25 kilobits-per-
second (Kbps) data rate and is asynchronous. Because it is a serial protocol, if a lot of data needs to be transmitted
on a single cable at one time, the music might become out of sync or have other timing issues.
Most digital music synthesizers can emulate many different instruments. These are called programs or patches.
MIDI can be used to select specific programs on the playing instrument. The General MIDI specification defines
many of these instruments to help standardize playback.
MIDI is one way, from transmitter to receiver. The cable must be connected from one device's output to another
device's input.
A through port allows multiple devices to be daisy-chained together. One cable can carry 16 MIDI channels.
Some MIDI devices use USB cables. These are common in instruments designed to be used with a PC, such as
small keyboards or MIDI interfaces. They use USB to send the MIDI information to the computer.
The standard MIDI file, or SMF, is the file format for sharing MIDI information. It typically had the .mid file
extension.
The ubiquity of MIDI music from early video games has led to some bands using game consoles as instruments.
This is called chiptunes.
Most digital electronic instruments are MIDI compatible. This includes equipment such as synthesizers,
sequencers, drum machines, keyboards and electronic drum kits. Some people also make their own specialized
MIDI-compatible controllers for performances. For example, an artist might place sensors in their clothing that
trigger MIDI notes and then play a song by tapping on their arms or legs.
• open standard
• wide adoption
• easy to implement
• inexact timing
The analog process encodes video and audio in complete frames (modulation), with the receiving device
interpreting and translating the signal into video and audio on a monitor (de-modulation). This process can
introduce a progressive loss of data leading to a general loss of video quality. NTSC can only deliver 720 pixels
wide video or stills from video.
Digital video, or DV, on the other hand, remains digital (such as '0's and '1's ) with the data describing the colors
and brightness of each video frame. On the receiving end of this data transmission, there is no translation or
interpretation, just the delivery of pure data. The consistency of delivery is the crucial advantage that digital video
has over analog video when it comes to working with images on a PC. As opposed to NTSC, there is no limit to
resolution so images or movies as wide as 4000 pixels are easily obtainable with the digital cameras we sell for
microscopy.
When introduced in 1995, Firewire, one of many electronic protocols for A/V also referred to as iLink, IEEE1394
or 1394, provided both the transfer speed, at 400Mpbs, and the consistency needed to allow the average user to edit
video on their PC.
In the past, there was a clear distinction between USB and Firewire. USB 1.1 could not transfer high quality DV;
loosely defined as 25 frames per second (fps) with each frame being 640x480 resolution, due to USB1.1's transfer
limit of around 11Mbps (or around 1.5MB per second).
Transferring DV requires a transfer rate of at least 3.6MB per second, which at the time left Firewire as one of few
options due to its ability to work at 400Mbps, or up to around 50MB per second. Then along came USB 2.0 with a
"published" transfer rate of 480Mbps or 60MB per second. Not to be left behind by Wintel, at least on paper,
Firewire800 or IEEE1394b was introduced. The new standard also provided for even faster 1.6Gbps and 3.2Gbps
transfer rates across 100 meters of copper twisted pair wires.
At first glance, it appears that USB 2.0 is faster than Firewire400; however speed is not the only issue when it
comes to DV. One of the worst issues with USB 2.0 is that it cannot guarantee a sustained data transfer rate. This is
due to USB adopting a "master-slave" topology, which means it needs the computer's CPU to coordinate data
transfers. While this not usually a problem for low demand peripherals such as webcams, scanners, printers, etc.,
digital video requires dependable high-performance to avoid dropping video frames.
Firewire was designed from the beginning as a real time protocol. Firewire also works in a peer-to-peer topology.
As a result, many professional DV users can download their video clips from a DV camcorder directly to a hard
drive without the use of a PC. More importantly, Firewire delivers data consistently at a specified high rate. If you
want to do serious work with video, even to edit a family movie, it is best to go with Firewire.
Analog CCD cameras are still the best way to go if you want to use a video projector or a big screen TV or monitor
like in classroom or training situations. To work with high-resolution digital video or digital still images for print
publication, then our CMOS or CCD Digital Video Cameras are the best choice.
Digital video cameras that use USB2.0 are fine for work done with microscopes because videos captured on
microscopes are relatively short clips in duration. Also when using a USB2.0 video system, we suggest using a
newer, fast computer with no other applications or TSR's running in the background or better yet, using a dedicated
computer to do microscope video imaging.
USB2.0 is found on all new PC or Macintosh type computers, whereas Firewire is typically an add-on card for
most PC's. All Meiji Techno Digital Video Cameras use USB2.0 with no current plans for Firewire cameras.
Firewire is native to most Macintosh computers so finding a Firewire camera is probably the best way to go if you
are a Mac user.
JPEG stands for Joint Photographic Experts Group. Mainly JPEG is a type of digital image compression.
Also, it is known as a format of image, but this is completely false because JPEG is a technique of image
compression, it is used for many file formats such as EPS, PDF, even TIFF files also. And the format similar to
JPEG is JPG, many of us are thinking that these 2 are the
same.
This technique of image compression developed by the Joint Photography Experts Group so that its name is
JPEG. This compression uses a lossy compression algorithm so that some information is removed from the
image when compressing. The JPEG standard works by averaging color variation and discard the information
that the human eye cannot see.
How JPEG compression works
JPEG is compressed into either full-color or grayscale images. In the case of color images, RGB is transformed
into a luminance or chrominance color space. JPEG compression mainly works by identifying similar areas of
color inside the image and converting them to actually the same color code. JPEG uses the DCT (Discrete
Cosine Transform) method to compress for coding transformation. Steps of Compression:
1. The raw image is first converted to a different color model, which separates the color of a pixel from
its brightness.
2. image is divided into a small block which is having 8×8 block, each block is called pixel.
3. Then RGB is converted into Y-Cb-Cr, JPEG uses a Y-Cb-Cr model instead of RGB.
4. After that, DCT is applied to each block of pixels and converts the image from the spatial domain to
the frequency domain. The formula followed by the DCT method :
1. Then make the resulting image quantized, because human eyes can not see high frequency so to the
make the is low quantization is applied.
2. After quantization, zigzag scan is performed on these quantized 8×8 blocks to group the low-
frequency coefficients.
3. The coefficients is then encoded by Run Length and Huffman coding algorithm to get the final
image.
History of JPEG
In the year 1982 the ISO(International Organization for Standardization) formed a Photographic Expert
Group to research on how to transmit Video and still images over data links. Three years later,
the CCITT formed a group to work on compression technique of image. Then in the year 1987 the two groups
are attached together and make another group called Joint Photographic Experts Group (JPEG), then all were
working on a new standard which use data compression to make the graphics file small. Then JPEG was created
in 1992 and the latest version released in the year 1994.
Characteristics of JPEG
• The main characteristics of JPEG is that, it uses lossy compression technique, so that the size of the image is
less.
• JPEG standard works by averaging color variation and discard the information what the human eye cannot
see to reduce the size of the image. For this it is lossy compression.
• JPEG has an improved way to compress a file, that it automatically looks over the file and chooses the best
way to compress it.
• JPEG has composed of some separate parts, that are,
• JPEG-ISO/IEC 10918-1 :It defines the core coding technology of JPEG and it involves options for encoding
photographic images
• JPEG-ISO/IEC 10918-2 : This part has several rules for testing software.(
• JPEG-ISO/IEC 10918-3 :This part defines the set of extensions to the coding technologies of Part 1
including Still Picture Interchange File Format(SPIFF)
• JPEG-ISO/IEC 10918-4 :This part focuses on the registration of files which has JPEG extensions.
• JPEG-ISO/IEC 10918-5 :This is file format which is known as JPEG File Interchange Format(JFIF).
• JPEG can works with multiple files, that is, it can working on several image at the same time.
Advantages of JPEG
• It has very good compression rate, image quality, also has good transmission rate.
• JPEG standard is supports 24-bit color with up to 16 million colors, so the resolution in maximum level is
superb.
• JPEG files are very small in size but according to the size the quality is not low. So we can save our disk
space while store JPEG files without effecting the quality of the image.
• Image Processing time is much faster than other image standards.
• It is suitable for full-color realistic images with a lot of color and contrast transitions.
• JPEG is compatible with every computer, mobile devices, camera devices and also photo editors.
• The user can independently selects the ratio quality of the image in JPEG
• There is no need of editing required to print any image, JPEG files can be printed directly from camera
devices without editing.
Disadvantages of JPEG
• Image may loss important details from the image because of lossy compression, that is, the image is dived
into 8×8 blocks, then much information is discarded.
• JPEG is not a flexible application, because JPEG is not efficient for image which contains texts, sharp lines
or edges. JPEG is good for portraits and nature photographs mainly.
• Quality of image is reduced after JPEG compression, mainly for text based images. For other images its not
effect so much if we are not see the image in detail.
• JPEG standard does not support opacity or transparency, most of the cases he transparent portion of any
image is consider as white area in JPEG.
• JPEG does not handle Black & White and motion pictures.
• JPEG images have less color depth than other image formats.
• In JPEG Layered image is not support.
MPEG
• MPEG stands for the Moving Picture Experts Group. MPEG is an ISO/TEC working group, established in
1988 to develop standards for digital audio and video formats.
• There are five MPEG standards being used or in development. Each compression standard was designed
with a specific application and bit rate in mind, although MPEG compression scales well with increased bit
rates.
• Bitrate, as the name implies, describes the rate at which bits are transferred from one location to another. In
other words, it measures how much data is transmitted in a given amount of time. Bitrate is commonly
measured in bits per second (bps), kilobits per second (Kbps), or megabits per second (Mbps).
MPEG1
• Designed for up to 1.5 Mbit/sec
• Standard for the compression of moving pictures and audio.
• This was based on CD- ROM video applications, and is a popular standard for video on the Internet,
transmitted as mpg files.
• In addition, level 3 of MPEG-1 is the most popular standard for digital compression of audio-known as
MP3.
MPEG2
• MPEG-2 is an enhanced version of MPEG-1
• Designed for between 1.5 and 15 Mbit/sec.
• It was specially designed for HDTV.
• HDTV is designed to provide a sense of immersion to the viewer. Hence it uses high resolution image
frames.
• It is based on MPEG-1, but designed for the compression and transmission of digital broadcast television.
The most significant enhancement from MPEG-1 is its ability to efficiently compress interlaced video.
• MPEG-2 scales well to HDTV resolution and bit rates, obviating the need for an MPEG-3
• The interlaced signal contains two fields of a videoframe captured at two different times ......This technique
uses two fields to create a frame. One field contains all odd-numbered lines in the image; the other
contains all even-numbered lines.
MPEG4
• MPEG-4 is an advanced version of MPEG-2.
• MPEG-4 is based on object-based compression. Individual objects within a scene are tracked separately
and compressed together to create an MPEG4 file.
• This results in very efficient compression that is very scalable, from low bit rates to very high.
• It also allows developers to control objects independently in a scene, and therefore introduce interactivity.
H.261
H.261 is an earlier digital video compression standard. Because its principle of motion - compensation - based
compression is very much retained in all later video compression standards, we will start with a detailed discussion
of H.261.
The International Telegraph and Telephone Consultative Committee (CCITT) initiated development of H.261 in
1988. The final recommendation was adopted by the International Telecommunication Union - Telecommunication
The standard was designed for videophone, video conferencing, and other, audio visual services over ISDN telephone
lines. Initially, it was intended to support multiples (from 1 to 5) of 384 kbps channels. In the end, however, the video
codec supports bitrates of p x 64 kbps, where p ranges from 1 to 30. Hence the standard was once known as p * 64,
pronounced "p star 64". The standard requires the video encoders delay to be less than 150 msec, so that the video
H.261 belongs to the following set of ITU recommendations for visual telephony systems:
The above table lists the video formats supported by H.261. Chroma subsampling in H.261 is 4:2:0. Considering the
relatively low bitrate in network communications at the time, support for CCIR 601 QCIF is specified as required,
The following figure illustrates a typical H.261 frame sequence. Two types of image frames are defined: ultra -
I - frames are treated as independent images. Basically, a transform coding method similar to JPEG is applied within
P - frames are not independent. They are coded by a forward predictive coding method in which current macroblocks
are predicted from similar macroblocks in the preceding I: or P - frame, and differences between the macroblocks
are coded. Temporal redundancy removal is hence included in P - frame coding, whereas I - frame coding performs
only spatial redundancy removal. It is important to remember that prediction from a previous P - frame is allowed
The interval between pairs of I - frames is a variable and is determined by the encoder. Usually, an ordinary digital
video has a couple of I - frames per second. Motion vectors in H.261 are always measured in units of full pixels and
I - frame coding
Macroblocks are of size 16 x 16 pixels for the Y frame of the orignal image. For Cb and Cr frames, they correspond
to areas of 8 x 8, since 4:2:0 chroma subsampling is employed. Hence, a macroblock consists of four Y blocks, one
The following figure shows the H.261 P - frame coding scheme based on motion compensation. For each macroblock
in the Target frame, a motion vector is allocated by one of the search methods discussed earlier. After the prediction,
a difference macroblock is derived to measure the prediction error. It is also carried in the form of four Y blocks,
one Cb, and one Cr block. Each of these 8 x 8 blocks goes through DCT, quantization, zigzag scan, and entropy
Sometimes, a good match cannot be found — the prediction error exceeds a certain acceptable level."The macroblock
itself is then encoded (treated as an intra macroblock) and in this case is termed a non - motion - compensated
macroblock.
P - frame coding encodes the difference macroblock (not the Target macroblock itself). Since the difference
macroblock usually has a much smaller entropy than the Target macroblock a a large compression ratio is attainable.
In fact, even the motion vector is not directly coded. Instead, the difference, MVD, between the motion vectors of
the preceding macroblock and current macroblock is sent for entropy coding:
Quantization in H.261
The quantization in H.261 does not use 8 x 8 quantization matrices, as in JPEG and MPEG. Instead, it uses a constant,
to 62. One exception, however, is made for the DC coefficient in intra mode, where a step size of 8 is always used.
If we use DCT and QDCT to denote the DCT coefficients before and after quantization, then for DC coefficients in
intra mode,
The following figure shows a relatively complete picture of how the H.261 encoder and decoder work. Here, Q and Q
- 1 stand for quantization and its inverse, respectively. Switching of the intra - and inter - frame modes can be readily
To illustrate the operational detail of the encoder and decoder, let's use a scenario where frames I, P1, and P2 are
encoded and then decoded. The data that goes through the observation points, indicated by the circled numbers in
the above figure is summarized in the above tables. We will use I, P1, P2 for the original data,for the decoded data
(usually a lossy version of the original), and P' 1, P' 2for the predictions in the Inter - frame mode.
For the encoder, when the Current Frame is an Intra - frame, Point number 1 receives macroblocks from the I - frame.
DCT, Quantization, and Entropy Coding steps, and the result is sent to the Output Buffer, ready to be transmitted.
Meanwhile, the quantized DCT coefficients for I are also sent to Q - 1 and IDCT and hence appear at Point as I.
Combined with a zero input from Point, the data at Point remains as I and this is stored in Frame Memory, waiting
to be used for Motion Estimation and Motion - Compensation - based Prediction for the subsequent frame P1.
Quantization Control serves as feedback — that is, when the Output Buffer is too full, the quantization step size is
increased, so as to reduce the size of the coded data. This is known as an encoding rate control process.
When the subsequent Current Frame P1 arrives at Point 1, the Motion Estimation process is invoked to find the
motion vector for the best matching macroblock in frame I for each of the macroblocks in P1. The estimated motion
vector is sent to both Motion - Compensation - based Prediction and Variable - Length Encoding (VLE). The MC -
based Prediction yields the best matching macroblock in P1. This is denoted as P`1 appearing at Point 2.
At Point, the "prediction error" is obtained, which is D1 = P1 - P`1. Now D1 undergoes DCT, Quantization, and
Entropy Coding, and the result is sent to the Output Buffer. As before, the DCT coefficients for D1 are also sent to Q
Added to P’1 at Point, we have P' 1 = P' 1 + D' 1at Point6. This is stored in Frame Memory, waiting to be used for
Motion Estimation and Motion - Compensation - based Prediction for the subsequent frame P2. The steps for
encoding P2 are similar to those for P1, except that P2will be the Current Frame and P1 becomes the Reference
Frame.
For the decoder, the input code for frames will be decoded first by Entropy Decoding, Q - 1, and IDCT. For Intra -
frame mode, the first decoded frame appears at Point 1 and then Point 4 as I. It is sent as the first output and at the
Subsequently, the input code for Inter - frame Pi is decoded, and prediction error D1 is received at Point. Since the
motion vector for the current macroblock is also entropy - decoded and sent to Motion - Compensation - based
Prediction, the corresponding predicted macroblock P’1 can be located in frame I and will appear at Points.
Combined with D' 1, we have P'1 = P' 1 + D' 1 at point, and it is sent out as the decoded frame and also stored in the
Frame Memory, Again, the steps for decoding P2 are similar to those for P1
Let's take a brief look at the H.261 video bitstream syntax. This consists of a hierarchy of four layers: Picture, Group
Each GOB has its Start Code (GBSC) and. Group number (GN). The GBSC is unique and can be identified, without
decoding the entire variable - length code in the bitstream. In case a network error causes a bit error or the loss of
some bits, H.261 video can be recoyered and resynchronized at the next identifiable GOB, preventing the possible
propagation of errors.
GQuant indicates the quantizer to be used in the GOB, unless it is overridden by any subsequent Macroblock
Quantizer (MQuant). GQuant and MQuant are referred to as scale. Each macroblock (MB) has its own Address,
indicating its position within the GOB, quantizer (MQuant), and six 8 x 8 image blocks (4 Y, 1 Cb, 1
Cr). Type denotes whether it is an Intra- or Inter, motion - compensated or non - motion - compensated macroblock.
difference between the motion vectors of the preceding and current macroblocks. Moreover, since some blocks in
the macroblocks match well and some match poorly in Motion Estimation, a bitmask Coded Block Pattern (CBP) is
used to indicate this information. Only well - matched blocks will have their coefficients transmitted. Block layer.
For each 8^ x. g block, the bitstream starts with DC value, followed by pairs of length of zero - run (Rim) and the
subsequent nonzero value (Level) for ACs, and finally the End of Block (EOB) code. The range of "Run" is [0,63].
With advances in technology video has become a very popular medium for audio-visual communication. The advent
of digital video and its processing on normal desktop computers and miniaturization leading to small hand held
camcorders have been mainly responsible for this popularity.
• Cinema is based on recording images on film and playing it back rapidly enough to simulate real time motion.
Video and television on the other hand use a scanning of a stream of electrons across a photo sensor while recording
and a phosphorescent screen while displaying images.
• This scanning process has a bearing on various parameters such as aspect ratio (size of the video frame), colours
and their range, recording and playback, digitization and transmission of video signals.
• Analog video can be converted into digital video using a video capture interface. Once digitized, video can be
stored as a digital file in a variety of formats. They can also be converted from one format to another using appropriate
converter. Digital video can be stored on a variety of media like video tape, CD/DVD, hard disks and flash memory
devices like memory sticks, etc.
• Digital video formats are selected based on the purpose of the video and the storage and transmission medium.
Broadcast video has the highest resolution and hence the highest file size. Video for the web uses the highest
compression and small image sizes, making it suitable for playback on low speed internet connections.
• Video communication is a creative medium and calls for a range of abilities. Developing a communication on video
requires systematic planning, organizing, production and post production activities, requiring a range of educational,
technical and professional skills.
• While professional video production is an involved process, you can also use small hand held camcorders and a
computer based software to compile, edit and produce an amateur video communication. Software like Windows
Movie Maker has features designed for the amateur video enthusiast. You may use an online facility like Kaltura to
develop web ready video.
LECTURE NOTE 16
Generally, Computer animation is a visual digital display technology that processes the moving images on screen.
In simple words, it can be put or defined as the art or power of giving life, energy and emotions etc. toany non-
living or inanimate object via computers. It can be presented in form of any video or movie. Computer animation
has the ability to make any dead image alive. The key/main concept behind computer animation is to play the defined
images at a faster rate to fool the viewer so that the viewer should interpret those images as a continuous motion of
images.
Computer Animation is a sub-part or say small part of computer graphics and animation. Nowadays, animation can
be seen in many area around us. It is used in a lot of movies, films and games, education, e-commerce, computer art,
training etc. It is a big part of entertainment area as most of the sets and background is all build up through VFX and
animation.
History of Computer Animation:
Year Developments
Felix the cat” and “Old doc Yak” – Cartoon series was
1913
developed.
2. Procedural:
In Procedural method, set of rules are used to animate the objects. Animator defines or specify the
initial rules and procedure to process and later runs simulations. Many of the times rules or
procedure are based on real world.s physical rule which are shown by mathematical equations.
3. Behavioral:
According to this method/technique, to a certain extent the character or object specifies/determines
it’s own actions which helps / allows the character to improve later, and in turn, it frees the animator
in determining each and every details of the character’s motion.
4. Key Framing:
A key frame in computer animation is a frame where we define changes in an animation. According
to key framing, a storyboard requirement is must as the animator/artist draws the major frames
(frames in which major/important changes can be made later) of animation from it. In key framing,
character’s or object’s key position are the must and need to be defined by the animator, because the
missing frames are filled in those key position via computer automatically.
5. Motion Capture:
This method of animation uses the live action/motion footage of a living human character which is
recorded to the computer via video cameras and markers and later, that action or motion is
used/applied to animate the character which gives the real feel to the viewers as if the real human
character has been animated. Motion Capture is quite famous among the animators because of the
fact that the human action or motion can be captured with relative ease.
6. Dynamics:
In this method, simulations are used in order to produce a quite different sequence while maintaining
the physical reality. Physics’ laws are used in simulations to create the motion of pictures/characters.
High level of interactivity can be achieved in this method, via the use of real-time simulations,
where a real person performs the action or motions of a simulated character.
LECTURE NOTE 17
A temporal database is a database that needs some aspect of time for the organization of information. In the temporal
database, each tuple in relation is associated with time. It stores information about the states of the real world and
time. The temporal database does store information about past states it only stores information about current states.
Whenever the state of the database changes, the information in the database gets updated. In many fields, it is very
necessary to store information about past states. For example, a stock database must store information about past
stock prizes for analysis. Historical information can be stored manually in the schema.
There are various terminologies in the temporal database:
• Valid Time: The valid time is a time in which the facts are true with respect to the real world.
• Transaction Time: The transaction time of the database is the time at which the fact is currently present
in the database.
• Decision Time: Decision time in the temporal database is the time at which the decision is made about
the fact.
Temporal databases use a relational database for support. But relational databases have some problems in temporal
database, i.e. it does not provide support for complex operations. Query operations also provide poor support for
performing temporal queries.
Applications of Temporal Databases
Finance: It is used to maintain the stock price histories.
1. It can be used in Factory Monitoring System for storing information about current and past readings
of sensors in the factory.
2. Healthcare: The histories of the patient need to be maintained for giving the right treatment.
3. Banking: For maintaining the credit histories of the user.
1. An EMPLOYEE table consists of a Department table that the employee is assigned to. If an employee
is transferred to another department at some point in time, this can be tracked if the EMPLOYEE
table is an application time-period table that assigns the appropriate time periods to each department
he/she works for.
Temporal Relation
A temporal relation is defined as a relation in which each tuple in a table of the database is associated with time,
the time can be either transaction time or valid time.
• Interval-based: specifications of the temporal relations between the time intervals of the presentation of
media objects
• Axes-based: allows presentation events to be synchronized according to shared axes, e.g., a global timer
• Control flow-based: at specified points in presentations, they are synchronized
• Event-based: Events in the presentation trigger presentation actions
If the multimedia presentation is live and multiple parties are involved, then none of the approaches above is suitable
for delivering synchronization information to the sink(s) in a timely fashion. Figure 11.2 shows typical
communication patterns.
Of particular interest here, is that if multiple sinks are involved, then they will receive identical data. It would be
inefficient if the data were replicated at the source for separate transmission to each of the sinks. It would also be
inefficient if the same operation was carried out at different sinks. Multicasting or broadcasting of streams is the
responsibility of the stream layer, whereas efficient planning of operation execution in the different communication
patterns is a responsibility of the object layer.
Multi-Step Synchronization
In a distributed environment, synchronization is typically a multi-step process, during which the synchronization
must be maintained so as to enable the sink to perform the final synchronization. The synchronization steps are:
With many different points at which synchronization must occur decisions must be made about how to implement
it. A first decision is the selection of the type of transport for the synchronization specification. In runtime, decisions
must be taken concerning the location of synchronization operations, keeping clocks in synchrony (if used to provide
common timing information), and the handling of multicast and broadcast messages. Coherent planning of the steps
in the synchronization process, together with the necessary operations of the objects, e.g., decompression, must also
be done. In addition, presentation manipulation operations demand additional replanning at runtime.
LECTURE NOTE 19
A synchronization specification should comprise:
In addition, the form, or alternate forms, of a multimedia object may be described. For example, a text could be
presented as text on the screen or as a generated audio sequence. In the case of live synchronizations, the temporal
relations are implicitly defined during capture. QoS requirements are specified before the start of the capture. In the
case of synthetic synchronization, the specification must be created explicitly.
The term 'magnetic media' is used to describe any record format where analogue or digital information is recorded
to and retrieved from a coated matrix that can be magnetised.
Magnetic tape has a plastic carrier coated with a matrix of metal or metal oxide particles, a resin binder and other
ingredients such as lubricants and fungicides. Sometimes the tape has an antistatic coating on the back to reduce
static charge build-up and to improve its winding capability.
Magnetic hard disks usually have an aluminium base, coated on both sides with a metal or metal oxide matrix.
They have wide application in computing as the principle storage medium. Floppy disks and diskettes consist of a
plastic base with a magnetic matrix on one or both sides. They are enclosed in a rigid, plastic protective jacket.
Although an obsolete medium they are still likely to be found in collections and are a priority for transfer to new
media.
All materials degrade at different rates over time. We cannot prevent this inevitable deterioration, but we can slow
it down. Below are examples of the types of deterioration to which magnetic media are prone.
• Older acetate audio tapes can become brittle and easily broken. The magnetic coating on tapes and disks
can deteriorate and subsequently flake off the base.
• Print-through, which is the transfer of a signal from one loop of tape onto an adjacent loop. This takes the
form of a pre-echo and can be obviated by storing audio tapes 'tail-out' on their reels.
• High temperature and humidity and fluctuations may cause the magnetic and base layers in a reel of tape to
separate, or cause adjacent loops to block together. High temperatures may also weaken the magnetic
signal, and ultimately demagnetise the magnetic layer.
• Tapes are particularly susceptible to mould because pockets of air trapped in the windings can create
microclimates which will support mould growth.
• Dust, dirt, grease and chemical pollutants can promote moisture condensation and oxidative deterioration.
These contaminants also interfere with the contact between the playback head and the tape, causing audio
signal drop-outs.
Magnets or magnetic fields can cause information loss on a tape or disk if it is in close proximity for long enough
because information is encoded on magnetic media by the alignment of magnetic particles. The degree of risk
depends on a number of factors; proximity of the media to the source of the field; strength of the field, and duration
of exposure. Running a vacuum cleaner past the shelves will probably not cause any damage, because magnetic
effects decrease with distance.
Magnetic media is sometimes supplied in cardboard enclosures. These can be used for the storage when in good
condition. However, when they are older or damaged, they tend to generate dust.
Tapes should be stored in cases made of non-magnetisable material, preferably an inert plastic such as
polypropylene. Cases should have internal lugs to securely hold the tapes by the hub. They should be strong
enough to protect the cassettes from external damage and close tightly to keep out dust.
Reels or cores used for winding tapes should be clean and undamaged. Reels should be made of aluminium or a
stable inert plastic.
Floppy disks and diskettes should be stored in protective envelopes that have a non-abrasive surface and are
resistant to the build-up of static electricity. Tyvek envelopes are widely available and are suitable for this purpose.
Measures, such as the installation of an air-lock, or the maintenance of positive internal air pressure, can be taken
to prevent dust entering from the outside.
Magnetic media should ideally be stored in closed metal cabinets to provide extra protection against heat and dust.
However, if adequate environmental controls are in place, storage on open shelves, in their cases is acceptable.
Storage equipment should be sturdy, allow tapes and disks to be stored vertically, and most importantly, be
electrically grounded.
The National Archives provides a Standard for the Physical Storage of Commonwealth Records (pdf, 400kb) that
recommends environmental conditions suitable for magnetic media.
Turn off lights in storage areas when not in use to minimise the exposure of records. An ideal storage area would
have no windows, but if windows are present, they should be covered with curtains or blinds.
Cleanliness is very important in storage areas, for reasons of records protection and, work health and safety. Never
allow food or drink to be taken into a records storage area, and ensure the area is cleaned regularly. Insects and
rodents, once attracted to a records storage area by food, may begin to eat the records. For further information the
National Archives provides information on Integrated Pest Management.
The Archives Disaster Preparedness Manual for Commonwealth Agencies (pdf,429kb) offers specific advice for
recovery of records.
It is essential that recording and replay equipment for magnetic tapes is maintained in good condition because
information held on magnetic media is mechanically processed. Poorly maintained equipment can damage records.
The heads, disk drive and tape drive elements of playback and recording equipment should be cleaned regularly in
accordance with the manufacturers' recommendations.
To minimise deterioration due to handling and use, copies of important and frequently used tapes should be made
for reference purposes. Ideally, a preservation master copy, a duplicating copy and a reference copy should be
produced, and clearly labelled as such. As a disaster preparedness measure, the preservation master copy should be
stored in a different location to the others. The duplicating copy may be used to produce further reference copies
when required.
Long-term preservation of magnetic media is affected by two major factors; its intrinsic instability and the
likelihood of the hardware obsolescence. The equipment used to access magnetic media today will almost certainly
have been superseded in the next decades. For all practical purposes the records will be unusable, even media in
good condition. The main prospect for long-term retention of the information held on magnetic media seems to be
in regular copying or data migration, thus maintaining a good quality signal that can be read using available
equipment. Copying can either be to fresh tape or disk, or to some other machine-readable format. Once copied to
an uncompressed digital format the information can be copied without loss of quality.
LECTURE NOTE 20
What is an Optical Disc?
An optical disc is an electronic data storage medium that is also referred to as an optical disk, optical storage, optical
media, Optical disc drive, disc drive, which reads and writes data by using optical storage techniques and technology.
An optical disc, which may be used as a portable and secondary storage device, was first developed in the late 1960s.
James T. Russell invented the first optical disc, which could store data as micron-sized light and dark dots.
An optical disc can store more data and has a longer lifespan than the preceding generation of magnetic storage
medium. To read and write to CDs and DVDs, computers use a CD writer or DVD writer drive, and to read and write
to Blu-ray discs, they require a Blu-ray drive. MO drives, such as CD-R and DVD-R drives, are used to read and
write information to discs (magneto-optic). The CDs, Blu-ray, and DVDs are the most common types of optical
media, which are usually used to:
With the introduction of an all-new generation of optical media, the storage capacity to store data has increased. CDs
have the potential to store 700 MB of data, whereas DVDs allow you to store up to 8.4 GB of data. Blu-ray discs,
the newest type of optical media, can hold up to 50 GB of data. This storage capacity is the most convenient benefit
as compared to the floppy disk storage media, which can store up to 1.44 MB of data.
Optical discs are impervious to most environmental threats like magnetic disturbances or power surges; however,
these discs are not expensive to manufacture. It helps optical disc storage to make well-suited for archival storage.
Sony said in 2016 that it would release a Blu-ray disc with the capacity to hold 3.3 terabytes (TB) of data.
Advertisement
There is a need to be noted, a CD drive can only have the ability to read CDs, and a DVD drive can read CDs and
DVDs. Additionally, a Blu-ray, a new type of optical media that can read CDs, DVDs, and Blu-ray discs. In other
words, older drives are not able to read newer optical discs, but the latest drives have the ability to read older optical
discs.
The basic sandwich of materials structure is used by all recent optical disc formats. The base is formed by using a
hard-plastic substrate, and a reflective layer of metallic foil is used for encoding the digital data. Next, a clear
polycarbonate layer secures the foil and enables the laser beam to move via the reflective foil layer.
Optical discs include different materials in the sandwich, which are dependent on the type of disc, whether it is
rewritable or write-once. In write-once CD-ROM, an organic dye layer is located between the polycarbonate and
unwritten reflective foil. Because they replace the aluminium foil with an alloy that is a phase-change material,
rewritable optical discs may be erased and rewritten several times.
Blu-ray discs, which are the newest type of optical media, have the potential to store the most, up to 50 GB of data.
BD-R (Blu-ray recordable) is available in the market with a storage capacity of 25 GB or 50 GB.
Russell's creation bears little resemblance to later CDs or DVDs; however, he developed the first method to store
digital information on an optical medium. As the medium, Russell used transparent foil, and instead of reflecting a
laser off, the data was read with the help of shining a light. Russell's system could also be any size, not simply a disc,
because the data was not read as it spun.
A physicist Peter Kramer developed technology in 1969 when he was working for Philips Research; the newest CD
and DVD are based on this technology Kramer devised a method for encoding data on a reflective metallic foil that
may be read with a low-powered laser.. Initially, his work was used to hold analog video on the first laserdisc, but
gradually it becomes the basis of all digital optical storage media.
Philips and Sony formed a collaborative collaboration in 1979 to produce the first audio CD, which was the first
commercial usage of digital optical storage. After five years, Sony developed not only the first CD-ROM for the
storage of any digital data, but also the first CD-ROM for the storage of any digital data, this time in conjunction
with Denon. The CD-ROM, at that time, had the potential to store up to 680 MB of data. Around 10 years later, the
DVD was created by Sony, which increased data capacity to 4.7 GB; When Sony again teamed up with Philips and
also joined with Panasonic and Toshiba.
The next generation of optical storage, the Blu-ray disc, took another 10 years to come into the market. The Blu-ray
disc was developed by a partnership led by Sony, and it came with up to 27 GB of storage. Toshiba had attempted
to offer its own format, HD-DVD, but it was not a member of the consortium at the time. After a brief format war,
Blu-ray became the industry standard.
What is a File System?
A file system is a process of managing how and where data on a storage disk, which is also referred to as file
management or FS. It is a logical disk component that compresses files separated into groups, which is known as
directories. It is abstract to a human user and related to a computer; hence, it manages a disk's internal operations. Files
and additional directories can be in the directories. Although there are various file systems with Windows, NTFS is
the most common in modern times. It would be impossible for a file with the same name to exist and also impossible
to remove installed programs and recover specific files without file management, as well as files would have no
organization without a file structure. The file system enables you to view a file in the current directory as files are often
managed in a hierarchy.
A disk (e.g., Hard disk drive) has a file system, despite type and usage. Also, it contains information about file size,
file name, file location fragment information, and where disk data is stored and also describes how a user or application
may access the data. The operations like metadata, file naming, storage management, and directories/folders are all
managed by the file system.
On a storage device, files are stored in sectors in which data is stored in groups of sectors called blocks. The size and
location of the files are identified by the file system, and it also helps to recognize which sectors are ready to be used.
Other than Windows, there are some other operating systems that contain FAT and NTFS file system. But Apple
product (like iOS and macOS) uses HFS+ as operating system is horizon by many different kinds of file systems.
Sometimes the term "file system" is used in the reference of partitions. For instance, saying, "on the hard drive, two
files systems are available," that does not have to mean the drive is divided between two file systems, NTFS and FAT.
But it means two separate partitions are there that use the same physical disk.
Advertisement
In order to work, a file system is required by most of the applications you come into contact with; therefore, each
partition should have one. Furthermore, if a program is built for use in macOS, you will be unable to use this program
on windows because programs are file system-dependent.
FAT: FAT is a type of file system, which is developed for hard drives. It stands for file allocation table and was first
introduced in 1977, which is used for 12 or 16 bits for each and every cluster access into the file allocation table (FAT).
On hard drives and other computer systems, it helps to manage files on Microsoft operating systems. In devices like
digital cameras, flash memory, and other portable devices, it is also often found that is used to store file information.
It also helps to extend the life of a hard drive as it minimizes the wear and tears on the hard disc. Today, FAT is not
used by later versions of Microsoft Windows like Windows XP, Vista, 7, and 10 as they use NTFS. The FAT8, FAT12,
FAT32, FAT16 are all the different types of FAT (for file allocation table).
GFS: A GFS is a file system, which stands for Global File System. It has the ability to make enable multiple computers
to act as an integrated machine, which is first developed at the University of Minnesota. But now it is maintained by
Red Hat. When the physical distance of two or more computers is high, and they are unable to send files directly with
each other, a GFS file system makes them capable of sharing a group of files directly. A computer can organize its I/O
to preserve file systems with the help of a global file system.
HFS: HFS (Hierarchical file system) is the file system that is used on a Macintosh computer for creating a directory
at the time a hard disk is formatted. Generally, its basic function is to organize or hold the files on a Macintosh hard
disk. Apple is not capable of supporting to write to or format HFS disks since when OS X came on the market. Also,
HFS-formatted drives are not recognized by Windows computers as HFS is a Macintosh format. With the help of
WIN32 or NTFS file systems, Windows hard drives are formatted.
NTFS: NTFS is the file system, which stands for NT file system and stores and retrieves files on Windows NT
operating system and other versions of Windows like Windows 2000, Windows XP, Windows 7, and Windows 10.
Sometimes, it is known as the New Technology File System. As compared to the FAT and HPFS file system, it
provides better methods of file recovery and data protection and offers a number of improvements in terms of
extendibility, security, and performance.
UDF: A UDF is a file system, stands for Universal Disk Format and used first developed by OSTA (Optical Storage
Technology Association) in 1995 for ensuring consistency among data written to several optical media. It is used with
CD-ROMs and DVD-ROMs and is supported on all operating systems. Now, it is used in the process of CD-R's and
CD-RW's, called packet writing.
Architecture of the File System
Two or three layers are contained by the file system. Sometimes, these layers function combined and sometimes are
explicitly separated. For file operations, the API (Application Program Interface) is provided by the logical file system,
like OPEN, CLOSE, READ, and more because it is accountable for interaction with the user application. Also, for
processing, the requested operation is forwarded to the layer that is located below it. Furthermore, for various
concurrent instances of physical file systems, the second optional layer allows support, which is a virtual file system.
And each concurrent instance is called a file system implementation.
The third layer is responsible for handling buffering and memory management, which is called the physical file system.
It is concerned with the physical operation of the storage device and processes physical blocks being read or written.
Furthermore, to drive the storage device, this layer interacts with the channel and the device drivers.
On the disk storage medium, a disk file system has the ability to randomly address data within a few amounts of time.
Also, it includes the anticipation that led to the speed of accessing data. Without regard to the sequential location of
the data, multiple users can access several data on the disk with the help of a disk file system.
A flash file system is responsible for restrictions, performance, and special abilities of flash memory. It is superior to
utilize a file system that is designed for a flash device; however, a disk file system is the basic storage media, which
can use a flash memory device.
A tape file system is used to hold files on the tape as it is a tape format and file system. As compared to disks, magnetic
tapes are more powerful to access data for a long time, which are the challenges for a general-purpose file system in
terms of creation and efficient management.
A database-based file system is another method for file management. Files are recognized by their characteristics (like
a type of file, author, topic, etc.) rather than hierarchical structured management.
Some programs require one or more changes to fail for any reason or need several file systems changes but do not
make any changes. For instance, a program may write configuration files or libraries mand executables at the time of
installing or updating the software. The software may be unusable or broken if the software is stopped while updating
or installing. Also, the entire system may leave in an unusable state if the process of installing or updating the software
is incomplete.
A shared-disk file system allows the same external disk subsystem to be accessed by multiple machines, but when the
number of machines accesses the same external disk subsystem, there may be occurred collisions in this condition; so,
to prevent the collision, the file system decides which subsystem to be accessed.
In the 1970s, for some initial microcomputer users, disk and digital tape devices were much expensive. A few cheaper
basic data storage systems used common audio cassette tape was designed. On the cassette recorder, the user was
informed about pressing "RECORD" when there was required to write data by system. And, to notify the system, press
"RETURN" on the keyboard. Also, on the cassette recorder, the user was needed to press the "PLAY" button when
the system required to read data.
The subdirectories are not available in the flat system. It contains the only directory, and all files are held in a single
directory. Due to the relatively small amount of data space available, this type of file system was adequate when floppy
disk media was available for the first time.
LECTURE NOTE 21
Multimedia Hardware
Most of the computers now-a-days come equipped with the hardware components required to develop/view
multimedia applications. Following are the various categories in which we can define the various types of hardwares
required for multimedia applications.
• Processor The heart of any multimedia computer is its processor. Today Core 15 or higher processor
is recommended for a multimedia computer.
o CPU is considered as the brain of the computer.
o CPU performs all types of data processing operations.
o It stores data, intermediate result and instructions (program).
o It controls the operations of all parts of computer.
• Memory and Storage Devices - You need memory for storing various files used during production,
original audio and video clips, edited pieces and final mined pieces. You also need memory for
backup of your project files.
o Primary Memory- Primary memory holds only those data and instructions on which
computer is currently working. It has limited capacity and data gets lost when power
is switched off. It is generally made up of semiconductor device. These memories
are not as fast as registers. The data and instructions required to be processed earlier
reside in main memory. It is divided into two subcategories RAM and ROM.
o Flash Memory- Cache memory is a very high speed semiconductor memory, which
can speed up CPU. It acts as a buffer between the CPU and main memory. It is used
to hold those parts of data and program which are most frequently used by CPU. The
parts of data and programs are transferred from disk to cache memory by operating
system, from where CPU can access them.
o Secondary Memory: This type of memory is also known as external memory or
non-volatile. It is slower than main memory. These are used for storing
Data/Information permanently. CPU directly does not access these memories;
instead they are accessed via input-output routines. Contents of secondary memories
are first transferred to main memory and then CPU can access it. For example, disk,
CD-ROM, DVD, etc.
• Input Devices - Following are the various types of input devices which are used in multimedia
systems.
o Keyboard- Most common and very popular input device is keyboard. The keyboard
helps in inputting the data to the computer. The layout of the keyboard is like that of
traditional typewriter, although there are some additional keys provided for
performing some additional functions. Keyboards are of two sizes 84 keys or101/102
keys, but now 104 keys or 108 keys keyboard is also available for Windows and
Internet. The keys are following:
Sr.
Keys Description
No.
Typing These keys include the letter keys (A-Z) and digits keys (0-9) which
1
Keys generally give same layout as that of typewriters.
The twelve functions keys are present on the keyboard. These are
Function
3 arranged in a row along the top of the keyboard. Each function key has
Keys
unique meaning and is used for some specific purpose.
Special
Keyboard also contains some special purpose keys such as Enter, Shift,
5 Purpose
Caps Lock, Num Lock, Space bar, Tab, and Print Screen.
Keys
o
o Mouse - Mouse is most popular Pointing device. It is a very famous cursor-control
device. It is a small palm size box with a round ball at its base which senses the
movement of mouse and sends corresponding signals to CPU on pressing the buttons.
Generally, it has two buttons called left and right button and scroll bar is present at
the mid. Mouse can be used to control the position of cursor on screen, but it cannot
be used to enter text into the computer.
o Joystick - Joystick is also a pointing device, which is used to move cursor position
on a monitor screen. It is a stick having a spherical ball at its both lower and upper
ends. The lower spherical ball moves in a socket. The joystick can be moved in all
four directions. The function of joystick is similar to that of a mouse. It is mainly
used in Computer Aided Designing (CAD) and playing computer games.
o Light Pen - Light pen is a pointing device, which is similar to a pen. It is used to
select a displayed menu item or draw pictures on the monitor screen. It consists of a
photocell and an optical system placed in a small tube. When light pen's tip is moved
over the monitor screen and pen button is pressed, its photocell sensing element
detects the screen location and sends the corresponding signal to the CPU.
o Track Ball - Track ball is an input device that is mostly used in notebook or laptop
computer, instead of a mouse. This is a ball, which is half inserted and by moving
fingers on ball, pointer can be moved.Since the whole device is not moved, a track
ball requires less space than a mouse. A track ball comes in various shapes like a
ball, a button and a square.
o Scanner - Scanner is an input device, which works more like a photocopy machine.
It is used when some information is available on a paper and it is to be transferred to
the hard disc of the computer for further manipulation. Scanner captures images from
the source which are then converted into the digital form that can be stored on the
disc. These images can be edited before they are printed.
o Magnetic Ink Card Reader (MICR) - MICR input device is generally used in banks
because of a large number of cheques to be processed everyday. The bank's code
number and cheque number are printed on the cheques with a special type of ink that
contains particles of magnetic material that are machine readable. This reading
process is called Magnetic Ink Character Recognition (MICR). The main advantage
of MICR is that it is fast and less error prone.
o Optical Character Reader (OCR) - OCR is an input device used to read a printed
text. OCR scans text optically character by character, converts them into a machine
readable code and stores the text on the system memory.
o Bar Code Readers - Bar Code Reader is a device used for reading bar coded data
(data in form of light and dark lines). Bar coded data is generally used in labelling
goods, numbering the books, etc. It may be a hand-held scanner or may be embedded
in a stationary scanner.Bar Code Reader scans a bar code image, converts it into an
alphanumeric value, which is then fed to the computer to which bar code reader is
connected.
o Optical Mark Reader (OMR) - OMR is a special type of optical scanner used to
recognize the type of mark made by pen or pencil. It is used where one out of a few
alternatives is to be selected and marked. It is specially used for checking the answer
sheets of examinations having multiple choice questions.
o Voice Systems - Following are the various types of input devices which are used in
multimedia systems.
▪ Microphone- Microphone is an input device to input sound that is
then stored in digital form. The microphone is used for various
applications like adding sound to a multimedia presentation or for
mixing music.
▪ Speaker- Speaker is an output device to produce sound which is
stored in digital form. The speaker is used for various applications
like adding sound to a multimedia presentation or for movies
displays etc.
o Digital Camera - Digital camera is an input device to input images that is then stored
in digital form. The digital camera is used for various applications like adding images
to a multimedia presentation or for personal purposes.
• Output Devices - Following are few of the important output devices, which are used in Computer
Systems:
o Monitors - Monitor commonly called as Visual Display Unit (VDU) is the main
output device of a computer. It forms images from tiny dots, called pixels, that are
arranged in a rectangular form. The sharpness of the image depends upon the number
of the pixels. There are two kinds of viewing screen used for monitors:
▪ Cathode-Ray Tube (CRT) Monitor- In the CRT, display is made
up of small picture elements called pixels for short. The smaller the
pixels, the better the image clarity or resolution. It takes more than
one illuminated pixel to form whole character, such as the letter 'e' in
the word help. A finite number of characters can be displayed on a
screen at once. The screen can be divided into a series of character
boxes - fixed location on the screen where a standard character can
be placed. Most screens are capable of displaying 80 characters of
data horizontally and 25 lines vertically.
• Printers - Printer is the most important output device, which is used to print information on paper.
o Dot Matrix Printer- In the market, one of the most popular printers is Dot Matrix
Printer because of their ease of printing features and economical price. Each
character printed is in form of pattern of Dot's and head consists of a Matrix of Pins
of size (5*7, 7*9, 9*7 or 9*9) which comes out to form a character that is why it is
called Dot Matrix Printer.
o Daisy Wheel- Head is lying on a wheel and Pins corresponding to characters are like
petals of Daisy (flower name) that is why it is called Daisy Wheel Printer. These
printers are generally used for word-processing in offices which require a few letters
to be send here and there with very nice quality representation.
o Line Printers- Line printers are printers, which print one line at a time.
o Laser Printers- These are non-impact page printers. They use laser lights to produce
the dots needed to form the characters to be printed on a page.
• Screen Image Projector - Screen image projector or simply projector is an output device used to
project information from a computer on a large screen so that a group of people can see it
simultaneously. A presenter first makes a PowerPoint presentation on the computer. Now a screen
image projector is plugged to a computer system and presenter can make a presentation to a group
of people by projecting the information on a large screen. Projector makes the presentation more
understandable.
• Speakers and Sound Card - Computers need both a sound card and speakers to hear audio, such as
music, speech and sound effects. Most motherboards provide an on-board sound card. This built-in-
sound card is fine for the most purposes. The basic functions of a sound card are that it converts
digital sound signals to analog for speakers making it louder or softer.
Multimedia Software
Multimedia software tells the hardware what to do. For example, multimedia software tells the hardware to display
the color blue, play the sound of cymbals crashing etc. To produce these media elements( movies, sound, text,
animation, graphics etc.) there are various software available in the market such as Paint Brush, Photo Finish,
Animator, Photo Shop, 3D Studio, Corel Draw, Sound Blaster, IMAGINET, Apple Hyper Card, Photo Magic, Picture
Publisher.
Multimedia Software Categories
Following are the various categories of Multimedia software
• Device Driver Software- These softwares are used to install and configure the multimedia
peripherals.
• Media Players- Media players are applications that can play one or more kind of multimedia file
format.
• Media Conversion Tools- These tools are used for encoding / decoding multimedia contexts and
for converting one file format to another.
• Multimedia Editing Tools- These tools are used for creating and editing digital multimedia data.
• Multimedia Authoring Tools- These tools are used for combing different kinds of media formats
and deliver them as multimedia contents.
Multimedia Application:
Multimedia applications are created with the help of following mentioned tools and packages.
The sound, text, graphics, animation and video are the integral part of multimedia software. To produce and edit
these media elements, there are various software tools available in the market. The categories of basic software tools
are:
• Text Editing Tools- These tools are used to create letters, resumes, invoices, purchase orders, user
manual for a project and other documents. MS-Word is a good example of text tool. It has following
features:
o Creating new file, opening existing file, saving file and printing it.
o Insert symbol, formula and equation in the file.
o Correct spelling mistakes and grammatical errors.
o Align text within margins.
o Insert page numbers on the top or bottom of the page.
o Mail-merge the document and making letters and envolpes.
o Making tables with variable number of columns and rows.
• Painting and Drawing Tools- These tools generally come with a graphical user interface with pull
down menus for quick selection. You can create almost all kinds of possible shapes and resize them
using these tools. Drawing file can be imported or exported in many image formats like .gif, .tif, .jpg,
.bmp, etc. Some examples of drawing software are Corel Draw, Freehand, Designer, Photoshop,
Fireworks, Point etc.These software have following features:
o Tools to draw a straight line, rectangular area, circle etc.
o Different colour selection option.
o Pencil tool to draw a shape freehand.
o Eraser tool to erase part of the image.
o Zooming for magnified pixel editing.
• Image Editing Tools- Image editing tools are used to edit or reshape the existing images and
pictures. These tools can be used to create an image from scratch as well as images from scanners,
digital cameras, clipart files or original artwork files created with painting and drawing tools.
Examples of Image editing or processing software are Adobe Photoshop and Paint Shop Pro.
• Sound Editing Tools- These tools are used to integrate sound into multimedia project very easily.
You can cut, copy, paste and edit segments of a sound file by using these tools. The presence of
sound greatly enhances the effect of a mostly graphic presentation, especially in a video. Examples
of sound editing software tools are: Cool Edit Pro, Sound Forge and Pro Tools. These software have
following features:
o Record your own music, voice or any other audio.
o Record sound from CD, DVD, Radio or any other sound player.
o You can edit, mix the sound with any other audio.
o Apply special effects such as fade, equalizer, echo, reverse and more.
• Video Editing Tools- These tools are used to edit, cut, copy, and paste your video and audio files.
Video editing used to require expensive, specialized equipment and a great deal of knowledge. The
aritistic process of video editing consists of deciding what elements to retain, delete or combine from
various sources so that they come together in an organized, logical and visually planning manner.
Today computers are powerful enough to handle this job, disk space is cheap and storing and
distributing your finished work on DVD is very easy. Examples of video editing software are Adobe
Premiere and Adobe After Effects.
• Animation and Modeling Tools- An animation is to show the still images at a certain rate to give it
visual effect with the help of Animation and modeling tools. These tools have features like multiple
windows that allow you to view your model in each dimension, ability to drag and drop primitive
shapes into a scene, color and texture mapping, ability to add realistic effects such as transparency,
shadowing and fog etc. Examples of Animations and modeling tools are 3D studio max and Maya.
LECTURE NOTE 22
What is CD ROM?
A CD-ROM, which stands for Compact Disc Read-Only Memory, is an optical disc with audio or software data
that has read-only memory. The tool that is used to read data from it is a CD-ROM drive or optical drive. The speed
of a CD-ROM drive can range from 1x to 72x, which means it reads the CD roughly 72 times faster than the 1x
version. These drives can read and play data from CDs, including CD-R and CD-RW discs, as well as audio CDs, as
you might expect.
Note: A DVD, including a data or movie DVD, cannot be read by a CD-ROM player. A CD-ROM
drive is not made to read the format of a DVD because it differs from a CD. A DVD can only be read
using a DVD-ROM drive.
History
Independent researchers in the United States, such as David Paul Gregg (1958) and James Russell, produced the first
theoretical studies on optical disc storage (1965-1975). Gregg's inventions, in particular, served as the foundation for
the LaserDisc specification, which was jointly created by MCA and Philips after MCA bought Gregg's patents and
the business he formed, Gauss Electrophysics. The LaserDisc was the CD's direct predecessor; the main distinction
between the two was how information was encoded-LaserDisc used analog encoding, whereas CDs utilized digital
encoding.
With the release of the PC Engine CD-ROM2 (TurboGrafx-CD) in 1988, CD-ROMs were first employed in home
video game consoles. By the end of the 1980s, personal computers also received CD-ROM drives. In 1990, Data
East showed off a board for an arcade system that accepted CD-ROMs. This board was reminiscent of laserdisc video
games from the 1980s but used digital data, giving it greater flexibility than older laserdisc games. [10] Early in
1990, Japan sold roughly 300,000 CD-ROM drives, while the US was producing 125,000 CD-ROM discs each month
at the time. [11] Due to the inclusion of a CD-ROM drive, which permitted the transport of several hundred
megabytes of video, picture, and music data, several computers sold in the 1990s were referred to as "multimedia"
computers.
Disk drives are no longer present in contemporary computers (CD drives). Look at the front of your computer to see
if it has a CD drive. A CD-reading slot or tray should be located on the front of the computer. If you can't see a CD
drive, check your computer's disks to see if your operating system displays one.
PYour computer lacks a CD drive if there is no disc drive visible. However, a desktop computer
LECTURE can be NOTE
equipped22
with a CD drive, and a laptop or desktop can be connected to an external CD drive.
If you click the tray eject button on the front of the drive, a CD-ROM drive will automatically open. Repeatedly
pressing the tray or eject button will shut the CD-ROM drive.
If the eject button is not functioning, you can access My Computer or "This PC" in later versions of Windows to
open or eject the tray. Locate the drive list in My Computer, right-click the CD-ROM drive, and then choose
Eject from the pop-up menu. Another way to manually access a CD-ROM drive is to slide the paperclip's point
through the eject hole in the drive. Gently push it in until resistance is encountered, then slightly press in to engage
the release mechanism. In the event that everything is done correctly, the tray should be slightly open, allowing you
to gently remove it with your fingertips. If a CD becomes stuck in the CD-ROM drive, manually opening or ejecting
the tray may be helpful.
The various interfaces that enable a computer to connect to a CD-ROM and other disc drives are listed below.
1. USB - External disc drives are most frequently connected using this interface.
2. SATA - IDE will no longer be the preferred method for connecting disc drives.
3. SCSI - An additional typical interface for disks and disc drives.
4. Panasonic - Older proprietary interface.
5. PCMCIA (PC Card) - External disc drives can occasionally be connected to laptop computers using an
interface.
6. IDE/ATA - one of the most widely utilized disc drive interfaces.
7. Parallel - External CD-ROM drives from the past are interfaced with.
LECTURE NOTE 23
What is Scanner?
A scanner is an electrical device that reads and converts documents such as photos and pages of text into a digital
signal. This changes the documents in a form that can be viewed and or modified on a computer system by using
software applications. There are numerous kinds of scanners available in the market that have different resolutions.
Most scanners have a flat scanning surface as they are flatbed devices, which are mainly used for scanning magazines,
photographs, and numerous documents. Furthermore, because most flatbed scanners have a cover that lifts up, they
can scan books and other heavy things. A sheet-fed scanner is another type of scanner that is only able to accept paper
documents. Although sheet-fed scanners have no capability of scanning books, some of their models include a feature
of an automatic document feeder (ADF) that allows various pages to be scanned in sequence.
The scanner interacts with computer software applications to execute tasks. The data from the scanner is imported into
these apps. Most of the scanners contain basic scanning software that makes users capable of configuring, initiating,
and importing scans. Scanners are also able to import scanned images directly through various software. The software
accomplishes this by scanning the computer's installed plug-ins. If a scanner plug-in for Adobe Photoshop is installed,
for example, users can create new photos directly from the linked scanner.
Although some programs like OmniPage and Acrobat can identify scanned text, the scanned images can also be edited
by Photoshop. It is done by a technology, which is known as optical character recognition (OCR). Scanning software
that includes optical character recognition has the ability to convert scanned text documents into digital text in a form
that can be viewed and modified with the help of a word processor. Some OCR programs also have the ability to
capture page and text formatting that led to possible of generating electronic copies of physical documents. Scanning
is also the most dependable and cost-effective method of delivering images in the world of electronic data transmission.
Types of Scanner
There are various types of scanners that are used with a computer for different functions. Such are as follows:
Flatbed Scanners
LECTURE NOTE 23
The most popular type of optical scanner is a flatbed scanner, which scans documents on a flat surface. These scanners
do not require the document to be moved and are capable of capturing all of the document's elements. Flatbed scanners
come in a couple of different sizes for standard paper and are effective for delicate materials, like documents that are
fragile, including also vintage photographs and papers. There are also some models of scanners available that help to
reduce the size of desk space. For example, you can minimize the amount of desk space required by purchasing all-in-
one models, which include a scanner and a printer. A flatbed scanner looks like as shown in the below picture.
These scanners are also effective for scanning books, articles, newspapers, and even DVD cases. If you have purchased
a high-resolution scanner, they are also better for scanning photos as well. Because each object to be scanned must be
put onto the flatbed and processed on this scanner, it is a time-consuming option. It is a superior option for individuals
who need to scan a large number of papers.
Unlike other types of scanners, the process of scanning documents of the flatbed scanner is very easy. Users merely
need to place the paper on the glass and close the lid to scan the document. Additionally, some other models of flatbed
scanners can include advanced features such as Bluetooth or wireless connectivity as well as automatic document
feeders.
Flatbed scanners are more versatile than sheet-fed scanners because they can scan thicker objects. Furthermore, unlike
drum or handheld scanners, it does not necessitate document movement, resulting in a significant reduction in the
danger of document damage during scanning. It also has the disadvantage; flatbed scanners can also be expensive, and
they take more space as compared to other scanners.
Sheetfed Scanners
A scanner that allows the only paper to be scanned, known as sheetfed scanners. These scanners are a little smaller
than regular flatbed scanners, and they feature a lesser image resolution. They are great for scanning enormous amounts
of paper. These scanners are useful if you have a limited amount of room to deal with. They are commonly used by
businesses to scan office papers, but they are less commonly used by archives and libraries to scan books, and they're
built specifically for scanning loose sheets of paper. These scanners have duplex capabilities, are capable of handling,
have a duty cycle rating, and are fast in terms of paperweight and size (pages per minute). The sheetfed scanner is
shown in the image below.
Sheetfed scanners allow you to scan multiple documents at once instead of turning pages manually after each scan.
Like photocopiers, these scanners allow you to insert papers into a feeder tray and then scan one page at a time. Also,
comparing other kinds of scanners, Sheetfed document scanners can be a bit more costly. But the extra investment
could be beneficial if time is of concern.
Handheld Scanner
A portable scanner is a compact manual scanning device that functions similarly to a flatbed scanner. It is positioned
above the thing to be scanned. You must place the document inside the scanner for flatback and sheetfed scanners to
scan it. The handheld scanner, on the other hand, is dragged over the page to be scanned. It scans physical documents
into their digital forms, which makes it capable of storing, modifying, forward, and emailing digitally. As flatbed
scanners take up more space; therefore, when space is concerned, the handheld scanner is a mainly useful device. The
below image is an example of a handheld scanner.
When utilising a portable scanner, the hand must remain firm at all times, making it a difficult operation. Even a little
bit of movement of the hand can cause deformation of the scanning pictures. Typically, it is mainly used to evaluate
goods in shopping stores. Also, the barcode scanner is one of the great advantages of a handheld scanner. These
scanners are very popular, despite considering lower quality scanners. As compared to flatbed counterparts, they are
less expensive and small in size. Additionally, they have the potential to scan items that would not fit in a flatbed
scanner owing to size or placement. There are some models of handheld scanners available on the market that include
additional features such as storing and sending scanned content to computers and other devices, including translations,
definitions, and reading printed text aloud.
Drum Scanner
A scanner that uses a photomultiplier tube to capture the highest resolution from an image is known as a
photomultiplier tube scanner. It scans with a photomultiplier tube rather than a charge-coupled device. A charge-
coupled device is a gadget that is commonly seen in flatbed scanners. The photomultiplier tubes used by drum scanners
are vacuum tubes that are excessively sensitive to light. A glass tube is available in the drum scanner, and the image
is mounted on that. When the scanner starts to scan the image, the beam of light moves over the image, and
photomultiplier tubes (PMT) pick up its reflection and process it.
Drum scanners are noted for their high resolution, which may reach more than 10,000 dots per inch (dpi). Furthermore,
due to their cost and large size, they are not more popular than flatbed scanners in the market.
Photo Scanner
A type of optical scanner that is mainly used to scan photographs. Photo scanners provide high resolution and color
depth. They are smaller as compared to general-purpose scanners. Typically, a photo scanner has the ability to scan
3x5-inch or 4x6-inch photographs with higher resolution. Also, the negatives and slides can also be scanned by high-
end photo scanners. Some photo scanners come with software that can help you clean and restore outdated photos. A
photo scanner is shown in the image below.
Film Scanner
A film scanner is a device that scans photographic film and transfers it to a computer. It scans without the need of any
printmaking intermediates. As compared to a flatbed scanner, it offers different benefits to scan in a print of any size;
the photographer directly can perform certain aspects like a ratio of the original image on the film, cropping, adjusting,
unmolested image on film, and more. Also, many film scanners can remove film grain and scratches and improve color
reproduction from film through special software or hardware.
Portable Scanners
Portable scanners are designed in a way that can be easily carried around as they are small in size. Even some can be
carried in the pockets, too, as they are as small as your PDAs. They are effective for text document scanning. They
have limitations in terms of resolution. They are also available with a wireless facility. The below picture is an example
of a cabled portable scanner.
These are not capable of scanning photographs as well as applications that need high-resolution scanning. Now you
do not even need to desktop to get your work done because many smartphones come with a lot of applications that
enable your smartphone into a pocket-sized scanner. These applications can be used for scanning pictures and editing
them, scanning documents and converting them into PDFs, including scan bar codes. If you want to scan a sharp and
detailed image, however, a flatbed or drum scanner is the way to go. Otherwise, you can get complete your work with
the help of these productivity apps.
Advantages of scanner
There are many advantages of using a scanner; today's multifunction printers are designed in a way that includes
capable scanners, allowing you to scan documents without buying anything separately. They also do not take more
space. Some advantages are given below, which are included of prominent benefits.
Reliability: Unlike some modes of data transmission, scanning simply involves the conversion of physical images to
digital ones. In the case of scanning, the role of the end-user is limited. They can also assist in the transmission or
storage of crucial information because they are not reliant on two-way communication.
Quality: Scanners are capable of reproducing images with high resolution and accuracy. Scanning, as opposed to fax
machines, ensures the highest possible resolution for digital photos, whereas fax machines may struggle to replicate
correct details. Scanners are also more useful in the photography and engineering fields.
Efficiency: Modern scanners comes with ease of use as well as convenience. And, they are designed to offer better
speed and efficiency.
Cost-saving: The conversion of physical files into digital forms is one of the biggest advantages of scanning. Using a
scanner offers environmental benefits as well, since it helps to conserve physical space that would otherwise be utilised
for storage.
Ease of use: Scanners are electronic devices that are very easy to use. In modern times, the scanners that are built into
multifunction printers can be used freely without instruction or worry. Users only need to select basic options like
document or photograph or color versus black and white because most settings are automatically adjusted and fine-
tuned. You can also send the file to an email account or a computer after scanning is completed. Furthermore, users
can also save the scanned file in a different format, such as PDF documents.
Disadvantages of scanner
Although there are multiple benefits to using a scanner, they also have their disadvantages as well. If you are using a
desktop scanner for your home and office, it can be a valuable tool to get complete your work. In addition, both desktop
and high-volume scanners might be useful for business. Before investing in a pricey scanning system, both home users
and company owners should be aware of all scanner limits. The major drawbacks are listed below:
Depending on the number of factors, the quality of the scanned output can be different. These factors include the
quality of the scanner's lens, the condition of the original documents and scanner glass, and the cleanliness of the
scanner glass. If the original papers are in electronic format, a tool like Adobe Acrobat is usually the best option. This
program will help out to convert these files to a PDF format, which can be read by anyone who has internet access or
connection.
In terms of Maintenance, the use of a scanner can be expensive as there are numerous companies that need a large
amount of paperwork. To deal with this, they use high-volume scanners, which can be more expensive. Although these
high-volume scanners can be useful tools, owners must replace the lamps on a regular basis to maintain them working
at their best. Also, need to perform maintenance on the camera and lens as well. Thus, the maintenance cost can be
much costly.
A scanner captures images using reflected light and converts them into files that the computer can read and interpret
in order to show them. Scanners can scan images in black-and-white or colour, and they come in high- and low-
resolution models. The scanner can be used for a wide variety on the basis of users' requirements.
Copying
Copying is one of the most common uses of a scanner. A scanner can be used to make multiple copies of a poster,
brochure, worksheet, or other document so that it can be printed as many times as necessary. It will function as if your
PC were connected to a printer. In addition, in contrast to a copier, a scanner offers users the benefit of modifying their
documents before they print their copies.
Research
Scanners are also played an important role in research projects. Long-term research projects, whether for school or
business, nearly always need acquiring information from borrowed library books or other privately held sources. The
information is necessary for later research if it is collected from these sources. As a result, it can be referred to at a
later time without having to scan the original document into your computer. This enables users to return the source
without the need for losing the information found in it.
Archiving
Digital archiving is another one of the popular uses of the scanner. It's a method for making and saving digital copies
of hard copies of documents. Business records, personal documents, and tax paperwork, as well as family letters, are
examples of these documents. It contains many copies of important papers to aid recovery in the event that the originals
are lost, stolen, or destroyed.
Sharing Photos
Through the internet, scanners can also be used by users to share hard copy photos with friends and relatives. Although
professional and amateur photographers commonly use digital photography with the prevalent format, many people
still have old family pictures that were never recorded digitally because these photos were captured with traditional
film cameras.
Although today's USB cable is most interface used to connect a scanner to the computer, there are many different
interfaces that can be used. They are as follows:
Firewire
Parallel
USB
SCSI
Firewire connection: It is the fastest method as compared to others, which is referred to as IEEE-1394. It was
developed by Apple in 1995 and has been introduced in the latest high-end scanners. It deals with scanning high-
resolution images as it is a digital bus with a bandwidth of more than 400-800 Mbps. It is hot-swappable and can
transfer data at a maximum speed of 800 MBPS and handle up to 63 units on the same bus.
Parallel Connection: This is the oldest and slowest method of connecting a scanner to a computer, and it's also known
as the Centronics interface, Centronics port, or Centronics connection after the firm that invented it. Epson later turned
it into a 25-pin (type DB-25) computer interface. It has a data transfer rate of 70 kbps and is often used to connect
printers to computers.
Universal Serial Bus (USB) Connection: It is the most economical and latest method of data transfer, which stands
for universal serial bus. It is a simple plug-and-play interface that connects to the scanner quickly. It also enables a
computer to communicate with peripherals and other devices at speeds of up to 60 megabits per second.
A SCSI interface card, pronounced "Scuzzy," can be used to implement this strategy. A specialized SCSI card was
used in older scanners. It was completed for the first time in 1982, and it can accommodate eight or sixteen devices
using Wide SCSI.
The basic difference in how old scanners and modern scanners work is the sort of image sensor they use. Modern
scanners use a Charge-coupled device, whereas old scanners were used a photomultiplier tube. A CCD sensor's
principal job is to capture light from a scanner and convert it into proportional electrons. The charge created will be
higher if the intensity of light that hits the sensor is higher. A flatbed scanner incorporates a number of devices,
including the following:
Power supply
Scan head
Stepper motor
Glass plate
Lamp
Filters
Stabilizer bar
Belt
Cover
Control circuitry
Interface ports
Mirrors
Although, according to the manufacturer's design, the configuration of the above components is different, the basic
functioning is almost the same.
A flat transparent glass bed is included in the scanner, and under it, the lenses, lamp, filters, CCD sensors, and mirrors
are built. When a document is to be scanned, it is needed to place on the glass bed. A scanner also comes with a cover,
which can be white or black and is used to close the scanner. The color of this cover helps to offer uniformity in the
background. And, this uniformity helps to analyze the size of the document, which is to be scanned. You may not be
able to use the cover if you are going to scan a page from a book due to its heavier size. Most of the scanners utilize a
CCFL (cold cathode fluorescent lamp).
A stepper motor is included in the scanner, which moves the scanner head from one end to the other. The scanner head
compresses CCD sponsors, mirrors, filters, and lenses. The scan head, in a constant path, moves parallel to the glass
bed. A stabilizer bar will be offered to compromise it as deviation may occur in its motion. The scanning is dependent
on the scan head. The scanning of the document is completed when it reaches the other end of the machine as the scan
head goes from one end to the other. For some scanners, a two-way scan is performed when the scan head needs to
return to its original location before doing a complete scan.
When the scan head slides under the glass bed, the lamp's light strikes the paper and is reflected back using angled
mirrors. On the basis of the device's design, there may be either 2-way or 3-way mirrors. The mirrors will be angled
according to the reflected image, which will be meeting the smaller surface. Finally, the picture will pass through a
filter before reaching a lens, which will cause the image to be focused on CCD sensors. Then, the light is converted
into electrical signals by CCD sensors. Below is given a picture that may help out to understand the working of the
scanner.
The electrical signals will then be translated into picture format inside the computer. This reception may differ as a
result of differences in lens and filter design. A method that is employed is three-pass scanning. In this technology,
each composite colour is communicated between the lens and the CCD sensors with each movement of the scan head
from one end to the other. The scanner software converts the three filtered images into one single-color image after
the three composite colours are scanned.
Another method is also used, which is a single-pass scanning method. In this method, the image will be split into three
pieces, which image is captured by the lens. Any of the color composite filters help to pass these pieces. Then, CCD
sensors receive the output. Thus, with the help of a scanner, the single-color image will be combined.
In some recent scanners, the CCD sensor has been replaced by the contact image sensor (CIS). Although, as compared
to the CCD scanner, this method is not more costly, it offers lower quality and resolution of the image.
Modern scanners are modeled around early telephotography and fax input devices. The pantelegraph, invented by
Giovanni Caselli, was an early type of facsimile machine that transmitted over standard telegraph lines. It was the
apparatus that was employed for the first time in practical service in the 1860s. It employed electromagnets to drive
and synchronize pendulum movement and scanned and reproduced images from afar. It was capable of transmitting
drawings up to 150 by 100 mm in size, including signatures and handwriting.
By scanning with the help of a photocell and transmitting over standard phone lines, Édouard Belin's Belinograph of
1913 laid the groundwork for the AT&T Wirephoto service. From the 1920s through the mid-1990s, a Belino, a service
similar to a wirephoto, was utilized in Europe. It was used by news organisations and consisted of a rotating drum with
a single photodetector. It rotated at 60 to 120 rotations per minute on average. They send a linear analogue AM signal
to receptors, which may print the proportionate intensity on special paper using conventional phone lines. Color
photographs were delivered in three distinct RGB filtered images due to transmission expenses.
The first scanners were invented in the 1860s. However, at the National Bureau of Standards in the United States, a
guy named Russell Kirsch developed a scanner that is still in use today. For the first time, this gadget scanned a
photograph of Kirsch's son. This black and white graphic had a resolution of 176 pixels on each side and sized only
5x5 cm.
A computer scanner, often known as a digitizer, is a type of input device. It takes data from a document or a photograph
and converts it to digital data. A scanner, like a printer (which is an output device), cannot receive data from the
computer and can only give data to the computer. Therefore, it is known as an input device.
Interpolated resolution expresses to increase the resolution of the picture through scanning software. This is done with
the help of increasing additional pixels between the document or image that one actually scanned by the CCD array.
But only as an average of the adjacent pixels can be added to these extra pixels. For example, the manufacturer declared
the interpolated resolution of a scanner is 600×300 dpi and has a true resolution of 300×300 dpi. Thus, with the help
of software, an extra pixel is added in each row of the CCD sensor. The size of the file increases when the resolution
increases. Lossy compression techniques such as JPEG can help to reduce the file size. This method offers the benefit
of reducing the quality of the picture to a small amount.
At least, a scanner contains a resolution of approx. 300×300 dots per inch (dpi). It improves with the help of the stepper
motor's precision, as well as an increase in the number of CCD sensors row by row.
When the scanner's lamp brightness is increased in conjunction with the use of high-quality optics, the image clarity
improves. Another setting is the density range, which aids scanning in reproducing small shadow and brightness
details.
Another parameter is colour depth, which refers to the amount of colours in colour scanning. The scanner is capable
of reproducing it. Despite the fact that scanners with 30 and 36 bits are available on the market, a 24 bit/pixel scanner
will suffice.
LECTURE NOTE 24
Stands for "Charged Coupled Device."
A CCD is a type of image sensor used to capture still and moving imagery in digital cameras and scientific
instruments. A CCD sensor is a flat integrated circuit that contains an array of light-sensitive elements, called pixels,
that capture incoming light and turn it into electrical signals. Other components in the device use these signals to
construct a digital recreation of the original image.
The quality of the image produced by a CCD depends on its resolution — the number of physical pixels present on
the sensor. A sensor's resolution is measured in Megapixels (the total number of pixels in millions). For example, a
16 Megapixel sensor produces an image with twice as many total pixels as an 8 Megapixel sensor, resulting in a
more detailed image. A physically-larger sensor can either pack in more pixels than a smaller one, or include larger
pixels that are more sensitive to light. CCD sensors can also detect light outside the visible spectrum, enabling
infrared photography and night-vision video recording.
Since CCDs only capture the amount of light that hits the sensor, not the color of that light, an unfiltered sensor only
produces a monochrome image. CCDs can use red, green, and blue (RGB) filters to capture colored light separately
before combining the data into a full-color image. Some devices even include three separate CCDs, one each for red,
blue, and green, to capture full-color image data in less time.
While many early digital cameras used CCDs to capture images, other types of image sensors came along that were
faster and less prone to overexposure. Eventually, most consumer- and professional-level digital cameras switched
to CMOS image sensors. However, CCDs produce higher-quality images when overexposure is not a concern, so
they're still used in scientific and medical instruments. They're even used in the harsh environment of space, taking
photos of far-off galaxies from orbiting satellite telescopes.
LECTURE NOTE 25
What is Image Segmentation?
Image segmentation is a branch of digital image processing which focuses on partitioning an image into different
parts according to their features and properties. The primary goal of image segmentation is to simplify the image for
easier analysis. In image segmentation, you divide an image into various parts that have similar attributes.The
parts in which you divide the image are called Image Objects.
It is the first step for image analysis. Without performing image segmentation, performing computer vision
implementations would be nearly impossible for you.
By using image segmentation techniques, you can divide and group-specific pixels from an image, assign them labels
and classify further pixels according to these labels. You can draw lines, specify borders, and separate particular
objects (important components) in an image from the rest of the objects (unimportant components).
In machine learning, you can use the labels you generated from image segmentation for supervised and unsupervised
training. This would allow you to solve many business problems.
On the other hand, if you wanted to identify every chair present in an image such as the following, you’ll have
to use instance segmentation:
Why is Image Segmentation Necessary?
Image segmentation is a large aspect of computer vision and has many applications in numerous industries. Some of
the notable areas where image segmentation is used profusely are:
1. Face Recognition
The facial recognition technology present in your iPhone and advanced security systems uses image segmentation
to identify your face. It must be able to identify the unique features of your face so that any unwanted party cannot
access your phone or system.
Many traffic lights and cameras use number plate identification to charge fines and help with searches. Number plate
identification technology allows a traffic system to recognize a car and get its ownership-related information. It
uses image segmentation to separate a number plate and its information from the rest of the objects present in its
vision. This technology has simplified the fining process considerably for governments.
3. Image-Based Search
Google and other search engines that offer image-based search facilities use image segmentation techniques to
identify the objects present in your image and compare their findings with the relevant images they find to give you
search results.
4. Medical Imaging
In the medical sector, we use image segmentation to locate and identify cancer cells, measure tissue volumes,
run virtual surgery simulations, and perform intra-surgery navigation. Image segmentation has many applications
in the medical sector. It helps in identifying affected areas and plan out treatments for the same.
Apart from these applications, image segmentation has uses in manufacturing, agriculture, security, and many other
sectors. As our computer vision technologies become more advanced, the uses of image segmentation techniques
will increase accordingly.
For example, some manufacturers have started using image segmentation techniques to find faulty products.
Here, the algorithm would capture only the necessary components from the object’s image and classify them as faulty
or optimal. This system reduces the risk of human errors and makes the testing process more efficient for the
organization.
Image segmentation is a very broad topic and has different ways to go about the process. We can classify image
segmentation according to the following parameters:
1. Approach-Based Classification
In its most basic sense, image segmentation is object identification. An algorithm cannot classify the different
components without identifying an object first. From simple to complicated implementations, all image
segmentation work based on object identification.
So, we can classify image segmentation methods based on the way algorithms identify objects, which means,
collecting similar pixels and separating them from dissimilar pixels. There are two approaches to performing this
task:
Region-based Approach (Detecting Similarity)
In this method, you detect similar pixels in the image according to a selected threshold, region merging, region
spreading, and region growing. Clustering and similar machine learning algorithms use this method to detect
unknown features and attributes. Classification algorithms follow this approach for detecting features and separating
image segments according to them.
The boundary-based approach is the opposite of the region-based approach for object identification. Unlike region-
based detection, where you find pixels having similar features, you find pixels that are dissimilar to each other in the
boundary-based approach. Point Detection, Edge Detection, Line Detection, and similar algorithms follow this
method where they detect the edge of dissimilar pixels and separate them from the rest of the image accordingly.
2. Technique-Based Classification
Both of the approaches have their distinct image segmentation techniques. We use these techniques according to the
kind of image we want to process and analyse and the kind of results we want to derive from it.
Based on these parameters, we can divide image segmentation algorithms into the following categories:
Structural Techniques
These algorithms require you to have the structural data of the image you are using. This includes the pixels,
distributions, histograms, pixel density, colour distribution, and other relevant information. Then, you must have the
structural data on the region you have to separate from the image.
You’ll need that information so your algorithm can identify the region. The algorithms we use for these
implementations follow the region-based approach.
Stochastic Techniques
These algorithms require information about the discrete pixel values of the image, instead of the structure of the
required section of the image. Due to this, they don’t require a lot of information to perform image segmentation
and are useful when you have to work with multiple images. Machine learning algorithms such as K-means clustering
and ANN algorithms fall in this category.
Hybrid Techniques
As you can guess from the name, these algorithms use both stochastic and structural methods. This means they
use the structural information of the required region and the discrete pixel information of the whole image for
performing image segmentation.
Now that we know the different approaches and kinds of techniques for image segmentation, we can start discussing
the specifics. Following are the primary types of image segmentation techniques:
1. Thresholding Segmentation
2. Edge-Based Segmentation
3. Region-Based Segmentation
4. Watershed Segmentation
5. Clustering-Based Segmentation Algorithms
6. Neural Networks for Segmentation
Let’s discuss each one of these techniques in detail to understand their properties, benefits, and limitations:
1. Thresholding Segmentation
The simplest method for segmentation in image processing is the threshold method. It divides t he pixels in an image
by comparing the pixel’s intensity with a specified value (threshold). It is useful when the required object has a
higher intensity than the background (unnecessary parts).
You can consider the threshold value (T) to be a constant but it would only work if the image has very little noise
(unnecessary information and data). You can keep the threshold value constant or dynamic according to your
requirements.
The thresholding method converts a grey-scale image into a binary image by dividing it into two segments (required
and not required sections).
According to the different threshold values, we can classify thresholding segmentation in the following categories:
Simple Thresholding
In this method, you replace the image’s pixels with either white or black. Now, if the intensity of a pixel at a particular
position is less than the threshold value, you’d replace it with black. On the other hand, if it’s higher than the
threshold, you’d replace it with white. This is simple thresholding and is particularly suitable for beginners in image
segmentation.
Otsu’s Binarization
In simple thresholding, you picked a constant threshold value and used it to perform image segmentation. However,
how do you determine that the value you chose was the right one? While the straightforward method for this is to
test different values and choose one, it is not the most efficient one.
Take an image with a histogram having two peaks, one for the foreground and one for the background. By using
Otsu binarization, you can take the approximate value of the middle of those peaks as your threshold value.
In Otsu binarization, you calculate the threshold value from the image’s histogram if the image is bimodal.
This process is quite popular for scanning documents, recognizing patterns, and removing unnecessary colours from
a file. However, it has many limitations. You can’t use it for images that are not bimodal (images whose histograms
have multiple peaks).
Adaptive Thresholding
Having one constant threshold value might not be a suitable approach to take with every image. Different images
have different backgrounds and conditions which affect their properties.
Thus, instead of using one constant threshold value for performing segmentation on the entire image, you can
keep the threshold value variable. In this technique, you’ll keep different threshold values for different sections
of an image.
This method works well with images that have varying lighting conditions. You’ll need to use an algorithm that
segments the image into smaller sections and calculates the threshold value for each of them.
2. Edge-Based Segmentation
Edge-based segmentation is one of the most popular implementations of segmentation in image processing. It
focuses on identifying the edges of different objects in an image. This is a crucial step as it helps you find the features
of the various objects present in the image as edges contain a lot of information you can use.
Edge detection is widely popular because it helps you in removing unwanted and unnecessary information from
the image. It reduces the image’s size considerably, making it easier to analyse the same.
Algorithms used in edge-based segmentation identify edges in an image according to the differences in texture,
contrast, grey level, colour, saturation, and other properties. You can improve the quality of your results by
connecting all the edges into edge chains that match the image borders more accurately.
There are many edge-based segmentation methods available. We can divide them into two categories:
Search-based edge detection methods focus on computing a measure of edge strength and look for local directional
maxima of the gradient magnitude through a computed estimate of the edge’s local orientation.
Zero-crossing based edge detection methods look for zero crossings in a derivative expression retrieved from the
image to find the edges.
Typically, you’ll have to pre-process the image to remove unwanted noise and make it easier to detect edges. Canny,
Prewitt, Deriche, and Roberts cross are some of the most popular edge detection operators. They make it easier to
detect discontinuities and find the edges.
In edge-based detection, your goal is to get a partial segmentation minimum where you can group all the local edges
into a binary image. In your newly created binary image, the edge chains must match the existing components of the
image in question.
3. Region-Based Segmentation
Region-based segmentation algorithms divide the image into sections with similar features. These regions are
only a group of pixels and the algorithm find these groups by first locating a seed point which could be a small
section or a large portion of the input image.
After finding the seed points, a region-based segmentation algorithm would either add more pixels to them or shrink
them so it can merge them with other seed points.
Based on these two methods, we can classify region-based segmentation into the following categories:
Region Growing
In this method, you start with a small set of pixels and then start iteratively merging more pixels according to
particular similarity conditions. A region growing algorithm would pick an arbitrary seed pixel in the image, compare
it with the neighbouring pixels and start increasing the region by finding matches to the seed point.
When a particular region can’t grow further, the algorithm will pick another seed pixel which might not belong to
any existing region. One region can have too many attributes causing it to take over most of the image. To
avoid such an error, region growing algorithms grow multiple regions at the same time.
You should use region growing algorithms for images that have a lot of noise as the noise would make it difficult
to find edges or use thresholding algorithms.
As the name suggests, a region splitting and merging focused method would perform two actions together – splitting
and merging portions of the image.
It would first the image into regions that have similar attributes and merge the adjacent portions which are similar to
one another. In region splitting, the algorithm considers the entire image while in region growth, the algorithm would
focus on a particular point.
The region splitting and merging method follows a divide and conquer methodology. It divides the image into
different portions and then matches them according to its predetermined conditions. Another name for the algorithms
that perform this task is split-merge algorithms.
4. Watershed Segmentation
In image processing, a watershed is a transformation on a grayscale image. It refers to the geological watershed
or a drainage divide. A watershed algorithm would handle the image as if it was a topographic map. It considers
the brightness of a pixel as its height and finds the lines that run along the top of those ridges.
Watershed has many technical definitions and has several applications. Apart from identifying the ridges of the
pixels, it focuses on defining basins (the opposite of ridges) and floods the basins with markers until they meet
the watershed lines going through the ridges.
As basins have a lot of markers while the ridges don’t, the image gets divided into multiple regions according to the
‘height’ of every pixel.
The watershed method converts every image into a topographical map The watershed segmentation method
would reflect the topography through the grey values of their pixels.
Now, a landscape with valleys and ridges would certainly have three-dimensional aspects. The watershed would
consider the three-dimensional representation of the image and create regions accordingly, which are called
“catchment basins”.
It has many applications in the medical sector such as MRI, medical imaging, etc. Watershed segmentation is a
prominent part of medical image segmentation so if you want to enter that sector, you should focus on learning
this method for segmentation in image processing particularly.
If you’ve studied classification algorithms, you must have come across clustering algorithms. They are unsupervised
algorithms and help you in finding hidden data in the image that might not be visible to a normal vision. This hidden
data includes information such as clusters, structures, shadings, etc.
As the name suggests, a clustering algorithm divides the image into clusters (disjoint groups) of pixels that have
similar features. It would separate the data elements into clusters where the elements in a cluster are more similar in
comparison to the elements present in other clusters.
Some of the popular clustering algorithms include fuzzy c-means (FCM), k-means, and improved k-means
algorithms. In image segmentation, you’d mostly use the k-means clustering algorithm as it’s quite simple and
efficient. On the other hand, the FCM algorithm puts the pixels in different classes according to their varying degrees
of membership.
The most important clustering algorithms for segmentation in image processing are:
K-means Clustering
K-means is a simple unsupervised machine learning algorithm. It classifies an image through a specific number
of clusters. It starts the process by dividing the image space into k pixels that represent k group centroids.
Then they assign each object to the group based on the distance between them and the centroid. When the algorithm
has assigned all pixels to all the clusters, it can move and reassign the centroids.
Fuzzy C Means
With the fuzzy c-means clustering method, the pixels in the image can get clustered in multiple clusters. This
means a pixel can belong to more than one cluster. However, every pixel would have varying levels of similarities
with every cluster. The fuzzy c-means algorithm has an optimization function which affects the accuracy of your
results.
Clustering algorithms can take care of most of your image segmentation needs. If you want to learn more about
them.Perhaps you don’t want to do everything by yourself. Perhaps you want to have an AI do most of your
tasks, which you can certainly do with neural networks for image segmentation.
The experts at Facebook AI Research (FAIR) created a deep learning architecture called Mask R-CNN whichcan
make a pixel-wise mask for every object present in an image. It is an enhanced version of the Faster R-CNN object
detection architecture. The Faster R-CNN uses two pieces of data for every object in an image, the bounding
box coordinates and the class of the object. With Mask R-CNN, you get an additional section in this process. Mask
R-CNN outputs the object mask after performing the segmentation.
In this process, you’d first pass the input image to the ConvNet which generates the feature map for the image. Then
the system applies the region proposal network (RPN) on the feature maps and generates the objectproposals with
their objectness scores.
After that, the Roi pooling layer gets applied to the proposals to bring them down to one size. In the final stage,
the system passes the proposals to the connected layer for classification and generates the output with the bounding
boxes for every object.
LECTURE NOTE 26
An image retrieval system is a computer system used for browsing, searching and retrieving images from a
large database of digital images. Most traditional and common methods of image retrieval utilize some method of
adding metadata such as captioning, keywords, title or descriptions to the images so that retrieval can be performed
over the annotation words. Manual image annotation is time-consuming, laborious and expensive; to address this,
there has been a large amount of research done on automatic image annotation. Additionally, the increase in
social web applications and the semantic web have inspired the development of several web-based image
annotation tools.
Image search is a specialized data search used to find images. To search for images, a user may provide query
terms such as keyword, image file/link, or click on some image, and the system will return images "similar" to the
query. The similarity used for search criteria could be meta tags, color distribution in images, region/shape
attributes, etc.
• Image meta search - search of images based on associated metadata such as keywords, text, etc.
• Content-based image retrieval (CBIR) – the application of computer vision to the image retrieval.
CBIR aims at avoiding the use of textual descriptions and instead retrieves images based on
similarities in their contents (textures, colors, shapes etc.) to a user-supplied query image or user-
specified image features.
o List of CBIR Engines - list of engines which search for images based image visual content
such as color, texture, shape/object, etc.
Further information: Visual search engine and Reverse image search
• Image collection exploration - search of images based on the use of novel exploration paradigms
It is crucial to understand the scope and nature of image data in order to determine the complexity of image search
system design. The design is also largely influenced by factors such as the diversity of user-base and expected user
traffic for a search system. Along this dimension, search data can be classified into the following categories:
• Archives - usually contain large volumes of structured or semi-structured homogeneous data pertaining
to specific topics.
• Domain-Specific Collection - this is a homogeneous collection providing access to controlled users
with very specific objectives. Examples of such a collection are biomedical and satellite image
databases.
• Enterprise Collection - a heterogeneous collection of images that is accessible to users within an
organization's intranet. Pictures may be stored in many different locations.
• Personal Collection - usually consists of a largely homogeneous collection and is generally small in
size, accessible primarily to its owner, and usually stored on a local storage media.
• Web - World Wide Web images are accessible to everyone with an Internet connection. These image
collections are semi-structured, non-homogeneous and massive in volume, and are usually stored in
large disk arrays.
LECTURE NOTE 27
A K-D Tree is a binary tree in which each node represents a k-dimensional point. Every non-leaf node in the tree
acts as a hyperplane, dividing the space into two partitions. This hyperplane is perpendicular to the chosen axis,
which is associated with one of the K dimensions.
There are different strategies for choosing an axis when dividing, but the most common one would be to
cycle through each of the K dimensions repeatedly and select a midpoint along it to divide the space. For
instance, in the case of 2-dimensional points with and axes, we first split along the -axis, then the -axis, and
then the -axis again, continuing in this manner until all points are accounted for:
The construction of a K-D Tree involves recursively partitioning the points in the space, forming a binary tree. The
process starts by selecting an axis. We then choose the middle(median) point along the axis and split the space, so
that the remaining points lie into two subsets based on their position relative to the splitting hyperplane.
The left child of the root node is then created with the points in the subset that lie to the left of the hyperplane,
while the right child is created with the points that lie to the right. This process is repeated recursively for each
child node, selecting a new axis and splitting the points based on their position relative to the new hyperplane.
If the above algorithm is executed correctly, the resulting tree will be balanced, with each leaf node being
approximately equidistant from the root. To achieve this, it’s essential to select the median point every time. It is
also worth noting that finding the median can add some complexity, as it requires the usage of another algorithm. If
the median point is not selected, there is no guarantee that the tree will be balanced.
One approach to finding the median is to use a sorting algorithm, sort the points along the selected axis and take
the middle point. To address the added complexity, we can sort a fixed number of randomly selected points and use
the median of those points as the splitting plane. This alternative practice can be less computationally expensive
than sorting the entire array of points.
4. Inserting and Removing Nodes
Inserting and removing nodes are essential operations in maintaining a K-D tree’s structure and performance.
However, due to the tree’s special characteristics, they must be implemented with care. That is why in this section,
we’ll discuss both operations in detail.
Here we will see the R-Trees data structure. The R-Trees are used to store special data indexes in an efficient manner.
This structure is very useful to hold special data queries and storages. This R-trees has some real life applications.
These are like below −
• Indexing multidimensional information
• Handling game data
• Hold geospatial coordinates
• Implementation of virtual maps
One example of R-Tree is like below.
Quad-tree can be formed on B-tree R-tree does not follow the structure of B-tree
Nearest neighbor querying is slower, but the Nearest neighbor querying is faster, but the
Window querying is faster. Window querying is slower.
Lecture 30: Video Content, querying
Content-based image retrieval, also known as query by image content (QBIC) and content-based
visual information retrieval (CBVIR), is the application of computer vision techniques to the image
retrieval problem, that is, the problem of searching for digital images in large databases. Content-based
image retrieval is opposed to traditional concept-based approaches (see Concept-based image
indexing).
"Content-based" means that the search analyzes the contents of the image rather than the metadata such
as keywords, tags, or descriptions associated with the image. The term "content" in this context might
refer to colors, shapes, textures, or any other information that can be derived from the image itself. CBIR
is desirable because searches that rely purely on metadata are dependent on annotation quality and
completeness.
Lecture 31: Video segmentation, indexing
Digital Image Processing means processing digital image by means of a digital computer. We can also
say that it is a use of computer algorithms, in order to get enhanced image either to extract some useful
information.
Digital image processing is the use of algorithms and mathematical models to process and analyze
digital images. The goal of digital image processing is to enhance the quality of images, extract
meaningful information from images, and automate image-based tasks.
1. Image acquisition: This involves capturing an image using a digital camera or scanner, or importing
an existing image into a computer.
2. Image enhancement: This involves improving the visual quality of an image, such as increasing
contrast, reducing noise, and removing artifacts.
3. Image restoration: This involves removing degradation from an image, such as blurring, noise, and
distortion.
4. Image segmentation: This involves dividing an image into regions or segments, each of which
corresponds to a specific object or feature in the image.
5. Image representation and description: This involves representing an image in a way that can be
analyzed and manipulated by a computer, and describing the features of an image in a compact and
meaningful way.
6. Image analysis: This involves using algorithms and mathematical models to extract information
from an image, such as recognizing objects, detecting patterns, and quantifying features.
7. Image synthesis and compression: This involves generating new images or compressing existing
images to reduce storage and transmission requirements.
8. Digital image processing is widely used in a variety of applications, including medical imaging,
remote sensing, computer vision, and multimedia.
Types of Segments :
• Posted Segment : When visible attribute of segment is set to 1, it is called Posted segment. This is
included in active segment list.
• Unposted Segment : When visible attribute of segment is set to 0, it is called Unposted segment.
This is not included in active segment list.
Functions for Segmenting the display :
1. Segment Creation : Segment must be created or opened when no other segment is open, since two
segments can’t be opened at the same time because it’s difficult to assign drawing instruction to
particular segment. The segment created must be given a name to identify it which must be
a valid one and there should be no segment with the same name. After this, we initialize items in
segment table under our segment name and the first instruction of this segment is allocated at next
free storage in display file and attributes of segments are initialized to default. Algorithm :
1. If any segment is open, give error message : “Segment is still open” and go to step 8.
2. Read the name of the new segment.
3. If the segment name is not valid, give error message : “Segment name not a valid name” and
go to step 8.
4. If given segment name already exists, give error message : “Segment name already exists in
name list” and go to step 8.
5. Make next free storage area in display file as start of new segment.
6. Initialize size of new segment to 0 and all its attributes to their default values.
7. Inform that the new segment is now open.
8. Stop.
2. Closing a Segment : After completing entry of all display file instructions, the segment needs to be
closed for which it has to be renamed, which is done by changing the name of currently open
segment as 0. Now the segment with name 0 is open i.e. unnamed segment is open and if two
unnamed segments are present in display file one needs to be deleted. Algorithm :
1. If any segment is not open, give error message : “No segment is open now” and go to step 6.
2. Change the name of currently opened segment to any unnamed segment, lets say 0.
3. Delete any other unnamed segment instruction which may have been saved and initialize above
unnamed segment with no instructions.
4. Make the next free storage area available in display file as start of the unnamed segment.
5. Initialize size of unnamed segment to 0.
6. Stop.
3. Deleting a Segment : To delete a particular segment from display file, we must just delete that one
segment without destroying or reforming the entire display and recover space occupied by this
segment. Use this space for some other segment. The method to achieve this depends upon the data
structure used to represent display file. In case of arrays, the gap left by deleted segment is filled by
The Open Document Architecture (ODA) is an internationally standardized electronic representation for
document content and structure. ODA has been ratified by the International Standards Organization as
ISO 8613.
ODA is a critical standard for anyone who wants to share documents without sacrificing control over
content, structure and layout of those documents. It is designed to solve difficulties created by the variety
of document formats that exist. An ODA document can be opened, changed, exchanged, stored and
reproduced by any ODA-compliant program.
The value of ODA increases with time. That is, while proprietary stored files may be incompatible with
the new formats of software upgrades, ODA files remain readable regardless of new formats. With ODA,
massive file conversions will become a thing of the past. ODA not only protects office automation
investments, it also allows for easy document transfer and common document storage. Moreover, ODA
goes beyond unformatted text by using logical structure, so that text can be formatted on output in a form
most suitable for the output device.
ODA has established subsets of functionality, suitable for particular applications. Such a subset is
specified in a functional profile or Document Application Profile (DAP). At present, functional profiles
for ODA have been developed for three specific types of use. WordPerfect Exchange, the ODA solution
from WordPerfect, the Novell Applications Group, supports a profile known as FOD26. This DAP aims
to provide support for typical word processing and simple Desk Top Publishing.
Working with word processors has become a natural part of today's office life for most of us. Word
processors, editors and other document handlers provide electronic assistance in the production of
documents, which frequently include graphics and images as well as text.
The story is very different where document handling is concerned; it is still not possible for us to send
electronic documents to other people without first having to inquire about their document handling
facilities. Is there really no alternative to sending all our texts in ASCII form?
What is needed is a single, self-contained standard under which documents, including both text and
graphics, can be transmitted with all attributes intact from one system to another, for further editing,
processing, storing, printing and retransmission.
Open Document Architecture (ODA) is precisely such a standard. The Open Document Architecture and
Interchange Format (ODA/ODIF) is a new compound standard for use in the expanding world of open
systems. Compound, or multimedia documents are those made up of several different types of content;
for example, character text, graphics and images.
In the early 1990s, six major computer companies joined together to form the ODA Consortium. The
ODA Consortium (ODAC), a European Economic Interest Grouping, promotes the ODA standard and
provides means to implement it in software applications by use of Toolkit APIs. The companies are: Bull
SA, DEC, IBM, ICL PLC, Siemens-Nixdorf and Unisys. In 1994, WordPerfect, the Novell applications
group joined the consortium.
ODA has been designed to facilitate inter-operability between different document processing systems.
Document interchange occurs whenever one person sends a document to another. Users may prefer
electronic document interchange for a variety of reasons:
The essential benefit within the context of document interchange is to give the document vendor
independence.
Because ODA has been designed to be extensible, the ODA standard is written to have a very broad
scope. In practice, however, no device can support every possible feature. How then is it possible to
guarantee that the sender and the recipient of a document support a compatible set of features?
The answer to this problem is a set of defined document application profiles (DAP). These are arranged
in levels of increasing functionality. Loosely speaking, each DAP defines a list of supported features,
which any system at the same or higher level must be able to accept or interpret correctly.
ODA prescribes, first of all, an organization for the information in a document. This is divided into the
following categories:
• Logical structure: --> Like SGML, a sequential order of objects in the file.
• Layout: --> The placement of content on the page
• Content: --> Text, geometric graphics, and raster graphics (raster graphics are also called
facsimile images)
For example, a memo document (where layout may not be critical) could have logical and content
components in ODA format.
All ODA files can be classified as Formatted, Formatted Processable, or Processable. Formatted files are
not to be edited. Processable and Formatted Processable files can be edited.
The development of MHEG arose directly out of the increasing convergence of broadcast and interactive
technologies. It specifies an encoding format for multimedia applications independently of service
paradigms and network protocols. Like Quicktime and OMFI it is concerned with time-based media
objects, whose encodings are determined by other standards.However, the scope of MHEG is larger in
that it directly supports interactive media and real-time delivery over networks.
There have been a progressions of MHEG standards (much like MPEG) The current widespread standard
is MHEG-5 but drafts standards exist up to MHEG-7.
Every design generally represents a compromise between conflicting goals. MHEG's design is no
exception, especially if you consider that MHEG-5 (and later) targets a continuously growing and fiercely
competitive market where broadcast and interactive technologies converge almost daily.
Converging technologies have often stimulated adopting standard solutions. Multimedia applications
standards provide more than just the obvious objectives of portability and interoperability. A good
multimedia standard can become a reference solution for system developers and application
programmers. It also promotes the use of modular architectures that rely on common components that
accomplish a specific functionality, such as interpreting and presenting MHEG applications to users. This
task is performed by a compliant runtime engine (RTE), a resident software component that sched-ules
delivery of an application to the user. It's aimed at a wide installation base within complete solutions, like
a Video on Demand or an Interactive TV system. RTEs help improve a product's return on investment,
abate a product's per unit costs, and provide high quality, robust products due to extensive product
testing.
SGML stands for Standard Generalized Markup Language. It can be defined as the standard for
defining generalized markup language for documents.
It was developed and designed by the International Organization for Standards i.e ISO.
HTML was theoretically an example of an SGML-based language until HTML 5, which browsers
cannot parse as SGML for compatibility reasons. The SGML is extended from GML and later on it is
extended to HTML and XML.
The extension of SGML files is:
.sgml
Syntax:
<NAME TYPE="user">
Geeks for Geeks
</NAME>
SGML code typically looks like:
HTML
<EMAIL>
<SENDER>
<PERSON>
<FIRSTNAME>GEEKSFORGEEKS</LASTNAME>
</PERSON>
</SENDER>
<BODY>
</EMAIL>
Characteristics
• The SGML Declarations.
• The Prologue, containing a DOCTYPE declaration with the various markup declarations that
together make a DTD i.e Document Type Definition.
• The instance itself, containing one top-most element and its contents
Components of an SGML Document :
There are mainly three components of SGML document. They are –
1. SGML Declaration
2. Prolog
3. Document instance.
Advantages
• It has the capability to encode the full structure of the document and can support any media type.
• It is of much more use than HTML which provides capabilities to code visual representation and not
to structure the real piece of information.
• Separates content from appearance.
• SGML files encoding is allowed for more complex formatting as compared to HTML.
• The Stylesheets present in SGML make the content to use for different purposes.
• Extremely flexible.
• Well supported with many tools available because of ISO standard.
Disadvantages
• It may be typical to code software in SGML.
• Tools that are used in SGML are expensive.
• It may not be used widely.
• Special software is required to run or to allow the document to display.
• Creating DTD’s requires exacting software engineering.
LECTURE 36
A Document Type Definition (DTD) describes the tree structure of a document and something about its
data. It is a set of markup affirmations that actually define a type of document for the SGML family,
like GML, SGML, HTML, XML.
A DTD can be declared inside an XML document as inline or as an external recommendation. DTD
determines how many times a node should appear, and how their child nodes are ordered.
There are 2 data types, PCDATA and CDATA
• PCDATA is parsed character data.
• CDATA is character data, not usually parsed.
Syntax:
<!DOCTYPE element DTD identifier
[
first declaration
second declaration
.
.
nth declaration
]>
Example:
<?xml version="1.0"?>
<!DOCTYPE address [
]>
<address>
<name>
<first>Rohit</first>
<last>Sharma</last>
</name>
<email>sharmarohit@gmail.com</email>
<phone>9876543210</phone>
<birthday>
<year>1987</year>
<month>June</month>
<day>23</day>
</birthday>
</address>
<?xml version="1.0"?>
<name>
<first>Rohit</first>
<last>Sharma</last>
</name>
<email>sharmarohit@gmail.com</email>
<phone>9876543210</phone>
<birthday>
<year>1987</year>
<month>June</month>
<day>23</day>
</birthday>
</address>
address.dtd:
• <!ELEMENT address (name, email, phone, birthday)>
• <!ELEMENT name (first, last)>
• <!ELEMENT first (#PCDATA)>
• <!ELEMENT last (#PCDATA)>
• <!ELEMENT email (#PCDATA)>
• <!ELEMENT phone (#PCDATA)>
• <!ELEMENT birthday (year, month, day)>
• <!ELEMENT year (#PCDATA)>
• <!ELEMENT month (#PCDATA)>
• <!ELEMENT day (#PCDATA)>
Output:
LECTURE 37
HTML (HyperText Markup Language) is the code that is used to structure a web page and its content. For
example, content could be structured within a set of paragraphs, a list of bulleted points, or using images
and data tables. As the title suggests, this article will give you a basic understanding of HTML and its
functions.
So what is HTML?
HTML is a markup language that defines the structure of your content. HTML consists of a series
of elements, which you use to enclose, or wrap, different parts of the content to make it appear a certain
way, or act a certain way. The enclosing tags can make a word or image hyperlink to somewhere else, can
italicize words, can make the font bigger or smaller, and so on. For example, take the following line of
content:
If we wanted the line to stand by itself, we could specify that it is a paragraph by enclosing it in paragraph
tags:
HTMLCopy to Clipboard
<p>My cat is very grumpy</p>
Anatomy of an HTML element
1. The opening tag: This consists of the name of the element (in this case, p), wrapped in opening
and closing angle brackets. This states where the element begins or starts to take effect — in this
case where the paragraph begins.
2. The closing tag: This is the same as the opening tag, except that it includes a forward slash before
the element name. This states where the element ends — in this case where the paragraph ends.
Failing to add a closing tag is one of the standard beginner errors and can lead to strange results.
3. The content: This is the content of the element, which in this case, is just text.
4. The element: The opening tag, the closing tag, and the content together comprise the element.
Elements can also have attributes that look like the following:
Attributes contain extra information about the element that you don't want to appear in the actual content.
Here, class is the attribute name and editor-note is the attribute value. The class attribute allows you to give the
element a non-unique identifier that can be used to target it (and any other elements with the same
class value) with style information and other things. Some attributes have no value, such as required.
1. A space between it and the element name (or the previous attribute, if the element already has one
or more attributes).
2. The attribute name followed by an equal sign.
3. The attribute value wrapped by opening and closing quotation marks.
Note: Simple attribute values that don't contain ASCII whitespace (or any of the characters " ' ` = < >) can
remain unquoted, but it is recommended that you quote all attribute values, as it makes the code more
consistent and understandable.
Nesting elements
You can put elements inside other elements too — this is called nesting. If we wanted to state that our cat
is very grumpy, we could wrap the word "very" in a <strong> element, which means that the word is to be
strongly emphasized:
HTMLCopy to Clipboard
<p>My cat is <strong>very</strong> grumpy.</p>
You do however need to make sure that your elements are properly nested. In the example above, we
opened the <p> element first, then the <strong> element; therefore, we have to close the <strong> element first,
then the <p> element. The following is incorrect:
HTMLCopy to Clipboard
<p>My cat is <strong>very grumpy.</p></strong>
The elements have to open and close correctly so that they are clearly inside or outside one another. If they
overlap as shown above, then your web browser will try to make the best guess at what you were trying to
say, which can lead to unexpected results. So don't do it!
Void elements
Some elements have no content and are called void elements. Take the <img> element that we already have
in our HTML page:
HTMLCopy to Clipboard
<img src="images/firefox-icon.png" alt="My test image" />
This contains two attributes, but there is no closing </img> tag and no inner content. This is because an
image element doesn't wrap content to affect it. Its purpose is to embed an image in the HTML page in the
place it appears.
That wraps up the basics of individual HTML elements, but they aren't handy on their own. Now we'll look
at how individual elements are combined to form an entire HTML page. Let's revisit the code we put into
our index.html example (which we first met in the Dealing with files article):
HTMLCopy to Clipboard
<!DOCTYPE html>
<html lang="en-US">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width" />
<title>My test page</title>
</head>
<body>
<img src="images/firefox-icon.png" alt="My test image" />
</body>
</html>
• <!DOCTYPE html> — doctype. It is a required preamble. In the mists of time, when HTML was
young (around 1991/92), doctypes were meant to act as links to a set of rules that the HTML page
had to follow to be considered good HTML, which could mean automatic error checking and other
useful things. However, these days, they don't do much and are basically just needed to make sure
your document behaves correctly. That's all you need to know for now.
• <html></html> — the <html> element. This element wraps all the content on the entire page and
is sometimes known as the root element. It also includes the lang attribute, setting the primary
language of the document.
• <head></head> — the <head> element. This element acts as a container for all the stuff you want
to include on the HTML page that isn't the content you are showing to your page's viewers. This
includes things like keywords and a page description that you want to appear in search results, CSS
to style our content, character set declarations, and more.
• <meta charset="utf-8"> — This element sets the character set your document should use to UTF-8
which includes most characters from the vast majority of written languages. Essentially, it can now
handle any textual content you might put on it. There is no reason not to set this, and it can help
avoid some problems later on.
• <meta name="viewport" content="width=device-width"> — This viewport element ensures the
page renders at the width of viewport, preventing mobile browsers from rendering pages wider than
the viewport and then shrinking them down.
• <title></title>
HTMLCopy — the <title> element. This sets the title of your page, which is the title that appears
to Clipboard
in the browser tab the page is loaded in. It is also used to describe the page when you
bookmark/favorite it.
• <body></body> — the <body> element. This contains all the content that you want to show to web
users when they visit your page, whether that's text, images, videos, games, playable audio tracks,
or whatever else.
Images
HTMLCopy to Clipboard
<img src="images/firefox-icon.png" alt="My test image" />
As we said before, it embeds an image into our page in the position it appears. It does this via the src (source)
attribute, which contains the path to our image file.
We have also included an alt (alternative) attribute. In the alt attribute, you specify descriptive text for users
who cannot see the image, possibly because of the following reasons:
1. They are visually impaired. Users with significant visual impairments often use tools called screen
readers to read out the alt text to them.
2. Something has gone wrong causing the image not to display. For example, try deliberately changing
the path inside your src attribute to make it incorrect. If you save and reload the page, you should
see something like this in place of the image:
The keywords for alt text are "descriptive text". The alt text you write should provide the reader with
enough information to have a good idea of what the image conveys. In this example, our current text of
"My test image" is no good at all. A much better alternative for our Firefox logo would be "The Firefox
logo: a flaming fox surrounding the Earth."
Try coming up with some better alt text for your image now.
Note: Find out more about accessibility in our accessibility learning module.
Marking up text
This section will cover some essential HTML elements you'll use for marking up the text.
Headings
Heading elements allow you to specify that certain parts of your content are headings — or subheadings.
In the same way that a book has the main title, chapter titles, and subtitles, an HTML document can too.
HTML contains 6 heading levels, <h1> - <h6>, although you'll commonly only use 3 to 4 at most:
HTMLCopy to Clipboard
<!-- 4 heading levels: -->
<h1>My main title</h1>
<h2>My top level heading</h2>
<h3>My subheading</h3>
<h4>My sub-subheading</h4>
Note: Anything in HTML between <!-- and --> is an HTML comment. The browser ignores comments as
it renders the code. In other words, they are not visible on the page - just in the code. HTML comments are
a way for you to write helpful notes about your code or logic.
Now try adding a suitable title to your HTML page just above your <img> element.
Note: You'll see that your heading level 1 has an implicit style. Don't use heading elements to make text
bigger or bold, because they are used for accessibility and other reasons such as SEO. Try to create a
meaningful sequence of headings on your pages, without skipping levels.
Paragraphs
As explained above, <p> elements are for containing paragraphs of text; you'll use these frequently when
marking up regular text content:
HTMLCopy to Clipboard
<p>This is a single paragraph</p>
Add your sample text (you should have it from What will your website look like?) into one or a few
paragraphs, placed directly below your <img> element.
Lists
A lot of the web's content is lists and HTML has special elements for these. Marking up lists always consists
of at least 2 elements. The most common list types are ordered and unordered lists:
1. Unordered lists are for lists where the order of the items doesn't matter, such as a shopping list.
These are wrapped in a <ul> element.
2. Ordered lists are for lists where the order of the items does matter, such as a recipe. These are
wrapped in an <ol> element.
Each item inside the lists is put inside an <li> (list item) element.
For example, if we wanted to turn the part of the following paragraph fragment into a list
HTMLCopy to Clipboard
<p>
At Mozilla, we're a global community of technologists, thinkers, and builders
working together…
</p>
<ul>
<li>technologists</li>
<li>thinkers</li>
<li>builders</li>
</ul>
<p>working together…</p>
Links
Links are very important — they are what makes the web a web! To add a link, we need to use a simple
element — <a> — "a" being the short form for "anchor". To make text within your paragraph into a link,
follow these steps:
HTMLCopy to Clipboard
<a>Mozilla Manifesto</a>
HTMLCopy to Clipboard
<a href="">Mozilla Manifesto</a>
4. Fill in the value of this attribute with the web address that you want the link to:
HTMLCopy to Clipboard
<a href="https://www.mozilla.org/en-US/about/manifesto/">
Mozilla Manifesto
</a>
You might get unexpected results if you omit the https:// or http:// part, called the protocol, at the beginning
of the web address. After making a link, click it to make sure it is sending you where you wanted it to.
Note: href might appear like a rather obscure choice for an attribute name at first. If you are having trouble
remembering it, remember that it stands for hypertext reference.
LECTURE 39
Video On Demand has become increasingly popular. Giant television providers in the United States have
committed to provide VOD services in the near future. Interactive Video On Demand (IVOD) is an
extension of VOD in which additional functionalities such as Fast Forward, Fast Rewind, and Pause are
implemented. These functionalities pose new requirements and challenges on the system implementation.
An IVOD system has three components: Client's "set-top box", network, and servers with archives. The
clients' set-top boxes are their interfaces to the IVOD system. It has a network interface, a decoder, buffers,
and synchronization hardware. Clients input their commands using remote controls. Network of an IVOD
system must be a high speed network. Currently available technologies that are suitable for transferring
IVOD data include SONET, ATM, ADSL, and HFC. Servers with archives are places where user
commands are processed and where movies are stored. Issues such as admission control, servicing policies,
and the storage subsystem structure must be considered when designing the IVOD system. In addition to
the technical issues described above, non-technical issues such as standards, property rights, and cost must
also be considered.
video-on-demand (VOD), technology for delivering video content, such as movies and television shows,
directly to individual customers for immediate viewing.
In a cable television VOD system, video content is stored on a centralized server in the form of compressed
digital files. A customer navigates a programming menu via the cable set-top box and makes a selection,
available either at no cost or for a charge. The server immediately begins streaming the program. The
viewer may pause, fast-forward, rewind, or stop and later resume the program. Sometimes the program will
be available for viewing only for a short set time period. VOD systems may also use a download-based
model, in which the program is stored on a hard disk in the set-top box, or they may transmit over
the Internet to a personal computer. Satellite television services, which broadcast the same signal over an
entire service area, cannot accommodate true VOD, though they often offer Internet VOD.
Cable providers experimented with VOD in the 1990s, but the services failed to achieve much success until
the next decade, when equipment and bandwidth became less expensive and content providers began
allowing more programming to be offered by VOD. By the middle of the decade, VOD had largely
supplanted schedule-driven pay-per-view service on cable systems, and by 2010 most television networks
were offering many of their programs on VOD.
In the same time frame, Internet-based VOD grew more pervasive in the video-rental market, allowing
customers immediate access to an expansive library of programming at the click of a button. New
development efforts focused on bridging the gap between Internet and television sets, allowing online rental
services to compete with cable providers in bringing content to customers’ television screens. Subscription
services such as Netflix and Amazon Prime allowed users to stream video for a monthly fee. The video site
YouTube also hosted many videos that could be streamed on demand. In the early 21st century more than
70 percent of Internet traffic was streaming video. The many options available for Internet VOD haveeven
led some viewers to “cut the cord”—that is, to reject cable television in favor of Internet streaming
LECTURE 40
Video conferencing is a live video-based meeting between two or more people in different locations using
video-enabled devices. Video conferencing allows multiple people to meet and collaborate face to face
long distance by transmitting audio, video, text and presentations in real time through the internet.
• 80% of executives say turning on video for internal communications is becoming the norm; 84% say the
same for external meetings 1
• 87% more people use video conferencing today than two years ago
• 75% of remote workers experience increased productivity and an enhanced work-life balance
Project Management
Project management tools like Asana and Monday.com are designed to help teams organize, track and
manage tasks and projects. All projects go through a customized task workflow that controls the status of
the project as well as the rules by which it transitions to other statuses. These tools make it easy to see
project due dates, assignees and progress to ensure every project is completed correctly and on time.
Communication Tools
Instant messaging tools like Microsoft Teams and Slack® are great for quick, impromptu, text-based
conversations to help keep your team connected throughout the workday. In addition to texts, you can also
send GIFs and videos on the platforms, making the messages more exciting. All files and chats are synced,
archived and searchable, so you can find chats and documents in the future. You can also create unique
channels so smaller groups of people or departments can work together on tasks and projects.
White boarding
Whiteboards are great for quickly sharing ideas, brainstorming with your team and making visual concepts
easier to understand, but if you’re calling into the meeting over video it’s easy to feel left out. Life-size
Share™ with the Kaptivo™ whiteboard camera system lets you digitally capture and share content from
any standard whiteboard. Kaptivo removes people, reflection and glares from the whiteboard session and
presents a clear and sharp digital image to everyone on the call in real time. After the meeting you can
easily download or share your whiteboard session with your entire team or people outside your
organization.
LECTURE 41
Applications of multimedia
Multimedia finds its application in various areas including, but not limited to, advertisements, art,
education, entertainment, engineering, medicine, mathematics, business, scientific research and
spatial, temporal applications. A few application areas of multimedia are listed below:
• Creative industries: Creative industries use multimedia for a variety of purposes ranging from
fine arts, to entertainment, to commercial art, to journalism, to media and software services
provided for any of the industries listed below. An individual multimedia designer may cover
the spectrum throughout their career request for their skills range from technical, to analytical
and to creative.
• Commercial: Much of the electronic old and new media utilized by commercial artists is
multimedia. Exciting presentations are used to grab and keep attention in advertising.
Industrial, business to business, and interoffice communications are often developed by
creative services firms for advanced multimedia presentations beyond simple slide shows to
sell ideas or liven-up training. Commercial multimedia developers may be hired to design for
governmental services and nonprofit services applications as well.
• Entertainment and Fine Arts: In addition, multimedia is heavily used in the entertainment
industry, especially to develop special effects in movies and animations. Multimedia games
are a popular pastime and are software programs available either as CD-ROMs or online. Some
video games also use multimedia features.
Multimedia applications that allow users to actively participate instead of just sitting by as
passive recipients of information are called Interactive Multimedia.
• Education: In Education, multimedia is used to produce computer-based training courses
(popularly called CBTs) and reference books like encyclopedia and almanacs. A CBT lets the
user go through a series of presentations, text about a particular topic, and associated
illustrations in various information formats. Edutainment is an informal term used to describe
combining education with entertainment, especially multimedia entertainment.
• Engineering: Software engineers may use multimedia in Computer Simulations for anything
from entertainment to training such as military or industrial training. Multimedia for software
interfaces are often done as collaboration between creative professionals and software
engineers.
• Industry: In the Industrial sector, multimedia is used as a way to help present information to
shareholders, superiors and coworkers. Multimedia is also helpful for providing employee
training, advertising and selling products all over the world via virtually unlimited web-based
technologies.
• Mathematical and Scientific Research: In Mathematical and Scientific Research, multimedia
is mainly used for modeling and simulation. For example, a scientist can look at a molecular
model of a particular substance and manipulate it to arrive at a new substance. Representative
research can be found in journals such as the Journal of Multimedia.
• Medicine: In Medicine, doctors can get trained by looking at a virtual surgery or they can
simulate how the human body is affected by diseases spread by viruses and bacteria and then
develop techniques to prevent it.
• Multimedia in Public Places: In hotels, railway stations, shopping malls, museums, and
grocery stores, multimedia will become available at stand-alone terminals or kiosks to provide
information and help. Such installation reduce demand on traditional information booths and
personnel, add value, and they can work around the clock, even in the middle of the night,
when live help is off duty.
A menu screen from a supermarket kiosk that provide services ranging from meal planning to coupons.
Hotel kiosk list nearby restaurant, maps of the city, airline schedules, and provide guest services such
as automated checkout. Printers are often attached so users can walk away with a printed copy of the
information. Museum kiosk are not only used to guide patrons through the exhibits, but when installed
at each exhibit, provide great added depth, allowing visitors to browser though richly detailed
information specific to that display.
LECTURE 42
• a collection of services
• a collection of information objects
• a supporting users with information objects
• organization and presentation of those objects l available directly or indirectly
• electronic/digital availability
A digital library is much more than just the collection of material in its depositories. Digital Libraries It
provides a variety of services to all of its users. The basis of the digital library is the information objects
that provide the content in the form of digital resource. The goal of the digital library is to satisfy user needs
for management, access, storage and manipulation of the variety of information stored in the collection of
material that represents the holding of the library. The information objects may be digital objects or they
may be in other media but represented in the library via digital means (e. g. metadata). They may be
available directly over the network or indirectly. Although the object may not even be electronic, and
although the objects themselves may not be available directly over the network, they must be represented
electronically in some manner.
There are many definitions of a digital library. The terms such as “electronic library” and virtual library are
often used synonymously. The elements that have been identified as common to these definitions are :
— The digital library is not a single entity.
— The digital library requires technology to link the resources of many
— The linkages between the many digital libraries and information service are transparent to the end user
— Universal access to digital libraries and information services is a goal
— Digital library collections are not limited to document surrogates, they extend to digital artifacts that
cannot be represented or distributed in printed formats.
The aim of a digital library may be to expedite the systematic development of digital resources collection;
the means to collect, store and organize information and knowledge in digital form.
Important characteristics of a digital library are :
i) Digital collection – In the digital environment a digital library is expected to develop document
collection in a digital format.
ii) Technology – It is understood that a digital library will have digital material in its collection.
But in the present day context, both digital and non-digital information belonging to a digital
library are to be handled using digital technologies.
iii) Work and Service – The professionals supposed to work in a digital library should have
necessary training in handling digital information in order to provide the optimum level of
effective service.
The most important component of a digital library is the digital collection it holds or has access to. A
digital library can have a wide range of resources. It may contain both paper based conventional
documents or information contained in computer processible form. The collection of a digital library
may include – a combination of structured/unstructured texts, numerical data, scanned images,
graphics, audio and video recordings.
Slot Question & Assignment