Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Adbms 2070-2076

Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

Institute of Science and Technology

2070

Bachelor Level/ Fourth Year/ Seven Semester/ Science Full Marks: 60

Computer Science and Information Technology (CSc. 401) Pass Marks: 24

(Advanced Database Management System) Time: 3 hours

Candidates are required to give their answers in their own words as far as practicable.

The questions are of equal value.

Attempt all questions. (10x6=60)

1.) Explain the following terms:


• Extent

Ans: An extent is a logical unit of database storage space allocation made up of a number of contiguous
data blocks. One or more extents in turn make up a segment. When the existing space in a segment is
completely used, Oracle allocates a new extent for the segment.

When Extents Are Allocated

When you create a table, Oracle allocates to the table's data segment an initial extent of a specified
number of data blocks. Although no rows have been inserted yet, the Oracle data blocks that
correspond to the initial extent are reserved for that table's rows.

If the data blocks of a segment's initial extent become full and more space is required to hold new data,
Oracle automatically allocates an incremental extent for that segment. An incremental extent is a
subsequent extent of the same or greater size than the previously allocated extent in that segment.

For maintenance purposes, the header block of each segment contains a directory of the extents in that
segment.

• Temporal database
Ans: A temporal database is a database that has certain features that support time-sensitive status for
entries. Where some databases are considered current databases and only support factual data
considered valid at the time of use, a temporal database can establish at what times certain entries are
accurate.
A temporal database stores data relating to time instances. It offers temporal data types and stores
information relating to past, present and future time. The temporal database has two major notions or
attributes. 1. valid time. 2. transaction time. More specifically the temporal aspects usually include valid
time and transaction time. These attributes can be combined to form bitemporal data.
 Valid time is the time period during which a fact is true in the real world.
 Transaction time is the time period during which a fact stored in the database was known.
 Bitemporal data combines both Valid and Transaction Time.
It is possible to have timelines other than Valid Time and Transaction Time, such as Decision Time, in
the database. In that case the database is called a multitemporal database as opposed to a bitemporal
database. However, this approach introduces additional complexities such as dealing with the validity of
(foreign) keys.

• Degree of homogeneity of DBMS


Ans: If all servers (or individual local DBMSs) use identical software and all users (clients) use identical
software, the DDBMS is called homogeneous; otherwise, it is called heterogeneous. Another factor
related to the degree of homogeneity is the degree of local autonomy. If there is no provision for the
local site to function as a stand-alone DBMS, then the system has no local autonomy. On the other
hand, if direct access by local transactions to a server is permitted, the system has some degree of local
autonomy.

At one extreme of the autonomy spectrum, we have a DDBMS that "looks like" a centralized DBMS to
the user. A single conceptual schema exists, and all access to the system is obtained through a site that
is part of the DDBMS—which means that no local autonomy exists. At the other extreme we encounter
a type of DDBMS called a federated DDBMS (or a multidatabase system). In such a system, each server
is an independent and autonomous centralized DBMS that has its own local users, local transactions,
and DBA and hence has a very high degree of local autonomy. The term federated database system
(FDBS) is used when there is some global view or schema of the federation of databases that is shared
by the applications. On the other hand, a multidatabase system does not have global schema and
interactively constructs one as needed by the application. Both systems are hybrids between
distributed and centralized systems and the distinction we made between them is not strictly followed.
We will refer to them as FDBSs in a generic sense.

In a heterogeneous FDBS, one server may be a relational DBMS, another a network DBMS, and a third
an object or hierarchical DBMS; in such a case it is necessary to have a canonical system language and
to include language translators to translate subqueries from the canonical language to the language of
each server.

• X Path
• Classification and clustering
Ans: Classification:
Data classification is the process of organizing data into categories for its most effective and efficient
use. A well-planned data classification system makes essential data easy to find and retrieve. This can
be of particular importance for risk management, legal discovery, and compliance. Written procedures
and guidelines for data classification should define what categories and criteria the organization will use
to classify data and specify the roles and responsibilities of employees within the organization
regarding data stewardship. Once a data-classification scheme has been created, security standards
that specify appropriate handling practices for each category and storage standards that define
the data's lifecyle requirements should be addressed.

Clustering:

Clustering, in the context of databases, refers to the ability of several servers or instances to connect to
a single database. An instance is the collection of memory and processes that interacts with a database,
which is the set of physical files that actually store data.
Clustering offers two major advantages, especially in high-volume database environments:

 Fault tolerance: Because there is more than one server or instance for users to connect to,
clustering offers an alternative, in the event of individual server failure.
 Load balancing: The clustering feature is usually set up to allow users to be automatically
allocated to the server with the least load.

• OLAP
Ans: Stands for "Online Analytical Processing." OLAP allows users to analyze database information from
multiple database systems at one time. While relational databases are considered to be two-
dimensional, OLAP data is multidimensional, meaning the information can be compared in many
different ways. For example, a company might compare their computer sales in June with sales in July,
then compare those results with the sales from another location, which might be stored in a different
database.
2.) Draw an ER Diagram for a hospital with a set of patients and set of doctors associated with each
patient a log of various tests and examinations conducted.
Ans:
3.) What is the difference between an object and a….. in the object oriented data
model (OOBM) ?
4.) What are the main difference between designing a relational database and an
object database?
Ans: Relational database is used to store data in tables which contains rows and columns and stores
data in it. Every row has its own key and every column has its specific name. The maker of relational
database or its developer refers a field as attribute, record as tuple and file as relation. Whereas the
user refers field as column, record as row, and file as table.
The difference between relational database and object oriented database is that the relational data
base stores data in the form of tables which contains rows and columns. Every column in the table has
its specific name and every row of the table has its own primary key. In the processing of file
environment terms such as field, record, file is used to represent data. While in the object oriented
database the data is stored in the form of objects. In the object oriented data the data is stored along
with its actions that processes or reads the existing data. For example we give an object a name of a
student and it will contain data about student such as its Address, First name and Last name of the
student his ID, student fee record, student Result etc.

A relational databases relies on the relational model, on the other hand a object database relies on the
OOP.

The relational model organizes information in a set of tables each are composed of rows and columns.
Each column represents a property and each row represent an entity.

In a object oriented database each element resembles a object from the object oriented paradigm.

5.) Discuss some applications of active database. How do spatial databases differ from
regular database?
Ans:
 Application which depends on data monitoring activities such as CIM. Telecommunications
Network Management, Program trading, Medical and Financial Support Systems can greatly
benefit from integration with active database.
 Production control, e.g., power plants.
 Maintenance tasks, e.g., inventory control.
 Financial applications, e.g., stock and bond trading.
 Telecommunication and network management.
 Air traffic control.
 Computer integrated manufacturing(CIM)
 Statistics gathering and authorization tools.
a) A spatial database supports special data types for geometric objects and allows you to store
geometric data (usually of a geographic nature) in tables while a non-spatial database doesn't
support such.

b) A spatial database provides special functions and indexes for querying and manipulating
geospatial data using something like Structured Query Language (SQL) while non-spatial
database doesn't provide such functions and indexes.

c) A spatial database is often used as a storage container for geospatial data, but it can do
much more than that. While non-spatial database is often used as a storage container for non-
spatial data.

d) A spatial database uses spatial query in geometric functions to answer questions about
space and objects in space. While non-spatial database don't support spatial queries.

e) In addition to being able to answer questions about the use of space, spatial database
functions allow you to create and modify objects in space. This portion of spatial analysis is
often referred to as geometric or spatial processing.

f) A spatially enabled database can intrinsically work with data types like rivers (modeled as line
strings), land parcels (modeled as polygons), and trees (modeled as points). While non-spatial
database can’t work with these forms of models.

6.) Write a schema that provides tags for a person’s first name, last name, weight, and
shoe size. Weight and shoe size tags should have attributes to designate measuring
systems.
7.) Distinguish between structured and unstructured complex objects.
Ans: see q(5)2071 solution.
8.) What is data warehouse? List the characteristics of data warehouse.
Ans: A data warehouse is a federated repository for all the data that an enterprise's various business
systems collect. The repository may be physical or logical.
Data warehousing emphasizes the capture of data from diverse sources for useful analysis and access,
but does not generally start from the point-of-view of the end user who may need access to specialized,
sometimes local databases. The latter idea is known as the data mart.
Characteristics of Data Warehouse:
i. Subject-oriented :
The warehouse organizes data around the essential subjects of the business (customers and products)
rather than around applications such as inventory management or order processing.

i.Integrated:
It is consistent in the way that data from several sources is extracted and transformed. For example,
coding conventions are standardized: M _ male, F _ female.

ii. Time-variant:
Data are organized by various time-periods (e.g. months).

iii. Non-volatile:
The warehouse’s database is not updated in real time. There is periodic bulk uploading of transactional
and other data. This makes the data less subject to momentary change. There are a number of steps
and processes in building a warehouse.

First, you must identify where the relevant data is stored. This can be a challenge.When the Common-
wealth Bank opted to implement CRM in its retail banking business, it found that relevant customer
data were resident on over 80 separate systems.
9.) What are the advantages and disadvantages of extending the relational data model
by means of ORDBMS?
Ans: Advantages and Disadvantages of ORDBMSS
ORDBMSs can provide appropriate solutions for many types of advanced database applications.
However, there are also disadvantages.

Advantages of ORDBMSs
There are following advantages of ORDBMSs:
Reuse and Sharing: The main advantages of extending the Relational data model come from reuse and
sharing. Reuse comes from the ability to extend the DBMS server to perform standard functionality
centrally, rather than have it coded in each application.
Increased Productivity: ORDBMS provides increased productivity both for the developer and for the,
end user
Use of experience in developing RDBMS: Another obvious advantage is that .the extended relational
approach preserves the significant body of knowledge and experience that has gone into developing
relational applications. This is a significant advantage, as many organizations would find it prohibitively
expensive to change. If the new functionality is designed appropriately, this approach should allow
organizations to take advantage of the new extensions in an evolutionary way without losing the
benefits of current database features and functions.
Disadvantages of ORDBMSs
The ORDBMS approach has the obvious disadvantages of complexity and associated increased costs.
Further, there are the proponents of the relational approach that believe the· essential simplicity' and
purity of the .relational model are lost with these types of extension.
ORDBMS vendors are attempting to portray object models as extensions to the relational model with
some additional complexities. This potentially misses the point of object orientation, highlighting the
large semantic gap between these two technologies. Object applications are simply not as data-centric
as relational-based ones.

10.) Enumerate the limitations of conventional database compared to multimedia


database.
Tribhuvan University

Institute of Science and Technology

2071

Bachelor Level/ Fourth Year/ Seven Semester/ Science Full Marks: 60

Computer Science and Information Technology (CSc. 401) Pass Marks: 24

(Advanced Database Management System) Time: 3 hours

(NEW COURSE)

Candidates are required to give their answers in their own words as far as practicable.

The questions are of equal value.

Attempt all questions. (10x6=60)

Candidates are required to give their answers in their own words as far as practicable.
The questions are of equal value.
Attempt all questions.
1. Explain the following terms :
a. Data Warehouse

Ans: A data warehouse is a relational database that is designed for query and analysis rather
than for transaction processing. It usually contains historical data derived from transaction data,
but it can include data from other sources. It separates analysis workload from transaction
workload and enables an organization to consolidate data from several sources.

In addition to a relational database, a data warehouse environment includes an extraction,


transportation, transformation, and loading (ETL) solution, an online analytical processing
(OLAP) engine, client analysis tools, and other applications that manage the process of gathering
data and delivering it to business users.

b. Distribution Transparency
Ans: Distribution transparency is the property of distributed databases by the virtue of which
the internal details of the distribution are hidden from the users. The DDBMS designer may
choose to fragment tables, replicate the fragments and store them at different sites. However,
since users are oblivious of these details, they find the distributed database easy to use like any
centralized database.

The three dimensions of distribution transparency are −

 Location transparency
 Fragmentation transparency
 Replication transparency

c. X Query
d. Distribution transaction
Ans: A distributed transaction is a database transaction in which two or more network hosts
are involved. Usually, hosts provide transactional resources, while the transaction
manager is responsible for creating and managing a global transaction that encompasses all
operations against such resources. Distributed transactions, as any other transactions, must
have all four ACID (atomicity, consistency, isolation, durability) properties, where atomicity
guarantees all-or-nothing outcomes for the unit of work (operations bundle).

e. Knowledge base
Ans: In general, a knowledge base is a centralized repository for information: a public library,
a database of related information about a particular subject, and whatis.com could all be
considered to be examples of knowledge bases. In relation to information technology (IT), a
knowledge base is a machine-readable resource for the dissemination of information,
generally online or with the capacity to be put online. An integral component of knowledge
management systems, a knowledge base is used to optimize information collection,
organization, and retrieval for an organization, or for the general public.
f. Classification and clustering
2. Distinguish multiple inheritance and selective inheritance in OO concepts.

3. Define state of an object. Distinguish between persistent and transient objects.


Ans: A condition or situation during the life of an object during which it satisfies some condition,
performs some activity, or waits for some event is known as state of an object.

 Persistent Objects: are those that are stored in the database [Objects created using abstract
data types varrays, nested tables etc.]. These can be used both with SQL commands and also in
PL/SQL blocks. These reside in the data dictionary. Persistent objects are available to the user
until they are deleted explicitly. They can be implemented as tables, columns or
attributes. Persistent objects is one that outlives the process in which it is created. Remark that
this does not mean that objects are stored in a database and that any recovery is guaranteed. It
means better, that the lifetime of such objects persist across server process activation and
deactivation cycles.

 Transient object exists only within the scope of the PL/SQL block. These get automatically de-
allocated once they go out of the scope of the PL/SQL block. Examples of transient objects are
PL/SQL variables. Transient objects have a lifetime bounded by the lifetime of the process in
which they are created.

4. Discuss how time is represented in temporal databases and compare the


different time dimensions.
Ans: Temporal data stored in a temporal database is different from the data stored in non-temporal
database in that a time period attached to the data expresses when it was valid or stored in the
database. As mentioned above, conventional databases consider the data stored in it to be valid at
time instant now, they do not keep track of past or future database states. By attaching a time
period to the data, it becomes possible to store different database states. In the temporal base
database time is represented as an ordered sequence of points in granularity. It is determined by
the application and for particular applications temporal database researches are used the term
‘chronon’ in state of point to describe the minimal granularity.

5. What is the difference between structured and unstructured complex object?


Differentiate identical versus equal objects with examples.
Ans:
Structure Data:
Structured data is highly organized information that uploads neatly into a relational database (think
traditional row database structures), lives in fixed fields, and is easily detectable via search operations
or algorithms. Structured data is relatively simple to enter, store, query, and analyze, but it must be
strictly defined in terms of field name and type (e.g. alpha, numeric, date, currency), and as a result is
often restricted by character numbers or specific terminology. Analysts typically use simple or more
complex VLOOKUP queries in Excel spreadsheets or Structured Query Language (SQL) to perform
queries on structured data within relational databases.
Structured data leaves out immense amounts of material that do not fit simply into a firm’s
organization of information. Until recently, structured data was supplemented by this additional
information in the form of paper or microfiche. With the improvement of processing by computers,
lowered cost of data storage, and the spread of new formats of data, the age of unstructured data
began. Now, structured and unstructured data must both be consulted, queried, assimilated and
leveraged to make the best business decisions.
Unstructured Data:
Unstructured data may have its own internal structure, but does not conform neatly into a
spreadsheet or database. While unruly in nature, it is also incredibly valuable and increasingly
available in the form of complex data sources, such as web logs, multimedia content, email, customer
service interactions, sales automation, and social media data. Most business interactions, in fact, are
unstructured in nature.
The fundamental challenge of unstructured data sources is that they are difficult for nontechnical
business users and data analysts alike to unbox, understand, and prepare for analytic use. Beyond
issues of structure, is the sheer volume of this type of data. Because of this, current data mining
techniques often leave out valuable information and make analyzing unstructured data laborious and
expensive.

6. What are the advantages and disadvantages of OODBMS?


Ans: Advantages and Disadvantages of OODBMSS

Enriched modeling capabilities

The object-oriented data model allows the 'real world' to be modeled more closely. The object,
which encapsulates both state and behavior, is a more natural and realistic representation of real-
world objects. An object can store all the relationships it has with other objects, including many-to-
many relationships, and objects can be formed into complex objects that the traditional data
models cannot cope with easily.

Extensibility

OODBMSs allow new data types to be built from existing types. The ability to factor out common
properties of several classes and form them into a superclass that can be shared with subclasses can
greatly reduce redundancy within system is regarded as one of the main advantages of object
orientation. Further, the reusability of classes promotes faster development and easier
maintenance of the database and its applications.

Capable of handling a large variety of data types

Unlike traditional databases (such as hierarchical, network or relational), the object oriented
database are capable of storing different types of data, for example, pictures, voice video, including
text, numbers and so on.

Removal of impedance mismatch

A single language interface between the Data Manipulation Language (DML) and the programming
language overcomes the impedance mismatch. This eliminates many of the efficiencies that occur in
mapping a declarative language such as SQL to an imperative 'language such as 'C'. Most OODBMSs
provide a DML that is computationally complete compared with SQL, the 'standard language of
RDBMSs.

More expressive query language

Navigational access from the object is the most common form of data access in an OODBMS. This is
in contrast to the associative access of SQL (that is, declarative statements with selection based on
one or more predicates). Navigational access is more suitable for handling parts explosion, recursive
queries, and so on.
Support for schema evolution

The tight coupling between data and applications in an OODBMS makes schema evolution more
feasible.

Support for long-duration, transactions

Current relational DBMSs enforce serializability on concurrent transactions to maintain database


consistency. OODBMSs use a different protocol to handle the types of long-duration transaction
that are common in many advanced database application.

Applicability to advanced database applications

There are many areas where traditional DBMSs have not been particularly successful, such as,
Computer-Aided Design (CAD), Computer-Aided Software Engineering (CASE), Office Information
System(OIS), and Multimedia Systems. The enriched modeling capabilities of OODBMSs have made
them suitable for these applications.

Improved performance

There have been a number of benchmarks that have suggested OODBMSs provide significant
performance improvements over relational DBMSs. The results showed an average 30-fold
performance improvement for the OODBMS over the RDBMS.
Disadvantages:
There are following disadvantages of OODBMSs:
Lack of universal data model: There is no universally agreed data model for an OODBMS, and most
models lack a theoretical foundation. This .disadvantage is seen as a significant drawback, and is
comparable to pre-relational systems.
Lack of experience: In comparison to RDBMSs the use of OODBMS is still relatively limited. This
means that we do not yet have the level of experience that we have with traditional systems.
OODBMSs are still very much geared towards the programmer, rather than the naïve end-user. Also
there is a resistance to the acceptance of the technology. While the OODBMS is limited to a small
niche market, this problem will continue to exist
Lack of standards: There is a general lack of standards of OODBMSs. We have already mentioned
that there is not universally agreed data model. Similarly, there is no standard object-oriented query
language.
Competition: Perhaps one of the most significant issues that face OODBMS vendors is the
competition posed by the RDBMS and the emerging ORDBMS products. These products have an
established user base with significant experience available. SQL is an approved standard and the
relational data model has a solid theoretical formation and relational products have many
supporting tools to help .both end-users and developers.
Query optimization compromises encapsulations: Query optimization requires. An understanding of
the underlying implementation to access the database efficiently. However, this compromises the
concept of incrassation.
Locking at object level may impact performance Many OODBMSs use locking as the basis for
concurrency control protocol. However, if locking is applied at the object level, locking of an
inheritance hierarchy may be problematic, as well as impacting performance.
Complexity: The increased functionality provided by the OODBMS (such as the illusion of a single-
level storage model, pointer sizzling, long-duratipntransactions, version management, and schema
evolution--makes the system more complex than that of traditional DBMSs. In complexity leads to
products that are more expensive and more difficult to use.
Lack of support for views: Currently, most OODBMSs do not provide a view mechanism, which, as
we have seen previously, provides many advantages such as data independence, security, reduced
complexity, and customization.
Lack of support for security: Currently, OODBMSs do not provide adequate security mechanisms.
The user cannot grant access rights on individual objects or classes.
If OODBMSs are to expand fully into the business field, these deficiencies must be rectified.

7. What are the differences and similarities between objects and literals in the
ODMG object model?
Ans: objects and literals are basic block of object model.
The difference is
a) Object has both object identifier and state, literal has no object identifier.
b) Object state can change overtime by modifying object value; literal is basically a constant
value that does not change.
c) Objects are identified by their OID’s whereas literals are identified by their value.
d) Object will have a life time it depends on whether persistent object or transient object
lifetime is not applicable to literal.
e) Copy of object result in shallow copy whereas literal result in logical copy objects and literal
can be atomic or structured.
8. Describe the main reasons for the potential advantage for distributed database.
What additional functions does it have over centralized DBMS?
Ans:
9. Describe the characteristics of mobile computing environment in detail.
Ans: Mobile computing is human–computer interaction by which a computer is expected to be
transported during normal usage, which allows for transmission of data, voice and video. Mobile
computing involves mobile communication, mobile hardware, and mobile software. Communication
issues include ad hoc networks and infrastructure networks as well as communication
properties, protocols, data formats and concrete technologies. Hardware includes mobile devices or
device components. Mobile software deals with the characteristics and requirements of mobile
applications.
Characteristics of mobile computing are:

 Portability: Devices/nodes connected within the mobile computing system should facilitate
mobility. These devices may have limited device capabilities and limited power supply, but should
have a sufficient processing capability and physical portability to operate in a movable environment.
 Connectivity: This defines the quality of service (QoS) of the network connectivity. In a mobile
computing system, the network availability is expected to be maintained at a high level with the
minimal amount of lag/downtime without being affected by the mobility of the connected nodes.
 Interactivity: The nodes belonging to a mobile computing system are connected with one another
to communicate and collaborate through active transactions of data.
 Individuality: A portable device or a mobile node connected to a mobile network often denote an
individual; a mobile computing system should be able to adopt the technology to cater the
individual needs and also to obtain contextual information of each node.
10. Differentiate between XML schema and XML DTD with suitable example.
Ans: The critical difference between DTDs and XML Schema is that XML Schema utilizes an XML-
based syntax, whereas DTDs have a unique syntax held over from SGML DTDs. Although DTDs are
often criticized because of this need to learn a new syntax, the syntax itself is quite terse. The
opposite is true for XML Schema, which are verbose, but also make use of tags and XML so that
authors of XML should find the syntax of XML Schema less intimidating.
Some of the main differences are:
Differences between an XML Schema Definition (XSD) and Document Type Definition (DTD) include:

 XML schemas are written in XML while DTD are derived from SGML syntax.
 XML schemas define datatypes for elements and attributes while DTD doesn't support
datatypes.
 XML schemas allow support for namespaces while DTD does not.

 XML schemas define number and order of child elements, while DTD does not.

 XML schemas can be manipulated on your own with XML DOM but it is not possible in case of
DTD.
 using XML schema user need not to learn a new language but working with DTD is difficult for a
user.
 XML schema provides secure data communication i.e sender can describe the data in a way that
receiver will understand, but in case of DTD data can be misunderstood by the receiver.
 XML schemas are extensible while DTD is not extensible.

Tribhuvan University

Institute of Science and Technology

2072
Bachelor Level/ Fourth Year/ Seven Semester/ Science Full Marks: 60

Computer Science and Information Technology (CSc. 401) Pass Marks: 24

(Advanced Database Management System) Time: 3 hours

(NEW COURSE)

Candidates are required to give their answers in their own words as far as practicable.

The questions are of equal value.

Attempt all questions. (10x6=60)

1. Explain the following terms:

a. Database performance tuning

Ans: A fundamental requirement of running a mission-critical application is being able to achieve and
sustain high performance with your database.

Database performance tuning encompasses the steps you can take to optimize performance with the
goal of maximizing the use of system resources for greater efficiency. By fine-tuning certain database
elements such as index use, query structure, data models, system configuration (e.g., hardware and OS
settings), and application design, you can significantly impact the overall performance of your
application.

MongoDB is a database built for high performance deployments at scale. For this reason and more,
thousands of organizations and over a third of the Fortune 100 count on MongoDB to help them deliver
innovative, lower cost applications that result in competitive advantage.

b. UML
Ans: UML is an acronym that stands for Unified Modeling Language. Simply put, UML is a modern
approach to modeling and documenting software. In fact, it’s one of the most popular business process
modeling techniques.

It is based on diagrammatic representations of software components. As the old proverb says: “a


picture is worth a thousand words”. By using visual representations, we are able to better understand
possible flaws or errors in software or business processes.
c. Subclass vs Superclass

d. X Query

Ans: See in 2071[Q.1.c]

e. Calendars

f. Active Database
Ans: An active database is a database that includes an event-driven architecture (often in the
form of ECA rules) which can respond to conditions both inside and outside the database. Possible
uses include security monitoring, alerting, statistics gathering and authorization.

2. What are query optimization techniques? Explain.

3. Differentiate between specialization and generalization with example.


Ans:

GENERALIZATION SPECIALIZATION

It proceeds in a bottom-up It proceeds in a top-down manner.

manner.

Generalization extracts the Specialization splits an entity to form

common features of multiple multiple new entities that inherit

entities to form a new entity. some feature of the splitting entity.

The higher level entity must have The higher level entity may not have

lower level entities. lower level entities.

Generalization reduces the size Specialization increases the size of a


GENERALIZATION SPECIALIZATION

of a schema. schema.

Generalization entities on group Specialization is applied on a single

of entities. entity.

Generalization results in forming Specialization results in forming the

a single entity from multiple multiple entity from a single entity.

entities.

4. How do single inheritance, multiple inheritance and selective


inheritance differ?
Ans:

BASIS FOR
SINGLE INHERITANCE MULTIPLE INHERITANCE
COMPARISON

Basic Derived class inherits a single Derived class inherits two or more than

base class. two base class.

Implementation Class derived_class : Class derived _class: access_specifier

access_specifier base class base_class1, access_specifier

base_class2, ....

Access Derived class access the Derived class access the combined
BASIS FOR
SINGLE INHERITANCE MULTIPLE INHERITANCE
COMPARISON

features of single base class features of inherited base classes

Visibility Public, Private, Protected Public, Private, Protected

Run time Require small amount of run Require additional runtime overhead as

time over head compared to single inheritance

5. What are the differences between structured and unstructured complex


objects? Explain.
Ans: See 2071[Q.5]

6. What are the object relational features that have been included in SQL-
99?
Ans: SQL-1999 introduced object support into the SQL standard. The SQL-1999 standard had to be
backward compatible with the existing SQL-1992 standard, so object support was implemented as
an extension to the existing standard. The types defined by SQL-1992 were retained, and the
standard modified to support user defined types (UDT) with object-like features.

Several new types were introduced by the SQL-1999 standard:

 A reference type,
 Distinct types,
 Structured types.
A reference type is essentially an object identifier (OID) which can be used to uniquely identify an
instance of an object, and is used to point to another type. This infers the use of a structured type
being pointed to.

Distinct types were an ability to ‘rename’ an existing pre-defined type. For example, you could
define a type METERS to be an INTEGER. You could similarly define FEET as INTEGER also. The
benefit of this is that the database does not allow type mixing, so adding FEET to METERS without
conversion would result in an error.
Structured types are more interesting, as a set of data and associated methods can be grouped
into a user defined type. For example, you could define an address type, which would contain
street number, street, state, zip, country, etc. You could also, depending on the application, define
associated functionality

7. Discuss how time is represented in temporal database and compare


different time dimensions.
Ans:

8. What are the difference and similarities between objects and literals in
the ODMG Object Model?

Ans: See in 2071[Q.7]

9. Describe multimedia database and what are the different types of


multimedia data that are available in current systems?

Ans: A Multimedia database (MMDB) is a collection of related multimedia data.


The multimedia data include one or more primary media data types such as text, images, graphic
objects (including drawings, sketches and illustrations) animation sequences, audio and video.
There are number of data types that can be characterized as multimedia data types. These are
typically the elements or the building blocks of ore generalized multimedia environments,
platforms, or integrating tools. The basic types can be described as follows :

1. Text : The form in which the text can be stored can vary greatly. In addition to ASCII based
files, text is typically stored in processor files, spreadsheets, databases and annotations on
more general multimedia objects. With availability and proliferation of GUIs, text fonts the
job of storing text is becoming complex allowing special effects(color, shades..).
2. Images : There is great variance in the quality and size of storage for still images. Digitalized
images are sequence of pixels that represents a region in the user's graphical display. The
space overhead for still images varies on the basis of resolution, size, complexity, and
compression scheme used to store image. The popular image formats are jpg, png, bmp, tiff.
3. Audio : An increasingly popular datatype being integrated in most of applications is Audio. Its
quite space intensive. One minute of sound can take up to 2-3 Mbs of space. Several
techniques are used to compress it in suitable format.
4. Video : One on the most space consuming multimedia data type is digitalized video. The
digitalized videos are stored as sequence of frames. Depending upon its resolution and size a
single frame can consume upto 1 MB. Also to have realistic video playback, the transmission,
compression, and decompression of digitalized require continuous transfer rate.
5. Graphic Objects: These consists of special data structures used to define 2D & 3D shapes
through which we can define multimedia objects. These includes various formats used by
image, video editing applications.

10. Explain XML schema and XML DTD.


Ans: An XML Schema is a language for expressing constraints about XML documents. There are
several different schema languages in widespread use, but the main ones are Document Type
Definitions (DTDs), Relax-NG, Schematron and W3C XSD (XML Schema Definitions). From this page
you can find out more about DTDs and W3C XSD, since those are the primary schema languages
defined at W3C.

A Schema can be used:

 to provide a list of elements and attributes in a vocabulary;


 to associate types, such as integer, string, etc., or more specifically such as hatsize, sock_colour,
etc., with values found in documents;
 to constrain where elements and attributes can appear, and what can appear inside those
elements, such as saying that a chapter title occurs inside a chapter, and that a chapter must
consist of a chapter title followed by one or more paragraphs of text;
 to provide documentation that is both human-readable and machine-processable;
 to give a formal description of one or more documents.
The XML Document Type Declaration, commonly known as DTD, is a way to describe XML language
precisely. DTDs check vocabulary and validity of the structure of XML documents against grammatical
rules of appropriate XML language.

An XML DTD can be either specified inside the document, or it can be kept in a separate document and
then liked separately.

Basic syntax of a DTD is as follows −

<!DOCTYPE element DTD identifier

declaration1

declaration2

........

]>

Internal DTD
A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it as
internal DTD, standalone attribute in XML declaration must be set to yes. This means, the declaration
works independent of an external source.

Syntax
Following is the syntax of internal DTD −
<!DOCTYPE root-element [element-declarations]>

External DTD
In external DTD elements are declared outside the XML file. They are accessed by specifying the
system attributes which may be either the legal .dtd file or a valid URL. To refer it as external
DTD, standalone attribute in the XML declaration must be set as no. This means, declaration includes
information from the external source.

Syntax
Following is the syntax for external DTD −
<!DOCTYPE root-element SYSTEM "file-name">
©ASCOL CSIT

Tribhuvan University

Institute of Science and Technology

2073

Bachelor Level/ Fourth Year/ Seven Semester/ Science Full Marks: 60

Computer Science and Information Technology (CSc. 401) Pass Marks: 24

(Advanced Database Management System) Time: 3 hours

(NEW COURSE)

Candidates are required to give their answers in their own words as far as practicable.

The questions are of equal value.


Attempt all questions. (10x6=60)

1. Discuss different constraints of specialization and generalization.


Ans:
There are three constraints that may apply to a specialization/generalization: membership constraints, disjoint
constraints and completeness constraints.
 Membership constraints
Condition defined: Membership of a specialization/generalization relationship can be defined as a condition
in the requirements
User defined: Sometimes the designer can define the superclass-subclass relationship. This can be done to
simplify the design model or represent a complex relationship that exists between entities.
 Disjoint constraints
Disjoint: The disjoint constraint only applies when a superclass has more than one subclass. If the subclasses
are disjoint, then an entity occurrence can be a member of only one of the subclasses, e.g.

Figure 7.13
Overlapping: This applies when an entity occurrence may be a member of more than one subclass, e.g.

Figure 7.14
 Completeness constraints
Total: Each superclass (higher-level entity) must belong to subclasses (lower-level entity sets), e.g. a student
must be postgrad or undergrad. To represent completeness in the specialization/generalization relationship,
the keyword Mandatory is used.
Figure 7.15
Partial: Some superclasses may not belong to subclasses (lower-level entity sets), e.g. some people at UCT are
neither student nor staff. The keyword Optional is used to represent a partial specialization/generalization
relationship.

Figure 7.16
We can show both disjoint and completeness constraints in the ER diagram. Following our examples, we can combine
disjoint and completeness constraints.

Figure 7.17
Some members of a university are both students and staff. Not all members of the university are staff and students.
Figure 7.18
A student in the university must be either an undergraduate or postgraduate, but not both.

2. Draw an ER diagram for a hospital with a set of patients and a


set of doctors. Associated with each patient a log of various tests and
examinations conducted.
Ans: see in Q(2) 2070

3. Define encapsulation? How is it used to create abstract data


types?
Ans: In general, encapsulation is the inclusion of one thing within another thing so that the
included thing is not apparent. Decapsulation is the removal or the making apparent a thing
previously encapsulated.
Encapsulation is the object model concept of including processing or behavior with the object instances
defined by the class. Encapsulation allows code and data to be packaged together.

The definition of methods for a class is an integral part of encapsulation. A method is programming
code that performs the behavior an object instance can exhibit. Calculating the age of a person would
be an example of such behavior. The figure shows a way of looking at encapsulating the age method
with an instance object. The code for the age method is "attached" to or encapsulated with the object
rather than part of the application.

An object-oriented database must provide support for all data types not just the built in data
types such as character, integer, and float. To understand abstract data types lets take two steps back
by taking off the abstract and then the data from abstract data type. We now have a type, a type would
be defined as a collection of a type values. A simple example of this is the Integer type, it consists of
values 0, 1, 2, 3, etc. If we add the word data back in we would define data type as a type and the set of
operations that will manipulate the type. If we expand off our integer example, a data type would be
an integer variable, an integer variable is a member of the integer data type. Addition, subtraction, and
multiplication are examples of operations that can be performed on the integer data type.

If we now add the word abstract back in we can define an abstract data type (ADT) as a data type, that
is a type and the set of operations that will manipulate the type. The set of operations are only defined
by their inputs and outputs. The ADT does not specify how the data type will be implemented, all of the
ADT's details are hidden from the user of the ADT. This process of hiding the details is called
encapsulation. If we extend the example for the integer data type to an abstract data type, the
operations might be delete an integer, add an integer, print an integer, and check to see if a certain
integer exists. Notice that we do not care how the operation will be done but simply how do invoke the
operation.

4. What is versioning? Why is it important? What is the difference


between versions and configurations?
Ans: Versioning is the creation and management of multiple releases of a product, all of which have the
same general function but are improved, upgraded or customized. The term applies especially
to operating systems (OSs), software and Web services. Version control is the practice of ensuring
collaborative data sharing and editing among users of systems that employ different versions of a
product. The terms "versioning" and "version control" are sometimes used interchangeably even
though their technical meanings are different.

Configuration control refers to setting runtime dependencies and we often discuss "configuring" an
application to run. An example would be a JMX control or even more basic - specifying whether you are
accessing a QA/UAT or production database. There are lots of jobs out there where you focus on
configuration management in the sense of configuring a package to run (actually customizing the
runtime experience). This is often done through XML or properties files such as an application server
(e.g. WebSphere).
Version control refers to checking in and storing specific versions of the source code and now there is a
real difference between configuration control and version control. Years ago the terms were used
almost interchangeably although back then (around the 80s and early 90s) we didn't have too many real
version control tools. On mainframes we had Pan valet and I think that CA Librarian came soon after.
Configuration control applies to service assets as a whole, of which, systems configuration is a subset.
In places where configuration baselines are adopted, configurations of the service assets must be within
the limits that are recommended in the baselines - so, in a way, configuration control also means that
the configurations of the Configuration Items (CI) don't cross above/below the limits defined in the
baselines. That way, we ensure that all the assets follow uniform configurations. If and when there is a
need to cross these limits, the concerned team must get the approval of Change Advisory Board (CAB)
in order to make those changes.

5. What is object relational database? Discuss object relational


features of SQL.
Ans: An object-relational database (ORD) is a database management system (DBMS) that's composed of
both a relational database (RDBMS) and an object-oriented database (OODBMS). ORD supports the
basic components of any object-oriented database model in its schemas and the query language used,
such as objects, classes and inheritance.
An object-relational database may also be known as an object relational database management systems
(ORDBMS).

6. Define active database. Discuss some applications of active


databases.
Ans: An active database management system (ADBMS) is an event-driven system in which schema or data
changes generate events monitored by active rules. Active database management systems are invoked by
synchronous events generated by user or application programs as well as external asynchronous data change
events such as a change in sensor value or time.

7. What are the difference between valid time, transaction time,


and bitemporal relations?

8. What is data mining? Discuss data mining as a part of knowledge


discovery process.
Ans: Data mining is the process of sorting through large data sets to identify patterns and
establish relationships to solve problems through data analysis. Data mining tools allow
enterprises to predict future trends.
Some people don’t differentiate data mining from knowledge discovery. While others view data mining as
an essential step in the process of knowledge discovery. Here is the list of steps involved in the knowledge
discovery process in data mining −
1. Data Cleaning − basically in this step, the noise and inconsistent data are removed.
2. Data Integration − generally, in this step, multiple data sources are combined.
3. Data Selection − basically, in this step, data relevant to the analysis task are retrieved from the database.
4. Data Transformation −In this step, data is transformed into forms appropriate for mining. Also, by
performing summary or aggregation operations.
5. Data Mining − generally, in this, intelligent methods are applied in order to extract data patterns.
6. Pattern Evaluation − basically in this step, data patterns are evaluated.
7. Knowledge Presentation − generally, in this step, knowledge is represented.

9. What is data warehouse? Discuss the typical functionality of data


warehouse.
Ans: See in Q(8)2070

10. What is mobile database? Discuss the characteristics of mobile


environments.
Ans:

A mobile database is a database that resides on a mobile device such as a PDA, a smart phone, or a
laptop. Such devices are often limited in resources such as memory, computing power, and battery
power. Due to device limitations, a mobile database is often much smaller than its counterpart residing
on servers and mainframes. A mobile database is managed by a Database Management System (DBMS).
Again, due to resource constraints, such a system often has limited functionality compared to a full
blown database management system. For example, mobile databases are single user systems, and
therefore a concurrency control mechanism is not required. Other DBMS components such as query
processing.

IOST, TU
1. OiL t iL (e-phC,iL + ', ol\~ OiL +iL e- phc.,iL l ,ol\ ',s +ke- proc,e-ss ofs +w 'l\g
ao:tiL ',1\ 1\\0r e:-tkiLl\ Ol\e- s', e- or I\o(\e- J ', s IJse-(1J1 ', 1\ ',l\\pr ov"l\g
+le- iLViL ,liLP ,I,+ j o( 6iL +iLJt ,S S'l\\elj c.,oPjll\g aiL +iL (r OI\\ iL _
aiLl c.kPiLse- (ro 0 e- _se-r ve-r + o_lhlloH~_e- r se-rv_e-r so +la, + i1JL +le-
IJSe-b C,iLl\, SliLJ:e- +le- siLJl\e- aiL +iL w',+ kolJ.± iLl\(j ll\c.,oI\S 'L~'±e-I\C,(j_

2 UMI..~ UI\},e-a Moae.,hng /(l,nglJiLge- lUMl..1 ',s iL s+iLl\aiLra'2.e-a


I\\oad~ liLl\glJiLge- e-niLPl,ng ae-ve-Iope-rs 0 SPe-CI\.'J' v,slJiLF ,2.e-
,cons+rIJC + iLl\a aoclJl\\e-l\+ iLr+,(iLC +s o( iL soHl/lliLre., \.'Js+e-I\\
1 j.,IJS,_ UMI.. l\\iLle-s +le.,se- iLr+',(iLC ts sCiLliLPIe-, se-ClJre- iLna
_ _ _tJrQD.lJsL n e-,Le-CIJ.hOI\ UMI.. ' _ Q.,n.-i.l\J pod iLn+ iL~qe-C +,n\l.OI"e-a
',1'1 ,OPje.,L +-or,e-J,+e-aoHl/lliLre- ()..e-ve-Joqll\e.,n+ ,

35Q1..2003;5QI.. 2003 ae.,('lne.,s +1/110 colle-c +'101'1 +,~ p e-s, niLl\\e-


-Tj +1 e- ~rriL L~ +j £e- iLl\a- +le- MlJl+,se., + +,~ p e- ):1'1 iLaa,+ ,OI\
+_0 Ne-5+e.,a liLvle., ~iL+iL+j~e-, oriLcle- slJPpOrt 5 +le- ~r-r ~
+jpe- M +le- lliLrriLj aiL+iL+jPe., 5"I\Ce- oriLcle-'8 Ne-5+e.,O
colle-c +',01\ o( lliLr r iL jS iLna Ne-5+e.,a I iLple-5 liLlie- pe.,e.,n
slJpgor+e-a ',1\ oriLcle- aiL +iLPiL5e-s since- oriLcle-'\',

~I<PiL t~ ;ly, -l<liL Tl, lJ5,ng s ', nglZ-!,ne.. o-ICoae." one., CiLl\
+riLve-rse- +\;:e- iL +iL 'In n:: e- 1'1\\1 aoclJl\\e-n+ Il,s s"l\g)e.,
!'ne., o( coae., ''s ciLlle-a ~I'p r e.,ss 'lon Ile-se., e-l'gre-ss',oI\S
iLre- lJse-a +0 ge-1 +le- ae., +iL,ls (rOI\\ +le- aocl,i l\\e-nts riL +l
-e-r +liLn l\\iLl\lJiLl +riLve.c "I\g o( aOClJl\\e.,l\ t 1..e-1 lJ5 e.,/(l,por
-iL+ e-OlJr COn+iLC + adiL L e.,l'iLl\\ple., +0 lJnae.,cs+iLna lOI/ll __
e.,l'gre.,s5',ons l/IIorl
Tribhuvan University
Institute of Science and Technology
2075
Bachelor Level / Fourth Year /Seven Semester/Science Full Marks: 60
Computer Science and Information Technology-(CSc.401) Pass Marks: 60
(Advanced Database Management System)

Time: 3 hours.
Candidates are required to give their answers in their own words as for as practicable.
The figures in the margin indicate full marks.
Attempt all questions. (10 x 60=60)
1. How do you increase performance of the database? Explain any one database performance tuning
technique with example.

Ans: Databases are the most important applications for your enterprise. Make sure that your databases have
enough resources available. This will help your database in performing at their best. You should check your host
health. Sometimes you can improve your database performance by buying more hardware. Also, make sure that you
are using the latest database version. There are various query optimizers available in the market that you can use for
optimizing your SQL queries. You need to first investigate the cause of bad database performance. After that, you
must try to remove these bottlenecks.

Database performance tuning techniques are:

1. Database statistics
2. Create optimized indexes
3. Avoid functions on RHS of the operator
4. Predetermine expected growth
5. Specify optimizer hints in SELECT
6. Use EXPLAIN
7. Avoid foreign key constraints: Foreign keys constraints ensure data integrity at the cost of performance.
Therefore, if performance is your primary goal you can push the data integrity rules to your application
layer. A good example of a database design that avoids foreign key constraints is the System tables in most
databases. Every major RDBMS has a set of tables known as system tables. These tables contain meta data
information about user databases. Although there are relationships among these tables, there is no
foreign key relationship. This is because the client, in this case the database itself, enforces these rules.

8. Select limited data: The less data retrieved, the faster the query will run. Rather than filtering on the client,
push as much filtering as possible on the server-end. This will result in less data being sent on the wire and
you will see results much faster. Eliminate any obvious or computed columns. Consider the following
example.
Select FirstName, LastName, City
Where City = 'New York City'

In the above example, you can easily eliminate the "City" column, which will always be "New York City".
Although this may not seem to have a large effect, it can add up to a significant value for large result sets.

9. Drop indexes before loading data: Consider dropping the indexes on a table before loading a large batch
of data. This makes the insert statement run faster. Once the inserts are completed, you can recreate the
index again.
If you are inserting thousands of rows in an online system, use a temporary table to load data. Ensure that
this temporary table does not have any index. Since moving data from one table to another is much faster
than loading from an external source, you can now drop indexes on your primary table, move data from
temporary to final table, and finally recreate the indexes.

2. What is query processing? How is it different from query optimization? Discuss heuristic query optimization.
Ans:
Query processing denotes the compilation and execution of a query specification
usually expressed in a declarative database query language such as the structured
query language (SQL). Query processing consists of a compile-time phase and a
runtime phase. At compile-time, the query compiler translates the query
specification into an executable program. This translation process (often
called query compilation) is comprised of lexical, syntactical, and semantical
analysis of the query specification as well as a query optimization and code
generation phase.

Quert Optimization: A single query can be executed through different algorithms or


re-written in different forms and structures. Hence, the question of query
optimization comes into the picture – Which of these forms or pathways is the most
optimal? The query optimizer attempts to determine the most efficient way to
execute a given query by considering the possible query plans.

Heuristic Based Optimization

Heuristic based optimization uses rule-based optimization approaches for query optimization.
These algorithms have polynomial time and space complexity, which is lower than the exponential
complexity of exhaustive search-based algorithms. However, these algorithms do not necessarily
produce the best query plan.
Some of the common heuristic rules are −
• Perform select and project operations before join operations. This is done by moving the
select and project operations down the query tree. This reduces the number of tuples
available for join.
• Perform the most restrictive select/project operations at first before the other operations.
• Avoid cross-product operation since they result in very large-sized intermediate tables.

3. What are the benefits of using distributed databases? Discuss different types of distributed database systems.
Ans:

Distributed databases basically provide us the advantages of distributed


computing to the database management domain. Basically, we can define a Distributed
database as a collection of multiple interrelated databases distributed over a computer
network and a distributed database management system as a software system that
basically manages a distributed database while making the distribution transparent to the
user.
Distributed database management basically proposed for the various reason from
organizational decentralization and economical processing to greater autonomy. Some of
these advantages are as follows:
1. Management of data with different level of transparency –
Ideally, a database should be distribution transparent in the sense of hiding the details of
where each file is physically stored within the system. The following types of
transparencies are basically possible in the distributed database system:
• Network transparency:
This basically refers to the freedom for the user from the operational details of
the network. These are of two types Location and naming transparency.
• Replication transparencies:
It basically made user unaware of the existence of copies as we know that
copies of data may be stored at multiple sites for better availability performance
and reliability.
• Fragmentation transparency:
It basically made user unaware about the existence of fragments it may be the
vertical fragment or horizontal fragmentation.
2. Increased Reliability and availability –
Reliability is basically defined as the probability that a system is running at a certain time
whereas Availability is defined as the probability that the system is continuously available
during a time interval. When the data and DBMS software are distributed over several
sites one site may fail while other sites continue to operate and we are not able to only
access the data that exist at the failed site and this basically leads to improvement in
reliability and availability.
3. Easier Expansion –
In a distributed environment expansion of the system in terms of adding more data,
increasing database sizes, or adding more data, increasing database sizes or adding
more processor is much easier.
4. Improved Performance –
We can achieve interquery and intraquery parallelism by executing multiple queries at
different sites by breaking up a query into a number of subqueries that basically executes
in parallel which basically leads to improvement in performance.

Distributed databases can be broadly classified into homogeneous and heterogeneous distributed database
environments, each with further sub-divisions, as shown in the following illustration.
Homogeneous Distributed Databases

In a homogeneous distributed database, all the sites use identical DBMS and operating systems.
Its properties are −
• The sites use very similar software.
• The sites use identical DBMS or DBMS from the same vendor.
• Each site is aware of all other sites and cooperates with other sites to process user requests.
• The database is accessed through a single interface as if it is a single database.

Types of Homogeneous Distributed Database

There are two types of homogeneous distributed database −


• Autonomous − Each database is independent that functions on its own. They are integrated
by a controlling application and use message passing to share data updates.
• Non-autonomous − Data is distributed across the homogeneous nodes and a central or
master DBMS co-ordinates data updates across the sites.

Heterogeneous Distributed Databases

In a heterogeneous distributed database, different sites have different operating systems, DBMS
products and data models. Its properties are −
• Different sites use dissimilar schemas and software.
• The system may be composed of a variety of DBMSs like relational, network, hierarchical or
object oriented.
• Query processing is complex due to dissimilar schemas.
• Transaction processing is complex due to dissimilar software.
• A site may not be aware of other sites and so there is limited co-operation in processing user
requests.

Types of Heterogeneous Distributed Databases


• Federated − The heterogeneous database systems are independent in nature and integrated
together so that they function as a single database system.
• Un-federated − The database systems employ a central coordinating module through which
the databases are accessed.

4. What are the benefits of using object oriented databases over relational databases? Discuss different type of
constructors used in object oriented databases.
Ans:

◼ Type Constructors:

◼ In OO databases, the state (current value) of a complex object may be constructed from other
objects (or other values) by using certain type constructors.

◼ The three most basic constructors are atom, tuple, and set.

◼ Other commonly used constructors include list, bag, and array.

◼ The atom constructor is used to represent all basic atomic values, such as integers, real numbers, character
strings, Booleans, and any other basic data types that the system supports directly.

◼ This example illustrates the difference between the two definitions for comparing object states for
equality.

◼ o1 = (i1, tuple, <a1:i4, a2:i6>)

◼ o2 = (i2, tuple, <a1:i5, a2:i6>)

◼ o3 = (i3, tuple, <a1:i4, a2:i6>)

◼ o4 = (i4, atom, 10)

◼ o5 = (i5, atom, 10)

◼ o6 = (i6, atom, 20)

◼ In this example, The objects o1 and o2 have equal states, since their states at the atomic level are the
same but the values are reached through distinct objects o4 and o5.

◼ However, the states of objects o1 and o3 are identical, even though the objects themselves are not
because they have distinct OIDs.

◼ Similarly, although the states of o4 and o5 are identical, the actual objects o4 and o5 are equal but
not identical, because they have distinct OIDs.

5. Discuss different implementation issues related with object relational database. (6)
Ans:

Several Implementation Challenges Due to Enhanced Functionality of ORDBMS

Storage & Access Method

Query Processing

Query Optimization
Storage & Access Method : efficiently store ADT objects and structure objects and provide efficient indexed access to
both

● Large ADTs, like BLOBs(Binary Large Object), require special storage, typically in a different location on disk from
the tuples that contain them

● Disk-based pointers are maintained from the tuples to the objects they contain.

● A complication arises with array types. Arrays are broken into contiguous chunks, which are then stored in some
order on disk.

Query Processing : ADTs and structured types call for new functionality in processing queries

● To register an aggregation function, a user must implement three methods, which we call initialize, iterate and
terminate.

● ADTs give users the power to add code to the DBMS; this power can be abused.

● A buggy or malicious ADT method can bring down the database server or even corrupt the database.

● User-defined ADT methods are very expensive

Query Optimization : To handle new query processing functionality, an optimizer must know about the new
functionality and use it appropriate

6. Define GIS. Discuss different data modeling and representation for GIS data.
Ans:

A geographic information system (GIS) is a framework for gathering, managing, and analyzing data. Rooted in
the science of geography, GIS integrates many types of data. It analyzes spatial location and organizes layers of
information into visualizations using maps and 3D scenes. With this unique capability, GIS reveals deeper insights into
data, such as patterns, relationships, and situations—helping users make smarter decisions.

Representing the “real world” in a data model has been a challenge for GIS since their inception in the 1960s. A GIS
data model enables a computer to represent real geographical elements as graphical elements. Two representational
models are dominant; raster (grid-based) and vector (line-based):

Raster. Based on a cellular organization that divides space into a series of units. Each unit
is generally similar in size to another. Grid cells are the most common raster
representation. Features are divided into cellular arrays and a coordinate (X,Y) is assigned
to each cell, as well as a value. This allows for registration with a geographic reference
system. A raster representation also relies on tessellation: geometric shapes that can
completely cover an area.
Vector. The concept assumes that space is continuous, rather than discrete, which gives
an infinite (in theory) set of coordinates. A vector representation is composed of three main
elements: points, lines and polygons. Points are spatial objects with no area but can have
attached attributes since they are a single set of coordinates (X and Y) in a coordinate
space. Lines are spatial objects made up of connected points (nodes) that have no
width. Polygons are closed areas that can be made up of a circuit of line segments.
7. Define multimedia database. Discuss benefits of multimedia databases. How do you query image database?
Ans:

A Multimedia database (MMDB) is a collection of related for multimedia data. The multimedia data include one or
more primary media data types such as text, images, graphic objects (including drawings, sketches and illustrations)
animation sequences, audio and video.

Benefits of using Multimedia Databases are:

1. Query in Multimedia DBMS


2. Charts and Graphs: Multimedia database supports all type of charts and graphs which is considered a
great feature in the MDBS. Creating those charts and graphs will need special queries and those
queries are created by using SQL +D.
3. Multimedia Presentation: The terms multimedia document, multimedia or temporal presentation,
have been used in the literature interchangeably. Multimedia Database offers to store large amount
of media like audio and video, all of those are used also in the presentation process. MDBS also
stores multimedia documents and presentation.

To query a image in database:

Multimedia information is very expressive, self explanatory, narrative, etc. Now a day the
development of digital media, advanced network infrastructure and the easily available consumer
electronics makes the multimedia revolution to run in an alarming rate. Inline with the advancement
of database technology that incorporates multimedia data, an open question that always rose in the
technology is how to retrieve/search images in the multimedia databases. There are a huge number
of research works focusing on the searching mechanisms in image databases for efficient retrieval
and tried to give supplementary suggestions on the overall systems. The growing of digital medias
(digital camera, digital video, digital TV, e-book, cell phones, etc.) gave rise to the revolution of very
large multimedia databases, in which the need of efficient storage, organization and retrieval of
multimedia contents came into question. Among the multimedia data, this survey paper focuses on
the different methods (approaches) and their evaluation techniques used by many of recent research
works on image retrieval system. Many researchers develop and use lots of approaches towards
image retrieval.

8. Discuss data warehouse and its functionality. Discuss association rule mining with example.
Ans:

A Data Warehouse works as a central repository where information arrives from one or more data sources. Data
flows into a data warehouse from the transactional system and other relational databases.

Data may be:

1. Structured

2. Semi-structured

3. Unstructured data
Functions of Data warehouse:
It works as a collection of data and here is organized by various communities that
endures the features to recover the data functions. It has stocked facts about the tables
which have high transaction levels which are observed so as to define the data
warehousing techniques and major functions which are involved in this are mentioned
below:
1. Data consolidation
2. Data Cleaning
3. Data Integration

In data mining, association rules are useful for analyzing and predicting customer behavior. They play an
important part in customer analytics, market basket analysis, product clustering, catalog design and store
layout.
Programmers use association rules to build programs capable of machine learning. Machine learning is a
type of artificial intelligence (AI) that seeks to build programs with the ability to become more efficient
without being explicitly programmed.

Examples of association rules in data mining


A classic example of association rule mining refers to a relationship between diapers
and beers. The example, which seems to be fictional, claims that men who go to a store
to buy diapers are also likely to buy beer. Data that would point to that might look like
this:

A supermarket has 200,000 customer transactions. About 4,000 transactions, or about


2% of the total number of transactions, include the purchase of diapers. About 5,500
transactions (2.75%) include the purchase of beer. Of those, about 3,500 transactions,
1.75%, include both the purchase of diapers and beer. Based on the percentages, that
large number should be much lower. However, the fact that about 87.5% of diaper
purchases include the purchase of beer indicates a link between diapers and beer.
9. What is web service? Discuss SOAP in detail.
Ans:
A web service is a collection of open protocols and standards used for exchanging data between
applications or systems. Software applications written in various programming languages and running on
various platforms can use web services to exchange data over computer networks like the Internet in a
manner similar to inter-process communication on a single computer. This interoperability (e.g., between
Java and Python, or Windows and Linux applications) is due to the use of open standards.

Components of Web Services


The basic web services platform is XML + HTTP. All the standard web services work using the
following components −
• SOAP (Simple Object Access Protocol)
• UDDI (Universal Description, Discovery and Integration)
• WSDL (Web Services Description Language)

SOAP
SOAP is an acronym for Simple Object Access Protocol. It is an XML-based messaging protocol for
exchanging information among computers. SOAP is an application of the XML specification.

SOAP is an XML-based protocol for exchanging information between computers.


• SOAP is a communication protocol.
• SOAP is for communication between applications.
• SOAP is a format for sending messages.
• SOAP is designed to communicate via Internet.
• SOAP is platform independent.
• SOAP is language independent.
• SOAP is simple and extensible.
• SOAP allows you to get around firewalls.
• SOAP will be developed as a W3C standard.

Although SOAP can be used in a variety of messaging systems and can be delivered via a variety
of transport protocols, the initial focus of SOAP is remote procedure calls transported via HTTP.
Other frameworks including CORBA, DCOM, and Java RMI provide similar functionality to SOAP,
but SOAP messages are written entirely in XML and are therefore uniquely platform- and language-
independent.
10. Write short notes on: (2x3)
a) Integrity constraint:

Ans:

o Integrity constraints are a set of rules. It is used to maintain the quality of information.

o Integrity constraints ensure that the data insertion, updating, and other processes have to
be performed in such a way that data integrity is not affected.

o Thus, integrity constraint is used to guard against accidental damage to the database.

Types of Integrity Constraint:

1. Domain constraints

o Domain constraints can be defined as the definition of a valid set of values for an attribute.

o The data type of domain includes string, character, integer, time, date, currency, etc. The
value of the attribute must be available in the corresponding domain.

2. Entity integrity constraints

o The entity integrity constraint states that primary key value can't be null.

o This is because the primary key value is used to identify individual rows in relation and if
the primary key has a null value, then we can't identify those rows.

o A table can contain a null value other than the primary key field.

3. Referential Integrity Constraints

o A referential integrity constraint is specified between two tables.


o In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary Key
of Table 2, then every value of the Foreign Key in Table 1 must be null or be available in
Table

4. Key constraints

o Keys are the entity set that is used to identify an entity within its entity set uniquely.

o An entity set can have multiple keys, but out of which one key will be the primary key. A
primary key can contain a unique and null value in the relational table.
Tribhuwan University

Institute of Science and Technology

2076

Bachelor Level / seventh-semester / Science Full marks: 60

Computer Science and Information Technology(CSC401) Pass marks: 24

(Advanced Database Management System)

Time: 3 hours

Candidates are required to give their answers in their own words as far as practicable.

The figures in the margin indicate full marks.

2.What is data fragmentation? Discuss horizontal and vertical fragmentation in detail.


Ans:
Fragmentation is the task of dividing a table into a set of smaller tables. The
subsets of the table are called fragments. Fragmentation can be of three types:
horizontal, vertical, and hybrid (combination of horizontal and vertical). Horizontal
fragmentation can further be classified into two techniques: primary horizontal
fragmentation and derived horizontal fragmentation.
Fragmentation should be done in a way so that the original table can be reconstructed
from the fragments. This is needed so that the original table can be reconstructed from
the fragments whenever required. This requirement is called “reconstructiveness.”

Vertical Fragmentation
In vertical fragmentation, the fields or columns of a table are grouped into fragments. In
order to maintain reconstructiveness, each fragment should contain the primary key
field(s) of the table. Vertical fragmentation can be used to enforce privacy of data.
For example, let us consider that a University database keeps records of all registered
students in a Student table having the following schema.
STUDENT

Regd_No Name Course Address Semester Fees Marks

Now, the fees details are maintained in the accounts section. In this case, the designer
will fragment the database as follows −
CREATE TABLE STD_FEES AS
SELECT Regd_No, Fees
FROM STUDENT;

Horizontal Fragmentation
Horizontal fragmentation groups the tuples of a table in accordance to values of one or
more fields. Horizontal fragmentation should also confirm to the rule of
reconstructiveness. Each horizontal fragment must have all columns of the original base
table.
For example, in the student schema, if the details of all students of Computer Science
Course needs to be maintained at the School of Computer Science, then the designer
will horizontally fragment the database as follows −
CREATE COMP_STD AS
SELECT * FROM STUDENT
WHERE COURSE = "Computer Science";

Hybrid Fragmentation
In hybrid fragmentation, a combination of horizontal and vertical fragmentation
techniques are used. This is the most flexible fragmentation technique since it generates
fragments with minimal extraneous information. However, reconstruction of the original
table is often an expensive task.
Hybrid fragmentation can be done in two alternative ways −
• At first, generate a set of horizontal fragments; then generate vertical fragments
from one or more of the horizontal fragments.
• At first, generate a set of vertical fragments; then generate horizontal fragments
from one or more of the vertical fragments.

3.Whydo we need extended ER modeling? Discuss class/subclass relationship with


example.
Ans:

Enhanced entity-relationship (EER) diagrams are basically an expanded upon


version of ER diagrams. EER models are helpful tools for designing
databases with high-level models. With their enhanced features, you can plan
databases more thoroughly by delving into the properties and constraints with
more precision.
An EER diagram provides you with all the elements of an ER diagram while adding:
• Attribute or relationship inheritances
• Category or union types
• Specialization and generalization
• Subclasses and superclasses

When to use which


Overall, both diagrams provide the ability to design your database with precision.
An ER diagram gives you the visual outlook of your database. It details the
relationships and attributes of its entities, paving the way for smooth database
development in the steps ahead.
EER diagrams, on the other hand, are perfect for taking a more detailed look at your
information. When your database contains a larger amount of data it is best to turn to
an enhanced model to more deeply understand your model.
So when should you use which? Honestly, both are useful, and it depends mostly on
the size and detail of your data. The more complicated the data,

8.Discuss benefits and applications of data mining.


Ans:
Data mining is a process that is used by an organization to turn the raw data into useful
data. Utilizing software to find patterns in large data sets, organizations can learn more
about their customers to develop more efficient business strategies, boost sales, and
reduce costs. Effective data collection, storage, and processing of the data are important
advantages of data mining. Data mining method is been used to develop machine learning
models.

Data mining has many enormous advantages as explained below:

• If the user has managed to interact directly with the data mining tool, then the user
can choose better and smart marketing choices for some corporation.
• Communication is important when dealing directly with data mining so that strong
relationships and connections can be determined.
• Due to the 80/20 principle, if there are 20% of customers then the profit will be 80%.
• The customers that are important with 20% are lossless. The company should aim at
increasing profit with an additional 80%.
• There are two concepts called segmentation and clustering that are important in
advertising and the connection of customers to successfully use the data mining on
the details.
• Data mining was also used as part of the strategy for preventing health fraud, waste
and abuse in society in the area of CMIP of the Medicaid Integrity Program.
• If you have knowledge of data mining techniques, you can manage applications in
various areas such as Market Analysis, Production Control, Sports, Fraud Detection,
Astrology, etc.
• If you have a website for shopping, then data mining will help in defining a shopping
pattern. If you are having issues with designing or selecting the products, data
mining techniques can be useful to identify all the shopping patterns.
• Data mining also helps in data optimization.
• One of the most important factors of data mining is that it determines hidden
profitability.
• The risk factor in business can be taken care of because data mining provides clear
identification of hidden profitability.
• Frauds and malware are the most dangerous threats on the internet which are
increasing day by day. Credit card services and telecommunication are the main
reasons for that. With the help of the Data mining techniques, professionals can get
fraud related data such as caller ID, location, duration of the call, the exact date and
time, etc which can help to find a person or group who is responsible for that fraud.
• Also in the Corporate world where time is money, data mining techniques can help
organizations in real-time for planning finances and resources, evaluation of assets,
an idea about business competitors, etc.
Applications:
1. Future Healthcare
2. Market Basket Analysis
3. Education
4. Fraud Detection
5. Lie Detection
6. Research Analysis
7.
10.Write short notes on:
Deductive database
Ans:
A Deductive Database is a type of database that can make conclusions or we can say
deductions using a sets of well defined rules and fact that are stored in the database. In today’s
world as we deal with a large amount of data, this deductive database provides a lot of
advantages. It helps to combine the RDBMS with logic programming. To design a deductive
database a purely declarative programming language called Datalog is used.
The implementations of deductive databases can be seen in LDL (Logic Data Language), NAIL
(Not Another Implementation of Logic), CORAL, and VALIDITY.
The use of LDL and VALIDITY in a variety of business/industrial applications are as follows.

1. LDL Applications:
This system has been applied to the following application domains:
. Enterprise modeling
. Hypothesis testing or data dredging
. Software reuse

2. VALIDITY Applications:
Validity combines deductive capabilities with the ability to manipulate
complex objects (OIDs, inheritance, methods, etc). It provides a DOOD
data model and language called DEL (Datalog Extended Language), an
engine working along a client-server model and a set of tools for schema
and rule editing, validation, and querying.
The following are some application areas of the VALIDITY system:
. Electronic commerce:
. Rules-governed processes:
. Knowledge discovery:
Concurrent Engineering:
ODMG:
Ans
ODMG (Object Data Management Group) 2.0 builds on
database, object and programming language standards to give
developers portability and ease of use.

The Internet explosion has fundamentally changed application


development goals and strategies, requiring new applications that are
dynamic and data-rich. To build them, developers are employing multi-
tier architectures and object programming languages like Java.
Consequently, object storage, especially Java language object storage,
has become a crucial problem that many application developers face.

6.Discuss mobile computing architecture. Discuss mobile data management in


detail.
Ans:




Key Differences Between Classification and Clustering
1. Classification is the process of classifying the data with the help of
class labels. On the other hand, Clustering is similar to classification
but there are no predefined class labels.
2. Classification is geared with supervised learning. As against,
clustering is also known as unsupervised learning.
3. Training sample is provided in classification method while in case of
clustering training data is not provided.

You might also like