Adbms 2070-2076
Adbms 2070-2076
Adbms 2070-2076
2070
Candidates are required to give their answers in their own words as far as practicable.
Ans: An extent is a logical unit of database storage space allocation made up of a number of contiguous
data blocks. One or more extents in turn make up a segment. When the existing space in a segment is
completely used, Oracle allocates a new extent for the segment.
When you create a table, Oracle allocates to the table's data segment an initial extent of a specified
number of data blocks. Although no rows have been inserted yet, the Oracle data blocks that
correspond to the initial extent are reserved for that table's rows.
If the data blocks of a segment's initial extent become full and more space is required to hold new data,
Oracle automatically allocates an incremental extent for that segment. An incremental extent is a
subsequent extent of the same or greater size than the previously allocated extent in that segment.
For maintenance purposes, the header block of each segment contains a directory of the extents in that
segment.
• Temporal database
Ans: A temporal database is a database that has certain features that support time-sensitive status for
entries. Where some databases are considered current databases and only support factual data
considered valid at the time of use, a temporal database can establish at what times certain entries are
accurate.
A temporal database stores data relating to time instances. It offers temporal data types and stores
information relating to past, present and future time. The temporal database has two major notions or
attributes. 1. valid time. 2. transaction time. More specifically the temporal aspects usually include valid
time and transaction time. These attributes can be combined to form bitemporal data.
Valid time is the time period during which a fact is true in the real world.
Transaction time is the time period during which a fact stored in the database was known.
Bitemporal data combines both Valid and Transaction Time.
It is possible to have timelines other than Valid Time and Transaction Time, such as Decision Time, in
the database. In that case the database is called a multitemporal database as opposed to a bitemporal
database. However, this approach introduces additional complexities such as dealing with the validity of
(foreign) keys.
At one extreme of the autonomy spectrum, we have a DDBMS that "looks like" a centralized DBMS to
the user. A single conceptual schema exists, and all access to the system is obtained through a site that
is part of the DDBMS—which means that no local autonomy exists. At the other extreme we encounter
a type of DDBMS called a federated DDBMS (or a multidatabase system). In such a system, each server
is an independent and autonomous centralized DBMS that has its own local users, local transactions,
and DBA and hence has a very high degree of local autonomy. The term federated database system
(FDBS) is used when there is some global view or schema of the federation of databases that is shared
by the applications. On the other hand, a multidatabase system does not have global schema and
interactively constructs one as needed by the application. Both systems are hybrids between
distributed and centralized systems and the distinction we made between them is not strictly followed.
We will refer to them as FDBSs in a generic sense.
In a heterogeneous FDBS, one server may be a relational DBMS, another a network DBMS, and a third
an object or hierarchical DBMS; in such a case it is necessary to have a canonical system language and
to include language translators to translate subqueries from the canonical language to the language of
each server.
• X Path
• Classification and clustering
Ans: Classification:
Data classification is the process of organizing data into categories for its most effective and efficient
use. A well-planned data classification system makes essential data easy to find and retrieve. This can
be of particular importance for risk management, legal discovery, and compliance. Written procedures
and guidelines for data classification should define what categories and criteria the organization will use
to classify data and specify the roles and responsibilities of employees within the organization
regarding data stewardship. Once a data-classification scheme has been created, security standards
that specify appropriate handling practices for each category and storage standards that define
the data's lifecyle requirements should be addressed.
Clustering:
Clustering, in the context of databases, refers to the ability of several servers or instances to connect to
a single database. An instance is the collection of memory and processes that interacts with a database,
which is the set of physical files that actually store data.
Clustering offers two major advantages, especially in high-volume database environments:
Fault tolerance: Because there is more than one server or instance for users to connect to,
clustering offers an alternative, in the event of individual server failure.
Load balancing: The clustering feature is usually set up to allow users to be automatically
allocated to the server with the least load.
• OLAP
Ans: Stands for "Online Analytical Processing." OLAP allows users to analyze database information from
multiple database systems at one time. While relational databases are considered to be two-
dimensional, OLAP data is multidimensional, meaning the information can be compared in many
different ways. For example, a company might compare their computer sales in June with sales in July,
then compare those results with the sales from another location, which might be stored in a different
database.
2.) Draw an ER Diagram for a hospital with a set of patients and set of doctors associated with each
patient a log of various tests and examinations conducted.
Ans:
3.) What is the difference between an object and a….. in the object oriented data
model (OOBM) ?
4.) What are the main difference between designing a relational database and an
object database?
Ans: Relational database is used to store data in tables which contains rows and columns and stores
data in it. Every row has its own key and every column has its specific name. The maker of relational
database or its developer refers a field as attribute, record as tuple and file as relation. Whereas the
user refers field as column, record as row, and file as table.
The difference between relational database and object oriented database is that the relational data
base stores data in the form of tables which contains rows and columns. Every column in the table has
its specific name and every row of the table has its own primary key. In the processing of file
environment terms such as field, record, file is used to represent data. While in the object oriented
database the data is stored in the form of objects. In the object oriented data the data is stored along
with its actions that processes or reads the existing data. For example we give an object a name of a
student and it will contain data about student such as its Address, First name and Last name of the
student his ID, student fee record, student Result etc.
A relational databases relies on the relational model, on the other hand a object database relies on the
OOP.
The relational model organizes information in a set of tables each are composed of rows and columns.
Each column represents a property and each row represent an entity.
In a object oriented database each element resembles a object from the object oriented paradigm.
5.) Discuss some applications of active database. How do spatial databases differ from
regular database?
Ans:
Application which depends on data monitoring activities such as CIM. Telecommunications
Network Management, Program trading, Medical and Financial Support Systems can greatly
benefit from integration with active database.
Production control, e.g., power plants.
Maintenance tasks, e.g., inventory control.
Financial applications, e.g., stock and bond trading.
Telecommunication and network management.
Air traffic control.
Computer integrated manufacturing(CIM)
Statistics gathering and authorization tools.
a) A spatial database supports special data types for geometric objects and allows you to store
geometric data (usually of a geographic nature) in tables while a non-spatial database doesn't
support such.
b) A spatial database provides special functions and indexes for querying and manipulating
geospatial data using something like Structured Query Language (SQL) while non-spatial
database doesn't provide such functions and indexes.
c) A spatial database is often used as a storage container for geospatial data, but it can do
much more than that. While non-spatial database is often used as a storage container for non-
spatial data.
d) A spatial database uses spatial query in geometric functions to answer questions about
space and objects in space. While non-spatial database don't support spatial queries.
e) In addition to being able to answer questions about the use of space, spatial database
functions allow you to create and modify objects in space. This portion of spatial analysis is
often referred to as geometric or spatial processing.
f) A spatially enabled database can intrinsically work with data types like rivers (modeled as line
strings), land parcels (modeled as polygons), and trees (modeled as points). While non-spatial
database can’t work with these forms of models.
6.) Write a schema that provides tags for a person’s first name, last name, weight, and
shoe size. Weight and shoe size tags should have attributes to designate measuring
systems.
7.) Distinguish between structured and unstructured complex objects.
Ans: see q(5)2071 solution.
8.) What is data warehouse? List the characteristics of data warehouse.
Ans: A data warehouse is a federated repository for all the data that an enterprise's various business
systems collect. The repository may be physical or logical.
Data warehousing emphasizes the capture of data from diverse sources for useful analysis and access,
but does not generally start from the point-of-view of the end user who may need access to specialized,
sometimes local databases. The latter idea is known as the data mart.
Characteristics of Data Warehouse:
i. Subject-oriented :
The warehouse organizes data around the essential subjects of the business (customers and products)
rather than around applications such as inventory management or order processing.
i.Integrated:
It is consistent in the way that data from several sources is extracted and transformed. For example,
coding conventions are standardized: M _ male, F _ female.
ii. Time-variant:
Data are organized by various time-periods (e.g. months).
iii. Non-volatile:
The warehouse’s database is not updated in real time. There is periodic bulk uploading of transactional
and other data. This makes the data less subject to momentary change. There are a number of steps
and processes in building a warehouse.
First, you must identify where the relevant data is stored. This can be a challenge.When the Common-
wealth Bank opted to implement CRM in its retail banking business, it found that relevant customer
data were resident on over 80 separate systems.
9.) What are the advantages and disadvantages of extending the relational data model
by means of ORDBMS?
Ans: Advantages and Disadvantages of ORDBMSS
ORDBMSs can provide appropriate solutions for many types of advanced database applications.
However, there are also disadvantages.
Advantages of ORDBMSs
There are following advantages of ORDBMSs:
Reuse and Sharing: The main advantages of extending the Relational data model come from reuse and
sharing. Reuse comes from the ability to extend the DBMS server to perform standard functionality
centrally, rather than have it coded in each application.
Increased Productivity: ORDBMS provides increased productivity both for the developer and for the,
end user
Use of experience in developing RDBMS: Another obvious advantage is that .the extended relational
approach preserves the significant body of knowledge and experience that has gone into developing
relational applications. This is a significant advantage, as many organizations would find it prohibitively
expensive to change. If the new functionality is designed appropriately, this approach should allow
organizations to take advantage of the new extensions in an evolutionary way without losing the
benefits of current database features and functions.
Disadvantages of ORDBMSs
The ORDBMS approach has the obvious disadvantages of complexity and associated increased costs.
Further, there are the proponents of the relational approach that believe the· essential simplicity' and
purity of the .relational model are lost with these types of extension.
ORDBMS vendors are attempting to portray object models as extensions to the relational model with
some additional complexities. This potentially misses the point of object orientation, highlighting the
large semantic gap between these two technologies. Object applications are simply not as data-centric
as relational-based ones.
2071
(NEW COURSE)
Candidates are required to give their answers in their own words as far as practicable.
Candidates are required to give their answers in their own words as far as practicable.
The questions are of equal value.
Attempt all questions.
1. Explain the following terms :
a. Data Warehouse
Ans: A data warehouse is a relational database that is designed for query and analysis rather
than for transaction processing. It usually contains historical data derived from transaction data,
but it can include data from other sources. It separates analysis workload from transaction
workload and enables an organization to consolidate data from several sources.
b. Distribution Transparency
Ans: Distribution transparency is the property of distributed databases by the virtue of which
the internal details of the distribution are hidden from the users. The DDBMS designer may
choose to fragment tables, replicate the fragments and store them at different sites. However,
since users are oblivious of these details, they find the distributed database easy to use like any
centralized database.
Location transparency
Fragmentation transparency
Replication transparency
c. X Query
d. Distribution transaction
Ans: A distributed transaction is a database transaction in which two or more network hosts
are involved. Usually, hosts provide transactional resources, while the transaction
manager is responsible for creating and managing a global transaction that encompasses all
operations against such resources. Distributed transactions, as any other transactions, must
have all four ACID (atomicity, consistency, isolation, durability) properties, where atomicity
guarantees all-or-nothing outcomes for the unit of work (operations bundle).
e. Knowledge base
Ans: In general, a knowledge base is a centralized repository for information: a public library,
a database of related information about a particular subject, and whatis.com could all be
considered to be examples of knowledge bases. In relation to information technology (IT), a
knowledge base is a machine-readable resource for the dissemination of information,
generally online or with the capacity to be put online. An integral component of knowledge
management systems, a knowledge base is used to optimize information collection,
organization, and retrieval for an organization, or for the general public.
f. Classification and clustering
2. Distinguish multiple inheritance and selective inheritance in OO concepts.
Persistent Objects: are those that are stored in the database [Objects created using abstract
data types varrays, nested tables etc.]. These can be used both with SQL commands and also in
PL/SQL blocks. These reside in the data dictionary. Persistent objects are available to the user
until they are deleted explicitly. They can be implemented as tables, columns or
attributes. Persistent objects is one that outlives the process in which it is created. Remark that
this does not mean that objects are stored in a database and that any recovery is guaranteed. It
means better, that the lifetime of such objects persist across server process activation and
deactivation cycles.
Transient object exists only within the scope of the PL/SQL block. These get automatically de-
allocated once they go out of the scope of the PL/SQL block. Examples of transient objects are
PL/SQL variables. Transient objects have a lifetime bounded by the lifetime of the process in
which they are created.
The object-oriented data model allows the 'real world' to be modeled more closely. The object,
which encapsulates both state and behavior, is a more natural and realistic representation of real-
world objects. An object can store all the relationships it has with other objects, including many-to-
many relationships, and objects can be formed into complex objects that the traditional data
models cannot cope with easily.
Extensibility
OODBMSs allow new data types to be built from existing types. The ability to factor out common
properties of several classes and form them into a superclass that can be shared with subclasses can
greatly reduce redundancy within system is regarded as one of the main advantages of object
orientation. Further, the reusability of classes promotes faster development and easier
maintenance of the database and its applications.
Unlike traditional databases (such as hierarchical, network or relational), the object oriented
database are capable of storing different types of data, for example, pictures, voice video, including
text, numbers and so on.
A single language interface between the Data Manipulation Language (DML) and the programming
language overcomes the impedance mismatch. This eliminates many of the efficiencies that occur in
mapping a declarative language such as SQL to an imperative 'language such as 'C'. Most OODBMSs
provide a DML that is computationally complete compared with SQL, the 'standard language of
RDBMSs.
Navigational access from the object is the most common form of data access in an OODBMS. This is
in contrast to the associative access of SQL (that is, declarative statements with selection based on
one or more predicates). Navigational access is more suitable for handling parts explosion, recursive
queries, and so on.
Support for schema evolution
The tight coupling between data and applications in an OODBMS makes schema evolution more
feasible.
There are many areas where traditional DBMSs have not been particularly successful, such as,
Computer-Aided Design (CAD), Computer-Aided Software Engineering (CASE), Office Information
System(OIS), and Multimedia Systems. The enriched modeling capabilities of OODBMSs have made
them suitable for these applications.
Improved performance
There have been a number of benchmarks that have suggested OODBMSs provide significant
performance improvements over relational DBMSs. The results showed an average 30-fold
performance improvement for the OODBMS over the RDBMS.
Disadvantages:
There are following disadvantages of OODBMSs:
Lack of universal data model: There is no universally agreed data model for an OODBMS, and most
models lack a theoretical foundation. This .disadvantage is seen as a significant drawback, and is
comparable to pre-relational systems.
Lack of experience: In comparison to RDBMSs the use of OODBMS is still relatively limited. This
means that we do not yet have the level of experience that we have with traditional systems.
OODBMSs are still very much geared towards the programmer, rather than the naïve end-user. Also
there is a resistance to the acceptance of the technology. While the OODBMS is limited to a small
niche market, this problem will continue to exist
Lack of standards: There is a general lack of standards of OODBMSs. We have already mentioned
that there is not universally agreed data model. Similarly, there is no standard object-oriented query
language.
Competition: Perhaps one of the most significant issues that face OODBMS vendors is the
competition posed by the RDBMS and the emerging ORDBMS products. These products have an
established user base with significant experience available. SQL is an approved standard and the
relational data model has a solid theoretical formation and relational products have many
supporting tools to help .both end-users and developers.
Query optimization compromises encapsulations: Query optimization requires. An understanding of
the underlying implementation to access the database efficiently. However, this compromises the
concept of incrassation.
Locking at object level may impact performance Many OODBMSs use locking as the basis for
concurrency control protocol. However, if locking is applied at the object level, locking of an
inheritance hierarchy may be problematic, as well as impacting performance.
Complexity: The increased functionality provided by the OODBMS (such as the illusion of a single-
level storage model, pointer sizzling, long-duratipntransactions, version management, and schema
evolution--makes the system more complex than that of traditional DBMSs. In complexity leads to
products that are more expensive and more difficult to use.
Lack of support for views: Currently, most OODBMSs do not provide a view mechanism, which, as
we have seen previously, provides many advantages such as data independence, security, reduced
complexity, and customization.
Lack of support for security: Currently, OODBMSs do not provide adequate security mechanisms.
The user cannot grant access rights on individual objects or classes.
If OODBMSs are to expand fully into the business field, these deficiencies must be rectified.
7. What are the differences and similarities between objects and literals in the
ODMG object model?
Ans: objects and literals are basic block of object model.
The difference is
a) Object has both object identifier and state, literal has no object identifier.
b) Object state can change overtime by modifying object value; literal is basically a constant
value that does not change.
c) Objects are identified by their OID’s whereas literals are identified by their value.
d) Object will have a life time it depends on whether persistent object or transient object
lifetime is not applicable to literal.
e) Copy of object result in shallow copy whereas literal result in logical copy objects and literal
can be atomic or structured.
8. Describe the main reasons for the potential advantage for distributed database.
What additional functions does it have over centralized DBMS?
Ans:
9. Describe the characteristics of mobile computing environment in detail.
Ans: Mobile computing is human–computer interaction by which a computer is expected to be
transported during normal usage, which allows for transmission of data, voice and video. Mobile
computing involves mobile communication, mobile hardware, and mobile software. Communication
issues include ad hoc networks and infrastructure networks as well as communication
properties, protocols, data formats and concrete technologies. Hardware includes mobile devices or
device components. Mobile software deals with the characteristics and requirements of mobile
applications.
Characteristics of mobile computing are:
Portability: Devices/nodes connected within the mobile computing system should facilitate
mobility. These devices may have limited device capabilities and limited power supply, but should
have a sufficient processing capability and physical portability to operate in a movable environment.
Connectivity: This defines the quality of service (QoS) of the network connectivity. In a mobile
computing system, the network availability is expected to be maintained at a high level with the
minimal amount of lag/downtime without being affected by the mobility of the connected nodes.
Interactivity: The nodes belonging to a mobile computing system are connected with one another
to communicate and collaborate through active transactions of data.
Individuality: A portable device or a mobile node connected to a mobile network often denote an
individual; a mobile computing system should be able to adopt the technology to cater the
individual needs and also to obtain contextual information of each node.
10. Differentiate between XML schema and XML DTD with suitable example.
Ans: The critical difference between DTDs and XML Schema is that XML Schema utilizes an XML-
based syntax, whereas DTDs have a unique syntax held over from SGML DTDs. Although DTDs are
often criticized because of this need to learn a new syntax, the syntax itself is quite terse. The
opposite is true for XML Schema, which are verbose, but also make use of tags and XML so that
authors of XML should find the syntax of XML Schema less intimidating.
Some of the main differences are:
Differences between an XML Schema Definition (XSD) and Document Type Definition (DTD) include:
XML schemas are written in XML while DTD are derived from SGML syntax.
XML schemas define datatypes for elements and attributes while DTD doesn't support
datatypes.
XML schemas allow support for namespaces while DTD does not.
XML schemas define number and order of child elements, while DTD does not.
XML schemas can be manipulated on your own with XML DOM but it is not possible in case of
DTD.
using XML schema user need not to learn a new language but working with DTD is difficult for a
user.
XML schema provides secure data communication i.e sender can describe the data in a way that
receiver will understand, but in case of DTD data can be misunderstood by the receiver.
XML schemas are extensible while DTD is not extensible.
Tribhuvan University
2072
Bachelor Level/ Fourth Year/ Seven Semester/ Science Full Marks: 60
(NEW COURSE)
Candidates are required to give their answers in their own words as far as practicable.
Ans: A fundamental requirement of running a mission-critical application is being able to achieve and
sustain high performance with your database.
Database performance tuning encompasses the steps you can take to optimize performance with the
goal of maximizing the use of system resources for greater efficiency. By fine-tuning certain database
elements such as index use, query structure, data models, system configuration (e.g., hardware and OS
settings), and application design, you can significantly impact the overall performance of your
application.
MongoDB is a database built for high performance deployments at scale. For this reason and more,
thousands of organizations and over a third of the Fortune 100 count on MongoDB to help them deliver
innovative, lower cost applications that result in competitive advantage.
b. UML
Ans: UML is an acronym that stands for Unified Modeling Language. Simply put, UML is a modern
approach to modeling and documenting software. In fact, it’s one of the most popular business process
modeling techniques.
d. X Query
e. Calendars
f. Active Database
Ans: An active database is a database that includes an event-driven architecture (often in the
form of ECA rules) which can respond to conditions both inside and outside the database. Possible
uses include security monitoring, alerting, statistics gathering and authorization.
GENERALIZATION SPECIALIZATION
manner.
The higher level entity must have The higher level entity may not have
of a schema. schema.
of entities. entity.
entities.
BASIS FOR
SINGLE INHERITANCE MULTIPLE INHERITANCE
COMPARISON
Basic Derived class inherits a single Derived class inherits two or more than
base_class2, ....
Access Derived class access the Derived class access the combined
BASIS FOR
SINGLE INHERITANCE MULTIPLE INHERITANCE
COMPARISON
Run time Require small amount of run Require additional runtime overhead as
6. What are the object relational features that have been included in SQL-
99?
Ans: SQL-1999 introduced object support into the SQL standard. The SQL-1999 standard had to be
backward compatible with the existing SQL-1992 standard, so object support was implemented as
an extension to the existing standard. The types defined by SQL-1992 were retained, and the
standard modified to support user defined types (UDT) with object-like features.
A reference type,
Distinct types,
Structured types.
A reference type is essentially an object identifier (OID) which can be used to uniquely identify an
instance of an object, and is used to point to another type. This infers the use of a structured type
being pointed to.
Distinct types were an ability to ‘rename’ an existing pre-defined type. For example, you could
define a type METERS to be an INTEGER. You could similarly define FEET as INTEGER also. The
benefit of this is that the database does not allow type mixing, so adding FEET to METERS without
conversion would result in an error.
Structured types are more interesting, as a set of data and associated methods can be grouped
into a user defined type. For example, you could define an address type, which would contain
street number, street, state, zip, country, etc. You could also, depending on the application, define
associated functionality
8. What are the difference and similarities between objects and literals in
the ODMG Object Model?
1. Text : The form in which the text can be stored can vary greatly. In addition to ASCII based
files, text is typically stored in processor files, spreadsheets, databases and annotations on
more general multimedia objects. With availability and proliferation of GUIs, text fonts the
job of storing text is becoming complex allowing special effects(color, shades..).
2. Images : There is great variance in the quality and size of storage for still images. Digitalized
images are sequence of pixels that represents a region in the user's graphical display. The
space overhead for still images varies on the basis of resolution, size, complexity, and
compression scheme used to store image. The popular image formats are jpg, png, bmp, tiff.
3. Audio : An increasingly popular datatype being integrated in most of applications is Audio. Its
quite space intensive. One minute of sound can take up to 2-3 Mbs of space. Several
techniques are used to compress it in suitable format.
4. Video : One on the most space consuming multimedia data type is digitalized video. The
digitalized videos are stored as sequence of frames. Depending upon its resolution and size a
single frame can consume upto 1 MB. Also to have realistic video playback, the transmission,
compression, and decompression of digitalized require continuous transfer rate.
5. Graphic Objects: These consists of special data structures used to define 2D & 3D shapes
through which we can define multimedia objects. These includes various formats used by
image, video editing applications.
An XML DTD can be either specified inside the document, or it can be kept in a separate document and
then liked separately.
declaration1
declaration2
........
]>
Internal DTD
A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it as
internal DTD, standalone attribute in XML declaration must be set to yes. This means, the declaration
works independent of an external source.
Syntax
Following is the syntax of internal DTD −
<!DOCTYPE root-element [element-declarations]>
External DTD
In external DTD elements are declared outside the XML file. They are accessed by specifying the
system attributes which may be either the legal .dtd file or a valid URL. To refer it as external
DTD, standalone attribute in the XML declaration must be set as no. This means, declaration includes
information from the external source.
Syntax
Following is the syntax for external DTD −
<!DOCTYPE root-element SYSTEM "file-name">
©ASCOL CSIT
Tribhuvan University
2073
(NEW COURSE)
Candidates are required to give their answers in their own words as far as practicable.
Figure 7.13
Overlapping: This applies when an entity occurrence may be a member of more than one subclass, e.g.
Figure 7.14
Completeness constraints
Total: Each superclass (higher-level entity) must belong to subclasses (lower-level entity sets), e.g. a student
must be postgrad or undergrad. To represent completeness in the specialization/generalization relationship,
the keyword Mandatory is used.
Figure 7.15
Partial: Some superclasses may not belong to subclasses (lower-level entity sets), e.g. some people at UCT are
neither student nor staff. The keyword Optional is used to represent a partial specialization/generalization
relationship.
Figure 7.16
We can show both disjoint and completeness constraints in the ER diagram. Following our examples, we can combine
disjoint and completeness constraints.
Figure 7.17
Some members of a university are both students and staff. Not all members of the university are staff and students.
Figure 7.18
A student in the university must be either an undergraduate or postgraduate, but not both.
The definition of methods for a class is an integral part of encapsulation. A method is programming
code that performs the behavior an object instance can exhibit. Calculating the age of a person would
be an example of such behavior. The figure shows a way of looking at encapsulating the age method
with an instance object. The code for the age method is "attached" to or encapsulated with the object
rather than part of the application.
An object-oriented database must provide support for all data types not just the built in data
types such as character, integer, and float. To understand abstract data types lets take two steps back
by taking off the abstract and then the data from abstract data type. We now have a type, a type would
be defined as a collection of a type values. A simple example of this is the Integer type, it consists of
values 0, 1, 2, 3, etc. If we add the word data back in we would define data type as a type and the set of
operations that will manipulate the type. If we expand off our integer example, a data type would be
an integer variable, an integer variable is a member of the integer data type. Addition, subtraction, and
multiplication are examples of operations that can be performed on the integer data type.
If we now add the word abstract back in we can define an abstract data type (ADT) as a data type, that
is a type and the set of operations that will manipulate the type. The set of operations are only defined
by their inputs and outputs. The ADT does not specify how the data type will be implemented, all of the
ADT's details are hidden from the user of the ADT. This process of hiding the details is called
encapsulation. If we extend the example for the integer data type to an abstract data type, the
operations might be delete an integer, add an integer, print an integer, and check to see if a certain
integer exists. Notice that we do not care how the operation will be done but simply how do invoke the
operation.
Configuration control refers to setting runtime dependencies and we often discuss "configuring" an
application to run. An example would be a JMX control or even more basic - specifying whether you are
accessing a QA/UAT or production database. There are lots of jobs out there where you focus on
configuration management in the sense of configuring a package to run (actually customizing the
runtime experience). This is often done through XML or properties files such as an application server
(e.g. WebSphere).
Version control refers to checking in and storing specific versions of the source code and now there is a
real difference between configuration control and version control. Years ago the terms were used
almost interchangeably although back then (around the 80s and early 90s) we didn't have too many real
version control tools. On mainframes we had Pan valet and I think that CA Librarian came soon after.
Configuration control applies to service assets as a whole, of which, systems configuration is a subset.
In places where configuration baselines are adopted, configurations of the service assets must be within
the limits that are recommended in the baselines - so, in a way, configuration control also means that
the configurations of the Configuration Items (CI) don't cross above/below the limits defined in the
baselines. That way, we ensure that all the assets follow uniform configurations. If and when there is a
need to cross these limits, the concerned team must get the approval of Change Advisory Board (CAB)
in order to make those changes.
A mobile database is a database that resides on a mobile device such as a PDA, a smart phone, or a
laptop. Such devices are often limited in resources such as memory, computing power, and battery
power. Due to device limitations, a mobile database is often much smaller than its counterpart residing
on servers and mainframes. A mobile database is managed by a Database Management System (DBMS).
Again, due to resource constraints, such a system often has limited functionality compared to a full
blown database management system. For example, mobile databases are single user systems, and
therefore a concurrency control mechanism is not required. Other DBMS components such as query
processing.
IOST, TU
1. OiL t iL (e-phC,iL + ', ol\~ OiL +iL e- phc.,iL l ,ol\ ',s +ke- proc,e-ss ofs +w 'l\g
ao:tiL ',1\ 1\\0r e:-tkiLl\ Ol\e- s', e- or I\o(\e- J ', s IJse-(1J1 ', 1\ ',l\\pr ov"l\g
+le- iLViL ,liLP ,I,+ j o( 6iL +iLJt ,S S'l\\elj c.,oPjll\g aiL +iL (r OI\\ iL _
aiLl c.kPiLse- (ro 0 e- _se-r ve-r + o_lhlloH~_e- r se-rv_e-r so +la, + i1JL +le-
IJSe-b C,iLl\, SliLJ:e- +le- siLJl\e- aiL +iL w',+ kolJ.± iLl\(j ll\c.,oI\S 'L~'±e-I\C,(j_
~I<PiL t~ ;ly, -l<liL Tl, lJ5,ng s ', nglZ-!,ne.. o-ICoae." one., CiLl\
+riLve-rse- +\;:e- iL +iL 'In n:: e- 1'1\\1 aoclJl\\e-n+ Il,s s"l\g)e.,
!'ne., o( coae., ''s ciLlle-a ~I'p r e.,ss 'lon Ile-se., e-l'gre-ss',oI\S
iLre- lJse-a +0 ge-1 +le- ae., +iL,ls (rOI\\ +le- aocl,i l\\e-nts riL +l
-e-r +liLn l\\iLl\lJiLl +riLve.c "I\g o( aOClJl\\e.,l\ t 1..e-1 lJ5 e.,/(l,por
-iL+ e-OlJr COn+iLC + adiL L e.,l'iLl\\ple., +0 lJnae.,cs+iLna lOI/ll __
e.,l'gre.,s5',ons l/IIorl
Tribhuvan University
Institute of Science and Technology
2075
Bachelor Level / Fourth Year /Seven Semester/Science Full Marks: 60
Computer Science and Information Technology-(CSc.401) Pass Marks: 60
(Advanced Database Management System)
Time: 3 hours.
Candidates are required to give their answers in their own words as for as practicable.
The figures in the margin indicate full marks.
Attempt all questions. (10 x 60=60)
1. How do you increase performance of the database? Explain any one database performance tuning
technique with example.
Ans: Databases are the most important applications for your enterprise. Make sure that your databases have
enough resources available. This will help your database in performing at their best. You should check your host
health. Sometimes you can improve your database performance by buying more hardware. Also, make sure that you
are using the latest database version. There are various query optimizers available in the market that you can use for
optimizing your SQL queries. You need to first investigate the cause of bad database performance. After that, you
must try to remove these bottlenecks.
1. Database statistics
2. Create optimized indexes
3. Avoid functions on RHS of the operator
4. Predetermine expected growth
5. Specify optimizer hints in SELECT
6. Use EXPLAIN
7. Avoid foreign key constraints: Foreign keys constraints ensure data integrity at the cost of performance.
Therefore, if performance is your primary goal you can push the data integrity rules to your application
layer. A good example of a database design that avoids foreign key constraints is the System tables in most
databases. Every major RDBMS has a set of tables known as system tables. These tables contain meta data
information about user databases. Although there are relationships among these tables, there is no
foreign key relationship. This is because the client, in this case the database itself, enforces these rules.
8. Select limited data: The less data retrieved, the faster the query will run. Rather than filtering on the client,
push as much filtering as possible on the server-end. This will result in less data being sent on the wire and
you will see results much faster. Eliminate any obvious or computed columns. Consider the following
example.
Select FirstName, LastName, City
Where City = 'New York City'
In the above example, you can easily eliminate the "City" column, which will always be "New York City".
Although this may not seem to have a large effect, it can add up to a significant value for large result sets.
9. Drop indexes before loading data: Consider dropping the indexes on a table before loading a large batch
of data. This makes the insert statement run faster. Once the inserts are completed, you can recreate the
index again.
If you are inserting thousands of rows in an online system, use a temporary table to load data. Ensure that
this temporary table does not have any index. Since moving data from one table to another is much faster
than loading from an external source, you can now drop indexes on your primary table, move data from
temporary to final table, and finally recreate the indexes.
2. What is query processing? How is it different from query optimization? Discuss heuristic query optimization.
Ans:
Query processing denotes the compilation and execution of a query specification
usually expressed in a declarative database query language such as the structured
query language (SQL). Query processing consists of a compile-time phase and a
runtime phase. At compile-time, the query compiler translates the query
specification into an executable program. This translation process (often
called query compilation) is comprised of lexical, syntactical, and semantical
analysis of the query specification as well as a query optimization and code
generation phase.
Heuristic based optimization uses rule-based optimization approaches for query optimization.
These algorithms have polynomial time and space complexity, which is lower than the exponential
complexity of exhaustive search-based algorithms. However, these algorithms do not necessarily
produce the best query plan.
Some of the common heuristic rules are −
• Perform select and project operations before join operations. This is done by moving the
select and project operations down the query tree. This reduces the number of tuples
available for join.
• Perform the most restrictive select/project operations at first before the other operations.
• Avoid cross-product operation since they result in very large-sized intermediate tables.
3. What are the benefits of using distributed databases? Discuss different types of distributed database systems.
Ans:
Distributed databases can be broadly classified into homogeneous and heterogeneous distributed database
environments, each with further sub-divisions, as shown in the following illustration.
Homogeneous Distributed Databases
In a homogeneous distributed database, all the sites use identical DBMS and operating systems.
Its properties are −
• The sites use very similar software.
• The sites use identical DBMS or DBMS from the same vendor.
• Each site is aware of all other sites and cooperates with other sites to process user requests.
• The database is accessed through a single interface as if it is a single database.
In a heterogeneous distributed database, different sites have different operating systems, DBMS
products and data models. Its properties are −
• Different sites use dissimilar schemas and software.
• The system may be composed of a variety of DBMSs like relational, network, hierarchical or
object oriented.
• Query processing is complex due to dissimilar schemas.
• Transaction processing is complex due to dissimilar software.
• A site may not be aware of other sites and so there is limited co-operation in processing user
requests.
4. What are the benefits of using object oriented databases over relational databases? Discuss different type of
constructors used in object oriented databases.
Ans:
◼ Type Constructors:
◼ In OO databases, the state (current value) of a complex object may be constructed from other
objects (or other values) by using certain type constructors.
◼ The three most basic constructors are atom, tuple, and set.
◼ The atom constructor is used to represent all basic atomic values, such as integers, real numbers, character
strings, Booleans, and any other basic data types that the system supports directly.
◼ This example illustrates the difference between the two definitions for comparing object states for
equality.
◼ In this example, The objects o1 and o2 have equal states, since their states at the atomic level are the
same but the values are reached through distinct objects o4 and o5.
◼ However, the states of objects o1 and o3 are identical, even though the objects themselves are not
because they have distinct OIDs.
◼ Similarly, although the states of o4 and o5 are identical, the actual objects o4 and o5 are equal but
not identical, because they have distinct OIDs.
5. Discuss different implementation issues related with object relational database. (6)
Ans:
Query Processing
Query Optimization
Storage & Access Method : efficiently store ADT objects and structure objects and provide efficient indexed access to
both
● Large ADTs, like BLOBs(Binary Large Object), require special storage, typically in a different location on disk from
the tuples that contain them
● Disk-based pointers are maintained from the tuples to the objects they contain.
● A complication arises with array types. Arrays are broken into contiguous chunks, which are then stored in some
order on disk.
Query Processing : ADTs and structured types call for new functionality in processing queries
● To register an aggregation function, a user must implement three methods, which we call initialize, iterate and
terminate.
● ADTs give users the power to add code to the DBMS; this power can be abused.
● A buggy or malicious ADT method can bring down the database server or even corrupt the database.
Query Optimization : To handle new query processing functionality, an optimizer must know about the new
functionality and use it appropriate
6. Define GIS. Discuss different data modeling and representation for GIS data.
Ans:
A geographic information system (GIS) is a framework for gathering, managing, and analyzing data. Rooted in
the science of geography, GIS integrates many types of data. It analyzes spatial location and organizes layers of
information into visualizations using maps and 3D scenes. With this unique capability, GIS reveals deeper insights into
data, such as patterns, relationships, and situations—helping users make smarter decisions.
Representing the “real world” in a data model has been a challenge for GIS since their inception in the 1960s. A GIS
data model enables a computer to represent real geographical elements as graphical elements. Two representational
models are dominant; raster (grid-based) and vector (line-based):
Raster. Based on a cellular organization that divides space into a series of units. Each unit
is generally similar in size to another. Grid cells are the most common raster
representation. Features are divided into cellular arrays and a coordinate (X,Y) is assigned
to each cell, as well as a value. This allows for registration with a geographic reference
system. A raster representation also relies on tessellation: geometric shapes that can
completely cover an area.
Vector. The concept assumes that space is continuous, rather than discrete, which gives
an infinite (in theory) set of coordinates. A vector representation is composed of three main
elements: points, lines and polygons. Points are spatial objects with no area but can have
attached attributes since they are a single set of coordinates (X and Y) in a coordinate
space. Lines are spatial objects made up of connected points (nodes) that have no
width. Polygons are closed areas that can be made up of a circuit of line segments.
7. Define multimedia database. Discuss benefits of multimedia databases. How do you query image database?
Ans:
A Multimedia database (MMDB) is a collection of related for multimedia data. The multimedia data include one or
more primary media data types such as text, images, graphic objects (including drawings, sketches and illustrations)
animation sequences, audio and video.
Multimedia information is very expressive, self explanatory, narrative, etc. Now a day the
development of digital media, advanced network infrastructure and the easily available consumer
electronics makes the multimedia revolution to run in an alarming rate. Inline with the advancement
of database technology that incorporates multimedia data, an open question that always rose in the
technology is how to retrieve/search images in the multimedia databases. There are a huge number
of research works focusing on the searching mechanisms in image databases for efficient retrieval
and tried to give supplementary suggestions on the overall systems. The growing of digital medias
(digital camera, digital video, digital TV, e-book, cell phones, etc.) gave rise to the revolution of very
large multimedia databases, in which the need of efficient storage, organization and retrieval of
multimedia contents came into question. Among the multimedia data, this survey paper focuses on
the different methods (approaches) and their evaluation techniques used by many of recent research
works on image retrieval system. Many researchers develop and use lots of approaches towards
image retrieval.
8. Discuss data warehouse and its functionality. Discuss association rule mining with example.
Ans:
A Data Warehouse works as a central repository where information arrives from one or more data sources. Data
flows into a data warehouse from the transactional system and other relational databases.
1. Structured
2. Semi-structured
3. Unstructured data
Functions of Data warehouse:
It works as a collection of data and here is organized by various communities that
endures the features to recover the data functions. It has stocked facts about the tables
which have high transaction levels which are observed so as to define the data
warehousing techniques and major functions which are involved in this are mentioned
below:
1. Data consolidation
2. Data Cleaning
3. Data Integration
In data mining, association rules are useful for analyzing and predicting customer behavior. They play an
important part in customer analytics, market basket analysis, product clustering, catalog design and store
layout.
Programmers use association rules to build programs capable of machine learning. Machine learning is a
type of artificial intelligence (AI) that seeks to build programs with the ability to become more efficient
without being explicitly programmed.
SOAP
SOAP is an acronym for Simple Object Access Protocol. It is an XML-based messaging protocol for
exchanging information among computers. SOAP is an application of the XML specification.
Although SOAP can be used in a variety of messaging systems and can be delivered via a variety
of transport protocols, the initial focus of SOAP is remote procedure calls transported via HTTP.
Other frameworks including CORBA, DCOM, and Java RMI provide similar functionality to SOAP,
but SOAP messages are written entirely in XML and are therefore uniquely platform- and language-
independent.
10. Write short notes on: (2x3)
a) Integrity constraint:
Ans:
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes have to
be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.
1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of values for an attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc. The
value of the attribute must be available in the corresponding domain.
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation and if
the primary key has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.
4. Key constraints
o Keys are the entity set that is used to identify an entity within its entity set uniquely.
o An entity set can have multiple keys, but out of which one key will be the primary key. A
primary key can contain a unique and null value in the relational table.
Tribhuwan University
2076
Time: 3 hours
Candidates are required to give their answers in their own words as far as practicable.
Vertical Fragmentation
In vertical fragmentation, the fields or columns of a table are grouped into fragments. In
order to maintain reconstructiveness, each fragment should contain the primary key
field(s) of the table. Vertical fragmentation can be used to enforce privacy of data.
For example, let us consider that a University database keeps records of all registered
students in a Student table having the following schema.
STUDENT
Now, the fees details are maintained in the accounts section. In this case, the designer
will fragment the database as follows −
CREATE TABLE STD_FEES AS
SELECT Regd_No, Fees
FROM STUDENT;
Horizontal Fragmentation
Horizontal fragmentation groups the tuples of a table in accordance to values of one or
more fields. Horizontal fragmentation should also confirm to the rule of
reconstructiveness. Each horizontal fragment must have all columns of the original base
table.
For example, in the student schema, if the details of all students of Computer Science
Course needs to be maintained at the School of Computer Science, then the designer
will horizontally fragment the database as follows −
CREATE COMP_STD AS
SELECT * FROM STUDENT
WHERE COURSE = "Computer Science";
Hybrid Fragmentation
In hybrid fragmentation, a combination of horizontal and vertical fragmentation
techniques are used. This is the most flexible fragmentation technique since it generates
fragments with minimal extraneous information. However, reconstruction of the original
table is often an expensive task.
Hybrid fragmentation can be done in two alternative ways −
• At first, generate a set of horizontal fragments; then generate vertical fragments
from one or more of the horizontal fragments.
• At first, generate a set of vertical fragments; then generate horizontal fragments
from one or more of the vertical fragments.
• If the user has managed to interact directly with the data mining tool, then the user
can choose better and smart marketing choices for some corporation.
• Communication is important when dealing directly with data mining so that strong
relationships and connections can be determined.
• Due to the 80/20 principle, if there are 20% of customers then the profit will be 80%.
• The customers that are important with 20% are lossless. The company should aim at
increasing profit with an additional 80%.
• There are two concepts called segmentation and clustering that are important in
advertising and the connection of customers to successfully use the data mining on
the details.
• Data mining was also used as part of the strategy for preventing health fraud, waste
and abuse in society in the area of CMIP of the Medicaid Integrity Program.
• If you have knowledge of data mining techniques, you can manage applications in
various areas such as Market Analysis, Production Control, Sports, Fraud Detection,
Astrology, etc.
• If you have a website for shopping, then data mining will help in defining a shopping
pattern. If you are having issues with designing or selecting the products, data
mining techniques can be useful to identify all the shopping patterns.
• Data mining also helps in data optimization.
• One of the most important factors of data mining is that it determines hidden
profitability.
• The risk factor in business can be taken care of because data mining provides clear
identification of hidden profitability.
• Frauds and malware are the most dangerous threats on the internet which are
increasing day by day. Credit card services and telecommunication are the main
reasons for that. With the help of the Data mining techniques, professionals can get
fraud related data such as caller ID, location, duration of the call, the exact date and
time, etc which can help to find a person or group who is responsible for that fraud.
• Also in the Corporate world where time is money, data mining techniques can help
organizations in real-time for planning finances and resources, evaluation of assets,
an idea about business competitors, etc.
Applications:
1. Future Healthcare
2. Market Basket Analysis
3. Education
4. Fraud Detection
5. Lie Detection
6. Research Analysis
7.
10.Write short notes on:
Deductive database
Ans:
A Deductive Database is a type of database that can make conclusions or we can say
deductions using a sets of well defined rules and fact that are stored in the database. In today’s
world as we deal with a large amount of data, this deductive database provides a lot of
advantages. It helps to combine the RDBMS with logic programming. To design a deductive
database a purely declarative programming language called Datalog is used.
The implementations of deductive databases can be seen in LDL (Logic Data Language), NAIL
(Not Another Implementation of Logic), CORAL, and VALIDITY.
The use of LDL and VALIDITY in a variety of business/industrial applications are as follows.
1. LDL Applications:
This system has been applied to the following application domains:
. Enterprise modeling
. Hypothesis testing or data dredging
. Software reuse
2. VALIDITY Applications:
Validity combines deductive capabilities with the ability to manipulate
complex objects (OIDs, inheritance, methods, etc). It provides a DOOD
data model and language called DEL (Datalog Extended Language), an
engine working along a client-server model and a set of tools for schema
and rule editing, validation, and querying.
The following are some application areas of the VALIDITY system:
. Electronic commerce:
. Rules-governed processes:
. Knowledge discovery:
Concurrent Engineering:
ODMG:
Ans
ODMG (Object Data Management Group) 2.0 builds on
database, object and programming language standards to give
developers portability and ease of use.
•
•
•
•
Key Differences Between Classification and Clustering
1. Classification is the process of classifying the data with the help of
class labels. On the other hand, Clustering is similar to classification
but there are no predefined class labels.
2. Classification is geared with supervised learning. As against,
clustering is also known as unsupervised learning.
3. Training sample is provided in classification method while in case of
clustering training data is not provided.