IG System Task 7 Chapter 6
IG System Task 7 Chapter 6
IG System Task 7 Chapter 6
SUBJECT:
RESEARCH WORK #7
CHAPTER 6
STUDENT:
PROFESSOR:
9 no “ A” OF FOREIGN TRADE
QUESTION RESOLUTION
REVIEW QUESTIONS
6-1 What are the problems of managing data resources in a traditional file
environment?
• Define and explain the meaning of key entities, attributes, and fields.
Entity: It is a person, place, thing or event about which information is stored and
preserved.
Attributes: Characteristic or quality that describes an entity.
Key Fields: Unique identifier for all information in any row of the table and its
primary key cannot be duplicated.
Data redundancy and inconsistency: Presence of data stored in multiple data files,
so the same data is stored in more than one place.
Dependency between programs and data : Close relationship between data stored
in files and the specific programs required to update and maintain those files, such
that changes to the programs require changes to the data.
Lack of flexibility: Sends routine scheduled reports after extensive programming
efforts, but cannot transmit reports for specific purposes or respond in a timely
manner to unforeseen information requests.
Poor security: Access and information can get out of control.
Lack of sharing and availability of data: Information fragmented in different files
so they are not related.
6-2 What are the main capabilities of database management systems (DBMS) and why
is a relational DBMS so powerful?
Database
Set of data organized to efficiently serve many applications by centralizing data and
controlling its redundancy.
It is the software that allows an organization to centralize data, manage it efficiently and
provide access to stored data through application programs.
A DBMS contains tools to access and manipulate information in databases. Most DBMSs
have a specialized language called data manipulation language which is used to add,
modify, delete and retrieve data in the database. This language contains commands that
allow end users and programming specialists to extract data from the database to satisfy
information requests and develop applications.
The most prominent data manipulation language today is Structured Query Language, or
SQL. Users of DBMSs for large and midrange computers, such as DB2, Oracle, or SQL
Server, can use SQL to retrieve the information they need from the database.
Database model that tracks entities, attributes and relationships. Currently the most popular
type of DBMS for PCs, as well as larger computers and mainframes, is the Relational
DBMS.
It organizes the data as two-dimensional tables which are considered files where each one
has an entity.
Allows you to create a subset that consists of all records in the file that meet
established criteria.
Bind:
It makes it possible to combine relational tables to provide the user with more
information than is available in the individual tables.
To project:
It allows creating a subset of columns in a table, with which the user can create new
tables that contain only the required information.
Non-relational database management systems use a more flexible data model and are
designed to handle large data sets across multiple distributed machines and can easily scale
up or down in size. They are useful for speeding up simple queries against large volumes of
structured and unstructured data, whether on the Web, social media, graphs, and other
forms of data that are difficult to analyze with traditional SQL-based tools.
• Define and describe normalization and referential integrity; explain how they
contribute to a well-designed relational database.
Relational database systems try to enforce referential integrity rules to ensure that
relationships between coupled tables remain consistent. When one table has a foreign
key that points to another, it is not possible to add a record to the table with the foreign
key unless there is a corresponding one in the linked table.
Conceptual database design describes how data elements should be grouped in the
database. The design process identifies the relationships between data elements and the
most efficient way to group them together to satisfy the information requirements of the
company. This process also identifies redundant data elements and data element groupings
required for certain specific application programs.
Groups of data are organized, refined, and optimized until an overall logical view of the
relationships between all the data in the database emerges. To use a relational database
model effectively, complex groupings of data must be optimized to minimize redundant
data elements and cumbersome many-to-many relationships. The process of creating small,
stable, yet flexible and adaptive data structures from complex groups of data is called
normalization.
6-3 What are the main tools and technologies to access information from databases
and improve both business performance and decision making?
• Define Big Data and describe the technologies to manage and analyze it.
The term big data to describe these data sets with volumes so large that they are beyond the
ability of a typical DBMS to capture, store and analyze. Big Data does not refer to a
specific quantity, but generally to data in the range of petabytes and exabytes; that is, from
billions to trillions of records all from different origins.
Companies are interested in Big Data because it can reveal more interesting patterns and
anomalies than smaller data sets, with the potential to provide new insights into customer
behavior, weather patterns, market activity. financial or other phenomena.
Suppose you want concise, reliable information about current operations, trends, and
changes across the company. If you worked in a large company, you would need to gather
the necessary data from separate systems, such as sales, manufacturing, and accounting,
and even from external sources, such as demographic or competency data. It is likely that
there was an increasing need to use Big Data.
A data warehouse is a database that stores current and historical information of potential
interest to company decision makers. The data originates from many core operational
transaction systems, such as sales systems, customer accounts, manufacturing, and may
include website transaction data.
The data warehouse makes data available to everyone as needed, but it cannot be altered. A
data warehouse system also provides a range of ad hoc and standardized query tools,
analytical tools and graphical reporting facilities.
2. Hadoop
Relational DBMS and data warehouse products are not well suited to organizing and
analyzing Big Data or data that does not easily fit into the columns and rows used in your
data models. To handle unstructured and semi-structured data in large quantities, as well as
structured data, organizations use Hadoop, which is an open source software framework,
managed by the Apache Software Foundation, enabling distributed parallel processing of
enormous quantities. amounts of data through inexpensive computers.
3. In-memory computing
Another way to facilitate big data analysis is to use in-memory computing, which relies
primarily on the computer's main memory (RAM) for data storage (conventional DBMSs
use data storage systems). The main commercial products for in-memory computing are:
High Performance Analytics Appliance (HANA) from SAP, and Oracle Exalytics. Each
offers a set of integrated software components, including in-memory database software and
specialized analysis software, running on hardware optimized for in-memory computing
work.
4. Analytical platforms
OLAP supports multidimensional data analysis, which allows users to view the same data
in different ways by using multiple dimensions. Each aspect of information—product,
pricing, cost, region, or time period—represents a different dimension.
OLAP allows users to get online answers to ad hoc questions like these in a very short time,
even when the data is stored in very large databases, such as sales figures from several
years.
• Define data mining; Describe how it differs from OLAP and the types of information
it provides.
Data mining
Data mining is more discovery-oriented, as it provides insights into corporate data that
cannot be obtained through OLAP, by finding patterns and relationships hidden in large
databases and inferring rules from these patterns and relationships, to predict future
behavior.
The types of information that can be obtained from data mining are:
Classification recognizes the patterns that describe the group to which an element
belongs, for which existing elements that have been classified are examined and a set of
rules is inferred
Clustering works in a similar way to classification when groups have not yet been
defined. A data mining tool can discover different groupings within the data
• Explain how text mining and Web mining differ from conventional data mining.
There are now text mining tools available to help companies analyze this data. These tools
can extract key elements from large unstructured data sets, discover patterns and
relationships, and synthesize information. Companies could use text mining to analyze
customer service call center transcripts to identify top service and repair issues, or to
measure customer sentiment toward their company.
Web mining looks for patterns in data through content mining, structure mining, and usage
mining. Web content mining is the process of extracting knowledge from the content of
Web pages, which can include text, image, audio, and video data. Web structure mining
examines data related to the structure of a specific Web site.
• Describe how users can access information from a company's internal databases via
the Web.
Chances are you used a Web site linked to an internal corporate database. Many companies
now use the Web to make some of the information in their internal databases available to
customers and business partners.
The user accesses the seller's Web site over the Internet using Web browser software on his
or her client PC. The user's Web browser software requests information from the
organization's database, using HTML commands to communicate with the Web server.
There are several advantages to using the Web to access an organization's internal
databases. First, Web browser software is much easier to use than proprietary query
tools. Second, the Web interface requires little or no changes to the internal
database. It is much less expensive to add a Web interface to a legacy system than
to redesign and rebuild the system to improve user access.
6-4 Why are information policy, data management, and data quality assurance
essential for managing enterprise data resources?
The information policy is what specifies the organization's rules for sharing,
disseminating, acquiring, standardizing, classifying and inventorying information: it
establishes specific procedures and accountability, identifies which users and organizational
units can share information, where to distribute it, and Who is responsible for updating and
maintaining it.
Data management is responsible for the specific policies and procedures through which
data can be managed as an organizational resource. These responsibilities encompass
information policy development, data planning, oversight of database logical design, and
data dictionary development, as well as the process of monitoring how systems specialists
information and end-user groups use the data.
• Explain why data quality audits and data cleansing are essential.
Data quality analysis begins with a data quality audit , which is a structured survey of the
accuracy and completeness of an information system. Data quality audits can be performed
by inspecting entire data files, inspecting samples from data files, or by surveying end users
about their perceptions of data quality.
Data cleansing consists of activities to detect and correct data in a database that is
incorrect, incomplete, improperly formatted, or redundant. Data cleaning not only corrects
errors, but also enforces consistency between different data sets originating from separate
information systems. Specialized data cleansing software is available to automatically
inspect data files, correct errors in the data, and integrate it into a consistent, company-wide
format. (Laudon & Jane P. Laudon, 2016)
DISCUSSION QUESTIONS
6-5 It has been said that there is no bad data, but poor data management. Discuss the
implications of this statement.
That is false, companies need this software to prevent their data from being poorly
organized, which becomes a serious problem when it is used for decision making.
6-6 To what degree should end users be involved in selecting a database management
system and database design?
To a high degree, if these systems are not mastered, users will not know how to work and
this could generate losses for the company, whether monetary to partners or clients.
6-7 What are the consequences of an organization not having an information policy?
If you do not have an information policy, anyone inside or outside the company would have
access to your information, which would be a problem, since there is data that not the entire
company should have access to, for example salaries and social security numbers. of
employees (this only corresponds to human resources).
CONCLUSION
The competition in which companies currently coexist is no longer the
same as in previous years, now organizations are focused on addressing
their customers in a more personalized way, according to the history
they have lived, this refers to the fact of being able to project behavior
based on past experiences. Even for managers in any industry, the way
they analyze data from any department is now more dynamic thanks to
the tools that have been developed and that facilitate the diagnosis of
different phenomena and sometimes even prevent situations of low
performance.
BIBLIOGRAPHY
Laudon, K. C., & Jane P. Laudon. (2016). Sistemas de Información Gerencial. México:
Decimocuarta Edición.