Mba It Unit 2
Mba It Unit 2
Mba It Unit 2
Data Collection?
Data collection is the process of collecting and evaluating information or data from
multiple sources to find answers to research problems, answer questions, evaluate
outcomes, and forecast trends and probabilities. It is an essential phase in all types of
research, analysis, and decision-making, including that done in the social sciences,
business, and healthcare.
During data collection, the researchers must identify the data types, the sources of data,
and what methods are being used. We will soon see that there are many different data
collection methods. There is heavy reliance on data collection in research, commercial,
and government fields.
Before an analyst begins collecting data, they must answer three questions first:
What methods and procedures will be used to collect, store, and process the
information?
Additionally, we can break up data into qualitative and quantitative types. Qualitative
data covers descriptions such as color, size, quality, and appearance. Quantitative data,
unsurprisingly, deals with numbers, such as statistics, poll numbers, percentages, etc.
There are three main types of data classification that are considered industry standards:
Data increasingly is seen as a corporate asset that can be used to make better-
informed business decisions, improve marketing campaigns, optimize business
operations and reduce costs, all with the goal of increasing revenue and profits. But a
lack of proper data management can saddle organizations with incompatible data silos,
inconsistent data sets and data quality problems that limit their ability to run business
intelligence (BI) and analytics applications -- or, worse, lead to faulty findings.
Big data management refers to the organization, administration and governance of large
volumes of unstructured and structured data. A high level of data quality and
accessibility for business intelligence and big data analytics applications is the aim of
big data management. Businesses, enterprises, and governments use big data
management strategies to tackle the vast and rapidly expanding data pools that typically
have hundreds of terabytes or even petabytes of data stored in various file formats.
Facebook, for instance, gets over 500 terabytes of new data into their databases daily.
A company's ability to locate valuable information in extensive stacks of unstructured
and semi-structured data from a variety of disparate sources, such as call records,
system logs, images, social media sites, and sensors, is aided by effective big data
management.
Using a centralized interface or dashboard to monitor and ensure the availability of all
big data resources
Monitoring big data analytics, big data reporting and other similar solutions and
implementing them
Better customer service: Big data initiatives almost always state customer service as
the primary objective. Big data management gives the benefit of better customer
service.
Cost Effective: Big data management increases the efficiency of efforts to decrease
expenses. With big data implementation, processes become more cost-effective.
Accurate Analytics: The accuracy and dependability of big data analytics can be
improved by big data management practices. When well-formed data enters the
analytics solution, the organization is prepared for the solution's high-quality business
insights.
"Outliers" refer to the data points that exist outside of what is to be expected. The major
thing about the outliers is what you do with them. If you are going to analyze any task
to analyze data sets, you will always have some assumptions based on how this data is
generated. If you find some data points that are likely to contain some form of error,
then these are definitely outliers, and depending on the context, you want to overcome
those errors. The data mining process involves the analysis and prediction of data that
the data holds. In 1969, Grubbs introduced the first definition of outliers.
Any unwanted error occurs in some previously measured variable, or there is any
variance in the previously measured variable called noise. Before finding the outliers
present in any data set, it is recommended first to remove the noise.
Types of Outliers
Global Outliers
Global outliers are also called point outliers. Global outliers are taken as the simplest
form of outliers. When data points deviate from all the rest of the data points in a given
data set, it is known as the global outlier. In most cases, all the outlier detection
procedures are targeted to determine the global outliers. The green data point is the
global outlier.
Data visualization?
But raw data can be hard to comprehend and use. Hence, data
scientists prepare and present data in the right context. They give it a
visual form so that decision-makers can identify the relationships
between data and detect hidden patterns or trends. Data visualization
creates stories that advance business intelligence and support data-
driven decision-making and strategic planning.
Strategic decision-making