0% found this document useful (0 votes)

54 views

Exploratory Data Analysis in ML

Exploratory data analysis (EDA) involves analyzing and investigating datasets to understand their main characteristics and relationships between variables. EDA helps identify issues, patterns, and anomalies to better prepare data for machine learning tasks. It is an important initial step that improves model performance, helps select appropriate techniques, and tests assumptions about the data.

Uploaded by

Suresh Kumar

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views

Exploratory Data Analysis in ML

Uploaded by

Suresh Kumar

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Exploratory Data Analysis

If you are a data scientist or a machine learning enthusiast, you probably know that EDA stands for
Exploratory Data Analysis.
But do you know why EDA is so important in the ML workflow?
In this blog post, I will try to explain why EDA is not just a preparatory step, but a critical one that can
make or break your ML project.

What EDA is
EDA is the process of exploring and understanding your data before applying any ML algorithms or
models.
It involves visualizing, summarizing, and finding patterns, outliers, and anomalies in your data.
EDA helps you to gain insights and intuition about your data, which can guide your ML choices and
improve your results.

Why EDA is performed in ML:

1. To identify and fix data quality issues:
This includes missing values, incorrect labels, duplicates, or errors.
These issues can affect the performance and accuracy of your ML models, so it is better to deal with
them early on.

2. To understand the data:

EDA helps you gain cognizance with the distribution, range, and variability of your data.

3. To choose the appropriate ML techniques:

Gaining understanding of the data via EDA helps in choosing the right ML techniques such as scaling,
normalization, transformation, or feature engineering, that can enhance your data and make it more
suitable for machine learning.

4. To select the most relevant features:

EDA helps you to discover the relationships and correlations between your variables. This can help
you to select the most relevant and informative features for your ML models, and avoid
multicollinearity or redundancy.

5. To generate/ engineer new features:

EDA provides inspiration or reveals avenues for creating/ generating new features i.e. by combining
or transforming existing ones.

6. To detect and handle outliers and anomalies:

Outliers are extreme values that deviate from the normal range of your data, while anomalies are
values that do not conform to the expected pattern or behaviour of your data.
Both outliers and anomalies can affect the performance and generalization of your ML models, so it is
important to identify them and decide how to deal with them (e.g., remove them, replace them, or
keep them).

7. To test your assumptions and hypotheses about your data:

Note that EDA isn’t enough to draw definitive conclusions but it helps in testing your intuitive
assumptions/ hypothesis about your data.
For example, you might have some prior knowledge or expectations about how your data should look
like, or how your variables should interact with each other. EDA can help you to validate or invalidate
these assumptions and hypotheses, and adjust them accordingly.
8. To communicate and present your findings and insights to others:
Your insights are only valuable if others understand them

EDA often involves creating visualizations, such as charts, graphs, plots, or maps, that can help you to
convey complex information in a simple and intuitive way.
Visualizations can also help you to tell a story with your data, and highlight the key points and
takeaways for your audience.

In conclusion
Building your ML models and selecting features based on intuition only (without painstakingly
carrying out EDA) is bad practice and will undermine the abilities of your model.
.
Exploratory Data Analysis in [ML]

What is exploratory data analysis?

Exploratory data analysis (EDA) is used by data scientists to analyze and investigate data sets
and summarize their main characteristics, often employing data visualization methods. It
helps determine how best to manipulate data sources to get the answers you need, making it
easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check
assumptions.
EDA is primarily used to see what data can reveal beyond the formal modeling or hypothesis
testing task and provides a provides a better understanding of data set variables and the
relationships between them. It can also help determine if the statistical techniques you are
considering for data analysis are appropriate. Originally developed by American
mathematician John Tukey in the 1970s, EDA techniques continue to be a widely used
method in the data discovery process today.

Why is exploratory data analysis important in data science?

The main purpose of EDA is to help look at data before making any assumptions. It can help
identify obvious errors, as well as better understand patterns within the data, detect outliers
or anomalous events, find interesting relations among the variables.
Data scientists can use exploratory analysis to ensure the results they produce are valid and
applicable to any desired business outcomes and goals. EDA also helps stakeholders by
confirming they are asking the right questions. EDA can help answer questions about
standard deviations, categorical variables, and confidence intervals. Once EDA is complete
and insights are drawn, its features can then be used for more sophisticated data analysis or
modeling, including machine learning.

Programming Language Used

Python: an interpreted, object-oriented programming language with dynamic semantics. Its
high-level, built-in data structures, combined with dynamic typing and dynamic binding,
make it very attractive for rapid application development, as well as for use as a scripting or
glue language to connect existing components together. Python and EDA can be used
together to identify missing values in a data set, which is important so you can decide how to
handle missing values for machine learning.
TYPES OF EXPLORATORY DATA ANALYSIS:
1. Univariate Non-graphical
2. Multivariate Non-graphical
3. Univariate graphical
4. Multivariate graphical

1. Univariate Non-graphical: As we only use one variable to research the data, this is the
most basic type of data analysis. Understanding the sample distribution and underlying
data in order to draw conclusions about the population is the basic objective of univariate
non-graphical EDA. The analysis also includes outlier detection. The population
distribution’s characteristics include:
 Central tendency: The average or middle values have something to do with the
central tendency or distribution location. Statistics with the names mean, median,
and occasionally mode are frequently useful gauges of central tendency, with mean
being the most prevalent. The median may be selected when there is a skewed
distribution or when outliers are a concern.
 Spread: Spread serves as a gauge for how far we should look to find the information
values from the centre. The variance and quality deviation are two helpful
measurements of spread. The variance is the root of the variance because it is the
mean of the square of each unique deviation.
 Skewness and kurtosis: Two more useful univariates descriptors are the skewness
and kurtosis of the distribution. Skewness is that the measure of asymmetry and
kurtosis may be a more subtle measure of peakedness compared to a normal
distribution
2. Multivariate Non-graphical: In cross-tabulation or statistics, the multivariate non-
graphical EDA technique is typically used to illustrate the relationship between two or more
variables.
 For categorical data, an extension of tabulation called cross-tabulation is extremely
useful. For 2 variables, cross-tabulation is preferred by making a two-way table with
column headings that match the amount of one-variable and row headings that
match the amount of the opposite two variables, then filling the counts with all
subjects that share an equivalent pair of levels.
 For each categorical variable and one quantitative variable, we create statistics for
quantitative variables separately for every level of the specific variable then compare
the statistics across the amount of categorical variable.
 Comparing the means is an off-the-cuff version of ANOVA and comparing medians may
be a robust version of one-way ANOVA.
3. Univariate graphical: Non-graphical methods are quantitative and objective, they are
doing not give the complete picture of the data; therefore, graphical methods are more
involve a degree of subjective analysis, also are required. Common sorts of univariate
graphics are:
 Histogram: The foremost basic graph is a histogram, which may be a barplot during
which each bar represents the frequency (count) or proportion (count/total count) of
cases for a variety of values. Histograms are one of the simplest ways to quickly learn a
lot about your data, including central tendency, spread, modality, shape and outliers.
 Stem-and-leaf plots: An easy substitute for a histogram may be stem-and-leaf plots. It
shows all data values and therefore the shape of the distribution.
 Boxplots: Another very useful univariate graphical technique is that the boxplot.
Boxplots are excellent at presenting information about central tendency and show
robust measures of location and spread also as providing information about symmetry
and outliers, although they will be misleading about aspects like multimodality. One
among the simplest uses of boxplots is within the sort of side-by-side boxplots.
 Quantile-normal plots: The ultimate univariate graphical EDA technique is that the
most intricate. it’s called the quantile-normal or QN plot or more generally the
quantile-quantile or QQ plot. it’s wont to see how well a specific sample follows a
specific theoretical distribution. It allows detection of non-normality and diagnosis of
skewness and kurtosis
4. Multivariate graphical: Multivariate graphical data uses graphics to display relationships
between two or more sets of knowledge. The sole one used commonly may be a grouped
barplot with each group representing one level of 1 of the variables and every bar within a
gaggle representing the amount of the opposite variable.
Other common sorts of multivariate graphics are:
 Scatterplot: For 2 quantitative variables, the essential graphical EDA technique is that
the scatterplot , sohas one variable on the x-axis and one on the y-axis and therefore
the point for every case in your dataset.
 Run chart: It’s a line graph of data plotted over time.
 Heat map: It’s a graphical representation of data where values are depicted by color.
 Multivariate chart: It’s a graphical representation of the relationships between
factors and response.
 Bubble chart: It’s a data visualization that displays multiple circles (bubbles) in two-
dimensional plot.
Exploratory Data Analysis (EDA) and Customer Segmentation of Credit Score
Classification Dataset
Introduction
Exploratory Data Analysis (EDA) is a crucial phase in any data science project, enabling data
scientists to gain insights, identify patterns, and prepare data for further analysis. This article
serves as a comprehensive guide to conducting EDA effectively. We will break down the
process into key steps and provide detailed insights at each stage.

1. Project Overview 📝
Project Title: Create a concise, descriptive title that encapsulates the main theme of your
analysis.
Goal of the project: Break down the project goals into specific objectives or research
questions. For instance, if your goal is to understand customer churn, you might have sub-
goals like “Identify key factors affecting churn rates” or “Segment customers based on
churn behavior.”
Dataset(s) used: List the dataset names, sources, formats, and any data preprocessing steps
you applied (e.g., cleaning, merging datasets).
Team Members: Specify the roles and responsibilities of each team member. Who was
responsible for data cleaning, analysis, visualization, and reporting?

2. Data Overview 📁
Source(s) of the Data: Provide detailed information about the data sources, including URLs,
databases, and any data retrieval methods.

Data Size: Include information on the number of records, the number of features (columns),
and the memory usage (e.g., “10,000 records with 20 features, consuming 5 MB of
memory”).

Brief Description of the Data: Elaborate on the context and significance of the data, including
how it relates to the project’s goals. Mention any data collection issues or peculiarities.

3. Data Cleaning and Preparation 🔧

Missing Value Treatment: Provide a step-by-step account of how you dealt with missing
data, which may involve data imputation methods such as mean, median, or advanced
techniques like regression imputation.

Outlier Detection & Treatment: Detail the process of identifying outliers and your approach
to handling them, such as using visualization, statistical tests, or filtering.

Feature Engineering: Elaborate on feature engineering, including the creation of new

features, transformations, and scaling. Explain the rationale behind these engineering
decisions.
4. Exploratory Data Analysis ‍
Univariate Analysis: Include a breakdown of the univariate analysis. Provide summary
statistics for each feature, histograms, kernel density plots, and discuss the distribution of
variables. Mention any data skewness.
Bivariate Analysis: Extend the analysis by explaining the bivariate exploration, including
scatter plots, correlation matrices, and any significant relationships discovered.

Multivariate Analysis: Describe how you conducted multivariate exploration, such as

clustering or principal component analysis, to uncover complex interactions between
variables.

5. Visualization 📊
Graphs and Plots: Provide a detailed inventory of the types of graphs and plots used for
visualization, with explanations of how each visualization was chosen for specific insights.
Insights Derived from the Visuals: Elaborate on the insights gained from the visualizations,
including patterns, anomalies, and trends. Relate these findings to the project’s goals.

6. Hypothesis Testing and Insights 🎯

Statistical Tests Used: Describe the specific statistical tests performed, including the null and
alternative hypotheses, test statistics, and degrees of freedom.
Findings and Insights: Explain the results of hypothesis tests and their implications. Discuss
any significant relationships or differences and their relevance to the project’s goals.

7. Final Report and Presentation 📑

Key Findings: Summarize the most critical findings in a concise and impactful manner. Use
visuals and key statistics to highlight these findings.
Limitations of the Analysis: Discuss potential limitations in the data, methodology, or
analysis. Address any sources of bias or uncertainty.
Recommendations and Next Steps: Offer actionable recommendations based on your
insights and propose what actions or analyses should follow this EDA.

8. References and Data Sources 📚

Dataset Links: Provide direct links to the sources of the data. Verify that the links are
accessible and up-to-date.
Research Papers or Articles Referenced: List external sources, research papers, or articles
you referenced during the analysis, with proper citations.

9. Project Files 📂
EDA Code Files: List and organize the code files, scripts, or notebooks used for different
stages of the analysis. Include comments and explanations in the code.
Data Files: Specify the data files used, including their names, formats, and descriptions.
Presentations or Reports: Include the final presentations, reports, or documents created for
communication and sharing of your EDA project results.

Conclusion
Exploratory Data Analysis is a fundamental step in the data science workflow. This guide
provides a comprehensive framework to help you navigate the process effectively, from
project initiation to data exploration, hypothesis testing, and beyond. By following these
steps, you’ll be better equipped to uncover valuable insights and make informed decisions
based on your data.

Dell Bomi Resume
No ratings yet
Dell Bomi Resume
5 pages
Introduction To NFL Analytics With R (Bradley J. Congelio) (Z-Library)
No ratings yet
Introduction To NFL Analytics With R (Bradley J. Congelio) (Z-Library)
383 pages
Throttling and Batching With Boomi Atom Queues
No ratings yet
Throttling and Batching With Boomi Atom Queues
5 pages
Multipart Message Support OracleBPEL
No ratings yet
Multipart Message Support OracleBPEL
10 pages
Unit 3
No ratings yet
Unit 3
47 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
5 pages
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
No ratings yet
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
9 pages
What Is Exploratory Data Analysis (EDA) ?
No ratings yet
What Is Exploratory Data Analysis (EDA) ?
6 pages
05_AIHC_Exp02
No ratings yet
05_AIHC_Exp02
11 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
13 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
3 pages
BI-LEc 3
No ratings yet
BI-LEc 3
24 pages
DOC-20250125-WA0000.
No ratings yet
DOC-20250125-WA0000.
15 pages
Unit 3
No ratings yet
Unit 3
31 pages
Unit 3 Ids Notes
No ratings yet
Unit 3 Ids Notes
31 pages
Data Sciecnce
No ratings yet
Data Sciecnce
16 pages
Unit 3
No ratings yet
Unit 3
222 pages
Unit 3
No ratings yet
Unit 3
77 pages
Why Exploratory Data Analysis is Important
No ratings yet
Why Exploratory Data Analysis is Important
2 pages
Exploratory Data Analysis Gam
No ratings yet
Exploratory Data Analysis Gam
10 pages
Fda End Sem
No ratings yet
Fda End Sem
14 pages
Chapter 2 Visualization of Data
No ratings yet
Chapter 2 Visualization of Data
15 pages
Edab Module - 1
No ratings yet
Edab Module - 1
20 pages
Descriptive Analytics
No ratings yet
Descriptive Analytics
31 pages
Research Methodology Overview
No ratings yet
Research Methodology Overview
21 pages
EDA IMPORTANT TWO MARKS & 16 MARKS
No ratings yet
EDA IMPORTANT TWO MARKS & 16 MARKS
17 pages
ds unit 2 qb
No ratings yet
ds unit 2 qb
25 pages
Unit3 Eda
No ratings yet
Unit3 Eda
13 pages
DS Lecture 15
No ratings yet
DS Lecture 15
44 pages
FDS Unit 2
No ratings yet
FDS Unit 2
15 pages
Activity 3 Interpreting Data
No ratings yet
Activity 3 Interpreting Data
7 pages
C21_SMA_EXP4[1]
No ratings yet
C21_SMA_EXP4[1]
12 pages
exp 4-10 merged
No ratings yet
exp 4-10 merged
89 pages
Chapter-7 (1)
No ratings yet
Chapter-7 (1)
3 pages
Unit 1 TE Honours
No ratings yet
Unit 1 TE Honours
22 pages
ML_EXP_NO_1
No ratings yet
ML_EXP_NO_1
8 pages
Exploratory Data Analysis With NumPy and Matplotlib
No ratings yet
Exploratory Data Analysis With NumPy and Matplotlib
8 pages
ML Unit 3
No ratings yet
ML Unit 3
17 pages
Crack_Data_Science_Interview_�_1731300339
No ratings yet
Crack_Data_Science_Interview_�_1731300339
132 pages
datascience unit-4
No ratings yet
datascience unit-4
6 pages
Data Science Process
No ratings yet
Data Science Process
30 pages
lab3report
No ratings yet
lab3report
4 pages
Basic Data Science Interview Questions
No ratings yet
Basic Data Science Interview Questions
18 pages
Data Mining
No ratings yet
Data Mining
34 pages
Q2 Ans
No ratings yet
Q2 Ans
5 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
5 pages
FTA-Module 1-Notes (1)
No ratings yet
FTA-Module 1-Notes (1)
24 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
5 pages
ML EXP1_2201107
No ratings yet
ML EXP1_2201107
34 pages
Creative and Minimal Portfolio Presentation
No ratings yet
Creative and Minimal Portfolio Presentation
5 pages
IDA Question Bank Ch2
No ratings yet
IDA Question Bank Ch2
26 pages
Exploratory Data Analysis unit 2
No ratings yet
Exploratory Data Analysis unit 2
39 pages
Exploratory Data Analysis: Datascience Using Python Topic: 3
No ratings yet
Exploratory Data Analysis: Datascience Using Python Topic: 3
32 pages
Eda
No ratings yet
Eda
6 pages
EDA QB Full Answers
No ratings yet
EDA QB Full Answers
18 pages
EDA Feature eng- Estimation Inference and Hypothesis
No ratings yet
EDA Feature eng- Estimation Inference and Hypothesis
53 pages
The analysis_In_EDA
No ratings yet
The analysis_In_EDA
7 pages
Quantitative
No ratings yet
Quantitative
9 pages
What Exactly Is Data Science
No ratings yet
What Exactly Is Data Science
15 pages
1.2 - Data Processing
No ratings yet
1.2 - Data Processing
25 pages
Lesson 8: Presentation of Data: Descriptive Statistics
No ratings yet
Lesson 8: Presentation of Data: Descriptive Statistics
6 pages
DSBDL Asg 2 Write Up
No ratings yet
DSBDL Asg 2 Write Up
4 pages
AUTOMATED EDA Libraries
No ratings yet
AUTOMATED EDA Libraries
12 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Beam Starter Projects
No ratings yet
Beam Starter Projects
2 pages
Boomi Integration Cloud
100% (1)
Boomi Integration Cloud
4 pages
Mastering Chunking in RAG - Techniques and Strategies
No ratings yet
Mastering Chunking in RAG - Techniques and Strategies
12 pages
Python Exception Handling
No ratings yet
Python Exception Handling
21 pages
Data Dict Dataframes Lists
No ratings yet
Data Dict Dataframes Lists
18 pages
API Manager 170 PDF
No ratings yet
API Manager 170 PDF
318 pages
Apigee Api Engineer PDF
No ratings yet
Apigee Api Engineer PDF
4 pages
Certified Scrum Master
0% (1)
Certified Scrum Master
5 pages
Using Dell Boomi Extensions
No ratings yet
Using Dell Boomi Extensions
8 pages
IELTS Line Graph Car Theft
100% (1)
IELTS Line Graph Car Theft
2 pages
Fault Handling Framework
No ratings yet
Fault Handling Framework
3 pages
IELTS Essentials - About Your Result
No ratings yet
IELTS Essentials - About Your Result
3 pages
11g Mediator - Diagnosing Resequencer Issues PDF
No ratings yet
11g Mediator - Diagnosing Resequencer Issues PDF
6 pages
Dell Bhomi Tool 9ways
No ratings yet
Dell Bhomi Tool 9ways
15 pages
How To Recover Initial Messages (Payload) From SOA Audit For Mediator and BPEL Components
No ratings yet
How To Recover Initial Messages (Payload) From SOA Audit For Mediator and BPEL Components
6 pages
Retrieve BPEL Payload From The Database
No ratings yet
Retrieve BPEL Payload From The Database
3 pages
Oracle Service Bus and Soa Suite in Mobile World v3 150122102217 Conversion Gate02
No ratings yet
Oracle Service Bus and Soa Suite in Mobile World v3 150122102217 Conversion Gate02
40 pages
A JMS Queue in Weblogic Server Is Associated With A Number of Additional Resources
No ratings yet
A JMS Queue in Weblogic Server Is Associated With A Number of Additional Resources
4 pages
Anatomy of Soa Suite Processes Behind The Scenes
No ratings yet
Anatomy of Soa Suite Processes Behind The Scenes
5 pages
Data Source Connection Pool Sizing: Weblogic Datasource
No ratings yet
Data Source Connection Pool Sizing: Weblogic Datasource
2 pages
Chapter 2 - Data ScienceOR Ing
No ratings yet
Chapter 2 - Data ScienceOR Ing
35 pages
DINESH_KUMAR_RESUME
No ratings yet
DINESH_KUMAR_RESUME
1 page
FutureSkills4All - Learning Pathways - EN
No ratings yet
FutureSkills4All - Learning Pathways - EN
22 pages
Python Career Path
No ratings yet
Python Career Path
18 pages
Data Science Program 2014 PDF
No ratings yet
Data Science Program 2014 PDF
20 pages
BTSDSB2018
No ratings yet
BTSDSB2018
27 pages
Instant download Doing Data Science in R An Introduction for Social Scientists 1st Edition Mark Andrews pdf all chapter
100% (3)
Instant download Doing Data Science in R An Introduction for Social Scientists 1st Edition Mark Andrews pdf all chapter
55 pages
Knowledge Creation With The Help of AI
No ratings yet
Knowledge Creation With The Help of AI
5 pages
AIM 2024 Application Formupdated Final
No ratings yet
AIM 2024 Application Formupdated Final
5 pages
Sensors Data
No ratings yet
Sensors Data
27 pages
(IJETA-V9I1P2) :yew Kee Wong
No ratings yet
(IJETA-V9I1P2) :yew Kee Wong
7 pages
Fdsa Unit 1 Aids Sem 4
No ratings yet
Fdsa Unit 1 Aids Sem 4
26 pages
Chapter 2 - Introduction To Data Science
No ratings yet
Chapter 2 - Introduction To Data Science
56 pages
CV - Michigan Med
No ratings yet
CV - Michigan Med
1 page
Why So Many Data Science Proje
100% (1)
Why So Many Data Science Proje
6 pages
Faculty of Economic and Management Sciences: Bcom (Statistics and Data Science)
No ratings yet
Faculty of Economic and Management Sciences: Bcom (Statistics and Data Science)
2 pages
Data Science Motivational Letter
No ratings yet
Data Science Motivational Letter
2 pages
Geographic Data Science with Python 1st Edition Sergio Rey - Download the full ebook version right now
100% (1)
Geographic Data Science with Python 1st Edition Sergio Rey - Download the full ebook version right now
60 pages
Indiana University Curriculum
No ratings yet
Indiana University Curriculum
64 pages
What Is Data Science
No ratings yet
What Is Data Science
3 pages
AI Class 10
No ratings yet
AI Class 10
29 pages
Data Analytics and Performance
100% (7)
Data Analytics and Performance
81 pages
Data Science With Python
No ratings yet
Data Science With Python
13 pages
The Role of Statistics in Data Science
No ratings yet
The Role of Statistics in Data Science
12 pages
Assignment - 2 Electronic Communication
No ratings yet
Assignment - 2 Electronic Communication
6 pages
Full download Exploratory Data Analysis with Python Cookbook: Over 50 recipes to analyze, visualize, and extract insights from structured and unstructured data Oluleye pdf docx
100% (3)
Full download Exploratory Data Analysis with Python Cookbook: Over 50 recipes to analyze, visualize, and extract insights from structured and unstructured data Oluleye pdf docx
41 pages
Data Analytics Course File 2021-22 Odd Semester
No ratings yet
Data Analytics Course File 2021-22 Odd Semester
164 pages
Machine Learning
100% (1)
Machine Learning
102 pages
Big Data Analytics Tutorial
0% (1)
Big Data Analytics Tutorial
25 pages