This project uses a fictitious Sleep Health and Lifestyle dataset to explore how sleep quality, stress, physical activity, and health indicators interact across demographic categories.
⚠️ Note: All insights are based on synthetic data and should not be interpreted as medically accurate.
More about the dataset here: Sleep Health and Lifestyle Dataset (Kaggle)
The dataset includes 15 attributes such as age, gender, occupation, sleep duration, physical activity, heart rate, blood pressure, BMI category, daily steps, and sleep disorder presence.
- What is the relationship between Stress and Sleep Quality?
- What is the relationship between Sleep Disorders and Physical Activity?
- What is the relationship between Health Factors and Sleep Patterns?
Each question is supported by a primary visualization, interaction capabilities, and a set of secondary insights drawn from demographic comparisons (age, gender, occupation).
This stacked bar chart illustrates how different stress levels (color-coded from red to green) distribute across categories of sleep quality.
- Encoding:
- X-axis: Sleep Quality categories
- Y-axis: Number of people
- Color: Stress level intensity
- Purpose: Visually highlights how stress influences reported sleep quality.
- Interactivity: Filtering by age, gender, occupation.
- Shows distribution of sleep disorders.
- Slice angles and colors distinguish between:
- Green: None
- Orange: Insomnia
- Red: Sleep Apnea
-
Compares physical activity levels across sleep disorder categories.
-
Reveals relationships between activity and disorder type.
- Axes represent:
- X-axis: Health Factors (Heart Rate, BP, BMI)
- Y-axis: Sleep Patterns (Sleep Duration, Quality, Disorder)
- Color Gradient: From white (no correlation) to purple (strong correlation)
- Purpose: Helps identify health metrics most linked to sleep patterns.
- Allows users to select two quantitative variables to cluster and explore.
- Uses PAM (K-Medoids) clustering for robustness on mixed data.
- Outputs:
- A scatter plot showing clusters
- A dendrogram representing hierarchical relationships
- A reference line to suggest an optimal number of clusters
- Aggregates state-level values for any selected quantitative attribute.
- Uses a purple gradient where:
- Darker = worse values (e.g., low steps)
- Lighter = healthier values (e.g., high steps)
All visualizations support dynamic filtering:
- By age range
- By gender
- By occupation
This allows users to slice the data and examine group-specific trends interactively.
- While the visualizations help explore patterns—like the link between stress and sleep quality, differences in physical activity across sleep disorders, or correlations between health factors and sleep—the dataset is synthetic and meant for educational purposes only.
- These findings are based on artificial data and should not be interpreted as real-world insights. The purpose of this project is to showcase interactive data visualization techniques.
- R (>= 4.2.0),
- shiny (>= 1.8.0),
- dplyr (>= 1.1.4),
- plotly (>= 4.10.3),
- ggplot2 (>= 3.4.4),
- forcats (>= 1.0.0),
- tidyr (>= 1.3.0),
- stats (>= 4.3.1),
- cluster (>= 2.1.6),
- sf (>= 1.0.14),
- leaflet (>= 2.2.1)
- Clone this repository.
- Open the project in PyCharm with the R plugin installed.
- Run the
application.Rfile - Access the app in your browser at the URL provided in the console (e.g.,
http://127.0.0.1:7907).
The app uses resources/dataset_usa.csv, which contains the following columns:
ID,Gender,Age,Occupation,Stress.Level,Quality.of.Sleep, etc.- more about the dataset here: Sleep Health and Lifestyle Dataset (Kaggle)
- Ensure all required R packages are installed.
- If you encounter errors, check the console for details and verify the dataset format.





