This repository contains code for performing linear regression using the scikit-learn library. The code is designed to predict air quality index (AQI) based on various atmospheric parameters.
This code utilizes the "city_day.csv" dataset, which contains daily air quality data for multiple cities in India. The dataset is obtained from Kaggle and can be accessed here.
Please refer to the Kaggle dataset documentation for more details on the data format and columns.
Note: The dataset may require preprocessing steps, such as handling missing values or removing irrelevant columns, before using it for linear regression.
The following dependencies are required to run the code:
- pandas
- numpy
- matplotlib
- scikit-learn
The code fits a linear regression model to the data and predicts the AQI values for the test set. It calculates the mean squared error (MSE) and R2 score as evaluation metrics for the model.
Contributions are welcome! If you find any issues or want to enhance the code, feel free to submit a pull request.