Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

ShreyanshDubey09/Fraud-Detection-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔎 Fraud Detection with Machine Learning

Confusion Matrix

Detecting fraudulent transactions using advanced machine learning models: RandomForest and LightGBM.
This project demonstrates end-to-end data preprocessing, feature engineering, model training, evaluation, and visualization.


📖 Table of Contents


Features

✅ Data cleaning (handling duplicates & missing values)
✅ One-hot encoding for categorical variables
✅ Scaled numerical features for better model performance
✅ Two models compared: RandomForest (robust) & LightGBM (fast)
✅ Performance evaluation: Confusion Matrix, ROC-AUC, Precision-Recall, Feature Importance
✅ Ready-to-use saved models (.pkl) and generated plots (.png)


Project Structure

fraud-detection/ │ ├── fraud_detection.py # Main Python script ├── rf_model.pkl # Saved RandomForest model ├── lgb_model.pkl # Saved LightGBM model ├── rf_confusion.png # Confusion Matrix plot ├── rf_roc.png # ROC Curve ├── rf_pr.png # Precision-Recall Curve ├── rf_feature_importance.png # Feature Importance └── README.md # This file


⚙️ Installation

  1. Clone the repository:
    git clone https://github.com/ShreyanshDubey09/fraud-detection.git
    cd fraud-detection
    

Install required libraries:

pip install pandas numpy matplotlib seaborn scikit-learn lightgbm joblib

Usage

  1. Place your Fraud.csv dataset in the working directory or update the path in fraud_detection.py.

  2. Run the script:

python fraud_detection.py

from google.colab import drive drive.mount('/content/drive')

df = pd.read_csv('/content/drive/MyDrive/Fraud.csv')

Future Improvements

  • Implement SMOTE or undersampling for extreme class imbalance
  • Deploy as a REST API using FastAPI or Flask
  • Experiment with deep learning models (e.g., LSTM or Autoencoder)
  • Add hyperparameter tuning (e.g., GridSearchCV, Optuna)

Contributing

Pull requests are welcome! For major changes, open an issue first to discuss what you’d like to change.

If you find this project helpful, please ⭐ the repo — it means a lot!

👩‍💻 Author Shreyansh Dubey

🌐 Portfolio - https://shreyanshdubey09.github.io/shreyansh-dubey.github.io/

💼 LinkedIn - https://www.linkedin.com/in/shreyanshdubey/

🐙 GitHub - https://github.com/ShreyanshDubey09

✉️ Email: sdubey0009999@gmail.com

About

Developed anomaly detection pipeline processing 40K+ transactions using Isolation Forest and LOF algorithms. Achieved 94% precision and 89% recall on test dataset. Designed custom features, evaluated models using F1-score and documented validation procedures.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages