Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
6 views

Algorithm To Deduce Parameter From Data

Uploaded by

hu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Algorithm To Deduce Parameter From Data

Uploaded by

hu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Algorithm to Deduce Parameter from Data

Optuna Hyperparameter Optimization Report

1. Introduction

● Objective: The goal of this project was to optimize the hyperparameters for multiple
outlier detection models using Optuna.

2. Dataset

● Description: The dataset used contains a scaled feature (OC,IC) for clustering and
outlier detection.
● Data Preparation: Data was scaled and preprocessed as required for each model
type.

3. Models and Hyperparameters

● DBSCAN:
○ eps: Maximum distance between two samples for one to be
considered as in the neighborhood of the other.
○ min_samples: Minimum number of samples in a neighborhood for a
point to be considered a core point.
● KMeans:
○ n_clusters: Number of clusters to form.
○ init: Method for initialization.
○ max_iter: Maximum number of iterations of the k-means algorithm
for a single run.
● Isolation Forest:
○ n_estimators: Number of base estimators in the ensemble.
○ max_samples: Number of samples to draw to train each base
estimator.
● ABOD:
○ n_neighbors: Number of neighbors to use for the angle-based
calculation.

4. Optimization Methodology

● Optuna Framework: Optuna was used to automate the hyperparameter tuning process.
The study was configured to maximize the outlier detection performance, evaluated
through metrics such as mean, Davies-Bouldin index, or other relevant outlier metrics.
● Search Space:
○ DBSCAN: eps and min_samples.
○ KMeans: n_clusters, init, and max_iter.
○ Isolation Forest: n_estimators and max_samples.
○ ABOD: n_neighbors.
● Optimization Algorithm: Optuna's Tree-structured Parzen Estimator (TPE)
was used for efficient exploration of the hyperparameter space.

5. Results

● Best Hyperparameters:
○ DBSCAN:
■ eps: [Optimal Value]
■ min_samples: [Optimal Value]
○ KMeans:
■ n_clusters: [Optimal Value]
■ init: [Optimal Method]
■ max_iter: [Optimal Value]
○ Isolation Forest:
■ n_estimators: [Optimal Value]
■ max_samples: [Optimal Value]
○ ABOD:
■ n_neighbors: [Optimal Value]
● Performance Metrics:
○ [Include relevant performance metrics for each model before and
after optimization.]

6. Conclusion

● Summary: This Algorithm works well for all modes except the clustering based
algorithm . Because in clustering based algorithms we need a threshold through which
we identify the anomalies.
Hyperopt Hyperparameter Optimization Report

1. Introduction

● Objective: The purpose of this project was to optimize the hyperparameters for several
outlier detection models using Hyperopt.

2. Dataset

● Description: The dataset comprises a scaled feature (OC,IC) used for clustering and
outlier detection.
● Data Preparation: Data was preprocessed and scaled appropriately for each model.

3. Models and Hyperparameters

● DBSCAN:
○ eps: The maximum distance between two samples for one to be
considered as in the neighborhood of the other.
○ min_samples: The minimum number of samples in a neighborhood
for a point to be considered a core point.
● KMeans:
○ n_clusters: Number of clusters to form.
○ init: Method for initialization.
○ max_iter: Maximum number of iterations of the k-means algorithm.
● Isolation Forest:
○ n_estimators: Number of base estimators in the ensemble.
○ max_samples: Number of samples to draw to train each base
estimator.
● ABOD:
○ n_neighbors: Number of neighbors to use for the angle-based
calculation.

4. Optimization Methodology

● Hyperopt Framework: Hyperopt was used to automate the hyperparameter tuning


process. The objective was to maximize outlier detection performance, assessed
through metrics such as the silhouette score, Davies-Bouldin index, or other relevant
metrics.
● Search Space:
○ DBSCAN: eps and min_samples.
○ KMeans: n_clusters, init, and max_iter.
○ Isolation Forest: n_estimators and max_samples.
○ ABOD: n_neighbors.
● Optimization Algorithm:
○ Search Algorithm: Hyperopt's Tree-structured Parzen Estimator (TPE) was
employed to efficiently explore the hyperparameter space.
○ Trials: The number of trials conducted to explore different combinations of
hyperparameters.

5. Results

● Best Hyperparameters:
○ DBSCAN:
■ eps: [Optimal Value]
■ min_samples: [Optimal Value]
○ KMeans:
■ n_clusters: [Optimal Value]
■ init: [Optimal Method]
■ max_iter: [Optimal Value]
○ Isolation Forest:
■ n_estimators: [Optimal Value]
■ max_samples: [Optimal Value]
○ ABOD:
■ n_neighbors: [Optimal Value]
● Performance Metrics:
○ [Include relevant performance metrics for each model before and
after optimization.]

6. Conclusion
● Summary: This Algorithm works well for all modes except the clustering based
algorithm . Because in clustering based algorithms we need a threshold through which
we identify the anomalies

You might also like