Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
10 views

Decision_Tree_Regression.ipynb - Colab

The document explains the Decision Tree algorithm, detailing its application in regression and classification tasks. It includes code examples for implementing Decision Tree Regression on a synthetic dataset and Decision Tree Classification on the Iris dataset, showcasing model training, prediction, and evaluation. Visualizations of the results are also provided to illustrate the performance of the models.

Uploaded by

mgiri63021
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Decision_Tree_Regression.ipynb - Colab

The document explains the Decision Tree algorithm, detailing its application in regression and classification tasks. It includes code examples for implementing Decision Tree Regression on a synthetic dataset and Decision Tree Classification on the Iris dataset, showcasing model training, prediction, and evaluation. Visualizations of the results are also provided to illustrate the performance of the models.

Uploaded by

mgiri63021
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

2/20/25, 9:32 AM Decision_Tree_Regression.

ipynb - Colab

Decision Tree

A Decision Tree is a popular machine learning algorithm used for both classification and regression tasks.
In regression, the Decision Tree algorithm predicts a continuous target variable by splitting the data into subsets based on feature
values.
The splits are made to minimize the variance (or mean squared error) in the target variable within each subset.

keyboard_arrow_down How Decision Tree Regression Works


Splitting

The algorithm starts at the root node and splits the data into two or more subsets based on the feature that results in the most significant
reduction in variance (or another criterion).

Leaf Nodes

The process continues recursively, creating branches until a stopping criterion is met (e.g., maximum depth, minimum samples per leaf).

Prediction

For a new data point, the algorithm traverses the tree from the root to a leaf node, and the prediction is typically the mean of the target values
in that leaf node.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split

# Create a synthetic dataset


np.random.seed(0)
X = np.sort(5 * np.random.rand(80, 1), axis=0) # 80 random points in the range [0, 5]
y = np.sin(X).ravel() + np.random.normal(0, 0.1, X.shape[0]) # Sine function with noise

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Decision Tree Regressor


regressor = DecisionTreeRegressor(max_depth=3) # Limiting the depth to avoid overfitting

# Fit the model


regressor.fit(X_train, y_train)

# Make predictions
y_pred = regressor.predict(X_test)

# Visualize the results


plt.figure(figsize=(10, 6))
plt.scatter(X_train, y_train, color='blue', label='Training data')
plt.scatter(X_test, y_test, color='red', label='Test data')
plt.scatter(X_test, y_pred, color='green', label='Predictions', marker='x')
plt.title('Decision Tree Regression')
plt.xlabel('Feature (X)')
plt.ylabel('Target (y)')
plt.legend()
plt.show()

https://colab.research.google.com/drive/1WpDf5vvlmg_lXutKgPNgFx3q5-SsOWQ0#scrollTo=iKsTLoVaduyU&printMode=true 1/3
2/20/25, 9:32 AM Decision_Tree_Regression.ipynb - Colab

Iris DataSet

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report, confusion_matrix

# Load the Iris dataset


iris = datasets.load_iris()
X = iris.data # Features (4 features)
y = iris.target # Target labels (3 classes)

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Decision Tree Classifier


classifier = DecisionTreeClassifier(max_depth=3, random_state=42)

# Fit the model


classifier.fit(X_train, y_train)

# Make predictions
y_pred = classifier.predict(X_test)

# Evaluate the model


print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

# Visualize the decision boundaries (using only the first two features for 2D visualization)
X_train_2d = X_train[:, :2] # Use only the first two features
X_test_2d = X_test[:, :2] # Use only the first two features

# Create a mesh grid for plotting decision boundaries


x_min, x_max = X_train_2d[:, 0].min() - 1, X_train_2d[:, 0].max() + 1
y_min, y_max = X_train_2d[:, 1].min() - 1, X_train_2d[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01),
np.arange(y_min, y_max, 0.01))

# Train a new classifier on the 2D data for visualization


classifier_2d = DecisionTreeClassifier(max_depth=3, random_state=42)
classifier_2d.fit(X_train_2d, y_train)

https://colab.research.google.com/drive/1WpDf5vvlmg_lXutKgPNgFx3q5-SsOWQ0#scrollTo=iKsTLoVaduyU&printMode=true 2/3
2/20/25, 9:32 AM Decision_Tree_Regression.ipynb - Colab
# Predict the class for each point in the mesh grid
Z = classifier_2d.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Plotting
plt.figure(figsize=(10, 6))
plt.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.coolwarm)
plt.scatter(X_train_2d[:, 0], X_train_2d[:, 1], c=y_train, edgecolor='k', marker='o', label='Training data')
plt.scatter(X_test_2d[:, 0], X_test_2d[:, 1], c=y_test, edgecolor='k', marker='x', label='Test data')
plt.title('Decision Tree Classifier on Iris Dataset (2D Visualization)')
plt.xlabel('Sepal Length')
plt.ylabel('Sepal Width')
plt.legend()
plt.show()

Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]

Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 10


1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

<ipython-input-4-84cca00a73bc>:53: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ign
plt.scatter(X_test_2d[:, 0], X_test_2d[:, 1], c=y_test, edgecolor='k', marker='x', label='Test data')

https://colab.research.google.com/drive/1WpDf5vvlmg_lXutKgPNgFx3q5-SsOWQ0#scrollTo=iKsTLoVaduyU&printMode=true 3/3

You might also like