Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Complex Engineering Problem-ES205-Fa2023

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Faculty of Engineering Sciences, GIK Institute

Advanced Linear Algebra (ES205)


Project: Complex Engineering Problem

Section: ES205 Instructor: Dr. Babar Zaman


Semester: Fall 2023 Weightage: 10%
Concerned CLOs:
✓ CLO4 - Analyze and solve applied engineering problems requiring tools from
advanced linear algebra. PLO-4 (Investigation)
✓ CLO5 - Efficiently work in a team to investigate and solve problems related to
applied linear algebra. PLO-9 (Individual and Teamwork)
Title: Application of linear algebra in Machine Learning
1.1 Objective:
The primary objective of this project is to comprehensively explore the application
of linear algebra concepts within machine learning methodologies. Through practical
implementation on a real-world dataset, the project aims to demonstrate the pivotal role of
linear algebra in tasks such as linear regression, dimensionality reduction, matrix
factorization, optimization algorithms, and neural networks etc. Additionally, the project
seeks to analyze the obtained results, extracting insights to showcase the significance of
linear algebra in enhancing machine learning models and algorithms.
1.2 Background:
The realm of Machine Learning (ML) revolves around the concept of enabling
machines to learn from data and make predictions or decisions without explicit
programming. One of the foundational pillars that underpins numerous ML algorithms and
techniques is linear algebra. Linear algebra serves as the mathematical framework that
provides tools and methods for representing and manipulating data efficiently. In the
context of ML, datasets are often depicted as matrices and vectors, where each data point
corresponds to a vector and the entire dataset is a matrix. This representation allows for
streamlined operations, including matrix multiplication, addition, and manipulation, which
are crucial for data preprocessing, feature extraction, and model optimization. Key concepts
of linear algebra, such as vectors, matrices, eigenvalues, eigenvectors, singular value
decomposition (SVD), and matrix factorization, are instrumental in various ML tasks. For

1|Page
Dr. Babar Zaman
Faculty of Engineering Sciences, GIK Institute
Advanced Linear Algebra (ES205)
Project: Complex Engineering Problem

instance, linear regression, a fundamental technique in ML for modeling relationships


between variables, heavily relies on linear algebraic methods to minimize errors and
optimize parameters.
Principal Component Analysis (PCA) utilizes eigenvalues and eigenvectors to
reduce data dimensionality while preserving essential information, enabling efficient feature
selection and visualization. Moreover, optimization algorithms like gradient descent, used
extensively in training ML models, leverage vector operations and matrix calculus to update
model parameters iteratively for convergence towards an optimal solution. Deep learning
models, particularly neural networks, capitalize on linear algebra for representing layers,
weights, and activations, making computations feasible through matrix operations during
both forward and backward propagation. In essence, the application of linear algebra in ML
lays the groundwork for understanding, developing, and optimizing algorithms capable of
learning from data. By employing linear algebra concepts in tasks such as regression,
dimensionality reduction, matrix factorization, optimization, and neural networks, this
project aims to elucidate the indispensable role of linear algebra in augmenting the
capabilities and performance of machine learning models and methodologies.

2. Milestones and Deadlines:

2.1 Milestone 1: CEP Group and data selection


• Please form a group of 4 students and mention the names and reg. nos. of the
students in the group as well as your chosen data from Kaggle, UCI Machine
Learning Repository, GitHub etc. the due date for milestone 1.
• The group will be assigned group no. which the instructor and the students will
use in their CEP reports and viva.
• Late submissions will be penalized.
Due Date: 28/11/2023
2.2 Milestone 2: Source-code and CEP report submission

2|Page
Dr. Babar Zaman
Faculty of Engineering Sciences, GIK Institute
Advanced Linear Algebra (ES205)
Project: Complex Engineering Problem

• The source code and the report must be included in the final submission. A
printed copy of the report must be sent in by the deadline to the course teaching
assistant. The report and source code will be supplied electronically through the
MS Teams platform as one zip file.
• Late submissions will be penalized. In the same way, there will be
consequences for not following the guidelines in this handout.
• The source code needs to be clearly formatted, easily executable, and include
comments that explain the differences between different lines of code. Use
Jupiter notebooks or MATLAB with live editor, if possible, as they display
significant intermediate outcomes and graphics directly within the code.
• The report should follow a standard structure and be no more than five pages in
double column format; the IEEE Word template for conferences is a good
choice for this.
• Title, Authors' names, Reg. no. and group no., Abstract, Introduction, Separate
sections detailing each CEP task's approach, implementation specifics, and
outcomes should all be included in the report. Task distribution, Conclusion,
and References (this section should be clearly described because this
assignment has a separate CLO for it). Anything not listed here may be added
to the appendices.
• Please adhere to standard report writing procedures, which include avoiding
formatting and language problems, using a suitable font type and size for
headers and body paragraphs, referencing figures and tables appropriately, and
avoiding pointless, repetitive, or cursory conversations. In this regard, you can
review a few pertinent sources.
Due Date: 11/12/2023
2.3 Milestone 3: CEP Demo and Viva

• Vivas will be arranged afterwards where each group will get a time slot to present

3|Page
Dr. Babar Zaman
Faculty of Engineering Sciences, GIK Institute
Advanced Linear Algebra (ES205)
Project: Complex Engineering Problem

their project and its understanding. Each team-member will be asked questions
related to the project and may be given a small task regarding any aspect of the
project, and marks will be awarded individually.
• Failure to register for a viva in a dates or no-show for the viva will result in
penalty.

2.4 Tips (which will help you in your project):


I. Understand the Problem and Define Objectives Clearly:

• Grasp the problem domain thoroughly and define clear objectives.


• Identify what needs to be achieved through the machine learning model.
II. Quality Data Collection and Preprocessing:

• Acquire high-quality data relevant to the problem statement.


• Perform thorough data cleaning, handle missing values, outliers, and normalize or scale features
appropriately.
III. Feature Selection and Engineering:

• Select relevant features that contribute most to the model's performance.


• Engineer new features that might enhance the model's predictive power.
IV. Choose the Right Model and Evaluation Metrics:

• Select the appropriate machine learning algorithm based on the nature of the problem
(classification, regression, clustering, etc.).
• Use suitable evaluation metrics (accuracy, precision, recall, F1-score, RMSE, etc.) based on the
problem type to measure model performance.
V. Train-Validation-Test Split and Cross-Validation:

• Split the dataset into training, validation, and test sets.


• Utilize cross-validation techniques for model validation to ensure robustness.
VI. Hyperparameter Tuning and Model Optimization:

• Optimize model performance through hyperparameter tuning.

4|Page
Dr. Babar Zaman
Faculty of Engineering Sciences, GIK Institute
Advanced Linear Algebra (ES205)
Project: Complex Engineering Problem

• Experiment with different parameters and algorithms to achieve the best results.
VII. Interpret and Communicate Results:

• Interpret model predictions and understand how the model arrives at conclusions.
• Communicate results effectively to stakeholders using visualizations, reports, or presentations.
VIII. Iterate and Improve:

• Continuously iterate and refine the model based on feedback and new data.
• Monitor performance and adapt the model to changing circumstances or new insights.
IX. Document the Process:

• Keep detailed documentation of the entire process, including data preprocessing steps, model
selection, hyperparameters, and results obtained.
• This documentation aids in reproducibility and future reference.
X. Ethical Considerations and Bias Awareness:

• Be mindful of ethical implications related to data collection, model bias, fairness, and privacy.
• Regularly check for biases in the data and model predictions to ensure fairness.
By adhering to these tips, a machine learning project can be approached systematically, ensuring that each
step contributes effectively to the development of a robust and reliable model, while also addressing
ethical considerations and continuous improvement.

Guidelines for Project Formatting: The following pages offer a comprehensive outline for
structuring your project, providing a clear understanding of its content and organization.

Important Message: If you run across any problems while working on this project, please
get in touch with me. I wish you luck.

5|Page
Dr. Babar Zaman
Faculty of Engineering Sciences, GIK Institute
Advanced Linear Algebra (ES205)
Project: Complex Engineering Problem

Project Outline
Title:

Application of linear algebra in Machine Learning


Abstract:
1. Introduction
✓ Overview of Machine Learning and its significance in real-world Applications
✓ Importance of Linear Algebra in Machine Learning
✓ Objectives of the Project
2. Literature Review
Review of key linear algebra concepts relevant to Machine Learning:
✓ Vectors, Matrices, and Operations
✓ Eigenvalues and Eigenvectors
✓ Singular Value Decomposition (SVD)
✓ Matrix Factorization Techniques
✓ Gradient Descent and Optimization Methods
✓ Application of Linear Algebra in specific ML algorithms (Linear Regression, PCA,
Neural Networks)
3. Dataset Selection
✓ Selection of a real-world dataset (e.g., from Kaggle, UCI Machine Learning
Repository, etc.)
✓ Brief description of the chosen dataset and its attributes
4. Data Preprocessing
✓ Data exploration and understanding
✓ Data cleaning (handling missing values, outlier detection, etc.)
✓ Representing the dataset using vectors and matrices

5. Tasks and Implementation (apply at least two methods)

6|Page
Dr. Babar Zaman
Faculty of Engineering Sciences, GIK Institute
Advanced Linear Algebra (ES205)
Project: Complex Engineering Problem

I. Linear Regression:
✓ Implementing linear regression to model relationships between features and target
variables
✓ Evaluating the model's performance (e.g., using Mean Squared Error)
II. Principal Component Analysis (PCA):
✓ Applying PCA for dimensionality reduction
✓ Visualizing data after dimensionality reduction
III. Singular Value Decomposition (SVD):
✓ Utilizing SVD for matrix factorization or compression tasks
✓ Analyzing the resulting components or reconstructed data
IV. Optimization Algorithms:
✓ Implementing gradient descent or its variants for model optimization
✓ Demonstrating the impact of different learning rates on convergence
V. Neural Networks:
✓ Building a simple neural network using matrix operations (forward and backward
propagation)
✓ Training the network on the selected dataset and evaluating its performance
6. Results and Analysis
✓ Presenting the results obtained from each task.
✓ Analyzing and discussing the insights gained from applying linear algebra in these
tasks.
7. Conclusion
✓ Summary of key findings and learnings
✓ Reflection on the significance of linear algebra in Machine Learning
✓ Future prospects and potential extensions of the project
8. References
✓ List of resources, papers, and tools used during the project.

7|Page
Dr. Babar Zaman

You might also like