Artificial

Artificial Intelligence Unit-1: Capstone Project
Fill in the Blanks
1.Every project, regardless of its size, starts with business understanding, which lays the foundation
for successful resolution of the business problem.
2. If the problem is to determine probabilities of an action, then a predictive model might be sed.
3.If the problem is to show relationships, a descriptive approach maybe be required.
4.If the problem requires a yes/ no answer, then a classification approach to predicting a response
would be suitable.
5.Techniques such as descriptive statistics and visualization can be applied to the data set, to assess
the content, quality, and initial insights about the data
2. What is a capstone project?
A capstone project is a project where students must research a topic independently to find a deep
understanding of the subject matter. It gives an opportunity for the student to integrate all their
knowledge and demonstrate it through a comprehensive project.
3.What is the importance of pattern in problem solving?
The premise that underlies all Machine Learning disciplines is that there needs to be a pattern.If
there is no pattern, then the problem cannot be solved with AI technology. It is fundamental that this
question is asked before deciding to embark on an AI development journey.
4.List down different problem categories that comes under predictive analysis? Write one example
for each?
1)Which category? (Classification)- Eg: Spam mail classification
2) How much or how many? (Regression)- Eg: Flight fare prediction
3) Which group? (Clustering)- Eg: Email marketing
4 Is this unusual? (Anomaly Detection) – Eg: Credit card fraud detection
5) Which option should be taken? (Recommendation) – Video recommendation system
5.What is design thinking? Draw the diagram and briefly explain each stage of design thinking?
Design Thinking is a design methodology that provides a solution-based approach to solving

problems. It is extremely useful in tackling complex problems that are ill-defined or unknown. The
five stages of Design Thinking are as follows: Empathize, Define, Ideate, Prototype, and Test.
1.Empathize
• Observe consumers to gain a deeper understanding of the problem

• Observation must be made with empathy
• Use 5W1H method for right questioning
• Who, What, When, Where, Why
• How
2.Define
• Define the problem statement

• Determining the cause of the problem
• Brainstorming to generate possible solutions
• Selecting most suitable solution.
3.Ideate
• Gather ideas to solve the problem you defined

• Brainstorm to arrive at various creative solutions
4.Prototype
• A prototype is a simple experimental model for a proposed solution

• Build representation (charts, models) of one or more ideas
5.Test
• Test the prototype and gain user feedback

• Iterate (Design thinking is an iterative process)
6.What is problem decomposition? Write down the steps involved in problem decomposition?
Problem decomposition is the process of breaking down the problem into smaller units before
coding
Problem decomposition steps
• 1.Understand the problem and then restate the problem in your ownwords
• 2.Break the problem down into a few large pieces.
• 3.Break complicated pieces down into smaller pieces.
• 4.Code one small piece at a time. Think about how to implement it Write the
code/query Test it… on its own .Fix problems, if any
7.Explain Train-Test Split Evaluation?
• The train-test split is a technique for evaluating the performance of a machine learning
algorithm.
• It can be used for classification or regression problems and can be used for any supervised
learning algorithm.
• The procedure involves taking a dataset and dividing it into two subsets.
• The first subset is used to fit the model and is referred to as the training dataset.
• The second subset is not used to train the model; but to evaluate the fit machine learning
model. It is referred to as testing dataset.
8.
How will you configure train test split procedure?
X_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.33) OR
X_train, X_test, y_train, y_test = train_test_split (X,y, train_size=0.67)

The procedure has one main configuration parameter, which is the size of the train and test sets.
This is most commonly expressed as a percentage between 0 and 1 for either the train or test
datasets.
For example, a training set with the size of 0.67 (67 percent) means that the remainder percentage
0.33 (33 percent) is assigned to the test set.
There is no optimal split percentage.
Nevertheless, common split percentages include:
Train: 80%, Test: 20%
Train: 67%, Test: 33%
Train: 50%, Test: 50
9.Explain cross validation?
• It is a resampling technique for evaluating machine learning models on a sample of data.

• The process includes a parameter k, which specifies the number of groups in to which a
given data sample should be divided.
• The process is referred as K- fold cross validation.
10.Explain difference between cross validation and train test split?
• On small datasets, the extra computational burden of running cross-validation isn’t a big
deal. So, if your dataset is smaller, you should run cross-validation
• If your dataset is larger, you can use train-test-split method For example K=10 for 10-fold
cross validation. More reliable, though it takes longer to run
11.What are hyper parameters?
Hyper parameters are parameters whose values govern the learning process. They also determine
the value of model parameters learned by a learning algorithm.Eg: The ratio of train-test-split,
Number of hidden layers in neural network, Number of clusters in clustering task
12.What is loss function? What are the different categories of loss function?
A loss function is a measure of how good a prediction model does in terms of being able to predict
the expected outcome. Loss functions can be broadly categorized into 2 types: Classification and
Regression Loss. Regression functions predict a quantity, and classification functions predict a label
14.Draw the diagram of Analytic Approach and explain each stage/ Explain foundational
methodology of data science?
1.Business understanding
• •What problem you are trying to solve?

• •Every project, whatever its size, begins with the understanding of the business.
• •Business partners who need the analytics solution play a critical role in this phase by
defining the problem, the project objectives, and the solution requirements from a business
perspective.
2.Analytic approach
• •How can you use the data to answer the question?

• •The problem must be expressed in the context of statistical learning to identify the
appropriate machine learning techniques to achieve the desired result.
3.Data Requirement
• What data do you need to answer the question?

• •Analytic approach determines the data requirements – specific content, formats, and data
representations, based on domain knowledge.
4.Data collection
• •Where is the data coming from (identify all sources) and how will you get it?
• •The Data Scientist identifies and collects data resources (structured, unstructured and semi-
structured) that are relevant to the problem area.
• •If the data scientist finds gaps in the data collection, he may need to review the data
requirements and collect more data.
5.Data understanding
• •Is the data that you collected representative of the problem to be solved?
• •Descriptive statistics and visualization techniques can help a data scientist understand the
content of the data, assess its quality, and obtain initial information about the data.
6.Data preparation
• •What additional work is required to manipulate and work with the data?
• •The Data preparation step includes all the activities used to create the data set used during
the modeling phase.
• •This includes cleansing data, combining data from multiple sources, and transforming data
into more useful variables.
• •In addition, feature engineering and text analysis can be used to derive new structured
variables to enrich all predictors and improve model accuracy.
7.Model Training
• •In What way can the data be visualized to get the answer that is required?
• •From the first version of the prepared data set, Data scientists use a Training data
set(historical data in which the desired result is known) to develop predictive or descriptive
models.
• •The modelling process is very iterative.
8.Model Evaluation
• •Does the model used really answer the initial question or does it need to be adjusted?
• •The Data Scientist evaluates the quality of the model and verifies that the business problem
is handled in a complete and adequate manner.
9.Deployment
• •Can you put the model into practice?

• •Once a satisfactory model has been developed and approved by commercial sponsors, it
will be implemented in the production environment or in a comparable test environment.
10.Feedback
• •Can you get constructive feedback into answering the question?

• •By collecting the results of the implemented model, the organization receives feedback
on the performance of the model and its impact on the implementation environment

Artificial

Uploaded by

Copyright:

Available Formats

Artificial

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial

Uploaded by

Copyright:

Available Formats

Artificial Intelligence Unit-1: Capstone Project

Fill in the Blanks

3.If the problem is to show relationships, a descriptive approach maybe be required.

2. What is a capstone project?

3.What is the importance of pattern in problem solving?

1)Which category? (Classification)- Eg: Spam mail classification

2) How much or how many? (Regression)- Eg: Flight fare prediction

3) Which group? (Clustering)- Eg: Email marketing

4 Is this unusual? (Anomaly Detection) – Eg: Credit card fraud detection

5) Which option should be taken? (Recommendation) – Video recommendation system

Design Thinking is a design methodology that provides a solution-based approach to solving

• Observe consumers to gain a deeper understanding of the problem

• Define the problem statement

• Gather ideas to solve the problem you defined

• A prototype is a simple experimental model for a proposed solution

• Test the prototype and gain user feedback

Problem decomposition steps

7.Explain Train-Test Split Evaluation?

How will you configure train test split procedure?

X_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.33) OR

X_train, X_test, y_train, y_test = train_test_split (X,y, train_size=0.67)

There is no optimal split percentage.

Nevertheless, common split percentages include:

Train: 80%, Test: 20%

Train: 67%, Test: 33%

Train: 50%, Test: 50

9.Explain cross validation?

• It is a resampling technique for evaluating machine learning models on a sample of data.

10.Explain difference between cross validation and train test split?

11.What are hyper parameters?

• •What problem you are trying to solve?

• •How can you use the data to answer the question?

• What data do you need to answer the question?

• •Can you put the model into practice?

• •Can you get constructive feedback into answering the question?

You might also like