Statistical Learning and Model Selection module 2.pptxnagarajan740445
Statistical learning theory was introduced in the 1960s as a problem of function estimation from data. In the 1990s, new learning algorithms like support vector machines were proposed based on the developed theory, making statistical learning theory a tool for both theoretical analysis and creating practical algorithms. Cross-validation techniques like k-fold and leave-one-out cross-validation help estimate a model's predictive performance and avoid overfitting by splitting data into training and test sets. The goal is to find the right balance between bias and variance to minimize prediction error on new data.
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Maninda Edirisooriya
Model Testing and Evaluation is a lesson where you learn how to train different ML models with changes and evaluating them to select the best model out of them. This was one of the lectures of a full course I taught in University of Moratuwa, Sri Lanka on 2023 second half of the year.
Statistical Learning and Model Selection (1).pptxrajalakshmi5921
This document discusses statistical learning and model selection. It introduces statistical learning problems, statistical models, the need for statistical modeling, and issues around evaluating models. Key points include: statistical learning involves using data to build a predictive model; a good model balances bias and variance to minimize prediction error; cross-validation is described as the ideal procedure for evaluating models without overfitting to the test data.
This document discusses various evaluation measures used in machine learning, including accuracy, precision, recall, F1 score, and AUROC for classification problems. For regression problems, the output is continuous and no additional treatment is needed. Classification accuracy is defined as the number of correct predictions divided by the total predictions. The confusion matrix is used to calculate true positives, false positives, etc. Precision measures correct positive predictions, while recall measures all positive predictions. The F1 score balances precision and recall for imbalanced data. AUROC plots the true positive rate against the false positive rate.
The one-sample t-test is used to determine whether a sample comes from a population with a specific mean. This population mean is not always known, but is sometimes hypothesized.
Evaluation of multilabel multi class classificationSridhar Nomula
This document discusses multi-label and multi-class classification as well as evaluation metrics for multi-label classifiers. It explains that multi-label classification allows instances to belong to more than one class, while multi-class classification assigns each instance to only one class. The document outlines example-based and label-based evaluation metrics for multi-label classifiers, including precision, recall, F1 score, hamming loss, and average precision. It provides examples of calculating these metrics and discusses the benefits of different averaging approaches.
Statistical Learning and Model Selection module 2.pptxnagarajan740445
Statistical learning theory was introduced in the 1960s as a problem of function estimation from data. In the 1990s, new learning algorithms like support vector machines were proposed based on the developed theory, making statistical learning theory a tool for both theoretical analysis and creating practical algorithms. Cross-validation techniques like k-fold and leave-one-out cross-validation help estimate a model's predictive performance and avoid overfitting by splitting data into training and test sets. The goal is to find the right balance between bias and variance to minimize prediction error on new data.
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Maninda Edirisooriya
Model Testing and Evaluation is a lesson where you learn how to train different ML models with changes and evaluating them to select the best model out of them. This was one of the lectures of a full course I taught in University of Moratuwa, Sri Lanka on 2023 second half of the year.
Statistical Learning and Model Selection (1).pptxrajalakshmi5921
This document discusses statistical learning and model selection. It introduces statistical learning problems, statistical models, the need for statistical modeling, and issues around evaluating models. Key points include: statistical learning involves using data to build a predictive model; a good model balances bias and variance to minimize prediction error; cross-validation is described as the ideal procedure for evaluating models without overfitting to the test data.
This document discusses various evaluation measures used in machine learning, including accuracy, precision, recall, F1 score, and AUROC for classification problems. For regression problems, the output is continuous and no additional treatment is needed. Classification accuracy is defined as the number of correct predictions divided by the total predictions. The confusion matrix is used to calculate true positives, false positives, etc. Precision measures correct positive predictions, while recall measures all positive predictions. The F1 score balances precision and recall for imbalanced data. AUROC plots the true positive rate against the false positive rate.
The one-sample t-test is used to determine whether a sample comes from a population with a specific mean. This population mean is not always known, but is sometimes hypothesized.
Evaluation of multilabel multi class classificationSridhar Nomula
This document discusses multi-label and multi-class classification as well as evaluation metrics for multi-label classifiers. It explains that multi-label classification allows instances to belong to more than one class, while multi-class classification assigns each instance to only one class. The document outlines example-based and label-based evaluation metrics for multi-label classifiers, including precision, recall, F1 score, hamming loss, and average precision. It provides examples of calculating these metrics and discusses the benefits of different averaging approaches.
This document discusses evaluating machine learning model performance. It covers classification evaluation metrics like accuracy, precision, recall, F1 score, and confusion matrices. It also discusses regression metrics like MAE, MSE, and RMSE. The document discusses techniques for dealing with class imbalance like oversampling and undersampling. It provides examples of evaluating models and interpreting results based on these various performance metrics.
Top 10 Data Science Practioner Pitfalls - Mark LandrySri Ambati
Over-fitting, misread data, NAs, collinear column elimination and other common issues play havoc in the day of practicing data scientist. In this talk, we review top 10 common pitfalls and steps to avoid them. #h2ony
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
The document discusses different techniques for cross-validation in machine learning. It defines cross-validation as a technique for validating model efficiency by training on a subset of data and testing on an unseen subset. It then describes various cross-validation methods like hold out validation, k-fold cross-validation, leave one out cross-validation, and their implementation in scikit-learn.
6 Evaluating Predictive Performance and ensemble.pptxmohammedalherwi1
Bagging, also known as bootstrap aggregating, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. It works by building multiple models (such as decision trees) and then averaging their predictions. Specifically, bagging fits each model on a random subset of the training set sampled with replacement. Then, the predictions from all the models are averaged or voted over to produce the final prediction.
Top 100+ Google Data Science Interview Questions.pdfDatacademy.ai
Data science interviews can be particularly difficult due to the many proficiencies that you'll have to demonstrate (technical skills, problem solving, communication) and the generally high bar to entry for the industry.we Provide Top 100+ Google Data Science Interview Questions : All You Need to know to Crack it
visit by :-https://www.datacademy.ai/google-data-science-interview-questions/
1) Statistics is the science of collecting, analyzing, and drawing conclusions from data. It is used to understand populations based on samples since directly measuring entire populations is often impossible.
2) There are two main types of data: qualitative data which relates to descriptive characteristics, and quantitative data which can be expressed numerically. Common statistical analyses include calculating the mean, standard deviation, and using t-tests, ANOVA, correlation, and chi-squared tests.
3) Statistical analyses allow researchers to determine uncertainties in measurements, compare groups, identify relationships between variables, and assess whether observed differences are likely due to chance or a factor being studied. Key concepts include null and alternative hypotheses, p-values, and effect size.
This document provides an overview of key concepts in sampling, including population, sample, sampling frame, probability sampling, and non-probability sampling. It discusses the qualities of a probability sample, including how findings from a random sample can be generalized to the population. It also covers sample size considerations and different types of error in sampling, such as sampling error and non-sampling error.
This document discusses sample size determination and calculation. It defines sample size as the subset of a population chosen for a study to make inferences about the total population. The key factors in determining sample size are the desired level of accuracy, allowing for appropriate analysis, and validity of significance tests. The document provides formulas and methods for calculating sample size for different study designs and populations, including using formulas, readymade tables, nomograms, and computer software. Accurately determining sample size is essential for research.
A sample design is a definite plan for obtaining a sample from a given population. Researcher must select/prepare a sample design which should be reliable and appropriate for his research study.
The document provides an overview of key concepts related to estimation in statistics, including:
- Estimation involves using sample data to estimate unknown population parameters. Common estimators include the sample mean, proportion, and standard deviation.
- There are two main types of estimates - point estimates and interval estimates. Point estimates are single values while interval estimates specify a range.
- The process of estimation involves identifying the parameter, selecting a random sample, choosing an estimator, and calculating the estimate.
- Estimates can differ from the true population value due to sampling error and non-sampling error. Bias occurs when the expected value of the estimate differs from the true parameter value.
This document discusses various concepts related to errors and accuracy in chemical analysis. It defines different types of errors like gross errors, systematic errors, and random errors. It explains how to classify errors based on their origin and how to minimize different types of errors. The document also covers key statistical concepts like mean, median, standard deviation, normal distribution, precision and accuracy that are important for understanding errors in chemical analysis.
Top 10 Data Science Practitioner PitfallsSri Ambati
Over-fitting, misread data, NAs, collinear column elimination and other common issues play havoc in the day of practicing data scientist. In this talk, Mark Landry, one of the world’s leading Kagglers, will review the top 10 common pitfalls and steps to avoid them.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
This document discusses various sampling techniques used in research. It defines key terms like population, sample, census, and sampling frame. It describes probability sampling methods like simple random sampling, systematic sampling, stratified random sampling, cluster sampling, and multi-stage sampling. It also discusses non-probability sampling techniques like judgmental, quota, snowball, and convenience sampling. The document emphasizes the importance of selecting the most appropriate sampling technique based on the research question and having a representative sample.
This document discusses various performance metrics used to evaluate machine learning models, with a focus on classification metrics. It defines key metrics like accuracy, precision, recall, and specificity using a cancer detection example. Accuracy is only useful when classes are balanced, while precision captures true positives and recall focuses on minimizing false negatives. The document emphasizes that the appropriate metric depends on the problem and whether minimizing false positives or false negatives is more important. Confusion matrices are also introduced as a way to visualize model performance.
This document discusses techniques for evaluating and improving classifiers. It begins by explaining how to evaluate a classifier's accuracy using metrics like accuracy, precision, recall, and F-measure. It introduces the confusion matrix and shows how different parts of the matrix relate to these metrics. The document then discusses issues like overfitting, underfitting, bias and variance that can impact a classifier's performance. It explains that the goal is to balance bias and variance to minimize total error and achieve optimal classification.
This document discusses various machine learning model validation techniques and ensemble methods such as bagging and boosting. It defines key concepts like overfitting, underfitting, bias-variance tradeoff, and different validation metrics. Cross validation techniques like k-fold and bootstrap are explained as ways to estimate model performance on unseen data. Bagging creates multiple models on resampled data and averages their predictions to reduce variance. Boosting iteratively adjusts weights of misclassified observations to build strong models, but risks overfitting. Gradient boosting and XGBoost are powerful ensemble methods.
The document discusses validation techniques for machine learning models. It describes the train-test split method of dividing a dataset into training and test sets. It also explains k-fold and leave-one-out cross-validation as alternatives that reduce the impact of random partitions by repeatedly splitting the data into training and test subsets. K-fold validation divides the data into k subsets and uses k-1 for training and 1 for testing over k iterations, while leave-one-out uses a single sample for testing each time.
As mentioned earlier, the mid-term will have conceptual and quanti.docxfredharris32
As mentioned earlier, the mid-term will have conceptual and quantitative multiple-choice questions. You need to read all 4 chapters and you need to be able to solve problems in all 4 chapters in order to do well in this test.
The following are for review and learning purposes only. I am not indicating that identical or similar problems will be in the test. As I have indicated in the class syllabus, all the exams in this course will have multiple-choice questions and problems.
Suggestion: treat this review set as you would an actual test. Sit down with your one page of notes and your calculator, and give it a try. That way you will know what areas you still need to study.
ADMN 210
Answers to Review for Midterm #1
1) Classify each of the following as nominal, ordinal, interval, or ratio data.
a. The time required to produce each tire on an assembly line – ratio since it is numeric with a valid 0 point meaning “lack of”
b. The number of quarts of milk a family drinks in a month - ratio since it is numeric with a valid 0 point meaning “lack of”
c. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor – ordinal since it is ranking data only
d. The telephone area code of clients in the United States – nominal since it is a label
e. The age of each of your employees - ratio since it is numeric with a valid 0 point meaning “lack of”
f. The dollar sales at the local pizza house each month - ratio since it is numeric with a valid 0 point meaning “lack of”
g. An employee’s identification number – nominal since it is a label
h. The response time of an emergency unit - ratio since it is numeric with a valid 0 point meaning “lack of”
2) True or False: The highest level of data measurement is the ratio-level measurement.
True (you can do the most powerful analysis with this kind of data)
3) True or False: Interval- and ratio-level data are also referred to as categorical data.
False (Interval and ratio level data are numeric and therefore quantitative, NOT qualitative….Nominal is qualitative)
4) A small portion or a subset of the population on which data is collected for conducting statistical analysis is called __________.
A sample! A population is the total group, a census IS the population, and a data set can be either a sample or a population.
5) One of the advantages for taking a sample instead of conducting a census is this:
a sample is more accurate than census
a sample is difficult to take
a sample cannot be trusted
a sample can save money when data collection process is destructive
6) Selection of the winning numbers is a lottery is an example of __________.
convenience sampling
random sampling
nonrandom sampling
regulatory sampling
7) A type of random sampling in which the population is divided into non-overlapping subpopulations is called __________.
stratified random sampling
cluster sampling
systematic random sampling
regulatory sampling
8) A ...
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
Concepts include decision tree with its examples. Measures used for splitting in decision tree like gini index, entropy, information gain, pros and cons, validation. Basics of random forests with its example and uses.
Non-Functional Testing Guide_ Exploring Its Types, Importance and Tools.pdfkalichargn70th171
Are you looking for ways to ensure your software development projects are successful? Non-functional testing is an essential part of the process, helping to guarantee that applications and systems meet the necessary non-functional requirements such as availability, scalability, security, and usability.
More Related Content
Similar to Performance Measurement for Machine Leaning.pptx
This document discusses evaluating machine learning model performance. It covers classification evaluation metrics like accuracy, precision, recall, F1 score, and confusion matrices. It also discusses regression metrics like MAE, MSE, and RMSE. The document discusses techniques for dealing with class imbalance like oversampling and undersampling. It provides examples of evaluating models and interpreting results based on these various performance metrics.
Top 10 Data Science Practioner Pitfalls - Mark LandrySri Ambati
Over-fitting, misread data, NAs, collinear column elimination and other common issues play havoc in the day of practicing data scientist. In this talk, we review top 10 common pitfalls and steps to avoid them. #h2ony
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
The document discusses different techniques for cross-validation in machine learning. It defines cross-validation as a technique for validating model efficiency by training on a subset of data and testing on an unseen subset. It then describes various cross-validation methods like hold out validation, k-fold cross-validation, leave one out cross-validation, and their implementation in scikit-learn.
6 Evaluating Predictive Performance and ensemble.pptxmohammedalherwi1
Bagging, also known as bootstrap aggregating, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. It works by building multiple models (such as decision trees) and then averaging their predictions. Specifically, bagging fits each model on a random subset of the training set sampled with replacement. Then, the predictions from all the models are averaged or voted over to produce the final prediction.
Top 100+ Google Data Science Interview Questions.pdfDatacademy.ai
Data science interviews can be particularly difficult due to the many proficiencies that you'll have to demonstrate (technical skills, problem solving, communication) and the generally high bar to entry for the industry.we Provide Top 100+ Google Data Science Interview Questions : All You Need to know to Crack it
visit by :-https://www.datacademy.ai/google-data-science-interview-questions/
1) Statistics is the science of collecting, analyzing, and drawing conclusions from data. It is used to understand populations based on samples since directly measuring entire populations is often impossible.
2) There are two main types of data: qualitative data which relates to descriptive characteristics, and quantitative data which can be expressed numerically. Common statistical analyses include calculating the mean, standard deviation, and using t-tests, ANOVA, correlation, and chi-squared tests.
3) Statistical analyses allow researchers to determine uncertainties in measurements, compare groups, identify relationships between variables, and assess whether observed differences are likely due to chance or a factor being studied. Key concepts include null and alternative hypotheses, p-values, and effect size.
This document provides an overview of key concepts in sampling, including population, sample, sampling frame, probability sampling, and non-probability sampling. It discusses the qualities of a probability sample, including how findings from a random sample can be generalized to the population. It also covers sample size considerations and different types of error in sampling, such as sampling error and non-sampling error.
This document discusses sample size determination and calculation. It defines sample size as the subset of a population chosen for a study to make inferences about the total population. The key factors in determining sample size are the desired level of accuracy, allowing for appropriate analysis, and validity of significance tests. The document provides formulas and methods for calculating sample size for different study designs and populations, including using formulas, readymade tables, nomograms, and computer software. Accurately determining sample size is essential for research.
A sample design is a definite plan for obtaining a sample from a given population. Researcher must select/prepare a sample design which should be reliable and appropriate for his research study.
The document provides an overview of key concepts related to estimation in statistics, including:
- Estimation involves using sample data to estimate unknown population parameters. Common estimators include the sample mean, proportion, and standard deviation.
- There are two main types of estimates - point estimates and interval estimates. Point estimates are single values while interval estimates specify a range.
- The process of estimation involves identifying the parameter, selecting a random sample, choosing an estimator, and calculating the estimate.
- Estimates can differ from the true population value due to sampling error and non-sampling error. Bias occurs when the expected value of the estimate differs from the true parameter value.
This document discusses various concepts related to errors and accuracy in chemical analysis. It defines different types of errors like gross errors, systematic errors, and random errors. It explains how to classify errors based on their origin and how to minimize different types of errors. The document also covers key statistical concepts like mean, median, standard deviation, normal distribution, precision and accuracy that are important for understanding errors in chemical analysis.
Top 10 Data Science Practitioner PitfallsSri Ambati
Over-fitting, misread data, NAs, collinear column elimination and other common issues play havoc in the day of practicing data scientist. In this talk, Mark Landry, one of the world’s leading Kagglers, will review the top 10 common pitfalls and steps to avoid them.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
This document discusses various sampling techniques used in research. It defines key terms like population, sample, census, and sampling frame. It describes probability sampling methods like simple random sampling, systematic sampling, stratified random sampling, cluster sampling, and multi-stage sampling. It also discusses non-probability sampling techniques like judgmental, quota, snowball, and convenience sampling. The document emphasizes the importance of selecting the most appropriate sampling technique based on the research question and having a representative sample.
This document discusses various performance metrics used to evaluate machine learning models, with a focus on classification metrics. It defines key metrics like accuracy, precision, recall, and specificity using a cancer detection example. Accuracy is only useful when classes are balanced, while precision captures true positives and recall focuses on minimizing false negatives. The document emphasizes that the appropriate metric depends on the problem and whether minimizing false positives or false negatives is more important. Confusion matrices are also introduced as a way to visualize model performance.
This document discusses techniques for evaluating and improving classifiers. It begins by explaining how to evaluate a classifier's accuracy using metrics like accuracy, precision, recall, and F-measure. It introduces the confusion matrix and shows how different parts of the matrix relate to these metrics. The document then discusses issues like overfitting, underfitting, bias and variance that can impact a classifier's performance. It explains that the goal is to balance bias and variance to minimize total error and achieve optimal classification.
This document discusses various machine learning model validation techniques and ensemble methods such as bagging and boosting. It defines key concepts like overfitting, underfitting, bias-variance tradeoff, and different validation metrics. Cross validation techniques like k-fold and bootstrap are explained as ways to estimate model performance on unseen data. Bagging creates multiple models on resampled data and averages their predictions to reduce variance. Boosting iteratively adjusts weights of misclassified observations to build strong models, but risks overfitting. Gradient boosting and XGBoost are powerful ensemble methods.
The document discusses validation techniques for machine learning models. It describes the train-test split method of dividing a dataset into training and test sets. It also explains k-fold and leave-one-out cross-validation as alternatives that reduce the impact of random partitions by repeatedly splitting the data into training and test subsets. K-fold validation divides the data into k subsets and uses k-1 for training and 1 for testing over k iterations, while leave-one-out uses a single sample for testing each time.
As mentioned earlier, the mid-term will have conceptual and quanti.docxfredharris32
As mentioned earlier, the mid-term will have conceptual and quantitative multiple-choice questions. You need to read all 4 chapters and you need to be able to solve problems in all 4 chapters in order to do well in this test.
The following are for review and learning purposes only. I am not indicating that identical or similar problems will be in the test. As I have indicated in the class syllabus, all the exams in this course will have multiple-choice questions and problems.
Suggestion: treat this review set as you would an actual test. Sit down with your one page of notes and your calculator, and give it a try. That way you will know what areas you still need to study.
ADMN 210
Answers to Review for Midterm #1
1) Classify each of the following as nominal, ordinal, interval, or ratio data.
a. The time required to produce each tire on an assembly line – ratio since it is numeric with a valid 0 point meaning “lack of”
b. The number of quarts of milk a family drinks in a month - ratio since it is numeric with a valid 0 point meaning “lack of”
c. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor – ordinal since it is ranking data only
d. The telephone area code of clients in the United States – nominal since it is a label
e. The age of each of your employees - ratio since it is numeric with a valid 0 point meaning “lack of”
f. The dollar sales at the local pizza house each month - ratio since it is numeric with a valid 0 point meaning “lack of”
g. An employee’s identification number – nominal since it is a label
h. The response time of an emergency unit - ratio since it is numeric with a valid 0 point meaning “lack of”
2) True or False: The highest level of data measurement is the ratio-level measurement.
True (you can do the most powerful analysis with this kind of data)
3) True or False: Interval- and ratio-level data are also referred to as categorical data.
False (Interval and ratio level data are numeric and therefore quantitative, NOT qualitative….Nominal is qualitative)
4) A small portion or a subset of the population on which data is collected for conducting statistical analysis is called __________.
A sample! A population is the total group, a census IS the population, and a data set can be either a sample or a population.
5) One of the advantages for taking a sample instead of conducting a census is this:
a sample is more accurate than census
a sample is difficult to take
a sample cannot be trusted
a sample can save money when data collection process is destructive
6) Selection of the winning numbers is a lottery is an example of __________.
convenience sampling
random sampling
nonrandom sampling
regulatory sampling
7) A type of random sampling in which the population is divided into non-overlapping subpopulations is called __________.
stratified random sampling
cluster sampling
systematic random sampling
regulatory sampling
8) A ...
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
Concepts include decision tree with its examples. Measures used for splitting in decision tree like gini index, entropy, information gain, pros and cons, validation. Basics of random forests with its example and uses.
Similar to Performance Measurement for Machine Leaning.pptx (20)
Non-Functional Testing Guide_ Exploring Its Types, Importance and Tools.pdfkalichargn70th171
Are you looking for ways to ensure your software development projects are successful? Non-functional testing is an essential part of the process, helping to guarantee that applications and systems meet the necessary non-functional requirements such as availability, scalability, security, and usability.
In this session, we discussed the critical need for comprehensive backups across all aspects of our industry—from code and databases to webservers, file servers, and network configurations. Emphasizing the importance of proactive measures, attendees were urged to ensure their backup systems were tested through restoration processes. The session underscored the risk of discovering backup issues only during crises, highlighting the necessity of verifying backup integrity through restoration tests.
Explore the latest in ColdBox Debugger v4.2.0, featuring the Hyper Collector for HTTP/S request tracking, Lucee SQL Collector for query profiling, and Heap Dump Support for memory leak debugging. Enhancements like the revamped Request Dock and improved SQL/JSON formatting streamline debugging for optimal ColdBox application performance and stability. Ideal for developers familiar with ColdBox, this session focuses on leveraging advanced debugging tools to enhance development efficiency.
Break data silos with real-time connectivity using Confluent Cloud Connectorsconfluent
Connectors integrate Apache Kafka® with external data systems, enabling you to move away from a brittle spaghetti architecture to one that is more streamlined, secure, and future-proof. However, if your team still spends multiple dev cycles building and managing connectors using just open source Kafka Connect, it’s time to consider a faster and cost-effective alternative.
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio, Inc.
Alluxio Webinar
June. 18, 2024
For more Alluxio Events: https://www.alluxio.io/events/
Speaker:
- Jianjian Xie (Staff Software Engineer, Alluxio)
As Trino users increasingly rely on cloud object storage for retrieving data, speed and cloud cost have become major challenges. The separation of compute and storage creates latency challenges when querying datasets; scanning data between storage and compute tiers becomes I/O bound. On the other hand, cloud API costs related to GET/LIST operations and cross-region data transfer add up quickly.
The newly introduced Trino file system cache by Alluxio aims to overcome the above challenges. In this session, Jianjian will dive into Trino data caching strategies, the latest test results, and discuss the multi-level caching architecture. This architecture makes Trino 10x faster for data lakes of any scale, from GB to EB.
What you will learn:
- Challenges relating to the speed and costs of running Trino in the cloud
- The new Trino file system cache feature overview, including the latest development status and test results
- A multi-level cache framework for maximized speed, including Trino file system cache and Alluxio distributed cache
- Real-world cases, including a large online payment firm and a top ridesharing company
- The future roadmap of Trino file system cache and Trino-Alluxio integration
Sami provided a beginner-friendly introduction to Amazon Web Services (AWS), covering essential terms, products, and services for cloud deployment. Participants explored AWS' latest Gen AI offerings, making it accessible for those starting their cloud journey or integrating AI into coding practices.
Join me for an insightful journey into task scheduling within the ColdBox framework. In this session, we explored how to effortlessly create and manage scheduled tasks directly in your code, enhancing control and efficiency in applications and modules. Attendees experienced a user-friendly dashboard for seamless task management and monitoring. Whether you're experienced with ColdBox or new to it, this session provided practical knowledge and tips to streamline your development workflow.
Ansys Mechanical enables you to solve complex structural engineering problems and make better, faster design decisions. With the finite element analysis (FEA) solvers available in the suite, you can customize and automate solutions for your structural mechanics problems and parameterize them to analyze multiple design scenarios. Ansys Mechanical is a dynamic tool that has a complete range of analysis tools.
Major Outages in Major Enterprises Payara ConferenceTier1 app
In this session, we will be discussing major outages that happened in major enterprises. We will analyse the actual thread dumps, heap dumps, GC logs, and other artifacts captured at the time of the problem. After this session, troubleshooting CPU spikes, OutOfMemoryError, response time degradations, network connectivity issues, and application unresponsiveness may not stump you.
Lots of bloggers are using Google AdSense now. It’s getting really popular. With AdSense, bloggers can make money by showing ads on their websites. Read this important article written by the experienced designers of the best website designing company in Delhi –
Explore the rapid development journey of TryBoxLang, completed in just 48 hours. This session delves into the innovative process behind creating TryBoxLang, a platform designed to showcase the capabilities of BoxLang by Ortus Solutions. Discover the challenges, strategies, and outcomes of this accelerated development effort, highlighting how TryBoxLang provides a practical introduction to BoxLang's features and benefits.
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdfonemonitarsoftware
WhatsApp Tracker Software is an effective tool for remotely tracking the target’s WhatsApp activities. It allows users to monitor their loved one’s online behavior to ensure appropriate interactions for responsive device use.
Download this PPTX file and share this information to others.
In this session, we explored setting up Playwright, an end-to-end testing tool for simulating browser interactions and running TestBox tests. Participants learned to configure Playwright for applications, simulate user interactions to stress-test forms, and handle scenarios like taking screenshots, recording sessions, capturing Chrome dev tools traces, testing login failures, and managing broken JavaScript. The session also covered using Playwright with non-ColdBox sites, providing practical insights into enhancing testing capabilities.
In this session, we explored how the cbfs module empowers developers to abstract and manage file systems seamlessly across their lifecycle. From local development to S3 deployment and customized media providers requiring authentication, cbfs offers flexible solutions. We discussed how cbfs simplifies file handling with enhanced workflow efficiency compared to native methods, along with practical tips to accelerate complex file operations in your projects.
2. Confusion Matrix (1)
• A confusion matrix is a table that is often used
to describe the performance of a classification
model (or “classifier”) on a set of test data for
which the true values are known
• It allows the visualization of the performance of
an algorithm
3. Confusion Matrix (2)
• It allows easy identification of confusion
between classes e.g. one class is commonly
mislabeled as the other.
• Most performance measures are computed
from the confusion matrix.
4. Confusion Matrix (3)
• A confusion matrix is a summary of prediction
results on a classification problem
• The number of correct and incorrect
predictions are summarized with count values
and broken down by each class. This is the key
to the confusion matrix
5. Confusion Matrix (4)
• The confusion matrix shows the ways in which
your classification model is confused when it
makes predictions
• It gives us insight not only into the errors being
made by a classifier but more importantly the
types of errors that are being made
7. Confusion Matrix (6)
• Here,
Class 1 : Positive
Class 2 : Negative
Definition of the Terms:
• Positive (P) : Observation is positive
(for example: is an apple).
• Negative (N) : Observation is not positive
• (for example: is not an apple).
8. Confusion Matrix (7)
• True Positive (TP) :
Observation is positive, and is predicted to be
positive
• False Negative (FN) :
Observation is positive, but is predicted negative.
• True Negative (TN) :
Observation is negative, and is predicted to be
negative.
• False Positive (FP) :
Observation is negative, but is predicted positive.
12. Sensitivity and Specificity
• Sensitivity and specificity values can be used
to quantify the performance of a case
definition or the results of a diagnostic test.
• Even with a highly specific diagnostic test, if a
disease is uncommon among those people
tested, a large proportion of positive test
results will be false positive, and the positive
predictive value will be low.
13. Sensitivity and Specificity
• If the test is applied more selectively such that
the proportion of people tested who truly have
disease is greater, the test's predictive value will
be improved
• Thus, sensitivity and specificity are characteristics
of the test, whereas predictive values depend
both on test sensitivity and specificity and on the
disease prevalence in the population in which
the test is applied
14. Sensitivity and Specificity
• Sensitivity/Recall
• Sensitivity (Se) is defined as the proportion of
individuals that have a positive test result.
15. Sensitivity and Specificity
• Specificity
• Specificity is defined as the proportion of
individuals have negative test result
16. Precision
• To get the value of precision we divide the total
number of correctly classified positive examples by
the total number of predicted positive examples.
High Precision indicates an example labeled as
positive is indeed positive (small number of FP).
17. precision
Precision is the fraction of true positive examples
among the examples that the model classified as
positive. In other words, the number of true
positives divided by the number of false positives
plus true positives.
recall
Recall, also known as sensitivity, is the fraction of
examples classified as positive, among the total
number of positive examples. In other words, the
number of true positives divided by the number of
true positives plus false negatives.
TP
The number of true positives classified by the
model.
FN
The number of false negatives classified by the
model.
FP
The number of false positives classified by the
model.
18. F1 Score
• The F-score, also called the F1-score, is a measure of a model’s accuracy
on a dataset. It is used to evaluate binary classification systems,
which classify examples into ‘positive’ or ‘negative’
• The F-score is a way of combining the precision and recall of the model,
and it is defined as the harmonic mean of the model’s precision and recall
19. Calculating F-score
• Let us imagine we have a tree with ten apples
on it. Seven are ripe and three are still unripe,
but we do not know which one is which. We
have an AI which is trained to recognize which
apples are ripe for picking, and pick all the
ripe apples and no unripe apples. We would
like to calculate the F-score, and we consider
both precision and recall to be equally
important, so we use the F1-score.
20. The AI picks five ripe apples but also picks one unripe apple.
24. Precision and Recall for model 1
• Precision = 0.8
• Recall = 0.666
• F1 Score = 0.72
25. Conclusion
• High recall, low precision:
This means that most of the positive examples
are correctly recognized (low FN) but there are a
lot of false positives.
• Low recall, high precision:
This shows that we miss a lot of positive
examples (high FN) but those we predict as
positive are indeed positive (low FP)
26. F-score vs Accuracy
• There are a number of metrics which can be used to evaluate a binary
classification model, and accuracy is one of the simplest to understand.
Accuracy is defined as simply the number of correctly categorized
examples divided by the total number of examples. Accuracy can be useful
but does not take into account the subtleties of class imbalances, or
differing costs of false negatives and false positives.
• The F1-score is useful:
where there are either differing costs of false positives or false negatives,
• or where there is a large class imbalance, such as if 10% of apples on
trees tend to be unripe. In this case the accuracy would be misleading,
since a classifier that classifies all apples as ripe would automatically get
90% accuracy but would be useless for real-life applications.
• The accuracy has the advantage that it is very easily interpretable, but the
disadvantage that it is not robust when the data is unevenly distributed, or
where there is a higher cost associated with a particular type of error.
27. Mean Absolute Error or MAE
• We know that an error basically is the absolute difference
between the actual or true values and the values that are
predicted. Absolute difference means that if the result has a
negative sign, it is ignored.
• Hence, MAE = True values – Predicted values
• MAE takes the average of this error from every sample in a
dataset and gives the output.
28. Mean Squared Error or MSE
• MSE is calculated by taking the average of the
square of the difference between the original
and predicted values of the data.
• Hence, MSE =
31. Where to use which Metric to determine the Performance of a
Machine Learning Model?
• MAE: It is not very sensitive to outliers in comparison to MSE since it
doesn't punish huge errors. It is usually used when the performance is
measured on continuous variable data. It gives a linear value, which
averages the weighted individual differences equally. The lower the value,
better is the model's performance.
• MSE: It is one of the most commonly used metrics, but least useful when a
single bad prediction would ruin the entire model's predicting abilities, i.e
when the dataset contains a lot of noise. It is most useful when the
dataset contains outliers, or unexpected values (too high or too low
values).
• RMSE: In RMSE, the errors are squared before they are averaged. This
basically implies that RMSE assigns a higher weight to larger errors. This
indicates that RMSE is much more useful when large errors are present
and they drastically affect the model's performance. It avoids taking the
absolute value of the error and this trait is useful in many mathematical
calculations. In this metric also, lower the value, better is the performance
of the model.
33. Cross Validation (1)
• In machine learning is to not use the entire data
set when training a learner.
• Some of the data is removed before training
begins.
• Then when training is done, the data that was
removed can be used to test the performance of
the learned model on ``new'' data.
• This is the basic idea for a whole class of model
evaluation methods called cross validation
34. Cross Validation (2)
• Method of estimating expected predicting
error
• Helps selecting the best fit model
• Helps ensuring model is not over fit
35. Cross Validation (3)
1) Holdout method
2) K-Fold CV
3) Leave one out CV
4) Bootstraps Methods
36. Holdout method
• The holdout cross validation method is the
simplest of all.
• In this method, you randomly assign data
points to two sets. The size of the sets does
not matter
37. K-FOLD
• K-fold cross validation is one way to improve
over the holdout method. The data set is
divided into k subsets and the holdout
method is repeated k times
• Each time, one of the k subsets is used as the
test set and the other k-1 subsets are put
together to form a training set
39. Leave one out CV (1)
• Leave-one-out cross validation is K-fold cross
validation taken to its logical extreme, with K
equal to N, the number of data points in the set
• That means that N separate times, the function
approximate is trained on all the data except for
one point and a prediction is made for that point
• As before the average error is computed and
used to evaluate the model.
40. Leave one out CV (2)
• Specific case of K-fold validation
42. Bootstrap (1)
• Randomly draw datasets from the training
sample
• Each sample same size as the training sample
• Refit the model with the bootstrap samples
• Examine the model