Datascience Syllabus PDF
Datascience Syllabus PDF
INTRODUCTION - 24 HOURS
BATCH LAUNCH Intro to Program | Curriculum Overview | Learning Methodology | Guest Lecture
Data | Variables | Data Types | Measures of Central Tendency in Data | Understanding Skewness in Data | Measures of Dispersion |
ALL ABOUT DATA Data Distribution
ANOVA/ REGRESSION
Analysis of Variance and Covariance | One way analysis of variance | Assumption of ANOVA | Statistics associated with one way analysis
ANALYSIS of variance | Interpreting the ANOVA Results | Two way analysis of variance | Interpreting the ANOVA Results | Analysis of Covariance |
Examine Regression Results | What is Regression Analysis | Linear and Logistic Regression | Statistics Associated with Regression
PREDICTIVE MODELLING
Decision Trees and Neural Networks | Introduction to Predictive Modelling with Decision Trees | Assumptions | Formulate the Model
Estimate the Parameters | Check the Prediction Accuracy
TREE AND BAYESIAN Decision Trees, Bagging | Random Forests, Boosted Trees | Bayesian Classification Models
NETWORK MODELS
R - 66 HOURS
R BASICS R Base Software | Understanding CRAN | RStudio The IDE | Basic Building Blocks in R | Sequence of Numbers in R |
Understanding Vectors in R | Basic Operations Operators and Types
R FUNCTIONS
Handling Missing Values in R | Subsetting Vectors in R | Matrices and Data Frames in R | Logical Statements in R | Lapply,
sapply, vapply and tapply Functions
LINEAR REGRESSION Covariance and Correlation | Multivariate Analysis | Assumptions of Linearity Hypothesis Testing | Limitations of
THEORY - R Regression
BUSINESS CASE: Business Case : Managing Credit Risk | Meaning of Credit Risk | Impact of Credit Default | Sources of Data for Managing
MANAGING CREDIT RISK
Risk | Understanding Loss Given Default | Understanding Default
LOSS GIVEN DEFAULT Loss Given Default Linear Regression R | Extract Data in R | Univariate Analysis of Data | Apply Data Transformations |
LINEAR REGRESSION R
Bivariate Analysis of Data | Identify Multicollinearity in Data | Treatment on Data | Identify Heteroscedasticity Discuss
what could be the Reason for Heteroscedasticity | Modelling of Data Variable Significance Identification | Model
Significance Test | Predict using Testing Data Set | Validate the Model Performance
Reason for Logistic Regression | The Logistic Transform | Logistic Regression Modelling | Model Optimisation |
LOGISTIC REGRESSION Understanding ROC Curve
THEORY - R
DECISION TREES
Introduction to Decision Trees | Theory of Entropy & Information Gain | Stopping Rules | Overfitting Problem | Cross
Validations for Overfitting Problem | Prunning as a Solution for Overfitting | Ensemble Learning Notion | Concept of
Bootstrap Aggregation | Concept of Random Forest
BUSINESS CASE Business Case : Intrusion Detection in IT Network | Meaning of Intrusion in IT Cost of Intrusion | Meaning of Intrusion
Detection System
PROJECT 3 Project 3 - Network Intrusion Detection using Decision Tree & Ensemble Learning in R
PYTHON - 35 HOURS
PYTHON BASICS What is Python? | Installing Anaconda | Understanding the Spyder Integrated Development Environment (IDE)
| Lists, tuples, dictionaries, variables
DATA STRUCTURES IN Intro to Numpy Arrays | Creating ndarrays | Indexing | Data Processing using Arrays | File Input and Output | Getting
PYTHON USED FOR DATA Started with Pandas
ANALYSIS
DATA FRAME Data Acquisition(Import & Export) | Indexing | Selection and Filtering Sorting & Summarizing | Descriptive Statistics | Combining
MANIPULATION
and Merging Data Frames | Removing Duplicates | Discretization and Binning | String Manipulation | PLUS: Project Work on
Python
OTHER PREDICTIVE Intro to Machine Learning | Random Forests | Sklearn Library & Statsmodels
MODELLING TOOLS
SAS - 40 HOURS
INTRODUCTION TO SAS What is SAS? | Key Features | Submitting a SAS Program | SAS Program Syntax Examining SAS Datasets Accessing SAS
AND SAS PROGRAMS Libraries | Sorting and Grouping Reporting Data | Using SAS Formats
READING AND Reading SAS Datasets | Reading Excel Data | Reading Raw Files | Reading Database Data | Creating Summary Reports |
MANIPULATING DATA Combining Datasets
Writing Observations | Writing to Multiple Datasets | Accumulating Total Creating Accumulating Total for a Group of Data
DATA TRANSFORMATIONS
| Data Transformations
MACROS
Introduction to Macro Variables | Automatic Macro Variables | User Defined Macro Variables | Macro Variable Reference |
Defining and Calling Macros | Macro Parameters | Global and Local Symbol Table | Creating Macro Variables in the Data
Step
Introduction to SQL | How Does RDBMS Work? | SQL Procedures | Specifying Columns | Specifying Rows | Presenting Data |
SQL
Summarizing Data | Writing Join Queries using SQL | Working with Subqueries, Indexes and Views | Set Operators | Creating
Tables and Views using Proc SQL
TABLEAU - 10 HOURS
TABLEAU BASIC Introduction to Visualization | Working with Tableau | Visualization in Depth Data Organisation | Advanced
Visualization | Mapping | Enterprise Dashboards Data Presentation
BEST PRACTICES FOR Have a Methodology | Know Your Audience | Define Resulting Actions Classify Your Dashboard | Profile Your Data |
DASHBOARDING AND
REPORTING AND CASE Use Visual Features Properly | Design Iteratively
STUDY
RESUME BUILDING AND Resume Building | Personal Branding | Tips and Resources | Interview Skills
INTERVIEW PREP
1:1 Mock Interviews with Industry Veterans to Clear the Technical Round of Interviews to Give You Confidence to Face Real
1:1 MOCK INTERVIEWS World Scenarios
GROUP PROJECT
PRESENTATION Groups Present their Project Presentation in Front of Their Peers and industry Experts Evaluate the Solution (Refresher session
for online batches)
HANDS-ON PROJECTS