Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Course Title Introduction to Data Science Course Type HC

Course Code B23CS0104 Credits 2 Class II Semester


Contact Work Total Number
LTP Credits Hours Load of Classes Assessment
2 2 2 Per Semester Weightage
Lecture
Tutorial - - -
Practice - - -
Theory Practical CIE SEE
Total 2 2 2 28 - 50% 50%
COURSE OVERVIEW:

Data Science is an interdisciplinary, problem-solving oriented subject that is used to apply scientific techniques
to practical problems. The course orients on preparation of datasets and programming of data analysis tasks. This
course covers the topics: Set Theory, Probability theory, Tools for data science, ML algorithms and demonstration
of experiments either by using MS-Excel/Python/R.

COURSE OBJECTIVE (S):


The objectives of this course are to:
1. Explain the fundamental concepts of Excel.
2. Illustrate the use of basic concepts of Data Science in the real-world applications.
3. Demonstrate the use of SQL commands in real world applications.
4. Discuss the functional components of Data Science for real world applications.

COURSE OUTCOMES (COs)

After the completion of the course, the student will be able to:

CO# Course Outcomes POs PSOs


Make use of the basic concepts of Data Science in developing the real-world
CO1 1 to 4, 12 1,2,3
applications.
CO2 Apply the SQL commands in developing the real-world applications. 1 to 5,12 1,2,3
Build the data analytics solutions for real world problems, perform analysis,
CO3 1 to 5 1, 2, 3
interpretation and reporting of data.

CO4 Demonstrate visualization of Data using python libraries 1 to 5, 8 to 10 1,2, 3

CO5 Find modeling Error in Linear Regression. 1 to 5 1, 2, 3

CO6 Use statistical principles to solve mean and standard deviations for given data. 1 to 4, 12 1,2, 3

BLOOM’S LEVEL OF THE COURSE OUTCOMES:

Bloom’s Level
CO# Apply Analyze Evaluate Create
Remember(L1) Understand(L2)
(L3) (L4) (L5) (L6)
CO1 
CO2 
CO3  
CO4    
CO5 
CO6 
COURSE ARTICULATIONMATRIX:

PO10

PO11

PO12
CO#/

PSO1

PSO2

PSO3
PO1

PO2

PO3

PO4

PO5

PO6

PO7

PO8

PO9
POs
CO1 3 2 2 2 2 3 1 1

CO2 2 3 2 1 2 2 2 3 2 2

CO3 2 3 3 2 2 3 3 3

CO4 3 3 3 2 2 2 2 2 3 3 3
CO5 2 3 2 2 2 3 3 3
CO6 3 3 2 2 2 3 3 3
Note: 1-Low, 2-Medium, 3-High

COURSE CONTENT
THEORY
Contents
UNIT –
1
Introduction to Microsoft Excel:
History and importance of Microsoft Excel, Creating Excel tables, understand how to Add, Subtract, Multiply,
Divide in Excel. Excel Data Validation, Sorting, Filtering, Grouping, Ungrouping and Subtotal. Introduction to
formulas and functions in Excel. Logical functions (operators) and conditions. Visualizing data using charts
in Excel. Import XML Data into Excel, How to Import CSV Data (Text) into Excel, How to Import MS Access
Data into Excel, Working with Multiple Worksheets.
UNIT – 2
Introduction to Data Science:
What is Data Science? Applications of Data Science, Data science life cycle, Tools for data science, definition of
AI, types of machine learning (ML), list of ML algorithms for classification, clustering, and feature selection.
Probability theory, bayes theorem, bayes probability; Cartesian plane, equations of lines, graphs; exponents.

Introduction to SQL: SQL Commands experimental demonstrations-DDL, DML, DCL, TCL, DQL. Import SQL
Database Data into Excel.

UNIT – 3
D Data Relationship Methods:
Introduction to Correlation, Description of linear regression and Logistic Regression, Introducing the Gaussian,
Introduction to Standardization, Standard Normal Probability Distribution in Excel, Calculating Probabilities
from Z-scores, Central Limit Theorem, Algebra with Gaussians, Markowitz Portfolio Optimization,
Standardizing x and y Coordinates for Linear Regression, Standardization Simplifies Linear Regression,
Modeling Error in Linear Regression, Information Gain from Linear Regression.
.
UNIT – 4

Data visualization using scatter plots, charts, graphs, histograms, and maps: Statistical Analysis:
Descriptive statistics- Mean, Standard Deviation for Continuous Data, Frequency, Percentage for
Categorical Data.

Introduction to Python: Python basics, Strings, Lists, Tuples, Sets, Dictionaries. Introduction to python libraries
- Numpy, Matplotlib, Pandas, Scikit-Learn, Implementation of ML.
TEXT BOOKS:

1. B.S. Grewal, “Higher Engineering Mathematics”,43rdEdition, Khanna Publishers, 2015.


2. Ramakrishnan and Gehrke, “Database Management systems”, 3 rdEdition, McGraw Hill Publications, 2003.
3. “Mastering Data Analysis in Excel” - https://www.coursera.org/learn/analytics-excel.
4. Kenneth N. Berk, Carey, “Data Analysis with Microsoft Excel”, S. Chand & Company,2004.
5. Joel Grus,”Data science from scratch - First principles with Python” , OâReily, 2015.

REFERENCE BOOKS:

1. B.V. Ramana, “Higher Engineering Mathematics”, 19th edition, Tata McGraw Hill Publications, 2013.
2. ErwinKreyszig, “Advanced Engineering Mathematics”, 9th edition, Wiley Publications, 2013.
3. Seymour Lipschutz, John J. Schiller, “Schaum's Outline of Introduction to Probability and Statistics”,
McGraw Hill Professional, 1998.

JOURNALS/MAGAZINES:

1. https://www.journals.elsevier.com/computational-statistics-and-data-analysis
2. https://www.springer.com/journal/41060International Journal on Data Science and Analytics
3. https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8254253IEEE Magazine on Big data and
Analytics

SWAYAMNPTEL/MOOCs

1. Excel Skills for Business: Essentials, Macquarie University (https://www.coursera.org/learn/excel-


essentials)
2. SQL for Data Science, University of California, Davis(https://www.coursera.org/learn/sql-for-data-science)
3. Data Science Math Skills, Duke Universityhttps://www.edx.org/course/subject/data-science
4. https://onlinecourses.nptel.ac.in/noc19_cs60/preview

SELF-LEARNINGEXERCISES:
1. Relational database management system.
2. Advanced MS-Excel

You might also like