Azure SQL Trainings: Contact: +91 90 32 82 44 67
Azure SQL Trainings: Contact: +91 90 32 82 44 67
Contact : +91 90 32 82 44 67
Introduction to Azure
Introduction to Storage
1) Azure Storage
Azure Blob
Azure Table
Azure Message
Azure Queue
2) Azure Data Lake Store Gen (1 and 2)
U-SQL Language
Introduction to U-SQL
U-SQL vs. SQL
Transforming Row Sets
Declare Parameters
Data Types
Expressions
Reading and Writing files
File sets
Grouping and Aggregation
U-SQL Catalog
Window functions
Set Operations
Joins
Complex Types
Extending U-SQL
Azure Streaming
Azure Databricks
Azure Spark
Sqoop
Oozie
Hive
Scala
Spark
Spark SQL
BIG DATA
Evolution of Data – Introduction to Big data – Classification - Size Hierarchy - Why Big data is Trending
(IOT, Devops, Cloud Computing, Enterprise Mobility) - Challenges in Big Data – Characteristics - Tools for
Big Data - Why Big Data draws attention in IT Industry - What do we do with Big data - How Big Data can
be analyzed - Typical Distributed System - Draw backs in Traditional distributed System
LINUX
History and Evolution - Architecture – Development Commands – Env Variables - File Management –
Directories Management – Admin Commands – Advanced Commands – Shell Scripting – Groups and
User managements – Permissions – Important directory structure – Disk utilities – Compression
Techniques – Misc Commands
SQOOP – RDBMS
Introduction & History – History - Installation and configuration - Why Sqoop - Indepth Architecture -
Sqoop Import Properties - Sqoop Export Architecture - Commands (Import – HDSF, HIVE, HBase from
MySQL) - Export – Incremental Import - Saved Jobs - Import All tables - Sqoop installation and
configuration - Sqoop workouts - Sqoop best practices & performance tuning - Sqoop import/export use
cases - Mock test on Sqoop
Introduction – Architecture - Hive Vs RDBMS - Detailed Installation (Metastore, Integrating with Hue)-
Starting Metastore and Hive Server - Data types (Primitive, Collection) - Create Tables (Managed,
external) and DML operations (load, insert, export) - Managed Vs External tables - QL Queries (select,
where, group by, having, sort by, order by) - Hive access through Hive Client, Beeline and Hue - File
Formats (RC, ORC, Sequence)- Partitioning (static and dynamic), partition with external table, dropping
partitions and corresponding configuration parameters - Bucketing, Partitioning Vs Bucketing - Views,
different types of joins (inner, outer) - Queries (Union, union all, intersection, minus) - Add files to the
distributed cache, jars to the class path - Optimized joins (MapSide join, Bucketing join) - Compressions
on tables (LZO, Snappy) - Serde (XML Serde, JsonSerde) - Parallel execution, Sampling data, Speculative
execution -Two POCs using the large dataset on the above topics -Mock Test on Hive and Its
Architecture
Hadoop Ecosystems ROAD MAP-MAP REDUCE FLOW-MapReduce Job submission in YARN Cluster in
details -What is MapReduce? - How MapReduce works on high level - Types of Input and Output Format
- MapReduce in details -Different types of files supported (Text, Sequence, map and Avro) - STORAGE &
PROCESSING DAEMONS Architecture Version 1 - PROCESSING DAEMONS Architecture Version 1 - Role of
Job Tracker and Task Tracker - Manager, Application Master, Node Manager), Architecture and Failure
handling – Schedulers - Resource Manager High availability -YARN Architecture
SCALA
Scala Introduction – History - Why Scala - Scala Installation - Get deep insights into the functioning of
Scala - Execute Pattern Matching in Scala - OOPs concepts (Classes, Objects, Collections, Inheritance,
Abstraction and Encapsulation) - Functional Programming in Scala (Closures, Currying, Expressions,
Anonymous Functions) - Know the concepts of classes in Scala - Object Orientation in Scala (Primary,
Auxiliary Constructors, Singleton Objects, Companion Objects) - Traits - Abstract classes
SPARK
Introduction – Scala/Python – History – Overview – MR vs Spark – Spark Libraries – Why Spark – RDDs –
Spark Internals – Transformations – Actions – DAG – Fault Tolerance – Lineage – Terminologies – Cluster
types – Hadoop Integration – Spark SQL – Data frames – DataSets – Optimizers – AST – Session –
Structured Streaming– RDDs to Relations – Spark Streaming – Why Spark Streaming– Data masking
techniques – SCD implementation - Real time use cases – End to end realtime integration with NIFI,
Kafka, Spark Streaming, EC2, Cassandra, RDBMS, Different Filesystems, Hive, Oozie & HBase