Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Aim Data Engineer

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 6

Data Engineer. <<>> https://www.youtube.

com/user/krishnaik06 <datascience><<>>

<<certification in AWS>>

1.programming language (Python).

2.operating system
window and linux.

3.Data structure and algorithms


arrays
strings
linked list
stack
queue
Tree
Graph(basics)
Dyanamic programming
searching
sorting

4.DBMS
DDL,DCL,DML,integrity constraints,data schema,basic operations, ACID
PROPERTIES,Transactions,Concurrency control
deadlock,indexing,hashing,normalization forms,Views,Stored Procedures,ER
Diagrams.

5. SQL scripting:
a.Transactinal databases: MySql/postgreSQL.
b.all types of joins
c.Nested queries.
d.Group by
e.Use of Case When statements.
f.window functions.
G.with CTE.

6.Basic terminologies in bigdata:


a.what is big data?
b.5v's of big data
c.Distributed computation
d.Distributed storage.
e.vertical vs horizontal scaling.
f.commodity hardwares
g.clusters
h.File Formats
1.CSV
2.JSON
3. AVRO
4.Parquet.
5.ORC
i.TYPES OF DATA
A. structured
B. unstructured
c. semi structured

7. data exploration libraries:


a.pandas.
b.Numpy
C. BOTO3(LIBRARY FOR aws)
8.DATA WAREHOUSING CONCEPTS:
a. OLAP vs OLTP.
b.Dimension tables.
c.Facts tables.
d.star schema.
e. snowflake schema.
f. warehouse designing questions
g. many more topics.

9.Big Data frameworks:


a. Apache Hadoop (Architecture understanding most imp)
1.HDFS
2.MAR-REDUCE
3.YAM
B.Apache hive
1.HOW TO LOAD DATA IN different file formats
2.internal tables.
3.external tables.
4.querying table data stored in HDFS
5.partitioning
6.Bucketing
7.MAP-SIDE-JOIN
8.SORted-merge join
9.UDF'S IN HIVE
10.serDe in hive
C.Apache Spark (Most Important)
1.Spark Core.
2.Spark SQL
3.Spark Streaming
D.Apache SQOOP
e.Apache NIFI
F.Apache flume.

10.Workflow Schedulers, Dependency management:


a.Apache Airflow
b.Azkaban

11.NoSQL Databases:
a.HBASE
b.Datastax cassandra (Recommended)
c.ElasticSearch
d.MongoDB.

12.Messaging Queue Frameworks:


a.Apache KAFKA

13.Dashboarding tools:
a.Tableau
b.PowerBI
C.Grafana.
D.Kibana(Part of ELK(ElasticSearch-logstash-kibana)

14.BigData Services in Cloud(AWS):


a.Ondemand machines
1. AWS ec2.
b.Access management
1.AWS IAM.
C.For Storing and Accessing credentials
1. AWS secret manager.
D.distributed file storage.
1.AWS S3.
E.Transactional Database Services
1. AWS RDS.
2.AWS Athena.
3.AWS Redshift.
F.NoSQL Database Services
1.AWS Dyanmo.
G.serverless
1.AWS Lambda.
H.ETL Services
1.AWS Glue.
I.Scheduler
1.AWS CloudWatch.

AWS SNS.AWS SQS: MESSAGING QUEUE


AWS kinesis :Real time data processing.

python:-1.https://www.youtube.com/watch?v=_uQrJ0TkZlc&t=16281s
2.https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbEwwZElpVV8xOTViOE9UQ1lXM0JPRTZIY014UXx
BQ3Jtc0ttRmxKa3FuZTNnLWJyUU9CQmZ1TW4zMnVxeERhRmZtM2MwcmJsRk5JWDdtYUV2MlhRSXIyRUJEVm
hfeXF0QXAyNmlrQWJfVWlfNmVaS05TbjdWblFVcjhWRGcxZm9ZYm9wZkZiZ3V4d21vX3c2eFpCSQ&q=http
s%3A%2F%2Fwww.programiz.com%2Fpython-programming

linux:- https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbnNyY1pGN3JaVTdCamctMUxfTU9SUUwzM3EwQXx
BQ3Jtc0tuQzFyUXhLMWM3clp6elJKWDNSQ0ZyVndPelFJY3NHLWFKSWhhTFZhNC1sNGVnMFZRM21SWWF6WU
F2SFRSSmg5RE5QWTM3WDhkUlJzTXBYNnNTWC1pcWJ6R0l0NEZ2c293NzZYTFFuWm50WmtfdDhnSQ&q=http
s%3A%2F%2Fpractice.geeksforgeeks.org%2Fbatch%2Flinux-shell-script
free:- ELEARNINGBLINUX

datastructure:- 1.https://www.youtube.com/watch?
v=5_5oE5lgrhw&list=PLu0W_9lII9ahIappRPN0MCAgtOu3lQjQi&t=0s
2. https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbkVwWmV0NHdUU2xobnlrakdqR2tyQ3FNckVlUXx
BQ3Jtc0tseWo0UG5XMHQ0LUppRHRFM0I3YnpNYVowSUpaNGFwRUJNSHowRkNtZElST2dmd2V0clVRNi1tY0
p1WS1MRlZJOXRPdTNFc0tLd0pmUGplTGVqSXd3SVU3N05nVm9yVHZFLXY0UENZaUlTTGVuMkhPZw&q=http
s%3A%2F%2Fwww.geeksforgeeks.org%2F
3. https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbks0eUN5WnNyRkxIejl5U3NSZWgzRXZONGhJUXx
BQ3Jtc0trNDdpLXU2NHRJejlFdGxnM2FpN1VyR3l4VnQ1OF8zalZZa044UVJlMmtBa1Z5NDhRQzNhS3FOWl
JwWTN2b21oSmVZY0d6UGxyZGo5Tk5EU2RrcTZreGRKeC16VFBUTDQ3ZE5zM1hYYUtLN2NyVUZtTQ&q=http
s%3A%2F%2Fleetcode.com%2F

DBMS: 1.https://www.youtube.com/watch?
v=kBdlM6hNDAE&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&t=0s
2.https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbE5xbjJua2ZXdV90MDZvRjZDamJHeGNUaVAtUXx
BQ3Jtc0trWXNIMEZoVmpmUF9hdVRneGVEOXdNRnpETnZmZ21DalFuVzdoR2s5anhwU3lkSTQ1UFVmakJvZk
Z4ajBxbXZqanZLNHh6ZzdtbnViZEZFeWVybkxMOENOR2ZHdVZNejdJZlNnTXlKSWx2SXVaTnRBZw&q=http
s%3A%2F%2Fwww.studytonight.com%2Fdbms%2F

SQLSCRIPTING:- 1.https://www.youtube.com/watch?v=HXV3zeQKqGY&t=0s
2.https://www.youtube.com/watch?v=7S_tz1z_5bA&t=0s
3.https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbEs2QVdiUlQyWkw5SjlhNU9PX04yRGxlaHNMZ3x
BQ3Jtc0tsUHgzZTdlRFo1eGdrTHc2cm1NTmszb0wzWUtvZFRMekJ5RjBhR3JBZTBuMFM3b3BjOHJHSjFpM2
43RlMyb2hZY1R2ZUV0RWZOV2llV3RJVjZuNkV2U0FmWHd0U20yaXNJTmdDOGMtTTVJdmlreFB2TQ&q=http
s%3A%2F%2Fwww.w3schools.com%2Fsql%2F

Basics terminology in bigdata:-1.https://www.youtube.com/redirect?


event=video_description&redir_token=QUFFLUhqbjF5UHQ1ajd6SUtQX3daRTdlZ0VhWlBLQ2J6QXx
BQ3Jtc0tscnVzeXozT3RPRVZkdzVIQXJac3hlNlNhQWJ3UHJGemZQZngxZmpjR0NqenpGYU1PbDhDcEM3Ni
1hd29CcEFpZnhNQ3VJVU4yVFo2LTlubkZyQmdudUU2SUZRQ0xoREZYRE1vS0lYQjhoU1JuQVhYdw&q=http
s%3A%2F%2Fdata-flair.training%2Fblogs%2Fwhat-is-big-data%2F
2.https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbGZZa3RHQ3NKRjJsNmU2VUlDYUJCalE2bm1Jd3x
BQ3Jtc0ttb0JHZ05QNG9kYWtmWjV2S28xMWZiSkNHZmNLVk1KUWhoSEw3LTZQWTJaclpja01TRnIzSFRDUV
ZER3JnUHU4ZHVlZlFDS2gwaWJEM0MyMzkyZThSd24wQ0JuY2tQUjNIMTc0UTBRYmRQVGYyay1yWQ&q=http
s%3A%2F%2Fwww.edureka.co%2Fblog%2Fwhat-is-big-data%2F

Data exploration libraries.:- 1.PANDA -https://www.youtube.com/watch?


v=UB3DE5Bgfx4&t=0s
2.NUMPY- https://www.youtube.com/watch?
v=DI8wg3SRV90&t=0s

Data warehousing conceptsL:-https://www.youtube.com/watch?v=J326LIUrZM8&t=0s


:-https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbUxFT2pXQ0hyY081LVYyXzBrVGQxS0lvSHJXd3x
BQ3Jtc0trZWZyYXQzV3BLVXl6U1lpaG41NG11MDdZaVEtU3IyeUVZRDU1S2Qtakg3ZUdkTVBtSkp4bHpPTV
JEcVlfMk1aVVVkbGVRdDV3blUtNXdVYTBlNGhUZUI3Vm42WWphSWhqLTNUdFJDdDJ1ZXhBVTBLQQ&q=http
s%3A%2F%2Fwww.tutorialspoint.com%2Fdwh%2Fdwh_data_warehousing.htm%23%3A%7E%3Atext
%3DData%2520warehousing%2520is%2520the%2520process%2Choc%2520queries%252C%2520and
%2520decision%2520making

Big data Frameworks(hadoop,hive,spark,sqoop)


:-https://www.youtube.com/results?
search_query=learning+journal
:-https://www.youtube.com/user/edurekaIN
:-https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbVk0RUtnMFExN0tENDA2cTRuQ3MtX0ppeFN3Z3x
BQ3Jtc0ttWklCR3h4N3lLMGVPTlpLeGw1Uk1paTIzei1ZVHhfR19nOVdqcVQzWUR3cVYtb3lLeThNRDIxR0
t4WHl3Yk02ZWJZUWNWWFRCdm1EY0xmaFRuVHBoQXQyVUVGdGRzbXZHVWU0c3EzY1VUVlA0blMzaw&q=http
s%3A%2F%2Fdata-flair.training%2F
:-https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbldNSlFNSkFDeFhKNEtlMksxNG13REQ3ZDZYQXx
BQ3Jtc0ttVzNRMVRLcmJ3ZXRrb3R5REJ2Sy1WNEg2V2JNSC1OeFBzaG9SRk04akJMR2dfbnBabHo5ckpuTG
IwTFAyT0Q5VVpvMVh2MWdNSS1zNjl6VnZiVFp1T1B0c1d3WjlNSlZJREYxSUlzWVpsLUxEREhOUQ&q=http
s%3A%2F%2Fwww.edureka.co%2F

Workflow schedulers dependency management


:-https://www.youtube.com/watch?v=niJ063dV5nA&t=0s
:-https://www.youtube.com/watch?v=6RebQR-5Kh8&t=0s

NoSQL Databases:- 1:-HBASE:-https://www.youtube.com/watch?v=NOX6-nDtrFQ&t=0s


2:-CASSANDRA:- https://www.youtube.com/watch?
v=iDhIjrJ7hG0&list=PL9ooVrP1hQOGJ4Yz9vbytkRmLaD6weg8k&t=0s
3:-ELASTIC SEARCH:-https://www.youtube.com/watch?
v=1EnvkPf7t6Y&t=0s
4:-MongoDb:-https://www.youtube.com/watch?v=pWbMrx5rVBE&t=0s

Apache kafak:- https://www.youtube.com/watch?v=daRykH67_qs&t=0s


Dashboard tools:-Tableau:-https://www.youtube.com/watch?v=aHaOIvR00So&t=0s
PowerBI:-https://www.youtube.com/watch?v=3u7MQz1EyPY&t=0s
Grafana:-https://www.youtube.com/watch?v=CjABEnRg9NI&t=0s
Kibana:- https://www.youtube.com/watch?v=gQ1c1uILyKI&t=0s

Bigdata services in cloud (AWS):-https://www.youtube.com/watch?v=k1RI5locZE4&t=0s


:-https://www.youtube.com/watch?v=8PyLr0Zzczw&t=0s
:-https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqbEdQWFk0ZHFCVFhxc082bEFYYVc3Ql95b0c2d3x
BQ3Jtc0tucjF5STR6WXg5SjZVb0NIVnVSMWhKZTltQVdxSGRuX0ctNXNhdUljQTU2ZWttc1BDQVU0ZnNzWV
RTUXB6a1pFVXBsMmw3TS1SaDBCclJjUmJrSFJKR21vRl9qLXB6Zm1WbG5TWHg1cDc3bkh1NG5Haw&q=http
s%3A%2F%2Fwww.simplilearn.com%2Faws-big-data-article

You might also like