CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
Massive Datasets
Michele Catasta, Jure Leskovec, Jeffrey Ullman
Agenda
● Intro by Michele
● Onboard: 4/3, we will meet every Wed on April, then on a per-need basis
▪
Advice on conducting research
◼
◼
◼
How to prepare for a meeting
◼
▪
▪
▪
▪
▪
How to prepare for a meeting
◼
▪
▪
▪
◼
▪
Grading
◼
▪
▪
▪
▪
Google Cloud Platform
CS341
● Founded a company in 2014 (Denizen)
2. Command line (gcloud sdk tools): Useful for using the resources once
provisioned. E.g. ssh into instances, submit jobs, copy files etc
● BigQuery: https://cloud.google.com/bigquery
● DataPrep: https://cloud.google.com/dataprep/
● DataProc: https://cloud.google.com/dataproc/