Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

CAIM: Cerca i Anàlisi d’Informació Massiva

FIB, Grau en Enginyeria Informàtica

Slides by Marta Arias, José Luis Balcázar,


Ramon Ferrer-i-Cancho, Ricard Gavaldá
Department of Computer Science, UPC

Fall 2020
http://www.cs.upc.edu/~caim

1 / 11
0. Presentation
COVID 19

I Follow the instructions that FIB has sent to you.

I Sit always of the same place.

I Write your row and column somewhere so that you can


remember it.

3 / 11
Instructors

I Ramon Ferrer-i-Cancho (lectures + exercices 10 & 20; lab


12)
I rferrericancho@cs.upc.edu
I Omega S124, 93 413 4028

I Ignasi Gómez (lab 11, 21 & 22)


I ignasi.gomez@upc.edu

I Javier Béjar (lab 13)


I bejar@cs.upc.edu
I Omega 204, 93 413 7879

4 / 11
Class Logistics

I Fridays, 12–14 (A6E01), 15–17 (A6E02)


I Theory and exercises. Often, exercises will be proposed in
advance.

I Thursdays, lab sessions

I Guided lab activities; expected to be complemented with an


average estimate of 2 additional hours per session of
autonomous work.
I Some lab sessions will finish by handing in a short written
report; these count towards the evaluation of the course.

5 / 11
Lab work - important rules

I Lab is done in pairs. Exceptions must have prior


permission
I This semester: keep the same partner for the whole
semester (see instructions at Racó).
I Do not exchange information with others, other than
general ideas; that will be considered plagiarism

6 / 11
Exercises

I In class, we will solve only a part of the exercises proposed


I You are strongly encouraged to try and solve the rest of the
exercises
I Self-study: One or more small topics will not be explained
in class. They will appear in the exam.

7 / 11
Evaluation

I Evaluation: as per “Guia Docent”

I Parcial 1 (P1): November 5 16:00-17:30 (during week for


partial exams), Parcial 2 (P2): 11/01/2021 15:00-18:00
I On the day of Parcial 2 you may choose to do instead a
final exam (F) on the whole course
I 40 % Lab + max(30 % P1 + 30 % P2, 60 % F)

8 / 11
Contents I

First half (until midterm):


I Core Information Retrieval:
I Introduction: Concept. The IR process
I Information Retrieval Models
I Indexing and Searching, Implementation
I Information Retrieval Evaluation, Feedback Models

I Web Search:
I Link analysis: Page Rank
I Crawling the web
I Architecture of a Web search system

9 / 11
Contents II

Second half:
I The “Big Data” Slogan
I Architecture of large-scale web search systems
I The Map-Reduce paradigm
I Introduction to NoSQL databases
I The Apache ecosystem for web search.

I Social Network Analysis:


I Characterizing of real complex networks
I Communities, influence, information diffusion

I Clustering and Locality Sensitive Hashing

I Recommender Systems

10 / 11
Bibliography

I R. Baeza-Yates, B. Ribeiro-Neto: Modern Information


Retrieval (2nd ed.). Addison Wesley, 2010.
I I.H. Witten, A. Moffat, T. Bell: Managing Gigabytes. Morgan
Kaufmann, 1999.
I C.D. Manning, P. Raghavan, H. Schütze: Introduction to
Information Retrieval. Cambridge 2008.
I Z. Markov, D.T. Larose: Data Mining the Web. Wiley, 2007.
I Russell, Matthew , Mining the Social Web: Analyzing Data
from Facebook, Twitter, LinkedIn, and Other Social Media
Site. O’Reilly , 2011
I . . . There’s a whole web out there

11 / 11

You might also like