0% found this document useful (0 votes)

93 views

Introduction To Elasticsearch.: Ruslan Zavacky

Uploaded by

Anonymous 1zCvIIjSc

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views

Introduction To Elasticsearch.: Ruslan Zavacky

Uploaded by

Anonymous 1zCvIIjSc

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 75

introduction to

elasticsearch.
Ruslan Zavacky

@ruslanzavacky | ruslan.zavacky@gmail.com
Released in 2010 
In 2014, 70$ million in Series C
funding

2
real time data real time analytics
Data flows into your system all the time. The question is … Search isn’t just free text search anymore - it’s about
how quickly can that data become an insight? With exploring your data. Understanding it. Gaining insights
Elasticsearch, real-time is the only time. that will make your business better or improve your
product.

high availability multi-tenancy

Elasticsearch clusters are resilient - they will detect and A cluster can host multiple indices which can be queried
remove failed nodes, and reorganise themselves to ensure independently or as a group. Index aliases allow you to
that your data is safe and accessible. add indexes on the fly, while being transparent to your
application.

3
full text search document oriented
Elasticsearch uses Lucene under the covers to provide the Store complex real world entities in Elasticsearch as
most powerful full text search capabilities available in any structured JSON documents. All fields are indexed by
open source product. Search comes with multi-language default, and all the indices can be used in a single query,
support, a powerful query language, support for to return results at breath taking speed.
geolocation, context aware did-you-mean suggestions,
autocomplete and search snippets.

conflict management schema free

Optimistic version control can be used where needed to Elasticsearch allows you to get started easily. Toss it a
ensure that data is never lost due to conflicting changes JSON document and it will try to detect the data structure,
from multiple processes index the data and make it searchable. Later, apply your
domain specific knowledge of your data to customise how
your data is indexed.

4
restful api per-operation persistence
Elasticsearch is API driven. Almost any action can be Elasticsearch puts your data safety first. Document
performed using a simple RESTful API using JSON over changes are recorded in transaction logs on multiple
HTTP. An API already exists in the language of your nodes in the cluster to minimise the chance of any data
choice. loss.

apache 2 open source license build on top of apache lucene™

Elasticsearch can be downloaded, used and modified free Apache Lucene is a high performance, full-featured
of charge. It is available under the Apache 2 license, one Information Retrieval library, written in Java. Elasticsearch
of the most flexible open source licenses available. uses Lucene internally to build its state of the art
distributed search and analytics capabilities.

5
who

6
I
7
8
Unstructured search

9
Structured search

10
Enrichment

11
Sorting

12
Pagination

13
Aggregation

14
Suggestions

15
Elasticsearch in 10 seconds

• Schema-free, REST & JSON based distributed

document store

• Open Source: Apache License 2.0

• Zero configuration

• Written in Java, extensible

16
The most
important question

17
18
Exploding kittens
on Kickstarter
> 195,794 bakers
> $7,840,830 pledged
… and yes, Kickstarter use
elasticsearch

19
Capabilities

20
Capabilities
Store schema less data
Or create a schema for your data
Manipulate your data record by record
Or use Multi-document APIs to do Bulk ops
Perform Queries/Filters on your data for insights
Or if you are DevOps person, use APIs to monitor
Do not forget about built-in Full-Text search and analysis
Document API Search APIs Indices API Cat APIs Cluster API Query DSL 
Validate API Search API More Like This API Mapping Analysis Modules
21
Auto Completion

SELECT name
FROM product
WHERE name LIKE ‘d%’

1k records 500k records 20m records

22
Auto Completion

Yea, sure…

23
Auto Completion: FST

24
Auto Completion
Multiple Inputs Going fuzzy
Single Unified Output Statistics
Scoring
Payloads
Synonyms
Ignoring stopwords

25
Auto Completion
curl -X PUT localhost:9200/hotels/hotel/2 -d '
{
"name" : "Hotel Monaco",
"city" : "Munich",
"name_suggest" : {
"input" : [
"Monaco Munich",
"Hotel Monaco"
],
"output": "Hotel Monaco",
"weight": 10
}
}'

26
Faceted Navigation

27
Aggregation & Filtering

Documents

28
Aggregation & Filtering

Documents

Query

29
Aggregation & Filtering

Documents

Query

Buckets

30
Aggregation & Filtering

Documents

Query

Buckets

31
Aggregation & Filtering

Documents

Query

Buckets

Metrics 123 344 545

32
Faceted Navigation

33
Snapshot / Restore
Snapshot
curl -XPUT "localhost:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true"

Restore
curl -XPOST "localhost:9200/_snapshot/my_backup/snapshot_1/_restore"

34
Percolate API
Store queries in ElasticSearch.
Pass documents as queries. 
Observe matched queries.

WUT?

35
Percolate API
Use Case
You tell customer, that you will notify them
when Plane ticket will be available and
cheaper.
Solution
Store customer criteria about desired flight
- departure, destination, max price
When you store flight data, match it against
saved percolators.
36
Percolate API
Store Query
curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
"query" : {
"match" : {
"message" : "bonsai tree"
}
}
}'

Match document
curl -XGET 'localhost:9200/my-index/my-type/_percolate'
-d '{
"doc" : {
"message" : "A new bonsai tree in the office"
}
}'

37
Percolate API
{
"took" : 19,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"total" : 1,
"matches" : [
{
"_index" : "my-index",
"_id" : "1"
}
]
}

38
More like this API
curl -XGET 'http://localhost:9200/memes/meme/1/_mlt?mlt_fields=face&min_doc_freq=1'

39
scalability

40
Distributed & scalable
Replication
Read scalability
Removing SPOF

Sharding
Split logical data over several machines
Write scalability
Control data flows

41
Distributed & scalable

node 1
curl -X PUT localhost:9200/orders -d ’{
“settings.index.number_of_shards" : 4
orders “settings.index.number_of_replicas”: 1
1 2 }'

3 4

curl -X PUT localhost:9200/products -d ’{

products
“settings.index.number_of_shards" : 2
1 2 “settings.index.number_of_replicas”: 0
}'

42
Distributed & scalable

node 1 node 2
orders orders

1 2 1 2

3 4 3 4

products products

1 2

43
Distributed & scalable

node 1 node 2 node 3

orders orders orders

1 2 2 1

4 3 3 4

products products products

1 2

44
API tour

45
Create

» curl -X PUT localhost:9200/books/book/1 -d '

{
"title" : "Elasticsearch - The definitive guide",
"authors" : "Clinton Gormley",
"started" : "2013-02-04",
"pages" : 230
}'

46
Update

» curl -X PUT localhost:9200/books/book/1 -d '

{
"title" : "Elasticsearch - The definitive guide",
"authors" : [ "Clinton Gormley", "Zachary Tong"],
"started" : "2013-02-04",
"pages" : 230
}'

47
Delete

» curl -X DELETE localhost:9200/books/book/1

Get

» curl -X GET localhost:9200/books/book/1

48
Search

» curl -X GET localhost:9200/books/_search?q=elasticsearch

{
"took" : 2, "timed_out" : false,
"_shards" : { "total" : 5, "successful" : 5, "failed" : 0 },
"hits" : {
"total" : 1, "max_score" : 0.076713204,
"hits" : [ {
"_index" : “books", "_type" : “book", "_id" : "1",
"_score" : 0.076713204, "_source" : {
"title" : "Elasticsearch - The definitive guide",
"authors" : [ "Clinton Gormley", "Zachary Tong" ],
"started" : “2013-02-04", "pages" : 230
}
}]
}
}
49
Search Query DSL
»»curl
curl -XGET
-XGET ‘localhost:9200/books/book/_search'
‘localhost:9200/books/book/_search' -d
-d '{
'{
"query":
"query": {{
"filtered"
"filtered" :: {{
"query"
"query" :: {{
"match":
"match": {{
"text"
"text" :: {{
"query"
"query" :: “To
“To Be
Be Or
Or Not
Not To
To Be",
Be",
"cutoff_frequency" : 0.01
"cutoff_frequency" : 0.01
}}
}}
},
},
"filter"
"filter" :: {{
"range":
"range": {{
"price":
"price": {{
"gte":
"gte": 20.0
20.0
"lte": 50.0
"lte": 50.0
……
}
}
}'
}'

50
Use case: Product Search Engine

51
Product Search Engine

Just index all your products and be happy?

Search is not that easy

Synonyms, Suggestions, Faceting, De-compounding,

Custom scoring, Analytics, Price agents,
Query optimisation, beyond search

52
Neutrality? Really?
Is full-text search relevancy really your
preferred scoring algorithm?

Possible influential factors

Age of the product, been ordered in last 24h

In stock?
Special offer
Provision
No shipping costs
Rating (product, seller)
Returns
….
53
Neutrality? Really?

54
Neutrality? Really?

55
ecosystem

56
Ecosystem

• Plugins
• Clients for many languages
• Kibana
• Logstash
• Hadoop integration
• Marvel

57
Ecosystem

• Plugins
• Clients for many languages
• Kibana
• Logstash
• Hadoop integration
• Marvel

58
spoiler alert!

59
what is data?

60
provides value for
Whatever
your business.

61
Domain data Application data
Internal
Orders Log files
products  Metrics
 
External
Social media streams
email

62
63
Logstash
• Managing events and logs

• Collect data

• Parse data

• Enrich data

• Store data (search and visualising)

64
Why collect and centralise data?

• Access log files without system access

• Shell scripting: Too limited or slow

• Using unique ids for errors, aggregate it across

your stack
• Reporting (everyone can create his/her own report)

• Bonus points: Unify your data to make it easily

searchable

65
Unify dates
• apache [19/Feb/2015:19:00:00 +0000]

• unix timestamp 1424372400

• log4j [2015-02-19 19:00:00,000]

• postfix.log Feb 19 19:00:00

• ISO 8601 2015-02-19T19:00:00+02:00

66
Logstash

}
• Managing events and logs
Input
• Collect data

• Parse data

• Enrich data } Filter

• Store data (search and visualise)

} Output
67
kibana

68
Kibana

69
Kibana

70
Kibana

71
Kibana

72
Thank You!

73
Feedback

☺ ! ☹
Sponsors of XXVIII DevClub.lv

Solid Starts - First 100 Days
94% (18)
Solid Starts - First 100 Days
287 pages
Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
The Hold Me Tight Workbook - Dr. Sue Johnson
100% (16)
The Hold Me Tight Workbook - Dr. Sue Johnson
187 pages
Read People Like A Book by Patrick King-Edited
62% (66)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
94% (212)
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
212 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
75% (12)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
36 Questions To Fall in Love 1
97% (31)
36 Questions To Fall in Love 1
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
71% (69)
1001 Songs
1,798 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
1-Getting Started
No ratings yet
1-Getting Started
55 pages
Oracle Academy
No ratings yet
Oracle Academy
16 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
WebSEAL Administration Guide
No ratings yet
WebSEAL Administration Guide
1,182 pages
Zookeeper Tutorial
100% (1)
Zookeeper Tutorial
43 pages
Effective Mysql Optimizing
No ratings yet
Effective Mysql Optimizing
10 pages
AWS VPC Notes
No ratings yet
AWS VPC Notes
3 pages
Project Report of Li-Fi Technology
83% (12)
Project Report of Li-Fi Technology
11 pages
Log Book: Student Training Scheme Faculty of Economics and Business
No ratings yet
Log Book: Student Training Scheme Faculty of Economics and Business
9 pages
ElasticSearch Interview Questions and Answers 40
No ratings yet
ElasticSearch Interview Questions and Answers 40
7 pages
Architecture Best Practices
No ratings yet
Architecture Best Practices
27 pages
Scaladayslambda Architecture Spark Cassandra Akka Kafka 150609194508 Lva1 App6891 PDF
No ratings yet
Scaladayslambda Architecture Spark Cassandra Akka Kafka 150609194508 Lva1 App6891 PDF
100 pages
Data Science in Spark With Sparklyr::: Cheat Sheet
No ratings yet
Data Science in Spark With Sparklyr::: Cheat Sheet
2 pages
The Default Password For The User Is .: Elastic Changeme
No ratings yet
The Default Password For The User Is .: Elastic Changeme
3 pages
Matillion Optimizing Snowflake
No ratings yet
Matillion Optimizing Snowflake
23 pages
CB Queryoptimization 01
No ratings yet
CB Queryoptimization 01
78 pages
Parallel Programming With Spark: Matei Zaharia
No ratings yet
Parallel Programming With Spark: Matei Zaharia
40 pages
Elastic Security Brochure
No ratings yet
Elastic Security Brochure
6 pages
Splunk 6.3.1 Forwarding
No ratings yet
Splunk 6.3.1 Forwarding
159 pages
Beginner's Crash Course To Elastic Stack - Part 1. 1 Intro To Elasticsearch and Kibana
100% (1)
Beginner's Crash Course To Elastic Stack - Part 1. 1 Intro To Elasticsearch and Kibana
59 pages
Talend Open Studio For Data Integration: User Guide
No ratings yet
Talend Open Studio For Data Integration: User Guide
452 pages
Nifi Expression Language Cheat Sheet
100% (1)
Nifi Expression Language Cheat Sheet
2 pages
OpenEDG Python Institute Fulda
No ratings yet
OpenEDG Python Institute Fulda
48 pages
Logsene Brochure PDF
No ratings yet
Logsene Brochure PDF
24 pages
Everything You Need To Know About PostgreSQL EXPLAIN
No ratings yet
Everything You Need To Know About PostgreSQL EXPLAIN
44 pages
Percona Monitoring and Management Documentation: Date .Getfullyear )
No ratings yet
Percona Monitoring and Management Documentation: Date .Getfullyear )
589 pages
Spark With Python Notes
No ratings yet
Spark With Python Notes
206 pages
03 Introduction To PostgreSQL
No ratings yet
03 Introduction To PostgreSQL
43 pages
Mastering Hazelcast 3.9
No ratings yet
Mastering Hazelcast 3.9
335 pages
Spark ETL and Process
No ratings yet
Spark ETL and Process
15 pages
New Relic
No ratings yet
New Relic
1 page
Certified Cloud Practitoner CheatSheet
No ratings yet
Certified Cloud Practitoner CheatSheet
16 pages
Project Ready Workshop catalog_updated Nov 2024
No ratings yet
Project Ready Workshop catalog_updated Nov 2024
121 pages
Business Intelligence DW
No ratings yet
Business Intelligence DW
17 pages
Architecang and Sizing Your Splunk Deployment: Simeon Yep
No ratings yet
Architecang and Sizing Your Splunk Deployment: Simeon Yep
47 pages
100 Linux Best Practices
No ratings yet
100 Linux Best Practices
15 pages
Splunk Fundamentals
No ratings yet
Splunk Fundamentals
9 pages
Hive Cheat Sheet - Quick Reference
No ratings yet
Hive Cheat Sheet - Quick Reference
19 pages
How To Install Logstash With Kibana Interface On RHEL - Above The Clouds
No ratings yet
How To Install Logstash With Kibana Interface On RHEL - Above The Clouds
17 pages
Splunk Test Blueprint Architect v.1.1
No ratings yet
Splunk Test Blueprint Architect v.1.1
4 pages
SS1123 - D2T - Apache Cassandra Overview PDF
100% (1)
SS1123 - D2T - Apache Cassandra Overview PDF
45 pages
### Promethus Counter. Adding Prometheus To A FastAPI App - Python - by Carlos Armando Marcano Vargas - Python in Plain English
No ratings yet
### Promethus Counter. Adding Prometheus To A FastAPI App - Python - by Carlos Armando Marcano Vargas - Python in Plain English
15 pages
Testing in Python - Unit Test & Script
No ratings yet
Testing in Python - Unit Test & Script
5 pages
Databricks Question
No ratings yet
Databricks Question
7 pages
SAS Viya Install
No ratings yet
SAS Viya Install
4 pages
BK Hdfs Administration
No ratings yet
BK Hdfs Administration
73 pages
DZone ScyllaDB Database Systems Trend Report
No ratings yet
DZone ScyllaDB Database Systems Trend Report
49 pages
Linux Admin 4
No ratings yet
Linux Admin 4
44 pages
Research On AWS Glue
No ratings yet
Research On AWS Glue
5 pages
Vcs Install 601 Aix
No ratings yet
Vcs Install 601 Aix
477 pages
Machine Learning Spark ML
No ratings yet
Machine Learning Spark ML
11 pages
HDPDeveloper EnterpriseSpark1 StudentGuide
100% (1)
HDPDeveloper EnterpriseSpark1 StudentGuide
244 pages
Course Content - Dynatrace Curriculum
No ratings yet
Course Content - Dynatrace Curriculum
3 pages
Amazon EMR Security: © 2018, Amazon Web Services, Inc. or Its Affiliates. All Rights Reserved
No ratings yet
Amazon EMR Security: © 2018, Amazon Web Services, Inc. or Its Affiliates. All Rights Reserved
16 pages
Jenkins Declarative Pipeline
No ratings yet
Jenkins Declarative Pipeline
41 pages
Filebeat To Graylog
No ratings yet
Filebeat To Graylog
4 pages
Tutorial Elasticsearch - English
0% (1)
Tutorial Elasticsearch - English
166 pages
TalendOpenStudio BigData UG 5.2.1 en
No ratings yet
TalendOpenStudio BigData UG 5.2.1 en
266 pages
ELK Cookbook
No ratings yet
ELK Cookbook
33 pages
Cloudera Kafka PDF
No ratings yet
Cloudera Kafka PDF
175 pages
WildFly Performance Tuning
From Everand
WildFly Performance Tuning
Arnold Johansson
No ratings yet
PostgreSQL 9 High Availability Cookbook
From Everand
PostgreSQL 9 High Availability Cookbook
Shaun M. Thomas
5/5 (2)
High-Performance Oracle: Proven Methods for Achieving Optimum Performance and Availability
From Everand
High-Performance Oracle: Proven Methods for Achieving Optimum Performance and Availability
Geoff Ingram
No ratings yet
Christin .B. Koshy CV
No ratings yet
Christin .B. Koshy CV
2 pages
GSM Chap1 - 8th ECE - VTU - GSM Architectrue and Interfaces2-Ramisuniverse
100% (1)
GSM Chap1 - 8th ECE - VTU - GSM Architectrue and Interfaces2-Ramisuniverse
15 pages
WGSN Fashion Forecast AW1415 Rendering Reality PDF
No ratings yet
WGSN Fashion Forecast AW1415 Rendering Reality PDF
25 pages
Roots
100% (1)
Roots
39 pages
Ieee Argencon 2016 Paper 14
No ratings yet
Ieee Argencon 2016 Paper 14
6 pages
PROFIBUS-DP User's Manual
No ratings yet
PROFIBUS-DP User's Manual
30 pages
Project Report of Sreekanth
No ratings yet
Project Report of Sreekanth
64 pages
Delivered Wednesday 12/06/2019 at 14:42: Ship Track Manage My Account Customs Tools Learn
No ratings yet
Delivered Wednesday 12/06/2019 at 14:42: Ship Track Manage My Account Customs Tools Learn
2 pages
Aix Admin
100% (1)
Aix Admin
224 pages
Newsletter June 14
No ratings yet
Newsletter June 14
5 pages
SQL Server Information - Schema
No ratings yet
SQL Server Information - Schema
16 pages
Microsoft Dynamics CRM
No ratings yet
Microsoft Dynamics CRM
10 pages
Abap - General Concepts: India Sap Coe, Slide 1
No ratings yet
Abap - General Concepts: India Sap Coe, Slide 1
45 pages
Teradata Indexes
No ratings yet
Teradata Indexes
59 pages
Power Series Method: Section 5.1 p1
No ratings yet
Power Series Method: Section 5.1 p1
18 pages
Louis H. Kauffman - Space and Time in Computation and Discrete Physics
No ratings yet
Louis H. Kauffman - Space and Time in Computation and Discrete Physics
50 pages
Data Structure and Applications Notes
No ratings yet
Data Structure and Applications Notes
4 pages
IP SUMMER HOLIDAY HOME WORK
No ratings yet
IP SUMMER HOLIDAY HOME WORK
2 pages
RPG Ile V7.1
No ratings yet
RPG Ile V7.1
898 pages
CCS - View Topic - Interfacing MRF24J40MA With PIC
No ratings yet
CCS - View Topic - Interfacing MRF24J40MA With PIC
8 pages
2009Ch01Ques 1
No ratings yet
2009Ch01Ques 1
3 pages
Win More Tenders With ISO Management Systems
100% (2)
Win More Tenders With ISO Management Systems
3 pages
STANDARD & LABELING For Agricultural Pumpset
No ratings yet
STANDARD & LABELING For Agricultural Pumpset
30 pages
Amity University Rajasthan: Amity School of Engineering
No ratings yet
Amity University Rajasthan: Amity School of Engineering
19 pages
POM QM Software Manual
No ratings yet
POM QM Software Manual
220 pages
Towers of Hanoi
No ratings yet
Towers of Hanoi
14 pages
FFT Module
No ratings yet
FFT Module
22 pages