Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Delivering Search for Today's Local,
Social, and Mobile Applications
Brian Pinkerton
VP and GM Amazon A9 and CloudSearch
July 18, 2013
Agenda
• Local, social, and mobile search
• Building with CloudSearch
• CloudSearch in action
• ManageFlitter on CloudSearch, James Peter, CTO & Co-Founder, 89n
The Rise of Mobile Search
• 45% of users 18-29 use mobile search daily (Icebreaker Consulting)
• Mobile searches (85.9B) are projected to exceed desktop searches
(84B) for in 2015 (eMarketer)
• 17% of users make a purchase after a mobile search (Juniper
Research)
Local, Social, Mobile Search Needs
• Location-based
• Rapidly scale in response to social trends
• Server-side importance
• Social relevance based on friends
• Optimized for the small screen
– Better autocorrect, spelling suggestions/corrections
Search Challenges
• Complex, expertise required
• Costly, often with upfront expenditure
• Long time to market, slow innovation & experimentation
• Operational overhead is undifferentiated work
AWS Summit 2013 | Singapore - Delivering Search for Today's Local, Social, and Mobile Applications
Amazon CloudSearch
• Pay for infrastructure you need when you need it
• Low cost
• No need to guess capacity
• Experiment fast with low risk
• We do the undifferentiated heavy lifting
• Go global in minutes
CloudSearch in Context
Amazon CloudSearch Overview
DNS / Load Balancing AWS Query
Search API Console Config
API
Command
Line Tools
ConsoleDoc
Svc API
Command
Line Tools
Console
SEARCH SERVICE
Search Documents
DOCUMENT SERVICE
Add Documents
Update Documents
Delete Documents
Create Domains
Configure Domains
Delete Domains
CONFIG SERVICE
Search Domain
Automatic Scaling
SEARCH INSTANCE
Index Partition n
Copy 1
SEARCH INSTANCE
Index Partition 2
Copy 2
SEARCH INSTANCE
Index Partition n
Copy 2
SEARCH INSTANCE
Index Partition 2
Copy n
SEARCH INSTANCE
DATA Document Quantity and Size
TRAFFIC
Search
Request
Volume and
Complexity
Index Partition n
Copy n
SEARCH INSTANCE
Index Partition 1
Copy 1
SEARCH INSTANCE
Index Partition 2
Copy 1
SEARCH INSTANCE
Index Partition 1
Copy 2
SEARCH INSTANCE
Index Partition 1
Copy n
Automatic Scaling
SEARCH INSTANCE
Index Partition n
Copy 1
SEARCH INSTANCE
Index Partition 2
Copy 2
SEARCH INSTANCE
Index Partition n
Copy 2
SEARCH INSTANCE
Index Partition 2
Copy n
SEARCH INSTANCE
Index Partition n
Copy n
SEARCH INSTANCE
Index Partition 1
Copy 1
SEARCH INSTANCE
Index Partition 2
Copy 1
SEARCH INSTANCE
Index Partition 1
Copy 2
SEARCH INSTANCE
Index Partition 1
Copy n
Compute
Storage
Load Balancing
Security
DATA Document Quantity and Size
Search
Request
Volume and
Complexity
Search
App
Server
CloudSearch
Query (URL)
Results (JSON/XML)
AWS Summit 2013 | Singapore - Delivering Search for Today's Local, Social, and Mobile Applications
Text Search
Highly Relevant Results
Faceted
Drilldown
Integer Range Searching
Complex Queries
Text fields for matching
user terms
Result enabled to
retrieve source data
Literal fields for Faceting
Facet enabled to retrieve
facets
Search enabled for
narrowing
Integer fields for
ranking, narrowing
BUILDING WITH AMAZON
CLOUDSEARCH
Create An Amazon CloudSearch Domain
Configure the Domain
Data Preparation and Upload
SDF
Batch
Amazon
CloudSearch
Search
Documents
CloudSearch SDF
[{"type":"add",
"id": "b007oznzg0",
"version": 1,
"lang": "en",
"fields": {
"title":"Kindle Paperwhite",
"description":"World's most advanced e-reader",
"category": ["Electronics","eBook Readers"],
"price":11900
} }, ...]
Document Service API
http(s)://< document service endpoint >/2011-02-01/documents/batch
Accept: application/json
Content-Length: 1176
Content-Type: application/json
Host: doc.imdb-movies-rr2f34ofg56xneuemujamut52i.us-east-
1.cloudsearch.amazonaws.com
[{ ,"id":"b007oznzg0","version": 1,"lang": "en","fields":
{"title":"Kindle Paperwhite","description":"World's most advanced e-
reader","category":["Electronics","eBook Readers"],"price":11900} },
{ , "id": "tt0434409", "version": 1337648735 } ]
Search Service API
http(s)://< search service endpoint>/2011-02-01/search?
• Simple searches
– q= text
• Boolean combination of fields
– bq= (or field:'value1' (and field:'value2' field:'value3'))
• Faceting
– facet= comma separated list of facet fields
• Pagination
– start=, size=
• Customized ranking
– rank= sort results based on the rank expression provided e.g. -text_relevance * log10(unit_sales)
Search Results
{"rank": "-text_relevance",
"match-expr": "(label 'kindle paperwhite')",
"hits": { "found": 204, "start": 0,
"hit": [ { "id": "sontsst12cf5f88b42" },
{ "id": "sopvopr12ab017f082" },
{ "id": "sorzrpw12ac468a13b" },
] },
...
}
CLOUDSEARCH IN ACTION
The Week – Mobile App
• Mobile news app with commentary and analysis
of breaking news
• CloudSearch enabled keyword search, results
ranking, social sharing
• Impact
– Increased usage, user engagement
– More page views
– Higher user growth
Viddy
• Capture, create, and share 30-
second social videos with friends
• Migrated entirely from Lucene to
CloudSearch in 3 days
• Operation costs reduced from $5-
6K/mo. to $1600/mo.
• 5x increase in search usage
• Freed up development resources
for innovation
James Peter, 89n
ManageFlitter on CloudSearch
ManageFlitter Background
• ManageFlitter helps businesses and
personal brands manage their Twitter
accounts
• Work faster and smarter with Twitter
– Unfollowing & Following
– Analytics
– Engagement
– Search
• 1.7 million users over 3 years
• Customers from over 100 countries
ManageFlitter Background
• Billions of social connections
processed daily
• ManageFlitter makes the ‘Big Data’
analysis associated with social
accounts accessible to everyone
• We require tools that can keep up
with our fast growth and high
throughput
ManageFlitter Background
Every 24 hours
• Read 732,916,544 social graph connections
• Write 7,517,513 social graph updates
(940,622 follows - 6,576,891 unfollows)
ManageFlitter Background
Cached data
• 90 million unique Twitter accounts
? Search service
Search Infrastructure Requirements
• Low operational overhead (no 3am wakeup calls)
• Can really automatically scale
• Handle large amounts of frequently updated data
• Easy to prototype and low cost to try
ManageFlitter CloudSearch Deploy Timeline
Prototype
(1 day)
Test
(1 week)
Launch
NowMay 29, 2012
Search Architecture
CloudSearch
Data Provider
Memcache
MySQL
Web Servers
Visitors
Social Data
Search Architecture
Data Provider
Web Servers
Visitors
Social Data
1000
CloudSearch
Memcache
MySQL
Per second
950 records added/updated
400 records deleted
Search Architecture
Data Provider Visitors
1000
CloudSearch
Memcache
MySQL
Web Servers
Social Data
Search Architecture
CloudSearch
Data Provider
Memcache
MySQL
Web Servers
Visitors
Social Data
http://manageflitter.com/search
http://manageflitter.com/search
Key Benefits
• Operational overhead has been incredibly low
– Saving $100,000+ per year on headcount by not needing dedicated engineers
– Saving $131,856 per year on hardware by running in the cloud
• CloudSearch team has been very responsive
• From prototype to production with very little work
• HTTP API has been easy to work with & platform independent
• Can handle 90,000,000+ frequently updated documents
Future Plans
• Take advantage of transparent
scaling & drive more people to
our search product
• Bring location based searches
to the fore-front
Thank You
James Peter
CTO & Co-Founder 89n
 Web http://www.manageflitter.com
 Twitter @zemaj
 Email james@89n.com
Summary
• Powerful search is a critical component of today's local, social
and mobile applications
• Amazon CloudSearch makes adding search easy
• Just create a domain, POST documents, GET search results
Resources and Q&A
Get started for free: 30 days
Amazon CloudSearch Overview Page
http://aws.amazon.com/cloudsearch/
• FAQs
• Community Forum
• Documentation & Getting Started Tutorial (IMDb)
Technical Track

More Related Content

AWS Summit 2013 | Singapore - Delivering Search for Today's Local, Social, and Mobile Applications

  • 1. Delivering Search for Today's Local, Social, and Mobile Applications Brian Pinkerton VP and GM Amazon A9 and CloudSearch July 18, 2013
  • 2. Agenda • Local, social, and mobile search • Building with CloudSearch • CloudSearch in action • ManageFlitter on CloudSearch, James Peter, CTO & Co-Founder, 89n
  • 3. The Rise of Mobile Search • 45% of users 18-29 use mobile search daily (Icebreaker Consulting) • Mobile searches (85.9B) are projected to exceed desktop searches (84B) for in 2015 (eMarketer) • 17% of users make a purchase after a mobile search (Juniper Research)
  • 4. Local, Social, Mobile Search Needs • Location-based • Rapidly scale in response to social trends • Server-side importance • Social relevance based on friends • Optimized for the small screen – Better autocorrect, spelling suggestions/corrections
  • 5. Search Challenges • Complex, expertise required • Costly, often with upfront expenditure • Long time to market, slow innovation & experimentation • Operational overhead is undifferentiated work
  • 7. Amazon CloudSearch • Pay for infrastructure you need when you need it • Low cost • No need to guess capacity • Experiment fast with low risk • We do the undifferentiated heavy lifting • Go global in minutes
  • 9. Amazon CloudSearch Overview DNS / Load Balancing AWS Query Search API Console Config API Command Line Tools ConsoleDoc Svc API Command Line Tools Console SEARCH SERVICE Search Documents DOCUMENT SERVICE Add Documents Update Documents Delete Documents Create Domains Configure Domains Delete Domains CONFIG SERVICE Search Domain
  • 10. Automatic Scaling SEARCH INSTANCE Index Partition n Copy 1 SEARCH INSTANCE Index Partition 2 Copy 2 SEARCH INSTANCE Index Partition n Copy 2 SEARCH INSTANCE Index Partition 2 Copy n SEARCH INSTANCE DATA Document Quantity and Size TRAFFIC Search Request Volume and Complexity Index Partition n Copy n SEARCH INSTANCE Index Partition 1 Copy 1 SEARCH INSTANCE Index Partition 2 Copy 1 SEARCH INSTANCE Index Partition 1 Copy 2 SEARCH INSTANCE Index Partition 1 Copy n
  • 11. Automatic Scaling SEARCH INSTANCE Index Partition n Copy 1 SEARCH INSTANCE Index Partition 2 Copy 2 SEARCH INSTANCE Index Partition n Copy 2 SEARCH INSTANCE Index Partition 2 Copy n SEARCH INSTANCE Index Partition n Copy n SEARCH INSTANCE Index Partition 1 Copy 1 SEARCH INSTANCE Index Partition 2 Copy 1 SEARCH INSTANCE Index Partition 1 Copy 2 SEARCH INSTANCE Index Partition 1 Copy n Compute Storage Load Balancing Security DATA Document Quantity and Size Search Request Volume and Complexity
  • 19. Text fields for matching user terms Result enabled to retrieve source data
  • 20. Literal fields for Faceting Facet enabled to retrieve facets Search enabled for narrowing
  • 23. Create An Amazon CloudSearch Domain
  • 25. Data Preparation and Upload SDF Batch Amazon CloudSearch Search Documents
  • 26. CloudSearch SDF [{"type":"add", "id": "b007oznzg0", "version": 1, "lang": "en", "fields": { "title":"Kindle Paperwhite", "description":"World's most advanced e-reader", "category": ["Electronics","eBook Readers"], "price":11900 } }, ...]
  • 27. Document Service API http(s)://< document service endpoint >/2011-02-01/documents/batch Accept: application/json Content-Length: 1176 Content-Type: application/json Host: doc.imdb-movies-rr2f34ofg56xneuemujamut52i.us-east- 1.cloudsearch.amazonaws.com [{ ,"id":"b007oznzg0","version": 1,"lang": "en","fields": {"title":"Kindle Paperwhite","description":"World's most advanced e- reader","category":["Electronics","eBook Readers"],"price":11900} }, { , "id": "tt0434409", "version": 1337648735 } ]
  • 28. Search Service API http(s)://< search service endpoint>/2011-02-01/search? • Simple searches – q= text • Boolean combination of fields – bq= (or field:'value1' (and field:'value2' field:'value3')) • Faceting – facet= comma separated list of facet fields • Pagination – start=, size= • Customized ranking – rank= sort results based on the rank expression provided e.g. -text_relevance * log10(unit_sales)
  • 29. Search Results {"rank": "-text_relevance", "match-expr": "(label 'kindle paperwhite')", "hits": { "found": 204, "start": 0, "hit": [ { "id": "sontsst12cf5f88b42" }, { "id": "sopvopr12ab017f082" }, { "id": "sorzrpw12ac468a13b" }, ] }, ... }
  • 31. The Week – Mobile App • Mobile news app with commentary and analysis of breaking news • CloudSearch enabled keyword search, results ranking, social sharing • Impact – Increased usage, user engagement – More page views – Higher user growth
  • 32. Viddy • Capture, create, and share 30- second social videos with friends • Migrated entirely from Lucene to CloudSearch in 3 days • Operation costs reduced from $5- 6K/mo. to $1600/mo. • 5x increase in search usage • Freed up development resources for innovation
  • 34. ManageFlitter Background • ManageFlitter helps businesses and personal brands manage their Twitter accounts • Work faster and smarter with Twitter – Unfollowing & Following – Analytics – Engagement – Search • 1.7 million users over 3 years • Customers from over 100 countries
  • 35. ManageFlitter Background • Billions of social connections processed daily • ManageFlitter makes the ‘Big Data’ analysis associated with social accounts accessible to everyone • We require tools that can keep up with our fast growth and high throughput
  • 36. ManageFlitter Background Every 24 hours • Read 732,916,544 social graph connections • Write 7,517,513 social graph updates (940,622 follows - 6,576,891 unfollows)
  • 37. ManageFlitter Background Cached data • 90 million unique Twitter accounts ? Search service
  • 38. Search Infrastructure Requirements • Low operational overhead (no 3am wakeup calls) • Can really automatically scale • Handle large amounts of frequently updated data • Easy to prototype and low cost to try
  • 39. ManageFlitter CloudSearch Deploy Timeline Prototype (1 day) Test (1 week) Launch NowMay 29, 2012
  • 41. Search Architecture Data Provider Web Servers Visitors Social Data 1000 CloudSearch Memcache MySQL Per second 950 records added/updated 400 records deleted
  • 42. Search Architecture Data Provider Visitors 1000 CloudSearch Memcache MySQL Web Servers Social Data
  • 46. Key Benefits • Operational overhead has been incredibly low – Saving $100,000+ per year on headcount by not needing dedicated engineers – Saving $131,856 per year on hardware by running in the cloud • CloudSearch team has been very responsive • From prototype to production with very little work • HTTP API has been easy to work with & platform independent • Can handle 90,000,000+ frequently updated documents
  • 47. Future Plans • Take advantage of transparent scaling & drive more people to our search product • Bring location based searches to the fore-front
  • 48. Thank You James Peter CTO & Co-Founder 89n  Web http://www.manageflitter.com  Twitter @zemaj  Email james@89n.com
  • 49. Summary • Powerful search is a critical component of today's local, social and mobile applications • Amazon CloudSearch makes adding search easy • Just create a domain, POST documents, GET search results
  • 50. Resources and Q&A Get started for free: 30 days Amazon CloudSearch Overview Page http://aws.amazon.com/cloudsearch/ • FAQs • Community Forum • Documentation & Getting Started Tutorial (IMDb)