Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
#SMX #32A @dawnieando
…And how you can overcome some of them
SOME CURRENT
CHALLENGES WITH
VOICE &
CONVERSATIONAL
SEARCH
#SMX #32A @dawnieando
Who  is  Dawn  Anderson?
• From  rainy  Manchester,  UK
• A  bit  of  a  ‘pracademic’  (hybrid  of  academic  and  
practitioner)
• International  SEO  consultant
• Move  It  Marketing
• I  lecture  on  search  and  digital  marketing  strategy
• But  I  mostly  ‘do’  SEO
• 11  years  in  SEO  now
• Googlebot hunter  ;P  ;P
• Consulting  with  brands,  in-­‐house  teams  and  start-­‐
ups
• My  pomeranian Bert  is  often  featured  in  tweets  
and  posts  ;P  ;P
#SMX #32A @dawnieando
Interest  over  time  on  Alexa  and  Google  Home
#SMX #32A @dawnieando
Seasonal  social  media  demonstrates  mass  engagement
#SMX #32A @dawnieando
Eyes-­‐free  device  sales  are  sky-­‐rocketing
#SMX #32A @dawnieando
Search  Engines  are  Getting  Better  At  Voice  Recognition  &  Question  
Answering
#SMX #32A @dawnieando
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
In 2017 was the year of “questions”
#SMX #32A @dawnieando
Google  Raters  guidelines  for  voice  search  published
#SMX #32A @dawnieando
What  does  a  good  result  look  like?
SPOILER
• Meets informational needs
• In short answers (as applicable)
• Or the answer is at the beginning
of the paragraph or result
• Grammatically correct
(syntactically well-formed)
• No spelling mistakes
• With accurate pronunciation
#SMX #32A @dawnieando
What  does  a  bad  result  look  like?
#SMX #32A @dawnieando
• [Skip]
• [play  mumford and  sons  reminder]  -­‐ Action  Response:  Set  a  
Reminder  Time:  Please  specify  a  time  Fails  to  Meet  The  user  
wanted  to  play  a  specific  song,  and  the  device  instead  set  a  
reminder.  No  users  would  be  satisfied  with  this  response.
Bad Result - Confusion between ‘actions’ & ‘queries’
#SMX #32A @dawnieando
Who  knows  how  many  times  Google  Home  cannot  help?
• Only  Google  knows
• But  they  aren’t  
sharing
• Search  engine  
embarrassment?
#SMX #32A @dawnieando
RECOGNITION IS NOT NATURAL
LANGUAGE
UNDERSTANDING
#SMX #32A @dawnieando
ESSIR2017  
European  Summer  
School  on  Information  
Retrieval
Information Retrieval Lectures
#SMX #32A @dawnieando
Enrique Alfonseca – Google Research Team
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
#SMX #32A @dawnieando
Better ranking needed because the user tends to focus
on a single answer
#SMX #32A @dawnieando
§ One  shot  at  the  answer
§ Berrypicking ‘evolving  search’  may  
not  apply  so  easily
§ Does  not  benefit  from  query  
refinement  and  user  feedback  as  
desktop  SERPs  do
– May  be  why  there  are  still  many  
unanswered  queries
Better Ranking Is Needed As The User Focuses On A
Single Result
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Query  diversity  ‘clusters’  
in  keyboard  ‘evolving’  
user  search
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Query  refinement  (via  
user  feedback)  is  not  
possible  with  voice  
search
#SMX #32A @dawnieando
#SMXInsights
§ No query expansion or relaxation
– Precision more important than recall
– Because there can be only one (or 2)
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Precision  >  Recall  in  voice  
search
Accuracy  >  Diversity
#SMX #32A @dawnieando
A rambled answer at the end is the worst possible result
#SMX #32A @dawnieando
“There  is  no  re-­‐ordering  in  
voice  search  – no  
paraphrasing  – just  
extraction  and  
compression.”
(Alfonseca,  2017,  
ESSIR2017)
#SMX #32A @dawnieando
Example of classic IR teaching query interpretation system
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
#SMX #32A @dawnieando
#SMXInsights
§ No paraphrasing with conversational search
– Paraphrasing likely needs full understanding
of query & intent to reformulate
#SMX #32A @dawnieando
• The  knowledge  base  is  checked  first
• Then  the  web  is  checked  to  ‘fill  in  gaps’
• Taking  from  the  messy  unstructured  
data  of  web  pages
Knowledge base first,
web text second
#SMX #32A @dawnieando
• Structured  data  (tables  and  data  stored  in  databases)
• Semi-­‐structured  data  (XML,  JSON,  meta  headings  [h1-­‐h6])
• Semantically-­‐enriched  data  (marked  up  schema,  entities)
• Unstuctured data  (normal  web  text  copy)
• The  web  is  messy  and  noisy
• Unstructured  data  is  difficult  to  make  sense  of  (no  topical  
strength)
The different types of data & the problem with
unstructured data
#SMX #32A @dawnieando
Structured  data  has  
never  been  more  
important  for  
disambiguation
#SMX #32A @dawnieando
• Adds  meaning
• Disambiguates
• Adds  structure
• Helps  with  context
• The  web  is  noisy
• Unstructured  data  is  voluminous
Structured Data is very,
very useful here
#SMX #32A @dawnieando
#SMXInsights
§ Simply adding topical H1 – H6
headings turns unstructured web
data into semi-structured data
#SMX #32A @dawnieando
Share these #SMXInsights on your social channels!
#SMXInsights
§ Tables are problematic for voice search
– Support tabular data with well formed
paragraphs and sentences
#SMX #32A @dawnieando
• What  may  be  good  for  featured  
snippets  (tabular  data)  may  not  be  
good  for  voice  search
• You  may  need  additional  strategy  
for  voice  search  &  tabular  data  in  
featured  snippets
• Pete  Myers  from  Moz found  only  
30%  voice  search  results  on  Google  
Home  came  from  tables  in  featured  
snippets  (Image  credit:  Pete  Myers,  
Moz)
Tables are currently problematic
#SMX #32A @dawnieando
CONFIRMED  BY:
• Google’s  Enrique  Alfonseca (2017)
• Microsoft’s  Harry  Shum  (2018)
• Conversational  contextual  search  is  difficult
Multi-turn conversations are still challenging
#SMX #32A @dawnieando
• (“anaphoric”  is  referring  
upward  to  previously  
mentioned  words)
• Resolution  means  trying  to  
understand  what  it  was  
which  is  referred  to  in  those  
previously  mentioned  words
Anaphoric
Resolution
#SMX #32A @dawnieando
• (“cataphoric”  is  referring  
downward  to  subsequent  
words)
• Resolution  means  trying  to  
understand  what  it  is  which  is  
referred  to  in  those  
subsequent  words
Cataphoric
Resolution
#SMX #32A @dawnieando
Likely  relates  to  anaphoric  (likely)  &  cataphoric (far  less  likely)  
resolution
Pronouns seem still
Problematic
#SMX #32A @dawnieando
Our ’Previous’ Work
#SMX #32A @dawnieando
AKA  – Word  category  disambiguation
• Function  words  – POS  (Syntax)
• Content  words  – POS  (relevant)
• Verbs  – POS
• Nouns  -­‐ POS
• Pronouns  -­‐ POS
• Plural-­‐pronouns  -­‐ POS
Pygmalion are carrying out Part of Speech (POS) &
Named Entity Tagging (NE tags) manually
#SMX #32A @dawnieando
WORD DISAMBIGUATION
#SMX #32A @dawnieando
Ambiguous queries need context – ‘House’
#SMX #32A @dawnieando
Linguistics are complex
Homophora Endophora Exophora
Hyponyms Hypernyms Homonyms
#SMX #32A @dawnieando
COREFERENCE RESOLUTION IS A
CHALLENGING PROBLEM FOR
DISAMBIGUATION
#SMX #32A @dawnieando
THE IMPORTANCE OF
CO-OCCURRENCE
#SMX #32A @dawnieando
”You shall know a word by the
company it keeps”
(Firth)
#SMX #32A @dawnieando
Other ’Previous’ Work – Similarity & Relatedness
#SMX #32A @dawnieando
WordSimilarity353 Test Collection
#SMX #32A @dawnieando
money cash 9.08
money currency 9.04
football soccer 9.03
magician wizard 9.02
gem jewel 8.96
car automobile 8.94
boy lad 8.83
furnace stove 8.79
Maradona football 8.62
king queen 8.58
money bank 8.5
Jerusalem Israel 8.46
vodka gin 8.46
planet star 8.45
money dollar 8.42
vodka brandy 8.13
bank money 8.12
physics proton 8.12
planet galaxy 8.11
stock market 8.08
psychology psychiatry 8.08
planet moon 8.08
planet constellation 8.06
planet sun 8.02
tiger feline 8
planet astronomer 7.94
movie theater 7.92
planet space 7.92
baby mother 7.85
wood forest 7.73
money deposit 7.73
psychology mind 7.69
Jerusalem Palestinian 7.65
Arafat terror 7.65
computer keyboard 7.62
computer internet 7.58
money property 7.57
tennis racket 7.56
psychology cognition 7.48
book paper 7.46
book library 7.46
media radio 7.42
psychology depression 7.42
jaguar cat 7.42
movie star 7.38
bird crane 7.38
tiger cat 7.35
physics chemistry 7.35
money possession 7.29
jaguar car 7.27
cup drink 7.25
psychology health 7.23
bird cock 7.1
company stock 7.08
tiger carnivore 7.08
WordSimilarity353 Test Collection
#SMX #32A @dawnieando
#SMXInsights
§ Secondary or 3-way strategy may be
needed
– Add a TL:DR
– Or an executive summary
– Or Q & A based table of contents
– Or a ‘Short Answer’ then ‘Longer Answer’
#SMX #32A @dawnieando
#SMXInsights
§ Mine forums, customer service, chat &
emails
– Build word clouds to provide answers to
topics which matter to your audience
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Soundex,  Metaphone or  
similar  ’misspelling’  
algorithms  may  not  apply  
to  voice  search
#SMX #32A @dawnieando
LEARN MORE: UPCOMING @SMX EVENTS
THANK YOU!
SEE YOU AT THE NEXT #SMX
#SMX #32A @dawnieando
• WordSimilarity353  Test  Collection  -­‐http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/
• Miller,  G.A.  and  Charles,  W.G.,  1991.  Contextual  correlates  of  semantic  similarity. Language  and  
cognitive  processes, 6(1),  pp.1-­‐28.
• Linkedin Harry  Shum.  2018. From  Search  to  Research.  [ONLINE]  Available  
at: https://www.linkedin.com/pulse/from-­‐search-­‐research-­‐harry-­‐shum/.  [Accessed  22  February  2018].
• Coreference Resolution  -­‐ The  Stanford  Natural  Language  Processing  Group.  2018. The  Stanford  Natural  
Language  Processing  Group.  [ONLINE]  Available  at: https://nlp.stanford.edu/projects/coref.shtml.  
[Accessed  19  February  2018].
Sources & References
#SMX #32A @dawnieando
APPENDIX
#SMX #32A @dawnieando
EXAMPLES
• Look  at  Wikipedia  Redirects
• Alternative  names  redirect  to  the most  appropriate  article  
title (for  example, Edison  Arantes  do  Nascimento redirects  
to Pelé)  (Wikipedia)
• SPARQL  and  DBPedia identifies  many  variations  
• (Beethoven  example)
• https://dbpedia.org/sparql
• https://en.wikipedia.org/wiki/Wikipedia:Redirect
Terms can have many ‘surface forms’
#SMX #32A @dawnieando
” It is concluded…the more often two words
can be substituted into the same contexts the
more similar in meaning they are judged to
be.”
(Miller & Charles,1991)
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Difficult  to  deal  with  
‘query  ambiguity’
Result  ‘diversity’  
assists  with  query  
ambiguity  in  desktop  
or  non-­‐voice  results
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Page  Length  
‘Normalization’  may  not  
apply  as  with  traditional  
results??
(Me  musing)
#SMX #32A @dawnieando
Long numbers should be rounded
§ 60,999,888.999999999
– It  reads  terribly
– Needs  to  be  rounded
#SMX #32A @dawnieando
• First  checks  whether  the  next  ‘turn’  of  question  relates  to  
the  previous  question
• Using  LSTMs  (Long  Short  Term  Memory)
• Bi-­‐directional  context  embedding
• Query  and  its  context  are  both  used  as  input
Conversational Context & Microsoft
#SMX #32A @dawnieando
Katja Filippova – Google Research Team
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
#SMX #32A @dawnieando
Query expansion and query relaxation
#SMX #32A @dawnieando
https://www.ntid.rit.edu/sea/processes/referencewords/practice/ph
oric
Example of cataphoric and anaphoric resolution

More Related Content

Voice Search Challenges For Search and Information Retrieval and SEO

  • 1. #SMX #32A @dawnieando …And how you can overcome some of them SOME CURRENT CHALLENGES WITH VOICE & CONVERSATIONAL SEARCH
  • 2. #SMX #32A @dawnieando Who  is  Dawn  Anderson? • From  rainy  Manchester,  UK • A  bit  of  a  ‘pracademic’  (hybrid  of  academic  and   practitioner) • International  SEO  consultant • Move  It  Marketing • I  lecture  on  search  and  digital  marketing  strategy • But  I  mostly  ‘do’  SEO • 11  years  in  SEO  now • Googlebot hunter  ;P  ;P • Consulting  with  brands,  in-­‐house  teams  and  start-­‐ ups • My  pomeranian Bert  is  often  featured  in  tweets   and  posts  ;P  ;P
  • 3. #SMX #32A @dawnieando Interest  over  time  on  Alexa  and  Google  Home
  • 4. #SMX #32A @dawnieando Seasonal  social  media  demonstrates  mass  engagement
  • 5. #SMX #32A @dawnieando Eyes-­‐free  device  sales  are  sky-­‐rocketing
  • 6. #SMX #32A @dawnieando Search  Engines  are  Getting  Better  At  Voice  Recognition  &  Question   Answering
  • 7. #SMX #32A @dawnieando TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) In 2017 was the year of “questions”
  • 8. #SMX #32A @dawnieando Google  Raters  guidelines  for  voice  search  published
  • 9. #SMX #32A @dawnieando What  does  a  good  result  look  like? SPOILER • Meets informational needs • In short answers (as applicable) • Or the answer is at the beginning of the paragraph or result • Grammatically correct (syntactically well-formed) • No spelling mistakes • With accurate pronunciation
  • 10. #SMX #32A @dawnieando What  does  a  bad  result  look  like?
  • 11. #SMX #32A @dawnieando • [Skip] • [play  mumford and  sons  reminder]  -­‐ Action  Response:  Set  a   Reminder  Time:  Please  specify  a  time  Fails  to  Meet  The  user   wanted  to  play  a  specific  song,  and  the  device  instead  set  a   reminder.  No  users  would  be  satisfied  with  this  response. Bad Result - Confusion between ‘actions’ & ‘queries’
  • 12. #SMX #32A @dawnieando Who  knows  how  many  times  Google  Home  cannot  help? • Only  Google  knows • But  they  aren’t   sharing • Search  engine   embarrassment?
  • 13. #SMX #32A @dawnieando RECOGNITION IS NOT NATURAL LANGUAGE UNDERSTANDING
  • 14. #SMX #32A @dawnieando ESSIR2017   European  Summer   School  on  Information   Retrieval Information Retrieval Lectures
  • 15. #SMX #32A @dawnieando Enrique Alfonseca – Google Research Team TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  • 16. #SMX #32A @dawnieando Better ranking needed because the user tends to focus on a single answer
  • 17. #SMX #32A @dawnieando § One  shot  at  the  answer § Berrypicking ‘evolving  search’  may   not  apply  so  easily § Does  not  benefit  from  query   refinement  and  user  feedback  as   desktop  SERPs  do – May  be  why  there  are  still  many   unanswered  queries Better Ranking Is Needed As The User Focuses On A Single Result
  • 18. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Query  diversity  ‘clusters’   in  keyboard  ‘evolving’   user  search
  • 19. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Query  refinement  (via   user  feedback)  is  not   possible  with  voice   search
  • 20. #SMX #32A @dawnieando #SMXInsights § No query expansion or relaxation – Precision more important than recall – Because there can be only one (or 2)
  • 21. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Precision  >  Recall  in  voice   search Accuracy  >  Diversity
  • 22. #SMX #32A @dawnieando A rambled answer at the end is the worst possible result
  • 23. #SMX #32A @dawnieando “There  is  no  re-­‐ordering  in   voice  search  – no   paraphrasing  – just   extraction  and   compression.” (Alfonseca,  2017,   ESSIR2017)
  • 24. #SMX #32A @dawnieando Example of classic IR teaching query interpretation system TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  • 25. #SMX #32A @dawnieando #SMXInsights § No paraphrasing with conversational search – Paraphrasing likely needs full understanding of query & intent to reformulate
  • 26. #SMX #32A @dawnieando • The  knowledge  base  is  checked  first • Then  the  web  is  checked  to  ‘fill  in  gaps’ • Taking  from  the  messy  unstructured   data  of  web  pages Knowledge base first, web text second
  • 27. #SMX #32A @dawnieando • Structured  data  (tables  and  data  stored  in  databases) • Semi-­‐structured  data  (XML,  JSON,  meta  headings  [h1-­‐h6]) • Semantically-­‐enriched  data  (marked  up  schema,  entities) • Unstuctured data  (normal  web  text  copy) • The  web  is  messy  and  noisy • Unstructured  data  is  difficult  to  make  sense  of  (no  topical   strength) The different types of data & the problem with unstructured data
  • 28. #SMX #32A @dawnieando Structured  data  has   never  been  more   important  for   disambiguation
  • 29. #SMX #32A @dawnieando • Adds  meaning • Disambiguates • Adds  structure • Helps  with  context • The  web  is  noisy • Unstructured  data  is  voluminous Structured Data is very, very useful here
  • 30. #SMX #32A @dawnieando #SMXInsights § Simply adding topical H1 – H6 headings turns unstructured web data into semi-structured data
  • 31. #SMX #32A @dawnieando Share these #SMXInsights on your social channels! #SMXInsights § Tables are problematic for voice search – Support tabular data with well formed paragraphs and sentences
  • 32. #SMX #32A @dawnieando • What  may  be  good  for  featured   snippets  (tabular  data)  may  not  be   good  for  voice  search • You  may  need  additional  strategy   for  voice  search  &  tabular  data  in   featured  snippets • Pete  Myers  from  Moz found  only   30%  voice  search  results  on  Google   Home  came  from  tables  in  featured   snippets  (Image  credit:  Pete  Myers,   Moz) Tables are currently problematic
  • 33. #SMX #32A @dawnieando CONFIRMED  BY: • Google’s  Enrique  Alfonseca (2017) • Microsoft’s  Harry  Shum  (2018) • Conversational  contextual  search  is  difficult Multi-turn conversations are still challenging
  • 34. #SMX #32A @dawnieando • (“anaphoric”  is  referring   upward  to  previously   mentioned  words) • Resolution  means  trying  to   understand  what  it  was   which  is  referred  to  in  those   previously  mentioned  words Anaphoric Resolution
  • 35. #SMX #32A @dawnieando • (“cataphoric”  is  referring   downward  to  subsequent   words) • Resolution  means  trying  to   understand  what  it  is  which  is   referred  to  in  those   subsequent  words Cataphoric Resolution
  • 36. #SMX #32A @dawnieando Likely  relates  to  anaphoric  (likely)  &  cataphoric (far  less  likely)   resolution Pronouns seem still Problematic
  • 37. #SMX #32A @dawnieando Our ’Previous’ Work
  • 38. #SMX #32A @dawnieando AKA  – Word  category  disambiguation • Function  words  – POS  (Syntax) • Content  words  – POS  (relevant) • Verbs  – POS • Nouns  -­‐ POS • Pronouns  -­‐ POS • Plural-­‐pronouns  -­‐ POS Pygmalion are carrying out Part of Speech (POS) & Named Entity Tagging (NE tags) manually
  • 39. #SMX #32A @dawnieando WORD DISAMBIGUATION
  • 40. #SMX #32A @dawnieando Ambiguous queries need context – ‘House’
  • 41. #SMX #32A @dawnieando Linguistics are complex Homophora Endophora Exophora Hyponyms Hypernyms Homonyms
  • 42. #SMX #32A @dawnieando COREFERENCE RESOLUTION IS A CHALLENGING PROBLEM FOR DISAMBIGUATION
  • 43. #SMX #32A @dawnieando THE IMPORTANCE OF CO-OCCURRENCE
  • 44. #SMX #32A @dawnieando ”You shall know a word by the company it keeps” (Firth)
  • 45. #SMX #32A @dawnieando Other ’Previous’ Work – Similarity & Relatedness
  • 47. #SMX #32A @dawnieando money cash 9.08 money currency 9.04 football soccer 9.03 magician wizard 9.02 gem jewel 8.96 car automobile 8.94 boy lad 8.83 furnace stove 8.79 Maradona football 8.62 king queen 8.58 money bank 8.5 Jerusalem Israel 8.46 vodka gin 8.46 planet star 8.45 money dollar 8.42 vodka brandy 8.13 bank money 8.12 physics proton 8.12 planet galaxy 8.11 stock market 8.08 psychology psychiatry 8.08 planet moon 8.08 planet constellation 8.06 planet sun 8.02 tiger feline 8 planet astronomer 7.94 movie theater 7.92 planet space 7.92 baby mother 7.85 wood forest 7.73 money deposit 7.73 psychology mind 7.69 Jerusalem Palestinian 7.65 Arafat terror 7.65 computer keyboard 7.62 computer internet 7.58 money property 7.57 tennis racket 7.56 psychology cognition 7.48 book paper 7.46 book library 7.46 media radio 7.42 psychology depression 7.42 jaguar cat 7.42 movie star 7.38 bird crane 7.38 tiger cat 7.35 physics chemistry 7.35 money possession 7.29 jaguar car 7.27 cup drink 7.25 psychology health 7.23 bird cock 7.1 company stock 7.08 tiger carnivore 7.08 WordSimilarity353 Test Collection
  • 48. #SMX #32A @dawnieando #SMXInsights § Secondary or 3-way strategy may be needed – Add a TL:DR – Or an executive summary – Or Q & A based table of contents – Or a ‘Short Answer’ then ‘Longer Answer’
  • 49. #SMX #32A @dawnieando #SMXInsights § Mine forums, customer service, chat & emails – Build word clouds to provide answers to topics which matter to your audience
  • 50. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Soundex,  Metaphone or   similar  ’misspelling’   algorithms  may  not  apply   to  voice  search
  • 51. #SMX #32A @dawnieando LEARN MORE: UPCOMING @SMX EVENTS THANK YOU! SEE YOU AT THE NEXT #SMX
  • 52. #SMX #32A @dawnieando • WordSimilarity353  Test  Collection  -­‐http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/ • Miller,  G.A.  and  Charles,  W.G.,  1991.  Contextual  correlates  of  semantic  similarity. Language  and   cognitive  processes, 6(1),  pp.1-­‐28. • Linkedin Harry  Shum.  2018. From  Search  to  Research.  [ONLINE]  Available   at: https://www.linkedin.com/pulse/from-­‐search-­‐research-­‐harry-­‐shum/.  [Accessed  22  February  2018]. • Coreference Resolution  -­‐ The  Stanford  Natural  Language  Processing  Group.  2018. The  Stanford  Natural   Language  Processing  Group.  [ONLINE]  Available  at: https://nlp.stanford.edu/projects/coref.shtml.   [Accessed  19  February  2018]. Sources & References
  • 54. #SMX #32A @dawnieando EXAMPLES • Look  at  Wikipedia  Redirects • Alternative  names  redirect  to  the most  appropriate  article   title (for  example, Edison  Arantes  do  Nascimento redirects   to Pelé)  (Wikipedia) • SPARQL  and  DBPedia identifies  many  variations   • (Beethoven  example) • https://dbpedia.org/sparql • https://en.wikipedia.org/wiki/Wikipedia:Redirect Terms can have many ‘surface forms’
  • 55. #SMX #32A @dawnieando ” It is concluded…the more often two words can be substituted into the same contexts the more similar in meaning they are judged to be.” (Miller & Charles,1991)
  • 56. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Difficult  to  deal  with   ‘query  ambiguity’ Result  ‘diversity’   assists  with  query   ambiguity  in  desktop   or  non-­‐voice  results
  • 57. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Page  Length   ‘Normalization’  may  not   apply  as  with  traditional   results?? (Me  musing)
  • 58. #SMX #32A @dawnieando Long numbers should be rounded § 60,999,888.999999999 – It  reads  terribly – Needs  to  be  rounded
  • 59. #SMX #32A @dawnieando • First  checks  whether  the  next  ‘turn’  of  question  relates  to   the  previous  question • Using  LSTMs  (Long  Short  Term  Memory) • Bi-­‐directional  context  embedding • Query  and  its  context  are  both  used  as  input Conversational Context & Microsoft
  • 60. #SMX #32A @dawnieando Katja Filippova – Google Research Team TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  • 61. #SMX #32A @dawnieando Query expansion and query relaxation