Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A Traducir

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Information Retrieval and Artificial

Intelligence

Mohand Boughanem, Imen Akermi, Gabriella Pasi


and Karam Abdulahhad

Abstract Information Retrieval (IR) is a process involving activities related to


human cognition and to knowledge management; as such, the definition of Infor-
mation Retrieval Systems can benefit of the application of artificial intelligence
techniques to account for the intrinsic uncertainty and imprecision that character-
ize the subjectivity of this task. This chapter presents a synthetic analysis of the IR
task from an AI perspective and explores how AI techniques are employed within
IR.

1 Introduction

Information retrieval (IR) systems (aka search engines) are widely employed in
a variety of applications, among which Web search engines are the most known
example. These tools are used by millions of users and became an essential part of
our daily lives.
Examples of applications that benefit from Information Retrieval Systems (IRSs)
are digital libraries, medical based applications, and desktop search. Regardless of
the application domain, the IR task is conditioned by several important factors. First,
an IRS constitute a so called pull technology, as it implies that a user proactively
specifies a keyword based query to express a specific information need. However, user
queries hardly capture the complexity of a user need, and the few specified keywords

M. Boughanem (B) · I. Akermi


IRIT, Université Toulouse III, Toulouse, France
e-mail: mohand.boughanem@irit.fr
I. Akermi
e-mail: imen.akermi@irit.fr
G. Pasi
Università degli Studi di Milano-Bicocca, Milan, Italy
e-mail: pasi@disco.unimib.it
K. Abdulahhad
GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
e-mail: karam.abdulahhad@gesis.org

© Springer Nature Switzerland AG 2020 147


P. Marquis et al. (eds.), A Guided Tour of Artificial Intelligence Research,
https://doi.org/10.1007/978-3-030-06170-8_5
148 M. Boughanem et al.

are often imprecise, incomplete, ambiguous, and often inadequate to express the real
user intent. A user’s query imposes a set of constraints on texts (we only consider
textual documents in this chapter) that are written in natural language, which is
known to be ambiguous. This is another important barrier that the IR system must
deal with. Finally, the IR system has to decide which documents to present to the
user, i.e. it has to estimate their relevance, based on an analysis of the textual content
of these documents.
The above observations show some of the aspects related to the difficulty under-
lying the IR task. Artificial Intelligence (AI) techniques may help better tackle the
IR task by addressing the incompleteness, vagueness and subjectivity intrinsic in
the IR process. Possible applications of AI to IR have been extensively discussed
in the literature (Jones 1983, 1991; Croft 1987; Mandl 2009), by raising two main
questions: how the IR task can be addressed from an AI point of view, and how IR
is considered within AI.
The purpose of this synthetic survey is to highlight different aspects related to the
first question. We give in Sect. 3 an overview related to AI and IR. Then, we discuss
in the subsequent sections the different AI approaches that have been employed in
IR.

2 Information Retrieval: Background

An IR system aims at selecting from a huge document collection the documents


that are deemed relevant to a particular user need expressed by means of a query
(Salton and McGill 1986). This definition points out three key concepts: documents,
information need/query and relevance, which can be defined as follows:
– A document (or item) constitutes a unit of retrievable information that can be
selected to satisfy a user need expressed by a query. A document can be a text, an
image, a video, an audio or it can be multimedia. In this chapter we only consider
textual documents.
– A query is a formal representation of the user’s information need. It is usually
composed of a set of keywords eventually connected with Boolean operators.
– Relevance is the core notion in IR. It can be defined as a relationship between a user
need and a document. This notion is very subjective and difficult to model. Sev-
eral IR models have been proposed, based on a variety of theoretical frameworks
namely, probability theory, linear algebra, set theory, fuzzy logic, etc. Most of these
models address topical relevance, usually relying on query-document matching.
A typical IR system is composed of three main components, namely, document
representation, user need representation and query-document matching. The formal
representation of a document is produced by the indexing process, which usually
consists in extracting a set of features, basically keywords, that characterize the
content of the document. The representation of a user need is defined through the
query formulation phase.
Information Retrieval and Artificial Intelligence 149

IR models are finalized at formalizing the query-document matching, and they allow
one to estimate the so called topical relevance. Query expansion can also be part of the
IR process, to the aim of modifying the original query by adding terms extracted from
the retrieved documents or drawn out from external resources such as dictionaries,
ontology, etc.

3 Artificial Intelligence for Information Retrieval

Intelligent Information Retrieval was introduced in the 1980s when AI techniques


were perceived as a promising means to define effective IRSs. As a consequence,
there has been a shift from the classical IRSs, i.e. those based on the Boolean
model to ranking-based systems and probabilistic approaches. Therefore, various
IR approaches were developed involving AI techniques that helped to better express
the document content, to better learn the users needs and to more effectively formal-
ize the concept of relevance. Consequently, several formal definitions were proposed
to address the concept of Intelligent Information retrieval (IIR).
Generally, IIR systems can be described as systems using AI techniques to repli-
cate intelligence through the IR process. Spark Jones (1983) defined an IIR system
as “a system with a knowledge base and inferential capabilities that can be used to
establish connections between a request and a set of documents”. In Van Rijsbergen
(1986), IIR is considered as an inference process described as “given a document
representation D and a request R, IR is the process of establishing a probability for
“D → R”.
A two-fold definition was presented in Chen and Dhar (1989), Belkin and
Marchetti (1989), stating that intelligence may intervene when “users do not know
what information they need before accessing the system so they have to be helped
in forming the query to the information retrieval system” (Chen and Dhar 1989),
and that “users become aware of their information need only through this process of
interacting with the system” (Belkin and Marchetti 1989).
A machine and Human-oriented perspectives were also put forward to define
IIR systems. Belkin et al. (1987) consider that “an intelligent IR system was one
in which the functions of the human intermediary were performed by a program,
interacting with the human user”. In Maes (1994), the authors state that “intelligent
IR is performed by a computer program (a so-called intelligent agent), which, acting
on (perhaps minimal or even no explicit) instructions from a human user, retrieves
and presents information to the user without any other interaction”.
According to these definitions, Cole (1998) resumes that the main goal of using
AI techniques is to support IR systems in the process of assisting users to discover
documents relevant to their information need by interacting with the system.
Consequently, a recurrent question that arises is: how AI techniques could benefit
IR systems ? One of the answers that provides some insights for this question is
stressed by Karen Sparck Jones (1991) who stated that, “IR is seen as a search for
unknown, and under-specified, information in a world of information as conveyed by
150 M. Boughanem et al.

natural language texts, it is easy to conclude that what AI discovers about the rep-
resentation of knowledge, reasoning under uncertainty, and learning, will be clearly
applicable to document retrieval”. In Ding (2001), KSJ claims were paraphrased
according to three different aspects:

– “Knowledge representation. IRs representation of entities and relations is very


weak. Concept names are not normalized, and descriptions are mere sets of inde-
pendent terms without structure… Concepts and topics, term and description
meanings are left implicit… The relation between terms is only association based
on co-presence… While, the representation in AI is strong. There already exist
various full-fledged methods and techniques to model the knowledge. Ontology
can be considered as the generic term for generalizing these representation ideas.
– Reasoning: Reasoning in IR is also weak, looking at what is in common between
descriptions and preferring one item over another because more in shared (whether
as different words or, via weighting, occurrences of the same word)… The proba-
bilistic network approach, that allows for more varied forms of search statement
and matching condition, does not alter the basic style of reasoning. While develop-
ment in knowledge representation of AI, especially ontology provides the backbone
for reasoning and also guarantees the reasoning.
– Learning: Loosely speaking, the relevance feedback of IR can be considered as
forms of learning. This again is very weak in IR. In this part, machine learning
will link the IR and AI together to improve both sides.”

To sum up, these claims asserted that several AI areas can help handling IR tasks,
particularly:

– Natural language Processing techniques (See chapter “Databases and Artificial


Intelligence” of this volume) and Knowledge Representation provide tools that
allow one to better represent the document content,
– Reasoning under uncertainty, e.g. by modal logic, probabilistic reasoning, fuzzy
logic, can help both in the phase of query formulation and relevance assessment
– Machine Learning techniques may intervene at different levels of the IR process.
Indeed, recent advances in neural networks have offered new perspectives to IR.
Such approaches have been applied to handle different IR tasks such as learning
document or query representations and learning the ranking model.
– Other close topics such as Metaheuristics (evolutionary computation), game the-
ory, multi-agent systems have been applied in IR. They, generally, regard IR as
an optimization problem where individuals, agents or players cooperate to real-
ize a given task, namely, building the “best” document or query representation,
retrieving the most relevant set of documents, or building an effective relevance
function.

The next sections will discuss how AI topics, particularly those listed above, can
help IR with respect to the three main components of the general IR process, namely
document representation, information need representation and relevance modelling.
Information Retrieval and Artificial Intelligence 151

4 Document Representation

The majority of document representation models are based on single words, com-
monly referred to as a bag of words representation. Document content (resp. query
content) is represented as a set of independent weighted words. This representation
has several limitations due to the lexical variety of words (synonym words) and the
semantic variation of words (polysemous words). This leads to a known issue called
“term-mismatch” or “word-mismatch”. Therefore, setting up a more sophisticated
representation that can go beyond a simple bag of words has been considered as
necessary since decades. This had been obvious for the pioneers of IR (Cleverdon
and Keen 1966; Sparck Jones 1972; Salton 1991; Luhn 1957). In fact, they proposed
to represent texts by syntactic or semantic units much more appropriate to represent
the meaning of the document’s components. Therefore, AI techniques, especially
those related to Natural Language Processing (NLP) and Knowledge Management
(KM), can be seen as natural tools that will help to better identify and extract the
meanings (word senses or concepts) conveyed in the document.
Several simple NLP techniques have been explored in IR including term extrac-
tion (tokenization), word stemming, compound phrase identification, part of speech
tagging (POS), chunking, word sense disambiguation and named entity recogni-
tion. All of these techniques, extensively discussed in chapter “Artificial Intelligence
and Natural Language” of this volume, are somehow useful at different extents in
IR (Manning et al. 2008) and help to better extract different forms of term units,
including single words, phrases, word senses, topics, etc. (Li and Xu 2014). Without
being exhaustive, stemming algorithms (Porter 1980; Krovetz 1993) are clearly the
most used “NLP” technique in IR. They have relatively low-cost processing and often
bring slight improvements in document retrieval (Harman 1991; Hollink et al. 2004).
Part of speech tagging (e.g., verb, noun), have also been applied in IR for different
purposes: POS-based term weighting (Lioma and Blanco 2009), disambiguation
(Krovetz 1997). However, moderate improvements have been reported (Kraaij and
Pohlmann 1996; Chowdhury and McCabe 1998; Lioma and Blanco 2009).
We will focus, in this section, on the two classes of approaches that have been
widely investigated to cope with the term-mismatch issue, namely, compound term
(phrase) indexing and concept based representation. Phrase indexing consists in
indexing multiword units instead of single words. Concept (Semantic) indexing
attempts to represent terms according to their meaning that might be taken from
semantic resources such as thesaurus, ontology, knowledge base, etc., or derived from
text corpus such as word embedding approaches (Deerwester et al. 1990; Mikolov
et al. 2013).
152 M. Boughanem et al.

4.1 Phrase-Based Indexing

Phrase indexing consists in representing index units by multiword units. These units
can be addressed according to two classes of approaches: linguistic and statistic
approaches. Linguistic-based approaches employ pure NLP techniques, including
lexical, syntactic and semantic analysis and discourse processing, in order to extract
meaningful phrases (i.e., phrases with certain syntactic relations Tong et al. 1997).
Several approaches, based on different linguistic clues, have been proposed and
developed in IR. Such approaches include linguistic phrases (Fagan 1987a; Evans
and Zhai 1996), lexical atoms (Sheridan and Smeaton 1992; Tong et al. 1997),
head-modifier pairs (Strzalkowski 1995; Zhai 1997). Most of the results that have
been reported showed no clear significant improvements of the retrieval performance
(Fagan 1987a; Lewis 1992).
Statistical approaches are the most widespread, they mainly rely on word col-
location to determine the weight of word relationships. The current IR approaches
based on such representation investigate different types of collocation-based on pure
statistical clues such as term proximity (Tao and Zhai 2007; Zhao and Yun 2009) and
adjacent terms (inseparability) (Metzler and Croft 2005; Shi and Nie 2009). Other
techniques combining linguistic and statistic approaches (Fagan 1987b; Hammache
et al. 2014) have also been proposed. However, the impact of phrase-based indexing
in terms of performances is quite limited. A combination with single words is often
required (Hammache et al. 2014; Shi and Nie 2009). Furthermore, it has been shown
that positional approaches that capture term dependency without explicitly extracting
phrases are much more effective (Lv and Zhai 2009).
The conclusions that can be drawn from the reported results of phrase-based
indexing are that pure NLP techniques have a limited impact on search, as statistical
approaches are capable to effectively handle terms proximity without sophisticated
linguistic analysis.

4.2 Semantic-Based Representation

Semantic based indexing consists in representing documents and queries according to


the meanings conveyed by their terms. These meanings are obtained through exter-
nal resources such as ontologies (WordNet,1 YAGO, …), controlled vocabularies
(Mesh,2 …) or Knowledge Bases (Wikipedia, Freebase, …) (see chapter “Semantic
Web” of this volume and chapter “Knowledge Engineering” of volume 1 for more
details). Representing documents by means of the meaning of words in IR is also
referred to as concept-based indexing. The two notions of semantic and concept-
based indexing, are often mixed up; although both are based on external resources,
semantic indexing uses linguistic resources, called also “light” ontologies, while con-

1 https://wordnet.princeton.edu/.
2 http://mesh.inserm.fr/FrenchMesh/.
Information Retrieval and Artificial Intelligence 153

cept indexing is based on formal ontological taxonomies. But, the two approaches
share the same purpose and intend to represent documents and queries as a set of
individual entries taken from resources. For instance, in case of WordNet, semantic
indexing consists in representing a document as set of synsets (synonyms sets). A
more sophisticated representation based on sub-trees extracted from WordNet has
been also proposed in Baziz et al. (2005). Such representations enable word sense
disambiguation (Sanderson 2000), where words are represented by their meaning,
and allow retrieving documents with words that are semantically related to those of
the query. The literature abounds on this topic (Krovetz and Croft 1992; Voorhees
1993, 1994; Sanderson 1994, 2000; Gonzalo et al. 1998; Moon et al. 2004; Stokoe
et al. 2003; Liu et al. 2004, 2005; Fang 2008; Cao et al. 2005), and relevant surveys
can be found in Sanderson (2000), Li and Xu (2014).
The results reported for such representations differ. Indeed, Sanderson (1994)
and Voorhees (1994) showed that there is no significant improvement in the search
performance. The work presented in Schütze and Pedersen (1995) is one of the first
works showing improvements on a large collection. Other improvements have been
reported in Gonzalo et al. (2014), Mihalcea and Mihalcea (2000), Baziz et al. (2005),
Dinh et al. (2013), Zakos (2005), where it was noted that these representations are
particularly effective in a domain-specific search environment (Li and Xu 2014) such
as the medical domain (Wang and Akella 2015).
What can be noticed from most of the semantic-based approaches listed above,
is that the presence of AI techniques is limited. In these works, an ontology was
addressed from a linguistic perspective, neither reasoning nor inference processes
are employed. The notion of inference with ontologies is rather developed in the
context of the Semantic Web (see chapter “Semantic Web” of this volume), and for
this reason this topic is not covered in this chapter.
Extensions based on fuzzy ontologies, where relationships between concepts are
weighted, have been proposed (Miyamoto 1990). These weights indicate the rela-
tive strength of these relationships. Possibilistic ontologies have also been explored
in Baziz et al. (2007), Boughanem et al. (2007). The links between concepts are
estimated by two degrees, possibility and necessity (see chapter “Representations of
Uncertainty in Artificial Intelligence: Probability and Possibility” of Volume 1). Two
types of relations have been considered in the above works, synonymy and hyper-
nymy. The necessity degree estimates to what extent it is certain that one concept is
a specialization of the other. Possibility indicates to what extent two concepts can
describe the same thing. Experiments have been conducted on small collections, and
moderate improvements have been reported.
To sum up, although ontology is used in most of the approaches listed above,
these approaches are not employing AI techniques for reasoning and inferring new
knowledge. Works relying on AI for document representation and reasoning, mainly
provide formal document representations derived from logic. To this purpose, differ-
ent frameworks have been used such as case frame-based representations (Mauldin
1991), rule-based systems (Vickery and Brooks 1987), logic-based representation
(e.g. propositional, predicates, modal) (Fuhr 1995; Van Rijsbergen 1986; Nie 1988;
154 M. Boughanem et al.

Meghini et al. 1993), and conceptual graphs (Chevallet and Chiaramella 1998). These
models will be deeply discussed in the retrieval models section of this chapter.

4.3 Word Embedding Representation

The deep learning (neural networks) wave has also intervened within the NLP
domain, one of the success stories for document representation is word embed-
ding representations, such as word2vec (Mikolov et al. 2013; Pennington et al.
2014). Word embedding approaches consist in representing terms, or more gen-
erally phrases, sentences or paragraphs, according to the contexts where they appear.
They all share the idea popularized by Firth (1957): “You shall know a word by the
company it keeps”. This leads to represent each term as a vector of attributes (real
numbers) that captures precise syntactic and semantic word relationships.
Two classes of approaches have been used to build such representations (Onal
et al. 2017), context-counting based on algebraic methods such as singular value
decomposition (Deerwester et al. 1990) and context-predicting based on neural meth-
ods (Mikolov et al. 2013). These latter attempt to learn word embeddings from the
raw text. One of the first neural based approaches date back to the 2000’s (Ben-
gio et al. 2003). In 2013, Mikolov et al. (2013), introduced word2vec within two
different folds: the Continuous Bag-of-Words model (CBOW) and the Skip-Gram
model. These models are quite similar, except that CBOW predicts target words
from source context words, while the skip-gram does the inverse and predicts source
context-words from the target words. Pennington et al. released GloVe (2014). Most
of these approaches are based on feed-forward neural networks (Bebis and Geor-
giopoulos 1994). Other neural models have been employed, including convolutional
neural network (Huang et al. 2013; Mitra et al. 2016; Shen et al. 2014) and recurrent
neural network (Kiros et al. 2015; Wan et al. 2016).
The success of neural embeddings in NLP is mainly related to the unsupervised
nature of the learning. None annotated data is needed, these representations can be
learned from any collection of texts. Pre-trained vectors are available, to mention,
the one provided by Google. These vectors are partially trained on part of Google
News dataset (about 100 billion words). It is composed of 3 millions of words and
phrases represented on 300-dimensional vectors.
From IR point of view, the word vectors that are used during a search may be
obtained from any pre-trained word vectors or can be derived from the same collection
where the search is performed. They have been widely applied for document-query
matching or for query expansion. In general, the query and the documents are either
represented as bag of word vectors or as an aggregated vector. Aggregation can
be obtained through different operators such as sum or average (Vulić and Moens
2015; Mitra et al. 2016; Nalisnick et al. 2016; Le and Mikolov 2014), Non-linear
combinations using Fisher Kernel (Clinchant and Gaussier 2010), k-means clustering
(Ganguly et al. 2015) and maximum likelihood estimation (Zamani and Croft 2016b).
The query-document relevance score is computed either by comparing query and
Information Retrieval and Artificial Intelligence 155

document word vectors, aggregated or not, using a variety of similarity metrics such
as cosine or dot-product (Mitra et al. 2016; Nalisnick et al. 2016). An alternative to
computing the relevance score is to incorporate the term representations into existing
IR models such as Language model (Zuccon et al. 2015; Ganguly et al. 2015; Zamani
and Croft 2016a; Ai et al. 2016) or BM25 (Kenter and De Rijke 2015; Rekabsaz
2016).
Word Embeddings have been also employed for query expansion. The basic
approach consists in comparing query term with term embeddings of the whole
collection or of the top retrieved document to find expansion candidates (Diaz et al.
2016; Roy et al. 2016; Zamani and Croft 2016a; Zheng and Callan 2015).

5 Information Need Representation

Query formulation is a crucial phase in the IR process: this is a subjective process that
should be tolerant to the uncertainty that intrinsically characterizes the identification
and the expression of an information need. As it has been widely advocated in
the literature, IR is an interactive process by which the user aims at locating the
documents useful to fulfill the needs behind his/her request. Despite the developments
underlying the technologies for managing and accessing information, state of the
art and commercial search engines are still mainly based on keyword-based query
formulation, which seldom makes use of knowledge resources to face the problem
of words’ disambiguation.
The complexity of natural languages, with their nuances and their subjective
usage is still far from being effectively captured by computer applications. Moreover
the intended semantics of the few keywords specified in a user query should be
disambiguated depending on both the user and the query context.
To cope with uncertainty a possibility is to allow the user to imprecisely or vaguely
represent his/her information needs. In this context the application of Fuzzy Set The-
ory has been finalized at modelling a tolerance to uncertainty in query formulation,
by means of the definition of flexible query languages. In particular, flexible query
languages have been defined as generalizations of the Boolean query language. Two
main kinds of generalizations have been proposed: (1) to associate numeric or lin-
guistic weights to query terms; (2) to introduce linguistic quantifiers to aggregate
(weighted) query terms.
A query term weight expresses the importance of a term as descriptor of the users
needs, and it is formally defined as a flexible constraint on the index term weights.
By this fuzzy extension, the structure of a Boolean query is maintained, by allowing
weighted query terms to be aggregated by the AND, OR connectives and negated by
the NOT operator. In this way, the exact matching of the Boolean model is relaxed to
a partial matching; in fact, the query evaluation mechanism applies a fuzzy decision
process that evaluates the degree of satisfaction of the query constraints by each
document representation, by applying a partial matching function. In the context of
Fuzzy Set Theory, the connectives AND and OR are defined as aggregation operators
156 M. Boughanem et al.

belonging to the classes of T-norms and T-conorms respectively. Usually, the AND is
defined as the min aggregation operator, and the OR as the max aggregation operator.
The first fuzzy models proposed the definition of numeric query term weights, in
the range [0, 1]. The flexible constraint identified by a query term weight depends on
its semantics; in the literature different semantics have been proposed, which have
introduced distinct fuzzy generalizations of the Boolean model (Yager 1988; Kraft
and Buell 1983; Bordogna et al. 1991; Kraft et al. 1999; Bordogna and Pasi 2001;
Boughanem et al. 2007).
The three main semantics that have been proposed for query term weights are:
the relative importance semantics (query weights express the relative importance
of pairs of terms in a query), the threshold semantics (a query weight expresses
a threshold on index term weights), and the ideal index term weight semantics (a
query weight expresses the perfect index term weight). The choice amongst the three
proposed query weight semantics implies a distinct modeling of the retrieval function
evaluating a query against documents representations.
To overcome the problem of imposing to the user the unnatural choice of a numeric
value, thus forcing her/him to quantify a qualitative concept of importance, some
recent models proposed in the literature have introduced linguistic query weights,
based on the concept of linguistic variable (Bordogna and Pasi 1993; Kraft et al.
1999). By this linguistic extension of the Boolean query language, query terms are
expressed by means of words such as important, and very important. Besides, lin-
guistic query term weights express flexible constraints on the index term weights. As
previously outlined, a second generalization of the Boolean query language has con-
cerned the definition of linguistic quantifiers as aggregation operators. This proposal
has come to improve query formulation by going beyond the usage of the AND and
OR connectives (Bordogna and Pasi 2005). In fact, when the AND is used for aggre-
gating the keywords specified in a user query, a document indexed by all keywords
but one is not retrieved, thus causing the possible rejection of useful items. The oppo-
site behavior characterizes the aggregation by OR. The use of linguistic quantifiers
(formally defined within Fuzzy Set Theory) was proposed to allow more expressive
and more natural query formulations. Linguistic quantifiers, such as at least 2 and
most, specify in fact more flexible selection strategies. Linguistic quantifiers have
been formally defined as averaging aggregation operators, the behavior of which lies
between the behavior of the AND and the OR connectives, which correspond to the
all and the at least one linguistic quantifiers. By adopting linguistic quantifiers, the
requirements of a complex Boolean query can be more easily and intuitively formu-
lated. For example, when desiring that at least 2 out of three selection conditions a,
b, c be satisfied, one should formulate the following Boolean query:
(a AND b) OR (a AND c) OR (b AND c)
which can be replaced by a simpler one: at least 2(a, b, c).
In Bordogna and Pasi (1995), a generalization of the Boolean query language
that allows one to personalize search in structured documents, was proposed; both
content-based selection constraint, and soft constraints on the document structure
can be expressed. The atomic component of the query (basic selection criterion) is
defined as follows: t in Q preferred sections, in which t is a search term expressing a
Information Retrieval and Artificial Intelligence 157

content-based selection constraint, and Q is a linguistic quantifier such as all, most, or


at least k. Q expresses a part of the structure-based selection constraint. It is assumed
that the quantification refers to the sections that are semantically meaningful to the
user. Q is used to aggregate the significance degrees of t in the desired sections.

6 Retrieval Models: Relevance Modelling

Relevance is the most important notion in IR and one of the fundamental issues is
to define the formal and the theoretical frameworks allowing the interpretation of
this notion. The majority of the IR models consider relevance as a matching problem
between query and document characteristics, often represented as a set of weighted
terms (phrases). Probabilistic models including BM25 (Robertson and Walker 1994),
language models (Ponte and Croft 1998; Lavrenko and Croft 2017; Zhai 2008),
information theory-based models (Amati and Van Rijsbergen 2002; Clinchant and
Gaussier 2010), and algebraic models such as vector space model (Salton et al. 1975),
are currently the most widespread and most performing models.
However, there are other theoretical frameworks, more related to AI, that have been
used to interpret the notion of relevance. This includes, logic (Propositions, Modal,
Description, …) (Van Rijsbergen 1986; Crestani et al. 2003; Abdulahhad 2014), fuzzy
logic (Damiani et al. 2007; Boughanem et al. 2009, 2007), inferential and beliefs
models (Turtle and Croft 2017; Silva et al. 2000), and optimization methods such
as evolutionary computation [genetic algorithms (Kim and Zhang 2003; Vrajitoru
2013), swarm intelligence (Kennedy and Eberhart 1995)], game theory (Raifer et al.
2017; Zhai 2016), multi-agent systems (Enembreck et al. 2004; Trifa et al. 2017).
Retrieval models were also addressed by Machine Learning techniques. The first
work tackling learning to rank dates back to 2000. Since 2013, the Neural networks
trend have also been inspiring the IR tasks. The surprising results, obtained in vision
and image retrieval, gave real opportunities to the document retrieval communities.
We will list, in the following, some IR approaches based on these theoretical frame-
works.

6.1 Logic-Based Models

These models assume that the retrieval process has an inferential nature. For example,
the direct term-based comparison, between a document d discussing “violin” and a
user query q entailing “fiddle”, will lead to a mismatch. However, based on the
knowledge that “violin” and “fiddle” are synonymous, it is possible to infer that d
is a possible answer to q. Therefore, using only classical (bag of words) IR models,
is not able to solve such issue. On the other hand, using formal logics, which are
basically inference systems and well adapted tools for knowledge representation, to
158 M. Boughanem et al.

model the retrieval process, supposes to make it more intelligent (i.e. closer to the
way how a human-being expert decides about relevance).
The use of logic in IR dates back to Cooper (1971), where the relevance is seen
as an inference process between a document d and a query q. The retrieval consists
of finding the documents that imply the query, denoted d → q where d and q are
normally logical sentences in the underlying logic. In the literature, logic-based IR
models adopted six stands (interpretations) of the → operator between d and q
(Sebastiani 1998). This strict view of inference allows only binary decisions, where
most formal logics allow only True/False decisions. Thus, Van Rijsbergen (1986;
1989) proposed the notion of Logical Uncertainty Principle (LUP) which allows a
nuanced formulation of the implication by associating a degree of uncertainty to it,
denoted U (d → q) (see also chapter “Constraint Reasoning” of Volume 2 for more
details).
The main issue that has been addressed in this line of research, is to define the
theoretical framework for translating the queries, the documents, the implication
and the uncertainty U . It is worth mentioning, in this context, that most logic-based
IR models differentiate between the two tightly related notions, namely matching
(represented via →) and ranking (represented via U ). This distinguishing allows for
a finer grained analysis.
Several frameworks have been proposed and adapted to IR, namely modal logic,
description logic, conceptual graphs, etc.. In the same way, uncertainty has been con-
sidered in different forms, including fuzzy logic, probability theory, logical imaging,
belief revision, etc.(see chapters “Knowledge Representation: Modalities, Condi-
tionals and Nonmonotonic Reasoning”, “Representations of Uncertainty in Artifi-
cial Intelligence: Probability and Possibility” of Volume 1 and chapter “Automated
Deduction” of Volume 2 for details about some of these logics). We list in the fol-
lowing some logic-based IR models. This part is largely inspired by Crestani et al.
(2003), Abdulahhad (2014), Lalmas (1998), which provide much more details than
those listed below. We present the models according to the formal logic, that is used
to represent the different components of the implication, and also according to the
mathematical theory that is used to estimate uncertainty.
Propositional logic: Many IR models use propositional logic as a logical frame-
work to represent the retrieval process. In Losada and Barreiro (2001, 2003), both d
and q are logical sentences, and the IR implication d → q is the logical consequence
d |= q.3 The uncertainty is estimated using Belief Revision. They particularly used
Dalal’s operator for document ranking. Abdulahhad et al. (2017) use opted for the
same choices to model d, q, and d → q. However, instead of using Belief Revi-
sion, they make use of the lattice structure that can be constructed between logical
sentences in order to have a probabilistic estimation for U .
Modal logic: Modal logic extends classical propositional and predicate logic
to include modality operators, namely necessity and possibility. In this context, two
mathematical frameworks, namely Kripke’s Possible Worlds (PW) semantics (Kripke

3 |= is a meta-language symbol, where s1 |= s2 means that in any interpretation if s1 is true then s2


is also true.
Information Retrieval and Artificial Intelligence 159

1963) and Logical Imaging, have been used to build IR models. PW assumes that
Worlds (formal interpretations) are connected through accessibility relationships.
The two modalities for a logical sentence s refer to the possibility to reach a possible
world where s is true starting from the current possible world and following the
accessibility relations between worlds. Logical Imaging evaluates the process of
moving probabilities from the worlds where a given sentence is false to the most
similar worlds where it is true. Propositional modal logic was used by Nie (1988,
1989), where documents are possible worlds or interpretations and queries are logical
sentences. According to Nie, a document d is relevant to a query q iff q is true in
d, or in a world accessible from d. Therefore, uncertainty is seen as the cost of the
path that is needed to move from the original document d, where q is not true, to a
document d  , where d  is accessible from d and q is true in it. However, Crestani et
al. (Crestani and van Rijsbergen (1995), Crestani (1998)) assume that each term is a
possible world and both documents and queries are logical sentences. A document
(resp. query) is true in a given term (world) t iff t appears in that document (resp.
query). Logical Imaging is then used to rank documents, where terms’ scores are first
relocated from the terms that do not appear in the document to the most similar terms
inside the document according to the accessibility relations, then the relevance value
U (d → q) is estimated based only on the terms that appear in d. Other extensions
that evaluate the accessibility between two possible worlds have also been proposed
in a fuzzy framework (Nie et al. 1995).
Conceptual graph: Conceptual Graph formalism of Sowa (1983) has been used
in IR (Chiaramella and Chevallet 1992; Chevallet and Chiaramella 1998; Amati and
Ounis 2000). It is a graphical formalism that is equivalent to first-order logic. In this
IR model documents and queries are represented by conceptual graphs (i.e. logical
sentences), the retrieval decision is carried out by conceptual graph operations to
establish a projection (i.e. material implication) between d and q. The uncertainty is
the cost of these operations.
Description/Terminological logic: Description Logic (DL) is a family of lan-
guages to represent knowledge. It is widely used in Semantic Web. It is more expres-
sive than propositional logic and less than first-order logic, but it has more efficient
reasoning than first-order logic. In Meghini et al. (1993), the query is a concept and
the document could be a concept or an individual. If the document is an individual,
then the retrieval decision is to check if the individual d is an instance of the concept
q. Otherwise, the retrieval decision is to check if the concept d is subsumed by the
concept q. In Meghini et al., the relevance is binary. It has been then extended in
Sebastiani (1994), Meghini and Straccia (1996) to include probabilities to fit the LUP
of Van Rijsbergen (1986; 1989). Other extensions have been proposed to estimate
uncertainty using the notion of possibility (Qi and Pan 2008).
Probabilistic Datalog: Datalog is a predicate logic that has been developed in
the database field. Probabilistic Datalog is an extension of Datalog using probability.
More precisely, predicates are associated with probabilities, denoted αg where g
is a classical predicate and α is the probability that g is true. Probabilistic Datalog
has been used in IR (Fuhr 1995; Rölleke and Fuhr 1996), where documents are
represented as a set of probabilistic predicates of the form αter m(t, d) that expressing
160 M. Boughanem et al.

the document d is indexed by the term t, and α indicates the probability that d is
about t. Queries are written as Boolean expressions, and the retrieval decision is seen
as an inference rule.
Probabilistic Argumentation Systems: is a logical framework that extends
propositional logic by a probabilistic mechanism to express uncertainty. It is able
to express both the qualitative and quantitative uncertainty. Picard (1999), Picard
and Savoy (2000) have proposed an IR model based on such logic. Documents and
queries are represented as a set of weighted rules indicating document-term about-
ness, inter-term, and inter-document relations, where weights indicate the strength of
the implications. The relevance is seen as the degree to which the document supports
the query.
Others: Other families of formal logics have been used to model IR process.
Situation theory was adopted by Lalmas and van Rijsbergen (1993), and Huibers
(1994) to build an IR model where the document d is a situation and the query q is
an infon or a set of infons. An infon is an atomic information carrier, and it refers
to the information that a particular relation holds / does not hold between a set of
objects. Accordingly, d is relevant to q iff d supports q. Abductive reasoning (Thiel
and Müller 1996) and default logic (Hunter 1997) are also used. This later is used to
represent semantic relations between objects, e.g. synonymy, polysemy, etc.
Although formal logics make the retrieval process more intelligent, where formal
logics are powerful inference and knowledge representation systems, the use of
formal logics to model the IR process is not cost-less. Most logic-based IR models
are too complex to have operational instances of them. However, some recent studies
were able to build operational logic-based IR systems (Abdulahhad et al. 2017;
Zuccon et al. 2009; Losada and Barreiro 2003).

6.2 Fuzzy Models

Fuzzy Set Theory has been applied to IR since the 70s, to the aim of modeling both the
vagueness/uncertainty in the formulation of an information need and the subjectivity
of the notion of relevance. Fuzzy sets were initially applied to information retrieval
as a means to generalize the Boolean retrieval model (Bordogna and Pasi 1995;
Miyamoto 1990; Buell 1985; Bookstein 1980). As it was outlined in Sect. 5, an
outcome of the proposed generalizations was to enable flexible query formulation,
by allowing the specification of both numeric and linguistic query term weights,
interpreted as constraints on the document representation formally expressed as a
fuzzy subset of index terms (Fox and Sharan 1986; Molinari and Pasi 1996; Herrera-
Viedma 2001).
The first fuzzy generalization the the Boolean IR model has consisted in simply
extending the document representation, by maintaining the Boolean query language.
By representing a document as a fuzzy subset of index terms, instead of a classical
set, index term weights can be considered (e.g. normalized t f ∗ id f weights), and the
Boolean query evaluation mechanism can produce an RSV (relevance score) for each
Information Retrieval and Artificial Intelligence 161

document, thus allowing a ranking of the proposed results. One of the first models
proposing this extension is the MMM (Min, Max, and Mixed) model introduced in
Fox and Sharan (1986). The adaptation is quite simple, the document is seen as a
fuzzy subset of the index terms in the collection (dictionary), where the term weight
represents the degree of membership of a term in a document. Then, the evaluation
of a Boolean query relies on the interpretation of the AND and OR connectives as
conjunctive aggregation operators, generally the min and max operators respectively.
Subsequent generalizations of the Boolean model have been proposed, to the aim
of also extending the Boolean query language (beside generalizing the formal doc-
ument representation). As mentioned in Sect. 5 two kinds of extensions have been
defined: (1) the association of (numeric or linguistic) weights to the query terms,
(2) the generalization of the AND, OR connectives (by means of linguistic quan-
tifiers, as shown in Sect. 5). In particular, different interpretations of query terms
weights have given origin to distinct generalizations of the Boolean retrieval model.
As shortly introduced in Sect. 4, the three semantics associated with query term
weights are: relative importance (the weights express the relative importance among
terms), threshold (the query term weight expresses a threshold constraint on the index
term weights) and the ideal index term weight semantics (Yager 1988; Sanchez 1989;
Kraft and Buell 1983; Bordogna et al. 1992; Boughanem et al. 2007; Baziz et al.
2006).
As outlined in Sect. 5, to help users in qualifying the importance of query terms
as descriptors of their needs, the numeric query weights have been generalized to
linguistic query weight, by maintaining their semantics. Formally, these weights are
defined as values of the linguistic variable Importance (e.g., important, very important
etc.), which still specify constraints on the index term weights (Bordogna and Pasi
1993; Kraft et al. 1999).
The other aspect related to the extension of the Boolean query language concerns
the connectives employed to aggregate the different search criteria, i.e. query terms.
Basically, in the Boolean query evaluation process, aggregation consists in evaluat-
ing a document on each term separately, and then aggregating the according to the
Boolean structure of the query. When generalizing the document representation to
a fuzzy document representation, the aggregation process must account for index
term weights (or for scores expressing the satisfaction of the constraint imposed by
a query term weight, in the case of generalized Boolean queries). The AND and OR
connectives are associated with conjunctive and disjunctive aggregation operators
respectively, such as t-norms for AND and t-conorms for OR (Yager 1988; Dubois
and Prade 1985; De Baets and Fodor 1997). Linguistic variants of aggregation oper-
ators enabling to relax AND (all) and OR (at least 1), such as most or at least k
have also been proposed and used in IR (Hayashi et al. 1992; Sanchez 1989). To
this purpose ordered weighted aggregation (OWA) operators have been introduced
(Yager 1988; Bordogna and Pasi 1995; Marrara et al. 2017).
There exist two noticeable refinements of the MIN operation,called discrimin
and leximin (Dubois et al. 1997). They allow to distinguish between values to be
aggregated having the same minimal value. These operators have been applied to
162 M. Boughanem et al.

IR in Boughanem et al. (2007), Baziz et al. (2006). An interesting survey about


aggregation in IR can be found in Marrara et al. (2017).
Other approaches based on Possibility Theory (See chapter “Representations of
Uncertainty in Artificial Intelligence: Probability and Possibility” of Volume 1) have
been defined. In particular, Loiseau et al. (2004) used fuzzy pattern matching (Dubois
et al. 1988), to formulate and evaluate flexible queries on documents represented by
fuzzy sets. In order to estimate the relevance of a document to a query, also called
compatibility, the possibility and the necessity measures were used. The possibility
metric estimates to what extent it is possible that a query q and a document d refer
to the same value (terms). It represents the intersection of the fuzzy set of values
compatible with q with the fuzzy set of possible values of d. The necessity metric
measures to what extent it is certain that the value corresponding to d is compatible
with q. It is computed as an inclusion degree of the possible values for d into the
set q of values compatible with the query terms The compatibility is evaluated by
means of a possibilistic ontology that allow to compare the compatibility of terms
even if they are not similar.
Other approaches have been defined within the framework of Possibility Theory
(Boughanem et al. 2009, 2007). Several surveys on fuzzy IR can be found in Kraft
et al. (1999), Tamir et al. (2015), Kraft et al. (2015), Pasi (2009), Kraft and Colvin
(2017), Marrara et al. (2017).

6.3 Bayesian Networks

Bayesian inference networks provide a probabilistic formalism for describing infer-


ence relations with uncertainty (See chapter “Belief Graphical Models for Uncer-
tainty Representation and Reasoning” of Volume 2). Several IR models have been
proposed (Turtle and Croft 2017; Ribeiro and Muntz 1996; Silva et al. 2000; De
Campos et al. 2002; Lee et al. 2009) where nodes represent either documents, terms
or queries variables. The links indicate the causality between nodes, the relevance
is related to the probability of logically inferring the query from document repre-
sentations or conversely (Van Rijsbergen 1986). The two most known models are
the inference networks model of Turtle and Croft (2017) and the Belief model of
Ribeiro-Neto and Muntz (Ribeiro and Muntz 1996; Silva et al. 2000). In these mod-
els, document, index terms, and query are represented by Boolean variables and the
relevance is seen either as the inference of the query from the documents (Turtle
and Croft 2017), or the deduction of relevant documents given a query (Ribeiro and
Muntz 1996). Thus, Belief networks can generalize Boolean, vector space, proba-
bilistic and inference models. Other extensions based on Bayesian networks have
been proposed either for optimizing the computation of conditional probabilities
(Bruza and van der Gaag 1994; Indrawan et al. 1996; Fung and Del Favero 1995), or
integrating dependence between term pairs (De Campos et al. 2003; Crestani et al.
2003) or document pairs (De Campos et al. 2002), or dealing with heterogeneous
Information Retrieval and Artificial Intelligence 163

documents (Crestani et al. 2003; Denoyer and Gallinari 2003), or with tweet search
(Jabeur et al. 2012).
Possibilistic framework has also been applied in IR to better characterize rele-
vance. These models also use possibility and necessity, as relevance metrics instead
of a unique probability metric. This allows one to better reflect the subjectivity of
the actual relevance. Such model has been proposed in Boughanem et al. (2009), it
is inspired by the Turtle model (Turtle and Croft 2017), but it employs possibilities
instead of probabilities. The relevance of a document given a query is measured by
two degrees: the necessity and the possibility. The possibility degree is convenient
to filter documents out from retrieved documents and the necessity degree is useful
for document relevance confirmation.

6.4 Machine Learning Based Models: Learning To Rank,


Deep Learning

Although the theoretical frameworks underlying the traditional IR models differ, they
all combine the same relevance signals, such as t f (term frequency), id f (inverse
document frequency), document length. However, if the number of signals increases
to reach hundreds of signals, which is actually the case for search engines, these
models fail and are unable to process such amount of signals. Machine learning
techniques provide a way to handle such issues, although on their side they require
annotated data, which are often not available.
The use of machine learning, particularly neural networks, dates back to the
1990’s. Thee early works (Belew 1987; Kwok 1989; Boughanem SDC 1992) are
based on spreading activation networks, often composed of two layers. The search is
carried out by propagating forward the entry (the query) from the term layer into the
document layer, with eventually, one more step back to the term layer to facilitate
learning. The first models that were built with effective ability to learn from hundreds
of features and combine them, date back to the 2000s. They are known as, Learning
to rank (LTR) approaches, their goal is to learn the ranking function over a set of
hand-crafted features composed of tens or even hundreds of characteristics, extracted
from documents and/or queries (Liu 2009; Li 2011). Such features include t f , id f ,
BM25 scores, occurrence of query term in document title, in anchor text, document
length, PageRank, number of unique words, document trust, etc. The learned model
is then used in the testing phase.
Several machine learning models including support vector machines (Herbrich
2000; Nallapati 2004; Yue et al. 2007), neural networks (Burges et al. 2005; Tsai
et al. 2007), and boosting (Wu et al. 2010), were developed to support IR tasks
(see chapter “Designing Algorithms for Machine Learning and Data Mining” of
Volume 2). The main issues, that arise in LTR, include training data creation, feature
construction and the machine training model. The use of a particular model depends
164 M. Boughanem et al.

on the size and the type of the training data, and on the training objective (the type
of the desired output). Liu (2009) categorized three types of objectives:
• Pointwise where the output is a query-document relevance, which can be rep-
resented as a degree of relevance, a binary relevance (relevant vs. irrelevant) or
multiple ordered categories: Perfect  Excellent  Good  Fair  Bad)(Crammer
and Singer 2002; Nallapati 2004; Shashua and Levin 2003; Li et al. 2008).
• Pairwise is based on document preference, document d1 is more relevant than
document d2 . (Burges et al. 2005; Freund et al. 2003; Wu et al. 2010),
• Listwise is based on a list of documents ranked according to their relevance. Its
main objective is to optimize the ranking metrics (MAP, NDCG) (Yue et al. 2007;
Tsai et al. 2007; Xia et al. 2008).
Unlike LTR models that require hand-crafted features, deep learning approaches
have been used to automatically learn the useful features to model relevance. This
class of IR neural models is called Interaction focused models as it attempts to
extract salient features from the interaction between a query and a document (or a
set of documents).
Neural architectures that have been used to handle this task include MultiLayer
Perceptron (MLP), convolutional neural networks (CNN) and recurrent models.
Generally, MLP are used to enable non linear combination of inputs (entries).
They help aggregate different input word vectors into a single representation vector
(Le and Mikolov 2014), and map a sparse vector to a low dimensional representation
vector (Huang et al. 2013). CNN networks are used to learn representation vectors
from raw text through a sequence of convolutional and pooling layers (Shen et al.
2014; Hu et al. 2014; Mitra et al. 2017). CNN defines a set of linear filters (layers)
able to extract features by detecting regularities of inputs having spatial constraints
such as images and texts. Convolutional layers are typically followed by pooling
layers that perform aggregation. Recurrent neural models are also widely used in
IR for their ability to represent sequential inputs, such as continuous word sequence
(Kiros et al. 2015; Wan et al. 2016), and their memorization aspect, as they allow
one to remember the different information present in the input data while processing
them.
Most of the proposed models in literature start with using a NN model, usually
CNN or RNN, to extract the salient features which are then given as input to an
MLP network to be aggregated or to learn relevance. Typically, these models operate
according to the network input which might be of different forms. It can take the
form of term vectors put forth by word embedding approaches (refer to Sect. 4.3),
or interaction matrix generated by comparing windows of text from the query and
the document. The terms within each window can be represented as one-hot vector
(Jozefowicz et al. 2016; Kim et al. 2015; Huang et al. 2013) or as word embeddings
(Hu et al. 2014).
The approaches using interaction matrix have been addressed for short text match-
ing (Lu and Li 2013; Yin and Schütze 2015; Pang et al. 2016) and for long docu-
ments ranking as well (Pang et al. 2016; Mitra et al. 2017). The Deep Structured
Semantic Models presented in Huang et al. (2013) were the first to introduce a NN
Information Retrieval and Artificial Intelligence 165

based approach for ad-hoc retrieval. The proposed models were trained by maximiz-
ing the conditional likelihood of the clicked documents given a query by using click
through data. Shen et al. (2014) proposed the Convolutional Deep Structured Seman-
tic Models, C-DSSM as an extension to the DSSM for documents/query matching
by combining CNN with max-pooling. However, this kind of models fails when
dealing with rare terms and search intents. Indeed, a good neural IR model should
incorporate both lexical and semantic matching signals (Mitra et al. 2017).
Lu and Li (2013) developed a deep matching method called DeepMatch that allows
one to model the matching between two objects from heterogeneous domains. The
proposed model was applied in two tasks: finding relevant answers for a given ques-
tion and matching tweets with comments. Likewise, Guo et al. (2016) proposed a
deep relevance matching model (DRMM) for ad-hoc retrieval that employs three
relevance matching factors: Exact matching signals, Query terms importance and
Diverse matching requirement. Their model is based on the interaction-focused mod-
els and uses a joint deep architecture at the query term level for relevance matching.
Recently, Zamani et al. (2018) explored how neural models addressed ranking doc-
uments with multiple document fields. The proposed model handles short text fields
like document’s title and long text fields like document’s body. They found that it is
more effective to learn separate embedding spaces to match the different document’s
fields against the query rather than opting for a common embedding space. This
can be explained by the fact that the document’s fields can correspond to different
aspects of the query and thus, it would be better to consider comparing with separate
representations of the query text.
Despite of the major improvements achieved by neural models operating with
supervised data, one of the main challenges is to learn how to handle IR tasks with
weak supervised or unsupervised data. There are some recent works that attempted
to address this issue. Dehghani et al. (2017) proposed a “Pseudo-Labeling” approach
for query-dependent ranking that creates its own training data set employing existing
unsupervised methods. The weak supervised signals generated are then used to train
a neural retrieval model. MacAvaney et al. (2017) presented an approach that gen-
erates weak supervision training data for neural IR models and considers negative
training examples. The proposed approach is applied on a news corpus where article
headlines are extracted as pseudo-queries and articles’ content as pseudo-documents.
The human relevance judgments are replaced by a similarity metric that measures
the interactions between the pseudo-queries and the pseudo-documents.
Applying deep learning to IR tasks is currently one of the hottest topic in infor-
mation retrieval field. There are more than fifty papers that have been published in
high venue conferences and journals. Several interesting surveys have been published
(Onal et al. 2017; Mitra and Craswell 2017).
166 M. Boughanem et al.

6.5 Evolutionary Computation

Several IR systems turned to using evolutionary algorithms in order to improve the


search performance and to reduce the time required to answer complex queries.
Evolutionary algorithms like Genetic algorithms (GAs) (Holland 1992), Ant
Colony (Colorini et al. 1991), Artificial Bee colony (Karaboga 2005) and Particle
Swarm Optimization (Kennedy and Eberhart 1995) are bio-inspired methods that
have been proposed as a way of finding optimal solutions for complex problems in
a much shorter time compared to the time required by evaluating all possible solu-
tions. These algorithms are extensively discussed in chapter “Meta-Heuristics and
Artificial Intelligence” on Volume 2.

6.5.1 Evolutionary Algorithms : Genetic Algorithm

Genetic algorithms (GAs), initially introduced by Holland (1992), are stochastic


optimization algorithms inspired from natural selection and genetics mechanisms.
They start with a population of potential solutions randomly chosen. Based on their
relative fitness (performances), a new population of potential solutions is created
using simple evolutionary operators: selection, crossing and mutation. This process
is repeated until we reach a “satisfactory” solution.
Genetic algorithms were proposed by several works (Vrajitoru 2013; Chen 1995;
Yang et al. 1993; Gordon 1991) as a solution to IR issues, like document indexing and
query reformulation. Analogically, the search space of a genetic algorithm, within an
IR system, involves a set of documents’ descriptors which are composed of the terms
belonging to each document. The genes are considered as the terms’ weights and
an individual is represented as the concatenation of all the document’s descriptors.
Thus, the main purpose of the genetic algorithm in this context, is to create at least one
new individual whose performance will be greater than that of its parents. The work
presented in De Almeida et al. (2007) addressed ranking strategies, from a genetic
programming perspective, that combine several term weighting functions and adapt
to each document collection. The proposed approach proved to be effective as it
outperforms the traditional weighting functions and improves the retrieval precision.
Tamine et al. (2003) made use of genetic algorithmic to develop a query reformula-
tion (optimization) process involving the niching technique (Goldberg and Corruble
1994), which retrieves for the same query, relevant documents that have relatively dis-
similar descriptors. In Kim and Zhang (2003), the authors proposed a genetic based
mining method to determine the significant tags and their weights for document
retrieval. Araujo and Pérez-Iglesias (2010) developed a query expansion approach
using genetic algorithm with a fitness function based on the user’s relevance judg-
ments. They believe that using a genetic algorithm to select the terms maximizing
the average precision, for each query, can enhance the retrieval process. Likewise, in
Sathya and Simon (2010), the authors use GA to obtain the best terms’ combination
Information Retrieval and Artificial Intelligence 167

from a set of keywords extracted by a document crawler. The output generated by


the GA is then applied to an IR system.
The work presented in Al-Khateeb et al. (2017) has also used GA for query
reformulation and expansion. Unlike traditional IR systems, instead of using a single
query, the authors use Wordnet to extract synonyms of the query’s keywords and thus
put forward a population of queries generated from the original query. They consider
that such approach allows to expand the search space.

6.5.2 Metaheuristic and Swarm Intelligence

Swarm Intelligence based Metaheuristics are founded on the collective and social
behavior of some species like ants and bees, forming Swarm Intelligence algorithms
i.e. Particle Swarm Optimization(PSO), Ant Colony Optimization(ACO) and Artifi-
cial Bee Colony (ABC). When dealing with IR issues, these algorithms proved their
worth and have empirically demonstrated their effectiveness.
Particle Swarm Optimization is an evolutionary technique that uses a population
of candidate solutions to develop an optimal solution to the problem. The members
of the population, particles, are distributed randomly in the search space, having
each a random velocity. In Bindal and Sanyal (2012), the authors proposed a PSO-
based approach for query optimization that learns the query’s terms significance
using the documents contexts. It determines the optimal query vector that improves
the IR system effectiveness. Therefore, a particle stands for a query vector and the
fitness function is represented as the cosine similarity between a query and the top-k
documents retrieved for the original query. An enhanced PSO algorithm was also
introduced by Khennak and Drias (2017) for query expansion and aimed to determine
the most suitable expanded query, rather than extracting the best expanded keywords
(Sathya and Simon 2010). In order to overcome the huge number of the expanded
query candidates, the authors turned to an accelerated version of the PSO algorithm
called APSO which deals with this issue as a combinatorial optimization problem.
Ant Colony Optimization (ACO) algorithm (Colorini et al. 1991) is based on
the behavior of ants seeking a path between their colony and a food source using
pheromone trails. The original idea has since diversified to solve a broader class of
problems and several algorithms have emerged, drawing on various aspects of ant
behavior. Chawla (2013) proposed an ACO based approach for personalized Web
search which considers the ant pheromone as the information scent and the set of
users play the ant’s role. The pages clicked by other users for a given query are used
as the the information scent, i.e. pheromone, that helps to enrich the search space
of a given user for the same query. ACO algorithm was also applied for Web page
ranking (Chawla 2017) as it addresses finding the optimal ranking of clicked URLs
from an optimization perspective.
The Artificial Bee Colony (ABC) algorithm (Karaboga 2005; Karaboga and Bas-
turk 2008) is a population-based, naturalistic-inspired algorithm based on the bees
foraging. In Abdullah and Hadi (2014), the authors put to the test an ACO based
approach for Web IR and proved that such approach helps to cope with the huge
168 M. Boughanem et al.

volume of information as it prunes the search space by exclusion and thus improves
the query processing and the response time. Hassan and Hadi (2016) opted for ACO
to address Word Sense Disambiguation (WSD) in IR, using the simplified lesk algo-
rithm. They showed that an ACO based approach yields a very high response time
and an accurate relevance compared to the traditional algorithms.

7 IR Approaches Based on Other AI Frameworks

IR issues were extensively addressed within other theoretic frameworks like Multi-
agent systems, Game theory and Decision making, in order to provide a better search
experience for the user. In the following, we put forth some of the approaches that
were proposed to tackle IR tasks within these frameworks.
Multi-agent systems (MASs) are considered as an important alternative to the tra-
ditional IR models as they proved to yield better results by providing scalability and
load balancing, by using agents for the different IR tasks (search, filter, rank,etc.).
Enembreck et al. (2004) proposed to distribute the retrieving process over several
agents: a Personal Agent that manages the user’s favorite websites, a Library Agent
for document indexing and a Filter Agent which retrieves the required information
from the Search Agents and filter it based on the user’s profile. They proved that
such distributed process help improve the search performance. The work proposed
in Trifa et al. (2017), addressed personalization within IR from a MAS perspec-
tive, integrating a Web scraping agent that tracks the user’s activities on the Web
and a crawling agent which collects information from social networks. These two
agents are used to predict the user’s search intentions.
Game theory, as a branch of mathematics used in several scientific domains (see
chapter “Games in Artificial Intelligence” of Volume 2), is perceived as an ana-
lytic method that is used to model the behavior of rational players who defend
their interests in well-defined situations. It consists in identifying the actors and
the strategies undertaken. There are several works that addressed IR tasks from
a game theoretic point of view. Raifer et al. (2017) considered game theory to
analyze publishers’ behavior regarding their documents ranking on the Web. They
described that, as a “ranking competition between documents’ authors (publish-
ers) for certain queries”. They believe that the modelling of the publishers behav-
ior from a game theoretic perspective helps address the post-ranking process in
retrieval models. The work presented in Zhai (2016), proposed a game-theoretic
formulation to optimize the search engine performance on a search session and
not just for an individual query. The retrieval process consists of a search engine
and a user who stands for a player in a cooperative game, with the aim to help
the user satisfy her/his information need with a minimum use effort and operation
cost. Hubert et al. (2018) proposed an unsupervised ranking model inspired by
real-life game and sport competition principles. Documents compete against each
other in tournaments using features as evidences of relevance. Tournaments are
Information Retrieval and Artificial Intelligence 169

modeled as a sequence of matches involving pairs of documents matches. Once


a tournament is ended, documents are ranked according to their number of won
matches during the tournament.

Decision Making is relatively close to the IR domain as it intervenes within sev-


eral tasks, starting from the choice of the query terms to the information display.
Indeed, the fact that the user has to decide about the query’s terms in order to
have the desired information, can be considered as a decision making problem and
several works were proposed to assist the user with query reformulation (Phillips-
Wren and Forgionne 2004). Hosanagar (2011) addresses several IR issues from an
optimal operational decision making perspective for distributed IR by considering
the user’s preferences and the performance history of the distributed sources. He
proposed a utility-theoretic framework associating the waiting time cost, user’s
decision strategies and the information value. Moulahi et al. (2014) proposed,
iAggregator, a fuzzy-based operator for multidimensional relevance aggregation,
inspired from the Choquet Integral Operator (Choquet 1954). This latter has been
extensively used in multicriteria decision-making problems. The authors adapted
this operator, not widely used in IR, to evaluate multicriteria relevance aggregation
on a tweet search task.The criteria considered were: topicality, recensy and author-
ity (chapters “Multicriteria Decision Making”, “Decision under Uncertainty” and
“Collective Decision Making” of Volume 1 discuss in extensive details the different
aspects of decision making.)

8 Conclusion

The role of AI in IR has been discussed by several authors for decades. They all
assumed that the impact of AI remains limited especially for adhoc IR. In this chapter,
we attempted to give an overview of the AI topics that have been mostly used in IR. We
first addressed the NLP topic with its ability to improve documents’ representation
through its accurate text analysis models. Then, came the logic wave in the late 80’s
and early 90’s, which basically involves inference systems and it is well adapted
tools for knowledge representation.
Fuzzy models were introduced in the 70’s but were mainly developed in the 90’s.
These models allowed a flexible formulation of queries. Meanwhile, evolutionary-
based approaches and simplistic neural networks models, not involving learning
techniques, were proposed. However, despite of the consequent number of the works
based on these models, the improvements drawn out in terms of performance, were
not significant enough to validate these models out of an academic frameworks.
It was until the 2000’s, that we could see Machine Learning techniques getting
involved with IR tasks and particularly the ranking process, “Learning to Rank”. The
ability of these models to handle hundreds of features shed a real interest on them,
especially for the Web search engines. However, these models, applied to other adhoc
170 M. Boughanem et al.

tasks, could not compete in terms of performance with the traditional models, since
these later do not require a learning phase.
Recently, we have been experiencing the Deep Learning trend especially for doc-
ument representation. Here again, the results reported do not clearly show the impact
of these models,unlike to what has been observed in the field of Image Retrieval.
Indeed, thanks to these neural models, there have been considerable advances in
terms of performances in this research field (Zhao et al. 2017).
To sum up, as far as IR models have been proved to be effective for a particular
task of information management, namely, document retrieval, the impact of AI for
this type of tasks is still limited compared to statistical methods. One of the reasons
observed by Lewis and Sparck-Jones (1996) is that “Statistical IR has picked some of
the fruits of the tree, and what is left is much harder”. Another explanation pointed out
by Karen Sparck-Jones (1999) is “that they work because, in situations where infor-
mation demand, and hence supply, is underspecified, the right strategy is to be broadly
indicative, rather than aggressively analytic (as in decision trees)”. Therefore, AI
is deemed especially useful for tasks that require fine-grained text analysis, to men-
tion, Opinion Mining, Question Answering, Entity Retrieval and Relation Retrieval.
Likewise, complex IR tasks, involving more that the usual retrieval modelling task,
will necessarily require AI tools (Yang et al. 2016). Tasks, like conversational search,
demand natural language understanding. IR systems are no longer related only to
the typical search process. They have to be able to explain the queries’ answers and
handle users’ questions. Information credibility is an important aspect ensuring that
the retrieved information is trustworthy. Reasoning models can be the best resort to
such issues.

References

Abdulahhad K (2014) Information retrieval (IR) modeling by logic and lattice. Application to
conceptual IR. Theses, Université de Grenoble
Abdulahhad K, Chevallet JP, Berrut C (2017) Logics, lattices and probability: the missing links to
information retrieval. Comput J 60(7):995–1018
Abdullah HS, Hadi MJ (2014) Artificial bee colony based approach for web information retrieval.
Eng Technol J 32(5):899–909
Ai Q, Yang L, Guo J, Croft WB (2016) Analysis of the paragraph vector model for information
retrieval. In: Proceedings of the 2016 ACM international conference on the theory of information
retrieval, ACM, pp 133–142
Al-Khateeb B, Al-Kubaisi AJ, Al-Janabi ST, (2017) Query reformulation using wordnet and genetic
algorithm. In: Annual conference on new trends in information and communications technology
applications (NTICT) (2017), IEEE, pp 91–96
De Almeida HM, Gonçalves MA, Cristo M, Calado P (2007) A combined component approach for
finding collection-adapted ranking functions based on genetic programming. In: Proceedings of
the 30th annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 399–406
Amati G, Ounis I (2000) Conceptual graphs and first order logic. Comput J 43(1):1–12
Amati G, Van Rijsbergen CJ (2002) Probabilistic models of information retrieval based on measuring
the divergence from randomness. ACM Trans Inf Syst (TOIS) 20(4):357–389
Information Retrieval and Artificial Intelligence 171

Araujo L, Pérez-Iglesias J (2010) Training a classifier for the selection of good query expansion
terms with a genetic algorithm. In: IEEE congress on evolutionary computation (CEC), IEEE, pp
1–8
Baziz M, Boughanem M, Aussenac-Gilles N (2005) Conceptual indexing based on document con-
tent representation, vol CoLIS’05. Springer, Berlin, Heidelberg, pp 171–186
Baziz M, Boughanem M, Prade H, Pasi G (2006) A fuzzy logic approach to information retrieval
using an ontology-based representation of documents. In: Sanchez E (ed) Fuzzy logic and the
semantic web, capturing intelligence, vol 1. Elsevier, Amsterdam, pp 363–377
Baziz M, Boughanem M, Pasi G, Prade H (2007) An information retrieval driven by ontology
from query to document expansion. In: Large Scale Semantic Access to Content (Text, Image,
Video, and Sound), Le Centre des Hautes Etudes Internationales D’Informatique Documentaire,
pp 301–313
Bebis G, Georgiopoulos M (1994) Feed-forward neural networks. IEEE Potentials 13(4):27–31
Belew RK (1987) A connectionist approach to conceptual information retrieval. In: Proceedings of
the 1st international conference on artificial intelligence and law, ACM, New York, NY, USA,
ICAIL ’87, pp 116–126
Belkin NJ, Marchetti PG (1989) Determining the functionality features of an intelligent interface
to an information retrieval system. In: Proceedings of the 13th annual international ACM SIGIR
conference on research and development in information retrieval, ACM, pp 151–177
Belkin NJ, Brooks HM, Daniels PJ (1987) Knowledge elicitation using discourse analysis. Int J
Man-Mach Stud 27(2):127–144
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach
Learn Res 3(Feb):1137–1155
Bindal AK, Sanyal S (2012) Query optimization in context of pseudo relevant documents. In: 3rd
Italian Information Retrieval (IIR) workshop, EPFL-CONF-174006
Bookstein A (1980) Fuzzy requests: an approach to weighted boolean searches. J Assoc Inf Sci
Technol 31(4):240–247
Bordogna G, Pasi G (1993) A fuzzy linguistic approach generalizing boolean information retrieval:
a model and its evaluation. J Am Soc Inf Sci 44(2):70
Bordogna G, Pasi G (1995) Linguistic aggregation operators of selection criteria in fuzzy informa-
tion retrieval. Int J Intell Syst 10(2):233–248
Bordogna G, Pasi G (2001) An ordinal information retrieval model. Int J Uncertain Fuzziness
Knowl-Based Syst 09(supp01):63–75
Bordogna G, Pasi G (2005) Personalised indexing and retrieval of heterogeneous structured docu-
ments. Inf Retr 8(2):301–318
Bordogna G, Carrara P, Pasi G (1991) Query term weights as constraints in fuzzy information
retrieval. Inf Process Manage 27(1):15–26
Bordogna G, Carrara P, Pasi G (1992) Extending boolean information retrieval: a fuzzy model
based on linguistic variables. In: 1992 IEEE international conference on fuzzy systems, IEEE,
pp 769–776
Boughanem M, Loiseau Y, Prade H (2007) Refining aggregation functions for improving document
ranking in information retrieval. In: International conference on scalable uncertainty management,
Springer, pp 255–267
Boughanem M, Brini A, Dubois D (2009) Possibilistic networks for information retrieval. Int J
Approx Reason 50(7):957–968
Boughanem SDC M (1992) A connexionist model for information retrieval. In: Proceeding of
DEXA, pp 260–265
Bruza PD, van der Gaag LC (1994) Index expression belief networks for information disclosure.
Int J Expert Syst 7(2):107–138
Buell DA (1985) A problem in information retrieval with fuzzy sets. J Assoc Inf Sci Technol
36(6):398–401
172 M. Boughanem et al.

Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning


to rank using gradient descent. In: Proceedings of the 22nd international conference on machine
learning, ACM, pp 89–96
Cao G, Nie JY, Bai J (2005) Integrating word relationships into language models. In: Proceedings of
the 28th annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 298–305
Chawla S (2013) Personalised web search using aco with information scent. Int J Knowl Web Intell
4(2–3):238–259
Chawla S (2017) Web page ranking using ant colony optimisation and genetic algorithm for effective
information retrieval. Int J Swarm Intell 3(1):58–76
Chen H (1995) Machine learning for information retrieval: neural networks, symbolic learning, and
genetic algorithms. J Assoc Inf Sci Technol 46(3):194–216
Chen H, Dhar V (1989) Online query refinement on information retrieval systems: a process model
of searcher/system interactions. In: Proceedings of the 13th annual international ACM SIGIR
conference on Research and development in information retrieval, ACM, pp 115–133
Chevallet JP, Chiaramella Y (1998) Experiences in information retrieval modelling using structured
formalisms and modal logic. Information retrieval: uncertainty and logics. Springer, Berlin, pp
39–72
Chiaramella Y, Chevallet JP (1992) About retrieval models and logic. Comput J 35(3):233–242
Choquet G (1954) Theory of capacities. Annales de l’institut Fourier 5:131–295
Chowdhury A, McCabe MC (1998) Improving information retrieval systems using part of speech
tagging. Tech rep
Cleverdon CW, Keen M (1966) Factors determining the performance of indexing systems; volume
2, test results, aslib cranfield research project. Tech rep
Clinchant S, Gaussier E (2010) Information-based models for ad hoc ir. In: Proceedings of the
33rd international ACM SIGIR conference on research and development in information retrieval,
ACM, pp 234–241
Cole C (1998) Intelligent information retrieval: diagnosing information need. Part i. the theoretical
framework for developing an intelligent ir tool. Inf Process Manag 34(6):709–720
Colorini A, Dorigo M, Maniezzo V (1991) Distributed optimization by ant colonies. The 1st euro-
pean conference on artificial life. Pans, France
Cooper WS (1971) A definition of relevance for information retrieval. Inf Storage Retr 7(1):19–37
Crammer K, Singer Y (2002) Pranking with ranking. In: Advances in neural information processing
systems, pp 641–647
Crestani F (1998) Logical imaging and probabilistic information retrieval. In: Information Retrieval:
uncertainty and logics. Springer, Berlin, pp 247–279
Crestani F, van Rijsbergen CJ (1995) Information retrieval by logical imaging. J Doc 51(1):3–17
Crestani F, De Campos LM, Fernández-Luna JM, Huete JF (2003) A multi-layered bayesian network
model for structured document retrieval. In: European conference on symbolic and quantitative
approaches to reasoning and uncertainty. Springer, pp 74–86
Croft WB (1987) Approaches to intelligent information retrieval. Inf Process Manage 23(4):249–
254
Damiani E, Marrara S, Pasi G (2007) Fuzzyxpath: using fuzzy logic an ir features to approximately
query xml documents. In: International fuzzy systems association world congress. Springer, pp
199–208
De Baets B, Fodor J (1997) On the structure of uninorms and their residual implicators. In: 18th
Proceedings Linz seminar on fuzzy set theory, pp 81–87
De Campos LM, Fernández-Luna JM, Huete JF (2002) A layered bayesian network model for
document retrieval. In: European conference on information retrieval. Springer, pp 169–182
De Campos LM, Fernández-Luna JM, Huete JF (2003) Improving the efficiency of the bayesian
network retrieval model by reducing relationships between terms. Int J Uncertain Fuzziness
Knowl-Based Syst 11(supp01):101–116
Information Retrieval and Artificial Intelligence 173

Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent
semantic analysis. J Am Soc Inf Sci 41(6):391
Dehghani M, Zamani H, Severyn A, Kamps J, Croft WB (2017) Neural ranking models with weak
supervision. In: Proceedings of the 40th international ACM SIGIR conference on research and
development in information retrieval, ACM, pp 65–74
Denoyer L, Gallinari P (2003) A belief networks-based generative model for structured documents.
an application to the xml categorization. In: International workshop on machine learning and data
mining in pattern recognition. Springer, pp 328–342
Diaz F, Mitra B, Craswell N (2016) Query expansion with locally-trained word embeddings. In:
Proceedings of the 54th annual meeting of the association for computational linguistics (volume
1: long papers), Association for computational linguistics, pp 367–377
Ding Y (2001) Ir and ai: the role of ontology. In: International conference of Asian digital libraries,
India
Dinh D, Tamine L, Boubekeur F (2013) Factors affecting the effectiveness of biomedical document
indexing and retrieval based on terminologies. Artif Intell Med 57(2):155–167
Dubois D, Prade H (1985) A review of fuzzy set aggregation connectives. Inf Sci 36(1–2):85–121
Dubois D, Prade H, Testemale C (1988) Weighted fuzzy pattern matching. Fuzzy Sets Syst
28(3):313–331
Dubois D, Fargier H, Prade H (1997) Beyond min aggregation in multicriteria decision:(ordered)
weighted min, discri-min, leximin. The ordered weighted averaging operators. Springer, Berlin,
pp 181–192
Enembreck F, Barthès JP, Ávila BC (2004) Personalizing information retrieval with multi-agent
systems. In: International workshop on cooperative Information Agents. Springer, pp 77–91
Evans DA, Zhai C (1996) Noun-phrase analysis in unrestricted text for information retrieval. In:
Proceedings of the 34th annual meeting on association for computational linguistics, Association
for computational linguistics, pp 17–24
Fagan J (1987a) Automatic phrase indexing for document retrieval. In: Proceedings of the 10th
annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 91–101
Fagan JL (1987b) Experiments in automatic phrase indexing for document retrieval: a comparison
of syntactic and non-syntactic methods. PhD thesis
Fang H (2008) A re-examination of query expansion using lexical resources. In: Proceedings of
ACL-08: HLT, pp 139–147
Firth JR (1957) A synopsis of linguistic theory, 1930-1955. Studies in linguistic analysis
Fox EA, Sharan S (1986) A comparison of two methods for soft boolean operator interpretation
in information retrieval. Tech. rep. Tech., Department of. Computer Science, Blacksburg, VA:
Virginia
Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining
preferences. J Mach Learn Res 4(Nov):933–969
Fuhr N (1995) Probabilistic dataloga logic for powerful retrieval methods. In: Proceedings of the
18th annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 282–290
Fung R, Del Favero B (1995) Applying bayesian networks to information retrieval. Commun ACM
38(3):42–50
Ganguly D, Roy D, Mitra M, Jones GJ (2015) Word embedding based generalized language model
for information retrieval. In: Proceedings of the 38th international ACM SIGIR conference on
research and development in information retrieval, ACM, pp 795–798
Goldberg DE, Corruble V (1994) Algorithmes génétiques: exploration, optimisation et apprentis-
sage automatique. Ed. Addison-Wesley, France
Gonzalo J, Verdejo F, Chugur I, Cigarrin J (1998) Indexing with wordnet synsets can improve text
retrieval. In: Workshop on usage Of WordNet in natural language processing systems, pp 38–44
174 M. Boughanem et al.

Gonzalo J, Li H, Moschitti A, Xu J (2014) Semantic matching in information retrieval. In: Proceed-


ings of the 37th international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 1296–1296
Gordon MD (1991) User-based document clustering by redescribing subject descriptions with a
genetic algorithm. J Am Soc Inf Sci 42(5):311
Guo J, Fan Y, Ai Q, Croft WB (2016) A deep relevance matching model for ad-hoc retrieval.
In: Proceedings of the 25th ACM international on conference on information and knowledge
management, ACM, pp 55–64
Hammache A, Boughanem M, Ahmed-Ouamer R (2014) Combining compound and single terms
under language model framework. Knowl Inf Syst 39(2):329–349
Harman D (1991) How effective is suffixing? J Am Soc Inf Sci 42(1):7
Hassan A, Hadi M (2016) Sense-based information retrieval using artificial bee colony approach.
Int J Appl Eng Res 11(15):8708–8713
Hayashi I, Nomura H, Yamasaki H, Wakami N (1992) Construction of fuzzy inference rules by ndf
and ndfl. Int J Approx Reason 6(2):241–266
Herbrich R (2000) Large margin rank boundaries for ordinal regression. Advances in large margin
classifiers. MIT Press, Oxford, pp 115–132
Herrera-Viedma E (2001) Modeling the retrieval process for an information retrieval system using
an ordinal fuzzy linguistic approach. J Assoc Inf Sci Technol 52(6):460–475
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with appli-
cations to biology, control, and artificial intelligence. MIT press, Oxford
Hollink V, Kamps J, Monz C, De Rijke M (2004) Monolingual document retrieval for european
languages. Inf Retr 7(1–2):33–52
Hosanagar K (2011) Usercentric operational decision making in distributed information retrieval.
Inf Syst Res 22(4):739–755
Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural
language sentences. In: Advances in neural information processing systems, pp 2042–2050
Huang PS, He X, Gao J, Deng L, Acero A, Heck L (2013) Learning deep structured semantic
models for web search using clickthrough data. In: Proceedings of the 22nd ACM international
conference on Conference on information and knowledge management, ACM, pp 2333–2338
Hubert G, Pitarch Y, Pinel-Sauvagnat K, Tournier R, Laporte L (2018) Tournarank: when retrieval
becomes document competition. Inf Process Manag 54(2):252–272
Huibers TWC (1994) Situations, a general framework for studying information retrieval, vol 1994.
Unknown Publisher
Hunter A (1997) Using default logic for lexical knowledge. Qualitative and quantitative practical
reasoning. Springer, Berlin, pp 322–335
Indrawan M, Ghazfan D, Srinivasan B (1996) Using bayesian networks as retrieval engines. In:
Proceedings of the text retrieval conference TREC’96
Jabeur LB, Tamine L, Boughanem M (2012) Active microbloggers: Identifying influencers, leaders
and discussers in microblogging networks. In: International symposium on string processing and
information retrieval. Springer, pp 111–117
Jones KS (1983) Intelligent retrieval. Proceedings of Informatics
Jones KS (1991) The role of artificial intelligence in information retrieval. J Am Soc Inf Sci 42(8):558
Jones KS (1999) Information retrieval and artificial intelligence. Artif Intell 114(1–2):257–281
Jozefowicz R, Vinyals O, Schuster M, Shazeer N, Wu Y (2016) Exploring the limits of language
modeling. arXiv:160202410
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Tech. rep.,
Technical report-tr06, Erciyes university, engineering faculty, computer engineering department
Karaboga D, Basturk B (2008) On the performance of artificial bee colony (abc) algorithm. Appl
Soft Comput 8(1):687–697
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: 1995 Proceedings of IEEE interna-
tional conference on neural networks, vol 4, pp 1942–1948
Information Retrieval and Artificial Intelligence 175

Kenter T, De Rijke M (2015) Short text similarity with word embeddings. In: Proceedings of the
24th ACM international on conference on information and knowledge management, ACM, pp
1411–1420
Khennak I, Drias H (2017) An accelerated pso for query expansion in web information retrieval:
application to medical dataset. Appl Intell 1–16
Kim S, Zhang BT (2003) Genetic mining of html structures for effective web-document retrieval.
Appl Intell 18(3):243–256
Kim Y, Jernite Y, Sontag D, Rush AM (2015) Character-aware neural language models. In: AAAI,
pp 2741–2749
Kiros R, Zhu Y, Salakhutdinov RR, Zemel R, Urtasun R, Torralba A, Fidler S (2015) Skip-thought
vectors. In: Advances in neural information processing systems, pp 3294–3302
Kraaij W, Pohlmann R (1996) Viewing stemming as recall enhancement. In: Proceedings of the
19th annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 40–48
Kraft DH, Buell DA (1983) Fuzzy sets and generalized boolean retrieval systems. Int J Man-Mach
Stud 19(1):45–56
Kraft DH, Colvin E (2017) Fuzzy information retrieval. Synthesis lectures on information concepts,
retrieval, and services. Morgan & Claypool Publishers, San Rafael
Kraft DH, Bordogna G, Pasi G (1999) Fuzzy set techniques in information retrieval. Fuzzy sets in
approximate reasoning and information systems. Springer, Berlin, pp 469–510
Kraft DH, Colvin E, Bordogna G, Pasi G (2015) Fuzzy information retrieval systems: a historical
perspective. Fifty years of fuzzy logic and its applications. Springer, Berlin, pp 267–296
Kripke SA (1963) Semantic analysis of modal logic I: Normal modal and propositional calculi.
Zeitschrift für Mathematische Logik und Grundlagen der Mathematik 9:67–96
Krovetz R (1993) Viewing morphology as an inference process. In: Proceedings of the 16th annual
international ACM SIGIR conference on research and development in information retrieval,
ACM, pp 191–202
Krovetz R (1997) Homonymy and polysemy in information retrieval. In: Proceedings of the 35th
annual meeting of the association for computational linguistics and eighth conference of the
European chapter of the association for computational linguistics, Association for computational
linguistics, Stroudsburg, PA, USA, ACL ’98, pp 72–79
Krovetz R, Croft WB (1992) Lexical ambiguity and information retrieval. ACM Trans Inf Syst
10(2):115–141
Kwok KL (1989) A neural network for probabilistic information retrieval. In: Proceedings of the
12th annual international ACM SIGIR conference on research and development in information
retrieval, ACM, New York, NY, USA, SIGIR ’89, pp 21–30
Lalmas M (1998) Logical models in information retrieval: introduction and overview. Inf Process
Manag 34(1):19–33
Lalmas M, van Rijsbergen K (1993) A logical model of information retrieval based on situation
theory. In: 14th information retrieval colloquium. Springer, pp 1–13
Lavrenko V, Croft WB (2017) Relevance-based language models. In: ACM SIGIR Forum, ACM,
vol 51, pp 260–267
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International
conference on machine learning, pp 1188–1196
Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsu-
pervised learning of hierarchical representations. In: Proceedings of the 26th annual international
conference on machine learning, ACM, pp 609–616
Lewis DD (1992) Representation and learning in information retrieval. PhD thesis, University of
Massachusetts at Amherst
Lewis DD, Jones KS (1996) Natural language processing for information retrieval. Commun ACM
39(1):92–101
Li H (2011) Learning to rank for information retrieval and natural language processing. Synth Lect
Hum Lang Technol 4(1):1–113
176 M. Boughanem et al.

Li H, Xu J et al (2014) Semantic matching in search. Found Trends® Inf Retrieva 7(5):343–469


Li P, Wu Q, Burges CJ (2008) Mcrank: learning to rank using multiple classification and gradient
boosting. In: Advances in neural information processing systems, pp 897–904
Lioma C, Blanco R (2009) Part of speech based term weighting for information retrieval. In:
European conference on information retrieval. Springer, pp 412–423
Liu S, Liu F, Yu C, Meng W (2004) An effective approach to document retrieval via utilizing
wordnet and recognizing phrases. In: Proceedings of the 27th annual international ACM SIGIR
conference on research and development in information retrieval, ACM, pp 266–272
Liu S, Yu C, Meng W (2005) Word sense disambiguation in queries. In: Proceedings of the 14th
ACM international conference on information and knowledge management, ACM, pp 525–532
Liu TY et al (2009) Learning to rank for information retrieval. Found Trends® Inf Retr 3(3):225–331
Loiseau Y, Prade H, Boughanem M (2004) Qualitative pattern matching with linguistic terms. AI
Commun 17(1):25–34
Losada DE, Barreiro A (2001) A logical model for information retrieval based on propositional
logic and belief revision. Comput J 44(5):410–424
Losada DE, Barreiro A (2003) Propositional logic representations for documents and queries: a
large-scale evaluation. In: European conference on information retrieval, Springer, pp 219–234
Lu Z, Li H (2013) A deep architecture for matching short texts. In: Advances in neural information
processing systems, pp 1367–1375
Luhn HP (1957) A statistical approach to mechanized encoding and searching of literary informa-
tion. IBM J Res Dev 1(4):309–317
Lv Y, Zhai C (2009) Positional language models for information retrieval. In: Proceedings of the
32nd international ACM SIGIR conference on Research and development in information retrieval,
ACM, pp 299–306
MacAvaney S, Hui K, Yates A (2017) An approach for weakly-supervised deep information retrieval.
In: Workshop on neural information retrieval (Neu-IR ’17) at SIGIR 2017
Maes P et al (1994) Agents that reduce work and information overload. Commun ACM 37(7):30–40
Mandl T (2009) Artificial intelligence for information retrieval. In: Encyclopedia of artificial intel-
ligence, IGI Global, pp 151–156
Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge Uni-
versity Press, Cambridge, UK
Marrara S, Pasi G, Viviani M (2017) Aggregation operators in information retrieval. Fuzzy Sets
Syst 324:3–19
Mauldin ML (1991) Retrieval performance in ferret a conceptual information retrieval system. In:
Proceedings of the 14th annual international ACM SIGIR conference on research and develop-
ment in information retrieval, ACM, pp 347–355
Meghini C, Straccia, (1996) A relevance terminological logic for information retrieval. In: Proceed-
ings of the 19th annual international ACM SIGIR conference on research and development in
information retrieval, ACM, pp 197–205
Meghini C, Sebastiani F, Straccia U, Thanos C (1993) A model of information retrieval based on
a terminological logic. In: Proceedings of the 16th annual international ACM SIGIR conference
on research and development in information retrieval, ACM, pp 298–307
Metzler D, Croft WB (2005) A markov random field model for term dependencies. In: Proceedings of
the 28th annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 472–479
Mihalcea R, Moldovan D (2000) Semantic indexing using wordnet senses. In: Proceedings of the
ACL-2000 workshop on recent advances in natural language processing and information retrieval:
held in conjunction with the 38th annual meeting of the association for computational linguistics
- volume 11, Association for computational linguistics, Stroudsburg, PA, USA, RANLPIR ’00,
pp 35–45
Mikolov T, Chen K, Corrado G (2013) Efficient estimation of word representations in vector space.
Dean J CoRR abs/1301.3781
Information Retrieval and Artificial Intelligence 177

Mitra B, Craswell N (2017) An introduction to neural information retrieval. Found Trends® Inf
Retr :1–120
Mitra B, Nalisnick ET, Craswell N, Caruana R (2016) A dual embedding space model for document
ranking. CoRR abs/1602.01137
Mitra B, Diaz F, Craswell N (2017) Learning to match using local and distributed representations
of text for web search. In: Proceedings of the 26th international conference on world wide web,
international world wide web conferences steering committee, pp 1291–1299
Miyamoto S (1990) Information retrieval based on fuzzy associations. Fuzzy Sets Syst 38(2):191–
205
Molinari A, Pasi G (1996) A fuzzy representation of html documents for information retrieval
systems. In: 1996 Proceedings of the fifth ieee international conference on fuzzy systems, IEEE,
vol 1, pp 107–112
Moon J, Shon T, Seo J, Kim J, Seo J (2004) An approach for spam e-mail detection with support
vector machine and n-gram indexing. In: International symposium on computer and information
sciences, Springer, pp 351–362
Moulahi B, Tamine L, Yahia SB (2014) iaggregator: multidimensional relevance aggregation based
on a fuzzy operator. J Assoc Inf Sci Technol 65(10):2062–2083
Nalisnick E, Mitra B, Craswell N, Caruana R (2016) Improving document ranking with dual word
embeddings. In: Proceedings of the 25th international conference companion on world wide web,
international world wide web conferences steering committee, pp 83–84
Nallapati R (2004) Discriminative models for information retrieval. In: Proceedings of the 27th
annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 64–71
Nie J (1988) An outline of a general model for information retrieval systems. In: Proceedings of the
11th annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 495–506
Nie J (1989) An information retrieval model based on modal logic. Inf Process Manag 25(5):477–
491
Nie JY, Brisebois M, Lepage F (1995) Information retrieval as counterfactual. Comput J 38(8):643–
657
Onal KD, Zhang Y, Altingovde IS, Rahman MM, Karagoz P, Braylan A, Dang B, Chang HL, Kim
H, McNamara Q et al (2017) Neural information retrieval: at the end of the early years. Inf Retr
J 1–72
Pang L, Lan Y, Guo J, Xu J, Cheng X (2016) A study of matchpyramid models on ad-hoc retrieval.
In: ACM SIGIR workshop on neural information retrieval (Neu-IR)
Pasi G (2009) Fuzzy models. Encyclopedia of database systems. Springer, Berlin, pp 1205–1209
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Pro-
ceedings of the 2014 conference on empirical methods in natural language processing (EMNLP),
pp 1532–1543
Phillips-Wren GE, Forgionne GA (2004) Intelligent decision making in information retrieval. In:
International conference on intelligent information and engineering systems. Springer, pp 103–
109
Picard J (1999) Logic as a tool in a term matching information retrieval system. In: Proceedings of
the workshop on logical and uncertainty models for information systems, pp 77–90
Picard J, Savoy J (2000) A logical information retrieval model based on a combination of propo-
sitional logic and probability theory. In: Soft computing in information retrieval, Springer, pp
225–258
Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: Proceed-
ings of the 21st annual international ACM SIGIR conference on research and development in
information retrieval, ACM, pp 275–281
Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137
Qi G, Pan JZ (2008) A tableau algorithm for possibilistic description logic . In: Asian semantic
web conference, Springer, pp 61–75
178 M. Boughanem et al.

Raifer N, Raiber F, Tennenholtz M, Kurland, (2017) Information retrieval meets game theory: The
ranking competition between documents’ authors. In: Proceedings of the 40th international ACM
SIGIR conference on research and development in information retrieval, ACM, pp 465–474
Rekabsaz N (2016) Enhancing information retrieval with adapted word embedding. In: Proceedings
of the 39th international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 1169–1169
Ribeiro BA, Muntz R (1996) A belief network model for ir. In: Proceedings of the 19th annual
international ACM SIGIR conference on research and development in information retrieval,
ACM, pp 253–260
van Rijsbergen CJ (1989) Towards an information logic. In: ACM SIGIR forum, ACM, vol 23, pp
77–86
Robertson SE, Walker S (1994) Some simple effective approximations to the 2-poisson model for
probabilistic weighted retrieval. In: Proceedings of the 17th annual international ACM SIGIR
conference on research and development in information retrieval, Springer-Verlag New York,
Inc., pp 232–241
Rölleke T, Fuhr N (1996) Retrieval of complex objects using a four-valued logic. In: Proceedings of
the 19th annual international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 206–214
Roy D, Paul D, Mitra M, Garain U (2016) Using word embeddings for automatic query expansion.
In: ACM SIGIR workshop on neural information retrieval
Salton G (1991) Developments in automatic text retrieval. Science 253(5023):974–980
Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill Inc, New
York, NY, USA
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM
18(11):613–620
Sanchez E (1989) Importance in knowledge systems. Inf Syst 14(6):455–464
Sanderson M (1994) Word sense disambiguation and information retrieval. In: Proceedings of the
17th annual international ACM SIGIR conference on research and development in information
retrieval, Springer-Verlag New York, Inc., pp 142–151
Sanderson M (2000) Retrieving with good sense. Inf Retr 2(1):49–69
Sathya SS, Simon P (2010) A document retrieval system with combination terms using genetic
algorithm. Int J Comput Electr Eng 2(1):1
Schütze H, Pedersen JO (1995) Information retrieval based on word senses, pp 161–175
Sebastiani F (1994) A probabilistic terminological logic for modelling information retrieval. In:
SIGIR94, Springer, pp 122–130
Sebastiani F (1998) On the role of logic in information retrieval. Inf Process Manag 34(1):1–18
Shashua A, Levin A (2003) Ranking with large margin principle: two approaches. In: Advances in
neural information processing systems, pp 961–968
Shen Y, He X, Gao J, Deng L, Mesnil G (2014) Learning semantic representations using convolu-
tional neural networks for web search. In: Proceedings of the 23rd international conference on
world wide web, ACM, pp 373–374
Sheridan P, Smeaton AF (1992) The application of morpho-syntactic language processing to effec-
tive phrase matching. Inf Process Manag 28(3):349–369
Shi L, Nie JY (2009) Integrating phrase inseparability in phrase-based model. In: Proceedings of the
32nd international ACM SIGIR conference on research and development in information retrieval,
ACM, New York, NY, USA, SIGIR ’09, pp 708–709
Silva I, Ribeiro-Neto B, Calado P, Moura E, Ziviani N (2000) Link-based and content-based eviden-
tial information in a belief network model. In: Proceedings of the 23rd annual international ACM
SIGIR conference on Research and development in information retrieval, ACM, pp pp 96–103
Sowa JF (1983) Conceptual structures: information processing in mind and machine
Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval.
J Doc 28(1):11–21
Information Retrieval and Artificial Intelligence 179

Stokoe C, Oakes MP, Tait J (2003) Word sense disambiguation in information retrieval revisited.
In: Proceedings of the 26th annual international ACM SIGIR conference on research and devel-
opment in informaion retrieval, ACM, pp 159–166
Strzalkowski T (1995) Natural language information retrieval. Inf Process Manag 31(3):397–417
Tamine L, Chrisment C, Boughanem M (2003) Multiple query evaluation based on an enchanced
genetic algorithm. Inf Process Manag 39(2):215–231
Tamir DE, Rishe ND, Kandel A (2015) Fifty years of fuzzy logic and its applications, vol 326.
Springer, Berlin
Tao T, Zhai C (2007) An exploration of proximity measures in information retrieval. In: Proceed-
ings of the 30th annual international ACM SIGIR conference on research and development in
information retrieval, ACM, pp 295–302
Thiel U, Müller A (1996) Why was this item retrieved? new ways to explore retrieval results.
Information retrieval and hypertext. Springer, Berlin, pp 181–201
Tong X, Zhai C, Milic-Frayling N, Evans DA (1997) Evaluation of syntactic phrase indexing. In:
The fifth text retrieval conference (TREC-5)
Trifa A, Sbaï AH, Chaari WL (2017) Evaluate a personalized multi agent system through social
networks: web scraping. In: 2017 IEEE 26th international conference on enabling technologies:
infrastructure for collaborative enterprises (WETICE), IEEE, pp 18–20
Tsai MF, Liu TY, Qin T, Chen HH, Ma WY (2007) Frank: a ranking method with fidelity loss. In:
Proceedings of the 30th annual international ACM SIGIR conference on research and develop-
ment in information retrieval, ACM, pp 383–390
Turtle H, Croft WB (2017) Inference networks for document retrieval. ACM SIGIR forum, ACM,
vol 51, pp 124–147
Van Rijsbergen CJ (1986) A non-classical logic for information retrieval. Comput J 29(6):481–485
Vickery A, Brooks HM (1987) Plexus-the expert system for referral. Inf Process Manag 23(2):99–
117
Voorhees EM (1993) Using wordnet to disambiguate word senses for text retrieval. In: Proceed-
ings of the 16th annual international ACM SIGIR conference on research and development in
information retrieval, ACM, pp 171–180
Voorhees EM (1994) Query expansion using lexical-semantic relations. In: Proceedings of the
17th annual international ACM SIGIR conference on research and development in information
retrieval, Springer-Verlag New York, Inc., pp 61–69
Vrajitoru D (2013) Large population or many generations for genetic algorithms? Soft computing
in information retrieval: techniques and applications, vol 50, p 199
Vulić I, Moens MF (2015) Monolingual and cross-lingual information retrieval models based on
(bilingual) word embeddings. In: Proceedings of the 38th international ACM SIGIR conference
on research and development in information retrieval, ACM, pp 363–372
Wan S, Lan Y, Guo J, Xu J, Pang L, Cheng X (2016) A deep architecture for semantic matching
with multiple positional sentence representations. In: AAAI, pp 2835–2841
Wang C, Akella R (2015) Concept-based relevance models for medical and semantic informa-
tion retrieval. In: Proceedings of the 24th ACM international on conference on information and
knowledge management, ACM, New York, NY, USA, CIKM ’15, pp 173–182
Wu Q, Burges CJ, Svore KM, Gao J (2010) Adapting boosting for information retrieval measures.
Inf Retr 13(3):254–270
Xia F, Liu TY, Wang J, Zhang W, Li H (2008) Listwise approach to learning to rank: theory and
algorithm. In: Proceedings of the 25th international conference on machine learning, ACM, pp
1192–1199
Yager RR (1988) On ordered weighted averaging aggregation operators in multicriteria decision-
making. IEEE Trans Syst Man Cybern 18(1):183–190
Yang GH, Sloan M, Wang J (2016) Dynamic information retrieval modeling, vol 8(3). Synthesis
lectures on information concepts, retrieval, and services, pp 1–144
180 M. Boughanem et al.

Yang J, Korfhage RR, Rasmussen E (1993) Query improvement in information retrieval using
genetic algorithms: a report on the experiments of the trec project. In: Proceedings of the 1st text
retrieval conference, pp 31–58
Yin W, Schütze H, Xiang B, Zhou B (2015) Abcnn: attention-based convolutional neural network
for modeling sentence pairs. arXiv preprint arXiv:151205193
Yue Y, Finley T, Radlinski F,Joachims T (2007) A support vector method for optimizing average
precision. In: Proceedings of the 30th annual international ACM SIGIR conference on research
and development in information retrieval, ACM, pp 271–278
Zakos J (2005) A novel concept and context-based approach for web information retrieval. PhD
thesis
Zamani H, Croft WB (2016a) Embedding-based query language models. In: Proceedings of the
2016 ACM international conference on the theory of information retrieval, ACM, pp 147–156
Zamani H, Croft WB (2016b) Estimating embedding vectors for queries. In: Proceedings of the
2016 ACM international conference on the theory of information retrieval, ACM, pp 123–132
Zamani H, Mitra B, Song X, Craswell N, Tiwary S (2018) Neural ranking models with multiple
document fields. In: Proceedings of the eleventh ACM international conference on web search
and data mining, ACM, pp 700–708
Zhai C (1997) Fast statistical parsing of noun phrases for document indexing. In: Proceedings
of the fifth conference on applied natural language processing, Association for computational
linguistics, pp 312–319
Zhai C (2008) Statistical language models for information retrieval. Synthesis lectures on human
language technologies, vol 1(1), pp 1–141
Zhai C (2016) Towards a game-theoretic framework for text data retrieval. IEEE Data Eng Bull
39(3):51–62
Zhao B, Feng J, Wu X, Yan S (2017) A survey on deep learning-based fine-grained object classifi-
cation and semantic segmentation. Int J Autom Comput 14(2):119–135
Zhao J, Yun Y (2009) A proximity language model for information retrieval. In: Proceedings of the
38th international ACM SIGIR conference on research and development in information retrieval,
ACM, pp 291–298
Zheng G, Callan J (2015) Learning to reweight terms with distributed representations. In: Proceed-
ings of the 38th international ACM SIGIR conference on research and development in information
retrieval, ACM, pp 575–584
Zuccon G, Azzopardi L, van Rijsbergen CJ (2009) Revisiting logical imaging for information
retrieval. In: Proceedings of the 32nd international ACM SIGIR conference on Research and
development in information retrieval, ACM, New York, NY, USA, SIGIR ’09, pp 766–767
Zuccon G, Koopman B, Bruza P, Azzopardi L (2015) Integrating and evaluating neural word embed-
dings in information retrieval. In: Proceedings of the 20th Australasian document computing
symposium, ACM, p 12

You might also like