Most of the time, queries to search engines are polysemic against the data. Indeed differents doc... more Most of the time, queries to search engines are polysemic against the data. Indeed differents documents can match the query but by understanding it with different meaning. Instead of combining all the responses in a simple list of results, CELLO exposes to the user the "communities of meanings" present in the result set, i.e. what are the different points of view from which the query can be interpreted. This allows the user to target more efficiently it query by refining it.
In this paper, we compare the topological structure of lexical networks with a method based on ra... more In this paper, we compare the topological structure of lexical networks with a method based on random walks. Instead of characterising pairs of vertices according only to whether they are connected or not, we measure their structural proximity by evaluating the relative probability of reaching one vertex from the other via a short random walk. This proximity between vertices is the basis on which we can compare the topological structure of lexical networks be-cause it outlines the similar dense zones of the graphs.
In this paper, we propose word sense disambiguation methods based on lexical substitution and use... more In this paper, we propose word sense disambiguation methods based on lexical substitution and used for the task 1 of the SemDis2014 workshop. This methods are run by using short random walks on unipartite networks or bipartite networks. Some of these methods only use graphs automatically built from corpora (unsurpervised methods), others also use graphs built from handcraft resources filled by lexicographers or by the crowds (supervised methods).
We propose a model to compute two measurements of semantic efficiency of verbs as action labels. ... more We propose a model to compute two measurements of semantic efficiency of verbs as action labels. It is based on the exploration of the specific structure of synonymy networks of verbs. We use these measurements to analyse and compare the semantic efficiency of [Children/Adults] productions in action labelling tasks, in French and Mandarin. The combination of these two measurements leads to a generic score of semantic efficiency, Skillex. Assigned to participants of the Approx protocol experiment, this score enables us to accurately classify them into Children and Adults categories, be they French or Mandarin native speakers.
This article presents SLAM, an Automatic Solver for Lexical Metaphors like “déshabiller* une pomm... more This article presents SLAM, an Automatic Solver for Lexical Metaphors like “déshabiller* une pomme” (to undress* an apple). SLAM calculates a conventional solution for these productions. To carry on it, SLAM has to intersect the paradigmatic axis of the metaphorical verb “déshabiller*”, where “peler” (“to peel”) comes closer, with a syntagmatic axis that comes from a corpus where “peler une pomme” (to peel an apple) is semantically and syntactically regular. We test this model on DicoSyn, which is a “small world” network of synonyms, to compute the paradigmatic axis and on Frantext.20, a French corpus, to compute the syntagmatic axis. Further, we evaluate the model with a sample of an experimental corpus of the database of Flexsem
We compare a psycholinguistic approach of mental lexicon organization with a computational approa... more We compare a psycholinguistic approach of mental lexicon organization with a computational approach of implicit lexical organization as found in dictionaries. In this work, we associate dictionaries with ’small world’ graphs. This multidisciplinary ap-proach aims at showing that implicit structure of dictionaries, mathematically iden- tified, fits the way young children catego-rize. These dictionary graphs might therefore be considered as ’cognitive artifacts’. This shows the importance of semantic proximity both in cognitive and computational organization of verbs lexicon.
In this methodological investigation, we examined the influence of cultural background on viewers... more In this methodological investigation, we examined the influence of cultural background on viewers’ interpretations of visual stimuli and verbs elicited by these materials. French and Mandarin native speakers’ interpretations of seventeen short movies, produced by French speakers, depicting various state-changing actions were collected by a 25-item cultural protocol. A slight difference in the familiarity rating of movies is found between French and Mandarin participants. We also found that Mandarin speakers used more general verbs when describing actions depicted by movies with low familiarity rating and children used more conventional forms with movies of higher familiarity. Hierarchical cluster analyses were conducted in selecting movies that were matched in action-interpretations by both language groups.
Going cross-linguistic is an important but challenging track for validating a computational model... more Going cross-linguistic is an important but challenging track for validating a computational model of lexical organization. Our starting point is a computational model that has been established and validated on French language and we attempted to apply it on Mandarin language. The main ingredients of this model are computational lexical resources and a psycho-linguistic protocol involving extra-linguistic material (video-clips). At this stage, all the psycho-linguistic experiments have been ran, most of the resources have been built but some comparative analyses are not fully completed. Still the project is advanced enough to report on the issues we had to address while performing this cross-linguistic move concerning the resources, the analysis of the data and the data alignment across languages.
Most of the time, queries to search engines are polysemic against the data. Indeed differents doc... more Most of the time, queries to search engines are polysemic against the data. Indeed differents documents can match the query but by understanding it with different meaning. Instead of combining all the responses in a simple list of results, CELLO exposes to the user the "communities of meanings" present in the result set, i.e. what are the different points of view from which the query can be interpreted. This allows the user to target more efficiently it query by refining it.
In this paper, we compare the topological structure of lexical networks with a method based on ra... more In this paper, we compare the topological structure of lexical networks with a method based on random walks. Instead of characterising pairs of vertices according only to whether they are connected or not, we measure their structural proximity by evaluating the relative probability of reaching one vertex from the other via a short random walk. This proximity between vertices is the basis on which we can compare the topological structure of lexical networks be-cause it outlines the similar dense zones of the graphs.
In this paper, we propose word sense disambiguation methods based on lexical substitution and use... more In this paper, we propose word sense disambiguation methods based on lexical substitution and used for the task 1 of the SemDis2014 workshop. This methods are run by using short random walks on unipartite networks or bipartite networks. Some of these methods only use graphs automatically built from corpora (unsurpervised methods), others also use graphs built from handcraft resources filled by lexicographers or by the crowds (supervised methods).
We propose a model to compute two measurements of semantic efficiency of verbs as action labels. ... more We propose a model to compute two measurements of semantic efficiency of verbs as action labels. It is based on the exploration of the specific structure of synonymy networks of verbs. We use these measurements to analyse and compare the semantic efficiency of [Children/Adults] productions in action labelling tasks, in French and Mandarin. The combination of these two measurements leads to a generic score of semantic efficiency, Skillex. Assigned to participants of the Approx protocol experiment, this score enables us to accurately classify them into Children and Adults categories, be they French or Mandarin native speakers.
This article presents SLAM, an Automatic Solver for Lexical Metaphors like “déshabiller* une pomm... more This article presents SLAM, an Automatic Solver for Lexical Metaphors like “déshabiller* une pomme” (to undress* an apple). SLAM calculates a conventional solution for these productions. To carry on it, SLAM has to intersect the paradigmatic axis of the metaphorical verb “déshabiller*”, where “peler” (“to peel”) comes closer, with a syntagmatic axis that comes from a corpus where “peler une pomme” (to peel an apple) is semantically and syntactically regular. We test this model on DicoSyn, which is a “small world” network of synonyms, to compute the paradigmatic axis and on Frantext.20, a French corpus, to compute the syntagmatic axis. Further, we evaluate the model with a sample of an experimental corpus of the database of Flexsem
We compare a psycholinguistic approach of mental lexicon organization with a computational approa... more We compare a psycholinguistic approach of mental lexicon organization with a computational approach of implicit lexical organization as found in dictionaries. In this work, we associate dictionaries with ’small world’ graphs. This multidisciplinary ap-proach aims at showing that implicit structure of dictionaries, mathematically iden- tified, fits the way young children catego-rize. These dictionary graphs might therefore be considered as ’cognitive artifacts’. This shows the importance of semantic proximity both in cognitive and computational organization of verbs lexicon.
In this methodological investigation, we examined the influence of cultural background on viewers... more In this methodological investigation, we examined the influence of cultural background on viewers’ interpretations of visual stimuli and verbs elicited by these materials. French and Mandarin native speakers’ interpretations of seventeen short movies, produced by French speakers, depicting various state-changing actions were collected by a 25-item cultural protocol. A slight difference in the familiarity rating of movies is found between French and Mandarin participants. We also found that Mandarin speakers used more general verbs when describing actions depicted by movies with low familiarity rating and children used more conventional forms with movies of higher familiarity. Hierarchical cluster analyses were conducted in selecting movies that were matched in action-interpretations by both language groups.
Going cross-linguistic is an important but challenging track for validating a computational model... more Going cross-linguistic is an important but challenging track for validating a computational model of lexical organization. Our starting point is a computational model that has been established and validated on French language and we attempted to apply it on Mandarin language. The main ingredients of this model are computational lexical resources and a psycho-linguistic protocol involving extra-linguistic material (video-clips). At this stage, all the psycho-linguistic experiments have been ran, most of the resources have been built but some comparative analyses are not fully completed. Still the project is advanced enough to report on the issues we had to address while performing this cross-linguistic move concerning the resources, the analysis of the data and the data alignment across languages.
Uploads
Papers by yann desalle
Talks by yann desalle