Interested in knowledge acquisition for artificial intelligence and the role of knowledge and emotions understanding in ethical machines. Phone: +81-11-706-6535 Address: Language Media Laboratory
Research Group of Information Media Science and Technology
Division of Media and Network Technologies
Graduate School of Information Science and Technology
Hokkaido University,
Kita-ku Kita 14 Nishi 9,
060-0814 Sapporo, Japan
Nowadays, social media has become the essential part of our lives. Pictograms (emoticons/emojis) ... more Nowadays, social media has become the essential part of our lives. Pictograms (emoticons/emojis) have been widely used in social media as a medium for visually expressing emotions. In this paper, we propose a emoji-aware attention-based GRU network model for sentiment analysis of Weibo which is the most popular Chinese social media platform. Firstly, we analyzed the usage of 67 emojis with facial expression. By performing a polarity annotation with a new “humorous type” added, we have confirmed that 23 emojis can be considered more as humorous than positive or negative. On this basis, we applied the emojis polarity to a attentionbased GRU network model for sentiment analysis of undersized labelled data. Our experimental results show that the proposed method can significantly improve the performance for predicting sentiment polarity on social media.
Advances in Intelligent Systems and Computing, 2019
This paper is an attempt at analyzing how much religious vocabulary (in this case Buddhist vocabu... more This paper is an attempt at analyzing how much religious vocabulary (in this case Buddhist vocabulary taken from a large scale dictionary of Buddhist terms available online) is present in everyday Japanese social space (in this case in a repository of blog entries form the Ameba blog service) and thus in the consciousness of people. We also investigate and what associations (positive or negative) it generates, thus indicating the connotations associated with several Buddhist terms – whether expressions containing Buddhist vocabulary are considered proper or not from a moral point of view – as well as the emotional response of Internet users to Buddhist terminology.
We begin this paper with revisiting the differences between descriptive and normative approach to... more We begin this paper with revisiting the differences between descriptive and normative approach to ethics and challenge the usefulness of the latter for the field of machine ethics. We continue this reasoning and present our insights on previous trends in this field and highlight the need for a change in the approach. We highlight the need for an experimental approach to machine ethics by introducing a moral reasoning system based on Aristotelian identification of civic rhetoric with a common-sense base. And present it as a step forward in the machine ethics research bypassing theoretical disputes between philosophers. We finish this paper with the introduction to the CAMILLA project for web-crawling algorithm as the first step towards creating an Aristotelian explicit moral agent.
Internet slang is an informal language used in everyday online communication which quickly become... more Internet slang is an informal language used in everyday online communication which quickly becomes adopted or discarded by new generations. Similarly, pictograms (emoticons/emojis) have been widely used in social media as a mean for graphical expression of emotions. People can convey delicate nuances through textual information when supported with emoticons. Furthermore, we also noticed that when people use new words and pictograms, they tend to express a kind of humorous emotion which is difficult to clearly classify as positive or negative. Therefore, it is important to fully understand the influence of Internet slang and emoticons on social media. In this paper, we propose a machine learning method considering Internet slang and emoticons for sentiment analysis of Weibo, the most popular Chinese social media platform. In the first step, we collected 448 frequent Internet slang expressions as a slang lexicon, then we converted the 109 Weibo emoticons into textual features creating...
This paper presents a Cockney rhyming slang recognizing and converting modules of a cyberbullying... more This paper presents a Cockney rhyming slang recognizing and converting modules of a cyberbullying detection system. Firstly, we introduce the concept of rhyming slang, analyze its phrasal constructions and discuss the usefulness of features of the rhyming slang, such as resemblance to code-mixing. Secondly, we describe the corpus and phrasal rhyming lexicon created for the purpose of the research and present the results of the experiments on recognizing and transforming rhyming slang constructions into casual English sentences. Finally, we process the obtained output and verify usability of the modules in cyberbullying and crime detection system.
Although we are still quite far from constructing a human-like conversational system, researchers... more Although we are still quite far from constructing a human-like conversational system, researchers all over the world keep investigating numerous factors that make conversations between humans. In this work we focus on two such factors: humor and metaphors. Numerous research projects exist in the area of metaphor understanding and generation. We propose a unique approach to this subject, based on an observation that humans can not only properly understand and generate metaphors, but also make fun of their misunderstandings. For instance, an utterance “you have legs like a deer” can be understood as a compliment (“long and graceful”), as well as an insult (“very hairy”). If used properly, such misunderstanding can serve as source of humor in human-computer conversations. In this paper we first briefly describe our previous research on humor-equipped conversational systems. We then summarize the state of the art in the metaphor processing research, and mention works showing that the sa...
In this position paper we introduce our approach to positive computing by developing and integrat... more In this position paper we introduce our approach to positive computing by developing and integrating methods for future assistant and companion agents which could help us a) avoid making mistakes due to biases caused by insufficient knowledge, b) be more empathic and righteous, c) be more sensitive and thoughtful. We present text processing techniques for automatic discovery of possible reasoning errors and provide hints to make users doubt their beliefs when there is a possibility of harm. We present existing sources and methods, discuss on how natural language processing technologies could contribute to various aspects of well-being by giving examples of systems we develop, and describe the strengths and weaknesses of our approach.
One of the essential parts of second language curriculum is teaching vocabulary. Until now many e... more One of the essential parts of second language curriculum is teaching vocabulary. Until now many existing techniques tried to facilitate word acquisition, but one method which has been paid less attention to is code-switching. In this paper, we present an experimental system for computer assisted vocabulary learning in context using a code-switching based method, focusing on teaching Japanese vocabulary to foreign language learners. First, we briefly introduce our Co-MIX method for vocabulary teaching systems using code-switching phenomenon to support vocabulary acquisition. Next, we show how we utilize incidental learning technique with graded readers to facilitate vocabulary learning. We present the systems architecture, underlying technologies and the initial evaluation of the system’s performance by using semantic differential scale. Finally, we discuss the evaluation results and compare them with our English vocabulary teaching system.
This paper presents a prototype of the Radiobot system which is characterized by artificial radio... more This paper presents a prototype of the Radiobot system which is characterized by artificial radio personalities able to automatically generate a vocal conversation and alter its content depending on comments from listeners. Preliminary experiments show that our proposed interaction model has promising features when compared to popular one-to-one assistant agents and conventional radio broadcasting.
Abstract In this study, we focus on ethical education as a means to improve artificial companion’... more Abstract In this study, we focus on ethical education as a means to improve artificial companion’s conceptualization of moral decision-making process in human users. In particular, we focus on automatically determining whether changes in ethical education influenced core moral values in humans throughout the century. We analyze ethics as taught in Japan before WWII and today to verify how much the pre-WWII moral attitudes have in common with those of contemporary Japanese, to what degree what is taught as ethics in school overlaps with the general population’s understanding of ethics, as well as to verify whether a major reform of the guidelines for teaching the school subject of “ethics” at school after 1946 has changed the way common people approach core moral questions (such as those concerning the sacredness of human life). We selected textbooks used in teaching ethics at school from between 1935 and 1937, and those used in junior high schools today (2019) and analyzed what emotional and moral associations such contents generated. The analysis was performed with an automatic moral and emotional reasoning agent and based on the largest available text corpus in Japanese as well as on the resources of a Japanese digital library. As a result, we found out that, despite changes in stereotypical view on Japan’s moral sentiments, especially due to historical events, past and contemporary Japanese share a similar moral evaluation of certain basic moral concepts, although there is a large discrepancy between how they perceive some actions to be beneficial to the society as a whole while at the same time being inconclusive when it comes to assessing the same action’s outcome on the individual performing them and in terms of emotional consequences. Some ethical categories, assessed positively before the war, while being associated with a nationalistic trend in education have also disappeared from the scope of interest of post- war society. The findings of this study support suggestions proposed by others that the development of personal AI systems requires supplementation with moral reasoning. Moreover, the paper builds upon this idea and further suggests that AI systems need to be aware of ethics not as a constant, but as a function with a correction on historical and cultural changes in moral reasoning.
Human Language Technology. Challenges for Computer Science and Linguistics, 2018
The problem of humiliating and slandering people through Internet, generally defined as cyberbull... more The problem of humiliating and slandering people through Internet, generally defined as cyberbullying (later: CB), has been recently noticed as a serious social problem disturbing mental health of Internet users. In Japan, to deal with the problem, members of Parent-Teacher Association (PTA) perform Internet Patrol – a voluntary work by reading through the whole Web contents to spot cyberbullying entries. To help PTA members we propose a novel method for automatic detection of malicious contents on the Internet. The method is based on a brute force search algorithm-inspired combinatorial approach to language modeling. The method automatically extracts sophisticated sentence patterns and uses them in classification. We tested the method on actual data containing cyberbullying provided by Human Rights Center. The results show our method outperformed previous methods. It is also more efficient as it requires minimal human effort.
2018 9th International Conference on Awareness Science and Technology (iCAST), 2018
Pictograms (emoticons/emojis) have been widely used in social media as a mean for graphical expre... more Pictograms (emoticons/emojis) have been widely used in social media as a mean for graphical expression of emotions. People can express delicate nuances through textual information when supported with emoticons, and the effectiveness of computer-mediated communication (CMC) is also improved. Therefore it is important to fully understand the influence of emoticons on CMC. In this paper, we propose an emoticon polarity-aware recurrent neural network method for sentiment analysis of Weibo, a Chinese social media platform. In the first step, we analyzed the usage of 67 emoticons with racial expression used on Weibo. By performing a polarity annotation with a new “humorous type” added, we have confirmed that 23 emoticons can be considered more as humorous than positive or negative. On this basis, we applied the emoticons polarity in a Long Short-Term Memory recurrent neural network (LSTM) for sentiment analysis of undersized labelled data. Our experimental results show that the proposed method can significantly improve the precision for predicting sentiment polarity on Weibo.
Nowadays, social media has become the essential part of our lives. Pictograms (emoticons/emojis) ... more Nowadays, social media has become the essential part of our lives. Pictograms (emoticons/emojis) have been widely used in social media as a medium for visually expressing emotions. In this paper, we propose a emoji-aware attention-based GRU network model for sentiment analysis of Weibo which is the most popular Chinese social media platform. Firstly, we analyzed the usage of 67 emojis with facial expression. By performing a polarity annotation with a new “humorous type” added, we have confirmed that 23 emojis can be considered more as humorous than positive or negative. On this basis, we applied the emojis polarity to a attentionbased GRU network model for sentiment analysis of undersized labelled data. Our experimental results show that the proposed method can significantly improve the performance for predicting sentiment polarity on social media.
Advances in Intelligent Systems and Computing, 2019
This paper is an attempt at analyzing how much religious vocabulary (in this case Buddhist vocabu... more This paper is an attempt at analyzing how much religious vocabulary (in this case Buddhist vocabulary taken from a large scale dictionary of Buddhist terms available online) is present in everyday Japanese social space (in this case in a repository of blog entries form the Ameba blog service) and thus in the consciousness of people. We also investigate and what associations (positive or negative) it generates, thus indicating the connotations associated with several Buddhist terms – whether expressions containing Buddhist vocabulary are considered proper or not from a moral point of view – as well as the emotional response of Internet users to Buddhist terminology.
We begin this paper with revisiting the differences between descriptive and normative approach to... more We begin this paper with revisiting the differences between descriptive and normative approach to ethics and challenge the usefulness of the latter for the field of machine ethics. We continue this reasoning and present our insights on previous trends in this field and highlight the need for a change in the approach. We highlight the need for an experimental approach to machine ethics by introducing a moral reasoning system based on Aristotelian identification of civic rhetoric with a common-sense base. And present it as a step forward in the machine ethics research bypassing theoretical disputes between philosophers. We finish this paper with the introduction to the CAMILLA project for web-crawling algorithm as the first step towards creating an Aristotelian explicit moral agent.
Internet slang is an informal language used in everyday online communication which quickly become... more Internet slang is an informal language used in everyday online communication which quickly becomes adopted or discarded by new generations. Similarly, pictograms (emoticons/emojis) have been widely used in social media as a mean for graphical expression of emotions. People can convey delicate nuances through textual information when supported with emoticons. Furthermore, we also noticed that when people use new words and pictograms, they tend to express a kind of humorous emotion which is difficult to clearly classify as positive or negative. Therefore, it is important to fully understand the influence of Internet slang and emoticons on social media. In this paper, we propose a machine learning method considering Internet slang and emoticons for sentiment analysis of Weibo, the most popular Chinese social media platform. In the first step, we collected 448 frequent Internet slang expressions as a slang lexicon, then we converted the 109 Weibo emoticons into textual features creating...
This paper presents a Cockney rhyming slang recognizing and converting modules of a cyberbullying... more This paper presents a Cockney rhyming slang recognizing and converting modules of a cyberbullying detection system. Firstly, we introduce the concept of rhyming slang, analyze its phrasal constructions and discuss the usefulness of features of the rhyming slang, such as resemblance to code-mixing. Secondly, we describe the corpus and phrasal rhyming lexicon created for the purpose of the research and present the results of the experiments on recognizing and transforming rhyming slang constructions into casual English sentences. Finally, we process the obtained output and verify usability of the modules in cyberbullying and crime detection system.
Although we are still quite far from constructing a human-like conversational system, researchers... more Although we are still quite far from constructing a human-like conversational system, researchers all over the world keep investigating numerous factors that make conversations between humans. In this work we focus on two such factors: humor and metaphors. Numerous research projects exist in the area of metaphor understanding and generation. We propose a unique approach to this subject, based on an observation that humans can not only properly understand and generate metaphors, but also make fun of their misunderstandings. For instance, an utterance “you have legs like a deer” can be understood as a compliment (“long and graceful”), as well as an insult (“very hairy”). If used properly, such misunderstanding can serve as source of humor in human-computer conversations. In this paper we first briefly describe our previous research on humor-equipped conversational systems. We then summarize the state of the art in the metaphor processing research, and mention works showing that the sa...
In this position paper we introduce our approach to positive computing by developing and integrat... more In this position paper we introduce our approach to positive computing by developing and integrating methods for future assistant and companion agents which could help us a) avoid making mistakes due to biases caused by insufficient knowledge, b) be more empathic and righteous, c) be more sensitive and thoughtful. We present text processing techniques for automatic discovery of possible reasoning errors and provide hints to make users doubt their beliefs when there is a possibility of harm. We present existing sources and methods, discuss on how natural language processing technologies could contribute to various aspects of well-being by giving examples of systems we develop, and describe the strengths and weaknesses of our approach.
One of the essential parts of second language curriculum is teaching vocabulary. Until now many e... more One of the essential parts of second language curriculum is teaching vocabulary. Until now many existing techniques tried to facilitate word acquisition, but one method which has been paid less attention to is code-switching. In this paper, we present an experimental system for computer assisted vocabulary learning in context using a code-switching based method, focusing on teaching Japanese vocabulary to foreign language learners. First, we briefly introduce our Co-MIX method for vocabulary teaching systems using code-switching phenomenon to support vocabulary acquisition. Next, we show how we utilize incidental learning technique with graded readers to facilitate vocabulary learning. We present the systems architecture, underlying technologies and the initial evaluation of the system’s performance by using semantic differential scale. Finally, we discuss the evaluation results and compare them with our English vocabulary teaching system.
This paper presents a prototype of the Radiobot system which is characterized by artificial radio... more This paper presents a prototype of the Radiobot system which is characterized by artificial radio personalities able to automatically generate a vocal conversation and alter its content depending on comments from listeners. Preliminary experiments show that our proposed interaction model has promising features when compared to popular one-to-one assistant agents and conventional radio broadcasting.
Abstract In this study, we focus on ethical education as a means to improve artificial companion’... more Abstract In this study, we focus on ethical education as a means to improve artificial companion’s conceptualization of moral decision-making process in human users. In particular, we focus on automatically determining whether changes in ethical education influenced core moral values in humans throughout the century. We analyze ethics as taught in Japan before WWII and today to verify how much the pre-WWII moral attitudes have in common with those of contemporary Japanese, to what degree what is taught as ethics in school overlaps with the general population’s understanding of ethics, as well as to verify whether a major reform of the guidelines for teaching the school subject of “ethics” at school after 1946 has changed the way common people approach core moral questions (such as those concerning the sacredness of human life). We selected textbooks used in teaching ethics at school from between 1935 and 1937, and those used in junior high schools today (2019) and analyzed what emotional and moral associations such contents generated. The analysis was performed with an automatic moral and emotional reasoning agent and based on the largest available text corpus in Japanese as well as on the resources of a Japanese digital library. As a result, we found out that, despite changes in stereotypical view on Japan’s moral sentiments, especially due to historical events, past and contemporary Japanese share a similar moral evaluation of certain basic moral concepts, although there is a large discrepancy between how they perceive some actions to be beneficial to the society as a whole while at the same time being inconclusive when it comes to assessing the same action’s outcome on the individual performing them and in terms of emotional consequences. Some ethical categories, assessed positively before the war, while being associated with a nationalistic trend in education have also disappeared from the scope of interest of post- war society. The findings of this study support suggestions proposed by others that the development of personal AI systems requires supplementation with moral reasoning. Moreover, the paper builds upon this idea and further suggests that AI systems need to be aware of ethics not as a constant, but as a function with a correction on historical and cultural changes in moral reasoning.
Human Language Technology. Challenges for Computer Science and Linguistics, 2018
The problem of humiliating and slandering people through Internet, generally defined as cyberbull... more The problem of humiliating and slandering people through Internet, generally defined as cyberbullying (later: CB), has been recently noticed as a serious social problem disturbing mental health of Internet users. In Japan, to deal with the problem, members of Parent-Teacher Association (PTA) perform Internet Patrol – a voluntary work by reading through the whole Web contents to spot cyberbullying entries. To help PTA members we propose a novel method for automatic detection of malicious contents on the Internet. The method is based on a brute force search algorithm-inspired combinatorial approach to language modeling. The method automatically extracts sophisticated sentence patterns and uses them in classification. We tested the method on actual data containing cyberbullying provided by Human Rights Center. The results show our method outperformed previous methods. It is also more efficient as it requires minimal human effort.
2018 9th International Conference on Awareness Science and Technology (iCAST), 2018
Pictograms (emoticons/emojis) have been widely used in social media as a mean for graphical expre... more Pictograms (emoticons/emojis) have been widely used in social media as a mean for graphical expression of emotions. People can express delicate nuances through textual information when supported with emoticons, and the effectiveness of computer-mediated communication (CMC) is also improved. Therefore it is important to fully understand the influence of emoticons on CMC. In this paper, we propose an emoticon polarity-aware recurrent neural network method for sentiment analysis of Weibo, a Chinese social media platform. In the first step, we analyzed the usage of 67 emoticons with racial expression used on Weibo. By performing a polarity annotation with a new “humorous type” added, we have confirmed that 23 emoticons can be considered more as humorous than positive or negative. On this basis, we applied the emoticons polarity in a Long Short-Term Memory recurrent neural network (LSTM) for sentiment analysis of undersized labelled data. Our experimental results show that the proposed method can significantly improve the precision for predicting sentiment polarity on Weibo.
From the beginning of computer era over half a century ago, humanity was fascinated by the idea o... more From the beginning of computer era over half a century ago, humanity was fascinated by the idea of creating a machine substituting their mental capabilities. This New Age version of Mary Shelley's Frankenstein gave birth to S-F literature and was one of the motors for development of our civilisation. The mental functions digitalized as the first ones were fast processing of large numbers or sophisticated formulas for specialized fields like mathematics or physics. These functions were the most troublesome for humans, but the easiest to process mechanically. Ironically, the human mental functions said to be the most human-like, and thought of as the ones which make up a grown well-socialized man, such as a sense of humour or understanding emotions of others, were neglected in Computer Science for a long time as too subjective and therefore unscientific...
Human Language Technology. Challenges for Computer Science and Linguistics 7th Language and Technology Conference, LTC 2015, Poznań, Poland, November 27-29, 2015, Revised Selected Papers. Lecture Notes in Artificial Intelligence (LNAI), Springer ., 2018
One of the essential parts of second language curriculum is teaching vocabulary. Until now many e... more One of the essential parts of second language curriculum is teaching vocabulary. Until now many existing techniques tried to facilitate word acquisition, but one method which has been paid less attention to is code-switching. In this paper, we present an experimental system for computer assisted vocabulary learning in context using a code-switching based method, focusing on teaching Japanese vocabulary to foreign language learners. First, we briefly introduce our Co-MIX method for vocabulary teaching systems using code-switching phenomenon to support vocabulary acquisition. Next, we show how we utilize incidental learning technique with graded readers to facilitate vocabulary learning. We present the systems architecture, underlying technologies and the initial evaluation of the system's performance by using semantic differential scale. Finally, we discuss the evaluation results and compare them with our English vocabulary teaching system.
Uploads
Papers by Rafal Rzepka