Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Every culture has its own topics of interest and its hot topics. In this paper we present a system that helps for a better understanding of different cultures, starting from the topics that are debated between their members. In order to... more
Every culture has its own topics of interest and its hot topics. In this paper we present a system that helps for a better understanding of different cultures, starting from the topics that are debated between their members. In order to do that, we recorded and analyzed the content of the messages that are sent by the citizens of different countries on Twitter (a worldwide conversations system), hoping that this way we will be able to capture the topics of interest for each culture and predict their hot topics. We did our analysis on English written tweets, based on the fact that English has become a global language, being spoken even by internet users from non-English speaking countries when they want to share their thoughts and have a global understanding amongst the readers. Our study is trying to capture the topic model for the tweets and for the URL shared in those tweets separately and then to compare the distribution of topics across different countries for both the tweets an...
Intent classification is a central component of a Natural Language Understanding (NLU) pipeline for conversational agents. The quality of such a component depends on the quality of the training data, however, for many conversational... more
Intent classification is a central component of a Natural Language Understanding (NLU) pipeline for conversational agents. The quality of such a component depends on the quality of the training data, however, for many conversational scenarios, the data might be scarce; in these scenarios, data augmentation techniques are used. Having general data augmentation methods that can generalize to many datasets is highly desirable. The work presented in this paper is centered around two main components. First, we explore the influence of various feature vectors on the task of intent classification using RASA’s text classification capabilities. The second part of this work consists of a generic method for efficiently augmenting textual corpora using large datasets of unlabeled data. The proposed method is able to efficiently mine for examples similar to the ones that are already present in standard, natural language corpora. The experimental results show that using our corpus augmentation me...
no abstrac
In the context of constantly evolving carry-on technology and its increasing accessibility, namely smart-phones and tablets, a greater need for reliable authentication means comes into sight. The current study offers an alternative... more
In the context of constantly evolving carry-on technology and its increasing accessibility, namely smart-phones and tablets, a greater need for reliable authentication means comes into sight. The current study offers an alternative solution of uninterrupted testing towards verifying user legitimacy. A continuously collected dataset of 41 users’ touch-screen inputs provides a good starting point into modeling each user’s behavior and later differentiate among users. We introduce a system capable of processing features based on raw data extracted from user-screen interactions and attempting to assign each gesture to its originator. Achieving an accuracy of over 83 %, we prove that this type of authentication system is feasible and that it can be further integrated as a continuous way of disclosing intruders within given mobile applications.
Water resource management represents a fundamental aspect of a modern society. Urban areas present multiple challenges requiring complex solutions, which include multidomain approaches related to the integration of advanced technologies.... more
Water resource management represents a fundamental aspect of a modern society. Urban areas present multiple challenges requiring complex solutions, which include multidomain approaches related to the integration of advanced technologies. Water consumption monitoring applications play a significant role in increasing awareness, while machine learning has been proven for the design of intelligent solutions in this field. This paper presents an approach for monitoring and predicting water consumption from the most important water outlets in a household based on a proposed IoT solution. Data processing pipelines were defined, including K-means clustering and evaluation metrics, extracting consumption events, and training classification methods for predicting consumption sources. Continuous water consumption monitoring offers multiple benefits toward improving decision support by combining modern processing techniques, algorithms, and methods.
Water supply systems are essential for a modern society. This article presents an overview of the latest research related to information and communication technology systems for water resource monitoring, control and management. The main... more
Water supply systems are essential for a modern society. This article presents an overview of the latest research related to information and communication technology systems for water resource monitoring, control and management. The main objective of our review is to show how emerging technologies offer support for smart administration of water infrastructures. The paper covers research results related to smart cities, smart water monitoring, big data, data analysis and decision support. Our evaluation reveals that there are many possible solutions generated through combinations of advanced methods. Emerging technologies open new possibilities for including new functionalities such as social involvement in water resource management. This review offers support for researchers in the area of water monitoring and management to identify useful models and technologies for designing better solutions.
In this paper, we present an approach to forecasting the number of paintings that will be sold daily by Vivre Deco S.A. Vivre is an online retailer for Home and Lifestyle in Central and Eastern Europe. One of its concerns is related to... more
In this paper, we present an approach to forecasting the number of paintings that will be sold daily by Vivre Deco S.A. Vivre is an online retailer for Home and Lifestyle in Central and Eastern Europe. One of its concerns is related to the stocks that it needs to make at its own warehouse (considering its limited available space) to ensure a good product flow that would maximize both the company profit and the users’ satisfaction. Since stocks are directly connected to sales, the purpose is to predict the amount of sales from each category of products, given the selling history of these products. Thus, we have chosen a category of products (paintings) and used ARIMA for obtaining the predictions. We present different considerations regarding how we chose the model, along with the solver and the optimization method for fitting ARIMA. We also discuss the influence of the differencing on the obtained results, along with information about the runtime of different models.
This paper presents the current state of the gaming industry, which provides an important background for an effective serious game implementation in mobile crowdsensing. An overview of existing solutions, scientific studies and market... more
This paper presents the current state of the gaming industry, which provides an important background for an effective serious game implementation in mobile crowdsensing. An overview of existing solutions, scientific studies and market research highlights the current trends and the potential applications for citizen-centric platforms in the context of Cyber–Physical–Social systems. The proposed solution focuses on serious games applied in urban water management from the perspective of mobile crowdsensing, with a reward-driven mechanism defined for the crowdsensing tasks. The serious game is designed to provide entertainment value by means of gamified interaction with the environment, while the crowdsensing component involves a set of roles for finding, solving and validating water-related issues. The mathematical model of distance-constrained multi-depot vehicle routing problem with heterogeneous fleet capacity is evaluated in the context of the proposed scenario, with random initial...
In this paper, we present an application for identifying English words whose use is cyclic or regularly varies in time. The purpose of the developed application was to build a cross-platform system for indexing and analyzing the graphs of... more
In this paper, we present an application for identifying English words whose use is cyclic or regularly varies in time. The purpose of the developed application was to build a cross-platform system for indexing and analyzing the graphs of words usage over time. For words indexing, we used the data provided by the Google Books N-grams Corpus, which was afterwards filtered using the WordNet lexical database. For identifying the cyclic or regularly varying words, we used two different algorithms: autocorrelation and dynamic time warping. The results of the analysis can be visualized using a web interface. The application also offers the possibility to view the evolution of the use frequency of different words in time.
Deliverable nr D 5.3 – Support and feedback services version 1.5 Work Package 5
In this paper, we present Streamer, a search application running over streams of Twitter messages. As opposed to most services that only do simple text search over conversations, Streamer aims to cluster messages together in order to... more
In this paper, we present Streamer, a search application running over streams of Twitter messages. As opposed to most services that only do simple text search over conversations, Streamer aims to cluster messages together in order to simplify analyzing a large number of messages from similar topics. The novelty of Streamer is that, unlike most applications that use fixed corpus or categories when clustering, it works with streaming data that may debate about any number of topics: Twitter messages are continuously retrieved and the clusters are updated as more data comes in. The running time and clustering quality of the application were evaluated using purity and Silhouette coefficient.
WP/Task responsible
The purpose of this paper is to identify a connection, if such a connection exists, between the sequence of sounds and the lyrics of a melody and its popularity with the help of machine learning techniques. The melody popularity will be... more
The purpose of this paper is to identify a connection, if such a connection exists, between the sequence of sounds and the lyrics of a melody and its popularity with the help of machine learning techniques. The melody popularity will be quantified as the number of views and number of “like” votes on the YouTube platform, where users can upload, view and vote videos. This analysis will reveal whether the two indicators from the YouTube platform are more influenced by the words or sounds of the songs. This work may help the producers from the music industry since the popularity of a melody (determined by analyzing a large set of songs) might be a very important aspect to be considered when deciding whether to make and launch a musical product or not.
As technology evolves at an alert pace, the desire and need for exploration of the outer space becomes more and more prevalent. Human exploration and colonization of the Moon or Mars represents a strategic goal of international space... more
As technology evolves at an alert pace, the desire and need for exploration of the outer space becomes more and more prevalent. Human exploration and colonization of the Moon or Mars represents a strategic goal of international space agencies for the coming years. This implies a long-term exposure of astronauts in an extreme environment characterized by microgravity, isolation, confinement and other stressors which cause damage to both their bodies and minds. The longer the exposure to space environment, the greater the chance for an astronaut to face psychological problems. In, this regard, the herein paper is interdisciplinary orienting towards speech and language disorder analysis, especially on oral language disorders, for shaping a tool of behavioral psychology. Considering the well-known connection between speech and language disorders with interpersonal and intrapersonal psychological problems, this paper presents specific semantic assessments based on natural language processing techniques in order to gain understanding and shape a speech and language tool able to detect psychological impairments, especially for astronauts, but not merely.
In this paper, we propose a method of learning representation layers with squashing activation functions within a deep artificial neural network which directly addresses the vanishing gradients problem. The proposed solution is derived... more
In this paper, we propose a method of learning representation layers with squashing activation functions within a deep artificial neural network which directly addresses the vanishing gradients problem. The proposed solution is derived from solving the maximum likelihood estimator for components of the posterior representation, which are approximately Beta-distributed, formulated in the context of variational inference. This approach not only improves the performance of deep neural networks with squashing activation functions on some of the hidden layers - including in discriminative learning - but can be employed towards producing sparse codes.
This report is about the way to deliver relevant feedback to students’ written production, either for free texts (e.g., essays, syntheses, notes) or chat conversation, in order for the students to build knowledge. This report presents an... more
This report is about the way to deliver relevant feedback to students’ written production, either for free texts (e.g., essays, syntheses, notes) or chat conversation, in order for the students to build knowledge. This report presents an overview and a selection of existing models, methods and resources for: 1) the automatic analysis of learner interactions using language technologies or social network analysis (Task 5.1) and 2) the automatic analysis of learner text (Task 5.2). Secondly , a proposition of the tools to be developed for the project is presented, with their possible scenarios of use.
Web documents contain vast amounts of information that can be extracted and processed to enhance the understanding of online data. Often, the structure of the document can be exploited in order to identify useful information within it.... more
Web documents contain vast amounts of information that can be extracted and processed to enhance the understanding of online data. Often, the structure of the document can be exploited in order to identify useful information within it. Pairs of attributes and their corresponding values are one such example of information frequently found in many online retail websites. These concentrated bits of information are often enclosed in specific tags of the web document, or highlighted with certain markers which can be automatically discovered and identified. This way, different methods can be employed to extract new pairs from other, more or less similar, documents. The method presented in this paper relies on the DOM (Document Object Model) structure and the text within web pages in order to extract patterns consisting of tags and pieces of text and then to classify them. Several classifiers have been compared and the best results have been obtained with a C4.5 decision tree classifier.
This report presents Version 1.5 of the Learning support and feedback services (delivering recommendations based on interaction analysis and on students’ textual production) that can be integrated within an e-learning environment.
Game development often requires more than technical skills to create outstanding experiences in digital entertainment. In the mobile world, gamification has been used to improve the user experience, combining simplicity of use with... more
Game development often requires more than technical skills to create outstanding experiences in digital entertainment. In the mobile world, gamification has been used to improve the user experience, combining simplicity of use with attractive game design elements into a unique package for real-world applications. In this paper, a common framework for mobile games with a real-world component is proposed. The game engine, based on Google Maps and Unity, can be easily adapted for a particular scenario, such as GPS navigation, fitness trackers and even courier services, requiring minimal knowledge about traditional game engines. The prototypes presented are promising towards a more accessible framework for mobile developers to create the next generation mobile applications with a game-like design.
Game development often requires more than technical skills to create outstanding experiences in digital entertainment. In the mobile world, gamification has been used to improve the user experience, combining simplicity of use with... more
Game development often requires more than technical skills to create outstanding experiences in digital entertainment. In the mobile world, gamification has been used to improve the user experience, combining simplicity of use with attractive game design elements into a unique package for real-world applications. In this paper, a common framework for mobile games with a real-world component is proposed. The game engine, based on Google Maps and Unity, can be easily adapted for a particular scenario, such as GPS navigation, fitness trackers and even courier services, requiring minimal knowledge about traditional game engines. The prototypes presented are promising towards a more accessible framework for mobile developers to create the next generation mobile applications with a game-like design.
Profiling consumers in a water distribution system is essential for achieving sustainability in terms of resource management and urban development. Unsupervised learning can provide data-driven decision support for evaluating the water... more
Profiling consumers in a water distribution system is essential for achieving sustainability in terms of resource management and urban development. Unsupervised learning can provide data-driven decision support for evaluating the water demand patterns in a large network, while various pre-processing methods can be added to expand the level of detail in terms of consumer behavior. The K-Means clustering method is used on a dataset based on publicly available data collected from multiple households with an emphasis on the data processing pipeline and its influence on the resulting clusters. Seasonal decomposition is used to evaluate the weekly trends in the dataset, while data normalization provides an in-depth analysis of the patterns and relative variation in terms of consumer demand. The results show different perspectives on the consumer demand patterns which can provide additional details in terms of consumption (volume, pattern, variation).
In this paper we address the problem of capturing, processing and analyzing images from the video stream of the Hearthstone game in order to obtain relevant information on the conduct of parties in this game. Since the information needs... more
In this paper we address the problem of capturing, processing and analyzing images from the video stream of the Hearthstone game in order to obtain relevant information on the conduct of parties in this game. Since the information needs to be presented to the user in real-time, we needed to find the most suitable methods of extracting this information. Therefore, techniques such as background subtraction, histograms comparisons, key points matching, optical character recognition were investigated. Driven by the required processing speed, we ended up using optical character recognition on limited areas of interest from the captured image. After developing the application, we tested it in real-world context, while real games were played and presented the obtained results. In the end, we also provided two examples where the application would prove useful for better decision making during the game.
The wider acceptance and usage of instant messaging (chat) represents one of the consequences of undertaking Computer-Supported Collaborative Learning (CSCL) practices in formal education settings. However, the difficulty of analyzing... more
The wider acceptance and usage of instant messaging (chat) represents one of the consequences of undertaking Computer-Supported Collaborative Learning (CSCL) practices in formal education settings. However, the difficulty of analyzing these textual artifacts of learners in order to offer them feedback represents a serious problem in further extending the usage of chat conversations. PolyCAFe is a system that was designed to support the tutors and to provide automatic feedback for the learners engaged in ...
ABSTRACT The main objective of this paper is to compare the sentiments that prevailed before and after the presidential elections, held in both US and France in the year 2012. To achieve this objective we extracted the content information... more
ABSTRACT The main objective of this paper is to compare the sentiments that prevailed before and after the presidential elections, held in both US and France in the year 2012. To achieve this objective we extracted the content information from a social medium such as Twitter and used the tweets from electoral candidates and the public users (voters), collected by means of crawling during the course of election. In order to gain useful insights about the US elections, we scored the sentiments for each tweet using different metrics and performed a time series analysis for candidates and different topics (identified by specific keywords). In addition to this, we compared some of our insights obtained from the US election with what we have observed for the French election. This deep dive analysis was done in order to understand the inherent nature of elections and to bring out the influence of social media on elections.
This report is about the way to deliver relevant feedback to students' written production, either for free texts (eg, essays, syntheses, notes) or chat conversation, in order for the students to build knowledge. This report presents... more
This report is about the way to deliver relevant feedback to students' written production, either for free texts (eg, essays, syntheses, notes) or chat conversation, in order for the students to build knowledge. This report presents an overview and a selection of existing models, methods and resources for: 1) the automatic analysis of learner interactions using language technologies or social network analysis (Task 5.1) and 2) the automatic analysis of learner text (Task 5.2). Secondly, a proposition of the tools to be developed for the project ...
This report presents Version 1 of the support and feedback services (delivering recommendations based on interaction analysis and on students' textual production) that can be integrated within an e-learning environment. Further steps... more
This report presents Version 1 of the support and feedback services (delivering recommendations based on interaction analysis and on students' textual production) that can be integrated within an e-learning environment. Further steps toward the implementation of Version 2 of these services and their future integration with all the LTfLL services are also suggested.
Trausan-Matu, S., Dessus, P., Rebedea, T., Loiseau, M., Dascalu, M., Mihaila, D., Braidman, I., Armitt, G., Smithies, A., Regan, M., Lemaire, B., Stahl, J., Villiot-Leclercq, E., Zampa, V., Chiru, C., Pasov, I., & Dulceanu, A. (2010).... more
Trausan-Matu, S., Dessus, P., Rebedea, T., Loiseau, M., Dascalu, M., Mihaila, D., Braidman, I., Armitt, G., Smithies, A., Regan, M., Lemaire, B., Stahl, J., Villiot-Leclercq, E., Zampa, V., Chiru, C., Pasov, I., & Dulceanu, A. (2010). D5.3 Support and feedback services version 1.5. LTfLL-project. ... Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
In this paper, we present a model that was intended to discriminate creative from non-creative news articles. In order to build the classifier, we have combined nine different measures using a stepwise logistic regression model. The... more
In this paper, we present a model that was intended to discriminate creative from non-creative news articles. In order to build the classifier, we have combined nine different measures using a stepwise logistic regression model. The obtained model was tested in two experiments: the first one tried to discriminate between news articles about the US 2012 Elections from different newspapers versus articles taken from The Onion (a website providing satiric news) on the same subject, while the second one evaluated the capacity of the model to generalize over different topics and text genres. The experiments showed that the system achieves 80% accuracy, but the lack of true positives from the second experiment raised the question of whether we really identified creativity or in fact we detected satire (as the assumption for the training corpus was that the satiric news from The Onion were also creative). Keywords-Creativity; Satire; Natural Language Processing; Metrics for Creativity Dete...
Word sense disambiguation is an essential, yet a very difficult task in natural language processing. While several other NLP tasks, such as POS tagging, can provide more than fairly good results (highly accurate, with almost 100% rate of... more
Word sense disambiguation is an essential, yet a very difficult task in natural language processing. While several other NLP tasks, such as POS tagging, can provide more than fairly good results (highly accurate, with almost 100% rate of successfully labeled words), disambiguation is far from achieving such performances. However, we will demonstrate the need of word sense disambiguation in computing the lexical chains on a special kind of text (chats) using a WordNet-based approach. In addition, we will try to identify the bottlenecks (mostly in respect to accuracy) in such an approach and provide possible improvements.
Research Interests:
Public data can be considered large and important sources of data that can be used for different purposes. In this paper we present a method for collecting and analyzing data within urban settlements. For more focused analysis and... more
Public data can be considered large and important sources of data that can be used for different purposes. In this paper we present a method for collecting and analyzing data within urban settlements. For more focused analysis and gathering of large amount of data we considered a case study of Bucharest. The main purpose of this analysis is to pick up important information about different streets, points of interests, details about urban planning, etc., with the goal of facilitating a quick and correct evaluation of specific areas and identifying suitable location for adding new points of interest. The prediction of suitable location involves using heuristics and data mining technics such as clustering algorithms, association rules.

And 48 more