2012 International Conference on Computer & Information Science (ICCIS), 2012
Abstract In this paper, we propose a norms mining technique for a visitor agent to detect the nor... more Abstract In this paper, we propose a norms mining technique for a visitor agent to detect the norms of a community of local agents to comply with the community's normative protocol. In this technique, the visitor agent is equipped with an algorithm, which detects the potential norms through the system's log file, interactions with the local agents, and observing the local agents in action. The visitor agent detects the norms from these sources depending on their availability. Due to security issues, access is prevented to one or more of these ...
In this paper, we present a framework for resolving conflicts between personal and normative goal... more In this paper, we present a framework for resolving conflicts between personal and normative goals in normative agent systems. The conflicts occur in the decision making process of time-constrained tasks of those goals. The agents observe the environment, generate the tasks based on their obligation to an authority, their desires, and intentions. They select and execute the tasks from a set of pre-compiled tasks based on their beliefs of the reward and penalty associated with the selected tasks. To resolve the conflicts within the constraint of the tasks' duration, we supplement the agents' normative capacity with two essential functions: Sacrifice and Diligence. The Sacrifice function enables an agent to reason and discard any tasks that have lower priorities to make way for accomplishment of the normative goal. The Diligence function enables an agent to increase its effort in accomplishing the normative goal in time-constrained situations. We simulate these situations and present the results.
Abstract. This article aims to automate the extraction of information from semi-structured web do... more Abstract. This article aims to automate the extraction of information from semi-structured web documents by minimizing the amount of hand coding. Ex-traction of information from the WWW can be used to structure the huge amount of data buried in web documents, so that data ...
Nowadays, activities and decisions making in an organization is based on data and information obt... more Nowadays, activities and decisions making in an organization is based on data and information obtained from data analysis, which provides various services for constructing reliable and accurate process. As data are significant resources in all organizations the quality of data is critical for managers and operating processes to identify related performance issues. Moreover, high quality data can increase opportunity for achieving top services in an organization. However, identifying various aspects of data quality from definition, dimensions, types, strategies, techniques are essential to equip methods and processes for improving data. This paper focuses on systematic review of data quality dimensions in order to use at proposed framework which combining data mining and statistical techniques to measure dependencies among dimensions and illustrate how extracting knowledge can increase process quality.
Nowadays, many users use web search engines to find and gather information. User faces an increas... more Nowadays, many users use web search engines to find and gather information. User faces an increasing amount of various semi-structured information sources. The issue of correlating, integrating and presenting related information to users becomes important. When a user uses a search engine such as Yahoo and Google to seek a specific information, the results are not only information about the availability of the desired information, but also information about other pages on which the desired information is mentioned. The number of selected pages is enormous. Therefore, the performance capabilities, the overlap among results for the same queries and limitations of web search engines are an important and large area of research. Extracting information from the web data sources also becomes very important because the massive and increasing amount of diverse semi-structured information sources in the Internet that are available to users, and the variety of web pages making the process of information extraction from web a challenging problem. This paper proposes a framework for extracting, classifying and browsing semi-structured web data sources. The framework is able to extract relevant information from different web data sources, and classify the extracted information based on the standard classification of Nokia products.
Abstract This paper aims to identify lexical of criminal elements for chatting corpus, which invo... more Abstract This paper aims to identify lexical of criminal elements for chatting corpus, which involved suspect and victim conversation utterances. Lexical criminal identification requires three processes. The first is tokenization to automatically assign each lexical with a corresponding serial number in every suspect and victim utterance. The second is tagging the lexical with parts of speech to identify verbs and nouns in the utterances. The third is to identify and analyze the interrogative criminal construct to get the criminal evidence. The ...
The growth of daily data and complexity in data warehouse with enhanced the information technolog... more The growth of daily data and complexity in data warehouse with enhanced the information technology has created new challenges for information user. The demand for quality data has increase an awareness of the quality, reliable and accuracy of information in making fast and reliable decision-making. Nowadays, many organizations are depending on their resources in data warehouse. As that matter of fact, the qualities of data warehouse are greatly concern. The poor and error data will cause more trouble in data warehouse as data accessed from the same resources by the user. Here we present the systematic review comparative model to determine the data quality model as further research in our studies.
Abstract In corpus-based response generation, dialogue utterances and strategies are constructed ... more Abstract In corpus-based response generation, dialogue utterances and strategies are constructed in the form of dialogue models. Previous approaches in dialogue modeling are based on speech acts at different level of abstractions, but often constructed as grammar-based. This paper presents modeling dialogue utterances and strategies based on Conversational Acts Theory. In this approach, learning is more intuitive in the sense that response utterance should robustly satisfy the intention of input utterance. At the same ...
The statistical approach to natural language generation of overgeneration-andranking suffers from... more The statistical approach to natural language generation of overgeneration-andranking suffers from expensive over generation. This article reports the findings of response classification experiment in the new approach of intention-based classification-andranking. Possible responses are deliberately chosen from a dialogue corpus rather than wholly generated, so the approach allows short ungrammatical utterances as long as they satisfy the intended meaning of the input utterance. We hypothesize that a response is relevant ...
The accuracy metric has been widely used for discriminating and selecting an optimal solution in ... more The accuracy metric has been widely used for discriminating and selecting an optimal solution in constructing an optimized classifier. However, the use of accuracy metric leads the searching process to the sub-optimal solutions due to its limited capability of discriminating values. In this study, we propose a hybrid evaluation metric, which combines the accuracy metric with the precision and recall metrics. We call this new performance metric as Optimized Accuracy with Recall-Precision (OARP). This paper demonstrates that the OARP metric is more discriminating than the accuracy metric using two counter-examples. To verify this advantage, we conduct an empirical verification using a statistical discriminative analysis to prove that the OARP is statistically more discriminating than the accuracy metric. We also empirically demonstrate that a naive stochastic classification algorithm trained with the OARP metric is able to obtain better predictive results than the one trained with the conventional accuracy metric. The experiments have proved that the OARP metric is a better evaluator and optimizer in the constructing of optimized classifier.
2012 International Conference on Computer & Information Science (ICCIS), 2012
Abstract In this paper, we propose a norms mining technique for a visitor agent to detect the nor... more Abstract In this paper, we propose a norms mining technique for a visitor agent to detect the norms of a community of local agents to comply with the community's normative protocol. In this technique, the visitor agent is equipped with an algorithm, which detects the potential norms through the system's log file, interactions with the local agents, and observing the local agents in action. The visitor agent detects the norms from these sources depending on their availability. Due to security issues, access is prevented to one or more of these ...
In this paper, we present a framework for resolving conflicts between personal and normative goal... more In this paper, we present a framework for resolving conflicts between personal and normative goals in normative agent systems. The conflicts occur in the decision making process of time-constrained tasks of those goals. The agents observe the environment, generate the tasks based on their obligation to an authority, their desires, and intentions. They select and execute the tasks from a set of pre-compiled tasks based on their beliefs of the reward and penalty associated with the selected tasks. To resolve the conflicts within the constraint of the tasks' duration, we supplement the agents' normative capacity with two essential functions: Sacrifice and Diligence. The Sacrifice function enables an agent to reason and discard any tasks that have lower priorities to make way for accomplishment of the normative goal. The Diligence function enables an agent to increase its effort in accomplishing the normative goal in time-constrained situations. We simulate these situations and present the results.
Abstract. This article aims to automate the extraction of information from semi-structured web do... more Abstract. This article aims to automate the extraction of information from semi-structured web documents by minimizing the amount of hand coding. Ex-traction of information from the WWW can be used to structure the huge amount of data buried in web documents, so that data ...
Nowadays, activities and decisions making in an organization is based on data and information obt... more Nowadays, activities and decisions making in an organization is based on data and information obtained from data analysis, which provides various services for constructing reliable and accurate process. As data are significant resources in all organizations the quality of data is critical for managers and operating processes to identify related performance issues. Moreover, high quality data can increase opportunity for achieving top services in an organization. However, identifying various aspects of data quality from definition, dimensions, types, strategies, techniques are essential to equip methods and processes for improving data. This paper focuses on systematic review of data quality dimensions in order to use at proposed framework which combining data mining and statistical techniques to measure dependencies among dimensions and illustrate how extracting knowledge can increase process quality.
Nowadays, many users use web search engines to find and gather information. User faces an increas... more Nowadays, many users use web search engines to find and gather information. User faces an increasing amount of various semi-structured information sources. The issue of correlating, integrating and presenting related information to users becomes important. When a user uses a search engine such as Yahoo and Google to seek a specific information, the results are not only information about the availability of the desired information, but also information about other pages on which the desired information is mentioned. The number of selected pages is enormous. Therefore, the performance capabilities, the overlap among results for the same queries and limitations of web search engines are an important and large area of research. Extracting information from the web data sources also becomes very important because the massive and increasing amount of diverse semi-structured information sources in the Internet that are available to users, and the variety of web pages making the process of information extraction from web a challenging problem. This paper proposes a framework for extracting, classifying and browsing semi-structured web data sources. The framework is able to extract relevant information from different web data sources, and classify the extracted information based on the standard classification of Nokia products.
Abstract This paper aims to identify lexical of criminal elements for chatting corpus, which invo... more Abstract This paper aims to identify lexical of criminal elements for chatting corpus, which involved suspect and victim conversation utterances. Lexical criminal identification requires three processes. The first is tokenization to automatically assign each lexical with a corresponding serial number in every suspect and victim utterance. The second is tagging the lexical with parts of speech to identify verbs and nouns in the utterances. The third is to identify and analyze the interrogative criminal construct to get the criminal evidence. The ...
The growth of daily data and complexity in data warehouse with enhanced the information technolog... more The growth of daily data and complexity in data warehouse with enhanced the information technology has created new challenges for information user. The demand for quality data has increase an awareness of the quality, reliable and accuracy of information in making fast and reliable decision-making. Nowadays, many organizations are depending on their resources in data warehouse. As that matter of fact, the qualities of data warehouse are greatly concern. The poor and error data will cause more trouble in data warehouse as data accessed from the same resources by the user. Here we present the systematic review comparative model to determine the data quality model as further research in our studies.
Abstract In corpus-based response generation, dialogue utterances and strategies are constructed ... more Abstract In corpus-based response generation, dialogue utterances and strategies are constructed in the form of dialogue models. Previous approaches in dialogue modeling are based on speech acts at different level of abstractions, but often constructed as grammar-based. This paper presents modeling dialogue utterances and strategies based on Conversational Acts Theory. In this approach, learning is more intuitive in the sense that response utterance should robustly satisfy the intention of input utterance. At the same ...
The statistical approach to natural language generation of overgeneration-andranking suffers from... more The statistical approach to natural language generation of overgeneration-andranking suffers from expensive over generation. This article reports the findings of response classification experiment in the new approach of intention-based classification-andranking. Possible responses are deliberately chosen from a dialogue corpus rather than wholly generated, so the approach allows short ungrammatical utterances as long as they satisfy the intended meaning of the input utterance. We hypothesize that a response is relevant ...
The accuracy metric has been widely used for discriminating and selecting an optimal solution in ... more The accuracy metric has been widely used for discriminating and selecting an optimal solution in constructing an optimized classifier. However, the use of accuracy metric leads the searching process to the sub-optimal solutions due to its limited capability of discriminating values. In this study, we propose a hybrid evaluation metric, which combines the accuracy metric with the precision and recall metrics. We call this new performance metric as Optimized Accuracy with Recall-Precision (OARP). This paper demonstrates that the OARP metric is more discriminating than the accuracy metric using two counter-examples. To verify this advantage, we conduct an empirical verification using a statistical discriminative analysis to prove that the OARP is statistically more discriminating than the accuracy metric. We also empirically demonstrate that a naive stochastic classification algorithm trained with the OARP metric is able to obtain better predictive results than the one trained with the conventional accuracy metric. The experiments have proved that the OARP metric is a better evaluator and optimizer in the constructing of optimized classifier.
Uploads
Papers by Aida Mustapha