Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
giuseppe polese

    giuseppe polese

    With the advent of Big Data there is an increasing necessity to incrementally mine information from data originating from sensors and other dynamic sources. Thus, it is necessary to devise algorithms capable of mining useful information... more
    With the advent of Big Data there is an increasing necessity to incrementally mine information from data originating from sensors and other dynamic sources. Thus, it is necessary to devise algorithms capable of mining useful information upon possible evolutions of databases. Among these, there are certainly data profiling info, such as functional dependencies (fd for short), which are particularly useful for data integration and for assessing the quality of data. The incremental scenario requires the definition of search strategies and validation methods able to analyze only the portion of the dataset affected by the last changes. In this paper, we propose a new validation method, which exploits regular expressions and compressed data structures to efficiently verify whether a candidate fd holds on an updated version of the dataset. Experimental results demonstrate the effectiveness of the proposed method on real-world datasets adapted for incremental scenarios, also compared with a baseline incremental fd discovery algorithm.
    Several experiments about learners behavior during structured exam tests based on multiple choice questions have been carried out so far. Most of these experiments were performed by exploiting the think out loud method: the learners were... more
    Several experiments about learners behavior during structured exam tests based on multiple choice questions have been carried out so far. Most of these experiments were performed by exploiting the think out loud method: the learners were informed of the experiment and had to modify their behavior in order to allow the experimenters to record information about their habits during the test. In this paper we describe a system which logs the interactions of the learner with the e-testing system interface during the test. Our system allows us to record information about their habits during on-line tests without informing them of the experiment and, consequently, without asking them to modify their behavior, thus obtaining more realistic data. In order to demonstrate the effectiveness of our system, we describe how it can be used for replicating on-line several experiments performed for traditional papery testing
    Relaxed functional dependencies ( rfd s) are properties expressing important relationships among data. Thanks to the introduction of approximations in data comparison and/or validity, they can capture constraints useful for several... more
    Relaxed functional dependencies ( rfd s) are properties expressing important relationships among data. Thanks to the introduction of approximations in data comparison and/or validity, they can capture constraints useful for several purposes, such as the identification of data inconsistencies or patterns of semantically related data. Nevertheless, rfd s can provide benefits only if they can be automatically discovered from data. In this paper we present an rfd discovery algorithm relying on a lattice structured search space, previously used for fd discovery, new pruning strategies, and a new candidate rfd validation method. An experimental evaluation demonstrates the discovery performances of the proposed algorithm on real datasets, also providing a comparison with other algorithms.
    ... Fisciano (SA), Italy gpolese@unisa.it Mario Vacca Dipartimento di Matematica e Informatica University of Salerno Via Ponte don Melillo, 84084 Fisciano (SA), Italy mvacca@unisa.it ABSTRACT The query synchronization is one ...
    Data integration is an extremely complex process, especially one involving big data sources. For this reason, several high-level approaches have been proposed to master the inherent complexity of this process. These exploit either natural... more
    Data integration is an extremely complex process, especially one involving big data sources. For this reason, several high-level approaches have been proposed to master the inherent complexity of this process. These exploit either natural language processing, or ontologies to perform the data integration process at requirement level, or visual languages with icon operators capable of specifying reconciliation operations on data source conceptual schemas, like for example the visual language CoDIL (Conceptual Data Integration Language) that we have recently proposed. In order to provide automated support to suggest the CoDIL icon operators that can be applied, in this paper we propose a Description Logic-based schema matcher, which extends current at- tribute matching approaches to more complex constructs of conceptual schemas. We also sketch a possible evolution of the system architecture for the tool CoDIT (Conceptual Data Integration Tool), supporting the data integration process based on CoDIL
    In this paper we propose an architecture of e-learning systems characterized by the use of Web services and a suitable middleware component. These technical infrastructures allow us to extend the system with new services as well as to... more
    In this paper we propose an architecture of e-learning systems characterized by the use of Web services and a suitable middleware component. These technical infrastructures allow us to extend the system with new services as well as to integrate and reuse heterogeneous software e-learning components. Moreover, they let us better support the "anytime and anywhere " learning paradigm. As a matter of fact, the proposal provides an implementation of the run-time environment suggested in the sharable content object reference model (SCORM) to trace learning processes, which is also suitable for mobile learning
    The detection of critical patients in Emergency Departments is often a critical task, especially in situations in which the number of patients to be monitored is high with respect to the available medical personnel. To this end, IoT data... more
    The detection of critical patients in Emergency Departments is often a critical task, especially in situations in which the number of patients to be monitored is high with respect to the available medical personnel. To this end, IoT data analytics can provide a useful support in automatically monitoring the status of patients, and detect the most critical ones. This paper presents a knowledge representation frame-work enabling the intelligent video surveillance of patients, which can be used in combination with IoT-based systems to enhance the detection of critical patients in emergency departments, and alert medical personnel. We also describe a clinical scenario related to the early treatment of sepsis in the emergency department, and show how the proposed framework can enhance the detection of such critical disease.
    Social networks are a vast source of information, and they have been increasing impact on people’s daily lives. They permit us to share emotions, passions, and interactions with other people around the world. While enabling people to... more
    Social networks are a vast source of information, and they have been increasing impact on people’s daily lives. They permit us to share emotions, passions, and interactions with other people around the world. While enabling people to exhibit their lives, social networks guarantee their privacy. The definitions of privacy requirements and default policies for safeguarding people’s data are the most difficult challenges that social networks have to deal with. In this work, we have collected data concerning people who have different social network profiles, aiming to analyse privacy requirements offered by social networks. In particular, we have built a tool exploiting image-recognition techniques to recognise a user from his/her picture, aiming to collect his/her personal data accessible through social networks where s/he has a profile. We have composed a dataset of 5000 users by combining data available from several social networks; we compared social network data mandatory in the re...
    We present an approach to integrate a visual authorization policy management system based on RBAC and XACM in the ADAMS (ADvanced Artifact Management System) Process Support System. ADAMS is a Web-based system that integrates project... more
    We present an approach to integrate a visual authorization policy management system based on RBAC and XACM in the ADAMS (ADvanced Artifact Management System) Process Support System. ADAMS is a Web-based system that integrates project management features such as resource allocation and process control and artifact management features, such as coordination of cooperative workers and artifact versioning, as well as context-awareness. We propose a hierarchy of visual languages aiming to support project managers and security administrators in modeling RBAC based access policies in ADAMS. The visual sentences are translated into XACML and stored into a Policy Repository. In this way the Policy Management System is able to process XACML requests and compare them against the defined access policies
    We discuss our recent approaches to enable the specication of a conceptual reconciled schema by directly manipulating source conceptual schemas. In particular, we use reverse engineering tools to reconstruct conceptual schemas when these... more
    We discuss our recent approaches to enable the specication of a conceptual reconciled schema by directly manipulating source conceptual schemas. In particular, we use reverse engineering tools to reconstruct conceptual schemas when these are not available, and use iconic operators to specify how to map source schemas to the reconciled one. Moreover, we describe the mapping of the conceptual reconciled schema to a logical data model, including mechanisms to extract data from sources and loading them to reconciled database
    Functional dependencies (fds) were conceived in the early ’70s, and were mainly used to verify database design and assess data quality. Nowadays they are automatically discovered from data since they can be exploited for many different... more
    Functional dependencies (fds) were conceived in the early ’70s, and were mainly used to verify database design and assess data quality. Nowadays they are automatically discovered from data since they can be exploited for many different purposes, such as query relaxation, data cleansing, and record matching. In the context of big data, the speed at which new data is being created demand for new efficient algorithms for fd discovery. In this paper, we propose an incremental fd discovery approach, which is able to update the set of holding fds upon insertions of new tuples to the data instance, without having to restart the discovery process from scratch. It exploits a bit-vector representation of fds, and an upward/downward search strategy aiming to reduce the overall search space. Experimental results show that such algorithm achieves extremely better time performances with respect to the re-execution of the algorithm from scratch.
    In this paper we propose a visual language based framework to effectively tackle the problem of software based structural analysis in different application domains. In particular, the framework includes grammar based parser generation... more
    In this paper we propose a visual language based framework to effectively tackle the problem of software based structural analysis in different application domains. In particular, the framework includes grammar based parser generation modules to easily adapt structural analysis software packages to evolving standards of specific application domains. Moreover, it includes visual analytics paradigms to enhance the software based structural analysis processes. To demonstrate the feasibility of the proposed framework we have implemented some of its modules and instantiated them in the context of the evaluation of earthquake-resistant masonry buildings.
    Research Interests:
    Research Interests:
    (Note: An asterisk * following a name indicates a program committee member) ... Allen Ambler* Laura Beckwith Alan Blackwell* Paolo Bottoni* Ruven Brooks* Jean-Marie Burkhardt* Margaret Burnett* Shi-Kuo Chang* Cynthia Corritore* Gennaro... more
    (Note: An asterisk * following a name indicates a program committee member) ... Allen Ambler* Laura Beckwith Alan Blackwell* Paolo Bottoni* Ruven Brooks* Jean-Marie Burkhardt* Margaret Burnett* Shi-Kuo Chang* Cynthia Corritore* Gennaro Costagliola* Philip Cox* Isabel Cruz* Allen Cypher Vincenzo Deufemia Francoise Detienne* Gregor Engels* Martin Erwig* Irene Finocchi Carmine Gravino Reiko Heckel Masahito Hirakawa* Stefan Hoermann John Hosking* Ken Kahn* Stuart Kent* Laura Leventhal* Henry Lieberman* Katharina ...
    The discovery of functional dependencies (FDs) from data is facing novel challenges also due to the necessity of monitoring datasets that evolves over time. In these scenarios, incremental FD discovery algorithms have to efficiently... more
    The discovery of functional dependencies (FDs) from data is facing novel challenges also due to the necessity of monitoring datasets that evolves over time. In these scenarios, incremental FD discovery algorithms have to efficiently verify which of the previously discovered FDs still hold on the updated dataset, and also infer new valid FDs. This requires the definition of search strategies and validation methods able to analyze only the portion of the dataset affected by new changes. In this paper we propose a new validation method, which can be used in combination with different search strategies, that exploits regular expressions and compressed data structures to efficiently verify whether a candidate FD holds on an updated version of the input dataset. Experimental results demonstrate the effectiveness of the proposed method on real-world datasets adapted for incremental scenarios, also compared with a baseline incremental FD discovery algorithm.
    A missing value represents a piece of incomplete information that might appear in database instances. Data imputation is the problem of filling missing values by means of consistent data with respect to the semantic of the entire database... more
    A missing value represents a piece of incomplete information that might appear in database instances. Data imputation is the problem of filling missing values by means of consistent data with respect to the semantic of the entire database instance they belong to. To overcome the complexity of considering all possible candidates for each missing value, heuristic methods have become popular to enhance execution times, while keeping high accuracy. This paper presents RENUVER, a new data imputation algorithm relying on relaxed functional dependencies (rfds) for identifying value candidates best guaranteeing the integrity of data. More specifically, the RENUVER imputation process focuses on the rfds involving the attribute whose value is missing. In particular, they are used to guide the selection of best candidate tuples from which to take values for imputing a missing value, and to evaluate the semantic consistency of the imputed missing values. Experimental results on real-world datas...
    The construction of spatial databases often requires considerable computing and storage resources, due to the inherent complexity of spatial data and their manipulation. Thus, it would be desirable to devise methods enabling a designer to... more
    The construction of spatial databases often requires considerable computing and storage resources, due to the inherent complexity of spatial data and their manipulation. Thus, it would be desirable to devise methods enabling a designer to estimate performances of a spatial database since from its early design stages. We present a method for estimating both the size of data and the cost of operations based on the conceptual schema of the spatial database. We also show the application of the method to the design of a spatial database concerning botanic data.
    Mashup editors enable end-users to mix the functionalities of several applications to derive a new one. However, when the end-user faces the development of a new mashup application s/he has to cope with the abundance of services and... more
    Mashup editors enable end-users to mix the functionalities of several applications to derive a new one. However, when the end-user faces the development of a new mashup application s/he has to cope with the abundance of services and information sources available on the Web, and with complex operations like filtering and joining. Thus, even a simple to use mashup editor is not capable of providing adequate support, unless it embeds intelligent methods to process the semantics of available mashups and rank them based on how much they meet user needs. Most existing mashup editors process either semantic or statistical information to derive recommendations for the mashups considered suitable to user needs. However, none of them uses both strategies in a synergistic way. In this paper we present a new mashup advisory approach and a system that combines the statistical and semantic based approaches, by using collaborative filtering techniques and semantic tagging, in order to rank mashups...
    We discuss the results of experiments on spatial databases, aiming to empirically derive parameters for estimating disk occupancy and performances since from the conceptual stages of the design process. This opens the way to the... more
    We discuss the results of experiments on spatial databases, aiming to empirically derive parameters for estimating disk occupancy and performances since from the conceptual stages of the design process. This opens the way to the definition of an estimation methodology, which should let a designer evaluate the quality of alternative design choices based on their xpected performances
    Query/view synchronization upon the evolution of a database schema is a critical problem that has drawn the attention of many researchers in the database community. It entails rewriting queries and views to make them continue work on the... more
    Query/view synchronization upon the evolution of a database schema is a critical problem that has drawn the attention of many researchers in the database community. It entails rewriting queries and views to make them continue work on the new schema version. Although several techniques have been proposed for this problem, many issues need yet to be tackled for evolutions concerning the deletion of schema constructs, hence yielding loss of information. In this paper, we propose a new methodology to rewrite queries and views whose definitions are based on information that have been lost during the schema evolution process. The methodology exploits (relaxed) functional dependencies to automatically rewrite queries and views trying to preserve their semantics.
    Relaxed functional dependencies (rfds) are properties expressing important relationships among data. Thanks to the introduction of approximations in data comparison and/or validity, they can capture constraints useful for several... more
    Relaxed functional dependencies (rfds) are properties expressing important relationships among data. Thanks to the introduction of approximations in data comparison and/or validity, they can capture constraints useful for several purposes, such as the identification of data inconsistencies or patterns of semantically related data. Nevertheless, rfds can provide benefits only if they can be automatically discovered from data. In this discussion paper we present an rfd discovery algorithm relying on a lattice structured search space, and a new candidate rfd validation method. An experimental evaluation demonstrates the discovery performances of the proposed algorithm on real datasets.
    With the advent of e-commerce and social networks, people often unconsciously disseminate their sensitive data through different platforms such as Amazon, eBay, Facebook, Twitter, Instagram, and so on. In this scenario, it would be useful... more
    With the advent of e-commerce and social networks, people often unconsciously disseminate their sensitive data through different platforms such as Amazon, eBay, Facebook, Twitter, Instagram, and so on. In this scenario, it would be useful to support users with tools providing awareness on how their sensitive data are exchanged. In this paper, we present a visual analytics tool that allows users to understand how their sensitive data are exchanged or shared among different network services. In particular, the proposed tool visualizes the communication flow generated during Web browsing activities, highlighting the providers tracking their data. The tool provides a real-time summary graph showing the information acquired from the network. A user study is presented to highlight how the proposed tool improves the user’s perception of privacy issues.
    Updating a schema is a very important activity which occurs naturally during the life cycle of database systems, due to different causes. A challenging problem arising when a schema evolves is the change propagation problem, i.e. the... more
    Updating a schema is a very important activity which occurs naturally during the life cycle of database systems, due to different causes. A challenging problem arising when a schema evolves is the change propagation problem, i.e. the updating of the database ground instances to make them consistent with the evolved schema. Spatial datasets, a stored representation of geographical areas, are VLDBs and so the change propagation process, involving an enormous mass of data among geographical distributed nodes, is very expensive and call for efficient processing. Moreover, the problem of designing languages and tools for spatial data sets change propagation is relevant, for the shortage of tools for schema evolution, and, in particular, for the limitations of those for spatial data sets. In this paper, we take in account both efficiency and limitations and we propose an instance update language, based on the efficient and popular Map-Reduce Google programming paradigm, which allows to pe...
    Failing queries are database queries returning few o no results. It might be useful reformulating them in order to retrieve results that are close to those intended with original queries. In this paper, we introduce an approach for... more
    Failing queries are database queries returning few o no results. It might be useful reformulating them in order to retrieve results that are close to those intended with original queries. In this paper, we introduce an approach for rewriting failing queries that are in the disjunctive normal form. In particular, the approach prescribes to replace some of the attributes of the failing queries with attributes semantically related to them by means of Relaxed Functional Dependencies (rfds), which can be automatically discovered from data. The semantics of automatically discovered rfds allow us to rank them in a way to provide an application order during the query rewriting process. Experiments show that such application order of rfds yields a ranking of the approximate query answers meeting the expectations of the user.
    Quality of search engine results often do not meet user’s expectations. In this paper we propose to implicitly infer visitors feedbacks from the actions they perform while reading a web document. In particular, we propose a new model to... more
    Quality of search engine results often do not meet user’s expectations. In this paper we propose to implicitly infer visitors feedbacks from the actions they perform while reading a web document. In particular, we propose a new model to interpret mouse cursor actions, such as scrolling, movement, text selection, while reading web documents, aiming to infer a relevance value indicating how the user found the document useful for his/her search purposes. We have implemented the proposed model through light-weight components, which can be easily installed within major web browsers as a plug-in. The components capture mouse cursor actions without spoiling user browsing activities, which enabled us to easily collect experimental data to validate the proposed model. The experimental results demonstrate that the proposed model is able to predict user feedbacks with an acceptable level of accuracy.
    Functional dependencies (fds), and their extensions relaxed functional dependencies (rfds), represent an important semantic property of data. They have been widely used over the years for several advanced database operations. Thanks to... more
    Functional dependencies (fds), and their extensions relaxed functional dependencies (rfds), represent an important semantic property of data. They have been widely used over the years for several advanced database operations. Thanks to the availability of discovery algorithms for inferring them from data, in the last years (relaxed) fds have been exploited in many new application contexts, including data cleansing and query relaxation. One of the main problems in this context is the possible “big” number of rfds that might hold on a given dataset, which might make it difficult for a user getting insights from them. On the other hand, one of the main challenges that has recently arisen is the possibility of monitoring how dependencies change during discovery processes run over data streams, also known as continuous discovery processes. To this end, in this paper we present a tool for visualizing the evolution of discovered rfds during continuous discovery processes. It permits to ana...
    Changes to the schema of databases naturally and frequently occur during the life cycle of information systems; supporting their management, in the context of distributed databases, requires tools to perform changes easily and to... more
    Changes to the schema of databases naturally and frequently occur during the life cycle of information systems; supporting their management, in the context of distributed databases, requires tools to perform changes easily and to propagate them efficiently to the database instances. In this paper we illustrate ENVISION, a project aiming to develop a Visual Tool for Schema Evolution in Distributed Databases to support the database administrator during the schema evolution process. The first stage of this project concerned the design of an instance update language, allowing to perform schema changes in a parallel way [14]; in this paper we deal with further steps toward the complete realization of the project: the choice of a declarative schema update language and the realization of the mechanism for the automatic generation of instance update routines. The architecture of the system, which has been implementing, is also designed
    Many modern application contexts, especially those related to the semantic Web, advocate for automatic techniques capable of extracting relationships between semi-structured data, for several purposes, such as the identification of... more
    Many modern application contexts, especially those related to the semantic Web, advocate for automatic techniques capable of extracting relationships between semi-structured data, for several purposes, such as the identification of inconsistencies or patterns of semantically related data, query rewriting, and so forth. One way to represent such relationships is to use relaxed functional dependencies (rfds), since they can embed approximate matching paradigms to compare unstructured data, and admit the possibility of exceptions for them. To this end, thresholds might need to be specified in order to limit the similarity degree in approximate comparisons or the occurrence of exceptions. Thanks to the availability of huge amount of data, including unstructured data available on the Web, nowadays it is possible to automatically discover rfds from data. However, due to the many different combinations of similarity and exception thresholds, the discovery process has an exponential complexity. Thus, it is vital devising proper optimization strategies, in order to make the discovery process feasible. To this end, in this paper, we propose a genetic algorithm to discover rfds from data, also providing an empirical evaluation demonstrating its effectiveness.
    Abstract Data stream profiling concerns the automatic extraction of metadata from a data stream, without having the possibility to store it. Among the metadata of interest, functional dependencies ( fd s), and their extensions relaxed... more
    Abstract Data stream profiling concerns the automatic extraction of metadata from a data stream, without having the possibility to store it. Among the metadata of interest, functional dependencies ( fd s), and their extensions relaxed functional dependencies ( rfd s), represent an important semantic property of data. Nowadays, there are many algorithms for automatically discovering them from static datasets, and some are being proposed for data streams. However, one of the main problems is that the stream nature of data requires a different paradigm of monitoring, since the “big” number of ( r ) fd s that might hold on a given dataset continuously change as new data are read from the stream. In this paper, we present a tool for visualizing rfd s discovered from a data stream. The tool permits to explore results for different types of rfd s, and uses quantitative measures to monitor how discovery results evolve. Moreover, the tool enables the comparison among rfd s discovered across several executions, also proving visual manipulation operators to dynamically compose and filter results. A user study has been conducted to assess the effectiveness of the proposed visualization tool.
    Cardiac arrhythmia is an alteration of the heart rhythm, for which the heartbeat is irregular. Based on the severity of this condition, an arrhythmia could represent a serious danger for a patient. An ECG is a graphic representation of an... more
    Cardiac arrhythmia is an alteration of the heart rhythm, for which the heartbeat is irregular. Based on the severity of this condition, an arrhythmia could represent a serious danger for a patient. An ECG is a graphic representation of an heart rhythm, which provides an overview of heart’s conditions over a specific time interval. ECG signal analysis is entrusted to trained clinicians, although complex and frantic environments, such as emergency settings, can make hard to delegate continuous monitoring to the medical personnel. In such scenarios, an automatic detection methodology could provide crucial support in promptly alerting clinicians towards a potential degeneration of a patient’s conditions. To this end, we propose a heartbeat classification module capable of capturing the semantics of visual information of ECG signals provided by video frames. The module relies on feature extraction techniques derived from video projected images resulting in ECG data, which are then classified by means of deep-learning models. It can be used to support the early detection of some arrhythmia in critical contexts, such as emergency rooms. We show how the proposed module can be used to support clinicians in this context, and discuss an experimental evaluation performed over ground-truth datasets.

    And 84 more