Automated and explainable data interpretation hinges on two critical steps: (i) identifying emerg... more Automated and explainable data interpretation hinges on two critical steps: (i) identifying emerging properties from data and representing them into abstract concepts, and (ii) translating such concepts into natural language. While Large Language Models have recently demonstrated impressive capabilities in generating natural language, their trustworthiness remains difficult to ascertain. The deployment of an explainable pipeline enables its application in high-risk activities, such as decision making. Addressing this demanding requirement is facilitated by the fertile ground of knowledge representation and automated reasoning research. Building upon previous work that explored the first step, we focus on the second step, named Concept2Text. The design of an explainable translation naturally lends itself to a logic-based model, once again highlighting the contribution of declarative programming to achieving explainability in AI. This paper explores a Prolog/CLP-based rewriting system designed to interpret concepts expressed in terms of classes and relations derived from a generic ontology, generating text in natural language. Its key features encompass hierarchical tree rewritings, modular multilingual generation, support for equivalent variants across semantic, grammar, and lexical levels, and a transparent rule-based system. We present the architecture and illustrate a simple working example that allows the generation of hundreds of different and equivalent rewritings relative to the input concept.
The concept of luxury, considering it a rare and exclusive attribute, is evolving due to technolo... more The concept of luxury, considering it a rare and exclusive attribute, is evolving due to technological advances and the increasing influence of consumers in the market. Luxury cars have always symbolized wealth, social status, and sophistication. Recently, as technology progresses, the ability and interest to gather, store, and analyze data from these elegant vehicles has also increased. In recent years, the analysis of luxury car data has emerged as a significant area of research, highlighting researchers’ exploration of various aspects that may differentiate luxury cars from ordinary ones. For instance, researchers study factors such as economic impact, technological advancements, customer preferences and demographics, environmental implications, brand reputation, security, and performance. Although the percentage of individuals purchasing luxury cars is lower than that of ordinary cars, the significance of analyzing luxury car data lies in its impact on various aspects of the automotive industry and society. This literature review aims to provide an overview of the current state of the art in luxury car data analysis.
SummaryBackgroundThe long‐term results of web‐based behavioural intervention in non‐alcoholic fat... more SummaryBackgroundThe long‐term results of web‐based behavioural intervention in non‐alcoholic fatty liver disease (NAFLD) have not been described in patients followed in specialised centres.AimsTo analyse the long‐term effectiveness of web education compared with the results achieved by a group‐based behavioural intervention in the same years 2012–2014.MethodsWe followed 679 patients with NAFLD (web‐based, n = 290; group‐based, n = 389) for 5 years. Weight loss ≥10% was the primary outcome; secondary outcomes were attrition, changes in liver enzymes and in biomarkers of steatosis (Fatty liver Index) and fibrosis (Fibrosis‐4 index).ResultsThe cohorts differed in age, education, working status and presence of diabetes. Attrition was higher in the web‐based cohort (hazard ratio: 1.53; 95% CI: 1.24–1.88), but not different after adjustment for confounders. Among patients in active follow‐up, 50% lost ≥5% of initial body weight and 19% lost ≥10%, without difference between cohorts. Alani...
Forensic Science International: Digital Investigation, Sep 1, 2021
Sharing images on Social Network (SN) platforms is one of the most widespread behaviors which may... more Sharing images on Social Network (SN) platforms is one of the most widespread behaviors which may cause privacy-intrusive and illegal content to be widely distributed. Clustering the images shared through SN platforms according to the acquisition cameras embedded in smartphones is regarded as a significant task in forensic investigations of cybercrimes. The Sensor Pattern Noise (SPN) caused by camera sensor imperfections due to the manufacturing process has been proved to be an effective and robust camera fingerprint that can be used for several tasks, such as digital evidence analysis, smartphone fingerprinting and user profile linking as well. Clustering the images uploaded by users on their profiles is a way of fingerprinting the camera sources and it is considered a challenging task since users may upload different types of images, i.e., the images taken by users\u2019 smartphones (taken images) and single images from different sources, cropped images, or generic images from the Web (shared images). The shared images make a perturbation in the clustering task, as they do not usually present sufficient characteristics of SPN of their related sources. Moreover, they are not directly referable to the user\u2019s device so they have to be detected and removed from the clustering process. In this paper, we propose a user profiles\u2019 image clustering method without prior knowledge about the type and number of the camera sources. The hierarchical graph-based method clusters both types of images, taken images and shared images. The strengths of our method include overcoming large-scale image datasets, the presence of shared images that perturb the clustering process and the loss of image details caused by the process of content compression on SN platforms. The method is evaluated on the VISION dataset, which is a public benchmark including images from 35 smartphones. The dataset is perturbed by 3000 images, simulating the shared images from different sources except for users\u2019 smartphones. Experimental results confirm the robustness of the proposed method against perturbed datasets and its effectiveness in the image clustering
A fundamental problem in Social Network Analysis is how to move from single-layer to multi-layer,... more A fundamental problem in Social Network Analysis is how to move from single-layer to multi-layer, which provide a holistic view. User profiles resolution has received considerable attention since it allows to match users on different online social networks (OSNs). However, to the best of our knowledge, no study has focused on nesting operation for merging OSNs graphs. This work is a first step in the direction of defining the data model and the algorithm to perform approximate nesting of multiple OSNs graphs, based on user features. We provide initial experimental evidence based on synthetic data.
While a plethora of digital contents are daily generated and shared online, authorship verificati... more While a plethora of digital contents are daily generated and shared online, authorship verification has become an imperative task. In comparison to other media watermarking techniques, text watermarking is a more challenging task. The changes in text would strongly affect the visual form and the meaning, text might be very short (eg. social media posts) and it cannot be always converted into image. In this paper we propose a novel text watermarking method for authorship verification based on Unicode confusable substitution. The proposed method substitutes latin symbols with homoglyph characters. It ensures length preservation and visual indistinguishability among the original text and the watermarked one. We successfully evaluate our approach using a real dataset of 1.8 million of New York Times articles. The results show the effectiveness of our method providing an average length of 101 characters needed to embed a 64bit password based watermark.
In the last decades, Social Networks (SNs) have deeply changed interactions and habits of the use... more In the last decades, Social Networks (SNs) have deeply changed interactions and habits of the users that are also prone to create more than one profile on the same SN. On the flip side, fake profiles (i.e., impersonating profiles), have become a considerable problem in digital investigations. In this paper, we propose a method for user profiles resolution through a cluster-based approach of the smartphone fingerprints extracted from the images being posted on SNs. The proposed method is thus able to detect fake profiles. To evaluate our approach, we use a real dataset of 1,500 images from 10 different smartphone devices and Facebook and WhatsApp platforms. The results show that the average of sensitivity and specificity for user profiles resolution is about 98%.
Digital watermarking has become crucially important in authentication and copyright protection of... more Digital watermarking has become crucially important in authentication and copyright protection of the digital contents, since more and more data are daily generated and shared online through digital archives, blogs and social networks. Out of all, text watermarking is a more difficult task in comparison to other media watermarking. Text cannot be always converted into image, it accounts for a far smaller amount of data (eg. social network posts) and the changes in short texts would strongly affect the meaning or the overall visual form. In this paper we propose a text watermarking technique based on homoglyph characters substitution for latin symbols1. The proposed method is able to efficiently embed a password based watermark in short texts by strictly preserving the content. In particular, it uses alternative Unicode symbols to ensure visual indistinguishability and length preservation, namely content-preservation. To evaluate our method, we use a real dataset of 1.8 million New York articles. The results show the effectiveness of our approach providing an average length of 101 characters needed to embed a 64bit password based watermark.
2023 IEEE Symposium on Computers and Communications (ISCC)
As a result of an increasing elderly population, the number of people with age-related diseases i... more As a result of an increasing elderly population, the number of people with age-related diseases is increasing worldwide. Alzheimer's disease is thus becoming an emergency health and social problem. Neuropsychological evaluation and biomarker identification represent the two main approaches to identifying subjects with Alzheimer's. In this paper, we propose a web application designed to be sensitive to the cognitive changes distinctive of the early Mild Cognitive Impairment, which is a condition in which someone experiences minor cognitive problems, and the preclinical phase of Alzheimer's disease. The application is conceived to be self-administered in a comfortable and nonstressful environment. It was designed to be quick to administer, automatic to score, and able to preserve privacy because of the highly sensitive data collected. The preliminary evaluation of the application was done by enrolling 518 subjects characterised by several risk factors and the presence of a family history, which underwent standard neuropsychological screening.
While a plethora of digital contents are daily generated and shared online, authorship verificati... more While a plethora of digital contents are daily generated and shared online, authorship verification has become an imperative task. In comparison to other media watermarking techniques, text watermarking is a more challenging task. The changes in text would strongly affect the visual form and the meaning, text might be very short (eg. social media posts) and it cannot be always converted into image. In this paper we propose a novel text watermarking method for authorship verification based on Unicode confusable substitution. The proposed method substitutes latin symbols with homoglyph characters. It ensures length preservation and visual indistinguishability among the original text and the watermarked one. We successfully evaluate our approach using a real dataset of 1.8 million of New York Times articles. The results show the effectiveness of our method providing an average length of 101 characters needed to embed a 64bit password based watermark. Keywords—Authorship Analysis, Copyr...
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-... more This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Copyright c by Società editrice il Mulino, Bologna. Tutti i diritti sono riservati. Per altre inf... more Copyright c by Società editrice il Mulino, Bologna. Tutti i diritti sono riservati. Per altre informazioni si veda https://www.rivisteweb.it Licenza d'uso L'articoloè messo a disposizione dell'utente in licenza per uso esclusivamente privato e personale, senza scopo di lucro e senza fini direttamente o indirettamente commerciali. Salvo quanto espressamente previsto dalla licenza d'uso Rivisteweb,è fatto divieto di riprodurre, trasmettere, distribuire o altrimenti utilizzare l'articolo, per qualsiasi scopo o fine. Tutti i diritti sono riservati.
As a result of an increasing elderly population, the number of people with age-related diseases i... more As a result of an increasing elderly population, the number of people with age-related diseases is increasing worldwide. Alzheimer's disease is thus becoming an emergency health and social problem. Neuropsychological evaluation and biomarker identification represent the two main approaches to identifying subjects with Alzheimer's. In this paper, we propose a web application designed to be sensitive to the cognitive changes distinctive of the early Mild Cognitive Impairment, which is a condition in which someone experiences minor cognitive problems, and the preclinical phase of Alzheimer's disease. The application is conceived to be self-administered in a comfortable and nonstressful environment. It was designed to be quick to administer, automatic to score, and able to preserve privacy because of the highly sensitive data collected. The preliminary evaluation of the application was done by enrolling 518 subjects characterised by several risk factors and the presence of a family history, which underwent standard neuropsychological screening.
Automated and explainable data interpretation hinges on two critical steps: (i) identifying emerg... more Automated and explainable data interpretation hinges on two critical steps: (i) identifying emerging properties from data and representing them into abstract concepts, and (ii) translating such concepts into natural language. While Large Language Models have recently demonstrated impressive capabilities in generating natural language, their trustworthiness remains difficult to ascertain. The deployment of an explainable pipeline enables its application in high-risk activities, such as decision making. Addressing this demanding requirement is facilitated by the fertile ground of knowledge representation and automated reasoning research. Building upon previous work that explored the first step, we focus on the second step, named Concept2Text. The design of an explainable translation naturally lends itself to a logic-based model, once again highlighting the contribution of declarative programming to achieving explainability in AI. This paper explores a Prolog/CLP-based rewriting system designed to interpret concepts expressed in terms of classes and relations derived from a generic ontology, generating text in natural language. Its key features encompass hierarchical tree rewritings, modular multilingual generation, support for equivalent variants across semantic, grammar, and lexical levels, and a transparent rule-based system. We present the architecture and illustrate a simple working example that allows the generation of hundreds of different and equivalent rewritings relative to the input concept.
The concept of luxury, considering it a rare and exclusive attribute, is evolving due to technolo... more The concept of luxury, considering it a rare and exclusive attribute, is evolving due to technological advances and the increasing influence of consumers in the market. Luxury cars have always symbolized wealth, social status, and sophistication. Recently, as technology progresses, the ability and interest to gather, store, and analyze data from these elegant vehicles has also increased. In recent years, the analysis of luxury car data has emerged as a significant area of research, highlighting researchers’ exploration of various aspects that may differentiate luxury cars from ordinary ones. For instance, researchers study factors such as economic impact, technological advancements, customer preferences and demographics, environmental implications, brand reputation, security, and performance. Although the percentage of individuals purchasing luxury cars is lower than that of ordinary cars, the significance of analyzing luxury car data lies in its impact on various aspects of the automotive industry and society. This literature review aims to provide an overview of the current state of the art in luxury car data analysis.
SummaryBackgroundThe long‐term results of web‐based behavioural intervention in non‐alcoholic fat... more SummaryBackgroundThe long‐term results of web‐based behavioural intervention in non‐alcoholic fatty liver disease (NAFLD) have not been described in patients followed in specialised centres.AimsTo analyse the long‐term effectiveness of web education compared with the results achieved by a group‐based behavioural intervention in the same years 2012–2014.MethodsWe followed 679 patients with NAFLD (web‐based, n = 290; group‐based, n = 389) for 5 years. Weight loss ≥10% was the primary outcome; secondary outcomes were attrition, changes in liver enzymes and in biomarkers of steatosis (Fatty liver Index) and fibrosis (Fibrosis‐4 index).ResultsThe cohorts differed in age, education, working status and presence of diabetes. Attrition was higher in the web‐based cohort (hazard ratio: 1.53; 95% CI: 1.24–1.88), but not different after adjustment for confounders. Among patients in active follow‐up, 50% lost ≥5% of initial body weight and 19% lost ≥10%, without difference between cohorts. Alani...
Forensic Science International: Digital Investigation, Sep 1, 2021
Sharing images on Social Network (SN) platforms is one of the most widespread behaviors which may... more Sharing images on Social Network (SN) platforms is one of the most widespread behaviors which may cause privacy-intrusive and illegal content to be widely distributed. Clustering the images shared through SN platforms according to the acquisition cameras embedded in smartphones is regarded as a significant task in forensic investigations of cybercrimes. The Sensor Pattern Noise (SPN) caused by camera sensor imperfections due to the manufacturing process has been proved to be an effective and robust camera fingerprint that can be used for several tasks, such as digital evidence analysis, smartphone fingerprinting and user profile linking as well. Clustering the images uploaded by users on their profiles is a way of fingerprinting the camera sources and it is considered a challenging task since users may upload different types of images, i.e., the images taken by users\u2019 smartphones (taken images) and single images from different sources, cropped images, or generic images from the Web (shared images). The shared images make a perturbation in the clustering task, as they do not usually present sufficient characteristics of SPN of their related sources. Moreover, they are not directly referable to the user\u2019s device so they have to be detected and removed from the clustering process. In this paper, we propose a user profiles\u2019 image clustering method without prior knowledge about the type and number of the camera sources. The hierarchical graph-based method clusters both types of images, taken images and shared images. The strengths of our method include overcoming large-scale image datasets, the presence of shared images that perturb the clustering process and the loss of image details caused by the process of content compression on SN platforms. The method is evaluated on the VISION dataset, which is a public benchmark including images from 35 smartphones. The dataset is perturbed by 3000 images, simulating the shared images from different sources except for users\u2019 smartphones. Experimental results confirm the robustness of the proposed method against perturbed datasets and its effectiveness in the image clustering
A fundamental problem in Social Network Analysis is how to move from single-layer to multi-layer,... more A fundamental problem in Social Network Analysis is how to move from single-layer to multi-layer, which provide a holistic view. User profiles resolution has received considerable attention since it allows to match users on different online social networks (OSNs). However, to the best of our knowledge, no study has focused on nesting operation for merging OSNs graphs. This work is a first step in the direction of defining the data model and the algorithm to perform approximate nesting of multiple OSNs graphs, based on user features. We provide initial experimental evidence based on synthetic data.
While a plethora of digital contents are daily generated and shared online, authorship verificati... more While a plethora of digital contents are daily generated and shared online, authorship verification has become an imperative task. In comparison to other media watermarking techniques, text watermarking is a more challenging task. The changes in text would strongly affect the visual form and the meaning, text might be very short (eg. social media posts) and it cannot be always converted into image. In this paper we propose a novel text watermarking method for authorship verification based on Unicode confusable substitution. The proposed method substitutes latin symbols with homoglyph characters. It ensures length preservation and visual indistinguishability among the original text and the watermarked one. We successfully evaluate our approach using a real dataset of 1.8 million of New York Times articles. The results show the effectiveness of our method providing an average length of 101 characters needed to embed a 64bit password based watermark.
In the last decades, Social Networks (SNs) have deeply changed interactions and habits of the use... more In the last decades, Social Networks (SNs) have deeply changed interactions and habits of the users that are also prone to create more than one profile on the same SN. On the flip side, fake profiles (i.e., impersonating profiles), have become a considerable problem in digital investigations. In this paper, we propose a method for user profiles resolution through a cluster-based approach of the smartphone fingerprints extracted from the images being posted on SNs. The proposed method is thus able to detect fake profiles. To evaluate our approach, we use a real dataset of 1,500 images from 10 different smartphone devices and Facebook and WhatsApp platforms. The results show that the average of sensitivity and specificity for user profiles resolution is about 98%.
Digital watermarking has become crucially important in authentication and copyright protection of... more Digital watermarking has become crucially important in authentication and copyright protection of the digital contents, since more and more data are daily generated and shared online through digital archives, blogs and social networks. Out of all, text watermarking is a more difficult task in comparison to other media watermarking. Text cannot be always converted into image, it accounts for a far smaller amount of data (eg. social network posts) and the changes in short texts would strongly affect the meaning or the overall visual form. In this paper we propose a text watermarking technique based on homoglyph characters substitution for latin symbols1. The proposed method is able to efficiently embed a password based watermark in short texts by strictly preserving the content. In particular, it uses alternative Unicode symbols to ensure visual indistinguishability and length preservation, namely content-preservation. To evaluate our method, we use a real dataset of 1.8 million New York articles. The results show the effectiveness of our approach providing an average length of 101 characters needed to embed a 64bit password based watermark.
2023 IEEE Symposium on Computers and Communications (ISCC)
As a result of an increasing elderly population, the number of people with age-related diseases i... more As a result of an increasing elderly population, the number of people with age-related diseases is increasing worldwide. Alzheimer's disease is thus becoming an emergency health and social problem. Neuropsychological evaluation and biomarker identification represent the two main approaches to identifying subjects with Alzheimer's. In this paper, we propose a web application designed to be sensitive to the cognitive changes distinctive of the early Mild Cognitive Impairment, which is a condition in which someone experiences minor cognitive problems, and the preclinical phase of Alzheimer's disease. The application is conceived to be self-administered in a comfortable and nonstressful environment. It was designed to be quick to administer, automatic to score, and able to preserve privacy because of the highly sensitive data collected. The preliminary evaluation of the application was done by enrolling 518 subjects characterised by several risk factors and the presence of a family history, which underwent standard neuropsychological screening.
While a plethora of digital contents are daily generated and shared online, authorship verificati... more While a plethora of digital contents are daily generated and shared online, authorship verification has become an imperative task. In comparison to other media watermarking techniques, text watermarking is a more challenging task. The changes in text would strongly affect the visual form and the meaning, text might be very short (eg. social media posts) and it cannot be always converted into image. In this paper we propose a novel text watermarking method for authorship verification based on Unicode confusable substitution. The proposed method substitutes latin symbols with homoglyph characters. It ensures length preservation and visual indistinguishability among the original text and the watermarked one. We successfully evaluate our approach using a real dataset of 1.8 million of New York Times articles. The results show the effectiveness of our method providing an average length of 101 characters needed to embed a 64bit password based watermark. Keywords—Authorship Analysis, Copyr...
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-... more This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Copyright c by Società editrice il Mulino, Bologna. Tutti i diritti sono riservati. Per altre inf... more Copyright c by Società editrice il Mulino, Bologna. Tutti i diritti sono riservati. Per altre informazioni si veda https://www.rivisteweb.it Licenza d'uso L'articoloè messo a disposizione dell'utente in licenza per uso esclusivamente privato e personale, senza scopo di lucro e senza fini direttamente o indirettamente commerciali. Salvo quanto espressamente previsto dalla licenza d'uso Rivisteweb,è fatto divieto di riprodurre, trasmettere, distribuire o altrimenti utilizzare l'articolo, per qualsiasi scopo o fine. Tutti i diritti sono riservati.
As a result of an increasing elderly population, the number of people with age-related diseases i... more As a result of an increasing elderly population, the number of people with age-related diseases is increasing worldwide. Alzheimer's disease is thus becoming an emergency health and social problem. Neuropsychological evaluation and biomarker identification represent the two main approaches to identifying subjects with Alzheimer's. In this paper, we propose a web application designed to be sensitive to the cognitive changes distinctive of the early Mild Cognitive Impairment, which is a condition in which someone experiences minor cognitive problems, and the preclinical phase of Alzheimer's disease. The application is conceived to be self-administered in a comfortable and nonstressful environment. It was designed to be quick to administer, automatic to score, and able to preserve privacy because of the highly sensitive data collected. The preliminary evaluation of the application was done by enrolling 518 subjects characterised by several risk factors and the presence of a family history, which underwent standard neuropsychological screening.
Sommario: 1. I dati come essenziale risorsa per lo sviluppo dell'IA in ambito sanitario.-2. La qu... more Sommario: 1. I dati come essenziale risorsa per lo sviluppo dell'IA in ambito sanitario.-2. La qualità dei dati.-3. Dati e processo evolutivo dell'IA.
Social networks have become an indispensable part of everyday life by providing users with differ... more Social networks have become an indispensable part of everyday life by providing users with different types of interaction. However, sharing different types of data, such as text, image, video and etc., on social networks, gives rise to user privacy concerns and risks, while the user is not aware of that. In this chapter, we show how the images shared by users can be applied to fingerprint the acquisition devices and link user profiles on social networks.
Uploads
Papers by Flavio Bertini