Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Importance of Characteristic Features and Their Form for Data Exploration release_6zuzyi22jfbijiqnxmm7n665da

by Urszula Stańczyk, Beata Zielosko, Grzegorz Baron

Published in Entropy by MDPI AG.

2024   Volume 26, Issue 5, p404

Abstract

The nature of the input features is one of the key factors indicating what kind of tools, methods, or approaches can be used in a knowledge discovery process. Depending on the characteristics of the available attributes, some techniques could lead to unsatisfactory performance or even may not proceed at all without additional preprocessing steps. The types of variables and their domains affect performance. Any changes to their form can influence it as well, or even enable some learners. On the other hand, the relevance of features for a task constitutes another element with a noticeable impact on data exploration. The importance of attributes can be estimated through the application of mechanisms belonging to the feature selection and reduction area, such as rankings. In the described research framework, the data form was conditioned on relevance by the proposed procedure of gradual discretisation controlled by a ranking of attributes. Supervised and unsupervised discretisation methods were employed to the datasets from the stylometric domain and the task of binary authorship attribution. For the selected classifiers, extensive tests were performed and they indicated many cases of enhanced prediction for partially discretised datasets.
In application/xml+jats format

Archived Files and Locations

application/pdf  2.3 MB
file_uwodcz6xf5dyhkncnuktc7gcuy
mdpi-res.com (publisher)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2024-05-06
Language   en ?
DOI  10.3390/e26050404
PubMed  38785653
PMC  PMC11119179
Container Metadata
Open Access Publication
In DOAJ
In ISSN ROAD
In Keepers Registry
ISSN-L:  1099-4300
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 98273e89-3ef4-4673-82c2-ea26823915ec
API URL: JSON