Fault prediction is a vital task to decrease the costs of equipment maintenance and repair, as we... more Fault prediction is a vital task to decrease the costs of equipment maintenance and repair, as well as to improve the quality level of products and production efficiency. Steel plates fault prediction is a significant materials science problem that contributes to avoiding the progress of abnormal events. The goal of this study is to precisely classify the surface defects in stainless steel plates during industrial production. In this paper, a new machine learning approach, entitled logistic model tree (LMT) forest, is proposed since the ensemble of classifiers generally perform better than a single classifier. The proposed method uses the edited nearest neighbor (ENN) technique since the target class distribution in fault prediction problems reveals an imbalanced dataset and the dataset may contain noise. In the experiment that was conducted on a real-world dataset, the LMT forest method demonstrated its superiority over the random forest method in terms of accuracy. Additionally, t...
Support vector machine (SVM) algorithms have been widely used for classification in many differen... more Support vector machine (SVM) algorithms have been widely used for classification in many different areas. However, the use of a single SVM classifier is limited by the advantages and disadvantages of the algorithm. This paper proposes a novel method, called support vector machine chains (SVMC), which involves chaining together multiple SVM classifiers in a special structure, such that each learner is constructed by decrementing one feature at each stage. This paper also proposes a new voting mechanism, called tournament voting, in which the outputs of classifiers compete in groups, the common result in each group gradually moves to the next round, and, at the last round, the winning class label is assigned as the final prediction. Experiments were conducted on 14 real-world benchmark datasets. The experimental results showed that SVMC (88.11%) achieved higher accuracy than SVM (86.71%) on average thanks to the feature selection, sampling, and chain structure combined with multiple m...
2021 Innovations in Intelligent Systems and Applications Conference (ASYU), 2021
The inclusion of artificial intelligence in both our daily and business life has created a new al... more The inclusion of artificial intelligence in both our daily and business life has created a new alternative to meet many different needs. In this study, the production time of a textile product was estimated by using the data generated from a textile company serving in the field of ready-made clothing. After determining the factors affecting the production time within the company, this paper aim to develop a decision support system with the model that gives the best results among a support vector regression and three ensemble learning methods (Random Forest, Decision Tree Regressor, and Bagging) based on an end-goal to predict the production time. The results show that the bagging and random forest yielded highest R2≥ 0.84 and with a minimal predictive error when compared with other approaches. A demo was prepared for the decision support system that can be used to predict the production time of new orders using the random forest model by developing an interface for users.
Abstract: The computerized education system which is introduced in this paper is called AURIS.“AU... more Abstract: The computerized education system which is introduced in this paper is called AURIS.“AURIS”, is developed to improve verbal communication skills of hearing impaired children. AURIS, which is the software combining both visual and audio technology; has ...
Data collection and processing progress made data mining a popular tool among organizations in th... more Data collection and processing progress made data mining a popular tool among organizations in the last decades. Sharing information between companies could make this tool more beneficial for each party. However, there is a risk of sensitive knowledge disclosure. Shared data should be modified in such a way that sensitive relationships would be hidden. Since the discovery of frequent itemsets is one of the most effective data mining tools that firms use, privacy-preserving techniques are necessary for continuing frequent itemset mining. There are two types of approaches in the algorithmic nature: heuristic and exact. This paper presents an exact itemset hiding approach, which uses constraints for a better solution in terms of side effects and minimum distortion on the database. This distortion creates an asymmetric relation between the original and the sanitized database. To lessen the side effects of itemset hiding, we introduced the sibling itemset concept that is used for generat...
Large numbers of job postings with complex content can be found on the Internet at present. There... more Large numbers of job postings with complex content can be found on the Internet at present. Therefore, analysis through natural language processing and machine learning techniques plays an important role in the evaluation of job postings. In this study, we propose a novel data structure and a novel algorithm whose aims are effective storage and analysis in data warehouses of big and complex data such as job postings. State-of-the-art approaches in the literature, such as database queries, semantic networking, and clustering algorithms, were tested in this study to compare their results with those of the proposed approach using 100,000 Kariyer.net job postings in Turkish, which can be considered to have an agglutinative language with a grammatical structure differing from that of other languages. The algorithm proposed in this study also utilizes stream logic. Considering the growth potential of job postings, this study aimed to recommend new sub-qualifications to advertisers for new...
2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)
In order to measure air pollution, to provide air quality control for the dangerous regions, and ... more In order to measure air pollution, to provide air quality control for the dangerous regions, and to provide stability in other regions, many developed countries have established air quality measurement stations. All of these sites have meta-data information containing the type of station such as urban, rural or industrial according to the location of the corresponding station or the characteristic specialty of the surrounding region. The classification of these stations under certain categories is an important process because if the type of station is known, the institutions that provide environmental auditing will transfer the resources appropriate to these regions and adequate control will be provided about the air quality for them. For this purpose, it was aimed in this study to determine to which class to be assigned when a new station is to be set up by taking into account the past pollutant concentrations and meteorological factors. In the experimental studies, different classification algorithms and their ensemble models are compared with our ensemble learning model "Enhanced Bagging (eBagging)" to classify 21 sites in the air quality monitoring network of Turkey. As a consequence, the eBagging ensemble learning algorithm combined with C4.5 significantly outperforms single classification models and their ensembles by better classifying the monitoring stations in terms of the air pollutant concentrations and meteorological data.
2019 Medical Technologies Congress (TIPTEKNO), 2019
It is well known that, early diagnosis is also very important for cancer patients. One of the ima... more It is well known that, early diagnosis is also very important for cancer patients. One of the imaging techniques used in the diagnosis of breast cancer is ultrasonography. In this study, a system that helps the doctor to detect the lesion in the breast has been suggested. We used K-Means clustering algorithm to detect the lesion in the images. The effects of three different filters (Median, Laplace, Sobel) have been examined in the study. Also, different partitioning effects are considered. According to the accuracy rates we obtained, it was concluded that the accuracy increased as the number of partitions increased. In addition, the Median filter was the best filter compared to other filters.
In this paper, Energy Effective-Accuracy Routing (EEAR) protocol is suggested for wireless sensor... more In this paper, Energy Effective-Accuracy Routing (EEAR) protocol is suggested for wireless sensor networks on the basis of energy saving while communication between sensor nodes on the whole network EEAR can conserve energy until keeping communication and routes leading to sink, by data-center gradient diffusion routing protocol. This is realizing by detection and turning on/off radio frequency and other elements of extra sensor nodes. EEAR, which is inspired from combining Gradient-Based Routing (GBR) route finding and Naps topology management protocol while applying both protocols advantages, keeps nearly constant level of routing accuracy with no need to geographic location information. After establishing communicative layers towards the sink while conserving inter-layer communication, this protocol puts extra nodes in sleeping state. In fact, in each layer, a node can go to sleep state by detecting some other nodes that can do communication duty on behalf of that node. Despite c...
Proceedings of the 2008 Euro American Conference on Telematics and Information Systems, 2008
... Full Text: Pdf Pdf. Authors: Arben Hajra, South Eastern European University, Tetovo, Republic... more ... Full Text: Pdf Pdf. Authors: Arben Hajra, South Eastern European University, Tetovo, Republic of Macedonia. Derya Birant, Dokuz Eylul University, Izmir, Turkey. Alp Kut, Dokuz Eylul University, Izmir, Turkey. Published in: · Proceeding. ...
İlk resmi işaretleme dili olarak gösterilen GML (Generalized Markup Language), 1969 yılında, yasa... more İlk resmi işaretleme dili olarak gösterilen GML (Generalized Markup Language), 1969 yılında, yasal belgelerin kolay bir şekilde paylaşılabilmesi ve taşınabilmesi amacıyla, yapılan araştırma geliştirme çalışmaları sonucunda ortaya çıkmıştır. GML, 1978 yılında ANSI (American National Standard Institute) kurumunca oluşturulan bir grup tarafından geliştirilmeye başlanmış ve 1986 yılında SGML (Standard Generalized Markup Language) adını alarak, ISO (International Organization for Standardization) kurumu tarafından uluslararası standart (ISO 8879) haline getirilmiştir. SGML’in internet üzerindeki uygulamalar için çok karmaşık olması ve HTML’in yetersiz kalması büyük bir sorun haline gelmiş ve bu sorunları gidermek için, 1996 yılında XML dilini tasarlamak amacıyla W3C toplanmıştır. XML’in basit dizayn yapısının çalışmalarına, 1996 Ağustosunun son günlerinde başlanmış ve yoğun çalışmalar sonucunda, Şubat 1998’de XML 1.0 bir standart olarak W3C tarafından yayınlanmıştır. XML(eXtensible Marku...
Proceedings of the Seventh Euromicro Workshop on Parallel and Distributed Processing. PDP'99, 1999
Information exchange between sites within network is currently based on automated data exchange a... more Information exchange between sites within network is currently based on automated data exchange at the protocol level and unfortunately only on supervised information exchange at the user level. Automation of information exchange is as difficult, as still no domainf independent ...
Genetic algorithm is a programming technique that mimics biological evolution as a problem-solvin... more Genetic algorithm is a programming technique that mimics biological evolution as a problem-solving strategy and being applied to a broad range of subjects. In this study hybrid genetic algorithm is used to optimize the equation for a dataset with corresponding ...
Fault prediction is a vital task to decrease the costs of equipment maintenance and repair, as we... more Fault prediction is a vital task to decrease the costs of equipment maintenance and repair, as well as to improve the quality level of products and production efficiency. Steel plates fault prediction is a significant materials science problem that contributes to avoiding the progress of abnormal events. The goal of this study is to precisely classify the surface defects in stainless steel plates during industrial production. In this paper, a new machine learning approach, entitled logistic model tree (LMT) forest, is proposed since the ensemble of classifiers generally perform better than a single classifier. The proposed method uses the edited nearest neighbor (ENN) technique since the target class distribution in fault prediction problems reveals an imbalanced dataset and the dataset may contain noise. In the experiment that was conducted on a real-world dataset, the LMT forest method demonstrated its superiority over the random forest method in terms of accuracy. Additionally, t...
Support vector machine (SVM) algorithms have been widely used for classification in many differen... more Support vector machine (SVM) algorithms have been widely used for classification in many different areas. However, the use of a single SVM classifier is limited by the advantages and disadvantages of the algorithm. This paper proposes a novel method, called support vector machine chains (SVMC), which involves chaining together multiple SVM classifiers in a special structure, such that each learner is constructed by decrementing one feature at each stage. This paper also proposes a new voting mechanism, called tournament voting, in which the outputs of classifiers compete in groups, the common result in each group gradually moves to the next round, and, at the last round, the winning class label is assigned as the final prediction. Experiments were conducted on 14 real-world benchmark datasets. The experimental results showed that SVMC (88.11%) achieved higher accuracy than SVM (86.71%) on average thanks to the feature selection, sampling, and chain structure combined with multiple m...
2021 Innovations in Intelligent Systems and Applications Conference (ASYU), 2021
The inclusion of artificial intelligence in both our daily and business life has created a new al... more The inclusion of artificial intelligence in both our daily and business life has created a new alternative to meet many different needs. In this study, the production time of a textile product was estimated by using the data generated from a textile company serving in the field of ready-made clothing. After determining the factors affecting the production time within the company, this paper aim to develop a decision support system with the model that gives the best results among a support vector regression and three ensemble learning methods (Random Forest, Decision Tree Regressor, and Bagging) based on an end-goal to predict the production time. The results show that the bagging and random forest yielded highest R2≥ 0.84 and with a minimal predictive error when compared with other approaches. A demo was prepared for the decision support system that can be used to predict the production time of new orders using the random forest model by developing an interface for users.
Abstract: The computerized education system which is introduced in this paper is called AURIS.“AU... more Abstract: The computerized education system which is introduced in this paper is called AURIS.“AURIS”, is developed to improve verbal communication skills of hearing impaired children. AURIS, which is the software combining both visual and audio technology; has ...
Data collection and processing progress made data mining a popular tool among organizations in th... more Data collection and processing progress made data mining a popular tool among organizations in the last decades. Sharing information between companies could make this tool more beneficial for each party. However, there is a risk of sensitive knowledge disclosure. Shared data should be modified in such a way that sensitive relationships would be hidden. Since the discovery of frequent itemsets is one of the most effective data mining tools that firms use, privacy-preserving techniques are necessary for continuing frequent itemset mining. There are two types of approaches in the algorithmic nature: heuristic and exact. This paper presents an exact itemset hiding approach, which uses constraints for a better solution in terms of side effects and minimum distortion on the database. This distortion creates an asymmetric relation between the original and the sanitized database. To lessen the side effects of itemset hiding, we introduced the sibling itemset concept that is used for generat...
Large numbers of job postings with complex content can be found on the Internet at present. There... more Large numbers of job postings with complex content can be found on the Internet at present. Therefore, analysis through natural language processing and machine learning techniques plays an important role in the evaluation of job postings. In this study, we propose a novel data structure and a novel algorithm whose aims are effective storage and analysis in data warehouses of big and complex data such as job postings. State-of-the-art approaches in the literature, such as database queries, semantic networking, and clustering algorithms, were tested in this study to compare their results with those of the proposed approach using 100,000 Kariyer.net job postings in Turkish, which can be considered to have an agglutinative language with a grammatical structure differing from that of other languages. The algorithm proposed in this study also utilizes stream logic. Considering the growth potential of job postings, this study aimed to recommend new sub-qualifications to advertisers for new...
2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)
In order to measure air pollution, to provide air quality control for the dangerous regions, and ... more In order to measure air pollution, to provide air quality control for the dangerous regions, and to provide stability in other regions, many developed countries have established air quality measurement stations. All of these sites have meta-data information containing the type of station such as urban, rural or industrial according to the location of the corresponding station or the characteristic specialty of the surrounding region. The classification of these stations under certain categories is an important process because if the type of station is known, the institutions that provide environmental auditing will transfer the resources appropriate to these regions and adequate control will be provided about the air quality for them. For this purpose, it was aimed in this study to determine to which class to be assigned when a new station is to be set up by taking into account the past pollutant concentrations and meteorological factors. In the experimental studies, different classification algorithms and their ensemble models are compared with our ensemble learning model "Enhanced Bagging (eBagging)" to classify 21 sites in the air quality monitoring network of Turkey. As a consequence, the eBagging ensemble learning algorithm combined with C4.5 significantly outperforms single classification models and their ensembles by better classifying the monitoring stations in terms of the air pollutant concentrations and meteorological data.
2019 Medical Technologies Congress (TIPTEKNO), 2019
It is well known that, early diagnosis is also very important for cancer patients. One of the ima... more It is well known that, early diagnosis is also very important for cancer patients. One of the imaging techniques used in the diagnosis of breast cancer is ultrasonography. In this study, a system that helps the doctor to detect the lesion in the breast has been suggested. We used K-Means clustering algorithm to detect the lesion in the images. The effects of three different filters (Median, Laplace, Sobel) have been examined in the study. Also, different partitioning effects are considered. According to the accuracy rates we obtained, it was concluded that the accuracy increased as the number of partitions increased. In addition, the Median filter was the best filter compared to other filters.
In this paper, Energy Effective-Accuracy Routing (EEAR) protocol is suggested for wireless sensor... more In this paper, Energy Effective-Accuracy Routing (EEAR) protocol is suggested for wireless sensor networks on the basis of energy saving while communication between sensor nodes on the whole network EEAR can conserve energy until keeping communication and routes leading to sink, by data-center gradient diffusion routing protocol. This is realizing by detection and turning on/off radio frequency and other elements of extra sensor nodes. EEAR, which is inspired from combining Gradient-Based Routing (GBR) route finding and Naps topology management protocol while applying both protocols advantages, keeps nearly constant level of routing accuracy with no need to geographic location information. After establishing communicative layers towards the sink while conserving inter-layer communication, this protocol puts extra nodes in sleeping state. In fact, in each layer, a node can go to sleep state by detecting some other nodes that can do communication duty on behalf of that node. Despite c...
Proceedings of the 2008 Euro American Conference on Telematics and Information Systems, 2008
... Full Text: Pdf Pdf. Authors: Arben Hajra, South Eastern European University, Tetovo, Republic... more ... Full Text: Pdf Pdf. Authors: Arben Hajra, South Eastern European University, Tetovo, Republic of Macedonia. Derya Birant, Dokuz Eylul University, Izmir, Turkey. Alp Kut, Dokuz Eylul University, Izmir, Turkey. Published in: · Proceeding. ...
İlk resmi işaretleme dili olarak gösterilen GML (Generalized Markup Language), 1969 yılında, yasa... more İlk resmi işaretleme dili olarak gösterilen GML (Generalized Markup Language), 1969 yılında, yasal belgelerin kolay bir şekilde paylaşılabilmesi ve taşınabilmesi amacıyla, yapılan araştırma geliştirme çalışmaları sonucunda ortaya çıkmıştır. GML, 1978 yılında ANSI (American National Standard Institute) kurumunca oluşturulan bir grup tarafından geliştirilmeye başlanmış ve 1986 yılında SGML (Standard Generalized Markup Language) adını alarak, ISO (International Organization for Standardization) kurumu tarafından uluslararası standart (ISO 8879) haline getirilmiştir. SGML’in internet üzerindeki uygulamalar için çok karmaşık olması ve HTML’in yetersiz kalması büyük bir sorun haline gelmiş ve bu sorunları gidermek için, 1996 yılında XML dilini tasarlamak amacıyla W3C toplanmıştır. XML’in basit dizayn yapısının çalışmalarına, 1996 Ağustosunun son günlerinde başlanmış ve yoğun çalışmalar sonucunda, Şubat 1998’de XML 1.0 bir standart olarak W3C tarafından yayınlanmıştır. XML(eXtensible Marku...
Proceedings of the Seventh Euromicro Workshop on Parallel and Distributed Processing. PDP'99, 1999
Information exchange between sites within network is currently based on automated data exchange a... more Information exchange between sites within network is currently based on automated data exchange at the protocol level and unfortunately only on supervised information exchange at the user level. Automation of information exchange is as difficult, as still no domainf independent ...
Genetic algorithm is a programming technique that mimics biological evolution as a problem-solvin... more Genetic algorithm is a programming technique that mimics biological evolution as a problem-solving strategy and being applied to a broad range of subjects. In this study hybrid genetic algorithm is used to optimize the equation for a dataset with corresponding ...
Uploads