2015 Latin America Congress on Computational Intelligence (LA-CCI), 2015
The identification of topics in Social Networks has become an important research task when dealin... more The identification of topics in Social Networks has become an important research task when dealing with event detection, particularly when global communities are affected. Text processing techniques and machine learning algorithms have been extensively used to solve this problem. In this paper we compare three clustering algorithms - k-means, k-medoids and NMF (Non-negative Matrix Factorization) - in order to detect topics related to textual messages obtained from Twitter. The algorithms were applied to a database composed by tweets, having as initial context hashtags that are related to the recent scandal of corruption involving FIFA (International Federation of Football Association). Obtained results suggest that the NMF presents better results, since it provides providing clusters that are easier to interpret.
The identification of bird species from their audio recorded songs are nowadays used in several i... more The identification of bird species from their audio recorded songs are nowadays used in several important applications, such as to monitor the quality of the environment and to prevent bird-plane collisions near airports. The complete identification cycle involves the use of: (a) recording devices to acquire the songs, (b) audio processing techniques to remove the noise and to select the most representative elements of the signal, (c) feature extraction procedures to obtain relevant characteristics, and (d) decision procedures to make the identification. The decision procedures can be obtained by Machine Learning (ML) algorithms, considering the problem in a standard classification scenario. One key element is this cycle is the selection of the most relevant segments of the audio for identification purposes. In this paper we show that the use of short audio segments with high amplitude - called pulses in our work - outperforms the use of the complete audio records in the species identification task. We also show how these pulses can be automatically obtained, based on measurements performed directly on the audio signal. The employed classifiers are trained using a previously labeled database of bird songs. We use a database that contains bird song recordings from 75 species which appear in the Southern Atlantic Coast of South America. Obtained results show that the use of automatically obtained pulses and a SVM classifier produce the best results, all the necessary procedures can be installed in a dedicated hardware, allowing the construction of a specific bird identification device.
Um dos desafios das instituicoes de ensino e reduzir o abandono de curso. Uma solucao muito promi... more Um dos desafios das instituicoes de ensino e reduzir o abandono de curso. Uma solucao muito promissora para atingir esse objetivo e o uso da mineracao de dados educacionais, a fim de identificar padroes que auxiliem os gestores educacionais na tomada de decisao. Este trabalho detalha um metodo de selecao de atributos aplicados na previsao da evasao escolar utilizando criacao e selecao de atributos oriundos de bases de dados educacionais. Os experimentos foram aplicados em alunos de graduacao de uma Instituicao Publica de Ensino Superior. Os resultados experimentais apresentam os atributos mais relevantes para prever a evasao, indicando a contribuicao da criacao de atributos na tarefa de mineracao de dados. A abordagem e generica e pode ser aplicada a uma grande quantidade de instituicoes de ensino.
2015 Latin America Congress on Computational Intelligence (LA-CCI), 2015
The identification of topics in Social Networks has become an important research task when dealin... more The identification of topics in Social Networks has become an important research task when dealing with event detection, particularly when global communities are affected. Text processing techniques and machine learning algorithms have been extensively used to solve this problem. In this paper we compare three clustering algorithms - k-means, k-medoids and NMF (Non-negative Matrix Factorization) - in order to detect topics related to textual messages obtained from Twitter. The algorithms were applied to a database composed by tweets, having as initial context hashtags that are related to the recent scandal of corruption involving FIFA (International Federation of Football Association). Obtained results suggest that the NMF presents better results, since it provides providing clusters that are easier to interpret.
The identification of bird species from their audio recorded songs are nowadays used in several i... more The identification of bird species from their audio recorded songs are nowadays used in several important applications, such as to monitor the quality of the environment and to prevent bird-plane collisions near airports. The complete identification cycle involves the use of: (a) recording devices to acquire the songs, (b) audio processing techniques to remove the noise and to select the most representative elements of the signal, (c) feature extraction procedures to obtain relevant characteristics, and (d) decision procedures to make the identification. The decision procedures can be obtained by Machine Learning (ML) algorithms, considering the problem in a standard classification scenario. One key element is this cycle is the selection of the most relevant segments of the audio for identification purposes. In this paper we show that the use of short audio segments with high amplitude - called pulses in our work - outperforms the use of the complete audio records in the species identification task. We also show how these pulses can be automatically obtained, based on measurements performed directly on the audio signal. The employed classifiers are trained using a previously labeled database of bird songs. We use a database that contains bird song recordings from 75 species which appear in the Southern Atlantic Coast of South America. Obtained results show that the use of automatically obtained pulses and a SVM classifier produce the best results, all the necessary procedures can be installed in a dedicated hardware, allowing the construction of a specific bird identification device.
Um dos desafios das instituicoes de ensino e reduzir o abandono de curso. Uma solucao muito promi... more Um dos desafios das instituicoes de ensino e reduzir o abandono de curso. Uma solucao muito promissora para atingir esse objetivo e o uso da mineracao de dados educacionais, a fim de identificar padroes que auxiliem os gestores educacionais na tomada de decisao. Este trabalho detalha um metodo de selecao de atributos aplicados na previsao da evasao escolar utilizando criacao e selecao de atributos oriundos de bases de dados educacionais. Os experimentos foram aplicados em alunos de graduacao de uma Instituicao Publica de Ensino Superior. Os resultados experimentais apresentam os atributos mais relevantes para prever a evasao, indicando a contribuicao da criacao de atributos na tarefa de mineracao de dados. A abordagem e generica e pode ser aplicada a uma grande quantidade de instituicoes de ensino.
Uploads
Papers by Celso Kaestner