Tf-Idf Research Papers - Academia.edu

Social media have become a discussion platform for individuals and groups. Hence, users belonging to different groups can communicate together. Positive and negative messages as well as media are circulated between those users. Users can... more

Bookmark
Download
- by Arief Adhitya
- •
- 7
  Distributed Computing, Ontology, Text Mining, Computer Software

6LowPAN was introduced by the IETF as a standard protocol to interconnect tiny and constrained devices across IPv6 clouds. 6LowPAN supports a QoS feature based on two priority bits. So far, little interest has been granted and this QoS... more

סיפורה של האחזות נח"ל גנ"ת בעיר העתיקה, ראשית ההתיישבות ברובע היהודי לאחרת מלחמת ששת הימים. The story of 'Moriah' Nahal holding. The first settlement in the Jewish Quarter of Jerusalem after the Six-Day War. The Old City of Jerusalem... more

Bookmark
Download
- by Reut Odem
- •
- 7
  Israel Studies, Israel/Palestine, Israel and Zionism, Jerusalem

The problem with Social networks has been a major issues globally. In recent years, Spam Detection on social networks has been focused. However, Spammer has seen that social networks are vulnerable to attack in order to perpetrate their... more

Bookmark
Download
- by Kamoru A B I O D U N Balogun PhD
- •
- 4
  Genetic Algorithms, Anti spam, Spam Detection, Tf-Idf

The paper attempts to analyze if the sentiment stability of financial 10-K reports over time can determine the company’s future mean returns. A diverse portfolio of stocks was selected to test this hypothesis. The proposed framework... more

Data mining is the process of analyzing data to find information knowledge discovery in databases. One of the techniques of Data Mining is Classification. Neural network has emerged as an algorithm for classification. In this research,... more

Data mining is the process of analyzing data to find information knowledge discovery in databases. One of the techniques of Data Mining is Classification. Neural network has emerged as an algorithm for classification. In this research, backpropagation neural network algorithm is adapted for the text mining to classify advisor lecturers based on the student's final project. Generally, small neural network structures are faster when deployed. The use of SVD and Weight Initialisation for optimizing the neural network structures was proposed in. SVD is used to identify and eliminate redundant hidden nodes. Moreover, the optimal neural network size is highly dependent on the weight of initialisation. It starts from a fairly large network and dynamically removes unimportant connections. The experiment was done 5 times for each testing scenarios. The results showed that neural network algorithm with prune method and a lot of training data produces better result. It showed the accuracy is amount of 85%, while the precision is amount of 90.63%, while recall is amount of 85%, while f-measure is amount of 87.72%. 1. Pendahuluan 1.1 Latar Belakang Tugas akhir merupakan karya ilmiah mahasiswa sebagai salah satu syarat untuk mendapatkan gelar sarjana. Dalam menyusun tugas akhir, mahasiswa membutuhkan dosen pembimbing sebagai tempat konsultasi dalam menyelesaikan tugas akhir tersebut. Dosen pembimbing sebaiknya merupakan orang yang menguasai bidang yang sesuai dengan tugas akhir mahasiswa sehingga proses bimbingan dapat berjalan dengan baik. Dalam proses penentuan dosen pembimbing di Jurusan Teknik Informatika UMM masih dilakukan secara manual dengan mengandalkan pengetahuan pribadi tentang keahlian dosen yang dibutuhkan. Oleh karena itu dibutuhkan analisis tentang keahlian dosen yang sesuai dengan topik tugas akhir mahasiswa. Pada penelitian tugas akhir ini, peneliti memanfaatkan penggunaan data mining berdasarkan pengalaman dosen yang telah membimbing mahasiswa dengan menggunakan parameter variabel topik, judul, serta keyword abstrak tugas akhir. Dengan mengenali pola dari variable-variabel yang menggambarkan tugas akhir yang sudah dibimbing oleh dosen dapat dibuat sebuah aplikasi untuk menentukan dosen pembimbing tugas akhir dengan teknik klasifikasi. Klasifikasi itu mengenali pola yang menggambarkan kelompok objek yang sudah diklasifikasi dan menyimpulkan

Bookmark
Download
- by Rosalina Syamsu and +1
  Yuda Munarko
- •
- 12
  Data Mining, Neural Networks, Artificial Neural Networks, Singular value decomposition

Increasing progress in numerous research fields and information technologies, led to an increase in the publication of research papers. Therefore, researchers take a lot of time to find interesting research papers that are close to their... more

Bookmark
Download
- by Giuseppe Caggianese
- •
- 2
  Tf-Idf, Information Sciences

With the ever-growing amount of text and information in digital space, it is nearly impossible to manually extract summary. Hence, there is demand of automatic system that can comprehend those data and deliver relevant information... more

Bookmark
Download
- by Nirusha Manandhar and +1
  Ruchi Tandukar
- •
- 5
  Natural Language Processing, Automatic Text Summarization, Wikipedia, Tf-Idf

When managers fail to attend a business meeting, they have to read the transcript from the meeting they missed and get informed about the decisions that have been taken. Text mining may fully automate this process. Support tools which can... more

Bookmark
Download
- by Georgios Kiminos and +1
  Christos Garis
- •
- 6
  Machine Learning, Text Mining, Support Vector Machines, Text Classification

Growth of research articles publication in various streams of research is exponential. Searching for a particular article from the research repository is considered to be a tremendous one and also time consuming. Research articles... more

Permasalahan yang selalu terjadi di kalangan mahasiswa Universitas Atma Jaya Yogyakarta adalah kesulitan dalam mendapatkan informasi perkuliahan secara cepat dan akurat. Dalam waktu dekat ini, Universitas Atma Jaya Yogyakarta akan... more

Bookmark
Download
- by Dwiki Witman
- •
- 7
  Information Systems, Information Technology, API Design, Dataware House

Bookmark
Download
- by Adi Rahman
- •
- 5
  Logo Design, Tf-Idf, Cf-Idf, Vector Space Machine

This presentation refers to the project doen by Ms. Sidra Mehtab as a part of her MSc (Data Science & Analytics) minor projects series. The project has two parts. In the Part I of the project, we have carried out a sentiment analysis on... more

Feature selection and extraction are frequently used approaches to solve the computational burden in problems with the classification of texts. An introduction of an extraction method for each class that summarizes the characteristics of... more

Bookmark
Download
- by IAEME Publication
- •
- 6
  Feature Selection, Text Classification, Feature Extraction, Tf-Idf

Bookmark
Download
- by dwi iswanto
- •
- 3
  LSA, Similarity, Tf-Idf

The key to the keys to immortality and eternal youth lies in the correct answer to the main question: How to naively discover new essential – but still hidden – features required for properly training novel adaptive supervised machine... more

Bookmark
Download
- by Thomas Hahn
- •
- 165
  Biochemistry, Genetics, Microbiology, Computer Science

One of the several beneﬁts of text classiﬁcation is to automatically assign document in predeﬁned category is one of the primary steps toward knowledge extraction from the raw textual data. In such tasks, words are dealt with as a set of... more

In the present days, the development of the internet has resulted in a significant rise in the number of electronic documents in several regional languages. As Tamil Text data in digital format both in online and offline mode is growing... more

Bookmark
Download
- by IAEME Publication
- •
- 4
  Machine Learning, Tf-Idf, Chi-Square, Tamil Document classification

The sentiment analysis approach is used to determine the sentiment in the text content by using the keyword intensity or term frequency based approach. The keyword extraction models are used to determine the words containing the sentiment... more

Bookmark
Download
- by GJESR Journal
- •
- 4
  Sentiment Analysis, SVM classifier, Tf-Idf, Naive Bayes

Near-Miss incidents can be treated as events to signal the weakness of safety management system (SMS) at the workplace. Analyzing near-misses will provide relevant root causes behind such incidents so that effective safety related... more

Bookmark
Download
- by Deeshant Rajput and +1
  Dr. Abhishek Verma
- •
- 20
  Artificial Intelligence, Decision Making, Machine Learning, Research Methodology

In this paper we present and validate a novel approach for single-label multi-class document categorization. The proposed catego-rization approach relies on the statistical property of Principal Component Analysis (PCA), which minimizes... more

Bookmark
Download
- by Abdulah Usman
- •
- 3
  Text Mining, Vector Space Model, Tf-Idf

The growth of interest in epistolary texts over the last few decades has led to a flourishing of international research projects devoted to cataloguing, editing, and studying modern letters, in a collective and coordinated effort to... more

Emotion is the human feeling when communicating with other humans or reaction to everyday events. Emotion classification is needed to recognize human emotions from text. This study compare the performance of the TF-IDF and Word2Vec models... more

There have been many notable works related to plagiarism detection techniques in the English language but very few in the Nepali language, mostly due to the involved challenges in the preprocessing of the Devanagari script (script for the... more

Bookmark
Download
- by Ayush Kumar Shah and +3
  Bal Krishna Bal
  Araju Nepal
  Manasi Kattel
- •
- 5
  Natural Language Processing, Plagiarism Detection, Tf-Idf, Cosine Similarity

User reviews provide a rich source of information regarding user interests. Many web platforms allow or even encourage their visitors to leave their feedback regarding the products and services they have consumed. The Term Frequency (TF)... more

Bookmark
Download
- by Stavros Kaperonis
- •
- 4
  Ahp, Tf-Idf, Multi-criteria Decision Analysis, Web Adaptation

A spam filter is a program which is used to identify unwanted emails and prevents those messages from getting into a user's mail. The study was focused on how the algorithms can be applied on a number of e-mails consisting of both ham and... more

Fault Tree Analysis (FTA) is a proven technique for finding out the root cause of the problem and simplifies the problem systematically and logically. In auto parts manufacturing companies, line stoppage is a major problem and thus Bottle... more

Bookmark
Download
- by IAEME Publication
- •
- 7
  Feature Selection, Text Classification, Feature Extraction, Tf-Idf

Sentiment analysis is an interdisciplinary field between natural language processing, artificial intelligence and text mining. The main key of the sentiment analysis is the polarity that is meant by the sentiment is positive or negative... more

Bookmark
Download
- by Sungjick Lee
- •
- 12
  Information Retrieval, Data Mining, Text Mining, Search Engines

The 2020 regional elections in the midst of the COVID-19 pandemic are starting to get crowded starting from the real world and in cyberspace, especially on Twitter social media. Twitter's existence has been widely used by various... more

The 2020 regional elections in the midst of the COVID-19 pandemic are starting to get crowded starting from the real world and in cyberspace, especially on Twitter social media. Twitter's existence has been widely used by various communities in recent years. Twitter is one of the media that represents the public response regarding public issu. Ahead of the general election (PEMILU), there are usually some parties who want to know the results of public sentiment or response to the issue, namely academics, intellectuals or even political opponents. Nevertheless, the implementation of local elections is very polemic in the community, therefore this study tries to analyze tweets that talk about issue public, namely the 2020 elections in the wake of the COVID-19 Pandemic. The analysis usually uses the classification of tweets containing public sentiment about the issue. The classification method used in this research is Naive Bayes Classifier (NBC) And Support Vector Machine (SVM). Naive Bayes Classifier is combined with features that can detect weighting using probability. The classification of tweets in this study was obtained based on a combination of two classes namely sentiment class and category class. The classification of sentiment consists of positive and negative. Test results on built-in applications show that accuracy with Naive Bayes delivers better results than Support Vector Machine. However, overall the use of the Naive Bayes method has a good performance to classify tweets with an accuracy rate of 92.2% Abstrak Pemilihan kepala daerah (Pilkada) serentak 2020 di tengah pandemic COVID-19 mulai ramai di bicarakan mulai dari dunia nyata maupun di dunia maya, khususnya di media sosial Twitter. Keberadaan Twitter telah digunakan secara luas oleh berbagai kalangan masyarakat dalam beberapa tahun terakhir. Twitter adalah salah satu media yang merepresentasikan tanggapan masyarakat terkait issu publik. Menjelang dilaksanakanya pemilihan umum (PEMILU), biasanya ada beberapa pihak yang ingin mengetahui hasil sentimen atau tanggapan masyarakat terhadap issu tersebut, yaitu akademisi, intelektual atau bahkan lawan politik. Kendati demikian pelaksaan pilkada sangat menuai polemik di lapisan masyarakat,oleh karena itu penelitian ini mencoba menganalisis tweet yang membicarakan tentang issu public yaitu pilkada 2020 di tengan Pandemic COVID-19. Analisis yang dilakukan biasanya menggunakan klasifikasi tweet yang berisi sentimen masyarakat tentang issu tersebut. Metode klasifikasi yang digunakan pada penelitian kali ini adalah Naive Bayes Classifier (NBC) Dan Support Vector Machine (SVM). Naive Bayes Classifier dikombinasikan dengan fitur yang dapat mendeteksi pembobotan menggunakan probabilitas. Klasifikasi tweet dalam penelitian ini diperoleh berdasarkan kombinasi antara dua kelas yaitu kelas sentimen dan kelas kategori. Klasifikasi sentimen terdiri dari positif dan negatif. Hasil pengujian pada aplikasi yang dibangun memperlihatkan bahwa akurasi dengan Naive Bayes memberikan hasil yang lebih baik dari pada Support Vector Machine. Namun demikian, secara keseluruhan penggunaan metode Naive Bayes memiliki performansi yang baik untuk melakukan klasifikasi tweet dengan tingkat akurasi 92,2%. Kata kunci: analisis sentimen, klasifikasi, Naive Bayes Classifier.

Social media enables government to discover events in real time, and forecast public opinion. This study presents a system prototype for measuring public opinion from News channels, Bulletin Board Systems (BBS) and social networking... more

Bookmark
Download
- by Ridho Al-Hamdi and +1
  Achmad Nurmandi
- •
- 5
  Public Opinion, Social Media, Distributed Processing, Tf-Idf

It began in 2004 with a simple idea. By organizing Walks on World Diabetes Day, organisations and individuals could raise awareness about diabetes, and how to prevent it. These Walks would be low-cost, educational, and fun. WDF would... more

Bookmark
Download
- by Dr Rajesh Jain and +1
  Susanne Olejas
- •
- 9
  Self and Identity, Diabetes, Doctor Who, Sweden

Objetivo. Describe la aplicación de una herramienta para el análisis semántico de una colección documental, basada en el uso de la frecuencia de término – frecuencia inversa de documento (TF-IDF). Metodología. Se desarrolla un sistema,... more

Objetivo. Describe la aplicación de una herramienta para el análisis semántico de una colección documental, basada en el uso de la frecuencia de término – frecuencia inversa de documento (TF-IDF). Metodología. Se desarrolla un sistema, basado en lenguaje PHP y bases de datos MySQL, para la gestión de un tesauro, del cálculo TF-IDF (como indicador de peso semántico) y para el desarrollo de un árbol de relevancia (conformado por aquellos conceptos más relevantes del tema analizado). Se evaluó la herramienta en el análisis semántico de una colección documental de Psicología. Resultados. El sistema logró identificar el nivel de presencia del tema: deontología profesional, en una colección los documentos del programa de Psicología. Conclusiones. La experiencia descrita confirma la viabilidad de la herramienta para el análisis semántico de una colección documental. Destaca la pertinencia y las capacidades de los profesionales de la información para el desarrollo de herramientas para el tratamiento de información. Los autores sugieren un especial abordaje técnico a partir del uso de scripts y de flujos de la información.

Objective. This paper describes the application of a tool for the semantic analysis of a document collection based on the use of term frequency–inverse document frequency (TF – IDF). Methodology. A system based on PHP and MySQL database for the management of a thesaurus, the calculation of TF – IDF (as an indicator of semantic weight) and for development a relevance tree (consisting of those concepts is developed most relevant issue analyzed). The tool was tested to the semantic analysis of a documentary collection of Psychology. Results. The system was able to identify the level of track presence: professional ethics, in a collection of documents Psychology program. Conclusions. The experience described confirms the viability of the tool for the semantic analysis of a documentary collection. It underlines the relevance and capacities of information professionals to develop this kind of tools for processing information. The authors suggests a special technical approach for use of scripts and information flows.

Bookmark
Download
- by Celeste Bogetti and +1
  Gladys Vanesa Fernandez
- •
- 9
  Information Retrieval, Data Mining, Knowledge Discovery in Databases, Semantic Analysis

User reviews provide a rich source of information regarding user interests. Many Web platforms allow or even encourage their visitors to leave their feedback regarding the products and services they have consumed. The Term Frequency (TF)... more

Devido ao crescente aumento do volume de informaç̧ões na internet, buscam-se uma melhoria contínua das diversas técnicas da recuperação de informaçõ̃es à fim de alcançar resultados mais eficientes e eficazes para encontrar documentos cada... more

Bookmark
Download
- by Murilo C Jayme
- •
- 11
  Information Retrieval, LSA, LSI, Recuperação da Informação

saya lulusan smamda surabaya, dan sekarang kuliah diupn jawatimur

SMS classifying technology has important significance to assist people in dealing with SMS messages. Although sms classification can be performed with little or no effort by people, it still remains difficult for computers. Machine... more

Bookmark
Download
- by Ghayda Altalib
- •
- 5
  Data Mining, Text Classification, Vector Space Model, Tf-Idf

— The Semantic Web opens up new opportunities for the data mining research. Identification of the current interests of the user based on the short-term navigational patterns instead of explicit user information has proved to be one of the... more

Bangla blog is increasing rapidly in the era of information, and consequently, the blog has a diverse layout and categorization. In such an aptitude, automated blog post classification is a comparatively more efficient solution in order... more

Bookmark
Download
- by beei iaes and +1
  Tanvirul Islam 152-15-6117
- •
- 7
  Tf-Idf, Supervised machine learning, Unigram, bigram

In recent times, the exponential growth of the Internet has resulted to an enormous number of electronic documents in several regional languages apart from English. Numerous documents in Tamil language are being generated from news,... more

Bookmark
Download
- by IAEME Publication
- •
- 5
  Stemming, Tf-Idf, Tamil Document classification, Extra-Tree

Tf-Idf

Log In