Text Segmentation
260 Followers
Recent papers in Text Segmentation
Reliable automatic semantic annotation systems do not exist for many languages. Their creation depends in many respects on construction of gold standard corpora. In this paper we present a system for supporting the semi-automatic... more
Current methods of coding recall, summarization, talk-aloud, and question-answering data are inherently unreliable and not effectively documented. If the process of coding protocol data could even be partially automated, this would be an... more
In this paper, we describe a new unsupervised sentence boundary detection system and present a comparative study evaluating its performance against different systems found in the literature that have been used to perform the task of... more
CAPTCHA is a test that can tell humans and computer programs apart automatically. The aim is to allow the server to identify the visitor is a human or a computer, and only provide services to human. It can improve the current server... more
Translators play an important role in conveying ideas and thoughts from one language to another. A contextual based approach to translation of English words from images to the equivalent Hindi words is described in this paper. Text... more
В статье обсуждаются критерии выделения неканонических подлежащих с признаками нулевых местоимений и обсуждается корреляция между сегментацией текста и параметрами грамматики, лицензирующими употребление разных видов нулевых подлежащих в... more
Publication info:
Current Trends in Scripture Translation (UBS Bulletin 198/199, 2005)
Current Trends in Scripture Translation (UBS Bulletin 198/199, 2005)
In this paper, we exploit natural language processing techniques to build a system that automatically segments Hadith into its two main components, Isnad and Matn. We evaluate the previous attempts to segment Hadith and identified the... more
Handwritten text line segmentation is an important task of Optical Character Recognition. The proposal discusses a novel technique for Segmentation of Lines for the handwritten text document written in Kannada Language. The algorithm... more
Recognition of dimensioning text in engineering drawings is an essential part of the drawing understanding process, as this text provides the exact dimensions and tolerances of the object described in the drawing. We consider engineering... more
Abstract This article reviews the role of relevance in text processing. It argues that relevance instructions provided by instructors and texts help readers identify text segments that are germane to a reading goal. A taxonomy of... more
In this paper the problems of deriving a taxonomy from a text and concept-oriented text segmentation are approached. Formal Concept Analysis (FCA) method is applied to solve both of these linguistic problems. The proposed segmentation... more
A reading-based CAPTCHA, called ‘ScatterType, ’ designed to resist character–segmentation attacks, is described. Its challenges are pseudorandomly synthesized images of text strings rendered in machine-print typefaces: within each image,... more
An optical character recognition (OCR) system may be the solution to data entry problems for saving the printed document as a soft copy of them. Therefore, OCR systems are being developed for all languages, and Kurdish is no exception.... more
In this paper the Naive Bayes Classifier (NBC) is introduced for text segmentation. A set of training data is generated from a wide category of document images for learning the NBC. The images used for generating the training data include... more
Over the past several years, researchers have applied different methods of text segmentation. Text segmentation is defined as a method of splitting a document into smaller segments, assuming with its own relevant meaning. Those segments... more
When presented with a retrieved document, users of a search engine are usually left with the task of pinning down the relevant information inside the document. Often this is done by a time-consuming combination of skimming, scrolling and... more
Reliable automatic semantic annotation systems do not exist for many languages. Their creation depends in many respects on construction of gold standard corpora. In this paper we present a system for supporting the semi-automatic... more