Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Jihad El-Sana
  • Israel
We present a fully automated learning free method, for line detection in manuscripts. We begin by separating components that span over multiple lines, then we remove noise, and small connected components such as diacritics. We apply a... more
We present a fully automated learning free method, for line detection in manuscripts. We begin by separating components that span over multiple lines, then we remove noise, and small connected components such as diacritics. We apply a distance transform on the image to create the image skeleton. The skeleton is pruned, its vertexes and edges are detected, in order to generate the initial document graph. We calculate the vertex v-score using its t-score and l-score quantifying its distance from being an absolute link in a line. In a greedy manner we classify each edge in the graph either a link, a bridge or a conflict edge. We merge every two edges classified as link together, then we merge the conflict edges next. Finally we remove the bridge edges from the graph generating the final form of the graph. Each edge in the graph equals to one extracted line. We applied the method on the DIVA-hisDB dataset on both public and private sections. The public section participated in the recently conducted Layout Analysis for Challenging Medieval Manuscripts Competition, and we have achieved results surpassing the vast majority of these systems.
Text lines are important parts of handwritten document images and easier to be analyzed by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task.... more
Text lines are important parts of handwritten document images and easier to be analyzed by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task. This paper proposes to use a fully convolutional network for text line detection and energy minimization for text line extraction. Detected text lines are represented by blob lines that strike through the text lines. These blob lines assist an energy function for text line extraction. The detection stage can locate arbitrarily oriented text lines. Furthermore, the extraction stage is capable of finding out the pixels of text lines with various heights and interline proximity independent of their orientations. Besides, it can finely split the touching and overlapping text lines without an orientation assumption. We evaluate the proposed method on VML-AHTE, VML-MOC and Diva-HisDB datasets. The first contains overlapping, touching and close text lines wi...
Arabic script is naturally cursive and unconstrained and, as a result, an automatic recognition of its handwriting is a challenging problem. The analysis of Arabic script is further complicated in comparison to Latin script due to... more
Arabic script is naturally cursive and unconstrained and, as a result, an automatic recognition of its handwriting is a challenging problem. The analysis of Arabic script is further complicated in comparison to Latin script due to obligatory dots/stokes that are placed above or below most letters. In this paper, we introduce a new approach that performs online Arabic word recognition on a continuous word-part level, while performing training on the letter level. In addition, we appropriately handle delayed strokes by first detecting them and then integrating them into the word-part body. Our current implementation is based on Hidden Markov Models (HMM) and correctly handles most of the Arabic script recognition difficulties. We have tested our implementation using various dictionaries and multiple writers and have achieved encouraging results for both writer-dependent and writer-independent recognition.
Arabic script is naturally cursive and unconstrained. As a result, automatic recognition of its handwriting is a challenging problem. In comparison to Latin script, analysis of Arabic script is further complicated by both cursiveness and... more
Arabic script is naturally cursive and unconstrained. As a result, automatic recognition of its handwriting is a challenging problem. In comparison to Latin script, analysis of Arabic script is further complicated by both cursiveness and the obligatory dots or stokes that are placed above or below most letters. The naturally inherited cursiveness and the large number and positions of additional strokes discourage the segmentation-free approach of analysis because of the anticipated huge number of combinations needed to produce different words. At the same time, segmentation of Arabic script to individual characters is almost impossible and often, responsible to many misclassified items in Arabic script recognizers. This paper presents statistics on the Arabic language using a very large corpus of Arabic words. These statistical results, which could are used to improve the efficiency and accuracy of Arabic script recognizers also indicate that a holistic approach is computationally a...
This course will focus on describing techniques for handling datasets larger than main memory in scientific visualization and computer graphics. Recently, several external memory techniques have been developed for a wide variety of... more
This course will focus on describing techniques for handling datasets larger than main memory in scientific visualization and computer graphics. Recently, several external memory techniques have been developed for a wide variety of graphics and visualization problems, including surface simplification, volume rendering, isosurface generation, ray tracing, surface reconstruction, and so on. This work has had significant impact given that in recent years there has been a rapid increase in the raw size of datasets. Several ...
Paleography is the study of ancient and medieval handwriting. It is essential for understanding, authenticating, and dating historical texts. Across many archives and libraries, many handwritten manuscripts are yet to be classified. Human... more
Paleography is the study of ancient and medieval handwriting. It is essential for understanding, authenticating, and dating historical texts. Across many archives and libraries, many handwritten manuscripts are yet to be classified. Human experts can process a limited number of manuscripts; therefore, there is a need for an automatic tool for script type classification. In this study, we utilize a deep-learning methodology to classify medieval Hebrew manuscripts into 14 classes based on their script style and mode. Hebrew paleography recognizes six regional styles and three graphical modes of scripts. We experiment with several input image representations and network architectures to determine the appropriate ones and explore several approaches for script classification. We obtained the highest accuracy using hierarchical classification approach. At the first level, the regional style of the script is classified. Then, the patch is passed to the corresponding model at the second lev...
We present an unsupervised text line segmentation method that is inspired by the relative variance between text lines and spaces among text lines. Handwritten text line segmentation is important for the efficiency of further processing. A... more
We present an unsupervised text line segmentation method that is inspired by the relative variance between text lines and spaces among text lines. Handwritten text line segmentation is important for the efficiency of further processing. A common method is to train a deep learning network for embedding the document image into an image of blob lines that are tracing the text lines. Previous methods learned such embedding in a supervised manner, requiring the annotation of many document images. This paper presents an unsupervised embedding of document image patches without a need for annotations. The main idea is that the number of foreground pixels over the text lines is relatively different from the number of foreground pixels over the spaces among text lines. Generating similar and different pairs relying on this principle definitely leads to outliers. However, as the results show, the outliers do not harm the convergence and the network learns to discriminate the text lines from th...
In this paper, we present a sub-word recognition method for historical Arabic manuscripts, using convolutional neural networks. We investigate the benefit of extending training set with synthetically created samples in comparison to... more
In this paper, we present a sub-word recognition method for historical Arabic manuscripts, using convolutional neural networks. We investigate the benefit of extending training set with synthetically created samples in comparison to augmentation. We show that annotating around ten pages of a manuscript and extending it, is sufficient for successful sub-word recognition in the whole manuscript. In addition, we show the contribution of using different combinations of training sets and compare their sub-word recognition performance in the whole manuscript.
We present a novel framework for automatic and efficient synthesis of historical handwritten Arabic text. The main purpose of this framework is to assist word spotting and keyword searching in handwritten historical documents. The... more
We present a novel framework for automatic and efficient synthesis of historical handwritten Arabic text. The main purpose of this framework is to assist word spotting and keyword searching in handwritten historical documents. The proposed framework consists of two main procedures: building a letter connectivity map and synthesizing words. A letter connectivity map includes multiple instances of the various shape of each letter, since a letter in Arabic usually has multiple shapes depends in its position in the word. Each map represents one writer and encodes the specific handwriting style. The letter connectivity map is used to guide the synthesis of any Arabic continuous subword, word, or sentence. The proposed framework automatically generates the letter connectivity map annotation from a several pages historical pages previously annotated. Once the letter connectivity map is available our framework can synthesis the pictorial representation of any Arabic word or sentence from their text representation. The writing style of the synthesized text resembles the writing style of the input pages. The synthesized words can be used in word-spotting and many other historical document processing applications. The proposed approach provides an intuitive and easy-to-use framework to search for a keyword in the rest of the manuscript. Our experimental study shows that our approach enables accurate results in word spotting algorithms.
ABSTRACT Non-rigid 3D shape retrieval has become a research hotpot in communities of computer graphics, computer vision, pattern recognition, etc. In this paper, we present the results of the SHREC'15 Track: Non-rigid 3D Shape... more
ABSTRACT Non-rigid 3D shape retrieval has become a research hotpot in communities of computer graphics, computer vision, pattern recognition, etc. In this paper, we present the results of the SHREC'15 Track: Non-rigid 3D Shape Retrieval. The aim of this track is to provide a fair and effective platform to evaluate and compare the performance of current non-rigid 3D shape retrieval methods developed by different research groups around the world. The database utilized in this track consists of 1200 3D watertight triangle meshes which are equally classified into 50 categories. All models in the same category are generated from an original 3D mesh by implementing various pose transformations. The retrieval performance of a method is evaluated using 6 commonly-used measures (i.e., PR-plot, NN, FT, ST, E-measure and DCG.). Totally, there are 37 submissions and 11 groups taking part in this track. Evaluation results and comparison analyses described in this paper not only show the bright future in researches of non-rigid 3D shape retrieval but also point out several promising research directions in this topic.
Trees, bushes, and other plants are ubiquitous in urban environments, and realistic models of trees can add a great deal of realism to a digital urban scene. There has been much research on modeling tree structures, but limited work on... more
Trees, bushes, and other plants are ubiquitous in urban environments, and realistic models of trees can add a great deal of realism to a digital urban scene. There has been much research on modeling tree structures, but limited work on reconstructing the geometry of real-world trees -- even then, most works have focused on reconstruction from photographs aided by significant user interaction. In this paper, we perform active laser scanning of real-world vegetation and present an automatic approach that robustly reconstructs skeletal structures of trees, from which full geometry can be generated. The core of our method is a series of global optimizations that fit skeletal structures to the often sparse, incomplete, and noisy point data. A significant benefit of our approach is its ability to reconstruct multiple overlapping trees simultaneously without segmentation. We demonstrate the effectiveness and robustness of our approach on many raw scans of different tree varieties.
In this paper we present a system for searching keywords in Arabic handwritten and historical documents using two algorithms, Dynamic Time Warping (DTW) and Hidden Markov Models (HMM). The HMM based system provides satisfying results when... more
In this paper we present a system for searching keywords in Arabic handwritten and historical documents using two algorithms, Dynamic Time Warping (DTW) and Hidden Markov Models (HMM). The HMM based system provides satisfying results when it is possible to provide adequate training samples (which is not always possible in historical documents). The DTW algorithm with a slight modification provides better results even with a small set of training samples. The observation sequences for the matching algorithms are generated by extracting a set of geometric features that already shown to obtain good recognition rates for on-line Arabic handwriting. We have adopted the segmentation-free approach, i.e., continuous word-parts are used as the basic alphabet, instead of the usual alphabet letters. The contours of the complete word-parts are used to represent the shapes of the compared word-parts. Additional strokes, such as dots and detached short segments, which are very common in Arabic sc...
Recently, many big libraries all over the world have been scanning their collections to make them publicly available and to preserve historical documents. We present a modular software system which can be used as a tool for... more
Recently, many big libraries all over the world have been scanning their collections to make them publicly available and to preserve historical documents. We present a modular software system which can be used as a tool for semi-automatical processing of historical handwritten Arabic documents. The development of this system is part of the HADARA project which aims for historical document analysis of Arabic manuscripts and consists of a project team including engineers and computer scientists but also users such as linguists and historians. The HADARA system is designed to support script and content analysis, identification, and classification of historical Arabic documents. The system has been created following an iterative development approach, and the current version assists the user in an interactive and partially already in an automatic manner. In this paper, a system overview is given and the first modules are presented which support the annotation of a scanned manuscript in a...
Historical manuscript alignment is a widely known problem in document analysis. Finding the differences between manuscript editions is mostly done manually. In this paper, we present a writer independent deep learning model which is... more
Historical manuscript alignment is a widely known problem in document analysis. Finding the differences between manuscript editions is mostly done manually. In this paper, we present a writer independent deep learning model which is trained on several writing styles, and able to achieve high detection accuracy when tested on writing styles not present in training data. We test our model using cross validation, each time we train the model on five manuscripts, and test it on the other two manuscripts, never seen in the training data. We've applied cross validation on seven manuscripts, netting 21 different tests, achieving average accuracy of $\%92.17$. We also present a new alignment algorithm based on dynamic sized sliding window, which is able to successfully handle complex cases.
In this paper we present the Tangible Stickers, a tangible interface framework which is based on small devices that include Inertial Measurement Units (IMU) sensors, such as gyroscopes and accelerometers. These Tangible Input Devices... more
In this paper we present the Tangible Stickers, a tangible interface framework which is based on small devices that include Inertial Measurement Units (IMU) sensors, such as gyroscopes and accelerometers. These Tangible Input Devices (TID) are attached to physical objects turning them into input devices, which transmit the sensed data wirelessly to a paired server. The server maintains the states of its paired devices in a stateful manner and expose these devices with their state to interactive applications connected to the server. These applications interact with the paired devices and augment their attached physical objects creating a tangible user interface. Our framework enables an application developer to easily incorporate a tangible interface into their applications, which communicate with the server to receive the state of these devices, and update the state of their digital counterparts. We have implemented the proposed framework, tested our implementation on various scenar...
Recently, several external memory techniques have been developed for a wide variety of graphics and visualization problems, including surface simplification, volume rendering, isosurface generation, ray tracing, surface reconstruction,... more
Recently, several external memory techniques have been developed for a wide variety of graphics and visualization problems, including surface simplification, volume rendering, isosurface generation, ray tracing, surface reconstruction, and so on. This work has had significant impact given that in recent years there has been a rapid increase in the raw size of datasets. Several technological trends are contributing to this, such as the development of high-resolution 3D scanners, and the need to visualize ASCI-size (Accelerated Strategic Computing Initiative) datasets. Another important push for this kind of technology is the growing speed gap between main memory and caches, such a gap penalizes algorithms which do not optimize for coherence of access. Because of these reasons, much research in computer graphics focuses on developing out-of-core (and often cache-friendly) techniques. This paper surveys fundamental issues, current problems, and unresolved solutions, and aims to provide...
This paper publishes a natural and very complicated dataset of handwritten documents with multiply oriented and curved text lines, namely VML-MOC dataset. These text lines were written as remarks on the page margins by different writers... more
This paper publishes a natural and very complicated dataset of handwritten documents with multiply oriented and curved text lines, namely VML-MOC dataset. These text lines were written as remarks on the page margins by different writers over the years. They appear at different locations within the orientations that range between 0° and 180° or as curvilinear forms. We evaluate a multi-oriented Gaussian based method to segment these handwritten text lines that are skewed or curved in any orientation. It achieves a mean pixel Intersection over Union score of 80.96% on the test documents. The results are compared with the results of a single-oriented Gaussian based text line segmentation method.
Online handwriting recognition of Arabic script is a difficult problem since it is naturally both cursive and unconstrained. The analysis of Arabic script is further complicated in comparison to Latin script due to obligatory dots/stokes... more
Online handwriting recognition of Arabic script is a difficult problem since it is naturally both cursive and unconstrained. The analysis of Arabic script is further complicated in comparison to Latin script due to obligatory dots/stokes that are placed above or below most letters. This paper introduces a Hidden Markov Model (HMM) based system to provide solutions for most of the difficulties inherent in recognizing Arabic script including: letter connectivity, position-dependent letter shaping, and delayed strokes. This is the first HMM-based solution to online Arabic handwriting recognition. We report successful results for writerdependent and writer-independent word recognition.
In historical document image processing, datasets account for a significant part of any research, and are crucial for the diversity and abundance of experimental results, which contribute to the development of new algorithms to meet the... more
In historical document image processing, datasets account for a significant part of any research, and are crucial for the diversity and abundance of experimental results, which contribute to the development of new algorithms to meet the new challenge. Moreover, they are very important for benchmarking processing algorithms. Numerous publicly available document image datasets of different languages have been emerged. However, current segmentation and recognition performances are nearly saturated with respect to the present publicly available datasets. As such, collecting and labelling historical document images is a burden on historical document image processing researchers. This paper introduces a public historical document image dataset, Pinkas dataset, with new challenges to open room for improvement and identify strengths and weaknesses of available processing algorithms. It is the first dataset in medieval handwritten Hebrew and fully labeled at word, line and page level by an e...
This paper proposes a novel Convolutional Neural Network model for contour data analysis (ContourCNN) and shape classification. A contour is a circular sequence of points representing a closed shape. For handling the cyclical property of... more
This paper proposes a novel Convolutional Neural Network model for contour data analysis (ContourCNN) and shape classification. A contour is a circular sequence of points representing a closed shape. For handling the cyclical property of the contour representation, we employ circular convolution layers. Contours are often represented sparsely. To address information sparsity, we introduce priority pooling layers that select features based on their magnitudes. Priority pooling layers pool features with low magnitudes while leaving the rest unchanged. We evaluated the proposed model using letters and digits shapes extracted from the EMNIST dataset and obtained a high classification accuracy.
Our research project is part of the Visual Media Lab, headed by Professor Jihad El-Sana, the Department of Computer Science at Ben-Gurion University of the Negev, Israel. In this interdisciplinary project we apply deep learning models to... more
Our research project is part of the Visual Media Lab, headed by Professor Jihad El-Sana, the Department of Computer Science at Ben-Gurion University of the Negev, Israel. In this interdisciplinary project we apply deep learning models to classify script types and sub-types in medieval Hebrew manuscripts. The model incorporates the the techniques and databases of Hebrew paleography and (with reservations) Hebrew codicology. Main theoretical base of our project is the SfarData dataset, that includes the full codicological descriptions and paleographical definitions of all dated medieval Hebrew manuscripts till the year 1540. In some exceptional cases, we go beyond this dataset framework. The major source of the data in terms of high definition photos of manuscripts is the Institute of Microfilmed Hebrew Manuscripts at the National Library of Israel that has undertaken the mission to collect copies of all extant He-brew manuscripts from all over the world. We mostly use manuscripts fro...

And 115 more