Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- abstractSeptember 2016
Table Modelling, Extraction and Processing
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 1–2https://doi.org/10.1145/2960811.2967173This tutorial is targeted at academics and practitioners, both within and outside of the Document Engineering community, who are confronted with table processing tasks such as information extraction and conversion, or have an interest in the topic, and ...
- abstractSeptember 2016
Document Changes: Modeling, Detection, Storage and Visualization (DChanges 2016)
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 5–6https://doi.org/10.1145/2960811.2967169The DChanges series of workshops focuses on changes in all their aspects and applications: algorithms to detect changes, models to describe them and techniques to present them to the final users are only some of the topics we investigate. The workshop ...
- short-paperSeptember 2016
Rendering Mathematics for the Web using Madoko
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 111–114https://doi.org/10.1145/2960811.2967168Madoko [6-8] is a novel authoring system for writing complex documents. It is especially well suited for complex academic or industrial documents, like scientific articles, reference manuals, or math-heavy presentations. One particular important aspect ...
- short-paperSeptember 2016
NCM 3.1: A Conceptual Model for Hyperknowledge Document Engineering
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 55–58https://doi.org/10.1145/2960811.2967167Most of multimedia documents available today are agnostic to data semantics and their specification language offer little to ease authoring and mechanisms to their players so they can retrieve and present meaningful content to improve user experience. ...
- short-paperSeptember 2016
Mass Serialization Method for Document Encryption Policy Enforcement
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 193–196https://doi.org/10.1145/2960811.2967166Analytics obtained during the creation of a database of mass serialized codes can also be used to help enforcement of encryption policy on documents. In this paper, we introduce a set of metrics which complement traditional NIST cryptography methods -- ...
-
- short-paperSeptember 2016
Bayesian Mixture Models on Connected Components for Newspaper Article Segmentation
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 143–146https://doi.org/10.1145/2960811.2967165In this paper we propose a new method for automated segmentation of scanned newspaper pages into articles. Article regions are produced as a result of merging sub-article level content and title regions. We use a Bayesian Gaussian mixture model to model ...
- short-paperSeptember 2016
A PDF Wrapper for Table Processing
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 115–118https://doi.org/10.1145/2960811.2967162We propose a PDF document wrapper system that is specifically targeted at table processing applications. We (i) review the PDF specifications and identify particular challenges from the table processing point of view, (ii) specify a table-oriented ...
- short-paperSeptember 2016
Frequent Multi-Byte Character Subtring Extraction using a Succinct Data Structure
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 103–106https://doi.org/10.1145/2960811.2967161Frequent string mining is widely used in text processing to extract text features. Most researchers have focused on text using single-byte characters. Consequently, their applications have problems when applied to text represented with multibyte ...
- short-paperSeptember 2016
Assessing Concept Weighting in Integer Linear Programming based Single-document Summarization
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 205–208https://doi.org/10.1145/2960811.2967160Some of the recent state-of-the-art systems for Automatic Text Summarization rely on the concept-based approach using Integer Linear Programming (ILP), mainly for multi-document summarization. A study on the suitability of such an approach to single-...
- short-paperSeptember 2016
Towards Cohesive Extractive Summarization through Anaphoric Expression Resolution
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 201–204https://doi.org/10.1145/2960811.2967159This paper presents a new method for improving the cohesiveness of summaries generated by extractive summarization systems. The solution presented attempts to improve the legibility and cohesion of the generated summaries through coreference resolution. ...
- short-paperSeptember 2016
Mobile Summarizer and News Summary Navigator: Two Multilingual News Article Summarization Tools for Mobile Devices
- Luciano Cabral,
- Manoel Neto,
- Artur Borges,
- Rafael Lins,
- Rinaldo Lima,
- Rafael Mello,
- Marcelo Riss,
- Steven J. Simske
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 107–110https://doi.org/10.1145/2960811.2967156Mobile devices such as smart phones and tablets are omnipresent in modern societies. Such devices allow browsing the Internet. This paper briefly describes two tools for news article summarization in mobile devices that attempts to automatically collect ...
- short-paperSeptember 2016
METIS: A Multi-faceted Hybrid Book Learning Platform
- Lei Liu,
- Rares Vernica,
- Tamir Hassan,
- Niranjan Damera Venkata,
- Yang Lei,
- Jian Fan,
- Jerry Liu,
- Steven J. Simske,
- Shanchan Wu
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 31–34https://doi.org/10.1145/2960811.2967155Today, students are offered a wide variety of alternatives to printed material for the consumption of educational content. Previous research suggests that, while digital content has its advantages, printed content still offers benefits that cannot be ...
- short-paperSeptember 2016
Automated Intrinsic Text Classification for Component Content Management Applications in Technical Communication
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 95–98https://doi.org/10.1145/2960811.2967153Classification models are used in component content management to identify content components for retrieval, reuse and distribution. Intrinsic metadata, such as the assigned information class, play an important role in these tasks. With the increasing ...
- short-paperSeptember 2016
Combining Taxonomies using Word2vec
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 131–134https://doi.org/10.1145/2960811.2967151Taxonomies have gained a broad usage in a variety of fields due to their extensibility, as well as their use for classification and knowledge organization. Of particular interest is the digital document management domain in which their hierarchical ...
- short-paperSeptember 2016
Centroid Terms as Text Representatives
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 99–102https://doi.org/10.1145/2960811.2967150Algorithms to topically cluster and classify texts rely on information about their semantic distances and similarities. Standard methods based on the bag-of-words model to determine this information return only rough estimations regarding the ...
- short-paperSeptember 2016
An Exploratory Study on Managing and Searching for Documents in Software Engineering Environments
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 189–192https://doi.org/10.1145/2960811.2967149A large number of documents are usually produced in the software industry. In this work, we conduct a qualitative study to explore the main practices and challenges related to managing these documents. The results of this study are based on interviews ...
- short-paperSeptember 2016
Extending Data Models by Declaratively Specifying Contextual Knowledge
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 123–126https://doi.org/10.1145/2960811.2967147The research data landscape of the arts and humanities is characterized by a high degree of heterogeneity. To improve interoperability, recent initiatives and research infrastructures are encouraging the use of standards and best practices. However, ...
- invited-talkSeptember 2016
Design is Not What You Think It Is
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Page 83https://doi.org/10.1145/2960811.2967145In this talk, Peter Bil'ak will examine the ways that current publishing practices are rooted in the 19th century, and how in order to move forward, we may have to go back to the roots and reconnect with readers. He will also talk about his recent ...
- research-articleSeptember 2016
SEL: A Unified Algorithm for Entity Linking and Saliency Detection
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 85–94https://doi.org/10.1145/2960811.2960819The Entity Linking task consists in automatically identifying and linking the entities mentioned in a text to their URIs in a given Knowledge Base, e.g., Wikipedia. Entity Linking has a large im- pact in several text analysis and information retrieval ...
- research-articleSeptember 2016
Digital Preservation Based on Contextualized Dependencies
DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringSeptember 2016, Pages 35–44https://doi.org/10.1145/2960811.2960818Most of existing efforts in digital preservation have focused on extending the life of documents beyond their period of creation, without taking into account intentions and assumptions made. However, in a continuously evolving setting, knowledge about ...