Cited By
View all- Jiménez PCorchuelo R(2022)On validating web information extraction proposalsExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.116700199:COnline publication date: 23-May-2022
HTML tables have become pervasive on the Web. Extracting their data automatically is difficult because finding the relationships between their cells is not trivial due to the many different layouts, encodings, and formats available. In ...
The World Wide Web has become a primary source of information. Therefore, extracting data from Web sources has become a key technology. In this paper, we introduce a semi-automatic system Ducky: including a Web Wrapper which extracts data from Web ...
The web is overflowing with implicitly structured data, spread over hundreds of thousands of sites, hidden deep behind search forms, or siloed in marketplaces, only accessible as HTML. Automatic extraction of structured data at the scale of thousands of ...
Elsevier Science Inc.
United States
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in