We report our experience in using GML for integrating heterogeneous geo-spatial datasets in build... more We report our experience in using GML for integrating heterogeneous geo-spatial datasets in building an experimental geographic information system (GIS). A GML application schema is developed to serve as an application profile that includes 1:1000 scale topographic maps, urban land-use zoning maps, and digital terrain models (DTM). In addition to using and modifying open source data conversion tools to transfer dgn-, shp-, and grd-formatted datasets to GML-coded documents, we have also developed our own query tools to retrieve collections of geo-spatial features from large GML documents. The retrieved collection of GML-coded geo-spatial features is rendered to SVG at the server-side and sent to the client-side for visual presentation and navigation.
地名在數位典藏常被使用於指涉地理空間位置,但傳統地名辭典是以關連式資料庫來管理地名,如此的地名服務不能提供數位典藏清楚地分辨出地名和去地名模糊化,連結資料是關於利用語意網技術去連結相關且過去沒有... more 地名在數位典藏常被使用於指涉地理空間位置,但傳統地名辭典是以關連式資料庫來管理地名,如此的地名服務不能提供數位典藏清楚地分辨出地名和去地名模糊化,連結資料是關於利用語意網技術去連結相關且過去沒有相互連結在一起的資料,連結資料也利用語意網技術來減低資料相互連結的障礙,本研究說明了我們在於利用語意網技術處理台灣地名資料的實務經驗,為釐清地名的概念,本研究建立知識本體來呈現地名具有的時空間知識,基於知識本體,地名資料可以轉換成RDF,以及透過D2R來建立連結地名資料服務,這連結地名服務則示範了地名的去模糊化。
This paper describes our experiences on dealing with the transformation from Traceable Agricultur... more This paper describes our experiences on dealing with the transformation from Traceable Agriculture Product (TAP) records to Linked Open Government Data. By using existing ontologies and vocabularies, TAP Ontology is developed for clarifying the semantics of TAP. To increase the reusability of TAP, the crops and operational processing details of TAP are mapped to Common Agricultural Vocabulary (CAVOC). There are four SPARQL endpoints developed for supporting queries to TAP. To demonstrate the reuse of Linked TAP, we develop a Chrome extension LinkedFood to offer TAP information via reading ingredients in recipe websites.
Reptile Road Mortality (in Chinese, 路殺社) is a citizen science project which aims to collect repor... more Reptile Road Mortality (in Chinese, 路殺社) is a citizen science project which aims to collect reports of dead animals that have been struck and/or killed by motor vehicles through the use of Facebook. The use of Facebook makes citizen easy to provide their observations. However, the crowdsourced information contributed by citizens through social media is often in unstructured data format such as text and image. It is a challenge to process unstructured data collections for scientific purposes. In order to engage social media with citizen science, we developed novel methods to transfer unstructured data to structured data for scientific purposes.
Position technologies such as GPS are directly geo-referenced with geographic coordinates, howeve... more Position technologies such as GPS are directly geo-referenced with geographic coordinates, however, the use of geographic names for indicating locations is instinctive. The value of geographic names is not only used for geographic reference but also represents the landscape of culture and social. Thus, the concepts of geographic names are often complicated, diversified, ambiguous and multi-scaled geospatial objects. There is a need to specify the place name to canonical and interchangeable geospatial knowledge. Linked data is a new research area which studies how to make data available on the Web, and to interconnect data with the aim of increasing its value for users. Each entry representing a fact in LOD datasets has a Unique Resource Identifier (URI) which is referenceable and linkable on the Web. The high interconnectivity between entries potentially increases discoverability, reusability, and the utility of information. Therefore, if geographic names can be identified and conne...
Proceedings of the 1st ACM SIGSPATIAL International Workshop on Privacy in Geographic Information Collection and Analysis - GeoPrivacy '14, 2014
ABSTRACT Location-based systems (LBS) represent an emerging genre of applications that exploit po... more ABSTRACT Location-based systems (LBS) represent an emerging genre of applications that exploit positioning technologies and facilitate a wide range of location-based services. Unlike conventional information systems, LBS data management is challenging because LBS data is high dimensional and spatio-temporal in nature, and information leakage may result in location related privacy crises. The issue has become even more complicated, as database outsourcing has become inevitable in view of the emerging popularity of LBS deployment. In this paper, we tackle the research challenge and propose a SecUre Database Outsourcing system, called SUDO. By combining the techniques of Hilbert space-filling curves, different invertible encryption algorithms, and genuine mixed data, we show that SUDO is capable of preserving location privacy for LBS against different attacks. Moreover, the proposed solution is simple, effective, and scalable; and it shows promise in supporting LBS data management with outsourced databases.
International Conference on Earth Observation Data Processing and Analysis (ICEODPA), 2008
Land use and land cover (LULC) data is essential to environmental and ecological research. Howeve... more Land use and land cover (LULC) data is essential to environmental and ecological research. However, semantic heterogeneous of land use and land cover classification are often resulted from different data resources, different cultural contexts, and different utilities. Therefore, ...
ABSTRACT The purposes of this study are to extract the names of species and places for a citizen-... more ABSTRACT The purposes of this study are to extract the names of species and places for a citizen-science monitoring program, to obtain crowd-sourced data of acceptable quality, and to assess the quality and the uncertainty of predictions based on crowd-sourced data and professional data. We used Natural Language Processing to extract names of species and places from text messages in a citizen science project. Bootstrap and Maximum Entropy methods were used to assess the uncertainty in the model predictions based on crowd-sourced data from the EnjoyMoths project in Taiwan. We compared uncertainty in the predictions obtained from the project and from the Global Biodiversity Information Facility (GBIF) field data for seven focal species of moth. The proximity to locations of easy access and the Ripley K method were used to test the level of spatial bias and randomness of the crowd-sourced data against GBIF data. Our results show that extracting information to identify the names of species and their locations from crowd-sourced data performed well. The results of the spatial bias and randomness tests revealed that the crowd-sourced data and GBIF data did not differ significantly in respect to both spatial bias and clustering. The prediction models developed using the crowd-sourced dataset were the most effective, followed by those that were developed using the combined dataset. Those that performed least well were based on the small sample size GBIF dataset. Our method demonstrates the potential for using data collected by citizen scientists and the extraction of information from vast social networks. Our analysis also shows the value of citizen science data to improve biodiversity information in combination with data collected by professionals.
We report our experience in using GML for integrating heterogeneous geo-spatial datasets in build... more We report our experience in using GML for integrating heterogeneous geo-spatial datasets in building an experimental geographic information system (GIS). A GML application schema is developed to serve as an application profile that includes 1:1000 scale topographic maps, urban land-use zoning maps, and digital terrain models (DTM). In addition to using and modifying open source data conversion tools to transfer dgn-, shp-, and grd-formatted datasets to GML-coded documents, we have also developed our own query tools to retrieve collections of geo-spatial features from large GML documents. The retrieved collection of GML-coded geo-spatial features is rendered to SVG at the server-side and sent to the client-side for visual presentation and navigation.
地名在數位典藏常被使用於指涉地理空間位置,但傳統地名辭典是以關連式資料庫來管理地名,如此的地名服務不能提供數位典藏清楚地分辨出地名和去地名模糊化,連結資料是關於利用語意網技術去連結相關且過去沒有... more 地名在數位典藏常被使用於指涉地理空間位置,但傳統地名辭典是以關連式資料庫來管理地名,如此的地名服務不能提供數位典藏清楚地分辨出地名和去地名模糊化,連結資料是關於利用語意網技術去連結相關且過去沒有相互連結在一起的資料,連結資料也利用語意網技術來減低資料相互連結的障礙,本研究說明了我們在於利用語意網技術處理台灣地名資料的實務經驗,為釐清地名的概念,本研究建立知識本體來呈現地名具有的時空間知識,基於知識本體,地名資料可以轉換成RDF,以及透過D2R來建立連結地名資料服務,這連結地名服務則示範了地名的去模糊化。
This paper describes our experiences on dealing with the transformation from Traceable Agricultur... more This paper describes our experiences on dealing with the transformation from Traceable Agriculture Product (TAP) records to Linked Open Government Data. By using existing ontologies and vocabularies, TAP Ontology is developed for clarifying the semantics of TAP. To increase the reusability of TAP, the crops and operational processing details of TAP are mapped to Common Agricultural Vocabulary (CAVOC). There are four SPARQL endpoints developed for supporting queries to TAP. To demonstrate the reuse of Linked TAP, we develop a Chrome extension LinkedFood to offer TAP information via reading ingredients in recipe websites.
Reptile Road Mortality (in Chinese, 路殺社) is a citizen science project which aims to collect repor... more Reptile Road Mortality (in Chinese, 路殺社) is a citizen science project which aims to collect reports of dead animals that have been struck and/or killed by motor vehicles through the use of Facebook. The use of Facebook makes citizen easy to provide their observations. However, the crowdsourced information contributed by citizens through social media is often in unstructured data format such as text and image. It is a challenge to process unstructured data collections for scientific purposes. In order to engage social media with citizen science, we developed novel methods to transfer unstructured data to structured data for scientific purposes.
Position technologies such as GPS are directly geo-referenced with geographic coordinates, howeve... more Position technologies such as GPS are directly geo-referenced with geographic coordinates, however, the use of geographic names for indicating locations is instinctive. The value of geographic names is not only used for geographic reference but also represents the landscape of culture and social. Thus, the concepts of geographic names are often complicated, diversified, ambiguous and multi-scaled geospatial objects. There is a need to specify the place name to canonical and interchangeable geospatial knowledge. Linked data is a new research area which studies how to make data available on the Web, and to interconnect data with the aim of increasing its value for users. Each entry representing a fact in LOD datasets has a Unique Resource Identifier (URI) which is referenceable and linkable on the Web. The high interconnectivity between entries potentially increases discoverability, reusability, and the utility of information. Therefore, if geographic names can be identified and conne...
Proceedings of the 1st ACM SIGSPATIAL International Workshop on Privacy in Geographic Information Collection and Analysis - GeoPrivacy '14, 2014
ABSTRACT Location-based systems (LBS) represent an emerging genre of applications that exploit po... more ABSTRACT Location-based systems (LBS) represent an emerging genre of applications that exploit positioning technologies and facilitate a wide range of location-based services. Unlike conventional information systems, LBS data management is challenging because LBS data is high dimensional and spatio-temporal in nature, and information leakage may result in location related privacy crises. The issue has become even more complicated, as database outsourcing has become inevitable in view of the emerging popularity of LBS deployment. In this paper, we tackle the research challenge and propose a SecUre Database Outsourcing system, called SUDO. By combining the techniques of Hilbert space-filling curves, different invertible encryption algorithms, and genuine mixed data, we show that SUDO is capable of preserving location privacy for LBS against different attacks. Moreover, the proposed solution is simple, effective, and scalable; and it shows promise in supporting LBS data management with outsourced databases.
International Conference on Earth Observation Data Processing and Analysis (ICEODPA), 2008
Land use and land cover (LULC) data is essential to environmental and ecological research. Howeve... more Land use and land cover (LULC) data is essential to environmental and ecological research. However, semantic heterogeneous of land use and land cover classification are often resulted from different data resources, different cultural contexts, and different utilities. Therefore, ...
ABSTRACT The purposes of this study are to extract the names of species and places for a citizen-... more ABSTRACT The purposes of this study are to extract the names of species and places for a citizen-science monitoring program, to obtain crowd-sourced data of acceptable quality, and to assess the quality and the uncertainty of predictions based on crowd-sourced data and professional data. We used Natural Language Processing to extract names of species and places from text messages in a citizen science project. Bootstrap and Maximum Entropy methods were used to assess the uncertainty in the model predictions based on crowd-sourced data from the EnjoyMoths project in Taiwan. We compared uncertainty in the predictions obtained from the project and from the Global Biodiversity Information Facility (GBIF) field data for seven focal species of moth. The proximity to locations of easy access and the Ripley K method were used to test the level of spatial bias and randomness of the crowd-sourced data against GBIF data. Our results show that extracting information to identify the names of species and their locations from crowd-sourced data performed well. The results of the spatial bias and randomness tests revealed that the crowd-sourced data and GBIF data did not differ significantly in respect to both spatial bias and clustering. The prediction models developed using the crowd-sourced dataset were the most effective, followed by those that were developed using the combined dataset. Those that performed least well were based on the small sample size GBIF dataset. Our method demonstrates the potential for using data collected by citizen scientists and the extraction of information from vast social networks. Our analysis also shows the value of citizen science data to improve biodiversity information in combination with data collected by professionals.
Uploads
Papers by Dongpo Deng