Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Frank S.C. Tseng

    Frank S.C. Tseng

    In heterogeneous database systems, partial values can be used to resolve the interoperability problems, including domain mismatch, inconsistent data, and missing data. Performing operations on partial values may produce maybe tuples in... more
    In heterogeneous database systems, partial values can be used to resolve the interoperability problems, including domain mismatch, inconsistent data, and missing data. Performing operations on partial values may produce maybe tuples in the query result which cannot be compared. Thus, users have no way to distinguish which maybe tuple is the most possible answer. The concept of partial values is
    ABSTRACT Entity-relationship (E-R) concepts are extended to capture natural language semantics and develop a logical form to represent natural language queries. The logical form can be efficiently transformed into relational algebra for... more
    ABSTRACT Entity-relationship (E-R) concepts are extended to capture natural language semantics and develop a logical form to represent natural language queries. The logical form can be efficiently transformed into relational algebra for query execution. The whole process provides a clear and natural framework for processing natural language queries to retrieve data from database systems
    ... However, for sum and average, we point oat that in general it takes exponential time complexity to do the computations. ... Also, ALP Chen and J.4. Chiu are with the Department of Computer Science, National Tsing Hua Uniuersity,... more
    ... However, for sum and average, we point oat that in general it takes exponential time complexity to do the computations. ... Also, ALP Chen and J.4. Chiu are with the Department of Computer Science, National Tsing Hua Uniuersity, Hsinchu, Tniwnn 300, Republic of China. ...
    ABSTRACT Uncertain data in databases were originally denoted as null values, which represent the meaning of ‘values unknown at present.” Null values were generalized into partial values, which correspond to a set of possible values, to... more
    ABSTRACT Uncertain data in databases were originally denoted as null values, which represent the meaning of ‘values unknown at present.” Null values were generalized into partial values, which correspond to a set of possible values, to provide a more powerful notion. In this paper, we derive some properties to refine partial values into more informative ones. In some cases, they can even be refined into definite values. Such a refinement is possible when there exist range constraint on attribute domains, or referential integrities, functional dependencies, or multivalued dependencies among attributes.Our work actually eliminates redundant elements in a partial value. By this process, we not only provide a more concise and informative answer to users, but also speedup the computation of queries issued afterward. Besides, it reduces the communication cost when imprecise data are requested to be transmitted from one site to another site in a distributed environment.
    Thanks to the drastic proliferation of the Internet, e-learning has been recognized as an effective medium for various kinds of aggressive learners. However, due to the deficiencies of tutoring and guiding functionalities in current... more
    Thanks to the drastic proliferation of the Internet, e-learning has been recognized as an effective medium for various kinds of aggressive learners. However, due to the deficiencies of tutoring and guiding functionalities in current learning platforms, casual learners may deviate from the original course direction with frustration, when confronting inflexible course materials and fixed learning models. In the post-COVID-19 era, we believe that the most important functionality for a personal learning environment (PLE) to offer is a course recommendation process which adaptively provides a versatile course combination scheme for different learners from different perspectives. In this paper, we propose a flexible framework for users to customize their e-learning environment based on a two-stage Analytical Hierarchical Processing (AHP) structure for building adaptive course portfolios, which adaptively provides a versatile course scheme for different learners. The main objective of our ...
    Integrating heterogeneous data warehouses using XML technologies
    In this paper, we propose a spatiotemporal multi-dimensional modeling framework of data warehouse for tracing events in various applications, like digital contact tracing for COVID19-related applications. Such framework provides a... more
    In this paper, we propose a spatiotemporal multi-dimensional modeling framework of data warehouse for tracing events in various applications, like digital contact tracing for COVID19-related applications. Such framework provides a progressive evolution from traditional static data management to modern dynamic data analysis with spatiotemporal tracking capabilities for future IoT applications, such that entity-centered or user-centered resource integration and business intelligence applications can be rigorously developed, managed and properly tracked.
    Graph databases have been widely employed for representing connected pieces of information for different kind of domains. The data model embraces relationships as a core aspect to connect objects, and organizes everything into a network... more
    Graph databases have been widely employed for representing connected pieces of information for different kind of domains. The data model embraces relationships as a core aspect to connect objects, and organizes everything into a network for efficient query processing of versatile applications, e.g., on-line social networking, metropolitan traffic modeling, marketing channels simulations or even counterterrorism analysis. As an emerging technology for encoding network structures, graph databases are also widely used as an infrastructure for social network analytics, which help us understand some phenomenon or hidden knowledge in buzz marketing, technology trends or public issues regarding social behaviors. Although many graph database management systems have been developed, there are still no formal definitions for theoretical graph database modeling. In this paper, we will present a formal definition for graph database model, extend the concept of data warehouse into graph warehouse...
    Uncertain data in databases were originally denoted as null values which were later generalized to partial values Based on the concept of partial values we have further generalized the notion to probabilistic partial values In this paper... more
    Uncertain data in databases were originally denoted as null values which were later generalized to partial values Based on the concept of partial values we have further generalized the notion to probabilistic partial values In this paper an important operation division is fully studied to handle partial values and probabilistic partial values Due to the uncertainty of partial values and probabilistic partial values the corresponding extended division may produce maybe tuples and maybe tuples with degrees of uncertainty respectively To process this extended division we decompose a relation consisting of partial val ues or probabilistic partial values into a set of relations containing only de nite values Bipartite graph matching techniques are then applied to develop e cient algorithms for the extended division that handles partial values The re nement on the maybe result is also discussed Finally we study the extended division that handles probabilistic partial values keywords partial values probabilistic partial values graph matching uncertain data To whom all correspondence should be sent
    Research Interests:
    在當(代)的商業環境下,可以說是(資)訊科技促使(企)業策略變革的年(代)。從供應鏈管 理、(企)業(資)源規劃、顧客關係管理以及知識管理等課題受到(企)業的(日)益重視,已足以證 明:(資)訊科技已經儼然成為(企)業增加核心競爭力的不二法門。然而,想要(有)效地整合 電子化(企)業,首重「(企)業應用系統整合」(Enterprise Application Integration, EAI),尤... more
    在當(代)的商業環境下,可以說是(資)訊科技促使(企)業策略變革的年(代)。從供應鏈管 理、(企)業(資)源規劃、顧客關係管理以及知識管理等課題受到(企)業的(日)益重視,已足以證 明:(資)訊科技已經儼然成為(企)業增加核心競爭力的不二法門。然而,想要(有)效地整合 電子化(企)業,首重「(企)業應用系統整合」(Enterprise Application Integration, EAI),尤 其是跨越組織之間的異質性系統整合。其主要功能在解讀並轉換來(自)異質性系統間不同 應用程式的(資)料,並導入統一的工作流程,藉以整合這些應用程式間的(協)同運作,提供 (資)料格式的轉換,同步或非同步的(自)動化流程處理。 在本研究中,我們利用 XML 語言的(自)我描述功能及跨平台(特)性,做為電子(資)料交 換的(資)訊載體,並提出一個以 XML 為基礎的工作流程系統整合架構。當系統產生所需 的 XML 文件時,會觸發相對應的工作流程程序,在跨組織或異質性系統間達到(資)訊傳 遞與整合的功能。以 XML做為文件主要(資)訊承載的媒體,可以讓整個過程具(有)廣泛而 多元的表示能力。可以表示的內容包含了:相關公文(資)訊、流程描述、流程核銷記錄等。 並結合系統定義的映對文件轉換規則,同時達到異質性(資)料庫橫向整合的目的。最後, 我們並以一個跨越中華民國海關、經濟部加工出口區管理局,以及加工出口區內廠商的 申請工作流程做為範例,來說明整個系統架構的可行性。
    Research Interests:
    In the past decade, research works in heterogeneous database integration have established a good and solid framework to alleviate this task. However, there are still works need to be accomplished to bring these achievements to be easily... more
    In the past decade, research works in heterogeneous database integration have established a good and solid framework to alleviate this task. However, there are still works need to be accomplished to bring these achievements to be easily implemented and integrated to Internet applications. In this paper, by employing the metadata of participate sites, we propose using XML, together with XSLT, as a general platform to achieve this task. We first define the formal definitions for the problems of semantic conflicts among heterogeneous databases and present their solutions. Then, some illustrative examples are presented to show that, by requesting local sites to transform the data into XML format and prepare the corresponding XSLT files on the global site, various kinds of schema integration problems can be unified and integrated into a global view seamlessly. The proposed methodology is not only suitable for heterogeneous database integration, but is also suitable for data warehouse cre...
    Research Interests:
    Research Interests:
    Research Interests:
    Document warehouses, unlike traditional document management systems, contain extensive semantic information about documents, cross-document feature relations, and document grouping or clustering, thus providing an accurate and efficient... more
    Document warehouses, unlike traditional document management systems, contain extensive semantic information about documents, cross-document feature relations, and document grouping or clustering, thus providing an accurate and efficient access to busi-ness intelligence information. Since documents are multi-dimensional in nature, we claim that traditional indexing methods are not really suitable for document warehousing. In this paper, we propose an indexing structure, called the D-tree, which can facilitate the construction of document cubes. We formally present the related definitions, the de-sign of its storage structure and related algorithms for D-trees. The above are essential for establishing an infrastructure for combining text processing methods with numeric OLAP processing technologies. Hopefully, the proposed combination of data warehous-ing and document warehousing will be an important kernel for knowledge management and customer relationship management applications.
    Research Interests:
    XML (eXtensible Markup Language), a simplified version of SGML (Standard Generalized Markup Language), is designed to enable electronic text interchange via the Internet. Most current approaches store XML documents in file systems or in... more
    XML (eXtensible Markup Language), a simplified version of SGML (Standard Generalized Markup Language), is designed to enable electronic text interchange via the Internet. Most current approaches store XML documents in file systems or in relational database systems. However, the nature and the design of file systems or relational database schemas cannot fit with the XML document structure very well. In
    Abstract In this paper, we will present an effective Fuzzy Frequent Itemset-Based Hierarchical Clustering (F 2 IHC) approach, which uses fuzzy frequent itemsets discovered by fuzzy association rule mining to improve the clustering... more
    Abstract In this paper, we will present an effective Fuzzy Frequent Itemset-Based Hierarchical Clustering (F 2 IHC) approach, which uses fuzzy frequent itemsets discovered by fuzzy association rule mining to improve the clustering accuracy of FIHC (Frequent ...
    Thanks to the drastic proliferation of Internet, e-learning has been recognized as an effective media for various kinds of learners. However, the tremendous course materials in the Internet may make learners be confused in choosing their... more
    Thanks to the drastic proliferation of Internet, e-learning has been recognized as an effective media for various kinds of learners. However, the tremendous course materials in the Internet may make learners be confused in choosing their suitable course materials. In this paper, we propose an approach to construct an adaptive curriculum portfolio recommendation system. It offers tailored course materials for
    [[abstract]]Research on accessing databases using natural language usually utilizes an intermediate logical form for the mapping process from natural languages to database query languages. However, there are still efforts to be... more
    [[abstract]]Research on accessing databases using natural language usually utilizes an intermediate logical form for the mapping process from natural languages to database query languages. However, there are still efforts to be accomplished to bridge the gap between natural language constructs and database schemas. In this paper, we present a translation scheme for transforming natural language queries into relational algebra through ORM (Object Role Modeling) representations. This approach employs a logical form to represent the natural language queries. The logical form has the merits that it can be mapped from natural language constructs by referring to the conceptual schema modeled by ORM
    Research Interests:
    Extensible markup language (XML), a simplified version of standard generalized markup language (SGML), is designed to enable electronic text interchange in the Internet. XML documents have a rigorously described structure that may be... more
    Extensible markup language (XML), a simplified version of standard generalized markup language (SGML), is designed to enable electronic text interchange in the Internet. XML documents have a rigorously described structure that may be analyzed by computers and easily understood by humans. Most current approaches store XML documents in file systems or in relational database systems. However, the nature and the
    Data warehousing has been widely adopted by contemporary enterprises. For inter-organizational information sharing, the need cannot be over-emphasized to conduct researches on the integration of heterogeneous data warehouses to overcome... more
    Data warehousing has been widely adopted by contemporary enterprises. For inter-organizational information sharing, the need cannot be over-emphasized to conduct researches on the integration of heterogeneous data warehouses to overcome the challenging situations today. That makes it urgent to establish a systematic integration methodology for integrating heterogeneous data warehouses via the Internet or proprietary extranets. Traditionally, researchers usually employed a canonical format as the integration medium for logical data integrations among heterogeneous systems. In this paper, to fully utilize the power of the Internet, we propose a framework and develop a prototype to integrate heterogeneous data warehouses by XML technologies. We first formally define the elements in data warehousing and discuss various semantic conflicts occurring among heterogeneous data cubes. Then, we propose the system architecture and related resolution procedures for all kinds of semantic conflict...
    ABSTRACT Existing parallel algorithms for association rule mining have a large inter-site communication cost or require a large amount of space to maintain the local support counts of a large number of candidate sets. This study proposes... more
    ABSTRACT Existing parallel algorithms for association rule mining have a large inter-site communication cost or require a large amount of space to maintain the local support counts of a large number of candidate sets. This study proposes a de-clustering approach for distributed architectures, which eliminates the inter-site communication cost, for most of the influential association rule mining algorithms. To de-cluster the database into similar partitions, an efficient algorithm is developed to approximate the shortest spanning path (SSP) to link transaction data together. The SSP obtained is then used to evenly de-cluster the transaction data into subgroups. The proposed approach guarantees that all subgroups are similar to each other and to the original group. Experiment results show that data size and the number of items are the only two factors that determine the performance of de-clustering. Additionally, based on the approach, most of the influential association rule mining algorithms can be implemented in a distributed architecture to obtain a drastic increase in speed without losing any frequent itemsets. Furthermore, the data distribution in each de-clustered participant is almost the same as that of a single site, which implies that the proposed approach can be regarded as a sampling method for distributed association rule mining. Finally, the experiment results prove that the original inadequate mining results can be improved to an almost perfect level.
    During the past decade, data warehousing has been widely adopted in the business community. It provides multi-dimensional analyses on cumulated historical business data for helping contemporary administrative decision-makings. However,... more
    During the past decade, data warehousing has been widely adopted in the business community. It provides multi-dimensional analyses on cumulated historical business data for helping contemporary administrative decision-makings. However, many data warehousing query language in present only provides on-line analytical processing (OLAP) for numeric data. For example, MDX (Multi-Dimensional eXpressions) has been proposed as a query language to allow describing
    Research on accessing databases using natural languages usually utilizes an intermediate logical form for the mapping process from natural languages to database query languages. However, there is still much that needs to be accomplished... more
    Research on accessing databases using natural languages usually utilizes an intermediate logical form for the mapping process from natural languages to database query languages. However, there is still much that needs to be accomplished to bridge the gap between ...