A mounting wave of data-intensive and knowledge-based applications, such as data mining, data warehousing, and Online Analytical Processing (OLAP), have created a strong demand for more powerful database languages and systems. Several data model extensions (e.g., Object-Relational models), new language constructs (e.g. recursion and OLAP constructs), and various database extenders based on user-defined functions, have been proposed to enhance the current Database Management Systems (DBMSs). However, state-of-the-art DBMSs are not powerful and general enough for many advanced database applications, and in particular for data mining.
In this thesis, we claim that User-defined Aggregates (UDAs) provide a versatile mechanism for extending the power and applicability of Object-Relational Databases (O-R DBs). We first define the formal semantics of UDAs in logic and then we apply them to SQL DBMSs. After building a series of language prototypes, we designed and implemented AXL. AXL is easy to learn and use for database programmers because it preserves the constructs, programming paradigm and data types of SQL (whereas there is an ‘impedance mismatch’ between SQL and the procedural languages of user-defined functions currently used in O-R DBs). Data independence and parallelizability represent two additional qualities that AXL inherits from database systems. In this thesis, we show that, while adding only minimal extensions to SQL, AXL is very powerful and capable of expressing complex algorithms efficiently. We demonstrate this by coding data mining functions and other advanced applications that, previously, had been a major problem for SQL databases.
Due to its flexibility, SQL-compatibility and ease of use, the AXL approach offers better extensibility mechanisms, in several application domains, than the function libraries now offered by commercial O-R DBs under names such as Datablades or DB-Extenders.
Recommendations
User-Defined Aggregates in Database Languages
DBPL '99: Revised Papers from the 7th International Workshop on Database Programming Languages: Research Issues in Structured and Semistructured Database ProgrammingUser-defined aggregates (UDAs) can be the linchpin of sophisticated data mining functions and other advanced database applications, but they find little support in current database systems. In this paper, we describe the SQL-AG prototype that overcomes ...
A Framework For Inferring Properties of User-Defined Functions
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software EngineeringUser-defined functions (UDFs) are widely used to enhance the capabilities of DBMSs. However, using UDFs comes with a significant performance penalty because DBMSs treat UDFs as black boxes, which hinders their ability to optimize queries that invoke such ...