Relational DataBase Management Systems RDBMS Testing
Relational DataBase Management Systems RDBMS Testing
Relational database management systems (RDBMSs) often persist mission-critical data which is updated by many applications and potentially thousands if not millions of end users. Furthermore, they implement important functionality in the form of database methods (stored procedures, stored functions, and/or triggers) and database objects (e.g. Java or C# instances). The best way to ensure the continuing quality of these assets, at least from a technical point of view, you should have a full regression test suite which you can run on a regular basis. In this article I argue for a fully automated, continuous regression testing based approach to database testing. Just as agile software developers take this approach to their application code, we should also to the same for our databases.
Table of Contents
1. 2. 3. 4.
Why test an RDBMS? What should we test? When should we test? How should we test? o Database sandboxes o Writing database tests o Setting up database tests o Database testing tools Who should test? Introducing database testing into your organization Best practices
5. 6. 7.
4.
Support for evolutionary development. Many evolutionary development techniques, in particular database refactoring, are predicated upon the idea that it must be possible to determine if something in the database has been broken when a change has been made. The easiest way to do that is to simply run your regression test suite.
Table 1. What to test in an RDBMS. Interface Testing Internal Testing Scaffolding code (e.g. triggers or updateable views) which support refactorings Typical unit tests for your stored procedures, functions, and triggers O/R mappings (including the Existence tests for database schema elements meta data) (tables, procedures, ...) Incoming data values Outgoing data values from views, stored procs, ... View definitions Outgoing data values (from Referential integrity (RI) rules queries, ...) Default values for a column Data invariants for a single column Data invariants involving several columns
Test-driven development (TDD) is an evolutionary approach to development which combines test-first development and refactoring. When an agile software developer goes to implement a new feature, the first question they ask themselves is "Is this the best design possible which enables me to add this feature?" If the answer is yes, then they do the work to add the feature. If the answer is no then they refactor the design to make it the best possible then they continue with
a TFD approach. This strategy is applicable to developing both your application code and your database schema, two things that you would work on in parallel. When you first start following a TDD approach to development you quickly discover that to make it successful you need to automate as much of the process as possible? Do you really want to manually run the same build script(s) and the same testing script(s) over and over again? Of course not. So, agile developers have created OSS tools such as ANT, Maven, and Cruise Control (to name a few) which enable them to automate these tasks. More importantly, it enables them to automate their database testing script into the build procedure itself. Agile developers realize that testing is so important to their success that it is something they do every day, not just at the end of the lifecycle. They test as often and early as possible, and better yet they test first. As you can see with the agile system development lifecycle (SDLC) of Figure 3 testing is in fact something that occurs during the development and release cycles, not just during release. Furthermore, many agile software developers realize that you can test more than just your code, you can in fact validate every work product created on a software development project if you choose to. This philosophy of exemplified by the Full Lifecycle Object-Oriented Testing (FLOOT) Methodology.
4. How to Test
Although you want to keep your database testing efforts as simple as possible, at first you will discover that you have a fair bit of both learning and set up to do. In this section I discuss the need for various database sandboxes in which people will test: in short, if you want to do database testing then you're going to need test databases (sandboxes) to work in. I then overview how to write a database test and more importantly describe setup strategies for database tests. Finally, I overview several database testing tools which you may want to consider.
sandbox you'll experiment, implement new functionality, and refactor existing functionality, validate your changes through testing, and then eventually you'll promote your work once you're happy with it to the project integration sandbox. In this sandbox you will rebuild your system and then run all the tests to ensure you haven't broken anything (if so, then back to the development sandbox). Occasionally, at least once an iteration/cycle, you'll deploy your work to the level (demo and pre-production testing), and every so often (perhaps once every six to twelve months) into production. The primary advantage of sandboxes are that they help to reduce the risk of technical errors adversely affecting a larger group of people than is absolutely necessary at the time. Figure 4. Sandboxes.
1.
Setup the test. You need to put your database into a known state before running tests against it. There are several strategies for doing so.
2.
Run the test. Using a database regression testing tool, run your database tests just like you would run your application tests. 3. Check the results. You'll need to be able to do "table dumps" to obtain the current values in the database so that you can compare them against the results which you expected.
Where does test data come from? For unit testing, I prefer to create sample data with known values. This way I can predict the actual results for the tests that I do write and I know I have the appropriate data values for those tests. For other forms of testing -- particularly load/stress, system integration, and function testing, I will use live data so as to better simulate real-world conditions. Beware Coupling: One danger with database regression testing, and with regression testing in general, is coupling between tests. If you put the database into a known state, then run several tests against that known state before resetting it, then those tests are potentially coupled to one another. Coupling between tests occurs when one test counts on another one to successfully run so as to put the database into a known state for it. Self-contained test cases do not suffer from this problem, although may be potentially slower as a result due to the need for additional initialization steps.
Tools simulate high usage loads on your database, enabling you to Testing determine tools for whether your load testing system's architecture will stand up to your true production needs. Developers need test data against which to validate their systems. Test data generators Test Data can be Generator particularly useful when you need large amounts of data, perhaps for stress and load testing.
Empirix Mercury Interactive RadView Rational Suite Test Studio Web Performance
1.
Insufficient testing skills. This problem can be overcome through training, through pairing with someone with good testing skills (pairing a DBA without testing skills and a tester without DBA skills still works), or simply through trial and error. The important thing is that you recognize that you need to pick up these skills. 2. Insufficient unit tests for existing databases. Few organizations have yet to adopt the practice of database testing, so it is likely that you will not have a sufficient test suite for your existing database(s). Although this is unfortunate, there is no better time than the present to start writing your test suite. 3. Insufficient database testing tools. As I said earlier, we still have a way to go with respect to tools. 4. Reticent DM groups. My experience is that some data management (DM) groups invest more effort playing politics and building their little organizational empires than doing their actual jobs. They may see the introduction of database regression testing, and agile techniques such as test-first development (TFD) and refactoring, as a threat. Instead of supporting your efforts to improve data quality within your organization, highly political DM groups try to thwart your efforts because they're not the ones leading the effort (and because it's such an obvious process improvement people might start to question why they weren't doing this years ago). In general, I highly suggest that you read my article Adopting Evolutionary/Agile Database Techniques and consider buying the book Fearless Change which describes a pattern language for successfully implementing change within organizations.
7. Best Practices
I'd like to conclude this article by sharing a few database testing "best practices" with you:
1. Use an in-memory database for regression testing. You can dramatically speed up your database tests by running them, or at least portions of them, against an in-memory database such as HSQLDB. The challenge with this approach is that because database methods are implemented differently across database vendors that any method tests will still need to run against the actual database server. 2. Start fresh each major test run. To ensure a clean database, a common strategy is that at the beginning of each test run you drop the database, then rebuild it from scratch taking into account all database refactorings and transformations to that point, then reload the test data, and then run your tests. Of course, you wouldn't do this to your production database. ;-) 3. Take a continuous approach to regression testing. I can't say this enough, a TDD approach to development is an incredibly effective way to work. 4. Train people in testing. Many developers and DBAs have not been trained in testing skills, and they almost certainly haven't been trained in database testing skills. Invest in your people, and give them the training and education they need to do their jobs. 5. Pair with novices with people that have database testing experience. One of the easiest ways to gain database testing skills is to pair program with someone who already has them.