Regular readers know that I periodically discuss database performance issues and that I always like to start with a definition of database performance, that I will reiterate here:
Database performance is the optimization of resource use to increase throughput and minimize contention, enabling the largest possible workload to be processed.
If you need clarification of any of these concepts, I direct you to previous columns. (What Do You Mean by Database Performance?, Database Trends and Applications, April 29, 2021.)
This month, I want to take a higher-level view of database performance concepts and introduce a few themes that will likely follow you throughout your career working with database systems and applications. These themes should be viewed as guiding principles to keep in mind when confronted with database performance issues.
The first basic guideline is to understand the 80/20 rule, also known as the Pareto Principle. The 80/20 rule asserts that 80% of outcomes result from 20% of all causes (or inputs) for any given event. This rule is broadly applicable to most areas of business, including computer software and data. Regarding database performance, it can be stated in several different ways. Perhaps the most common one is that 80% of your performance tuning results will come from 20% of your effort. Or stated another way, 20% of your database applications will cause 80% of your problems. Don’t get bogged down in forcing an exact 80% and 20% when applying this rule. The general idea here is to avoid wasting time and effort. By focusing your tuning efforts on the applications and processes that will provide the biggest return on investment you will be following the spirit of the 80/20 rule, as well as avoiding unproductive work.
The second basic guideline is to approach tuning as an iterative process. In other words, tune one thing at a time, instead of attacking multiple issues at once. And, after each tuning step measure the success or failure of the attempt. If you do not approach tuning in this manner, then you may be introducing sub-optimal settings and code because you failed to measure each change. Consider what would happen if you were confronted with a slow performing SQL query. In response, you add an index, collect updated statistics, and change some buffer pool settings, then you rerun the query. If performance is better, which of those changes helped? Was it only one of them, or a combination? And what if performance is now worse? Which change(s) caused that? To be able to answer these questions and tune effectively, it is imperative that you always change only one thing at a time and gauge the results before doing anything else.
Thirdly, you should try to avoid using the words “always” and “never.” There are rarely any rules that always apply. And likewise, it is just as rare that there are things that should never be done. Keep an open mind and consider every option, even if they might be ones you have never tried before (or have been told to avoid). Things change and standards and recommendations that once made sense may become obsolete. So, a wise DBA will avoid saying “always” and “never.” A corollary to this rule is to avoid saying “none” and “every” for similar reasons.
Finally, it is better to design performance into your application programs and system from the start, instead of trying to retrofit performance tweaks at a later time. The goal should be to train your developers to write code that does not require remedial after-the-fact modifications by performance analysts. The first step to becoming a successful database programmer who codes with performance in mind is to develop a relational mindset. A relational database system differs from other traditional types of data storage and programmers need to understand this to write efficient database code. Furthermore, relational systems rely on built-in optimization technology for determining the best way to access data. Programmers should avoid trying to force the DBMS to access data in specific ways and instead rely on the database system’s optimizer.
There are, of course, many additional considerations for coding applications for performance from the very beginning. For example, knowledge of indexes and data clustering is extremely important. Knowing and using the built-in functions that are available to you is also important. And let’s not forget trying to minimize the number of times the same data must be accessed in a program: the more accesses, the less efficient the program.
Other considerations include coding for concurrency, testing multiple variations of SQL, avoiding data type conversions, and more. DBAs should work with their developers to help them understand these things, because it will minimize the needed DBA effort when the application becomes operational.
The Bottom Line
The bottom line is that you should strive to build performance into your application programs from the very beginning. The further into the development process you are, the more painful it becomes to make changes.