Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
75 views

Sqlfordevscom Next Level Database Techniques For Developers 37 40

This document discusses techniques for optimizing database performance, including pre-sorting tables, pre-aggregating values, and creating indexes on functions and expressions. Pre-sorting tables by common query criteria ensures that rows are stored physically in the optimal order for faster retrieval. Pre-aggregating values avoids having to aggregate large amounts of data during analytical queries. Indexes can be created on functions and expressions to utilize indexes even when columns are transformed in queries.

Uploaded by

Sivakumar M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

Sqlfordevscom Next Level Database Techniques For Developers 37 40

This document discusses techniques for optimizing database performance, including pre-sorting tables, pre-aggregating values, and creating indexes on functions and expressions. Pre-sorting tables by common query criteria ensures that rows are stored physically in the optimal order for faster retrieval. Pre-aggregating values avoids having to aggregate large amounts of data during analytical queries. Indexes can be created on functions and expressions to utilize indexes even when columns are transformed in queries.

Uploaded by

Sivakumar M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Pre-Sorted Tables For Faster Access

SELECT *
FROM product_comments
WHERE product_id = 2
ORDER BY comment_id ASC
LIMIT 10

-- MySQL
CREATE TABLE product_comments (
product_id bigint,
comment_id bigint auto_increment UNIQUE KEY,
message text,
PRIMARY KEY (product_id, comment_id)
);

-- PostgreSQL
CREATE TABLE product_comments (
product_id bigint,
comment_id bigint GENERATED ALWAYS AS IDENTITY,
message text,
PRIMARY KEY (product_id, comment_id)
);
CLUSTER product_comments USING product_comments_pkey;

Every data you insert into a table will be physically sorted in the database's file so that
common tasks like selecting and updating rows are most efficient. But the database can't
know how you will use those rows. When building an e-commerce application, you want to
get some comments for a product. Typically, these comments are stored by an incrementing
primary key and are distributed through the table by their insertion order. But you can
enforce that these rows are stored physically in ascending order by the product (product_id)
and date of the comment (comment_id). The database can now efficiently find the first
comment and the next nine in the following rows instead of collecting them at 10 different
locations. Whether you use an SSD or HDD, random access to multiple locations will always
be slower than a single operation fetching multiple consecutive bytes.

This is the most exciting and most complex performance optimization I can teach you for big
data. The article below shares much more information and implementation obstacles.

Notice: I have already written a more extensive text about this topic on
SqlForDevs.com: Sorted Tables for Faster Range-Scans

37
Pre-Aggregation of Values for Faster Queries

SELECT SUM(likes_count)
FROM articles
WHERE user_id = 1 and publish_year = 2022;

Even if your schema is very well-designed and your queries all use a perfect index, they may
still be slow. When analytical queries, e.g. for a dashboard, have to aggregate tens or
hundreds of thousands of rows the performance will suffer drastically. Such queries are
constrained by computational limits about how fast data can be loaded and the required
time to extract the information from the rows or indexes and aggregate them. This operation
is very fast for small amounts of data, but the bigger it gets, the more you should look into
storing pre-aggregated values. No intelligent indexing will beat the performance
improvement of not having to aggregate tens of thousands of values.

38
Indexes

Without indexes your application would be slow as every operation would have to scan the
whole table. Consequently, indexes are the most interesting topic for developers but also
the most complicated one. A lot of content has been written about database indexing that I
don't want to repeat. Therefore, I am only sharing more extraordinary approaches and
features you may not have seen before.

The indexing chapter will show you a lot of exceptional indexing approaches like uniqueness
constraints for soft-deleted tables, simple rules for multi-column indexing, ways to find and
delete unused indexes and much more.

39
Indexes On Functions And Expressions

SELECT * FROM users WHERE lower(email) = 'test@example.com';

-- MySQL
CREATE INDEX users_email_lower ON users ((lower(email)));

-- PostgreSQL
CREATE INDEX users_email_lower ON users (lower(email));

Most developers are puzzled that their index on a column is not used when it is transformed
by a function or expression. A Google search results in countless StackOverflow articles
stating that you can't use an index in these cases, but this information is wrong! You can
create specialized indexes on a function or expression that are used whenever the exact
same transformation is applied in your WHERE.

Notice: I have written a more extensive text about this topic on my database
focused website SqlForDevs.com: Function-Based Indexes

40

You might also like