This presentation covers all aspects of PostgreSQL administration, including installation, security, file structure, configuration, reporting, backup, daily maintenance, monitoring activity, disk space computations, and disaster recovery. It shows how to control host connectivity, configure the server, find the query being run by each session, and find the disk space used by each database.
Best Practices for Becoming an Exceptional Postgres DBA EDB
Drawing from our teams who support hundreds of Postgres instances and production database systems for customers worldwide, this presentation provides real-real best practices from the nation's top DBAs. Learn top-notch monitoring and maintenance practices, get resource planning advice that can help prevent, resolve, or eliminate common issues, learning top database tuning tricks for increasing system performance and ultimately, gain greater insight into how to improve your effectiveness as a DBA.
The paperback version is available on lulu.com there http://goo.gl/fraa8o
This is the first volume of the postgresql database administration book. The book covers the steps for installing, configuring and administering a PostgreSQL 9.3 on Linux debian. The book covers the logical and physical aspect of PostgreSQL. Two chapters are dedicated to the backup/restore topic.
The document discusses PostgreSQL query planning and tuning. It covers the key stages of query execution including syntax validation, query tree generation, plan estimation, and execution. It describes different plan nodes like sequential scans, index scans, joins, and sorts. It emphasizes using EXPLAIN to view and analyze the execution plan for a query, which can help identify performance issues and opportunities for optimization. EXPLAIN shows the estimated plan while EXPLAIN ANALYZE shows the actual plan after executing the query.
This document discusses PostgreSQL statistics and how to use them effectively. It provides an overview of various PostgreSQL statistics sources like views, functions and third-party tools. It then demonstrates how to analyze specific statistics like those for databases, tables, indexes, replication and query activity to identify anomalies, optimize performance and troubleshoot issues.
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
This document discusses how PostgreSQL works with disks and provides recommendations for disk subsystem monitoring, hardware selection, and configuration tuning to optimize performance. It explains that PostgreSQL relies on disk I/O for reading pages, writing the write-ahead log (WAL), and checkpointing. It recommends monitoring disk utilization, IOPS, latency, and I/O wait. The document also provides tips for choosing hardware like SSDs or RAID configurations and configuring the operating system, file systems, and PostgreSQL to improve performance.
PostgreSQL Replication High Availability MethodsMydbops
This slides illustrates the need for replication in PostgreSQL, why do you need a replication DB topology, terminologies, replication nodes and many more.
The document provides an overview of PostgreSQL performance tuning. It discusses caching, query processing internals, and optimization of storage and memory usage. Specific topics covered include the PostgreSQL configuration parameters for tuning shared buffers, work memory, and free space map settings.
A look at what HA is and what PostgreSQL has to offer for building an open source HA solution. Covers various aspects in terms of Recovery Point Objective and Recovery Time Objective. Includes backup and restore, PITR (point in time recovery) and streaming replication concepts.
Creating a complete disaster recovery strategyMariaDB plc
Jens Bollmann, Principal Consultant at MariaDB, discusses all of the disaster recovery features and tools available in MariaDB, including MariaDB Flashback for point-in-time rollback, MariaDB Backup for incremental backup/restore, delayed replication and dedicated/tiered databases for backups.
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...ScaleGrid.io
Compare top PostgreSQL high availability frameworks - PostgreSQL Automatic Failover (PAF), Replication Manager (repmgr) and Patroni to improve your app uptime. ScaleGrid blog - https://scalegrid.io/blog/whats-the-best-postgresql-high-availability-framework-paf-vs-repmgr-vs-patroni-infographic/
사례로 알아보는 MariaDB 마이그레이션
현대적인 IT 환경과 애플리케이션을 만들기 위해 우리는 오늘도 고민을 거듭합니다. 최근 들어 오픈소스 DB가 많은 업무에 적용되고 검증이 되면서, 점차 무거운 상용 데이터베이스를 가벼운 오픈소스 DB로 전환하는 움직임이 대기업의 미션 크리티컬 업무까지로 확산하고 있습니다. 이는 클라우드 환경 및 마이크로 서비스 개념 확산과도 일치하는 움직임입니다.
상용 DB를 MariaDB로 이관한 사례를 통해 마이그레이션의 과정과 효과를 살펴 볼 수 있습니다.
MariaDB로 이관하는 것은 어렵다는 생각을 막연히 가지고 계셨다면 본 자료를 통해 이기종 데이터베이스를 MariaDB로 마이그레이션 하는 작업이 어렵지 않게 수행될 수 있다는 점을 실제 사례를 통해 확인하시길 바랍니다.
웨비나 동영상
https://www.youtube.com/watch?v=xRsETZ5cKz8&t=52s
This document discusses PostgreSQL replication. It provides an overview of replication, including its history and features. Replication allows data to be copied from a primary database to one or more standby databases. This allows for high availability, load balancing, and read scaling. The document describes asynchronous and synchronous replication modes.
PostgreSQL is a very popular and feature-rich DBMS. At the same time, PostgreSQL has a set of annoying wicked problems, which haven't been resolved in decades. Miraculously, with just a small patch to PostgreSQL core extending this API, it appears possible to solve wicked PostgreSQL problems in a new engine made within an extension.
This document provides an agenda and background information for a presentation on PostgreSQL. The agenda includes topics such as practical use of PostgreSQL, features, replication, and how to get started. The background section discusses the history and development of PostgreSQL, including its origins from INGRES and POSTGRES projects. It also introduces the PostgreSQL Global Development Team.
This document provides an overview of five steps to improve PostgreSQL performance: 1) hardware optimization, 2) operating system and filesystem tuning, 3) configuration of postgresql.conf parameters, 4) application design considerations, and 5) query tuning. The document discusses various techniques for each step such as selecting appropriate hardware components, spreading database files across multiple disks or arrays, adjusting memory and disk configuration parameters, designing schemas and queries efficiently, and leveraging caching strategies.
This document provides an overview of the VACUUM command in PostgreSQL. It discusses what VACUUM does, the evolution of VACUUM features over time, visibility maps, freezing tuples, and transaction ID wraparound. It also covers the syntax of VACUUM, improvements to anti-wraparound VACUUM, and new features like progress reporting and the freeze map.
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Mydbops
MySQL Clustering over InnoDB engines has grown a lot over the last decade. Galera began working with InnoDB early and then Group Replication came to the environment later, where the features are now rich and robust. This presentation offers a technical comparison of both of them.
Devrim Gunduz gives a presentation on Write-Ahead Logging (WAL) in PostgreSQL. WAL logs all transactions to files called write-ahead logs (WAL files) before changes are written to data files. This allows for crash recovery by replaying WAL files. WAL files are used for replication, backup, and point-in-time recovery (PITR) by replaying WAL files to restore the database to a previous state. Checkpoints write all dirty shared buffers to disk and update the pg_control file with the checkpoint location.
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
Recently, a set of modern table formats such as Delta Lake, Hudi, Iceberg spring out. Along with Hive Metastore these table formats are trying to solve problems that stand in traditional data lake for a long time with their declared features like ACID, schema evolution, upsert, time travel, incremental consumption etc.
EXPLAIN ANALYZE is a new query profiling tool first released in MySQL 8.0.18. This presentation covers how this new feature works, both on the surface and on the inside, and how you can use it to better understand your queries, to improve them and make them go faster.
This presentation is for everyone who has ever had to understand why a query is executed slower than anticipated, and for everyone who wants to learn more about query plans and query execution in MySQL.
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfJesmar Cannao'
ProxySQL is a MySQL protocol proxy that provides high availability, scalability, and security for MySQL database systems. It allows clients to connect to ProxySQL, which then evaluates requests and performs actions like routing queries to backend databases, caching reads, connection pooling, and load balancing across servers. ProxySQL's main features include query routing, firewalling, real-time statistics, monitoring, and management of large numbers of backend servers. The presentation discusses using ProxySQL's query routing and rewriting capabilities to mask sensitive data when replicating databases for development environments. It also covers using the REST API and Prometheus integration to configure ProxySQL and monitor metrics without direct SQL access.
Redefining tables online without surprisesNelson Calero
The Oracle database includes several features to allow moving data online, ie: without preventing users to access it when it is being moved (DML operation are not blocked).
One of those features is to change a table definition, using the package DBMS_REDEFINITION.
While moving a table is an online operation since version 12.2, redefinition is still needed for some changes. Also is needed in older versions.
In this session best practices will be shown based on experience of using it with big tablespaces, with examples covering all the steps needed to use DBMS_REDEFINITION under different scenarios, including the problems you can find, how to resolve them and how this process is different in version 11.2 and 12.
In 40 minutes the audience will learn a variety of ways to make postgresql database suddenly go out of memory on a box with half a terabyte of RAM.
Developer's and DBA's best practices for preventing this will also be discussed, as well as a bit of Postgres and Linux memory management internals.
The presentation covers improvements made to the redo logs in MySQL 8.0 and their impact on the MySQL performance and Operations. This covers the MySQL version still MySQL 8.0.30.
The document provides an overview of PostgreSQL performance tuning. It discusses caching, query processing internals, and optimization of storage and memory usage. Specific topics covered include the PostgreSQL configuration parameters for tuning shared buffers, work memory, and free space map settings.
A look at what HA is and what PostgreSQL has to offer for building an open source HA solution. Covers various aspects in terms of Recovery Point Objective and Recovery Time Objective. Includes backup and restore, PITR (point in time recovery) and streaming replication concepts.
Creating a complete disaster recovery strategyMariaDB plc
Jens Bollmann, Principal Consultant at MariaDB, discusses all of the disaster recovery features and tools available in MariaDB, including MariaDB Flashback for point-in-time rollback, MariaDB Backup for incremental backup/restore, delayed replication and dedicated/tiered databases for backups.
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...ScaleGrid.io
Compare top PostgreSQL high availability frameworks - PostgreSQL Automatic Failover (PAF), Replication Manager (repmgr) and Patroni to improve your app uptime. ScaleGrid blog - https://scalegrid.io/blog/whats-the-best-postgresql-high-availability-framework-paf-vs-repmgr-vs-patroni-infographic/
사례로 알아보는 MariaDB 마이그레이션
현대적인 IT 환경과 애플리케이션을 만들기 위해 우리는 오늘도 고민을 거듭합니다. 최근 들어 오픈소스 DB가 많은 업무에 적용되고 검증이 되면서, 점차 무거운 상용 데이터베이스를 가벼운 오픈소스 DB로 전환하는 움직임이 대기업의 미션 크리티컬 업무까지로 확산하고 있습니다. 이는 클라우드 환경 및 마이크로 서비스 개념 확산과도 일치하는 움직임입니다.
상용 DB를 MariaDB로 이관한 사례를 통해 마이그레이션의 과정과 효과를 살펴 볼 수 있습니다.
MariaDB로 이관하는 것은 어렵다는 생각을 막연히 가지고 계셨다면 본 자료를 통해 이기종 데이터베이스를 MariaDB로 마이그레이션 하는 작업이 어렵지 않게 수행될 수 있다는 점을 실제 사례를 통해 확인하시길 바랍니다.
웨비나 동영상
https://www.youtube.com/watch?v=xRsETZ5cKz8&t=52s
This document discusses PostgreSQL replication. It provides an overview of replication, including its history and features. Replication allows data to be copied from a primary database to one or more standby databases. This allows for high availability, load balancing, and read scaling. The document describes asynchronous and synchronous replication modes.
PostgreSQL is a very popular and feature-rich DBMS. At the same time, PostgreSQL has a set of annoying wicked problems, which haven't been resolved in decades. Miraculously, with just a small patch to PostgreSQL core extending this API, it appears possible to solve wicked PostgreSQL problems in a new engine made within an extension.
This document provides an agenda and background information for a presentation on PostgreSQL. The agenda includes topics such as practical use of PostgreSQL, features, replication, and how to get started. The background section discusses the history and development of PostgreSQL, including its origins from INGRES and POSTGRES projects. It also introduces the PostgreSQL Global Development Team.
This document provides an overview of five steps to improve PostgreSQL performance: 1) hardware optimization, 2) operating system and filesystem tuning, 3) configuration of postgresql.conf parameters, 4) application design considerations, and 5) query tuning. The document discusses various techniques for each step such as selecting appropriate hardware components, spreading database files across multiple disks or arrays, adjusting memory and disk configuration parameters, designing schemas and queries efficiently, and leveraging caching strategies.
This document provides an overview of the VACUUM command in PostgreSQL. It discusses what VACUUM does, the evolution of VACUUM features over time, visibility maps, freezing tuples, and transaction ID wraparound. It also covers the syntax of VACUUM, improvements to anti-wraparound VACUUM, and new features like progress reporting and the freeze map.
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Mydbops
MySQL Clustering over InnoDB engines has grown a lot over the last decade. Galera began working with InnoDB early and then Group Replication came to the environment later, where the features are now rich and robust. This presentation offers a technical comparison of both of them.
Devrim Gunduz gives a presentation on Write-Ahead Logging (WAL) in PostgreSQL. WAL logs all transactions to files called write-ahead logs (WAL files) before changes are written to data files. This allows for crash recovery by replaying WAL files. WAL files are used for replication, backup, and point-in-time recovery (PITR) by replaying WAL files to restore the database to a previous state. Checkpoints write all dirty shared buffers to disk and update the pg_control file with the checkpoint location.
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
Recently, a set of modern table formats such as Delta Lake, Hudi, Iceberg spring out. Along with Hive Metastore these table formats are trying to solve problems that stand in traditional data lake for a long time with their declared features like ACID, schema evolution, upsert, time travel, incremental consumption etc.
EXPLAIN ANALYZE is a new query profiling tool first released in MySQL 8.0.18. This presentation covers how this new feature works, both on the surface and on the inside, and how you can use it to better understand your queries, to improve them and make them go faster.
This presentation is for everyone who has ever had to understand why a query is executed slower than anticipated, and for everyone who wants to learn more about query plans and query execution in MySQL.
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfJesmar Cannao'
ProxySQL is a MySQL protocol proxy that provides high availability, scalability, and security for MySQL database systems. It allows clients to connect to ProxySQL, which then evaluates requests and performs actions like routing queries to backend databases, caching reads, connection pooling, and load balancing across servers. ProxySQL's main features include query routing, firewalling, real-time statistics, monitoring, and management of large numbers of backend servers. The presentation discusses using ProxySQL's query routing and rewriting capabilities to mask sensitive data when replicating databases for development environments. It also covers using the REST API and Prometheus integration to configure ProxySQL and monitor metrics without direct SQL access.
Redefining tables online without surprisesNelson Calero
The Oracle database includes several features to allow moving data online, ie: without preventing users to access it when it is being moved (DML operation are not blocked).
One of those features is to change a table definition, using the package DBMS_REDEFINITION.
While moving a table is an online operation since version 12.2, redefinition is still needed for some changes. Also is needed in older versions.
In this session best practices will be shown based on experience of using it with big tablespaces, with examples covering all the steps needed to use DBMS_REDEFINITION under different scenarios, including the problems you can find, how to resolve them and how this process is different in version 11.2 and 12.
In 40 minutes the audience will learn a variety of ways to make postgresql database suddenly go out of memory on a box with half a terabyte of RAM.
Developer's and DBA's best practices for preventing this will also be discussed, as well as a bit of Postgres and Linux memory management internals.
The presentation covers improvements made to the redo logs in MySQL 8.0 and their impact on the MySQL performance and Operations. This covers the MySQL version still MySQL 8.0.30.
This document discusses common mistakes made when implementing Oracle Exadata systems. It describes improperly sized SGAs which can hurt performance on data warehouses. It also discusses issues like not using huge pages, over or under use of indexing, too much parallelization, selecting the wrong disk types, failing to patch systems, and not implementing tools like Automatic Service Request and exachk. The document provides guidance on optimizing these areas to get the best performance from Exadata.
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
This document discusses tuning Linux and PostgreSQL for performance. It recommends:
- Tuning Linux kernel parameters like huge pages, swappiness, and overcommit memory. Huge pages can improve TLB performance.
- Tuning PostgreSQL parameters like shared_buffers, work_mem, and checkpoint_timeout. Shared_buffers stores the most frequently accessed data.
- Other tips include choosing proper hardware, OS, and database based on workload. Tuning queries and applications can also boost performance.
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
This document discusses in-memory caching in HDFS to improve query latency. The implementation caches important datasets in the DataNode memory and allows clients to directly access cached blocks via zero-copy reads without checksum verification. Evaluation shows the zero-copy reads approach provides significant performance gains over short-circuit and TCP reads for both microbenchmarks and Impala queries, with speedups of up to 7x when the working set fits in memory. MapReduce jobs see more modest gains as they are often not I/O bound.
Tuning Linux for your database FLOSSUK 2016Colin Charles
Some best practices about tuning Linux for your database workloads. The focus is not just on MySQL or MariaDB Server but also on understanding the OS from hardware/cloud, I/O, filesystems, memory, CPU, network, and resources.
In-memory Data Management Trends & TechniquesHazelcast
- Hardware trends like increasing cores/CPU and RAM sizes enable in-memory data management techniques. Commodity servers can now support terabytes of memory.
- Different levels of data storage have vastly different access times, from registers (<1ns) to disk (4-7ms). Caching data in faster levels of storage improves performance.
- Techniques to exploit data locality, cache hierarchies, tiered storage, parallelism and in-situ processing can help overcome hardware limitations and achieve fast, real-time processing. Emerging in-memory databases use these techniques to enable new types of operational analytics.
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
One way to speed up you application is to bring more of your data into memory. But how to do you handle hundreds of GB of data in a JVM and what tools can help you.
Mentions: Speedment, Azul, Terracotta, Hazelcast and Chronicle.
This document discusses Linux huge pages, including:
- What huge pages are and how they can reduce memory management overhead by allocating larger blocks of memory
- How to configure huge pages on Linux, including installing required packages, mounting the huge page filesystem, and setting kernel parameters
- When huge pages should be configured, such as for data-intensive or latency-sensitive applications like databases, but that testing is required due to disadvantages like reduced swappability
Scaling with sync_replication using Galera and EC2Marco Tusa
Challenging architecture design, and proof of concept on a real case of study using Syncrhomous solution.
Customer asks me to investigate and design MySQL architecture to support his application serving shops around the globe.
Scale out and scale in base to sales seasons.
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
on-Volatile-Memory express (NVMe) standard promises and order of magnitude faster storage than regular SSDs, while at the same time being more economical than regular RAM on TB/$. This talk evaluates the use cases and benefits of NVMe drives for its use in Big Data clusters with HBase and Hadoop HDFS.
First, we benchmark the different drives using system level tools (FIO) to get maximum expected values for each different device type and set expectations. Second, we explore the different options and use cases of HBase storage and benchmark the different setups. And finally, we evaluate the speedups obtained by the NVMe technology for the different Big Data use cases from the YCSB benchmark.
In summary, while the NVMe drives show up to 8x speedup in best case scenarios, testing the cost-efficiency of new device technologies is not straightforward in Big Data, where we need to overcome system level caching to measure the maximum benefits.
Current HDFS Namenode stores all of its metadata in RAM. This has allowed Hadoop clusters to scale to 100K concurrent tasks. However, the memory limits the total number of files that a single NameNode can store. While Federation allows one to create multiple volumes with additional Namenodes, there is a need to scale a single namespace and also to store multiple namespaces in a single Namenode.
This talk describes a project that removes the space limits while maintaining similar performance by caching only the working set or hot metadata in Namenode memory. We believe this approach will be very effective because the subset of files that is frequently accessed is much smaller than the full set of files stored in HDFS.
In this talk we will describe our overall approach and give details of our implementation along with some early performance numbers.
Speaker: Lin Xiao, PhD student at Carnegie Mellon University, intern at Hortonworks
OSDC 2016 - Tuning Linux for your Database by Colin CharlesNETWAYS
Many operations folk know that performance varies depending on using one of the many Linux filesystems like EXT4 or XFS. They also know of the schedulers available, they see the OOM killer coming and more. However, appropriate configuration is necessary when you're running your databases at scale.
Learn best practices for Linux performance tuning for MariaDB/MySQL (where MyISAM uses the operating system cache, and InnoDB maintains its own aggressive buffer pool), as well as PostgreSQL and MongoDB (more dependent on the operating system). Topics that will be covered include: filesystems, swap and memory management, I/O scheduler settings, using and understanding the tools available (like iostat/vmstat/etc), practical kernel configuration, profiling your database, and using RAID and LVM.
There is a focus on bare metal as well as configuring your cloud instances in.
Learn from practical examples from the trenches.
Accelerating hbase with nvme and bucket cacheDavid Grier
This set of slides describes some initial experiments which we have designed for discovering improvements for performance in Hadoop technologies using NVMe technology
This document discusses strategies for maintaining very large MySQL tables that have grown too big. It recommends creating a new database server with different configuration settings like InnoDB file per table to reduce size, using tools like MySQLTuner and tuning-primer to analyze settings, archiving old historical data with ptArchiver to reduce table sizes, and considering partitioning or changing the MySQL version. Monitoring tools like InnoDB status, global status, cacti and innotop are recommended to analyze server performance.
The document discusses best practices for running MySQL on Linux, covering choices for Linux distributions, hardware recommendations including using solid state drives, OS configuration such as tuning the filesystem and IO scheduler, and MySQL installation and configuration options. It provides guidance on topics like virtualization, networking, and MySQL variants to help ensure successful and high performance deployment of MySQL on Linux.
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration between Presto & Alluxio
Ke Wang, Software Engineer (Facebook)
Bin Fan, Founding Engineer, VP Of Open Source (Alluxio)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
SQream DB - Bigger Data On GPUs: Approaches, Challenges, SuccessesArnon Shimoni
This talk will present SQream’s journey to building an analytics data warehouse powered by GPUs. SQream DB is an SQL data warehouse designed for larger than main-memory datasets (up to petabytes). It’s an on-disk database that combines novel ideas and algorithms to rapidly analyze trillions of rows with the help of high-throughput GPUs. We will explore some of SQream’s ideas and approaches to developing its analytics database – from simple prototype and tech demos, to a fully functional data warehouse product containing the most important features for enterprise deployment. We will also describe the challenges of working with exotic hardware like GPUs, and what choices had to be made in order to combine the CPU and GPU capabilities to achieve industry-leading performance – complete with real world use case comparisons.
As part of this discussion, we will also share some of the real issues that were discovered, and the engineering decisions that led to the creation of SQream DB’s high-speed columnar storage engine, designed specifically to take advantage of streaming architectures like GPUs.
The document summarizes a presentation on optimizing Linux, Windows, and Firebird for heavy workloads. It describes two customer implementations using Firebird - a medical company with 17 departments and over 700 daily users, and a repair services company with over 500 daily users. It discusses tuning the operating system, hardware, CPU, RAM, I/O, network, and Firebird configuration to improve performance under heavy loads. Specific recommendations are provided for Linux and Windows configuration.
MariaDB Server Performance Tuning & OptimizationMariaDB plc
This document discusses various techniques for optimizing MariaDB server performance, including:
- Tuning configuration settings like the buffer pool size, query cache size, and thread pool settings.
- Monitoring server metrics like CPU usage, memory usage, disk I/O, and MariaDB-specific metrics.
- Analyzing slow queries with the slow query log and EXPLAIN statements to identify optimization opportunities like adding indexes.
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
This document provides an overview and agenda for taking a Splunk deployment to the next level by addressing scaling needs and high availability requirements. It discusses growing use cases and data volumes, making Splunk mission critical through clustering, and supporting global deployments. The agenda covers scaling strategies like indexer clustering, search head clustering, and hybrid cloud deployments. It also promotes justifying increased spending by mapping dependencies and costs of failures across an organization's systems.
Optimizing elastic search on google compute engineBhuvaneshwaran R
If you are running the elastic search clusters on the GCE, then we need to take a look at the Capacity planning, OS level and Elasticsearch level optimization. I have presented this at GDG Delhi on Feb 22,2020.
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...PostgreSQL-Consulting
Even an experienced PostgreSQL DBA can not always say that upgrading between major versions of Postgres is an easy task, especially if there are some special requirements, such as downtime limitations or if something goes wrong. For less experienced DBAs anything more complex than dump/restore can be frustrating.
In this talk I will describe why we need a special procedure to upgrade between major versions, how that can be achieved and what sort of problems can occur. I will review all possible ways to upgrade your cluster from classical pg_upgrade to old-school slony or modern methods like logical replication. For all approaches, I will give a brief explanation how it works (limited by the scope of this talk of course), examples how to perform upgrade and some advice on potentially problematic steps. Besides I will touch upon such topics as integration of upgrade tools and procedures with other software — connection brokers, operating system package managers, automation tools, etc. This talk would not be complete if I do not cover cases when something goes wrong and how to deal with such cases.
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...PostgreSQL-Consulting
Input-output performance problems are on every day agenda for DBAs since the databases exist. Volume of data grows rapidly and you need to get your data fast from the disk and moreover - fast to the disk. For most databases there is a more or less easy to find checklist of recommended Linux settings to maximize IO throughput. In most cases that checklist is good enough. But it is always better to understand how it works, especially if you run into some corner-cases. This talk is about how IO in Linux works, how database pages travel from disk level to database own shared memory and back and what kind of mechanisms exist to control this. We will discuss memory structures, swap and page-out daemons, filesystems, schedulers and IO methods. Some fundamental differences in IO approaches between PostgreSQL, Oracle and MySQL will be covered
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
This talk is prepared as a bunch of slides, where each slide describes a really bad way people can screw up their PostgreSQL database and provides a weight - how frequently I saw that kind of problem. Right before the talk I will reshuffle the deck to draw twenty random slides and explain you why such practices are bad and how to avoid running into them.
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
This talk is prepared as a bunch of slides, where each slide describes a really bad way people can screw up their PostgreSQL database and provides a weight - how frequently I saw that kind of problem. Right before the talk I will reshuffle the deck to draw ten random slides and explain you why such practices are bad and how to avoid running into them.
Linux internals for Database administrators at Linux Piter 2016PostgreSQL-Consulting
Input-output performance problems are on every day agenda for DBAs since the databases exist. Volume of data grows rapidly and you need to get your data fast from the disk and moreover - fast to the disk. For most databases there is a more or less easy to find checklist of recommended Linux settings to maximize IO throughput. In most cases that checklist is good enough. But it is always better to understand how it works, especially if you run into some corner-cases. This talk is about how IO in Linux works, how database pages travel from disk level to database own shared memory and back and what kind of mechanisms exist to control this. We will discuss memory structures, swap and page-out daemons, filesystems, schedullers and IO methods. Some fundamental differences in IO approaches between PostgreSQL, Oracle and MySQL will be covered.
10 things, an Oracle DBA should care about when moving to PostgreSQLPostgreSQL-Consulting
PostgreSQL can handle many of the same workloads as Oracle and provides alternatives to common Oracle features and practices. Some key differences for DBAs moving from Oracle to PostgreSQL include: using shared_buffers instead of SGA with a recommended 25-75% of RAM; using pgbouncer instead of a listener; performing backups with pg_basebackup and WAL archiving instead of RMAN; managing undo data in datafiles instead of undo segments; using streaming replication for high availability instead of RAC; and needing to tune autovacuum instead of manually managing redo and undo logs. PostgreSQL is very capable but may not be suited for some extremely high update workloads of 200K+ transactions per second on a single server
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaPostgreSQL-Consulting
Autovacuum is PostgreSQL's automatic vacuum process that helps manage bloat and garbage collection. It is critical for performance but is often improperly configured by default settings. Autovacuum works table-by-table to remove expired rows in small portions to avoid long blocking operations. Its settings like scale factors, thresholds, and costs can be tuned more aggressively for OLTP workloads to better control bloat and avoid long autovacuum operations.
PostgreSQL autovacuum is important for garbage collection and preventing fragmentation. It works table-by-table to remove old tuples and collect statistics. While autovacuum settings are often left as defaults, it's best to configure it aggressively for OLTP workloads so it can work quickly in small portions. Autovacuum must be properly configured for replication as well to avoid conflicts. Tools exist to help remove existing bloat without needing to dump/restore the entire database.
Как в PostgreSQL устроено взаимодействие с диском, какие проблемы производительности при этом бывают и как их решать выбором подходящего hardware, настройками операционной системы и настройками PostgreSQL
ρ. Κώστας Σαχπάζης: Foundation Analysis and Design: Single Piles
Welcome to this comprehensive presentation on "Foundation Analysis and Design," focusing on Single Piles—Static Capacity, Lateral Loads, and Pile/Pole Buckling. This presentation will explore the fundamental concepts, equations, and practical considerations for designing and analyzing pile foundations.
We'll examine different pile types, their characteristics, load transfer mechanisms, and the complex interactions between piles and surrounding soil. Throughout this presentation, we'll highlight key equations and methodologies for calculating pile capacities under various conditions.
Kalite Politikamız
Taykon Çelik için kalite, hayallerinizi bizlerle paylaştığınız an başlar. Proje çiziminden detayların çözümüne, detayların çözümünden üretime, üretimden montaja, montajdan teslime hayallerinizin gerçekleştiğini gördüğünüz ana kadar geçen tüm aşamaları, çalışanları, tüm teknik donanım ve çevreyi içine alır KALİTE.
This presentation provides an in-depth analysis of structural quality control in the KRP 401600 section of the Copper Processing Plant-3 (MOF-3) in Uzbekistan. As a Structural QA/QC Inspector, I have identified critical welding defects, alignment issues, bolting problems, and joint fit-up concerns.
Key topics covered:
✔ Common Structural Defects – Welding porosity, misalignment, bolting errors, and more.
✔ Root Cause Analysis – Understanding why these defects occur.
✔ Corrective & Preventive Actions – Effective solutions to improve quality.
✔ Team Responsibilities – Roles of supervisors, welders, fitters, and QC inspectors.
✔ Inspection & Quality Control Enhancements – Advanced techniques for defect detection.
📌 Applicable Standards: GOST, KMK, SNK – Ensuring compliance with international quality benchmarks.
🚀 This presentation is a must-watch for:
✅ QA/QC Inspectors, Structural Engineers, Welding Inspectors, and Project Managers in the construction & oil & gas industries.
✅ Professionals looking to improve quality control processes in large-scale industrial projects.
📢 Download & share your thoughts! Let's discuss best practices for enhancing structural integrity in industrial projects.
Categories:
Engineering
Construction
Quality Control
Welding Inspection
Project Management
Tags:
#QAQC #StructuralInspection #WeldingDefects #BoltingIssues #ConstructionQuality #Engineering #GOSTStandards #WeldingInspection #QualityControl #ProjectManagement #MOF3 #CopperProcessing #StructuralEngineering #NDT #OilAndGas
This PDF highlights how engineering model making helps turn designs into functional prototypes, aiding in visualization, testing, and refinement. It covers different types of models used in industries like architecture, automotive, and aerospace, emphasizing cost and time efficiency.
Lessons learned when managing MySQL in the CloudIgor Donchovski
Managing MySQL in the cloud introduces a new set of challenges compared to traditional on-premises setups, from ensuring optimal performance to handling unexpected outages. In this article, we delve into covering topics such as performance tuning, cost-effective scalability, and maintaining high availability. We also explore the importance of monitoring, automation, and best practices for disaster recovery to minimize downtime.
Welcome to the March 2025 issue of WIPAC Monthly the magazine brought to you by the LinkedIn Group WIPAC Monthly.
In this month's edition, on top of the month's news from the water industry we cover subjects from the intelligent use of wastewater networks, the use of machine learning in water quality as well as how, we as an industry, need to develop the skills base in developing areas such as Machine Learning and Artificial Intelligence.
Enjoy the latest edition
Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...J. Agricultural Machinery
Optimal use of resources, including energy, is one of the most important principles in modern and sustainable agricultural systems. Exergy analysis and life cycle assessment were used to study the efficient use of inputs, energy consumption reduction, and various environmental effects in the corn production system in Lorestan province, Iran. The required data were collected from farmers in Lorestan province using random sampling. The Cobb-Douglas equation and data envelopment analysis were utilized for modeling and optimizing cumulative energy and exergy consumption (CEnC and CExC) and devising strategies to mitigate the environmental impacts of corn production. The Cobb-Douglas equation results revealed that electricity, diesel fuel, and N-fertilizer were the major contributors to CExC in the corn production system. According to the Data Envelopment Analysis (DEA) results, the average efficiency of all farms in terms of CExC was 94.7% in the CCR model and 97.8% in the BCC model. Furthermore, the results indicated that there was excessive consumption of inputs, particularly potassium and phosphate fertilizers. By adopting more suitable methods based on DEA of efficient farmers, it was possible to save 6.47, 10.42, 7.40, 13.32, 31.29, 3.25, and 6.78% in the exergy consumption of diesel fuel, electricity, machinery, chemical fertilizers, biocides, seeds, and irrigation, respectively.
Gauges are a Pump's Best Friend - Troubleshooting and Operations - v.07Brian Gongol
No reputable doctor would try to conduct a basic physical exam without the help of a stethoscope. That's because the stethoscope is the best tool for gaining a basic "look" inside the key systems of the human body. Gauges perform a similar function for pumping systems, allowing technicians to "see" inside the pump without having to break anything open. Knowing what to do with the information gained takes practice and systemic thinking. This is a primer in how to do that.
Gauges are a Pump's Best Friend - Troubleshooting and Operations - v.07Brian Gongol
Linux tuning to improve PostgreSQL performance
1. Linux tuning to improve PostgreSQL
performance
Ilya Kosmodemiansky
ik@postgresql-consulting.com
2. The modern linux kernel
• About 1000 sysctl parameters (plus non-sysctl settings, such
as mount options)
• It is not possible to benefit from the modern kernel’s
advantages without wise tuning
4. PostgreSQL specifics
• Hungry for resources (like any other database)
• Tuning single target can have a very small effect
• We need to maximize throughput
6. How to make pages travel faster from disk to memory
• More effective work with memory
• More effective flushing pages to disk
• A proper hardware, of course
9. NUMA
What goes on
• Non Uniform Memory Access
• CPUs have their own memory, CPU + memory nodes
connected via NUMA interconnect
• CPU uses its own memory, then accesses remaining memory by
interconnect
• If node interleaving disabled, CPU tries to use its local memory
(for page cache for example;-))
10. NUMA
Which NUMA configuration is better for PostgreSQL
• Enable memory interleaving in BIOS
• numa → off or vm.zone_reclaim_mode = 0
• May be better numactl − −interleave = all
/etc/init.d/postgresql start
• kernel.numa_balancing = 0
Blog post from Robert Haas:
http://rhaas.blogspot.co.at/2014/06/linux-disables-
vmzonereclaimmode-by.html
11. Huge pages
Symptoms that something goes wrong
• You have a lot of RAM and you shared_buffers settings is
32Gb/64Gb or more
• That means that you definitely have an overhead if not using
huge pages
12. Huge pages
What goes on
• By default OS allocates memory by 4kB chunk
• OS translates physical addresses into virtual addresses and
cache the result in Translation Lookaside Buffer (TLB)
• 1Gb
4kB = 262144 - huge TLB overhead and cache misses
• Better to allocate memory in larger chunks
13. Huge pages
How can PostgreSQL benefit from huge pages?
• Enable pages in kernel
• vm.nr_hugepages = 3170 via sysctl
• Before 9.2 - libhugetlbfs library
• 9.3 - no way
• 9.4+ huge_pages = try|on|off (postgresql.conf)
• Works on Linux
• Disable Transparent huge pages - PostgreSQL can not benefit
from them
19. More effective flushing pages to disk
What goes on
• By default vm.dirty_ratio = 20, vm.dirty_background_ratio
= 10
• Nothing happens until kernel buffer is 10% full of dirty pages
• From 10% to 20% - background flushing
• From 20% IO effectively stops until pdflush/flushd/kdflush
finishes its job
• This is almost crazy if your shared_buffers setting is
32Gb/64Gb or more with any cache on RAID-controller or SSD
20. More effective flushing pages to disk
What is better for PostgreSQL?
• vm.dirty_background_bytes = 67108864, vm.dirty_bytes =
536870912 (for RAID with 512MB cache on board) looks more
reasonable
• Hardware settings and checkpoint settings in postgresql.conf
must be appropriate
• See my talk about PostgreSQL disc performance for details
(https://www.youtube.com/watch?v=Lbx-JVcGIFo)
22. Scheduler tuning
• sysctl kernel.sched_migration_cost_ns supposed to be
reasonably high
• sysctl kernel.sched_autogroup_enabled = 0
• A good explanation http://www.postgresql.org/message-
id/50E4AAB1.9040902@optionshouse.com
• You need a relatively new kernel
23. Example
$ pgbench -S -c 8 -T 30 -U postgres pgbench transaction type: SELECT only
scaling factor: 30 duration: 30 s
number of clients: 8 number of threads: 1
sched_migration_cost_ns = 50000, sched_autogroup_enabled = 1
- tps: 22621, 22692, 22502
sched_migration_cost_ns = 500000, sched_autogroup_enabled = 0
- tps: 23689, 23930, 23657
tests by Alexey Lesovsky
24. Power saving policy
• acpi_cpufreq and intel_pstate drivers
• scaling_governor: performance, ondemand, conservative,
powersave, userspace
• acpi_cpufreq + performance can be dramatically faster than
acpi_cpufreq + ondemand
• intel_pstate + powersave
25. Thanks
to my collegues Alexey Lesovsky and Max Boguk for a lot of
research on this topic