Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
The computer science behind a modern disributed data storeJ On The Beach
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Every application needs a stateful layer which holds the data. There are at least three necessary components which are everything else than trivial to combine, and, of course, even more challenging when heading for an acceptable performance.
Over the past years there has been significant progress in both the science and practical implementations of such data stores. In his talk Max Neunhoeffer will introduce the audience to some of the needed ingredients, address the difficulties of their interplay and show four modern approaches of distributed open-source data stores (ArangoDB, Cassandra, Cockroach and RethinkDB).
The Computer Science Behind a modern Distributed DatabaseArangoDB Database
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Every application needs a stateful layer which holds the data. There are several different necessary components which are anything but trivial to combine, and, of course, even more challenging when attempting to optimize for performance. Over the past years there has been significant progress in both the science and practical implementations of such data stores. In this talk Dan Larkin-York will introduce the audience to some of the challenges, address the difficulties of their interplay, and cover key approaches taken by some of the industry’s leaders (ArangoDB, Cassandra, CockroachDB, MarkLogic, and more).
Docker and containers allow for much higher density and efficiency compared to virtual machines. Containers start in milliseconds versus minutes for VMs, and allow hundreds of containers to run on a single physical machine versus 16 VMs. This leads to significant cost savings through reduced infrastructure needs as well as increased developer productivity from faster deployment and testing. It also enables rapid experimentation to drive more innovation and revenue growth through features. StackEngine helps manage containers at scale in production environments.
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...NETWAYS
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Most applications need a stateful layer which holds the data. There are at least three necessary ingredients which are everything else than trivial to combine and of course even more challenging when heading for an acceptable performance. Over the past years there has been significant progress in respect in both the science and practical implementations of such data stores. In his talk Max Neunhoeffer will introduce the audience to some of the needed ingredients, address the difficulties of their interplay and show four modern approaches of distributed open-source data stores.
Topics are:
– Challenges in developing a distributed, resilient data store
– Consensus, distributed transactions, distributed query optimization and execution
– The inner workings of ArangoDB, Cassandra, Cockroach and RethinkDB
The talk will touch complex and difficult computer science, but will at the same time be accessible to and enjoyable by a wide range of developers.
Test driven infrastructure development (2 - puppetconf 2013 edition)Tomas Doran
The document discusses test driven infrastructure development. It describes issues with the current state where infrastructure changes are not repeatable and difficult to test. The speaker proposes modeling infrastructure as code where environments are defined programmatically and configuration is generated externally rather than defined directly in puppet code. This allows for entire environments to be provisioned on demand and tested in an automated and repeatable way. Key benefits include high availability, ability to test all infrastructure changes, fully repeatable environments, high confidence in changes, and continuous integration/deployment of infrastructure.
Reactive Systems by Dave Farley at #AgileIndia2019Agile India
21st century problems cannot be solved with 20th century software architectures. So why is the starting point for so many projects built on the assumption of a simplistic monolithic, three-layer architecture sat on top of a RDBMS? Hardware has progressed. It has changed many of the assumptions that such architectures were built upon. Modern systems are distributed, deal with massive throughput of data and transactions. Users expect 24/7 service.
The Reactive Manifesto describes what it takes to build systems that meet these demands. Such systems are Responsive, Resilient, Elastic and Message Driven. What does this mean in terms of software architecture and design? This presentation will introduce these ideas and describe how systems built on these principles work.
More details:
https://confengine.com/agile-india-2019/proposal/8536/reactive-systems
Conference link: https://2019.agileindia.org
In 1971, David Parnas wrote the great paper, "On the criteria to be used decomposing the system into parts," and yet the problem of breaking down big projects into small parts that work well together remains a struggle in the industry. The ability to decompose a problem space and in turn, compose a solution is essential to our work.
Things have gotten worse since 1971. With microservices, big data, and streaming systems, we're all going to be distributed systems engineers sooner or later. In distributed systems, effective decomposition has an even greater impact on the reliability, performance, and availability of our systems as it determines the frequency and weight of communication in the system.
This talk speaks to the essential considerations for defining and evaluating boundaries and behaviors in large-scale distributed systems. It will touch on topics such as bulkhead design and architectural evolution.
The computer science behind a modern disributed data storeJ On The Beach
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Every application needs a stateful layer which holds the data. There are at least three necessary components which are everything else than trivial to combine, and, of course, even more challenging when heading for an acceptable performance.
Over the past years there has been significant progress in both the science and practical implementations of such data stores. In his talk Max Neunhoeffer will introduce the audience to some of the needed ingredients, address the difficulties of their interplay and show four modern approaches of distributed open-source data stores (ArangoDB, Cassandra, Cockroach and RethinkDB).
The Computer Science Behind a modern Distributed DatabaseArangoDB Database
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Every application needs a stateful layer which holds the data. There are several different necessary components which are anything but trivial to combine, and, of course, even more challenging when attempting to optimize for performance. Over the past years there has been significant progress in both the science and practical implementations of such data stores. In this talk Dan Larkin-York will introduce the audience to some of the challenges, address the difficulties of their interplay, and cover key approaches taken by some of the industry’s leaders (ArangoDB, Cassandra, CockroachDB, MarkLogic, and more).
Docker and containers allow for much higher density and efficiency compared to virtual machines. Containers start in milliseconds versus minutes for VMs, and allow hundreds of containers to run on a single physical machine versus 16 VMs. This leads to significant cost savings through reduced infrastructure needs as well as increased developer productivity from faster deployment and testing. It also enables rapid experimentation to drive more innovation and revenue growth through features. StackEngine helps manage containers at scale in production environments.
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...NETWAYS
What we see in the modern data store world is a race between different approaches to achieve a distributed and resilient storage of data. Most applications need a stateful layer which holds the data. There are at least three necessary ingredients which are everything else than trivial to combine and of course even more challenging when heading for an acceptable performance. Over the past years there has been significant progress in respect in both the science and practical implementations of such data stores. In his talk Max Neunhoeffer will introduce the audience to some of the needed ingredients, address the difficulties of their interplay and show four modern approaches of distributed open-source data stores.
Topics are:
– Challenges in developing a distributed, resilient data store
– Consensus, distributed transactions, distributed query optimization and execution
– The inner workings of ArangoDB, Cassandra, Cockroach and RethinkDB
The talk will touch complex and difficult computer science, but will at the same time be accessible to and enjoyable by a wide range of developers.
Test driven infrastructure development (2 - puppetconf 2013 edition)Tomas Doran
The document discusses test driven infrastructure development. It describes issues with the current state where infrastructure changes are not repeatable and difficult to test. The speaker proposes modeling infrastructure as code where environments are defined programmatically and configuration is generated externally rather than defined directly in puppet code. This allows for entire environments to be provisioned on demand and tested in an automated and repeatable way. Key benefits include high availability, ability to test all infrastructure changes, fully repeatable environments, high confidence in changes, and continuous integration/deployment of infrastructure.
Reactive Systems by Dave Farley at #AgileIndia2019Agile India
21st century problems cannot be solved with 20th century software architectures. So why is the starting point for so many projects built on the assumption of a simplistic monolithic, three-layer architecture sat on top of a RDBMS? Hardware has progressed. It has changed many of the assumptions that such architectures were built upon. Modern systems are distributed, deal with massive throughput of data and transactions. Users expect 24/7 service.
The Reactive Manifesto describes what it takes to build systems that meet these demands. Such systems are Responsive, Resilient, Elastic and Message Driven. What does this mean in terms of software architecture and design? This presentation will introduce these ideas and describe how systems built on these principles work.
More details:
https://confengine.com/agile-india-2019/proposal/8536/reactive-systems
Conference link: https://2019.agileindia.org
In 1971, David Parnas wrote the great paper, "On the criteria to be used decomposing the system into parts," and yet the problem of breaking down big projects into small parts that work well together remains a struggle in the industry. The ability to decompose a problem space and in turn, compose a solution is essential to our work.
Things have gotten worse since 1971. With microservices, big data, and streaming systems, we're all going to be distributed systems engineers sooner or later. In distributed systems, effective decomposition has an even greater impact on the reliability, performance, and availability of our systems as it determines the frequency and weight of communication in the system.
This talk speaks to the essential considerations for defining and evaluating boundaries and behaviors in large-scale distributed systems. It will touch on topics such as bulkhead design and architectural evolution.
The Ember.js Framework - Everything You Need To KnowAll Things Open
All Things Open 2014 - Day 2
Thursday, October 23rd, 2014
Yehuda Katz
Founder of Tilde
Front Dev 1
The Ember.js Framework - Everything You Need To Know
This document discusses challenges faced in implementing Presto, an open source distributed SQL query engine, for targeted audience delivery at TiVo. It describes choosing appropriate instance types for Presto worker nodes based on memory needs. It also addresses scaling the Presto cluster elastically to handle query concurrency and maturity issues with the Presto software. The document provides insights on testing Presto using Docker containers and connecting to mocked tables.
Nagios Conference 2014 - David Josephsen - Alert on What You DrawNagios
This document discusses various topics related to cloud computing including:
- The challenges of maintaining virtual, multi-tenant systems running on massive infrastructure and the need for compulsory maintenance.
- How cloud systems are designed to be resilient by running services atop unreliable systems and allowing them to be scaled and changed dynamically.
- The importance of monitoring metrics from within services to understand latency, queues, workers and ensure any engineer can access performance data and create new metrics.
- Examples of open source tools like Heka and Riemann that can be used to collect and analyze metrics.
Matt Franklin - Apache Software (Geekfest)W2O Group
The document discusses the potential benefits of container technologies like Docker. It notes that containers offer significantly higher density than virtual machines by avoiding hypervisor overhead. This density improvement can lead to major cost reductions by reducing infrastructure needs. Containers also improve developer efficiency by making development environments portable and disposable. This allows more rapid experimentation and innovation, potentially translating to increased revenue. Technologies like Amazon Lambda take the on-demand aspects of containers even further by abstracting compute resources. The document promotes StackEngine as a solution for managing containers at scale in production environments.
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven SystemsJonas Bonér
Abstract:
The demands and expectations for applications have changed dramatically in recent years. Applications today are deployed on a wide range of infrastructure; from mobile devices up to thousands of nodes running in the cloud—all powered by multi-core processors. They need to be rich and collaborative, have a real-time feel with millisecond response time and should never stop running. Additionally, modern applications are a mashup of external services that need to be consumed and composed to provide the features at hand.
We are seeing a new type of applications emerging to address these new challenges—these are being called Reactive Applications. In this talk we will discuss four key traits of Reactive; Responsive, Resilient, Elastic and Message-Driven—how they impact application design, how they interact, their supporting technologies and techniques, how to think when designing and building them—all to make it easier for you and your team to Go Reactive.
Intended Audience:
Programmers, architects, CIO/CTOs and everyone with a desire to challenge the status quo and expand their horizons on how to tackle the current and future challenges in the computing industry.
An idea for a log and backup policy that reduces the possibility of and potential damage from insider threats. Presented at Information Warfare Summit 2013.
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMilen Dyankov
This slide deck will be removed from here in the future. It has been moved to : https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Applications of different size, business domain and criticality suffer from a huge set of issues, be it boring enterprise software, “Highly-Loaded” social network or a cozy startup. In this talk Eduards will cover Software Architecture issues that he finds the most prevailing nowadays and what you can do with that. Think big!
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Databricks
PyWren is a serverless framework that allows data scientists to easily scale Python code across AWS Lambda. It uses Lambda to parallelize work by mapping Python functions to a large dataset. The functions and data are serialized and uploaded to S3, which then triggers Lambda. Results are stored in S3. This allows data science problems that take minutes or hours to be solved to complete in seconds by parallelizing across thousands of Lambda instances. PyWren aims to abstract away the complexity of serverless infrastructure so data scientists can focus on their code instead of operations.
Scaling a Rails Application from the Bottom Up Abhishek Singh
The document outlines Jason Hoffman's presentation on scaling a Rails application from the bottom up. It discusses fundamental limits like money, time, and hardware resources. It provides examples of logical server roles needed for a scalable architecture including provisioning, monitoring, logging etc. It also discusses hardware considerations like power, space, and networking. The presentation emphasizes standardization, virtualization, and keeping infrastructure costs below 10% of revenue.
In this presentation, Gil Tene, CTO of Azul Systems, discusses the benefits of using in-memory, in-process, and in-heap index representations with heap sizes that can now make full use of current commodity server scale. He also compares throughput and latency characteristics of the various architectural alternatives, using some of the configurable choices available in Apache Lucene™ 4.0 for specific examples.
The document discusses Java 8 streams and stream performance. It provides background on streams and why they were introduced in Java 8. It discusses sequential and parallel streams, how to visualize them, and practical benefits. It covers microbenchmarking and a case study comparing a sequential grep implementation to a parallelized version. Key points are that streams can improve readability but performance must be tested, parallelism helps if the workload is large enough to outweigh overhead, and stream sources need to be splittable for parallelism.
Parallel Ruby: Managing the Memory MonsterKevin Miller
The document discusses strategies for managing memory usage in Ruby applications. It begins by describing Flexport's migration from many small Ruby processes to fewer, larger processes with more threads. This led to performance issues due to Ruby's memory management. The document then provides details on Ruby's memory allocation and discusses how memory is fragmented across threads. It evaluates three options for improving memory usage: using the default allocator, reducing OS heap counts, or using the jemalloc allocator. It recommends using jemalloc if possible due to significant memory reductions and potential speed increases seen in other Ruby applications.
This document discusses cloud native transformation using Sadeem Cloud Platform Services. It addresses the challenges of running single docker containers for applications which often fail, requiring manual intervention to restart virtual machines and containers. The proposed solution is to orchestrate containers with Kubernetes to automatically restart failed containers and scale container instances as needed to avoid overload. An example is provided of running an OWASP hackathon capture-the-flag competition on a Hereko PAAS deployment configured with a single YAML file for autoscaling.
This document summarizes some of the challenges of hosting MongoDB in the cloud on platforms like Amazon AWS, Rackspace, and Terremark. It discusses issues like choosing optimal instance types and disk configurations, handling growth and expansion limitations, implementing effective disaster recovery strategies, and the benefits of using a managed service like Mongo Machine to avoid these headaches. Key points covered include the tradeoffs of different instance properties, using LVM and RAID configurations to allow for resizing while maintaining performance, and practicing recovery scenarios to prepare for failures.
Expecto Performa! The Magic and Reality of Performance TuningAtlassian
In the enterprise there are rarely simple solutions to highly nuanced problems that satisfy all needs. Several customers might each ask "How do I make Jira/Confluence faster?" and each require a different answer. Using this example, this talk will pick apart the inputs, outputs, concerns, and realities of answering a short question with a long answer. We'll then discuss real-world examples from our own internal instances, to give you a taste of the process we've gone through to solve our own performance problems, and to show why there is no simple playbook; "it depends" on a lot! The key takeaways are:
* The importance of having a shared definition of performance
* The importance of having agreed-upon priorities, including what isn't important
* The importance of measuring (allthethings) and understanding them
* The thing you think is the problem might not be the problem, and vice versa.
* The real world and the ideal world tend to look nothing alike!
The document introduces Akka, an open-source toolkit for building distributed, concurrent applications on the JVM. It provides a programming model called the actor model that makes it easier to build scalable and fault-tolerant systems. Actors process messages asynchronously and avoid shared state, providing a simpler approach to concurrency than traditional threads and locks. Akka allows actors to be distributed across a network, enabling applications to scale out elastically.
Containers and Developer Defined Data Centers - Evan Powell - Keynote in Bang...CodeOps Technologies LLP
DevOps and Containers go hand in hand. DevOps industry is expected to benefit significantly benefit from the container eco-system and technology. This keynote talks about the challenges and opportunities around deploying containers into production use cases.
Go Reactive: Event-Driven, Scalable, Resilient & Responsive SystemsJonas Bonér
The document discusses the need for new tools and approaches for building event-driven, scalable, resilient, and responsive systems. It notes that the demands on applications have changed with the rise of mobile devices, multi-core architectures, and cloud computing. Systems now need to be interactive, responsive, and collaborative. The document advocates building systems that react to events, load, failure, and users using asynchronous messaging and avoiding shared mutable state. It discusses various reactive programming approaches like actors, agents, futures, and reactive extensions that enable building such systems.
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
"Microservices and multitenancy - how to serve thousands of databases in one ...Fwdays
Imagine you are designing a B2B service that will serve millions of businesses. This service will have dozens of different microservices with their own data, which can contain millions of records. How do you design such a database? Why is sharding not always the answer? What other options are there for such an architectural solution?
I'll tell you how we at Uspacy came to serve thousands of small databases instead of a few large ones, what we've encountered and what we plan to face)
More Related Content
Similar to "$10 thousand per minute of downtime: architecture, queues, streaming and fintech", Max Baginskiy
The Ember.js Framework - Everything You Need To KnowAll Things Open
All Things Open 2014 - Day 2
Thursday, October 23rd, 2014
Yehuda Katz
Founder of Tilde
Front Dev 1
The Ember.js Framework - Everything You Need To Know
This document discusses challenges faced in implementing Presto, an open source distributed SQL query engine, for targeted audience delivery at TiVo. It describes choosing appropriate instance types for Presto worker nodes based on memory needs. It also addresses scaling the Presto cluster elastically to handle query concurrency and maturity issues with the Presto software. The document provides insights on testing Presto using Docker containers and connecting to mocked tables.
Nagios Conference 2014 - David Josephsen - Alert on What You DrawNagios
This document discusses various topics related to cloud computing including:
- The challenges of maintaining virtual, multi-tenant systems running on massive infrastructure and the need for compulsory maintenance.
- How cloud systems are designed to be resilient by running services atop unreliable systems and allowing them to be scaled and changed dynamically.
- The importance of monitoring metrics from within services to understand latency, queues, workers and ensure any engineer can access performance data and create new metrics.
- Examples of open source tools like Heka and Riemann that can be used to collect and analyze metrics.
Matt Franklin - Apache Software (Geekfest)W2O Group
The document discusses the potential benefits of container technologies like Docker. It notes that containers offer significantly higher density than virtual machines by avoiding hypervisor overhead. This density improvement can lead to major cost reductions by reducing infrastructure needs. Containers also improve developer efficiency by making development environments portable and disposable. This allows more rapid experimentation and innovation, potentially translating to increased revenue. Technologies like Amazon Lambda take the on-demand aspects of containers even further by abstracting compute resources. The document promotes StackEngine as a solution for managing containers at scale in production environments.
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven SystemsJonas Bonér
Abstract:
The demands and expectations for applications have changed dramatically in recent years. Applications today are deployed on a wide range of infrastructure; from mobile devices up to thousands of nodes running in the cloud—all powered by multi-core processors. They need to be rich and collaborative, have a real-time feel with millisecond response time and should never stop running. Additionally, modern applications are a mashup of external services that need to be consumed and composed to provide the features at hand.
We are seeing a new type of applications emerging to address these new challenges—these are being called Reactive Applications. In this talk we will discuss four key traits of Reactive; Responsive, Resilient, Elastic and Message-Driven—how they impact application design, how they interact, their supporting technologies and techniques, how to think when designing and building them—all to make it easier for you and your team to Go Reactive.
Intended Audience:
Programmers, architects, CIO/CTOs and everyone with a desire to challenge the status quo and expand their horizons on how to tackle the current and future challenges in the computing industry.
An idea for a log and backup policy that reduces the possibility of and potential damage from insider threats. Presented at Information Warfare Summit 2013.
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMilen Dyankov
This slide deck will be removed from here in the future. It has been moved to : https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Applications of different size, business domain and criticality suffer from a huge set of issues, be it boring enterprise software, “Highly-Loaded” social network or a cozy startup. In this talk Eduards will cover Software Architecture issues that he finds the most prevailing nowadays and what you can do with that. Think big!
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Databricks
PyWren is a serverless framework that allows data scientists to easily scale Python code across AWS Lambda. It uses Lambda to parallelize work by mapping Python functions to a large dataset. The functions and data are serialized and uploaded to S3, which then triggers Lambda. Results are stored in S3. This allows data science problems that take minutes or hours to be solved to complete in seconds by parallelizing across thousands of Lambda instances. PyWren aims to abstract away the complexity of serverless infrastructure so data scientists can focus on their code instead of operations.
Scaling a Rails Application from the Bottom Up Abhishek Singh
The document outlines Jason Hoffman's presentation on scaling a Rails application from the bottom up. It discusses fundamental limits like money, time, and hardware resources. It provides examples of logical server roles needed for a scalable architecture including provisioning, monitoring, logging etc. It also discusses hardware considerations like power, space, and networking. The presentation emphasizes standardization, virtualization, and keeping infrastructure costs below 10% of revenue.
In this presentation, Gil Tene, CTO of Azul Systems, discusses the benefits of using in-memory, in-process, and in-heap index representations with heap sizes that can now make full use of current commodity server scale. He also compares throughput and latency characteristics of the various architectural alternatives, using some of the configurable choices available in Apache Lucene™ 4.0 for specific examples.
The document discusses Java 8 streams and stream performance. It provides background on streams and why they were introduced in Java 8. It discusses sequential and parallel streams, how to visualize them, and practical benefits. It covers microbenchmarking and a case study comparing a sequential grep implementation to a parallelized version. Key points are that streams can improve readability but performance must be tested, parallelism helps if the workload is large enough to outweigh overhead, and stream sources need to be splittable for parallelism.
Parallel Ruby: Managing the Memory MonsterKevin Miller
The document discusses strategies for managing memory usage in Ruby applications. It begins by describing Flexport's migration from many small Ruby processes to fewer, larger processes with more threads. This led to performance issues due to Ruby's memory management. The document then provides details on Ruby's memory allocation and discusses how memory is fragmented across threads. It evaluates three options for improving memory usage: using the default allocator, reducing OS heap counts, or using the jemalloc allocator. It recommends using jemalloc if possible due to significant memory reductions and potential speed increases seen in other Ruby applications.
This document discusses cloud native transformation using Sadeem Cloud Platform Services. It addresses the challenges of running single docker containers for applications which often fail, requiring manual intervention to restart virtual machines and containers. The proposed solution is to orchestrate containers with Kubernetes to automatically restart failed containers and scale container instances as needed to avoid overload. An example is provided of running an OWASP hackathon capture-the-flag competition on a Hereko PAAS deployment configured with a single YAML file for autoscaling.
This document summarizes some of the challenges of hosting MongoDB in the cloud on platforms like Amazon AWS, Rackspace, and Terremark. It discusses issues like choosing optimal instance types and disk configurations, handling growth and expansion limitations, implementing effective disaster recovery strategies, and the benefits of using a managed service like Mongo Machine to avoid these headaches. Key points covered include the tradeoffs of different instance properties, using LVM and RAID configurations to allow for resizing while maintaining performance, and practicing recovery scenarios to prepare for failures.
Expecto Performa! The Magic and Reality of Performance TuningAtlassian
In the enterprise there are rarely simple solutions to highly nuanced problems that satisfy all needs. Several customers might each ask "How do I make Jira/Confluence faster?" and each require a different answer. Using this example, this talk will pick apart the inputs, outputs, concerns, and realities of answering a short question with a long answer. We'll then discuss real-world examples from our own internal instances, to give you a taste of the process we've gone through to solve our own performance problems, and to show why there is no simple playbook; "it depends" on a lot! The key takeaways are:
* The importance of having a shared definition of performance
* The importance of having agreed-upon priorities, including what isn't important
* The importance of measuring (allthethings) and understanding them
* The thing you think is the problem might not be the problem, and vice versa.
* The real world and the ideal world tend to look nothing alike!
The document introduces Akka, an open-source toolkit for building distributed, concurrent applications on the JVM. It provides a programming model called the actor model that makes it easier to build scalable and fault-tolerant systems. Actors process messages asynchronously and avoid shared state, providing a simpler approach to concurrency than traditional threads and locks. Akka allows actors to be distributed across a network, enabling applications to scale out elastically.
Containers and Developer Defined Data Centers - Evan Powell - Keynote in Bang...CodeOps Technologies LLP
DevOps and Containers go hand in hand. DevOps industry is expected to benefit significantly benefit from the container eco-system and technology. This keynote talks about the challenges and opportunities around deploying containers into production use cases.
Go Reactive: Event-Driven, Scalable, Resilient & Responsive SystemsJonas Bonér
The document discusses the need for new tools and approaches for building event-driven, scalable, resilient, and responsive systems. It notes that the demands on applications have changed with the rise of mobile devices, multi-core architectures, and cloud computing. Systems now need to be interactive, responsive, and collaborative. The document advocates building systems that react to events, load, failure, and users using asynchronous messaging and avoiding shared mutable state. It discusses various reactive programming approaches like actors, agents, futures, and reactive extensions that enable building such systems.
Similar to "$10 thousand per minute of downtime: architecture, queues, streaming and fintech", Max Baginskiy (20)
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
"Microservices and multitenancy - how to serve thousands of databases in one ...Fwdays
Imagine you are designing a B2B service that will serve millions of businesses. This service will have dozens of different microservices with their own data, which can contain millions of records. How do you design such a database? Why is sharding not always the answer? What other options are there for such an architectural solution?
I'll tell you how we at Uspacy came to serve thousands of small databases instead of a few large ones, what we've encountered and what we plan to face)
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
"NATO Hackathon Winner: AI-Powered Drug Search", Taras KlobaFwdays
This is a session that details how PostgreSQL's features and Azure AI Services can be effectively used to significantly enhance the search functionality in any application.
In this session, we'll share insights on how we used PostgreSQL to facilitate precise searches across multiple fields in our mobile application. The techniques include using LIKE and ILIKE operators and integrating a trigram-based search to handle potential misspellings, thereby increasing the search accuracy.
We'll also discuss how the azure_ai extension on PostgreSQL databases in Azure and Azure AI Services were utilized to create vectors from user input, a feature beneficial when users wish to find specific items based on text prompts. While our application's case study involves a drug search, the techniques and principles shared in this session can be adapted to improve search functionality in a wide range of applications. Join us to learn how PostgreSQL and Azure AI can be harnessed to enhance your application's search capability.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
"Black Monday: The Story of 5.5 Hours of Downtime", Dmytro DziubenkoFwdays
We will explore the most significant incident in our product's history. We'll discuss the causes that led to the failure, how our team responded, and the measures we took to prevent future incidents. Special attention will be paid to identifying the root cause of the incident and the role of the VACUUM mechanism in PostgreSQL.
"Reaching 3_000_000 HTTP requests per second — conclusions from participation...Fwdays
In this talk, we will get acquainted with TechEmpower Web Framework Benchmarks, consider generalized (programming language-independent) approaches to optimizing a web application and its environment to achieve extreme loads, and most importantly, how some of these things can be applied in practice in your projects.
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
"What I learned through reverse engineering", Yuri ArtiukhFwdays
In recent years, I have gained most of my knowledge through reverse engineering, how I did it and what I learned during this period, I decided to share. All this concerns graphic programming, performance, best practices in the frontend.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
"Micro frontends: Unbelievably true life story", Dmytro PavlovFwdays
A real life story about the experience of using Micro frontends in an existing Enterprise product. Problems and their solutions on the way from the integration of a separate component to an extensible No-code platform.
"Objects validation and comparison using runtime types (io-ts)", Oleksandr SuhakFwdays
A common task in modern JS is parsing, validating and then comparing JSON objects. In this talk I will quickly go through most common ways to parse/validate and compare objects we use today and then focus more on how runtime types (based on io-ts) can help make such tasks easier and quicker to implement.
"JavaScript. Standard evolution, when nobody cares", Roman SavitskyiFwdays
Should we take a look at JavaScript when everyone is writing in TypeScript? What happens to the standard? What did we get last year? What new features can we expect this and next year? And most importantly, when will Observer be standardized?
Let's try to answer all these questions and even a little more, dream about the future, and enjoy that Observer is alive (or not).
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...Fwdays
Case study of how small team in Preply started with inheriting an existing ranking model to being able to produce a model per day. In this talk we'll cover steps to take if you find yourself in a similar situation: what kind of technology and processes can you introduce in order to achieve a great speedup in a development speed.
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil TopchiiFwdays
In my talk, I will tell about the world of GenAI services beyond GPT-wrappers and how we developed and scaled GenAI-centric applications. I'll share personal experiences about the obstacles, lessons, and strategic tools and methodologies that were key in taking GenAI applications from 0 to 1. I'll talk about the challenges we faced when launching LLM-based and image generative applications and delivering them to end users, and what conclusions and solutions were made.
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
Python engineers are introduced to the transformative potential of Large Language Models (LLMs) in the realm of advanced data analysis and the application of Semantic Kernel techniques. We will talk about how LLMs like ChatGPT can be integrated into Python environments to automate data processing, enhance predictive modeling, and unlock deeper insights from complex datasets. The session will delve into practical strategies for embedding Semantic Kernel methods within Python projects, illustrating how these advanced techniques can refine the accuracy of machine learning models by embedding domain-specific knowledge directly into the analysis process. Attendees will leave with a clear roadmap for leveraging the combined power of LLMs and Semantic Kernels, equipped with actionable knowledge to drive innovation in their data analysis projects and beyond, marking a significant leap forward in the evolution of Python engineering practices.
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
Federated learning. Algorithmic solution to the problem of privacy preserving ML. Pieces involved to support the training with NVIDIA Flare as example. How newest legislation affects federated learning.
"What is a RAG system and how to build it",Dmytro SpodaretsFwdays
Today, large language models are becoming an integral part of almost every IT solution. However, their use is often accompanied by certain limitations, such as the relevance of information or its depth and specificity. One of the ways to overcome these limitations is the method of working with LLMs - RAG (Retrieval Augmented Generation).
In an ideal world, you would write Python code and then it would work perfectly. But unfortunately, it doesn't work in this manner. In my talk, I'll cover how to efficiently debug your programs, especially in cloud environments or inside Kubernetes.
MLOps (Machine Learning Operations) is a recent buzzword, that trends a lot. Let's figure out together how maintaining applications with machine learning components is significantly different from maintaining applications without them.
We will look into MLOps best practices and typical problems and their implementations/solutions in real world production.
MongoDB to ScyllaDB: Technical Comparison and the Path to SuccessScyllaDB
What can you expect when migrating from MongoDB to ScyllaDB? This session provides a jumpstart based on what we’ve learned from working with your peers across hundreds of use cases. Discover how ScyllaDB’s architecture, capabilities, and performance compares to MongoDB’s. Then, hear about your MongoDB to ScyllaDB migration options and practical strategies for success, including our top do’s and don’ts.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Keywords: AI, Containeres, Kubernetes, Cloud Native
Event Link: https://meine.doag.org/events/cloudland/2024/agenda/#agendaId.4211
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
For senior executives, successfully managing a major cyber attack relies on your ability to minimise operational downtime, revenue loss and reputational damage.
Indeed, the approach you take to recovery is the ultimate test for your Resilience, Business Continuity, Cyber Security and IT teams.
Our Cyber Recovery Wargame prepares your organisation to deliver an exceptional crisis response.
Event date: 19th June 2024, Tate Modern
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...AlexanderRichford
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation Functions to Prevent Interaction with Malicious QR Codes.
Aim of the Study: The goal of this research was to develop a robust hybrid approach for identifying malicious and insecure URLs derived from QR codes, ensuring safe interactions.
This is achieved through:
Machine Learning Model: Predicts the likelihood of a URL being malicious.
Security Validation Functions: Ensures the derived URL has a valid certificate and proper URL format.
This innovative blend of technology aims to enhance cybersecurity measures and protect users from potential threats hidden within QR codes 🖥 🔒
This study was my first introduction to using ML which has shown me the immense potential of ML in creating more secure digital environments!
Introducing BoxLang : A new JVM language for productivity and modularity!Ortus Solutions, Corp
Just like life, our code must adapt to the ever changing world we live in. From one day coding for the web, to the next for our tablets or APIs or for running serverless applications. Multi-runtime development is the future of coding, the future is to be dynamic. Let us introduce you to BoxLang.
Dynamic. Modular. Productive.
BoxLang redefines development with its dynamic nature, empowering developers to craft expressive and functional code effortlessly. Its modular architecture prioritizes flexibility, allowing for seamless integration into existing ecosystems.
Interoperability at its Core
With 100% interoperability with Java, BoxLang seamlessly bridges the gap between traditional and modern development paradigms, unlocking new possibilities for innovation and collaboration.
Multi-Runtime
From the tiny 2m operating system binary to running on our pure Java web server, CommandBox, Jakarta EE, AWS Lambda, Microsoft Functions, Web Assembly, Android and more. BoxLang has been designed to enhance and adapt according to it's runnable runtime.
The Fusion of Modernity and Tradition
Experience the fusion of modern features inspired by CFML, Node, Ruby, Kotlin, Java, and Clojure, combined with the familiarity of Java bytecode compilation, making BoxLang a language of choice for forward-thinking developers.
Empowering Transition with Transpiler Support
Transitioning from CFML to BoxLang is seamless with our JIT transpiler, facilitating smooth migration and preserving existing code investments.
Unlocking Creativity with IDE Tools
Unleash your creativity with powerful IDE tools tailored for BoxLang, providing an intuitive development experience and streamlining your workflow. Join us as we embark on a journey to redefine JVM development. Welcome to the era of BoxLang.
MySQL InnoDB Storage Engine: Deep Dive - MydbopsMydbops
This presentation, titled "MySQL - InnoDB" and delivered by Mayank Prasad at the Mydbops Open Source Database Meetup 16 on June 8th, 2024, covers dynamic configuration of REDO logs and instant ADD/DROP columns in InnoDB.
This presentation dives deep into the world of InnoDB, exploring two ground-breaking features introduced in MySQL 8.0:
• Dynamic Configuration of REDO Logs: Enhance your database's performance and flexibility with on-the-fly adjustments to REDO log capacity. Unleash the power of the snake metaphor to visualize how InnoDB manages REDO log files.
• Instant ADD/DROP Columns: Say goodbye to costly table rebuilds! This presentation unveils how InnoDB now enables seamless addition and removal of columns without compromising data integrity or incurring downtime.
Key Learnings:
• Grasp the concept of REDO logs and their significance in InnoDB's transaction management.
• Discover the advantages of dynamic REDO log configuration and how to leverage it for optimal performance.
• Understand the inner workings of instant ADD/DROP columns and their impact on database operations.
• Gain valuable insights into the row versioning mechanism that empowers instant column modifications.
In our second session, we shall learn all about the main features and fundamentals of UiPath Studio that enable us to use the building blocks for any automation project.
📕 Detailed agenda:
Variables and Datatypes
Workflow Layouts
Arguments
Control Flows and Loops
Conditional Statements
💻 Extra training through UiPath Academy:
Variables, Constants, and Arguments in Studio
Control Flow in Studio
Session 1 - Intro to Robotic Process Automation.pdfUiPathCommunity
👉 Check out our full 'Africa Series - Automation Student Developers (EN)' page to register for the full program:
https://bit.ly/Automation_Student_Kickstart
In this session, we shall introduce you to the world of automation, the UiPath Platform, and guide you on how to install and setup UiPath Studio on your Windows PC.
📕 Detailed agenda:
What is RPA? Benefits of RPA?
RPA Applications
The UiPath End-to-End Automation Platform
UiPath Studio CE Installation and Setup
💻 Extra training through UiPath Academy:
Introduction to Automation
UiPath Business Automation Platform
Explore automation development with UiPath Studio
👉 Register here for our upcoming Session 2 on June 20: Introduction to UiPath Studio Fundamentals: https://community.uipath.com/events/details/uipath-lagos-presents-session-2-introduction-to-uipath-studio-fundamentals/
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
Guidelines for Effective Data VisualizationUmmeSalmaM1
This PPT discuss about importance and need of data visualization, and its scope. Also sharing strong tips related to data visualization that helps to communicate the visual information effectively.
"$10 thousand per minute of downtime: architecture, queues, streaming and fintech", Max Baginskiy
1. $10 thousand per minute
of downtime: architecture,
queues, streaming and fintech
Max Baginskiy
Solidgate
2. About me
Head of Engineering
PreviouslyTech Lead and Platform engineer
Head of Engineering
PreviouslyTech Lead and Platform engineer
Head of Engineering
PreviouslyTech Lead and Platform engineer
10 yrs in Software Engineering
Last 6 years Go, fan of DevOps
10 yrs in Software Engineering
Last 6 years Go, fan of DevOps
10 yrs in Software Engineering
Last 6 years Go, fan of DevOps
Build teams (5 teams, 30+ people hired)
And architecture
Build teams (5 teams, 30+ people hired)
And architecture
Build teams (5 teams, 30+ people hired)
And architecture
3. Agenda
Company intro
Architecture of the system
Queues and Streams to choose from
Low latency streaming using outbox
CDC to our solution - comparison
Questions.
Company intro
Architecture of the system
Queues and Streams to choose from
Low latency streaming using outbox
CDC to our solution - comparison
Questions.
Company intro
Architecture of the system
Queues and Streams to choose from
Low latency streaming using outbox
CDC to our solution - comparison
Questions.
4. About company
7+ years online
7+ years online
7+ years online 70 engineers
50 SW engineers
20 Infra + Data engineers + AQA
70 engineers
50 SW engineers
20 Infra + Data engineers + AQA
70 engineers
50 SW engineers
20 Infra + Data engineers + AQA
PCI DSS Compliant
PCI DSS Compliant
PCI DSS Compliant European Acquirer
European Acquirer
European Acquirer
7. ALBTraffic
We have 100x less traffic on ALB
during high season than Shopify
We have 100x less traffic on ALB
during high season than Shopify
We have 100x less traffic on ALB
during high season than Shopify
Stripe served 250mil API calls
in 2020 perday
Stripe served 250mil API calls
in 2020 perday
Stripe served 250mil API calls
in 2020 perday
8. Kafka Producer
We started integrating kafka lastyear
We started integrating kafka lastyear
We started integrating kafka lastyear 20 rps average
20 rps average
20 rps average 2 mil events perday
2 mil events perday
2 mil events perday
9. RabbitMQ Producer
100-120 rps average
100-120 rps average
100-120 rps average 10 mil events perday
10 mil events perday
10 mil events perday
10. Logs
1.5-2k rps of logs
1.5-2k rps of logs
1.5-2k rps of logs 150 mil events per day. 200-300 GB of logs daily
150 mil events per day. 200-300 GB of logs daily
150 mil events per day. 200-300 GB of logs daily
14. Non functional requirements
Durability out of the box
Durability out of the box
Durability out of the box Queue replay
Queue replay
Queue replay
Single active consumer support
Single active consumer support
Single active consumer support
Easy to setup and to maintain
Easy to setup and to maintain
Easy to setup and to maintain Partitioning
Partitioning
Partitioning
Easy scaling for publisher and consumer
Easy scaling for publisher and consumer
Easy scaling for publisher and consumer
Extensiblity: schema registry support, dynamic routing, enrichment
Extensiblity: schema registry support, dynamic routing, enrichment
Extensiblity: schema registry support, dynamic routing, enrichment
15. NFR - explanation
What if message is lost in between services
while processing andwe retry payment?
What if message is lost in between services
while processing andwe retry payment?
What if message is lost in between services
while processing andwe retry payment?
What if message is lost in between
callback service and callback processor?
What if message is lost in between
callback service and callback processor?
What if message is lost in between
callback service and callback processor?
What if message is lost in between
payment and finance systems?
What if message is lost in between
payment and finance systems?
What if message is lost in between
payment and finance systems?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
what if …what if …what if …?
16. RabbitMQ dive in
Erlang
Written in Erlang. Erlang made by Ericssonwhich
makes telecommunication devices.
Erlang
Written in Erlang. Erlang made by Ericssonwhich
makes telecommunication devices.
Erlang
Written in Erlang. Erlang made by Ericssonwhich
makes telecommunication devices.
Proof of fail-safety
ATM AXD301 example, Calculated uptime
99,9999999%, only one problem permany years.
Proof of fail-safety
ATM AXD301 example, Calculated uptime
99,9999999%, only one problem permany years.
Proof of fail-safety
ATM AXD301 example, Calculated uptime
99,9999999%, only one problem permany years.
Mnesia as storage
Mnesia doesn’t support recovery from split brain and
othertypes of failures.
Mnesia as storage
Mnesia doesn’t support recovery from split brain and
othertypes of failures.
Mnesia as storage
Mnesia doesn’t support recovery from split brain and
othertypes of failures.
17. RabbitMQ Durability
Mechanisms
Publisher confirms is a MUST have
RabbitMQ can store data to Disk and
different autoheal modes
Different types of queues: Quorum,
Mirrored
Have Streaming in “beta”.
Mechanisms
Publisher confirms is a MUST have
RabbitMQ can store data to Disk and
different autoheal modes
Different types of queues: Quorum,
Mirrored
Have Streaming in “beta”.
Mechanisms
Publisher confirms is a MUST have
RabbitMQ can store data to Disk and
different autoheal modes
Different types of queues: Quorum,
Mirrored
Have Streaming in “beta”.
What if publisher confirms
disabled?
Delivery after exchange might not
happen
Persistence might not happen
Few replicas might not acknowledge
message in Quorum
Overwhelmed Clusterwill not accept
messages but publisherwill not
know.
What if publisher confirms
disabled?
Delivery after exchange might not
happen
Persistence might not happen
Few replicas might not acknowledge
message in Quorum
Overwhelmed Clusterwill not accept
messages but publisherwill not
know.
What if publisher confirms
disabled?
Delivery after exchange might not
happen
Persistence might not happen
Few replicas might not acknowledge
message in Quorum
Overwhelmed Clusterwill not accept
messages but publisherwill not
know.
18. Quorum queues
+ Pros
Have Consensus built in
Data written to disk, metadata in memory
Can easily handle restarts.
+ Pros
Have Consensus built in
Data written to disk, metadata in memory
Can easily handle restarts.
+ Pros
Have Consensus built in
Data written to disk, metadata in memory
Can easily handle restarts.
− Cons
Doesn’t scale well - millions of messages after
restart can replicate hours.
Doen’t have “replay” mechanism
Consumers doesn’t scale
Doesn’t preserve order of messages.
− Cons
Doesn’t scale well - millions of messages after
restart can replicate hours.
Doen’t have “replay” mechanism
Consumers doesn’t scale
Doesn’t preserve order of messages.
− Cons
Doesn’t scale well - millions of messages after
restart can replicate hours.
Doen’t have “replay” mechanism
Consumers doesn’t scale
Doesn’t preserve order of messages.
20. Split brain - autoheal
ignore
Usewhen network reliability is the highest practically possible and node availability is of topmost importance.
ignore
Usewhen network reliability is the highest practically possible and node availability is of topmost importance.
ignore
Usewhen network reliability is the highest practically possible and node availability is of topmost importance.
pause_minority
Appropriatewhen clustering across racks oravailability zones in a single region and the probability of losing a majority
of nodes (zones) at once is considered to bevery low.
pause_minority
Appropriatewhen clustering across racks oravailability zones in a single region and the probability of losing a majority
of nodes (zones) at once is considered to bevery low.
pause_minority
Appropriatewhen clustering across racks oravailability zones in a single region and the probability of losing a majority
of nodes (zones) at once is considered to bevery low.
autoheal
Appropriatewhen are more concernedwith continuity of service thanwith data consistency across nodes.
autoheal
Appropriatewhen are more concernedwith continuity of service thanwith data consistency across nodes.
autoheal
Appropriatewhen are more concernedwith continuity of service thanwith data consistency across nodes.
Summary - noway to guarantee that autohealwillwork properly
Summary - noway to guarantee that autohealwillwork properly
Summary - noway to guarantee that autohealwillwork properly
24. RabbitMQ + RabbitMQ streaming
Newfeature that not a lot of companies use.
Newfeature that not a lot of companies use.
Newfeature that not a lot of companies use.
Go client is not ready,what about Python orNode.js I’m aftaid to ask.
Go client is not ready,what about Python orNode.js I’m aftaid to ask.
Go client is not ready,what about Python orNode.js I’m aftaid to ask.
Hard to support. Requires updates of Erlang and then RabbitMQ.
Hard to support. Requires updates of Erlang and then RabbitMQ.
Hard to support. Requires updates of Erlang and then RabbitMQ.
Streaming is a plugin that requires specificversion of RabbitMQ.
Streaming is a plugin that requires specificversion of RabbitMQ.
Streaming is a plugin that requires specificversion of RabbitMQ.
Not made for fintech: lack of properdurability, lack of functionality.
Not made for fintech: lack of properdurability, lack of functionality.
Not made for fintech: lack of properdurability, lack of functionality.
25.
26. Kafka dive in
Java
Written in Java by Linkedin and then
opensourced and licenced under
Apache licence.
Java
Written in Java by Linkedin and then
opensourced and licenced under
Apache licence.
Java
Written in Java by Linkedin and then
opensourced and licenced under
Apache licence.
Highly available and durable
HasWAL,works in cluster, saves data to
disk by default.
Highly available and durable
HasWAL,works in cluster, saves data to
disk by default.
Highly available and durable
HasWAL,works in cluster, saves data to
disk by default.
️Blazing fast
Sequentialwrites, zero copy.
️Blazing fast
Sequentialwrites, zero copy.
️Blazing fast
Sequentialwrites, zero copy.
27. Kafka dive in
Kafka uses optimizations around Sequentialwrites to optimize disk usagewith zero copy.
Kafka uses optimizations around Sequentialwrites to optimize disk usagewith zero copy.
Kafka uses optimizations around Sequentialwrites to optimize disk usagewith zero copy.
HasWAL log forreplication and durability.
HasWAL log forreplication and durability.
HasWAL log forreplication and durability.
Zookeeperas separate system tracks health of the cluster.
Zookeeperas separate system tracks health of the cluster.
Zookeeperas separate system tracks health of the cluster.
Canwork evenwithout Zookeeper.
Canwork evenwithout Zookeeper.
Canwork evenwithout Zookeeper.
Chaos engineering shows that Kafka is highlyavailable and durable solution.
Chaos engineering shows that Kafka is highlyavailable and durable solution.
Chaos engineering shows that Kafka is highlyavailable and durable solution.
28. Debezium
Debezium howto:
Debezium howto:
Debezium howto: Create a replication slot
Create a replication slot
Create a replication slot Run Debezium Java service in cluster
Run Debezium Java service in cluster
Run Debezium Java service in cluster Configurate itwith Groovy
Configurate itwith Groovy
Configurate itwith Groovy
29. Debezium
+ Pros
UsesWAL directly - doesn’t create
additional load toWAL(no additional
data iswritten).
Production ready, tested solution
Lowlatency. ️
+ Pros
UsesWAL directly - doesn’t create
additional load toWAL(no additional
data iswritten).
Production ready, tested solution
Lowlatency. ️
+ Pros
UsesWAL directly - doesn’t create
additional load toWAL(no additional
data iswritten).
Production ready, tested solution
Lowlatency. ️
− Cons
Howto replay data? Can you specify
Log Sequence Number?What if you
need to stream only a fraction ofwhat
iswritten inWAL
Missing Buf(protobuf on steroids)
Lowflexibility and hard configurability
DB Isolation.
Groovywhich is not easy to use
Random disconnects
and need to restart.
30. Transactional outbox
Why to use Transactional
Outbox?
Why to use Transactional
Outbox?
Why to use Transactional
Outbox?
Nor Kafka nor CDC can flexibly
re-stream data.
Nor Kafka nor CDC can flexibly
re-stream data.
Nor Kafka nor CDC can flexibly
re-stream data.
Without specific instruments
you can’t remove specific
events from Kafka.
Without specific instruments
you can’t remove specific
events from Kafka.
Without specific instruments
you can’t remove specific
events from Kafka.
Replay with Kafka will require
setup of additional services.
Replay with Kafka will require
setup of additional services.
Replay with Kafka will require
setup of additional services.
Consistent state with the usage
of Transactions.
Consistent state with the usage
of Transactions.
Consistent state with the usage
of Transactions.
35. Schema registry -
“Speca first” approach - speedup development.
“Speca first” approach - speedup development.
“Speca first” approach - speedup development.
Backward compatibility support - linters.
Backward compatibility support - linters.
Backward compatibility support - linters.
Reusable “menthal model” = simplified migration from api to stream.
Reusable “menthal model” = simplified migration from api to stream.
Reusable “menthal model” = simplified migration from api to stream.
Client, server and models are generated for various language.
Client, server and models are generated for various language.
Client, server and models are generated for various language.
Simplified versioning.
Simplified versioning.
Simplified versioning.
36. Taxerv1 optionwe built
Update payment in Gate(kotlin).
Update payment in Gate(kotlin).
Update payment in Gate(kotlin).
Transaction: Save payment update
and create a record in Outbox.
Transaction: Save payment update
and create a record in Outbox.
Transaction: Save payment update
and create a record in Outbox.
Orderstreamer(Go) - reads batch
from outbox.
Orderstreamer(Go) - reads batch
from outbox.
Orderstreamer(Go) - reads batch
from outbox.
Publish data to Stream.
Publish data to Stream.
Publish data to Stream.
Update Offset in meta table.
Update Offset in meta table.
Update Offset in meta table.
40. v1 comparison with typical architecture
+ Pros
We have a full transaction log that can
be replayed, reworked, saved, fixed
Only 1 new tech - kafka
Streamer + Leaser = 200 lines of code +
800 lines of tests. It can be used as
library not a service
Buf/Go/PostgreSQL - everything
reused - maintenance simplified.
+ Pros
We have a full transaction log that can
be replayed, reworked, saved, fixed
Only 1 new tech - kafka
Streamer + Leaser = 200 lines of code +
800 lines of tests. It can be used as
library not a service
Buf/Go/PostgreSQL - everything
reused - maintenance simplified.
+ Pros
We have a full transaction log that can
be replayed, reworked, saved, fixed
Only 1 new tech - kafka
Streamer + Leaser = 200 lines of code +
800 lines of tests. It can be used as
library not a service
Buf/Go/PostgreSQL - everything
reused - maintenance simplified.
− Cons
WAL amplification - 2x. Transactional
outbox requires 1 more write to each
operation
High delay - 2 min for events
More CPU load than just reading from
WAL.
45. Orderstreamerdelay: 2min
ULIDs - doesn’t allowus to understand the commit orderof events and missing parts.
ULIDs - doesn’t allowus to understand the commit orderof events and missing parts.
ULIDs - doesn’t allowus to understand the commit orderof events and missing parts.
47. v2 Implementation
Auto increment instead of ULIDwill helpyou to report
and look formissing IDs. It’s more like “logical time”.
Auto increment instead of ULIDwill helpyou to report
and look formissing IDs. It’s more like “logical time”.
Auto increment instead of ULIDwill helpyou to report
and look formissing IDs. It’s more like “logical time”.
Look formissing ids for, save them in meta table for2
mins and restream themwhen theyappear.
Look formissing ids for, save them in meta table for2
mins and restream themwhen theyappear.
Look formissing ids for, save them in meta table for2
mins and restream themwhen theyappear.