Keynote by Brendan Gregg for YOW! 2018. Video: https://www.youtube.com/watch?v=03EC8uA30Pw . Description: "At Netflix, improving the performance of our cloud means happier customers and lower costs, and involves root cause
analysis of applications, runtimes, operating systems, and hypervisors, in an environment of 150k cloud instances
that undergo numerous production changes each week. Apart from the developers who regularly optimize their own code
, we also have a dedicated performance team to help with any issue across the cloud, and to build tooling to aid in
this analysis. In this session we will summarize the Netflix environment, procedures, and tools we use and build t
o do root cause analysis on cloud performance issues. The analysis performed may be cloud-wide, using self-service
GUIs such as our open source Atlas tool, or focused on individual instances, and use our open source Vector tool, f
lame graphs, Java debuggers, and tooling that uses Linux perf, ftrace, and bcc/eBPF. You can use these open source
tools in the same way to find performance wins in your own environment."
Performance Wins with eBPF: Getting Started (2021)Brendan Gregg
This document provides an overview of using eBPF (extended Berkeley Packet Filter) to quickly get performance wins as a sysadmin. It recommends installing BCC and bpftrace tools to easily find issues like periodic processes, misconfigurations, unexpected TCP sessions, or slow file system I/O. A case study examines using biosnoop to identify which processes were causing disk latency issues. The document suggests thinking like a sysadmin first by running tools, then like a programmer if a problem requires new tools. It also outlines recommended frontends depending on use cases and provides references to learn more about BPF.
Extreme Linux Performance Monitoring and TuningMilind Koyande
This document provides an introduction to monitoring Linux system performance. It discusses determining the type of application running and establishing a baseline of typical system usage. Key CPU concepts are then outlined such as hardware interrupts, soft interrupts, real-time threads and kernel/user threads. Context switches between threads and the thread scheduling queue are also introduced. The goal is to understand typical system behavior and identify any bottlenecks.
USENIX ATC 2017: Visualizing Performance with Flame GraphsBrendan Gregg
Talk by Brendan Gregg for USENIX ATC 2017.
"Flame graphs are a simple stack trace visualization that helps answer an everyday problem: how is software consuming resources, especially CPUs, and how did this change since the last software version? Flame graphs have been adopted by many languages, products, and companies, including Netflix, and have become a standard tool for performance analysis. They were published in "The Flame Graph" article in the June 2016 issue of Communications of the ACM, by their creator, Brendan Gregg.
This talk describes the background for this work, and the challenges encountered when profiling stack traces and resolving symbols for different languages, including for just-in-time compiler runtimes. Instructions will be included generating mixed-mode flame graphs on Linux, and examples from our use at Netflix with Java. Advanced flame graph types will be described, including differential, off-CPU, chain graphs, memory, and TCP events. Finally, future work and unsolved problems in this area will be discussed."
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
Talk by for Velocity 2017 by Brendan Gregg: Performance analysis superpowers with Linux eBPF.
"Advanced performance observability and debugging have arrived built into the Linux 4.x series, thanks to enhancements to Berkeley Packet Filter (BPF, or eBPF) and the repurposing of its sandboxed virtual machine to provide programmatic capabilities to system tracing. Netflix has been investigating its use for new observability tools, monitoring, security uses, and more. This talk will investigate this new technology, which sooner or later will be available to everyone who uses Linux. The talk will dive deep on these new tracing, observability, and debugging capabilities. Whether you’re doing analysis over an ssh session, or via a monitoring GUI, BPF can be used to provide an efficient, custom, and deep level of detail into system and application performance.
This talk will also demonstrate the new open source tools that have been developed, which make use of kernel- and user-level dynamic tracing (kprobes and uprobes), and kernel- and user-level static tracing (tracepoints). These tools provide new insights for file system and storage performance, CPU scheduler performance, TCP performance, and a whole lot more. This is a major turning point for Linux systems engineering, as custom advanced performance instrumentation can be used safely in production environments, powering a new generation of tools and visualizations."
Arrow Flight is a proposed RPC layer for Apache Arrow that allows for efficient transfer of Arrow record batches between systems. It uses GRPC as the foundation to define streams of Arrow data that can be consumed in parallel across locations. Arrow Flight supports custom actions that can be used to build services on top of the generic API. By extending GRPC, Arrow Flight aims to simplify the creation of data applications while enabling high performance data transfer and locality awareness.
This document discusses exactly once semantics in Apache Kafka 0.11. It provides an overview of how Kafka achieved exactly once delivery between producers and consumers. Key points include:
- Kafka 0.11 introduced exactly once semantics with changes to support transactions and deduplication.
- Producers can write in a transactional fashion and receive acknowledgments of committed writes from brokers.
- Brokers store commit markers to track the progress of transactions and ensure no data loss during failures.
- Consumers can read from brokers in a transactional mode and receive data only from committed transactions, guaranteeing no duplication of records.
- This allows reliable message delivery semantics between producers and consumers with Kafka acting as
How I learned to time travel, or, data pipelining and scheduling with AirflowPyData
This document discusses how the author learned to use Airflow for data pipelining and scheduling tasks. It describes some early tools like Cron and Luigi that were used for scheduling. It then evaluates options like Drake, Pydoit, Pinball, Luigi, and AWS Data Pipeline before settling on Airflow due to its sophistication in handling complex dependencies, built-in scheduling and monitoring, and flexibility. The author also develops a plugin called smart-airflow to add file-based checkpointing capabilities to Airflow to track intermediate data transformations.
This talk discusses Linux profiling using perf_events (also called "perf") based on Netflix's use of it. It covers how to use perf to get CPU profiling working and overcome common issues. The speaker will give a tour of perf_events features and show how Netflix uses it to analyze performance across their massive Amazon EC2 Linux cloud. They rely on tools like perf for customer satisfaction, cost optimization, and developing open source tools like NetflixOSS. Key aspects covered include why profiling is needed, a crash course on perf, CPU profiling workflows, and common "gotchas" to address like missing stacks, symbols, or profiling certain languages and events.
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
This document provides a performance engineer's predictions for computing performance trends in 2021 and beyond. The engineer discusses trends in processors, memory, disks, networking, runtimes, kernels, hypervisors, and observability. For processors, predictions include multi-socket systems becoming less common, the future of simultaneous multithreading being unclear, practical core count limits being reached in the 2030s, and more processor vendors including ARM-based and RISC-V options. Memory predictions focus on many workloads being memory-bound currently.
In the Cloud Native community, eBPF is gaining popularity, which can often be the best solution for solving different challenges with deep observability of system. Currently, eBPF is being embraced by major players.
Mydbops co-Founder, Kabilesh P.R (MySQL and Mongo Consultant) illustrates on debugging linux issues with eBPF. A brief about BPF & eBPF, BPF internals and the tools in actions for faster resolution.
VictoriaLogs: Open Source Log Management System - PreviewVictoriaMetrics
VictoriaLogs Preview - Aliaksandr Valialkin
* Existing open source log management systems
- ELK (ElasticSearch) stack: Pros & Cons
- Grafana Loki: Pros & Cons
* What is VictoriaLogs
- Open source log management system from VictoriaMetrics
- Easy to setup and operate
- Scales vertically and horizontally
- Optimized for low resource usage (CPU, RAM, disk space)
- Accepts data from Logstash and Fluentbit in Elasticsearch format
- Accepts data from Promtail in Loki format
- Supports stream concept from Loki
- Provides easy to use yet powerful query language - LogsQL
* LogsQL Examples
- Search by time
- Full-text search
- Combining search queries
- Searching arbitrary labels
* Log Streams
- What is a log stream?
- LogsQL examples: querying log streams
- Stream labels vs log labels
* LogsQL: stats over access logs
* VictoriaLogs: CLI Integration
* VictoriaLogs Recap
Talk for AWS re:Invent 2014. Video: https://www.youtube.com/watch?v=7Cyd22kOqWc . Netflix tunes Amazon EC2 instances for maximum performance. In this session, you learn how Netflix configures the fastest possible EC2 instances, while reducing latency outliers. This session explores the various Xen modes (e.g., HVM, PV, etc.) and how they are optimized for different workloads. Hear how Netflix chooses Linux kernel versions based on desired performance characteristics and receive a firsthand look at how they set kernel tunables, including hugepages. You also hear about Netflix’s use of SR-IOV to enable enhanced networking and their approach to observability, which can exonerate EC2 issues and direct attention back to application performance.
This document discusses improving the developer experience through GitOps and ArgoCD. It recommends building developer self-service tools for cloud resources and Kubernetes to reduce frustration. Example GitLab CI/CD pipelines are shown that handle releases, deployments to ECR, and patching apps in an ArgoCD repository to sync changes. The goal is to create faster feedback loops through Git operations and automation to motivate developers.
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 80+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are Open Source and can be applied to most Kubernetes deployments.
This document discusses how eBPF (extended Berkeley Packet Filter) can be used for kernel tracing. It provides an overview of BPF and eBPF, how eBPF programs are compiled and run in the kernel, the use of BPF maps, and how eBPF enables new possibilities for dynamic kernel instrumentation through techniques like Kprobes and ftrace.
This document discusses tracing in the Linux kernel. It describes various tracing mechanisms like ftrace, tracepoints, kprobes, perf, and eBPF. Ftrace allows tracing functions via compiler instrumentation or dynamically. Tracepoints define custom trace events that can be inserted at specific points. Kprobes and related probes like jprobes allow tracing kernel functions. Perf provides performance monitoring capabilities. eBPF enables custom tracing programs to be run efficiently in the kernel via just-in-time compilation. Tracing tools like perf, systemtap, and LTTng provide user interfaces.
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
Data Orchestration Summit
www.alluxio.io/data-orchestration-summit-2019
November 7, 2019
Apache Iceberg - A Table Format for Hige Analytic Datasets
Speaker:
Ryan Blue, Netflix
For more Alluxio events: https://www.alluxio.io/events/
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
RocksDB is the default state store for Kafka Streams. In this talk, we will discuss how to improve single node performance of the state store by tuning RocksDB and how to efficiently identify issues in the setup. We start with a short description of the RocksDB architecture. We discuss how Kafka Streams restores the state stores from Kafka by leveraging RocksDB features for bulk loading of data. We give examples of hand-tuning the RocksDB state stores based on Kafka Streams metrics and RocksDB’s metrics. At the end, we dive into a few RocksDB command line utilities that allow you to debug your setup and dump data from a state store. We illustrate the usage of the utilities with a few real-life use cases. The key takeaway from the session is the ability to understand the internal details of the default state store in Kafka Streams so that engineers can fine-tune their performance for different varieties of workloads and operate the state stores in a more robust manner.
A Deep Dive into Query Execution Engine of Spark SQLDatabricks
Spark SQL enables Spark to perform efficient and fault-tolerant relational query processing with analytics database technologies. The relational queries are compiled to the executable physical plans consisting of transformations and actions on RDDs with the generated Java code. The code is compiled to Java bytecode, executed at runtime by JVM and optimized by JIT to native machine code at runtime. This talk will take a deep dive into Spark SQL execution engine. The talk includes pipelined execution, whole-stage code generation, UDF execution, memory management, vectorized readers, lineage based RDD transformation and action.
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
CMP325 talk for AWS re:Invent 2017, by Brendan Gregg. "
At Netflix we make the best use of AWS EC2 instance types and features to create a high performance cloud, achieving near bare metal speed for our workloads. This session will summarize the configuration, tuning, and activities for delivering the fastest possible EC2 instances, and will help other EC2 users improve performance, reduce latency outliers, and make better use of EC2 features. We'll show how we choose EC2 instance types, how we choose between EC2 Xen modes: HVM, PV, and PVHVM, and the importance of EC2 features such SR-IOV for bare-metal performance. SR-IOV is used by EC2 enhanced networking, and recently for the new i3 instance type for enhanced disk performance as well. We'll also cover kernel tuning and observability tools, from basic to advanced. Advanced performance analysis includes the use of Java and Node.js flame graphs, and the new EC2 Performance Monitoring Counter (PMC) feature released this year."
This document summarizes how LINE messaging uses Redis. It discusses:
- How LINE has scaled Redis from 3 nodes in 2011 to over 14,000 nodes today to support over 25 billion messages per day.
- The key ways LINE uses Redis, including for storing sequences, caches, secondary indexes, and local queues.
- Challenges LINE faced in scaling their in-house Redis cluster to over 1,000 nodes and workarounds they developed.
- How LINE monitors Redis for slow commands, bursting operations, and connections to address performance issues.
- Future work LINE is doing with Redis 4 and improving latency and scalability.
Using eBPF for High-Performance Networking in CiliumScyllaDB
The Cilium project is a popular networking solution for Kubernetes, based on eBPF. This talk uses eBPF code and demos to explore the basics of how Cilium makes network connections, and manipulates packets so that they can avoid traversing the kernel's built-in networking stack. You'll see how eBPF enables high-performance networking as well as deep network observability and security.
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
In this session, Netflix provides an overview of Keystone, their new data pipeline. The session covers how Netflix migrated from Suro to Keystone, including the reasons behind the transition and the challenges of zero loss while processing over 400 billion events daily. The session covers in detail how they deploy, operate, and scale Kafka, Samza, Docker, and Apache Mesos in AWS to manage 8 million events & 17 GB per second during peak.
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
The talk is focused on administration, development and monitoring platform with Apache Spark, Apache Flink and Kubeflow in which the monitoring stack is based on Prometheus stack.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
This document provides a performance engineer's predictions for computing performance trends in 2021 and beyond. The engineer discusses trends in processors, memory, disks, networking, runtimes, kernels, hypervisors, and observability. For processors, predictions include multi-socket systems becoming less common, the future of simultaneous multithreading being unclear, practical core count limits being reached in the 2030s, and more processor vendors including ARM-based and RISC-V options. Memory predictions focus on many workloads being memory-bound currently.
In the Cloud Native community, eBPF is gaining popularity, which can often be the best solution for solving different challenges with deep observability of system. Currently, eBPF is being embraced by major players.
Mydbops co-Founder, Kabilesh P.R (MySQL and Mongo Consultant) illustrates on debugging linux issues with eBPF. A brief about BPF & eBPF, BPF internals and the tools in actions for faster resolution.
VictoriaLogs: Open Source Log Management System - PreviewVictoriaMetrics
VictoriaLogs Preview - Aliaksandr Valialkin
* Existing open source log management systems
- ELK (ElasticSearch) stack: Pros & Cons
- Grafana Loki: Pros & Cons
* What is VictoriaLogs
- Open source log management system from VictoriaMetrics
- Easy to setup and operate
- Scales vertically and horizontally
- Optimized for low resource usage (CPU, RAM, disk space)
- Accepts data from Logstash and Fluentbit in Elasticsearch format
- Accepts data from Promtail in Loki format
- Supports stream concept from Loki
- Provides easy to use yet powerful query language - LogsQL
* LogsQL Examples
- Search by time
- Full-text search
- Combining search queries
- Searching arbitrary labels
* Log Streams
- What is a log stream?
- LogsQL examples: querying log streams
- Stream labels vs log labels
* LogsQL: stats over access logs
* VictoriaLogs: CLI Integration
* VictoriaLogs Recap
Talk for AWS re:Invent 2014. Video: https://www.youtube.com/watch?v=7Cyd22kOqWc . Netflix tunes Amazon EC2 instances for maximum performance. In this session, you learn how Netflix configures the fastest possible EC2 instances, while reducing latency outliers. This session explores the various Xen modes (e.g., HVM, PV, etc.) and how they are optimized for different workloads. Hear how Netflix chooses Linux kernel versions based on desired performance characteristics and receive a firsthand look at how they set kernel tunables, including hugepages. You also hear about Netflix’s use of SR-IOV to enable enhanced networking and their approach to observability, which can exonerate EC2 issues and direct attention back to application performance.
This document discusses improving the developer experience through GitOps and ArgoCD. It recommends building developer self-service tools for cloud resources and Kubernetes to reduce frustration. Example GitLab CI/CD pipelines are shown that handle releases, deployments to ECR, and patching apps in an ArgoCD repository to sync changes. The goal is to create faster feedback loops through Git operations and automation to motivate developers.
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 80+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are Open Source and can be applied to most Kubernetes deployments.
This document discusses how eBPF (extended Berkeley Packet Filter) can be used for kernel tracing. It provides an overview of BPF and eBPF, how eBPF programs are compiled and run in the kernel, the use of BPF maps, and how eBPF enables new possibilities for dynamic kernel instrumentation through techniques like Kprobes and ftrace.
This document discusses tracing in the Linux kernel. It describes various tracing mechanisms like ftrace, tracepoints, kprobes, perf, and eBPF. Ftrace allows tracing functions via compiler instrumentation or dynamically. Tracepoints define custom trace events that can be inserted at specific points. Kprobes and related probes like jprobes allow tracing kernel functions. Perf provides performance monitoring capabilities. eBPF enables custom tracing programs to be run efficiently in the kernel via just-in-time compilation. Tracing tools like perf, systemtap, and LTTng provide user interfaces.
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
Data Orchestration Summit
www.alluxio.io/data-orchestration-summit-2019
November 7, 2019
Apache Iceberg - A Table Format for Hige Analytic Datasets
Speaker:
Ryan Blue, Netflix
For more Alluxio events: https://www.alluxio.io/events/
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
RocksDB is the default state store for Kafka Streams. In this talk, we will discuss how to improve single node performance of the state store by tuning RocksDB and how to efficiently identify issues in the setup. We start with a short description of the RocksDB architecture. We discuss how Kafka Streams restores the state stores from Kafka by leveraging RocksDB features for bulk loading of data. We give examples of hand-tuning the RocksDB state stores based on Kafka Streams metrics and RocksDB’s metrics. At the end, we dive into a few RocksDB command line utilities that allow you to debug your setup and dump data from a state store. We illustrate the usage of the utilities with a few real-life use cases. The key takeaway from the session is the ability to understand the internal details of the default state store in Kafka Streams so that engineers can fine-tune their performance for different varieties of workloads and operate the state stores in a more robust manner.
A Deep Dive into Query Execution Engine of Spark SQLDatabricks
Spark SQL enables Spark to perform efficient and fault-tolerant relational query processing with analytics database technologies. The relational queries are compiled to the executable physical plans consisting of transformations and actions on RDDs with the generated Java code. The code is compiled to Java bytecode, executed at runtime by JVM and optimized by JIT to native machine code at runtime. This talk will take a deep dive into Spark SQL execution engine. The talk includes pipelined execution, whole-stage code generation, UDF execution, memory management, vectorized readers, lineage based RDD transformation and action.
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
CMP325 talk for AWS re:Invent 2017, by Brendan Gregg. "
At Netflix we make the best use of AWS EC2 instance types and features to create a high performance cloud, achieving near bare metal speed for our workloads. This session will summarize the configuration, tuning, and activities for delivering the fastest possible EC2 instances, and will help other EC2 users improve performance, reduce latency outliers, and make better use of EC2 features. We'll show how we choose EC2 instance types, how we choose between EC2 Xen modes: HVM, PV, and PVHVM, and the importance of EC2 features such SR-IOV for bare-metal performance. SR-IOV is used by EC2 enhanced networking, and recently for the new i3 instance type for enhanced disk performance as well. We'll also cover kernel tuning and observability tools, from basic to advanced. Advanced performance analysis includes the use of Java and Node.js flame graphs, and the new EC2 Performance Monitoring Counter (PMC) feature released this year."
This document summarizes how LINE messaging uses Redis. It discusses:
- How LINE has scaled Redis from 3 nodes in 2011 to over 14,000 nodes today to support over 25 billion messages per day.
- The key ways LINE uses Redis, including for storing sequences, caches, secondary indexes, and local queues.
- Challenges LINE faced in scaling their in-house Redis cluster to over 1,000 nodes and workarounds they developed.
- How LINE monitors Redis for slow commands, bursting operations, and connections to address performance issues.
- Future work LINE is doing with Redis 4 and improving latency and scalability.
Using eBPF for High-Performance Networking in CiliumScyllaDB
The Cilium project is a popular networking solution for Kubernetes, based on eBPF. This talk uses eBPF code and demos to explore the basics of how Cilium makes network connections, and manipulates packets so that they can avoid traversing the kernel's built-in networking stack. You'll see how eBPF enables high-performance networking as well as deep network observability and security.
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
In this session, Netflix provides an overview of Keystone, their new data pipeline. The session covers how Netflix migrated from Suro to Keystone, including the reasons behind the transition and the challenges of zero loss while processing over 400 billion events daily. The session covers in detail how they deploy, operate, and scale Kafka, Samza, Docker, and Apache Mesos in AWS to manage 8 million events & 17 GB per second during peak.
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
The talk is focused on administration, development and monitoring platform with Apache Spark, Apache Flink and Kubeflow in which the monitoring stack is based on Prometheus stack.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
The need for gleaning answers from unbounded data streams is moving from nicety to a necessity. Netflix is a data driven company, and has a need to process over 1 trillion events a day amounting to 3 PB of data to derive business insights.
To ease extracting insight, we are building a self-serve, scalable, fault-tolerant, multi-tenant "Stream Processing as a Service" platform so the user can focus on data analysis. I'll share our experience using Flink to help build the platform.
Monitoring in Motion: Monitoring Containers and Amazon ECSAmazon Web Services
Containers and other forms of dynamic infrastructure can prove challenging to monitor. How do you define normal, when your infrastructure is intentionally in motion and change from minute to minute? Join us as we discuss proven strategies for monitoring your containerized infrastructure on AWS and ECS.
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Puppet
The document discusses network element automation using Puppet. It provides context on the challenges of manual network configuration including lack of agility, reliability issues from errors, and time spent on basic tasks. Puppet can automate network elements similar to how it automates servers, reducing errors and improving speed/productivity. The Cisco Nexus platform and NXAPI enable programmatic access for automation using Puppet through technologies like onePK and LXC containers running on the switch.
Surge 2014: From Clouds to Roots: root cause performance analysis at Netflix. Brendan Gregg.
At Netflix, high scale and fast deployment rule. The possibilities for failure are endless, and the environment excels at handling this, regularly tested and exercised by the simian army. But, when this environment automatically works around systemic issues that aren’t root-caused, they can grow over time. This talk describes the challenge of not just handling failures of scale on the Netflix cloud, but also new approaches and tools for quickly diagnosing their root cause in an ever changing environment.
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleSean Zhong
Gearpump is a Akka based realtime streaming engine, it use Actor to model everything. It has super performance and flexibility. It has performance of 18000000 messages/second and latency of 8ms on a cluster of 4 machines.
Wire data provides deep insights across IT, security and business use cases by capturing the communications transmitted over the wire between machines and applications in real-time. The Splunk App for Stream enables new operational intelligence by indexing this wire data without needing instrumentation. It provides enhanced visibility, efficient cloud-ready collection, and fast time to value through interface-driven deployment. Key features include protocol decoding, attribute filtering, aggregations, and custom content extraction for analysis in Splunk.
Dissecting Open Source Cloud Evolution: An OpenStack Case StudySalman Baset
This document discusses methods for understanding the evolution of open source cloud systems like OpenStack. It presents the authors' solution of using tracing techniques to analyze OpenStack's data and message flows for logical operations such as creating and deleting VMs. Key findings from tracing OpenStack releases include significant behavioral changes between releases, hundreds of database queries and AMQP messages required for operations, and the involvement of components like Keystone, Glance, Nova, and Neutron. The authors propose using their techniques to inject faults and build a knowledge base to aid future problem diagnosis.
This document provides an overview of a workshop on cloud native, capacity, performance and cost optimization tools and techniques. It begins with introducing the difference between a presentation and workshop. It then discusses introducing attendees, presenting on various cloud native topics like migration paths and operations tools, and benchmarking Cassandra performance at scale across AWS regions. The goal is to explore cloud native techniques while discussing specific problems attendees face.
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...HostedbyConfluent
The Ohio Department of Transportation has adopted Confluent as the event driven enabler of DriveOhio, a modern Intelligent Transportation System. DriveOhio digitally links sensors, cameras, speed monitoring equipment, and smart highway assets in real time, to dynamically adjust the surface road network to maximize the safety and efficiency for travelers. Over the past 24 months the team has increased the number and types of devices within the DriveOhio environment, while also working to see their vendors adopt Kafka to better participate in data sharing.
Proactive ops for container orchestration environmentsDocker, Inc.
This document discusses different approaches to monitoring systems from manual and reactive to proactive monitoring using container orchestration tools. It provides examples of metrics to monitor at the host/hardware, networking, application, and orchestration layers. The document emphasizes applying the principles of observability including structured logging, events and tracing with metadata, and monitoring the monitoring systems themselves. Speakers provide best practices around failure prediction, understanding failure modes, and using chaos engineering to build system resilience.
This document provides an overview and agenda for the Splunk App for Stream, including:
- The architecture of the Stream Forwarder for capturing wire data and routing it to Splunk.
- The architecture of the App for Stream for analyzing wire data in Splunk.
- Examples of deployment architectures for ingesting wire data.
- A customer use case where wire data from the network helped provide visibility that log data could not due to access restrictions.
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly SolarWinds Loggly
This document summarizes Loggly's transition from their first generation log management infrastructure to their second generation infrastructure built on Apache Kafka, Twitter Storm, and ElasticSearch on AWS. The first generation faced challenges around tightly coupling event ingestion and indexing. The new system uses Kafka as a persistent queue, Storm for real-time event processing, and ElasticSearch for search and storage. This architecture leverages AWS services like auto-scaling and provisioned IOPS for high availability and scale. The new system provides improved elasticity, multi-tenancy, and a pre-production staging environment.
This document discusses end-to-end processing of 3.7 million telemetry events per second using a lambda architecture at Symantec. It provides an overview of Symantec's security data lake infrastructure, the telemetry data processing architecture using Kafka, Storm and HBase, tuning targets for the infrastructure components, and performance benchmarks for Kafka, Storm and Hive.
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...Josef Adersberger
Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs, and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud native apps. But what to do if you’ve no shiny new cloud native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can!
We’re facing the challenge of migrating hundreds of JEE legacy applications of a major German insurance company onto a Kubernetes cluster within one year. We're now close to the finish line and it worked pretty well so far.
The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way. We'll provide our answers to life, the universe and a cloud native journey like:
- What technical constraints of Kubernetes can be obstacles for applications and how to tackle these?
- How to architect a landscape of hundreds of containerized applications with their surrounding infrastructure like DBs MQs and IAM and heavy requirements on security?
- How to industrialize and govern the migration process?
- How to leverage the possibilities of a cloud native platform like Kubernetes without challenging the tight timeline?
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH
CloudNativeCon North America 2017, Austin (Texas, USA): Talk by Josef Adersberger (@adersberger, CTO at QAware)
Abstract:
Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs, and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud native apps. But what to do if you’ve no shiny new cloud native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can!
We’re facing the challenge of migrating hundreds of JEE legacy applications of a major German insurance company onto a Kubernetes cluster within one year. We're now close to the finish line and it worked pretty well so far.
The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way. We'll provide our answers to life, the universe and a cloud native journey like:
- What technical constraints of Kubernetes can be obstacles for applications and how to tackle these?
- How to architect a landscape of hundreds of containerized applications with their surrounding infrastructure like DBs MQs and IAM and heavy requirements on security?
- How to industrialize and govern the migration process?
- How to leverage the possibilities of a cloud native platform like Kubernetes without challenging the tight timeline?
OSCON Data 2011 -- NoSQL @ Netflix, Part 2Sid Anand
The document discusses translating concepts from relational databases to key-value stores. It covers normalizing data to avoid issues like data inconsistencies and loss. While key-value stores don't support relations, transactions, or SQL, the relationships can be composed in the application layer for smaller datasets. Picking the right data for key-value stores involves accessing data primarily by key lookups.
Lessons Learnt from Running Thousands of On-demand Spark ApplicationsItai Yaffe
Ada Sharoni (Software Engineering Architect) @ Hunters:
Imagine you had to manage thousands of Spark applications that are automatically spinning up on-demand upon every customer interaction.
Our unique constraints in Hunters have led us to adopt an architecture and concepts that we believe many other companies will find useful.
In this lecture we will share our solutions and insights in running many lightweight, cheap Spark applications on Kubernetes, that can easily survive frequent restarts and smartly share resources on Spot EC2 instances.
The document discusses challenges with processor benchmarking and provides recommendations. It summarizes a case study where a popular CPU benchmark claimed a new processor was 2.6x faster than Intel, but detailed analysis found the benchmark was testing division speed, which accounted for only 0.1% of cycles on Netflix servers. The document advocates for low-level, active benchmarking and profiling over statistical analysis. It also provides a checklist for evaluating benchmarks and cautions that increased processor complexity and cloud environments make accurate benchmarking more difficult.
Talk for Facebook Systems@Scale 2021 by Brendan Gregg: "BPF (eBPF) tracing is the superpower that can analyze everything, helping you find performance wins, troubleshoot software, and more. But with many different front-ends and languages, and years of evolution, finding the right starting point can be hard. This talk will make it easy, showing how to install and run selected BPF tools in the bcc and bpftrace open source projects for some quick wins. Think like a sysadmin, not like a programmer."
Computing Performance: On the Horizon (2021)Brendan Gregg
Talk by Brendan Gregg for USENIX LISA 2021. https://www.youtube.com/watch?v=5nN1wjA_S30 . "The future of computer performance involves clouds with hardware hypervisors and custom processors, servers running a new type of BPF software to allow high-speed applications and kernel customizations, observability of everything in production, new Linux kernel technologies, and more. This talk covers interesting developments in systems and computing performance, their challenges, and where things are headed."
USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.
Performance Wins with BPF: Getting StartedBrendan Gregg
Keynote by Brendan Gregg for the eBPF summit, 2020. How to get started finding performance wins using the BPF (eBPF) technology. This short talk covers the quickest and easiest way to find performance wins using BPF observability tools on Linux.
Talk for YOW! by Brendan Gregg. "Systems performance studies the performance of computing systems, including all physical components and the full software stack to help you find performance wins for your application and kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (ftrace, bcc/BPF, and bpftrace/BPF), advice about what is and isn't important to learn, and case studies to see how it is applied. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud.
"
re:Invent 2019 BPF Performance Analysis at NetflixBrendan Gregg
This document provides an overview of Brendan Gregg's presentation on BPF performance analysis at Netflix. It discusses:
- Why BPF is changing the Linux OS model to become more event-based and microkernel-like.
- The internals of BPF including its origins, instruction set, execution model, and how it is integrated into the Linux kernel.
- How BPF enables a new class of custom, efficient, and safe performance analysis tools for analyzing various Linux subsystems like CPUs, memory, disks, networking, applications, and the kernel.
- Examples of specific BPF-based performance analysis tools developed by Netflix, AWS, and others for analyzing tasks, scheduling, page faults
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
BPF (Berkeley Packet Filter) has evolved from a limited virtual machine for efficient packet filtering to a new type of software called extended BPF. Extended BPF allows for custom, efficient, and production-safe performance analysis tools and observability programs to be run in the Linux kernel through BPF. It enables new event-based applications running as BPF programs attached to various kernel events like kprobes, uprobes, tracepoints, sockets, and more. Major companies like Facebook, Google, and Netflix are using BPF programs for tasks like intrusion detection, container security, firewalling, and observability with over 150,000 AWS instances running BPF programs. BPF provides a new program model and security features compared
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
This document discusses Brendan Gregg's opinions on various tracing tools including sysdig, perf, ftrace, eBPF, bpftrace, and BPF perf tools. It provides a table comparing the scope, capability, and ease of use of these tools. It then gives an example of using BPF perf tools to analyze readahead performance. Finally, it outlines desired additions to tracing capabilities and BPF helpers as well as challenges in areas like function tracing without frame pointers.
Here is a bpftrace program to measure scheduler latency for ICMP echo requests:
#!/usr/local/bin/bpftrace
kprobe:icmp_send {
@start[tid] = nsecs;
}
kprobe:__netif_receive_skb_core {
@diff[tid] = hist(nsecs - @start[tid]);
delete(@start[tid]);
}
END {
print(@diff);
clear(@diff);
}
This traces the time between the icmp_send kernel function (when the packet is queued for transmit) and the __netif_receive_skb_core function (when the response packet is received). The
This document summarizes Brendan Gregg's experiences working at Netflix for over 4.5 years. Some key points include:
- The company culture at Netflix is openly documented and encourages independent decision making, open communication, and sharing information broadly.
- Gregg's first meeting involved an expected "intense debate" but was actually professional and respectful.
- Netflix values judgment, communication, curiosity, courage, and other traits that allow the culture and architecture to complement each other.
- The cloud architecture is designed to be resilient through practices like chaos engineering and rapid deployments without approvals, in line with the culture of freedom and responsibility.
The document describes a biolatency tool that traces block device I/O latency using eBPF. It discusses how the tool was originally written in the bcc framework using C/BPF, but has since been rewritten in the bpftrace framework using a simpler one-liner script. It provides examples of the bcc and bpftrace implementations of biolatency.
Talk by Brendan Gregg and Martin Spier for the Linkedin Performance Engineering meetup on Nov 8, 2018. FlameScope is a visualization for performance profiles that helps you study periodic activity, variance, and perturbations, with a heat map for navigation and flame graphs for code analysis.
Talk by Brendan Gregg for All Things Open 2018. "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features,
for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability and the new open source tools that use it, Kyber for disk I/O sc
heduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the
most out of your systems with the latest Linux kernels and exciting features."
Linux Performance 2018 (PerconaLive keynote)Brendan Gregg
Keynote for PerconaLive 2018 by Brendan Gregg. Video: https://youtu.be/sV3XfrfjrPo?t=30m51s . "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features, for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability, Kyber for disk I/O scheduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the most out of your systems, whether they are databases or application servers, with the latest Linux kernels and exciting features."
Talk for USENIX LISA17: "Containers pose interesting challenges for performance monitoring and analysis, requiring new analysis methodologies and tooling. Resource-oriented analysis, as is common with systems performance tools and GUIs, must now account for both hardware limits and soft limits, as implemented using cgroups. A reverse diagnosis methodology can be applied to identify whether a container is resource constrained, and by which hard or soft resource. The interaction between the host and containers can also be examined, and noisy neighbors identified or exonerated. Performance tooling can need special usage or workarounds to function properly from within a container or on the host, to deal with different privilege levels and name spaces. At Netflix, we're using containers for some microservices, and care very much about analyzing and tuning our containers to be as fast and efficient as possible. This talk will show you how to identify bottlenecks in the host or container configuration, in the applications by profiling in a container environment, and how to dig deeper into kernel and container internals."
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
This document discusses using the Linux perf profiling tool at Netflix. It begins with an overview of why Netflix needs Linux profiling to understand CPU usage quickly and completely. It then provides an introduction to the perf tool, covering its basic workflow and commands. The document discusses profiling CPU usage with perf, including potential issues like JIT runtimes and missing symbols. It provides several examples of perf commands for listing, counting, and recording events. The overall summary is that perf allows Netflix to quickly and accurately profile CPU usage across the entire software stack, from applications to libraries to the kernel, to optimize performance.
[Webinar] Scaling Made Simple: Getting Started with No-Code Web AppsSafe Software
Ready to simplify workflow sharing across your organization without diving into complex coding? With FME Flow Apps, you can build no-code web apps that make your data work harder for you — fast.
In this webinar, we’ll show you how to:
Build and deploy Workspace Apps to create an intuitive user interface for self-serve data processing and validation.
Automate processes using Automation Apps. Learn to create a no-code web app to kick off workflows tailored to your needs, trigger multiple workspaces and external actions, and use conditional filtering within automations to control your workflows.
Create a centralized portal with Gallery Apps to share a collection of no-code web apps across your organization.
Through real-world examples and practical demos, you’ll learn how to transform your workflows into intuitive, self-serve solutions that empower your team and save you time. We can’t wait to show you what’s possible!
https://ncracked.com/7961-2/
Note: >> Please copy the link and paste it into Google New Tab now Download link
Free Download Wondershare Filmora 14.3.2.11147 Full Version - All-in-one home video editor to make a great video.Free Download Wondershare Filmora for Windows PC is an all-in-one home video editor with powerful functionality and a fully stacked feature set. Filmora has a simple drag-and-drop top interface, allowing you to be artistic with the story you want to create.Video Editing Simplified - Ignite Your Story. A powerful and intuitive video editing experience. Filmora 10 hash two new ways to edit: Action Cam Tool (Correct lens distortion, Clean up your audio, New speed controls) and Instant Cutter (Trim or merge clips quickly, Instant export).Filmora allows you to create projects in 4:3 or 16:9, so you can crop the videos or resize them to fit the size you want. This way, quickly converting a widescreen material to SD format is possible.
Field Device Management Market Report 2030 - TechSci ResearchVipin Mishra
The Global Field Device Management (FDM) Market is expected to experience significant growth in the forecast period from 2026 to 2030, driven by the integration of advanced technologies aimed at improving industrial operations.
📊 According to TechSci Research, the Global Field Device Management Market was valued at USD 1,506.34 million in 2023 and is anticipated to grow at a CAGR of 6.72% through 2030. FDM plays a vital role in the centralized oversight and optimization of industrial field devices, including sensors, actuators, and controllers.
Key tasks managed under FDM include:
Configuration
Monitoring
Diagnostics
Maintenance
Performance optimization
FDM solutions offer a comprehensive platform for real-time data collection, analysis, and decision-making, enabling:
Proactive maintenance
Predictive analytics
Remote monitoring
By streamlining operations and ensuring compliance, FDM enhances operational efficiency, reduces downtime, and improves asset reliability, ultimately leading to greater performance in industrial processes. FDM’s emphasis on predictive maintenance is particularly important in ensuring the long-term sustainability and success of industrial operations.
For more information, explore the full report: https://shorturl.at/EJnzR
Major companies operating in Global Field Device Management Market are:
General Electric Co
Siemens AG
ABB Ltd
Emerson Electric Co
Aveva Group Ltd
Schneider Electric SE
STMicroelectronics Inc
Techno Systems Inc
Semiconductor Components Industries LLC
International Business Machines Corporation (IBM)
#FieldDeviceManagement #IndustrialAutomation #PredictiveMaintenance #TechInnovation #IndustrialEfficiency #RemoteMonitoring #TechAdvancements #MarketGrowth #OperationalExcellence #SensorsAndActuators
UiPath Document Understanding - Generative AI and Active learning capabilitiesDianaGray10
This session focus on Generative AI features and Active learning modern experience with Document understanding.
Topics Covered:
Overview of Document Understanding
How Generative Annotation works?
What is Generative Classification?
How to use Generative Extraction activities?
What is Generative Validation?
How Active learning modern experience accelerate model training?
Q/A
❓ If you have any questions or feedback, please refer to the "Women in Automation 2025" dedicated Forum thread. You can find there extra details and updates.
There isn’t only one way to be a great technical leader.
To be a great technical leader you need to play to your strengths.
This talk explains why, shows you how to use the People, Process, Technology framework to identify your strengths, covers how using your strengths is best way to do your most important job, which is delivering business outcomes.
Data Intelligence Platform Transforming Data into Actionable Insights.pptxLisa Gerard
In today’s data-driven world, a Data Intelligence Platform plays a crucial role in empowering organizations to make informed, strategic decisions. By leveraging advanced analytics, seamless data integration, and robust governance, businesses can transform vast amounts of data into actionable insights.
World Information Architecture Day 2025 - UX at a CrossroadsJoshua Randall
User Experience stands at a crossroads: will we live up to our potential to design a better world? or will we be co-opted by “product management” or another business buzzword?
Looking backwards, this talk will show how UX has repeatedly failed to create a better world, drawing on industry data from Nielsen Norman Group, Baymard, MeasuringU, WebAIM, and others.
Looking forwards, this talk will argue that UX must resist hype, say no more often and collaborate less often (you read that right), and become a true profession — in order to be able to design a better world.
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog GavraScyllaDB
Learn how Responsive replaced embedded RocksDB with ScyllaDB in Kafka Streams, simplifying the architecture and unlocking massive availability and scale. The talk covers unbundling stream processors, key ScyllaDB features tested, and lessons learned from the transition.
Why Ivalua: A Relational Acquisition Model (RAM 2025) ComparisonJon Hansen
What makes Jon Hansen’s ProcureTech assessment solution RAM unique?
RAM (short for “Relational Acquisition Model,” based on historical context), stands out due to its pioneering approach to procurement efficiency, developed in the late 1990s and early 2000s. While specific technical details about RAM’s current iteration as of March 1, 2025, are not fully detailed in recent public sources, its uniqueness can be inferred from Hansen’s documented history, writings, and interviews, particularly from Procurement Insights and related discussions.
RAM stands out for its agent-based adaptability, interactive design, early AI intelligence, people-process-tech integration, and proven government success—features ahead of its time in the 1990s and resonant with 2025’s procurement needs. It tackled inefficiencies with a practical, transparent approach, not just tech hype, saving millions and streamlining operations where others failed. While its current form isn’t fully public, its legacy as a ProcureTech pioneer remains unique, blending foresight with results in a way few contemporaries matched then or now.
Today’s ProcureTech solution providers—such as Coupa, GEP, Jaggaer, Sievo, Ivalua—can benefit from the Relational Acquisition Model (RAM) by drawing on its foundational principles and proven strengths, adapting them to enhance their offerings in the context of 2025’s complex procurement landscape. While RAM, developed in the late 1990s, lacks the technological scale of modern platforms, its agent-based design, focus on transparency, and human-centric efficiency offer valuable lessons.
Today’s ProcureTech providers can benefit from RAM by adopting its agent-based adaptability, transparent AI, interactive simplicity, human-tech balance, operational focus, and proven credibility. These could enhance responsiveness (e.g., tariff tweaks), trust (e.g., black box fears), and ROI (e.g., faster savings), potentially lifting efficiency by 10-20% or adoption by 15-30%. RAM’s lessons—distilled from a $12 million success—offer a roadmap to refine, not replace, modern solutions like Ivalua. It’s a legacy worth mining for a market chasing the next big thing.
This is session #4 of the 5-session online study series with Google Cloud, where we take you onto the journey learning generative AI. You’ll explore the dynamic landscape of Generative AI, gaining both theoretical insights and practical know-how of Google Cloud GenAI tools such as Gemini, Vertex AI, AI agents and Imagen 3.
Bridging the Gap from Telco to Techco with Agile ArchitectureBATbern
The Telco industry is undergoing a major IT transformation, shifting from Telco to Techco. This shift is driven by a breakneck pace of technological change that traditional architectures simply cannot keep up with. In my presentation, I will explore the profound impact of digitization on Telco architecture. Using examples from Swisscom's Network Division, I will show why Agile Architecture is crucial for cutting complexity, accelerating time-to-market, and sparking innovation within the organization. This approach isn't just strategic; it's vital for the future success of our industry. Join me to uncover how Swisscom is navigating this transformation and what lessons can be applied to your organization.
EaseUS Partition Master Crack 2025 + Serial Keykherorpacca127
https://ncracked.com/7961-2/
Note: >> Please copy the link and paste it into Google New Tab now Download link
EASEUS Partition Master Crack is a professional hard disk partition management tool and system partition optimization software. It is an all-in-one PC and server disk management toolkit for IT professionals, system administrators, technicians, and consultants to provide technical services to customers with unlimited use.
EASEUS Partition Master 18.0 Technician Edition Crack interface is clean and tidy, so all options are at your fingertips. Whether you want to resize, move, copy, merge, browse, check, convert partitions, or change their labels, you can do everything with a few clicks. The defragmentation tool is also designed to merge fragmented files and folders and store them in contiguous locations on the hard drive.
Technology use over time and its impact on consumers and businesses.pptxkaylagaze
In this presentation, I explore how technology has changed consumer behaviour and its impact on consumers and businesses. I will focus on internet access, digital devices, how customers search for information and what they buy online, video consumption, and lastly consumer trends.
TrustArc Webinar: How to Create a Privacy-First CultureTrustArc
Privacy is no longer just a compliance issue—it’s a cornerstone of trust and a vital element of business success. Yet, many organizations struggle to embed privacy into their culture, leaving them vulnerable to breaches, regulatory action, and damaged reputations. Are your employees equipped to make privacy-conscious decisions? Does your company have the tools and mindset to prioritize data protection at every level?
This webinar brings together a panel of experts to explore why a strong privacy culture is critical and how it can drive both organizational integrity and customer confidence. You’ll learn how to align privacy values with business objectives, foster awareness and accountability among employees, and create policies that empower teams to safeguard sensitive information effectively.
Through engaging discussions and practical insights, we’ll provide actionable strategies for implementing privacy programs that stick. From building leadership support to weaving privacy considerations into daily workflows, you’ll discover what it takes to turn compliance into a competitive advantage and a core part of your company’s identity.
This webinar will review:
- Why your company needs a privacy culture
- Best practices for building a privacy-first culture
- Practical tips for implementing effective privacy programs
Formal Methods: Whence and Whither? [Martin Fränzle Festkolloquium, 2025]Jonathan Bowen
Alan Turing arguably wrote the first paper on formal methods 75 years ago. Since then, there have been claims and counterclaims about formal methods. Tool development has been slow but aided by Moore’s Law with the increasing power of computers. Although formal methods are not widespread in practical usage at a heavyweight level, their influence as crept into software engineering practice to the extent that they are no longer necessarily called formal methods in their use. In addition, in areas where safety and security are important, with the increasing use of computers in such applications, formal methods are a viable way to improve the reliability of such software-based systems. Their use in hardware where a mistake can be very costly is also important. This talk explores the journey of formal methods to the present day and speculates on future directions.
Future-Proof Your Career with AI OptionsDianaGray10
Learn about the difference between automation, AI and agentic and ways you can harness these to further your career. In this session you will learn:
Introduction to automation, AI, agentic
Trends in the marketplace
Take advantage of UiPath training and certification
In demand skills needed to strategically position yourself to stay ahead
❓ If you have any questions or feedback, please refer to the "Women in Automation 2025" dedicated Forum thread. You can find there extra details and updates.
There’s a common adage that it takes 10 years to develop a file system. As ScyllaDB reaches that 10 year milestone in 2025, it’s the perfect time to reflect on the last decade of ScyllaDB development – both hits and misses. It’s especially appropriate given that our project just reached a critical mass with certain scalability and elasticity goals that we dreamed up years ago. This talk will cover how we arrived at ScyllaDB X Cloud achieving our initial vision, and share where we’re heading next.
18. Linux (Ubuntu)
Java (JDK 8)
TomcatGC and
thread
dump
logging
GC and
thread
dump
logging
Application war files, base
servlet, platform, hystrix,
health check, metrics (Servo)
Application war files, base
servlet, platform, hystrix,
health check, metrics (Servo)
Optional Apache,
memcached, non-
Java apps (incl.
Node.js, golang)
Optional Apache,
memcached, non-
Java apps (incl.
Node.js, golang)
Atlas monitoring,
S3 log rotation,
ftrace, perf,
bcc/eBPF
Atlas monitoring,
S3 log rotation,
ftrace, perf,
bcc/eBPF
Typical BaseAMI
Cloud Instances
19. 5 Key Issues
And How the Netflix Cloud is
Architected to Solve Them
20. 1. Load Increases → Spinnaker Auto Scaling Groups
– Instances automatically
added or removed by a
custom scaling policy
– Alerts & monitoring used
to check scaling is sane
– Good for customers: Fast workaround
– Good for engineers: Fix later, 9-5
Scaling Policy
loadavg, latency, …
CloudWatch,Servo
ASG
Instance
Instance
Instance
Instance
21. 2. Bad Push → Spinnaker ASG Cluster Rollback
– ASG red black clusters: how code
versions are deployed
– Fast rollback for issues
– Traffic managed by Elastic Load
Balancers (ELBs)
– Automated Canary Analysis (ACA)
for testing
…
ASG-v011
…
ASG-v010
ASG
Cluster
prod1
Canary
ELB
Instance
Instance
Instance
Instance
Instance
Instance
22. 3. Instance Failure → Spinnaker Hystrix Timeouts
– Hystrix: latency and fault tolerance
for dependency services
Fallbacks, degradation, fast fail and rapid
recovery, timeouts, load shedding, circuit
breaker, realtime monitoring
– Plus Ribbon or gRPC for more fault tolerance
Tomcat
Application
Hystrix
get A
Dependency
A1
Dependency
A2
>100ms
23. 4. Region failure → Spinnaker Zuul 2 Reroute Traffic
– All device traffic goes through the Zuul 2 proxy: dynamic routing, monitoring,
resiliency, security
– Region or AZ failure: reroute traffic to another region
Zuul 2, DNS Monitoring
Region 1Region 1 Region 2Region 2 Region 3Region 3
27. Why Do Root Cause Perf Analysis?
…
…
Netflix
Application
ELBASG ClusterSG
ASG 1
Instances (Linux)
JVM
Tomcat
Service
AZ 1
ASG 2
AZ 2AZ 3
…
Often for:
●
High latency
●
Growth
●
Upgrades
28. Cloud Methodologies
●
Resource Analysis
●
Metric and event correlations
●
Latency Drilldowns
●
RED Method
Service A
Service C
Service B
Service D
For each microservice, check:
- Rate
- Errors
- Duration
30. Bad Instance Anti-Method
1. Plot request latency per-instance
2. Find the bad instance
3. Terminate it
4. Someone else’s problem now!
Bad instance latency
Terminate!
Could be an early warning of a bigger issue
32. Netflix Cloud Analysis Process
Instance AnalysisInstance Analysis
Atlas AlertsAtlas Alerts
Atlas/Lumen DashboardsAtlas/Lumen Dashboards
Atlas MetricsAtlas Metrics
ZipkinZipkinSlalomSlalom
PICSOUPICSOU
4. Check Dependencies
Create
New Alert
Cost
3. Drill Down
5. Root Cause
ChronosChronos
2. Check Events
Redirected to
a new Target
1. Check Issue
Example path
enumerated
Plus some other
tools not pictured
SlackSlack
Chat
33. Atlas: Alerts
Custom alerts on streams per second (SPS) changes, CPU usage, latency, ASG
growth, client errors, …
38. Atlas & Lumen: Custom Dashboards
●
Dashboards are a checklist methodology: what to show first,
second, third...
●
Starting point for issues
1. Confirm and quantify issue
2. Check historic trend
3. Atlas metrics to drill down
Lumen: more flexible dashboards
eg, go/burger
41. Atlas: Metrics
●
All metrics in one system
– System metrics: CPU usage, disk I/O, memory, …
– Application metrics: latency percentiles, errors, …
●
Filters or breakdowns by region,
application, ASG, metric, instance
●
URL has session state: shareable
49. Netflix Cloud Analysis Process
Instance AnalysisInstance Analysis
Atlas AlertsAtlas Alerts
Atlas MetricsAtlas Metrics
ZipkinZipkinSlalomSlalom
4. Check Dependencies
Create
New Alert
Cost
3. Drill Down
5. Root Cause
ChronosChronos
2. Check Events
Redirected to
a new Target
1. Check Issue
Example path
enumerated
Plus some other
tools not pictured
PICSOUPICSOU
Atlas/Lumen DashboardsAtlas/Lumen Dashboards
SlackSlack
Chat
50. Generic Cloud Analysis Process
Instance AnalysisInstance Analysis
AlertsAlerts
Custom DashboardsCustom Dashboards
Metric AnalysisMetric Analysis
Dependency AnalysisDependency Analysis
Usage ReportsUsage Reports
4. Check Dependencies
Create
New Alert
Cost
3. Drill Down
5. Root Cause
Change TrackingChange Tracking
2. Check Events
Redirected to
a new Target
1. Check Issue
Example path
enumerated
Plus other tools
as needed
MessagingMessaging
Chat
55. Linux Tools
●
vmstat, pidstat, sar, etc, used mostly normally
●
Micro benchmarking can be used to investigate hypervisor
behavior that can’t be observed directly
$ sar -n TCP,ETCP,DEV 1
Linux 4.15.0-1027-aws (xxx) 12/03/2018 _x86_64_ (48 CPU)
09:43:53 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
09:43:54 PM lo 15.00 15.00 1.31 1.31 0.00 0.00 0.00 0.00
09:43:54 PM eth0 26392.00 33744.00 19361.43 28065.36 0.00 0.00 0.00 0.00
09:43:53 PM active/s passive/s iseg/s oseg/s
09:43:54 PM 18.00 132.00 17512.00 33760.00
09:43:53 PM atmptf/s estres/s retrans/s isegerr/s orsts/s
09:43:54 PM 0.00 0.00 11.00 0.00 0.00
[…]
56. Exception: Containers
●
Most Linux tools are still not container aware
– From the container, will show the full host
●
We expose cgroup metrics in our cloud GUIs: Vector
67. CPU Flame Graphs
a()
b() h()
c()
d()
e() f()
g()
i()
Top edge:
Who is running on CPU
And how much (width)
Ancestry
●
Y-axis: stack depth
– 0 at bottom
– 0 at top == icicle graph
●
X-axis: alphabet
– Time == flame chart
●
Color: random
– Hues often used for
language types
– Can be a dimension
eg, CPI
68. Application Profiling
●
Primary approach:
– CPU mixed-mode flame graphs (eg, via Linux perf)
– May need frame pointers (eg, Java -XX:+PreserveFramePointer)
– May need a symbol file (eg, Java perf-map-agent, Node.js --perf-basic-prof)
●
Secondary:
– Application profiler (eg, via Lightweight Java Profiler)
– Application logs
80. # /usr/share/bcc/tools/biosnoop
TIME(s) COMM PID DISK T SECTOR BYTES LAT(ms)
0.000000000 tar 8519 xvda R 110824 4096 6.50
0.004183000 tar 8519 xvda R 111672 4096 4.08
0.016195000 tar 8519 xvda R 4198424 4096 11.88
0.018716000 tar 8519 xvda R 4201152 4096 2.43
0.019416000 tar 8519 xvda R 4201160 4096 0.61
0.032645000 tar 8519 xvda R 4207968 4096 13.16
0.033181000 tar 8519 xvda R 4207976 4096 0.47
0.033524000 tar 8519 xvda R 4208000 4096 0.27
0.033876000 tar 8519 xvda R 4207992 4096 0.28
0.034840000 tar 8519 xvda R 4208008 4096 0.89
0.035713000 tar 8519 xvda R 4207984 4096 0.81
0.036165000 tar 8519 xvda R 111720 4096 0.37
0.039969000 tar 8519 xvda R 8427264 4096 3.69
0.051614000 tar 8519 xvda R 8405640 4096 11.44
0.052310000 tar 8519 xvda R 111696 4096 0.55
0.053044000 tar 8519 xvda R 111712 4096 0.56
0.059583000 tar 8519 xvda R 8411032 4096 6.40
0.068278000 tar 8519 xvda R 4218672 4096 8.57
0.076717000 tar 8519 xvda R 4218968 4096 8.33
0.077183000 tar 8519 xvda R 4218984 4096 0.40
0.082188000 tar 8519 xvda R 8393552 4096 4.94
[...]
96. Take Aways
1. Get push-button CPU flame graphs: kernel & user
2. Check out eBPF perf tools: bcc, bpftrace
3. Measure IPC as well as CPU utilization using PMCs
90% CPU busy:
… really means: