Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
NORDSTROM'S
EVENT-SOURCED
ARCHITECTURE AND
KAFKA-AS-A-SERVICE
BEAU BENDER – DIRECTOR OF ENGINEERING, DATA & MACHINE LEARNING
ADAM WEYANT – SOFTWARE ENGINEERING MANAGER, DATA PROCESSING
SEPTEMBER 15, 2021
PRESENTERS
BEAU BENDER
Beau strives to democratize ML
& AI within Nordstrom by
empowering employees to
seamlessly deliver production
quality solutions at scale.
ADAM WEYANT
Adam supports Nordstrom’s
multi-tenant Kafka platform and
related tools.
Beau and Adam are honored to work at Nordstrom, a retailer that strives to be the center of fashion
authority. As part of the Data and Analytical Services organization, they work on solutions that enable
world-class customer service and personalization.
AGENDA
ABOUT NORDSTROM
NORDSTROM ANALYTICAL PLATFORM
KAFKA-AS-A-SERVICE
ABOUT NORDSTROM
Wallin & Nordstrom
store opened
1998
Nordstrom.com
Launches
1901 2019
Design of Nordstrom
Analytical Platform
Begins
2020
Real-time store
analytics launches
2023
2017
First Generation
Custom Analytical
Platform Launches Majority of business
decisions automated
NORDSTROM ANALYTICAL PLATFORM - OVERVIEW
NORDSTROM ANALYTICAL
PLATFORM - CONSIDERATIONS
Hackathonability
Security and tokenization
CCPA/GDPR compliance
Data quality/discovery
Acceptable staleness
NORDSTROM ANALYTICAL PLATFORM - ALIGNMENT
Event-first design
Processes
Application design review
Centralized schema advocate group
SDKs
Engineering standards
Ownership of data quality
ORDER SUBMITTED
Number of Use-Cases
Clickstream
NAP
Data
Quality
NORDSTROM ANALYTICAL PLATFORM – BEFORE/AFTER
Before NAP With NAP
Burdens
Considerations
Workflow
Data is an afterthought of design and
siloed
Analytics is a first-class
consideration
Data is collected with opaque process
and transformations, and can only be
accessed by data scientists once
processed the next day
Well defined business events are
streamed live to any system that is
interested
Data quality is owned by producers,
but all have a responsibility to drive
improvements
Ownership of quality is not well
defined
“If It’s not in NAP, it didn’t happen.”
NORDSTROM ANALYTICAL
PLATFORM – WHAT’S NEXT?
Majority of business decisions automated
All business operations flow through NAP
Staleness reporting
Data quality
Data discoverability
Data lineage
A DISTRIBUTED
STREAMING PLATFORM
FOR NORDSTROM,
BASED ON APACHE
KAFKA.
190
TEAMS
5-6X
READ:WRITE RATIO
150TB
STORED
1.25Gbps
PEAK READ
0.3Gbps
PEAK WRITE
500K
CONCURRENT
CONNECTIONS
KAFKA-AS-A-SERVICE
Reliable and resilient
Self-service and automated
Flexible and evolvable
Clear expectations
Monitoring and visibility
Troubleshooting client issues
Support SLAs
Quotas and access controls
Clear expectations
KAFKA-AS-A-SERVICE
Self-service and automated
Eliminate ticketing
Enable ClickOps
Empower DevOps
KAFKA-AS-A-SERVICE
Reliable and resilient
Mature SLAs
Monitoring and alerting
Topic mirroring
Data archiving
KAFKA-AS-A-SERVICE
Flexible and evolvable
Multi-tenant
Special-case clusters
Multi-region
Multi-provider
KAFKA-AS-A-SERVICE
PROTON API
Powers UI and Terraform provider.
PROTON UI
Proton resource management and
documentation.
STREAMING AND SCHEDULED
JOBS
SerivceNow integration, API Key lifecycle,
change-data capture operations, and operational
observability.
SELF-SERVICE LAYER
Proton resource management and core of
multi-tenancy architecture.
Kafka Brokers
Schema Registry
Kafka Connect
Monitoring
Automation
Proton API
Proton UI
Streaming and
Scheduled Jobs
Primary Secondary
KAFKA-AS-A-SERVICE
KAFKA BROKERS
Data streaming and retention.
SCHEMA REGISTRY
AVRO schema storage and management.
KAFKA CONNECT
Managed S3, SQS, and Lambda sink.
KAFKA INFRASTRUCTURE
Kafka streaming infrastructure and related
services.
Kafka Brokers
Schema Registry
Kafka Connect
Monitoring
Automation
Proton API
Proton UI
Streaming and
Scheduled Jobs
Primary Secondary
KAFKA-AS-A-SERVICE
AUTOMATION
Kafka topic, user, and connector resource
provisioning and quota management.
MONITORING AND
RECONCILIATION
Extract insights from infrastructure to
self-heal and improve visibility in Proton UI
for Kafka connector status, Schema details,
etc.
INFRASTRUCTURE
AUTOMATION
Event-driven control-plane and monitor for
Kafka clusters, Kafka Connect, and Schema
Registry.
Kafka Brokers
Schema Registry
Kafka Connect
Monitoring
Automation
Proton API
Proton UI
Streaming and
Scheduled Jobs
Primary Secondary
KAFKA-AS-A-SERVICE
KAFKA-AS-A-SERVICE
SLAs
TOPIC MANAGEMENT
KAFKA-AS-A-SERVICE
MANAGING KAFKA WITH KAFKA
TOPIC CREATION
1. Create topic request
Kafka Brokers
Schema Registry
Kafka Connect
Monitoring
Automation
Proton API
Proton UI
Streaming and
Scheduled Jobs
Primary Secondary
1
KAFKA-AS-A-SERVICE
MANAGING KAFKA WITH KAFKA
TOPIC CREATION
2. Resource request event published
Kafka Brokers
Schema Registry
Kafka Connect
Monitoring
Automation
Proton API
Proton UI
Streaming and
Scheduled Jobs
Primary Secondary
1
2
KAFKA-AS-A-SERVICE
MANAGING KAFKA WITH KAFKA
TOPIC CREATION
3. Consumed by automation
4. Actioned by automation
Kafka Brokers
Schema Registry
Kafka Connect
Monitoring
Automation
Proton API
Proton UI
Streaming and
Scheduled Jobs
Primary Secondary
1
2
3 4
KAFKA-AS-A-SERVICE
MANAGING KAFKA WITH KAFKA
TOPIC CREATION
5. Resource event published
Kafka Brokers
Schema Registry
Kafka Connect
Monitoring
Automation
Proton API
Proton UI
Streaming and
Scheduled Jobs
Primary Secondary
1
2
3 4
KAFKA-AS-A-SERVICE
5
MANAGING KAFKA WITH KAFKA
TOPIC CREATION
6. API Metadata updated
Kafka Brokers
Schema Registry
Kafka Connect
Monitoring
Automation
Proton API
Proton UI
Streaming and
Scheduled Jobs
Primary Secondary
1
2
3 4
5
KAFKA-AS-A-SERVICE
6
AREAS FOR GROWTH
Additional infrastructure providers
Additional source and sink Connectors
Platform intelligence
Advanced backup and restore tooling
Majority transition to OAuth
External Kafka integrations
Edge and in-store clusters
KAFKA-AS-A-SERVICE
THANK YOU

More Related Content

Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant and Beau Bender, Nordstrom

  • 1. NORDSTROM'S EVENT-SOURCED ARCHITECTURE AND KAFKA-AS-A-SERVICE BEAU BENDER – DIRECTOR OF ENGINEERING, DATA & MACHINE LEARNING ADAM WEYANT – SOFTWARE ENGINEERING MANAGER, DATA PROCESSING SEPTEMBER 15, 2021
  • 2. PRESENTERS BEAU BENDER Beau strives to democratize ML & AI within Nordstrom by empowering employees to seamlessly deliver production quality solutions at scale. ADAM WEYANT Adam supports Nordstrom’s multi-tenant Kafka platform and related tools. Beau and Adam are honored to work at Nordstrom, a retailer that strives to be the center of fashion authority. As part of the Data and Analytical Services organization, they work on solutions that enable world-class customer service and personalization.
  • 3. AGENDA ABOUT NORDSTROM NORDSTROM ANALYTICAL PLATFORM KAFKA-AS-A-SERVICE
  • 4. ABOUT NORDSTROM Wallin & Nordstrom store opened 1998 Nordstrom.com Launches 1901 2019 Design of Nordstrom Analytical Platform Begins 2020 Real-time store analytics launches 2023 2017 First Generation Custom Analytical Platform Launches Majority of business decisions automated
  • 6. NORDSTROM ANALYTICAL PLATFORM - CONSIDERATIONS Hackathonability Security and tokenization CCPA/GDPR compliance Data quality/discovery Acceptable staleness
  • 7. NORDSTROM ANALYTICAL PLATFORM - ALIGNMENT Event-first design Processes Application design review Centralized schema advocate group SDKs Engineering standards Ownership of data quality ORDER SUBMITTED Number of Use-Cases Clickstream NAP Data Quality
  • 8. NORDSTROM ANALYTICAL PLATFORM – BEFORE/AFTER Before NAP With NAP Burdens Considerations Workflow Data is an afterthought of design and siloed Analytics is a first-class consideration Data is collected with opaque process and transformations, and can only be accessed by data scientists once processed the next day Well defined business events are streamed live to any system that is interested Data quality is owned by producers, but all have a responsibility to drive improvements Ownership of quality is not well defined “If It’s not in NAP, it didn’t happen.”
  • 9. NORDSTROM ANALYTICAL PLATFORM – WHAT’S NEXT? Majority of business decisions automated All business operations flow through NAP Staleness reporting Data quality Data discoverability Data lineage
  • 10. A DISTRIBUTED STREAMING PLATFORM FOR NORDSTROM, BASED ON APACHE KAFKA. 190 TEAMS 5-6X READ:WRITE RATIO 150TB STORED 1.25Gbps PEAK READ 0.3Gbps PEAK WRITE 500K CONCURRENT CONNECTIONS
  • 11. KAFKA-AS-A-SERVICE Reliable and resilient Self-service and automated Flexible and evolvable Clear expectations
  • 12. Monitoring and visibility Troubleshooting client issues Support SLAs Quotas and access controls Clear expectations KAFKA-AS-A-SERVICE
  • 13. Self-service and automated Eliminate ticketing Enable ClickOps Empower DevOps KAFKA-AS-A-SERVICE
  • 14. Reliable and resilient Mature SLAs Monitoring and alerting Topic mirroring Data archiving KAFKA-AS-A-SERVICE
  • 15. Flexible and evolvable Multi-tenant Special-case clusters Multi-region Multi-provider KAFKA-AS-A-SERVICE
  • 16. PROTON API Powers UI and Terraform provider. PROTON UI Proton resource management and documentation. STREAMING AND SCHEDULED JOBS SerivceNow integration, API Key lifecycle, change-data capture operations, and operational observability. SELF-SERVICE LAYER Proton resource management and core of multi-tenancy architecture. Kafka Brokers Schema Registry Kafka Connect Monitoring Automation Proton API Proton UI Streaming and Scheduled Jobs Primary Secondary KAFKA-AS-A-SERVICE
  • 17. KAFKA BROKERS Data streaming and retention. SCHEMA REGISTRY AVRO schema storage and management. KAFKA CONNECT Managed S3, SQS, and Lambda sink. KAFKA INFRASTRUCTURE Kafka streaming infrastructure and related services. Kafka Brokers Schema Registry Kafka Connect Monitoring Automation Proton API Proton UI Streaming and Scheduled Jobs Primary Secondary KAFKA-AS-A-SERVICE
  • 18. AUTOMATION Kafka topic, user, and connector resource provisioning and quota management. MONITORING AND RECONCILIATION Extract insights from infrastructure to self-heal and improve visibility in Proton UI for Kafka connector status, Schema details, etc. INFRASTRUCTURE AUTOMATION Event-driven control-plane and monitor for Kafka clusters, Kafka Connect, and Schema Registry. Kafka Brokers Schema Registry Kafka Connect Monitoring Automation Proton API Proton UI Streaming and Scheduled Jobs Primary Secondary KAFKA-AS-A-SERVICE
  • 21. MANAGING KAFKA WITH KAFKA TOPIC CREATION 1. Create topic request Kafka Brokers Schema Registry Kafka Connect Monitoring Automation Proton API Proton UI Streaming and Scheduled Jobs Primary Secondary 1 KAFKA-AS-A-SERVICE
  • 22. MANAGING KAFKA WITH KAFKA TOPIC CREATION 2. Resource request event published Kafka Brokers Schema Registry Kafka Connect Monitoring Automation Proton API Proton UI Streaming and Scheduled Jobs Primary Secondary 1 2 KAFKA-AS-A-SERVICE
  • 23. MANAGING KAFKA WITH KAFKA TOPIC CREATION 3. Consumed by automation 4. Actioned by automation Kafka Brokers Schema Registry Kafka Connect Monitoring Automation Proton API Proton UI Streaming and Scheduled Jobs Primary Secondary 1 2 3 4 KAFKA-AS-A-SERVICE
  • 24. MANAGING KAFKA WITH KAFKA TOPIC CREATION 5. Resource event published Kafka Brokers Schema Registry Kafka Connect Monitoring Automation Proton API Proton UI Streaming and Scheduled Jobs Primary Secondary 1 2 3 4 KAFKA-AS-A-SERVICE 5
  • 25. MANAGING KAFKA WITH KAFKA TOPIC CREATION 6. API Metadata updated Kafka Brokers Schema Registry Kafka Connect Monitoring Automation Proton API Proton UI Streaming and Scheduled Jobs Primary Secondary 1 2 3 4 5 KAFKA-AS-A-SERVICE 6
  • 26. AREAS FOR GROWTH Additional infrastructure providers Additional source and sink Connectors Platform intelligence Advanced backup and restore tooling Majority transition to OAuth External Kafka integrations Edge and in-store clusters KAFKA-AS-A-SERVICE