Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Tracking Crime as It Occurs with Apache Phoenix,
Apache HBase and Apache NiFi
TIMOTHY SPANN
Field Engineer, Data in Motion
Cloudera
Introduction
Tim Spann has been running meetups in Princeton on Big Data technologies since 2015.
Tim has spoken at many international conferences on Apache NiFi, Deep Learning and
Streaming.
https://community.hortonworks.com/users/9304/tspann.html
https://dzone.com/users/297029/bunkertor.html
https://www.meetup.com/futureofdata-princeton/
https://dzone.com/articles/integrating-keras-tensorflow-yolov3-into-apache-ni
Introduction
Using Apache NiFi we can ingest various sources of criminal data real-time as activities happen as well as monitor
live traffic cameras (Source: TrafficLand).
We can do a lot of alerting, routing and react to crime data as it arrives, but we need more. We need to update
totals, store this data for future machine learning analytics and make it available for instant update dashboards and
reports.
The best destination for this data is Apache HBase and Apache Phoenix. We’ll populate tables with ease and speed!
Resources:
https://community.hortonworks.com/articles/54947/reading-opendata-json-and-storing-into-phoenix-tab.html
https://community.hortonworks.com/articles/56642/creating-a-spring-boot-java-8-microservice-to-read.html
https://community.hortonworks.com/articles/64122/incrementally-streaming-rdbms-data-to-your-hadoop.html
4 © Cloudera, Inc. All rights reserved.
DATAFLOW
5© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved.
CONTROL DATA-IN-MOTION FROM EDGE-TO-ENTERPRISE
Cloudera DataFlow - Collect, Curate and Analyze Data-in-Motion
DataFlow &
Steaming
• Edge-to-enterprise streaming data platform for management,
security and governance of real-time streaming data
• Edge data collection, processing and content routing of sensor data
from edge devices
• Continuous data ingestion from any streaming source or IoT device
• Ease-of-use in building sophisticated data flows with drag-and-drop
user interface
• Real-time stream processing and content syndication at the scale of
millions of messages per second
• Predictive and prescriptive analytics from streaming analytics
engines to gain actionable intelligence
7© Cloudera, Inc. All rights reserved.
CLOUDERA FLOW MANAGEMENT
● Web-based user interface
● Highly configurable
● Out-of-the-box data provenance
● Designed for extensibility
● Secure
● NiFi Registry
○ DevOps support
○ FDLC
○ Versioning
○ Deployment
8© Cloudera, Inc. All rights reserved.
300+ PROCESSORS FOR DEEPER ECOSYSTEM INTEGRATION
Hash
Extract
Merge
Duplicate
Scan
GeoEnrich
Replace
ConvertSplit
Translate
Route Content
Route Context
Route Text
Control Rate
Distribute Load
Generate Table Fetch
Jolt Transform JSON
Prioritized Delivery
Encrypt
Tail
Evaluate
Execute
Fetch
HTTP
Syslog
Email
HTML
Image
HL7
FTP
UDP
XML
SFTP
AMQP
WebSocket
9 © Cloudera, Inc. All rights reserved.
ARCHITECTURE
10© Cloudera, Inc. All rights reserved.
Apache Phoenix-5.0
• Expect similar timeframe for Phoenix-5.0
• We are working for HBase-2.0 support
• Re-write internals using Apache Calcite
• SQL-parser, planner and optimizer
• Cost based Optimizer used by Hive, Drill, etc
• Pluggable rules with default rules, and Phoenix specific ones
• SQL-92 support
• Apache NiFi calls Apache Calcite Avatica JDBC
12 © Cloudera, Inc. All rights reserved.
DEMO
13© Cloudera, Inc. All rights reserved.
14© Cloudera, Inc. All rights reserved.
15© Cloudera, Inc. All rights reserved.
16© Cloudera, Inc. All rights reserved.
SPRING BOOT APPLICATION TO PHOENIX
https://github.com/tspannhw/phoenix
https://community.hortonworks.com/articles/56642/creating-a-spring-boot-java-8-microservice-to-read.html
17© Cloudera, Inc. All rights reserved.
SPRING BOOT APPLICATION TO PHOENIX TABLE
CREATE TABLE phillycrime (dc_dist varchar,
dc_key varchar not null primary key,dispatch_date
varchar,dispatch_date_time varchar,dispatch_time varchar,hour
varchar,location_block varchar,psa varchar,
text_general_code varchar,ucr_general varchar);
java -Xms512m -Xmx2048m -Dhdp.version=3.1 -
Djava.net.preferIPv4Stack=true -jar target/phoenix-0.0.1-SNAPSHOT.jar
@RequestMapping("/query/{query}")
18© Cloudera, Inc. All rights reserved.
19 © Cloudera, Inc. All rights reserved.
DEMONSTRATION

More Related Content

Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi

  • 1. Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi TIMOTHY SPANN Field Engineer, Data in Motion Cloudera
  • 2. Introduction Tim Spann has been running meetups in Princeton on Big Data technologies since 2015. Tim has spoken at many international conferences on Apache NiFi, Deep Learning and Streaming. https://community.hortonworks.com/users/9304/tspann.html https://dzone.com/users/297029/bunkertor.html https://www.meetup.com/futureofdata-princeton/ https://dzone.com/articles/integrating-keras-tensorflow-yolov3-into-apache-ni
  • 3. Introduction Using Apache NiFi we can ingest various sources of criminal data real-time as activities happen as well as monitor live traffic cameras (Source: TrafficLand). We can do a lot of alerting, routing and react to crime data as it arrives, but we need more. We need to update totals, store this data for future machine learning analytics and make it available for instant update dashboards and reports. The best destination for this data is Apache HBase and Apache Phoenix. We’ll populate tables with ease and speed! Resources: https://community.hortonworks.com/articles/54947/reading-opendata-json-and-storing-into-phoenix-tab.html https://community.hortonworks.com/articles/56642/creating-a-spring-boot-java-8-microservice-to-read.html https://community.hortonworks.com/articles/64122/incrementally-streaming-rdbms-data-to-your-hadoop.html
  • 4. 4 © Cloudera, Inc. All rights reserved. DATAFLOW
  • 5. 5© Cloudera, Inc. All rights reserved.
  • 6. © Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved. CONTROL DATA-IN-MOTION FROM EDGE-TO-ENTERPRISE Cloudera DataFlow - Collect, Curate and Analyze Data-in-Motion DataFlow & Steaming • Edge-to-enterprise streaming data platform for management, security and governance of real-time streaming data • Edge data collection, processing and content routing of sensor data from edge devices • Continuous data ingestion from any streaming source or IoT device • Ease-of-use in building sophisticated data flows with drag-and-drop user interface • Real-time stream processing and content syndication at the scale of millions of messages per second • Predictive and prescriptive analytics from streaming analytics engines to gain actionable intelligence
  • 7. 7© Cloudera, Inc. All rights reserved. CLOUDERA FLOW MANAGEMENT ● Web-based user interface ● Highly configurable ● Out-of-the-box data provenance ● Designed for extensibility ● Secure ● NiFi Registry ○ DevOps support ○ FDLC ○ Versioning ○ Deployment
  • 8. 8© Cloudera, Inc. All rights reserved. 300+ PROCESSORS FOR DEEPER ECOSYSTEM INTEGRATION Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute Fetch HTTP Syslog Email HTML Image HL7 FTP UDP XML SFTP AMQP WebSocket
  • 9. 9 © Cloudera, Inc. All rights reserved. ARCHITECTURE
  • 10. 10© Cloudera, Inc. All rights reserved.
  • 11. Apache Phoenix-5.0 • Expect similar timeframe for Phoenix-5.0 • We are working for HBase-2.0 support • Re-write internals using Apache Calcite • SQL-parser, planner and optimizer • Cost based Optimizer used by Hive, Drill, etc • Pluggable rules with default rules, and Phoenix specific ones • SQL-92 support • Apache NiFi calls Apache Calcite Avatica JDBC
  • 12. 12 © Cloudera, Inc. All rights reserved. DEMO
  • 13. 13© Cloudera, Inc. All rights reserved.
  • 14. 14© Cloudera, Inc. All rights reserved.
  • 15. 15© Cloudera, Inc. All rights reserved.
  • 16. 16© Cloudera, Inc. All rights reserved. SPRING BOOT APPLICATION TO PHOENIX https://github.com/tspannhw/phoenix https://community.hortonworks.com/articles/56642/creating-a-spring-boot-java-8-microservice-to-read.html
  • 17. 17© Cloudera, Inc. All rights reserved. SPRING BOOT APPLICATION TO PHOENIX TABLE CREATE TABLE phillycrime (dc_dist varchar, dc_key varchar not null primary key,dispatch_date varchar,dispatch_date_time varchar,dispatch_time varchar,hour varchar,location_block varchar,psa varchar, text_general_code varchar,ucr_general varchar); java -Xms512m -Xmx2048m -Dhdp.version=3.1 - Djava.net.preferIPv4Stack=true -jar target/phoenix-0.0.1-SNAPSHOT.jar @RequestMapping("/query/{query}")
  • 18. 18© Cloudera, Inc. All rights reserved.
  • 19. 19 © Cloudera, Inc. All rights reserved. DEMONSTRATION