Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi

Tracking Crime as It Occurs with Apache Phoenix,
Apache HBase and Apache NiFi
TIMOTHY SPANN
Field Engineer, Data in Motion
Cloudera

Introduction
Tim Spann has been running meetups in Princeton on Big Data technologies since 2015.
Tim has spoken at many international conferences on Apache NiFi, Deep Learning and
Streaming.
https://community.hortonworks.com/users/9304/tspann.html
https://dzone.com/users/297029/bunkertor.html
https://www.meetup.com/futureofdata-princeton/
https://dzone.com/articles/integrating-keras-tensorflow-yolov3-into-apache-ni

Introduction
Using Apache NiFi we can ingest various sources of criminal data real-time as activities happen as well as monitor
live traffic cameras (Source: TrafficLand).
We can do a lot of alerting, routing and react to crime data as it arrives, but we need more. We need to update
totals, store this data for future machine learning analytics and make it available for instant update dashboards and
reports.
The best destination for this data is Apache HBase and Apache Phoenix. We’ll populate tables with ease and speed!
Resources:
https://community.hortonworks.com/articles/54947/reading-opendata-json-and-storing-into-phoenix-tab.html
https://community.hortonworks.com/articles/56642/creating-a-spring-boot-java-8-microservice-to-read.html
https://community.hortonworks.com/articles/64122/incrementally-streaming-rdbms-data-to-your-hadoop.html

4 © Cloudera, Inc. All rights reserved.
DATAFLOW

© Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved.
CONTROL DATA-IN-MOTION FROM EDGE-TO-ENTERPRISE
Cloudera DataFlow - Collect, Curate and Analyze Data-in-Motion
DataFlow &
Steaming
• Edge-to-enterprise streaming data platform for management,
security and governance of real-time streaming data
• Edge data collection, processing and content routing of sensor data
from edge devices
• Continuous data ingestion from any streaming source or IoT device
• Ease-of-use in building sophisticated data flows with drag-and-drop
user interface
• Real-time stream processing and content syndication at the scale of
millions of messages per second
• Predictive and prescriptive analytics from streaming analytics
engines to gain actionable intelligence

CLOUDERA FLOW MANAGEMENT
● Web-based user interface
● Highly configurable
● Out-of-the-box data provenance
● Designed for extensibility
● Secure
● NiFi Registry
○ DevOps support
○ FDLC
○ Versioning
○ Deployment

300+ PROCESSORS FOR DEEPER ECOSYSTEM INTEGRATION
Hash
Extract
Merge
Duplicate
Scan
GeoEnrich
Replace
ConvertSplit
Translate
Route Content
Route Context
Route Text
Control Rate
Distribute Load
Generate Table Fetch
Jolt Transform JSON
Prioritized Delivery
Encrypt
Tail
Evaluate
Execute
Fetch
HTTP
Syslog
Email
HTML
Image
HL7
FTP
UDP
XML
SFTP
AMQP
WebSocket

ARCHITECTURE

Apache Phoenix-5.0
• Expect similar timeframe for Phoenix-5.0
• We are working for HBase-2.0 support
• Re-write internals using Apache Calcite
• SQL-parser, planner and optimizer
• Cost based Optimizer used by Hive, Drill, etc
• Pluggable rules with default rules, and Phoenix specific ones
• SQL-92 support
• Apache NiFi calls Apache Calcite Avatica JDBC

DEMO

SPRING BOOT APPLICATION TO PHOENIX
https://github.com/tspannhw/phoenix
https://community.hortonworks.com/articles/56642/creating-a-spring-boot-java-8-microservice-to-read.html

SPRING BOOT APPLICATION TO PHOENIX TABLE
CREATE TABLE phillycrime (dc_dist varchar,
dc_key varchar not null primary key,dispatch_date
varchar,dispatch_date_time varchar,dispatch_time varchar,hour
varchar,location_block varchar,psa varchar,
text_general_code varchar,ucr_general varchar);
java -Xms512m -Xmx2048m -Dhdp.version=3.1 -
Djava.net.preferIPv4Stack=true -jar target/phoenix-0.0.1-SNAPSHOT.jar
@RequestMapping("/query/{query}")

DEMONSTRATION

Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi

Related slideshows

More Related Content

Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi