Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Assignments Activity #19 - Big Data Analytics Recommendation Assignments Activity #20 - Evaluating Amazon Data With EMR Notebooks

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

1

u Assignments > Activity #19 - Big Data Analytics Recommendation


u Assignments > Activity #20 - Evaluating Amazon Data with EMR Notebooks
u Grad Projects due Tuesday, 11/15
u Team Project Presentations due Thursday, 11/17
u Term Project Check-in #3 next week (11/14-11/18)

u Final Exam – Thursday, 12/8 from 7-9:30pm in GOL 1550


Project check-ins – Term Project 2

u Project check-in #2 (11/14-11/18)


Meet w/Professor Meet w/Su Meet w/Chirayu
Teams 7-8 Teams 1-3 Teams 4-6

u This is worth 5% of your grade:


Check-in Points Deliverables
1 Updated Kanban board reflecting current state of project
3 2 Able to demo IAC spinning up infrastructure for team project
2 Able to successfully demo working project

u Teams can book time via Calendy (links on myCourses) by tomorrow


u Check-in details can be found on myCourses at Content > Project > Project
Check-in
3
u Big Data Analytics is the complex process of examining large and
varied data sets, or big data, to uncover information such as
hidden patterns, unknown correlations, market trends and
customer preferences that can help organizations make informed
business decisions
u Why important?
u Cost Reduction
u New products and services
u Faster and smarter better decision
u Time to market reductions
4
u A Data Lake allows you to store all your structured and unstructured
data, in one centralized repository, and at any scale
u With a Data Lake, you store your data as-is, without having to first
structure the data, based on potential questions you may have in the
future
5
u Spark SQL is a Spark module for
structured data processing
u It provides a programming
abstraction called DataFrames and
can also act as a distributed SQL
query engine
u It enables unmodified Hadoop Hive
queries to run up to 100x faster on
existing deployments and data
u It also provides powerful integration
with the rest of the Spark ecosystem
Internet of Things (IOT) on the Cloud

SWEN 514/614: Engineering Cloud Software Systems

Department of Software Engineering


Rochester Institute of Technology
What is IOT? 10

u The Internet of Things (IoT) is a reference to a collection of devices or objects that are
linked together using an Internet connection
u The hub for the collection (the “things” part) is what sends and collects data using the
Internet, which helps the devices to make decisions and remember particular patterns
and routines for action to be carried out without any manual involvement
u These devices can include multiple appliances that
need to be connected for reasons including
automation and real-time control of the device
u As the IoT has both real-time and historical data
stored, it can provide effective decision-making
instructions to devices, and control certain actions
and aspects of when and how they function
u This technology enables your systems and devices to
be automated cost-effectively
Examples of IOT 11
u Amazon Echo/Google Home - Voice assistants are some of the most popular
connected devices in consumer IoT
u Home Security - IoT connects a variety of sensors, alarms, cameras, lights, and
microphones to provide 24/7/365 security all of which can be controlled from a smart
phone. Example: Ring doorbell
u Activity Trackers - These sensor devices are designed to be worn during the day to
monitor and transmit key health indicators in real time, such as fatigue, physical
movement, oxygen levels and blood pressure. Example: Fitbit
u Smart Thermostats – Allows you to control the temperature of your home from
anywhere, with just a simple touch on your smart phone. They are also able learn by
following your daily routing and change the temperature of your home without
bothering you. Example: Nest
u Smart Cars - These vehicles are equipped with Internet
access and can share that access with others, just like
connecting to a wireless network in a home or office.
Example: Tesla
When was the term “IOT” was created? 12
u The year 1999 was easily one of the most significant for the IoT history, as Kevin
Ashton coined the term “the internet of things”
u Ashton was giving a presentation for Procter & Gamble where he described
IoT as a technology that connected several devices with the help of Radio-
Frequency Identification (RFID) tags for supply chain management
u He specifically used the word “internet” in the title of his presentation in order
to draw the audience’s attention since the internet was just becoming a big
deal that time
u While his idea of RFID-based device
connectivity differs from today’s IP
based IoT, Ashton’s breakthrough
played an essential role on the internet
of things history and technological
development overall
Source: https://www.itransition.com/blog/iot-history#:~:text=The%20year%201999%20was%20easily,of%20RFID%20tags%20for%20supply
01 02 03
2018

96% 30% $ 94%


of senior business of c-level execs believe of businesses have

13
leaders plan to use loT will unlock new revenue already seen a return

Internet of Things Facts loT in the next 3 years.


- Wired
from existing products/
services. - The Economist
on their investments
in loT. - CMO.com

04 05 06

12
u In revenue
Facts You terms, the total
Need to Know About IoT market in 2019 was worth $465 billion, The IoT will lead to a

a figure which willofrise


The INTERNET to $1.5 trillion“ in 2030
THINGS 25% $970 …IoT will have the biggest
impact in customer service reduction in asset maintenance will be saved per year
and support…” costs and 35% reduction in per fleet vehicle
- The Economist
downtime. - U.S. Department of Energy - Cisco

01 02 03 07 08 09
2018

96% 30% $ 94%


of senior business of c-level execs believe of businesses have
38%
of businesses believe
$41 trillion
will be spent over
loT could add

$10-15
leaders plan to use loT will unlock new revenue already seen a return loT will have a major impact the next 20 years for
loT in the next 3 years. from existing products/ on their investments over the next 3 years. infrastructure upgrades. TRILLION
services. in loT. to the global GDP.
- Wired - The Economist - CMO.com - The Economist - Intel - GE

04 05 06 10 11 12

0.06% 40%
The IoT will lead to a Because of the


…IoT will have the biggest
impact in customer service
25%
reduction in asset maintenance
$970
will be saved per year
IoT there will be

22x of things that could


be connected
of all data generated
by 2020 will come
and support…” costs and 35% reduction in per fleet vehicle more data traffic by 2020 actually were in 2014. from connected sensors.
- The Economist
downtime. - U.S. Department of Energy - Cisco - Freescale - Baseline Magazine - Frost & Sullivan

07 08 09

38%
of businesses believe
$41 trillion
will be spent over
loT could add

$10-15
loT will have a major impact the next 20 years for
over the next 3 years. infrastructure upgrades. TRILLION Source: https://lp.servicemax.com/rs/020-PCR-876/images/12Facts_IoT.pdf
IoT and Cloud Computing 14

u IoT and Cloud Computing complement


one another and both working together to
provide an overall better IoT service
u The role of Cloud Computing in IoT works
as part of a collaboration and is used to
store IoT data
u The Cloud is a centralized server
containing computer resources that can
be accessed whenever required
u Cloud Computing is an easy method of
travel for the large data packages
generated by the IoT through the Internet
15
How Does IOT
Work?
1. Sensor/Devices
2. Connectivity/Data
Transmission
3. Data Processing
4. Data
Visualization/Analytics 1 2 3 4
1. Sensors/Devices 16
u Sensors or devices help in collecting very minute data from the surrounding
environment
u All this collected data can have various degrees of complexities ranging from
a simple temperature monitoring sensor or a complex full video feed
u A device can have multiple sensors that can bundle together to do more than
just sense things
u For example, our phone is a device that has multiple
sensors such as GPS, accelerometer, camera but our
phone does not simply sense things
u The most rudimentary step will always remain to
pick and collect data from the surrounding
environment be it a standalone sensor or multiple
devices
2. Connectivity/Data Transmission 17
u Once the data is collected it is transferred to the cloud onto some sort of IoT
platform
u These devices connect to the Internet by sending data to your phone or some
other dedicated hardware in your home that acts as a hub over a local
communication method (right)
u These connection can be made directly
through your router or modem via WiFi or
wired methods like Ethernet, cable or
power line networking (signals sent directly
over your home's power lines)
u It could also bypass your home network
entirely via cellular communication
u They may also communicate with other smart devices in the vicinity

Source: https://computer.howstuffworks.com/internet-of-things.htm
2. Connectivity/Data Transmission –IOT Gateway 18
u An IoT gateway device bridges the communication gap between IoT devices,
sensors, equipment, systems and the cloud
u It provides a place to preprocess that data locally (at the ”edge” – aka “fog
computing”) before sending it on to the cloud
u When data is aggregated and summarized at the edge, it minimizes the
volume of data that needs to be forwarded on to the cloud, which can have
a big impact on response times and network transmission costs
u Another benefit is that it can provide
additional security for the IoT network and
the data it transports
u Since manages information moving in both
directions, it can protect data moving to
the cloud from leaks and IoT devices from
being compromised
Source: https://www.lanner-america.com/blog/what-is-an-iot-gateway/
3. Data Processing 19
u Once the data gets to the cloud, the IoT platform processes it
u After reaching the cloud infrastructure the data has to be analyzed so
that the right action can be taken
u This can be as simple as checking if the
temperature is within the acceptable
range or it could be complex such as a
situation where an intruder comes in
and the device has to identify it
through cameras
u The IoT application is made such that it
can process all the data at a fast rate
to take immediate actions
4. Data Visualization/Analytics 20
u The processed data is then delivered to the end-user
u This can achieve by triggering alarms on their phones or notifying
through texts or emails
u Depending on the IoT application, the user may also be able to perform
an action and affect the system
u For example, the user might remotely
adjust the temperature via an app on
their phone
u Some actions could be performed
automatically
u Rather than waiting for you to adjust the
temperature, the system could do it
automatically via predefined rules
Real Life Example – Controlling an AC 21
1. We have an AC in our room and the temperature sensor installed in it in the room will be
integrated with an IOT gateway to connect the temperature sensor inside the AC to the cloud
2. The cloud has detailed records about every device connected to it such as device id, a status
of the device and what time was the device last accessed
3. An end-user uses a mobile app interact with Cloud (and in turn devices installed in our homes)
4. A request will send to the cloud infrastructure with the authentication of the device and device
information
5. Once the cloud has authenticated the device, it sends a
request to the appropriate sensor network using gateway
6. After receiving the request, the temperature sensor inside
the AC will read the current temperature in the room and
will send the response back to the cloud
7. Cloud infrastructure will then identify the user who has
requested the data and will then push the requested
data to the app
How Big is IOT? 22

u By 2025, there will be 8Xs more IoT devices than human beings

u According to International Data Research (IDC), IoT will constitute ~90


zettabytes, or 51% of all data generated in 2025
u A zettabyte is a trillion gigabytes

u Sounds like a job for…?


IoT and Big Data 23

u The relationship between IoT, Big Data and Cloud Computing creates
new opportunities for business to harness exponential growth
u When data is needed to be extracted for analysis reasons in a
company, IoT is the source for that data and Big Data can analyze and
extract the relevant data to create the required information
u As well as processing large amounts of data
on a real-time basis, Big Data then stores the
information making it invaluable when it
comes to utilizing IoT’s capabilities and the
data it provides
u IoT and Big Data can show hidden
correlations, unidentified patterns and
expose new trends in your data set
IOT Concerns 24
u What do you think are some of the major concerns with IOT?

Source: https://www.allerin.com/blog/4-challenges-that-are-faced-by-iot-developers
Security Challenges 25
u Distributed Denial of Service (DDoS) Attacks
u A DDoS attack happens when all network devices are made to send limitless
messages that eventually cause congestion in the IoT network shut it down
u Network Hacks
u Takes place when an IoT device is compromised through the network that it is
connected to and allows a hacker to access and control the device
u Lack of Device Updates
u Companies are manufacturing IoT devices at an
increasing rate due to the growing demand and in the
market do not have considerable security updates, and
some of them are never updated at all
u Unsafe Communication
u Most of the IoT devices do not encrypt messages while
communicating over a network, which makes it one of the
biggest security challenges of IoT
Security Challenges - CloudPets 26

u Internet-connected teddy bears manufactured by toy maker


Spiral Toys have leaked the email addresses and password
details of more than 800,000 customers online
u The 821,296 account records had been stored on a MongoDB
database, which was not protected by a firewall or password,
and stored with Romanian mobile development company,
mReady
u The leak also included more than 2 million voice recordings
between children and their parents, shared over the Internet
via teddy bears known as CloudPets
u The voice recordings, on the other hand, were linked to an Amazon S3 cloud storage bucket,
which again required no specific authorization
Source: https://internetofbusiness.com/teddy-bears-iot-leak/
Security Challenges – Potential Solutions 27
u Secure the IoT Network
u With VPN, traffic flows from the device, through an intermediary server, and then continues to
its destination masking the user IP address and replaces it with one from the VPN server
u Protect the back-end systems on the internet by implementing endpoint security features
such as antivirus, anti-malware, firewalls, and intrusion prevention and detection systems
u Authenticate the IoT Devices
u Allow the users to authenticate the devices by using multiple user management features and
authentication mechanisms (e.g. two-factor authentication, digital certificates, biometrics)
u Use IoT Data Encryption
u To protect the privacy of users and prevent IoT data breaches, encrypt the data at rest and
in-transit between IoT devices using an IOT Gateway
Privacy Challenges 28
u When you store data in a company’s cloud service, do you own the data or does the
cloud provider?
u This can be hugely important for IoT applications involving personal data such as
healthcare or smart homes
u Another thing to take into consideration is that while data generated by a single
device may not be sensitive, it may reveal a lot of personal information when
combined with data from other appliances
u Most people do not read privacy policies for
every device they buy or every app they
download, and, even if they attempted to do
so, most would be written in legal language
unintelligible to the average consumer
u One potential solution:
u Demand better regulation and putting the onus
on data gatherers to protect against a breach
Connectivity Challenges 29
u One of the biggest challenges for IoT in the future is to connect large number
of devices
u It takes time for data to be sent to the cloud and commands to return to the
device
u In certain IoT applications, these milliseconds can be critical such as in health
and safety
u For example, with Autonomous Vehicles
if a crash is imminent, you don’t want to
have to wait for the car to talk to the
cloud before deciding to swerve out of
the way
u This will continue to be a problem as
more and more IOT devices come
online
u One potential solution?
15
Game-changer for IOT – 5G
u The success of any IoT is ultimately tied to its performance, which is
dependent on how quickly it can communicate with other IoT devices,
smartphones and tablets
u 5G will be a game-changer as data-transfer speeds will increase
significantly
u Compared to current 4G LTE networks, it will be 10 times faster
u This increase in speed will allow IoT devices to communicate and share
data faster than ever
Compatibility Challenges 31
u Different technologies like ZigBee, Z-Wave, Wi-Fi, Bluetooth and Bluetooth Low Energy
(BTLE) are all battling to become the dominant transport mechanism between
devices and hubs
u This becomes a major source of problems when a lot of devices have to be
connected requiring the deployment of extra hardware and software
u Some of these technologies will eventually become obsolete in the next few years,
effectively rendering the devices implementing them useless
u One potential solution is The Open Connectivity
Foundation (OCF) is an industry organization having as its
stated mission is to develop specification standards,
promote a set of interoperability guidelines, and provide
a certification program for devices involved in IoT
u There are more than 500 member companies including
Samsung, Intel, Microsoft, Qualcomm and Electrolux
IOT Platforms 32

u IoT platforms are the middleware solutions that connect the IoT
devices to the cloud and help seamlessly exchange data over the
network

Source: https://www.kelltontech.com/kellton-tech-blog/best-iot-platforms
AWS and IOT… 33

u AWS IoT is an Amazon Web Services platform that collects and analyzes data
from internet-connected devices and sensors and connects that data to AWS
cloud applications

u It can collect data from billions of devices and connect them to endpoints for
other AWS tools and services, allowing a developer to tie that data into an
application
Collects, stores, organizes, and monitors data passed
from equipment by MQTT messages or APIs by providing
software that runs on a gateway in your facilities and
automates the process of collecting and organizing the
34
data and sending it to the AWS Cloud Detects and responds to
events from IoT sensors and
applications. Events are
patterns of data that
identify more complicated
circumstances than
expected, such as motion
Lets you efficiently run and detectors using movement
operationalize sophisticated signals to activate lights
analytics on massive volumes and security cameras.
unstructured IoT data

Help you track, monitor, and


manage the plethora of
connected devices that make
up your device's fleets
Managed cloud service that
enables connected devices to
securely interact with cloud
applications and other devices

Extends AWS to edge devices so


they can act locally on the data
they generate and use the
cloud for management,
analytics, and durable storage
Helps you efficiently
connect your devices to
AWS IoT
IOT Example #1 – Analyze Environment 35
u The Dropbear application performs multiple functions, including image capture, object
detection, tracking, and depth analysis of CCTV camera streams
u The platform can predict and alert operators about potential collisions based on deviations
from objects’ predicted paths, as shown in the following diagram
Direct from-the-edge
integration with AWS services
such as Amazon S3 to collect
image samples and improve
image classification model
training
Lambdas trigger
image transfers to
AWS Data is streamed to
the cloud, where it is
used to generate a
consolidated multi-
Dropbear uses the AWS factory view on
IoT SDK to communicate custom-built
locally with AWS IoT visualization platform,
Greengrass. which uses DynamoDB
and Amazon EC2

Source: https://aws.amazon.com/blogs/iot/improving-industrial-safety-with-video-analytics-aws-iot-core-and-aws-iot-greengrass/
IOT Example #2 – Contact Tracing 36
u SafeTrack automatically generates a customizable contact tracing report based on the
infected person’s interactions
u Using AWS IoT Core, SafeTrack wearable integrates with the LoRaWAN (Low Power, Wide Area
networking protocol) network server, enabling the process of offloading contact tracing data to
AWS
The LoRaWAN gateway’s network server will forward the
wearable data to AWS IoT Core

AWS IoT Core’s Rules Engine will trigger the


Lambda function to decode the message and
create a record in Amazon DynamoDB Amazon API Gateway enables frontend
applications (SafeTrack Lite) to retrieve the
device data and create a report of all the
contact tracing data
Source: https://aws.amazon.com/blogs/apn/how-to-build-and-deploy-a-contact-tracing-solution-with-aws-iot-core-and-safetrack-lite/
IOT Example #3 – IOT and Big Data 37
u Vehicles with GPS and a bunch of other sensors connect to AWS IOT to
do both near-realtime and batch analytics, and visualize their current
locations on a map Kinesis processes
realtime data and
stores in S3
Sensor readings and
updates flow from the
vehicles to the AWS
IoT gateway
Using Hive to query S3 data
and Hue to visualize into a map

Non-realtime data
can be written
Shadows can make a device’s directly to S3
state available to apps and other
services whether the device is
connected to AWS IoT or not

Source: https://aws.amazon.com/blogs/big-data/integrating-iot-events-into-your-analytic-platform/
IOT Example #4 – Tracking Sporting Events 38
u Radio-frequency identification (RFID) chips or IoT devices can be worn by players or embedded
in the playing equipment. These devices emit 20–50 messages per second, which may include
player coordinate positions, player speed, statistics, health information, or more
u To process the game, leagues, coaches, or broadcasters can analyze this data using analytics
tools and/or machine learning Sagemaker can be used for model training. In
soccer, you can predict a goal percentage
based on the player’s position, acceleration,
and past performance history

AWS Glue is a serverless data integration service that


makes it easy to discover, prepare, and combine data
for analytics, machine learning, and app development.
These devices emit
20–50 messages
per second. These
messages are
collected and
output using JSON
Amazon API Gateway provides an API
layer to clients like an OTT, broadcasting
service, or a web browser
Data is streamed to Kinesis where
it can be stored in S3 or passed
onto analytics can be computed
with Kinesis Data Analytics

Source: https://noise.getoto.net/tag/sports/
39

u Your team has been tasked with producing the next great IOT idea
u Choose one of the following:
1. Design next great IOT device that uses 5G
u You can either taking an existing IOT device and make it better or design something new
u What impacts will 5G have?
u What type of analytics could be used in the cloud to help with your IOT device?
2. Using all the data that’s already being collected for an IOT device(s) of your
choice, what could you do with it to create the next big thing?
u This could be a new product, new analytics or anything else that requires the data
collected from IOT
u Download the template from Assignments > Activity #21 - IOT Idea
u 1 submission per team

You might also like