클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스

Data Warehousing & Business
Intelligence in the Cloud
Seoul, Korea
COEX Convention Centre
24th October 2013

Data Analytics in the
Cloud
Blair Layton
Business Development Manager
(Databases) – Amazon Web Services
(APAC)

The Explosion of Data
Existing Challenges with Analytics
The Cloud

We are constantly producing more data

•

Insert big data infographic here

Take a look a data processing “pipeline”

Generation

Collection & storage

Analytics & computation

Collaboration & sharing

What has changed in this pipeline
Data is available
everywhere, contains
customer insight and
costs little to generate,
but..,

Generation




Everything else has constraints

Generation




Highly
constrained

Big Gap in turning data into actionable
information

Challenge 1: Capex Intensive
Provision all your infrastructure and tools before you get results

Cost of your infrastructure dictates what analytics you can perform

Source: Oracle technology global price list 11/1/2012

Most data never makes it to a data warehouse
The Data Analysis Gap
Enterprise Data is growing at over 50%
yearly
Data Warehousing growing at less than
10% yearly

1990

2000

2010

2020

Enterprise Data
Data in Warehouse
Sources:
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares

Most data is left on the floor

Challenge 2: Hard to setup, manage and scale
Setup takes months of planning and work
Extending your data-warehouse can be heavy on time and cost

Managing a data analytics platform requires expensive staff
Complex tuning and management skills required

Enterprises average between 3 and 4 DBAs per data
warehouse
Gartner: Critical factors in calculating the data warehouse TCO, July 2009

Very hard to move up the stack

These make it extremely hard to
move up the Business Intelligence
Maturity Stack

AWS Services

Deployment & Administration
Application Services
Compute

Storage
Networking
AWS Global Infrastructure

Database


9 Regions
25 Availability Zones
Continuous Expansion

• $5.2B retail business

Every day, AWS adds enough

• 7,800 employees

server capacity to power that

• A whole lot of servers

whole $5B enterprise

Powering the Most Popular Internet Businesses

We have partners and technologies ready to help

Solving Problems for Organizations Around the World

Value proposition of the AWS cloud

No Upfront Investment

Low ongoing cost

Flexible capacity

Replace capital expenditure with
variable expense

Customers leverage our
economies of scale

No need to guess capacity
requirements and overprovision

37
PRICE
REDUCTIONS

Speed and agility

Focus on business

Global Reach

Infrastructure in minutes not
weeks

Not undifferentiated heavy
lifting

Go global in minutes and reach
a global audience

Architected for Enterprise Security Requirements

“The Amazon Virtual Private Cloud
[Amazon VPC] was a unique option that

offered an additional level of security and
an ability to integrate with other aspects of
our infrastructure.”

Dr. Michael Miller, Head of HPC for R&D

Gartner Magic Quadrant for Cloud Infrastructure as a Service

(August 19, 2013)

Gartner “Magic Quadrant for Cloud Infrastructure as a Service,” Lydia Leong, Douglas Toombs, Bob Gill, Gregor Petri, Tiny Haynes, August 19, 2013. This Magic Quadrant graphic was published by Gartner, Inc. as part of a
larger research note and should be evaluated in the context of the entire report.. The Gartner report is available upon request from Steven Armstrong (asteven@amazon.com). Gartner does not endorse any vendor, product or
service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization
and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Summarizing the problem and the opportunity

The Explosion of Data

Data is a competitive edge

Existing challenges with
analytics

Hard and expensive to setup,
manage and scale

The Cloud

Lowers cost and improves
agility

The Solution
Data Analytics in the Cloud

Easy and inexpensive to get started
Easy to setup, scale and manage

Low cost to enable analytics on all your data
Open and flexible

Technology Process View

Data
source 1

Data
Data
source n
source 1

Extract Transform,
Load and Cleanse

Data
warehouse

Analytics
Analytics

Unstructur
ed data
sources

The diagram above shows functional architecture components of any data warehousing
project.

Source systems

Data
source 1

Data
Data
source n
source 1

Extract Transform,
Load and Cleanse

Data
warehouse

Analytics
Analytics

Unstructur
ed data
sources

project.

Data Integration

Data
source 1

Data
Data
source n
source 1

Extract Transform,
Load and Cleanse

Data
warehouse

Analytics
Analytics

Unstructur
ed data
sources

project.

The Data Warehouse

Data
source 1

Data
Data
source n
source 1

Extract Transform,
Load and Cleanse

Data
warehouse

Analytics
Analytics

Unstructur
ed data
sources

project.

Business Intelligence and Analytics

Data
source 1

Data
Data
source n
source 1

Extract Transform,
Load and Cleanse

Data
warehouse

Analytics
Analytics

Unstructur
ed data
sources

project.

Data Analytics -Technology Stack

Amazon Redshift

Data
Integration

Data
Warehouse

AWS Cloud

Business
Intelligence

Data warehousing done the AWS way

Deploy

• Easy to provision

• Pay as you go, no up front costs
• Fast, cheap, easy to use
• SQL

Customer quotes

“Queries that used to take hours came back in seconds. Our analysts
are orders of magnitude more productive.”

“Redshift is twenty times faster than Hive…The cost saving is even
more impressive…Our analysts like [it] so much they don’t want to go
back.”
“[Amazon Redshift] took an industry famous for its opaque pricing,
high TCO and unreliable results and completely turned it on its head.”

“Team played with Redshift today and concluded it is awesome. Unindexed complex queries returning in < 10s.”

Amazon Redshift lets you start small and grow big

Extra Large Node (HS1.XL)

Eight Extra Large Node (HS1.8XL)

3 spindles, 2 TB, 16 GB RAM, 2 cores

24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigE

Single Node (2 TB)

Cluster 2-100 Nodes (32 TB – 1.6 PB)

Cluster 2-32 Nodes (4 TB – 64 TB)

Note: Nodes not to scale

Amazon Redshift Pricing – Singapore & Sydney

Price Per Hour for
XL Node ($US)

On-Demand

$ 1.25

1 Year Reservation

$ 0.75

3 Year Reservation

$ 0.45

Simple Pricing
Number of Nodes x Cost per Hour
No charge for Leader Node
Pay as you go

So for example…….
•

1 XL node reserved for 3 years:
= 0.45c x number of hours in a month
=

$340 per month

• 1 XL node cluster gives you:
• 2 Cores
• 16 GB RAM
• 2 TB Disk

• Plus 2 TB storage in S3 for backups & snapshots

Amazon Redshift is easy to use

•

Provision in minutes

•

Monitor query performance

•

Point and click resize

•

Built in security

•

Automatic backups

Use cases
• Reporting Data-warehouse behind an OLTP system
• Data Mart to take load off the existing data warehouse
• Log file analysis for clickstream or gaming data (e.g.
Advertising, Retail, Gaming)
• Query-able archive for data compliance (e.g. Telco - Call
detail Records)
• Machine generated sensor data analysis (e.g. Utility smart meters, Resources - equipment failure prediction)
• As a data analytics system for live data (Gaming,
Advertising)

Flexibility & choice are key in the Cloud

Amazon Partner Network
(Technology Partners)
Deployment & Administration

Application Services
Compute

Storage

Database

Networking

Extending data
integration into the Cloud
Colm Daniel
World Wide Cloud Alliances
Ron Lunasin
Sr. Director – Cloud Product
Management

Today’s Agenda
•
•
•
•
•

Informatica Cloud Overview
Informatica Cloud Amazon Redshift Connector
Demonstration
Next Steps
Q&A

Informatica:
The Industry Leader in Cloud Integration
#1 by Customer Count

2000+ companies

#1 by Customers/Analysts

AppExchange

Gartner

#1 by Data Processed

+40B transactions/month

#1 by Connectivity

Informatica Cloud Marketplace

Top Right @ the Core: Gartner Magic Quadrants

Global Presence & Global Perspective
Employees in 26 Countries…. and growing!

New Cloud Connectors
New!

http://www.informaticacloud.com/connectivity

Cloud Integration Customer Success Stories

Data Migration

App Integration

Consolidated Smith
Barney and Morgan
Stanley data on
Day 1
of merger

Synchronizing
Salesforce CRM
with Netsuite and
other business apps

Managers didn’t
lose momentum in
ongoing recruiting
efforts

1.5M rows of data
synchronized daily

iPaaS *(Build)

Extend
PowerCenter

Decreased
operational issues
from 70% to 30%
of IT workload

Reduce time to
build and distribute
connectivity to 3rd
party data sources

Enabled faster, more
accurate decisionmaking based on
timely, trusted data

Customize cloud
integration
templates to execute
sophisticated
integration workflows

Hybrid deployment
gives integration
flexibility and
scalability to meet
various use cases

Data Replication

Lowered time and
resources needed for
integrations by 80%

Informatica Cloud
The Industry’s Most Comprehensive Cloud Integration
and Data Management Solution

Cloud Process Automation
Guiding users to work efficiently with the data

Cloud Data Quality and MDM
Delivering the “Single Customer View”

Cloud Integration
Connecting your cloud apps

Our Mission:

Unleash the Potential
Of the Cloud

Cloud Amazon Redshift
Connector
Ron Lunasin, Cloud Platform Adoption

Recognition of “The Next Wave” back in 2004

Challenges with Traditional Approaches to Cloud Integration
Mainframe based
Integration

Prism

ETI

Client / Server based
Integration

Cloud based
Integration

Move to the Cloud…
IT transitions from skeptic to partner to driver
Cloud First
(IT Led)

Increasing IT
involvement
in Cloud
decision
making

Business-IT
Collaboration
LOB Led
(IT Approved)

LOB Owned
(Outside of IT)

2012-2013

Pre-2010
2010-2012

2013 

Cloud is the Reality in the Enterprise
Large, Accelerating Market

4-6x
growth rate of
on-premise IT
20-27% CAGR
$20-40B market

SaaS
largest category

PaaS
fastest growing
(Forrester)

Led by Large
Enterprises

76%
enterprises
have a formal
cloud strategy
(Forrester)

(Forrester, IDC, Gartner, 451Group)

Driven by IT

90%
Cloud decisions
and operations
involve IT
(IDC)

60%

84%

of all companies
using SaaS w/in 12
months

of net new
software is
now SaaS

(Forrester)

(IDC)

74%
using cloud
will increase cloud
spend
> 20%
(IDC)

66%
SaaS POs
signed by IT
(IDC)

Informatica Cloud and Amazon Redshift:
Enabling cost-effective data warehousing

•
•

Redshift Connector pre-release announced in February
General availability in August 2013

InformaticaCloud.com/Amazon-Redshift

What did it use to take…
•
•
•
•
•
•

Budget large capital expenditure
Schedule a sales meeting with Oracle, IBM, Teradata, etc…
Formal POC (Proof of Concept)
Procure software and hardware
Install and setup
Start project

What it takes now…
•
•

Go to the web and sign-up
Start project!

Informatica Cloud Architecture Overview
Your Company

3

1
2
Secure
Agent

4

Amazon
Redshift
Marketplace

Informatica Cloud Amazon Redshift demonstration
6
Metadata Mappings

4

5

1

Firewall
1

Build mapping and execute job

2

Retrieve Account Data

3

Put Account Data into Flat File

4

Transfer compressed Flat File to S3

5

Initiate copy from S3

6

Load data into Amazon Redshift

3
Informatica Cloud
Secure Agent

2

Best practices to remember…
•

The Amazon S3 bucket that holds the data files must be created in the same
region as your cluster
– Files are deleted from Amazon S3 bucket when upload is complete

•

Choose a batch size where the number of batches matches the number of
slices in your cluster
– Each XL node has 2 slices, each 8XL node has 16
– If you have a 2 node XL cluster and 40,000 rows of data, choose a batch size of
10,000
– The Informatica Cloud Redshift connector can maximize Amazon’s parallel
processing capabilities this way

Next Steps
•

Get started with Amazon Redshift

•

Get started with Informatica Cloud
– InformaticaCloud.com

•

Learn more about our Redshift Connector
– InformaticaCloud.com/Amazon-Redshift

Q&A
Colm Daniel, cdaniel@informatica.com
Ron Lunasin, rlunasin@informatica.com

AWS Reporting &
Analysis
Ben Connors
Worldwide Head of Alliances - Jaspersoft

Session Overview
•
•
•
•
•
•

Analysis of Cloud market motivations
Overview of Cloud trends
Cloud User category expectations
How BI/Jaspersoft fits into Cloud strategies
Demos
Summary

© 2013 Jaspersoft Corporation

71

Industry Movement to the Cloud
•

Cloud Growth –
– Cloud IT spend will grow from 3% - 17% of total (Morgan Stanley)

•

Motivations:
–
–
–
–

•

Agility
Lower cost
Faster time to value
Less risk

Use cases:
– CRM, ERP, HR, Online Gaming, Manufacturing, Expense Reporting, Big Data,
Consumer Applications, Etc.

•

Workloads:
–
–
–
–

•

Dev/Test
‘Spiky’
High Growth
Reliable production

BI usage matches these Cloud trends

© 2013 Jaspersoft Corporation.

72

Cloud Computing Growth


http://www.forbes.com/sites/louiscolumbus/2013/02/19/gartner-predicts-infrastructure-services-willaccelerate-cloud-computing-growth/

73

Asia/Pacific Cloud Growth

http://techaisle.com/blog/2012/11/lots-of-clouds-in-the-forecast-and-a-holiday-story/


74

Top Cloud Applications
•

INTERNAL BUSINESS APPLICATIONS TOP THE LIST; MOBILE SITES NEXT
What kinds of applications have you delivered using a cloud environment? Which do you plan
to deliver during the next 12 months?
50

Deployed
40

In 12 months

30
20
10
0

Source: Forrester Cloud Developer Survey, Q3 2012


75

2013: Current/future BI
Cloud adoption trends


Does your organization run or plan to run any part of its BI, analytics and data warehousing
systems in the cloud?

15%
Yes, active cloud user
Plan to start using the cloud in the next 12 months

41%

13%

Considering, but no set plans

60% planning,
considering, or
actively using

No

32%
N = 559

• The cloud continues to play a critical role in supporting BI, analytics, and
DW initiatives with 3 out of 5 respondents reporting that they are
planning, considering or actively using the cloud.
TechTarget 2013 Analytics & Data Warehousing Reader Challenges & Priorities Survey


76

Constituents - Cloud Expectations
•

Business User
– Efficient access to IT resources w/o red tape and delays

•

Application Developer
– Platform with dev tools, middleware, capacity, configuration mgt.

•

IT Operations
– Elastic capacity, secure, standard, keep users happy

•

Management
– Control expenses & risk, delight customers/partners, move fast


77

Example Industry Use Cases
for Business Intelligence
Industry

Data Analyzed

Online Gaming

# players vs. time, spend/player, popularity of weapons, scene usage

Education

Student attendance, test scores, teacher performance, spend/student

Telecom

Customer churn, data traffic patterns, billing per service

Government

Crime data, demographics, health trends, economic

Advertising

Click-through rates, conversion rates, regional variation

Retail

Product sales, Profits, Customer traffic, Product correlations

Manufacturing

Inventory, quality, vendor performance, logistics


78

Current State of Business Intelligence
•

Standalone

•

Expensive

•

Desktop-based

•

High Latency


79

Competing on Time and Information
“The New Factors of Production: Time and Information”
Brian Gentile, Jaspersoft

But business users don’t
have access to timely,
actionable data

Why?
Most don’t spend their
day inside a BI tool …nor
do they want to!


80

Embedded BI - Why?
•

For Best Decisions, Information Should Be:

– Relevant

– Timely

– Actionable


81

Embedded BI
•

Maintains
–
–
–
–
–

•

Context/Relevance
Motivation/Timeliness
Train of thought/Timeliness
Actionable/Within application or beyond
Security

Broadens User Community
– Executives
– More knowledge workers
– Self-serve, Interactive


82

4xC Barriers to
Embedded BI Adoption

Cost

Complex
to Deploy

Complex
to Embed

Complex
to Use

Simple, Low-Cost
Embedded BI
NEED:
Develop
for free.
Pay only
for what
you use
when
deploy


NEED:
Deploy
with
pushbutton
ease or
use as a
service

NEED:
Embed
selfservice BI
through
standard
APIs

NEED:
Easy to
build and
use BI
assets

83

3rd Gen Embedded BI
Breaks Barriers

Cost

Complex
to Deploy

Complex
to Embed

Complex
to Use

3rd Generation
Embedded BI
Free +
usagebased
pricing


Push-button
on-premises
deployment
and Cloud BI
service

HTML5/CSS
+ RESTful
web
services

Easy to build
for BI Builders on any
data and self-serve for
BI Consumers on any
device

84

We Need “Intelligence Inside”
We want information to FIND US, not the other way round
“We need Intelligence Inside the
applications and business
processes we use every day.”

–
–
–
–
–
–


Pipeline dashboard inside SaaS CRM app
Performance report inside partner portal
Salary data visualizations inside HR intranet
Portfolio analytics inside client website
Tickets crosstab inside custom helpdesk app
Interactive charts inside native mobile app

85

Jaspersoft: The Intelligence Inside

Embeddable Architecture

Cloud Ready

Open web standard
architecture makes
integration with any
app easy to perform

Multi-tenant architecture,
100’s of SaaS
customers, top selling BI
solution on Amazon

Full Self-Service BI Suite
Address all user requirements with
interactive reports, dashboards,
analysis, and data integration

Affordable

Proven Platform

Up to 80% less than
traditional BI platforms
while delivering significant
power & capabilities

Millions of users,
380,000 community
members, deployed in
130,000+ applications

Jaspersoft Products
Reporting Engine

Studio

Visual Report
Design Environment

Ad Hoc Reports, Dashboards,
In-Memory Analysis Server

Powerful OLAP
Data Analysis


88

Design Any Report . . .


89

… Dashboard


90

… or Analytic View


91

... Using Any Data Type

Relational
Relational

Big Data

Files
Files

Redshift

POJO files


92

… bringing Intelligence to Any App


93

… with a World-Class BI Platform

Reporting, Dashboards, Visualization, OLAP
Analysis

Columnar-Based In-Memory Engine
Business Metadata Layer

Data
Integration

Data
Virtualization

Direct

Extensive APIs: HTTP, SOAP, REST

100% Web Standards: CSS, .JS, .JSP, Java

HTML5 Browser, Native Mobile Apps

Data Connectivity to Any Data
RDS

Redshift

EMR

SaaS

On-Premises

94

Jaspersoft Customers
Software & Technology

Public Sector

Healthcare/Pharmaceutical

Travel & Transportation

Financial Services

Telecommunications
Manufacturing

Jaspersoft AWS Hourly: 500+ Customers in 6 Months!

95

Jaspersoft/AWS Customer:
BizFlow/Samsung Korea
•

Business Process Management (BPM)

•

Challenge
– Monitor/Analyze Business Activities

•

Solution
– Jaspersoft on Cloud

•

Results
–
–
–
–

Customers avoid infrastructure
Increased BizFlow revenue
Self-service BI
Higher value analytics

http://www.bizflow.com/business-process-management/samsung-heavy-industries


96

Sage Human Capital
•

Recruiting Firm for High Tech companies

•

Challenge
– Visibility for recruiting process status
• Internal
• External

•

Solution
– Jaspersoft on AWS

•

Results
– Dashboards set up in two hours
– Disrupting the industry

“Jaspersoft for AWS allows me to have big
company analytics for a small business price.
With this information, we can be proactive
instead of reactive.”
- Paul Grewal, CEO Sage Human Capital


97

Blue Consulting
•

Administration Systems for Schools

•

Challenge
– Data from many systems
– Difficult for everyone, including teachers, to access

•

Solution
– Jaspersoft on AWS, Amazon Redshift

•

Results
– Over 200 schools provide reporting to teachers, even at home
– More informed decisions, educational approaches, resource optimization

“Our users LOVE Jaspersoft ad hoc reporting,
and the performance of the system with
Redshift.”
-Russ Davis, Founder & CEO

98

Jaspersoft BI for AWS Overview


99

Jaspersoft 5 Demo

Jaspersoft Integrated with
Amazon Redshift


100

Jaspersoft Pro on AWS
•

Jaspersoft is the first BI service that you can buy per hour
– No user limitations, no monthly fee,
– less than $1 per hour

•

First BI service to automatically
connect to your AWS data
– 10 minutes from launch to visualizing your data in RDS or Redshift
– AWS Security Integration

•

Released February, 2013
– Over 500 customers

101

클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스

Related slideshows

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to 클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스

Similar to 클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스 (20)

More from Amazon Web Services Korea

More from Amazon Web Services Korea (20)

Recently uploaded

Recently uploaded (20)

클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스