Lumira Sizing
Lumira Sizing
Lumira Sizing
Sizing Guide
Date Details
Feb 17, 2014 Additional information added regarding how HANA works in relation to workflow
estimation
April 14, 2014 Added option for measuring HANA impact using Lumira Desktop.
July 22, 2014 Sizing Methodology for Lumira Server updated. Sizing for Lumira Desktop removed.
Removed sample sizing calculation.
Oct 17, 2014 Added Background: Performance Impact section. Also added HANA cluster
considerations when doing sizing.
Nov 26, 2014 Added Getting Started with HANA and Lumira Server section, which provides
general sizing guidance for customers who dont yet have HANA systems. Changed
typical user think times to 10 seconds (from 600) to account for more typical
Lumira-style workflow.
Feb 23, 2015 Added Lumira Edge and multi-user Lumira Desktop sections.
March 9, 2015 Updated product names, fixed typo in Lumira Server sizing calculation
June 8 2015 Added Lumira, Server for BI Platform section to reflect the Lumira 1.25 release.
Sep 29 2016 Added guidance SAPS value and recommendations for Hana Online Documents in
Lumira, Server for BI Platform section to reflect the Lumira 1.31 release.
Added recommendations for Lumira 1.31 in Tuning Options section.
Think time
Refers to the time a user takes between actions that cause a load on a system. Normally a user does not
constantly click on an application. The amount of time a user spends on processing the information
displayed on screen before taking any subsequent action is referred to as think time.
Users
When sizing Lumira in terms of users, there are two dimensions to consider:
User Class: relates to the load in the system generated by user workflow.
User Types: helps you anticipate the concurrency ratio of users in the system.
User Class
Lumira workflows are divided into two user categories: Consumer and Analyst. Typically consumers will
utilize less system resources because their activities are principally related to viewing existing
documents.
Consumers: generally use the system to view cached data presented as information, drill, filter, and
occasionally refresh the data when a document is opened. Multiple consumers in a system would view
the same Lumira document created and distributed by analysts.
Analysts: perform more resource-intensive operations including ad-hoc analysis and customizing
content as required. Analysts use their own exclusive and detail-rich Lumira documents that contain the
most up-to-date data. A typical analyst workflow might involve opening a Lumira document, refreshing
the data, and editing visualizations.
Both consumers and analysts typically spend an average of 10 seconds think time in between on screen
actions.
You should use think time in conjunction with user classes to reach the most accurate sizing estimates.
User classes help you quantify and distinguish system usage and the subsequent load generation for the
two Lumira user categories. In your sizing exercise, workflows by a certain user class may have a
different think time between operations compared to other system users.
The following example illustrates how to estimate the active concurrent users.
Lets assume your organization has 3,500 users with accounts for Lumira Server products. You estimate
that only 20% of these users will actually log on to Lumira server. However, these 700 active users will
use Lumira server for only part of their tasks. Lets assume that 20% of the active users will concurrently
view, edit, refresh Lumira documents. Our example would therefore have 140 active concurrent users.
When sizing Lumira Server products, specify your inputs in terms of active concurrent users.
If you expect a concurrency ratio that is higher than normal, you should expect a heavier load and would
need to compensate accordingly. If you require guidance to estimate the user concurrency ratio in your
organization, please consult your SAP representative.
1. A CPU core refers to a physical (not hyper-threaded) core on a physical machine. In a virtual
machine, this is the equivalent to a logical processor.
2. Merged data sets are more complex to process, and can require more memory and processing
resources.
3. The refresh and edit workflows are memory-intensive operations requiring new copies of the
data to be loaded in the in-memory data engine.
4. It is recommended that you build your system to have at most an average 65% CPU utilization
rate to support potential bursts in activity. Performance of both physical and virtual systems can
degrade when the CPU utilization rate exceeds 80%.
5. The maximum memory utilized should not exceed the physical memory available. Refer to the
FRS disk recommendations mentioned in the BI 4 Sizing Guide.
SAP strongly recommends that BI virtual machines have reservations for the memory
and logical CPUs assigned to them.
For more in-depth information on BI and Virtualization, please refer to the SAP BI 4
Virtualization section at www.sap.com/bivirtualization.
Note: The recommendations outlined in this section do not apply to Lumira Desktop or Lumira, Server
for HANA as they have different architectures.
The sizing estimate below should only be used as a starting point as it is based on the
workflows outlined in Table 1. We strongly recommend volume testing to validate your
sizing estimate based on the expected usage in your deployment.
Recommended Maximum
Document Type Categories
Active Concurrent Users
Hana Online
Medium Document 25 35 50
documents*
Recommended Configuration
Scenario
Minimum memory required (RAM in GB) 32 48 64
SAPS Value 8550 25700 25700
Table 1.a: Sizing Recommendations for Lumira, Server for BI Platform (Online)
Recommended Maximum
Document Type Categories
Active Concurrent Users
Medium Document 10 12 15
Offline documents*
Small Document 15 25 35
Recommended Configuration
Scenario
Minimum memory required (RAM in GB) 32 48 64
SAPS Value 8550 25700 25700
Table 1.b: Sizing Recommendations for Lumira, Server for BI Platform (Offline)
1. Your hardware should at least meet the minimum hardware specifications outlined in SAP
Lumira, server for BI Platform Product Availability Matrix (PAM).
2. To avoid the user load impacting other BI platform workflows, it is recommended that the
Lumira, Server for BI platform runs on dedicated hardware resources according to the
3. The sizing recommendations are specific to the hardware configurations outlined in Table 1. For
larger deployments involving more than 35 active concurrent users, we recommend that you
add more nodes to your deployment rather than adding additional Lumira servers to an existing
node to avoid potential memory resources allocation conflicts. If your deployment requires 350
active concurrent users, you should consider deploying 10 nodes each with 64GB RAM and 24
CPU cores.
The recommendations listed in Table 1 are necessary to support the following user load:
Table 2: Typical Analyst workflow used for Lumira, Server for BI Platform sizing tests
For the workflows used, all users view different document and then refresh. We have factored in an
average idle think time of 10 seconds in between each operation. The workflow involves loading and
updating data into the in-memory data engine.
Refer to the following links for more Lumira, Server for BI Platform details.
http://scn.sap.com/docs/DOC-63551: contains the latest information regarding Lumira, Server
for BI Platform functionalities and support statements.
http://scn.sap.com/docs/DOC-26507: refer to the Architecture Process Flow section for a step
by step explanation of how Lumira user loads are processed within a BI Platform deployment.
Following are few server configuration parameters that could improve scalability. However, its
recommended to update the following parameters in your deployment landscape & check with a basic
load test.
This information is provided as a starting point for your deployment planning. Your experience may vary
depending on:
The complexity of documents created
The data set size
Complexity as well as general user workflow.
The recommended sizing estimate below should only be used as a starting point as it is
based on the workflows outlined in Table 3. We strongly recommend volume testing to
validate your sizing estimate based on the expected usage in your deployment.
10,000 8GB 4 8
500,000 32GB 8 10
100,000,000 32GB 8 4
Note: Table 3 is based on the tests on Lumira, Server for teams 1.23.
The recommendations listed in Table 3 are necessary to support the following user load:
Table 4: Typical Analyst workflow used for Lumira, Server for teams sizing tests
For the workflows used, all users view different document and then refresh. We have factored in an
average idle think time of 10 seconds in between each operation. The workflow involves loading and
updating data into the in-memory data engine.
-Xmx1024m
The default setting as shown is maximum 1GB of heap memory (RAM). You may need to set this value
higher to accommodate larger data sets, taking into account that you will need more available RAM.
By default Lumira Desktop allocates 85% of the available memory of your machine at start up to the in-
memory data engine.
Multi-User Sizing
SAP Lumira, Desktop Edition supports deployment in multi-user environments such as Citrix and
Windows Remote Desktop Services. Be sure to check the SAP Lumira, Desktop Edition PAM for
supported platforms.
There are two general approaches to multi-user application delivery: multiple sessions in the same
machine or per-user virtual machines. Lumira works with both configurations and the sizing is identical.
Sizing in general for multi-user environments is to multiply the memory requirement by the number of
expected concurrent users. For disk storage, multiply the total number of expected users assigned to a
server by the disk storage requirement.
Lumira, Server for HANA is designed to scale to large range of deployments and users. The number of
users, types of users, usage patterns, number of BI tools included in the Lumira family, kinds of data set
supported by HANA as well as the deployment options supported by the suite all factor into a series of
variables that affect the successful deployment of Lumira. No configuration fits all customers. The
purpose of this document is to help guide you through the Sizing Exercise for Lumira, Server for HANA.
Sizing HANA services is very different compared to sizing of other types of Enterprise software. BI in
general is a very resource intensive task. In addition, BI can be very bursty, since the load relies a lot on
the interaction of users. The act of extracting information from a potentially large amount of data
requires adequate amounts of processing power and exercises all the subsystems of a HANA system.
Having the right amount of capacity in your system is crucial to success.
Given its architecture, sizing for HANA is different than traditional BI sizing. Historically BI systems relied
on database servers and intermediate caching that all had a large dependency on disk subsystems and
were sized with an I/O orientation. HANA executes analytics in-memory. This makes the sizing analysis
very different: its CPU-oriented.
No tool or document can replace human judgment. So while this document attempts to cover as many
aspects of the Sizing Exercise as possible, Sizing Experts at SAP should always be consulted.
These are the things you need to do before you start your sizing exercise:
How many active concurrent users need to be supported by the deployment? See Active
Concurrent Users for more information.
What types of users will be using each tool? Consumers (consumption workflow) and Analysts
(design workflow)? See User Class for more information.
Do you know how users will use Lumira? Do they require always up-to-date data or can they
operate with cached results? How users use the tools affects the workload they produce.
What types of data sources will your users access? Is your data located on a HANA server for
testing?
How will you measure user workflow impact on HANA? Will you measure the user workflow via
Lumira Desktop or Lumira, Server for HANA?
How many HANA machines are to be involved in the Lumira analytics? Do you have a cluster of
HANA machines with data distributed among the nodes?
When considering a small or medium system, it is recommended that you choose a medium system
when anticipating more than 25 active concurrent users. The deciding factor for number of users should
include the complexity of your data as well. You should also choose a medium or larger system when
expecting individual dataset sizes greater than 1GB or 2 million rows.
Larger systems require Sizing help from the HANA Center of Excellence or your SAP representative. The
Sizing guidance methodology provided in this document is expected to be used in such an engagement
as well as HANA sizing resources.
The impact on HANA is the analytics you run. Lumira will display data and visualizations based on your
data and views, etc. If you create calculated views that are very complex (e.g. JOIN operation or a
calculated expression, etc.), that will cause a lot of processing when displayed by multiple Lumira users.
If you have a number of users running Lumira, that query impact will be multiplied by the number of
active concurrent users. That is where sizing becomes very important: you need to predict the load that
your users will create and make sure you have enough HANA power to accommodate it.
Unlike a traditional BI system which leverages intermediate caching and disk-based database system,
Lumira, Server for HANA is built natively in HANA and pushes most of the calculations down to the index
servers and processes the queries in-memory. As a result, there may not be a substantial difference in
the load generated in the system by either a consumer or an analyst. Whenever a user performs an
action that changes the state of the current view, all the calculations are executed again and dynamically
provide you with the updated content.
In a multi-user scenario, queries are executed concurrently via multiple threads. In situations where a
complex query takes notable time to process, any subsequent queries may need to wait until the
complex query is processed if all threads / CPUs are not available. See the Tuning Options section below
for performance-related tips.
Note: Sizing Lumira, Server for HANA applies to both on premise deployments and also HANA Enterprise
Cloud (HEC) deployments.
Once the user requirements are defined, the system can be defined that will achieve the required
amount of processing.
Assumptions
It is assumed that the memory requirements of the HANA machines has been determined and
appropriate deployment calculations for hosting that data have been made. See the section below
regarding basic HANA sizing.
Prerequisites
The goal of the sizing exercise is to calculate the peak load that will be placed on the system. In order to
proceed with the steps below, you need to know the following:
Users: How many active concurrent Consumers and Analysts will be using the system?
It is very important to know if the common workflow of the users is going to include refreshing
documents and if so, how frequently. Thats an important part of the load prediction and thus the sizing
estimate. The use of query result caching in HANA is an important factor here. Caching can significantly
help reduce load on the system if it is acceptable and allowed in the business context. I.e., it may be fine
for users to see slightly outdated data (seconds or minutes old). Conversely, if all users have a unique
security context that prevents query sharing, then caching will not have much effect on performance.
The following diagram demonstrates how various user workflows may fit together to form the overall
load on a HANA system.
Average Query Response Time: The amount of time it takes to run representative queries against your
data is required. Average and/or 90th percentile response time needs to be measured using
representative data sets from the customer.
Tip: HANA Studio can be used to obtain query execution time. The best method for measuring query
performance is done with Lumira, Server for HANA, using user simulation scripts run by Apache JMeter
or HP LoadRunner. However, in the absence of Lumira, Server for HANA, prediction and simulation can
be done using Lumira Desktop connected to a HANA system.
Scripts should be written to simulate expected workflows of users. User simulation is meant to
approximate the actions of users, incorporating the steps they take through the product. Be sure to
allow for think time at the appropriate places in the workflow. Once user workflows are modelled, the
goal is typically to have the ability to instantiate multiple script sessions to simulate multiple users. You
then analyze the performance of the system as load is increased.
The reason for this is that HANA performs the analytics and query calculations on the node that holds
the data. The result of your representative workflows may show one HANA machine doing more
processing than another in accordance with the data distribution and the analytics on it.
In the case that a HANA node will not scale to the number of users you need, the main course of action
is to redistribute the data within your cluster to spread out the processing. You may have capacity within
your existing cluster or you may need to acquire more HANA hardware to achieve your needs.
Sizing Calculation
Sizing Lumira, Server for HANA is done by factoring together the number of users, the user types and
their expected activity frequency and the time required to execute an average query.
Query execution times are measured by turning on Query Tracing in HANA Studio to determine how
much time is spent querying data for a given step of a workflow.
For more information on Query Tracing, see the Performance Trace tool, documented in the HANA
Admin Guide.
For each step of a workflow, make sure to allow enough time for the completion of the step so that all
the data has been delivered to the browser. Waiting for one minute between steps is generally
recommended.
Note: do this for all HANA nodes in a cluster, then take the processing time from the most
loaded node. See the HANA Cluster Considerations section above for more information.
Divide the full user time for the workflow by the CPU processing time to get the percentage of
system load produced.
Note: calculate the full user time as the amount of time all the CPUs provide during the user
workflow time. E.g., if the workflow takes 60 seconds and the machine is a 10 CPU system, that
HANA system will have had 600 seconds of CPU time available during that time period.
Multiply the number of required active users by the system load percentage to determine the
full system impact of this workflow
If the load on the HANA system is nearing its limit, you need to consider changes to
accommodate the load (by obtaining more HANA resources, changing the workflows,
redistributing the data in a cluster, etc.).
For each active user of the system, 1/20th of a CPU should be included in the sizing calculation. The
memory system impact of a user session is negligible for the purposes for sizing.
Step 4: Deployment
In this final step, sum the system impact of all the workflows to determine the number of HANA
machines required. For example, if the percentages sum to more than 100, additional HANA machines
will be needed to support the predicted user load.
Notes:
This sizing algorithm assumes peak usage of the system is no worse than an evenly distributed pattern of
workflows. It assumes that adding the percentage of system load by all workflows is valid. If you
anticipate greater peak load and want to ensure that peak load is supported in a responsive manner,
you should adjust your workflows, user think times and number of active users accordingly.
HANA Studio is required to obtain the performance statistics. You can obtain HANA Studio here:
http://scn.sap.com/community/developer-center/hana
Find the tables node and open the table HOST_SERVICE_STATISTICS, as shown here
Scrolling farther to the right, locate the PROCESS_CPU_TIME column and make note of the CPU time at
the start and end of the time period, as shown below:
An additional HANA Sizing resource on the SAP Community Network is the article, SAP HANA Sizing.