Mastering QlikView Sample Chapter
Mastering QlikView Sample Chapter
Mastering QlikView Sample Chapter
Stephen Redmond
Chapter No 7
"Visualizing Data"
Mastering QlikView
This is a book about mastery. But what does this mean? What does being a QlikView
master mean?
When I wrote QlikView for Developers Cookbook, Packt Publishing, I started the preface
with the sentence:
"There is no substitute for experience."
When it comes to QlikView, experience is the thing that makes a difference. Experience
is the difference between the developers who can create good applications and the
consultants who can create real business solutions that solve real business problems.
I have been working with QlikView since 2006, and in this time, I have created some
fantastic solutions. I also created applications that I cringe to look at today. I like to think
that I have mastered the subject, even though I am still learning.
At CapricornVentis, I work with one of the brightest bunch of consultants; it has ever
been my pleasure to work with them. I get to teach a lot but I also get to learn a
tremendous amount from these guys. We are constantly pushing the boundaries of the
product to get to the right solution. As a beginner in this area, I would have wanted to
work for an organization like CapricornVentis, where I could really learn and grow as a
consultant.
Let's be clear; I do not know every little detail about QlikView, but I do know most of
them. What I think I know, and know really well, are the important things to know about
when creating QlikView solutions. This knowledge is what I have tried to distil down
into this book.
You won't be a master by just reading this book. As Alfred Korzybski famously stated:
"The map is not the territory."
This book is not an ultimate mastering guide, rather is a like a map that guides us towards
our common destinationto become a QlikView master. Study the map well and you
will get there.
Qlik Sense
During the development of this book, Qlik released their next generation product, Qlik
Sense. Qlik Sense is not, currently, a replacement product for QlikView, and Qlik has
announced that they will have a two-product strategy and sell QlikView for guided BI
applications and Qlik Sense for self-service BI applications. A new version, QlikView
12.0, is slated for release in the second half of 2015.
While Qlik Sense is a new product, it is built on the same heritage as QlikView. There is
a new data engine, QIX, that stores the data in a format more columnar than that of
QlikView. However, the inference engine is still the same (green, white, and gray). The
script syntax is still the same; in fact, we can use QlikView scripts in Qlik Sense. The
frontend is very different because it is based on a new web design, but the expression
syntax is still the same.
Therefore, much of what is written in this book about QlikView will still apply to Qlik
Sense. Anyone who masters QlikView will be well on their way to mastering Qlik Sense.
Visualizing Data
"The greatest value of a picture is when it forces us to notice what we never
expected to see."
- John Wilder Tukey, statistician and developer of the box plot
"The purpose of visualization is insight, not pictures."
- Ben Shneiderman, developer of the treemap
These two quotes are interesting in their juxtaposition. One tells us to draw pictures
that reveal the unexpected. The other tells us that the purpose of visualization is not
pictures but insight. If they were part of the same conversation, one might believe that
the two famous contributors to the area of data visualization were in a disagreement.
Of course, this is not true, and these statements were made at different times and in
different contexts. However, they could be part of the same conversation. One that
extols us to, yes, create pictures, but not just pretty pictures; pictures that deliver
insight, pictures that reveal the unexpected.
In this chapter, we are going to explore where data visualization has come from.
We will also look at the important things to understand about how humans work with
data, and this will lead us to some rules about how to present data most effectively.
These are the topics we'll cover in this chapter:
Visualizing Data
[ 344 ]
Chapter 7
Analyzing geometry
The first cases of uses of visualizations to represent numbers come in the area of
analytical geometryusing some kind of coordinate system to either resolve or
create equations.
Grecian influences
The earliest uses can be traced back to before 300 BC in ancient Greece, during the
great era of philosophers, at a time when scholastic pursuits were encouraged.
Menaechmus (around 380 BC to 320 BC) was a Greek mathematician and friend of
Plato, who is credited with discovering the conic sections: the realization that shapes
like the ellipse and parabola were actually cross-sections of a cone. His methods of
proving his theorems had a strong resemblance to the use of coordinates.
Apollonius of Perga (around 262 BC to 190 BC) developed a method that is very
similar to those developed by more modern mathematicians. He can't be fully
attributed with the development of analytical geometry, because he was also working
on conics and his equations related to curves. He was able to come up with equations
of the motions of planets, and his work influenced other important mathematicians
such as Ptolemy.
Claudius Ptolemy (around 90 AD to 168 AD) created one of the first, widely replicated
data visualizations when he created his Geographia. He collected as much data as he
could, transformed it using rules that he established himself, and created his famous
world maps.
[ 345 ]
Visualizing Data
French discord
One of the most interesting debates in Mathematics is that of who really created
analytical geometry. The debate centers on two famous French mathematicians
and history appears to have come down in the favor of the publishing date.
Ren Descartes (1596 to 1650) is the historical winner:
Ren Descartes
This photograph is licensed under Public Domain via Wikimedia Commons and is
available at http://commons.wikimedia.org/wiki/File:Ren%C3%A9_Descartes.
jpg#mediaviewer/File:Ren%C3%A9_Descartes.jpg.
Descartes is famous as being both a mathematician and philosopher. He coined
the often used phrase, "I think, therefore I am". He has also had the honor of
having his name applied to the coordinate system used in analytical geometry:
Cartesian coordinates.
Descartes published his essay, La Geometrie, in 1637. Interestingly, although he
reduced geometry down to arithmetic and algebra and he introduced the concepts
of the coordinate system that now bears his name, there are no equations actually
graphed in this work.
[ 346 ]
Chapter 7
Pierre de Fermat
This photograph is licensed under Public Domain via Wikimedia Commons and is
available at http://commons.wikimedia.org/wiki/File:Pierre_de_Fermat.
png#mediaviewer/File:Pierre_de_Fermat.png.
Pierre de Fermat had also written a work on analytical geometry that was apparently
circulating in Paris in the manuscript form in 1637, prior to Descartes publication
of La Geometrie. It is unlikely that Descartes was aware of this as he was living in
the Dutch Republic at the time. So, it appears that both came up with their ideas
independently. Descartes was actually published in 1637 (with a Latin translation
published in 1649), whereas de Fermat's manuscript was not published until 1679.
The main difference between the two works was a matter of perspective. Descartes'
techniques started with a curve and produced the equation of the curve. Pierre de
Fermat's techniques started with an equation and then described the curve. Because
of this, Descartes had to deal with more complex equations but this meant that he
developed methods to deal with higher degree polynomial equations.
[ 347 ]
Visualizing Data
His A New Chart of History and Chart of Biography might have been influenced by
an earlier chart created by Jacques Barbeu-Dubourg (1709 to 1779) in 1753 in Paris.
However, Priestly's charts were much simplified (Barbeu-Duborg's chart was
54-feet long!) and easier to understand.
[ 348 ]
Chapter 7
His charts were much admired, and along with his influential work in the area
of Chemistry, this led him to be nominated by his peers to become a member of
the Royal Society.
[ 349 ]
Visualizing Data
When creating his work, Commercial and Political Atlas, 1786, Playfair had 43
plates that showed these line charts of import and export from various countries
over the years. However, he had a problem. He also wanted to include the data
for Scotland but did not have all the data. So, he came up with a different solution;
he just showed one year's data for Scotland's 17 trading partners with two lines for
each that represented the imports and exports:
This photograph is licensed under Public Domain via Wikimedia Commons and is
available at http://commons.wikimedia.org/wiki/File:Playfair_Barchart.
gif#mediaviewer/File:Playfair_Barchart.gif.
Of course, this is what we know today as a bar chart.
[ 350 ]
Chapter 7
In his work, Statistical Breviary, 1801, Playfair introduced another new chart;
the pie chart:
This piechart is taken from The Commercial and Political Atlas and Statistical Breviary,
Cambridge University Press.
This photograph is licensed under Public Domain via Wikimedia Commons and is
available at http://commons.wikimedia.org/wiki/File:Playfair-piechart.
jpg#mediaviewer/File:Playfair-piechart.jpg.
What Playfair achieved was not just the creation of a new chart type, but it was the
use of charts to bring numbers to the public. From that time, the use of charts in
financial and statistical publications has become the norm.
Creating infographics
A retired French engineer, Charles Joseph Minard (1781 to 1870), created a
visualization that had a big impact on infographics.
Minard retired in 1851 and spent his retirement doing private research. In his
career as a civil engineer, he worked on road and bridge projects and used maps
extensively. After his retirement, he started to produce some data visualizations
that made use of maps to position the data geographically. For example, in 1858,
he created a visualization of the cattle being sold in Paris. The chart showed a
pie chart on each region, where the cattle were coming from with the segments
breaking down the breed of the animals.
[ 351 ]
Visualizing Data
This map is taken from Des chiffres et des cartes: la cartographie quantitative au XIX
sicle, Gilles Palsky, Paris: Comit des travaux historiques et scientifiques.
This photograph is licensed under Public Domain via Wikimedia Commons
http://commons.wikimedia.org/wiki/File:Minard-carte-viande-1858.
png#mediaviewer/File:Minard-carte-viande-1858.png.
His most famous work was published in 1869. Minard combined his ideas around
mapping and engineering flow diagrams to show the results of Napoleon Bonaparte's
disastrous Russian campaign of 1812/1813. The beauty of this visualization was that
the entire campaign was described in one image and the reader required very little
effort to understand it:
[ 352 ]
Chapter 7
[ 353 ]
Visualizing Data
This photograph is licensed under Public Domain via Wikimedia Commons and
is available at http://commons.wikimedia.org/wiki/File:Nightingalemortality.jpg#mediaviewer/File:Nightingale-mortality.jpg.
The segments in this chart show the total deaths of servicemen in the British Army.
The red segments in the middle are deaths from wounds. The black segments are
"others". The larger blue segments are preventable deaths caused by infections.
She used this chart to make the case for better sanitation in hospitals.
[ 354 ]
Chapter 7
The digital revolution brought data visualization to the masses. Anyone with a PC
and Microsoft Excel could now quickly create charts and share them with colleagues.
While everyone was doing what they wanted with these tools, the academic study
of the subject has been slow to catch up. However, we now have a rich amount of
information and research available and there are several leading thinkers in the area.
Edward Tufte
Edward Tufte is alive and well and still talking to the world about data visualization.
His 1983 book is still in print and widely available. You can follow Edward on
Twitter at @EdwardTufte.
Few
Stephen Few published his first book on data visualization, Show Me The Numbers,
back in 2004. This was at a time when there was a real lack of thought-leadership
on the subject. He has since published two additional works: Information Dashboard
Design and Now You See It. Both Show Me the Numbers and Information Dashboard
Design have had second editions published in recent years. Stephen regularly
publishes blogs and comments to his own website, www.perceptualedge.com.
Robert Kosara
Robert Kosara was a professor at the University of Maryland before taking a sabbatical
year and joining Tableau Software, where he still works.
His blog, www.eagereyes.com, has been very popular for many years, and he also
appears at data visualization conferences and is a regular contributor to various media.
Robert can be followed on Twitter at @eagereyes.
[ 355 ]
Visualizing Data
Alberto Cairo
Alberto Cairo is a professor teaching visualization at the University of Miami.
His book, The Visual Art, is a bestseller in the topic. He has also taught the subject
on a Massive Open Online Course (MOOC). Alberto can be following on Twitter
at @albertocairo.
Andy Kirk
Andy Kirk is a freelance data visualization specialist, designer, speaker, and
researcher. He is the author of Data Visualization: A Successful Design Process.
He delivers public training on the subject worldwide. His data visualization
website is www.visualisingdata.com and Andy tweets on Twitter at
@visualisingdata.
Mike Bostock
Mike Bostock has had a huge influence on the area of data visualization because he is
the founder of and chief contributor to the d3.js JavaScript library. This library allows
developers to create engaging web content from their data with very little coding.
The library can also be relatively easily used within Qlik extension objects.
Mike's day job is working for the New York Times as part of their award-winning
visualization team where they regularly push the boundaries of how we view data.
He has his own blog at bost.ocks.org and he tweets at @mbostock.
[ 356 ]
Chapter 7
There are some rules that we need to know when dealing with humans. These are
based on sound psychological studies, and therefore, aren't always true! They are
good guidelines that apply to the majority of the population, but we really need
to know that you can't please all of the people all of the time.
Matching patterns
One of the things that humans really excel at is recognizing things that they have seen
before or look similar to things that they have seen before and associating those things
with other similar things that they have experienced before. As we tend to share a lot
of cultural experiences, many of us will share the same generalizations. For example,
you might have seen this in your Facebook or e-mail in the recent past:
"Olny srmat poelpe can raed tihs.
I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg. The
phaonmneal pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde
Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, t he olny
iprmoatnt tihng is taht the frist and lsat ltteer be in the rgh it pclae. The rset can be
a taotl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn
mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Amzanig huh?
yaeh and I awlyas tghuhot slpeling was ipmorantt!"
[ 357 ]
Visualizing Data
In the first paragraph, the letters aren't really completely scrambled. They are close
enough to the originals for us to easily read the paragraph as we scan across them
and match the patterns to the words. In the second sentence, the letters are truly
scrambled, and we need to try and employ our anagram-solving skills to try and
understand the sentence: understanding language seldom encourages confusion.
The process of seeing patterns in things that might otherwise be considered random
is called apophenia. It is something that we do a lot because we are very good at it.
Imagine driving down the freeway and seeing a cloud ahead of you.
What do you see? Is it merely a collection of water droplets, floating on air currents?
Or is it a dragon, flying through the sky? It could be anything. To each of us, it is
whatever our brains make of it, whatever pattern we match.
We are fantastic at seeing these pictures. We have a large proportion of our brain
devoted to the whole area of visuals and matching patterns against our memory,
far bigger than for any other sense.
Counting numbers
We, as humans, don't have a very long history with numbers. This is because
for very long stretches of our evolution, we just didn't need number systems.
As hunter-gatherers, it was not necessary for us to count accurately. All we
needed to do was make estimations.
We can still see this today in surviving hunter-gatherer tribes such as the Warlpiri
in Australia and the Munduruku in the Amazon. Both tribes have words in their
languages for small numbers such as one, two, or three, but after that they either
have no words at all or have some words but are inconsistent in their use.
[ 358 ]
Chapter 7
About 10,000 years ago, things started to change. Although there is evidence of
limited agriculture in surrounding areas, the real changes happened in and around
an area known as Fertile Crescent (http://en.wikipedia.org/wiki/Fertile_
Crescent), an area sitting between the Nile Delta in the southwest, the Caspian
Sea in the northeast, the Black Sea in the northwest, and the Persian Gulf in the
southeast. The main rivers of this area, Tigris, Euphrates, and Nile, created a large
area of fertile land and agriculture and husbandry of animals exploded. Man started
changing from hunter-gatherer to farmer and shepherd.
As we settled down, we started trading with each other. Suddenly, we came up
with a reason to count things! When we went to bed with one hundred sheep in
the field, it was important to know that there were one hundred sheep still there
in the morning.
Given that we have had up to a million years of evolution, it might not be too far a
stretch to say that most humans are not as comfortable with numbers as they think.
Estimating numbers
Consider this figure:
[ 359 ]
Visualizing Data
Now, consider how you answered both those questions. I would suggest that most
people will look at the upper circle and immediately see two dots. However, when
most people look at the second circle, they will not immediately see eight dots. Instead,
they will often switch to breaking the number down, perhaps see three + two + three
(vertically), three + three + two (horizontally), or some other breakdown, and then add
those back up to get the number eight. Even for such a relatively small number such
as eight, we still tend to break it down into smaller groups. So, how can we count this
number of dots?
Of course, we can't count these in one go. We could spend a minute counting
them one by one, although we still might not get the correct answer as the random
arrangement could lead to mistakes. Alternatively, we could just have a guess and
estimate the correct answer. Wouldn't that be good enough for most situations?
It would be especially good enough if our goal is just to answer the question of
which side has more dots:
If we have to answer the question with any sort of immediacy, we need to quickly
estimate and decide. Quite often though, we will get the right answer! Our brains
are actually very good at this estimation, and it comes from a time a long way before
numbers existed.
[ 360 ]
Chapter 7
When deciding where to spend valuable energy to chase down or gather food, early
man would have had to make the calculations on return on investment. All of these
would have been done by estimations: how many wildebeests are there in the herd,
how far it was to get to them, or how many people are needed to hunt them down.
We still do this today! If we walk into a fast-food restaurant at lunchtime, and you
see eight long queues of people waiting to be served, we immediately start making
evaluations and estimations about which queue should be the one for us to spend
our valuable time in to get the reward; otherwise, we estimate that it is not worth
spending time for that reward and we leave.
So, knowing that we are naturally good at estimation, how does this help when
we are working with numberssomething that we have, relatively, spent a lot
less time with?
It would appear that when it comes to numbers, we still perform estimations. When
we see two numbers beside each other, especially large numbers, our brains will make
estimates of the size of the numbers and create a ratio, though not always accurately.
Let's consider this famous set of numbers:
Just spend a minute perusing the numbers and see whether you can see anything
interesting in them.
They look reasonably similar. We might think about doing some analysis of the data
to see whether there is a major difference. Perhaps, we should average them:
[ 361 ]
Visualizing Data
Quite interestingly, it appears that each set of columns has the same average for the
X and Y values. Perhaps, we should look at the standard deviation:
Again, it appears that we have a very similar dataset indeed. Perhaps, we should
calculate the slope of the regression line for these numbers:
Statistically speaking, this is a remarkably similar dataset. I wonder how this dataset
would look if we actually graphed it:
Incredible! We have a dataset that looks quite similar on casual inspection, and even
more so when we apply common statistical functions, but when we graph it we can
see that it is completely different!
[ 362 ]
Chapter 7
Drawing conclusions
So, we know that humans are excellent pattern matchers. We can see patterns in
shapes and create stories from those patterns that match our experiences. However,
we are not really that great with numbers. We like to think that we are, but we often
fail to see patterns in sets of numbers.
We are quite comfortable with very small numbers, but even with slightly bigger
numbers, we will adopt a strategy of breaking them down to smaller parts to help
us understand.
We don't really get exact numbers unless we can directly experience them. For
example, for many of us, the phrase, "20 minutes", has no real meaning whereas
the phrase, "about 20 minutes", is immediately understood! This is because we
have no natural reference point for exactly how long a second or a minute is, let
alone 20 minutes, but we can reference our experience and understand exactly
how long about 20 minutes is.
So, if humans are not very good at dealing with exact numbers, what is the most
effective way of communicating with numbers? We cannot rely on people gaining
insight from a column of numbers in a spreadsheet. The only way to help us
understand numbers is to present them graphically and in context.
We really need to show people the numbers.
[ 363 ]
Visualizing Data
Understanding affordances
Donald Norman is a famous person in the area of design. Not in the area of data
visualization at all, but just the design of everyday things. In fact, one of his best
works is called The Design of Everyday Things.
Norman adopted a term that is now central in design: the idea of affordance.
Originally, an affordance means all of the things that an item affords you to do with
it. For example, a table affords us with many options: we can place things, sit, write
or even dance on it! Some other items have very few affordances. For example, a
button on a screen has pretty much only one: you can click on it. However, Norman
had a closer definition of this term: not all the things that are physically possible, but
the possibilities of different actions that will be immediately apparent to the person
using the item (we don't all immediately think of dancing on tables!) I like to call these
"unwritten rules": you look at something and will just know what to do with it.
A classic example of this is the known as a Donald Door because so many people
reference Donald Norman when discussing it. When we see a door that has a flat
panel on it, we don't even need to look for the word, "PUSH," above the panel,
because we know how to open that door. Similarly, when we see a door that has
a long vertical bar, our natural instinct is to grasp the bar and pullthis is the
unwritten rule. However, we are often stymied in our attempts to open such a
door until we realize that actually we need to push it. Here is an example:
[ 364 ]
Chapter 7
This image was taken from the UX article, The Usability of Garda Doors which can be
found at http://iqcontent.com/blog/2007/01/.
Here, we see a door that users want to grab and pull, but they should grab and push.
It is clear, from the wear on the word, PUSH, on the panel, that regular users of this
door completely bypass the use of the handle and push against the panel instead.
They choose to do the natural thing and reject the unnatural.
As user interface designers, we should always think about how the user will actually
use our layouts. We might do things that cause minor irritations to users that become
major irritations over extended use. If we have difficulty getting into the mind of a
user, it is useful to engage with users and talk to them about how they like or dislike
using an interface.
Nielsen's F
Jakob Nielsen is the cofounder, along with the aforementioned Donald Norman, of
the Nielsen Norman Group, a major design consultancy. He has done a large amount
of work in the area of user experience and has created several usability methods.
One of his experiments was to use eye-tracking equipment to track how users viewed
websites. You can refer to http://www.nngroup.com/articles/f-shaped-patternreading-web-content/ for more information.
[ 365 ]
Visualizing Data
The interesting thing for us to take note of is that when users first look at a page
on the screen, their gaze is directed immediately to the top-left area of the screen.
They will spend some time here and across the top and then move down and to
the left again, but spend less time on the lower areas. The gaze pattern often looks
like the shape of the capital letter, F.
There might be a difference in other cultures; however, a learned response by web
users in those cultures might also cause them to look to the upper-left area first.
The important conclusion for us is that the upper-left area of the screen is the most
important real estate and should contain the most important information.
Similar to Nielsen's F pattern, the upper-left area (1) is the primary area in which
information is inserted. Unlike the F pattern, the fallow areas are taken in, but only if
the user has a level of interest in the content. In this model, the lower-right area (4) is
actually an important area because it is the area where the gaze pattern will terminate.
This technique has been used by newspaper and magazines and latterly for web
designs, and it has been shown that the terminal area is the correct area to place
items where the user might take action. In this model, the lower-left area (3) is the
least important.
[ 366 ]
Chapter 7
[ 367 ]
Visualizing Data
Every time the user has to make a new selection, their hand covers most of the
screen, so they have to then move their hand out of the way to see the effect of
the change. Now, what if the listboxes are on the right-hand side?
Now, the user can make selections and see the changes as they happen without having
to move their hand. This is a preferable situation.
Dates on top
Because of their nature, field values, such as year and month, lend themselves to
being rendered as horizontal listboxes. These are quite often rendered across the
top of the screen in QlikView applications.
It appears that this is acceptable because users will accept a certain amount of header
and navigation elements across the top of the screen and the date filters, as they are
horizontal across the top, become a part of this.
[ 368 ]
Chapter 7
We can use this design grid to help implement a grid baseline design such as those
recommended by many web designers. By setting the Line Distance and Snap Step
values appropriately and then following the rule of always keeping objects one snap
step from the edge of the grid, we can achieve a clean and consistent layout with
regular spacing between objects:
[ 369 ]
Visualizing Data
Thinking quantitatively
There is an excellent show on BBC Radio 4 called More Or Less (http://www.bbc.
co.uk/radio4/moreorless) that takes statistics that have been presented in the media
and explains or debunks them. For a show about numbers, it is quite amusing and
worth listening to. Their podcasts can be downloaded worldwide.
They have a concept on the show of "big numbers". A big number is a number
that sounds quite big, is usually quite round, but is usually presented without any
additional context. It is the kind of number that headline writers love to use and that
the More Or Less team love to debunk. There are examples of them everywhere.
The important thing to do with big numbers or any number is to put it in context.
Otherwise, we face the prospect of the dreaded SFW question.
[ 370 ]
Chapter 7
Of course, 2,000 is a big number. The news article gave zero context to the number.
We don't know what types of cases they were. We don't know how many cases were
pursued and what the total was. Is the number 2,000 more or less than that of other
years, versus the crime rate of other years?
This was the lead story and presented as a major scandal. The reality is that we
should say: 2,000 cases? So what?
Every number that is presented in a Qlik application should have some context.
Whether that is a breakdown by category in a bar chart, a ratio versus a target
or previous period, or a trend over several periods, it is vital to give the users
information about where that number sits. Otherwise, the users will just be
asking, so what?
Designing dashboards
For me, the ultimate design of a dashboard is the one sitting in front of me when
I drive my car. It gives me all of the information that I need to know to be able to
manage my progress along the road.
It contains indicators vital to the current situation of the car: my speed, the engine
oil temperature, and the engine RPMs. It doesn't have any information about the
speed I was driving at the same time last week; this isn't important for me to drive
the car along the motorway right now. I don't have an indicator for the amount of
oil that I have in my engine, but this is something that I can find out by opening the
hood and checking the dipstick; something that I should do on a semiregular basis,
but not something that I need to know when I am behind the wheel.
For a business, the dashboard should be designed along the same lines. It should
only show the information needed for the users to understand what is happening
right now in their business: their key performance indicators. There should be very
limited selectability, if any, on a dashboard and there should be no date selectors.
The dashboard shouldn't show us the KPI value last week; it should show it right
now. We can provide analysis sheets for users to investigate values at different
time periods if that is important to them.
[ 371 ]
Visualizing Data
Choosing charts
When picking the chart to display numbers, there is often a balance to achieve
between effective visualization and attractive visualization. Users will appreciate
an attractive display and will get the most information out of an effective chart.
Luckily, with QlikView and Qlik Sense, we often can usually achieve both.
Categorical comparison
For a normal day-to-day comparison between different values in a category, it
is hard to beat a bar chart for simplicity and accuracy. Humans appear to have a
very good ability to discern differences in length, even if this is just a very small
difference. Bar charts beautifully encode their values by their lengths, so we can
quickly see the differences between the different categories:
Bar charts are also very effective when comparing two measures across category
values. It is important that the magnitude of the measures being compared be similar,
or they are at least expected to be similar; for example, budget versus actual, so that
they can share a common axis, and therefore, be comparable by length. We should also
be careful that if we put two bar charts side by side, the different axis lengths could
cause confusion for users and lead them to take up the wrong idea. For example,
does black tea have as much caffeine as brewed coffee?
[ 372 ]
Chapter 7
One of the things that we need to be aware of is that bar charts, as they encode
their values in their length, must always begin their axis at zero. If we are tempted
to change this, perhaps because the data doesn't look good, we are actually not
telling the truth about our data!
Back in 2007, the Quaker Oats company was making some quite interesting
claims about oatmeal and its effect on cholesterol in the body. They marketed
this with a graph showing the effect of consumption of oatmeal on cholesterol
over a four-week period:
At first glance, it appears that we should rush out and buy oatmeal! But wait!
We should notice that the axis here starts at about 195, not zero. How would it
look if we redraw with a zero axis:
Now we see that the change is not quite so drastic! The company was later forced to
remove the graph, along with some of the other more exaggerated claims.
[ 373 ]
Visualizing Data
Trend analysis
When looking at patterns of change over time, there is no better chart than the simple
line graph. While a bar chart allows us to focus on the difference between individual
bars, a line graph is all about the shape of the line: peaks and troughs:
By adding additional expressions, usually an average along with control lines based
on standard deviations, we can create a statistical control chart to look out for times
where peaks and troughs are not just normal variation:
The rules about zero on the axis that we have for bar charts do not have to be applied
to line charts. This is because the important thing about line charts is the shape of the
line, and we might need to change the axis bounds to properly see that.
[ 374 ]
Chapter 7
Comparing measures
When we are comparing measures, we can, of course, use a bar chart to juxtapose
one measure against another. However, this does not reveal whether there is any
correlation between the two measures, to see whether one measure appears to be
a driver for another. For this purpose, a scatter chart is the best choice:
As well as being able to see correlations, we can also spot outliers. We can also interact
with the chart and zoom in on areas of interest.
It can also be useful to be able to set the size of each of the bubbles based on a different
measure. We can also define the color of each bubble based on yet another measure.
[ 375 ]
Visualizing Data
Pie charts are all about ratio comparison. We are trying to compare a segment with
the whole of the circle. We should not be using a pie chart to compare one segment
with another; that task is much better served by a bar chart. Consider this example:
In this example, it becomes hard to separate the different segments from each other.
It can be argued that the legend on the right-hand side delivers more information
than the pie. We might also consider whether all regions are represented on the
chart; if not, then this is not a valid part-to-whole comparison.
A pie chart should really have a low number of segments (low cardinality) so that
a user can focus on the part-to-whole comparison. Ideally, this should be just one
segment versus the whole. We should also be sure that the whole does represent
the whole and not just a selected part.
Of course, QlikView and Qlik Sense are interactive, so they do give interesting
information when we hover over the segments, and we can add additional information
to that pop up using a pop-up text expression. We can also click to make a selection
that gives additional information.
It can also be interesting to do a true part-to-whole comparison by having one segment
representing the currently selected values and the whole showing all values:
[ 376 ]
Chapter 7
Recently, I wrote a blog post on key performance indicator approaches that included
a proposed new KPI visualization called Pie-Gauge, which can be found at http://
www.qliktips.com/2013/12/key-performance-indicator-approaches.html.
Pie-Gauge is an interesting use of pie charts. It is a part-to-whole but the whole
depends on whether we have exceeded the target or not. If not, the whole is the
target value and we have a segment representing the shortfall. If we have exceeded
the target, then the whole is the actual value and we have a segment representing
the amount by which we have exceeded the target:
Tabular information
The straight table is a very powerful tool to represent actual numbers. In general, it
will be used to show several calculations versus one-dimension category. However,
we know that raw numbers are not always processed well by humans, so we can add
additional graphical elements to aid understanding:
We can see two uses of horizontal gauge here: one with an indicator and one using
the Fill to Value setting to represent a bar chart. We also have a sparkline, which
is an example of a mini line chart that shows just the trend of a value over a period
without showing magnitudes. We also see a whisker chart here that shows values
above or below a value, in this case, budget and over time.
[ 377 ]
Visualizing Data
Another visual that we can add in straight tables is setting the color of the text to
indicate positive or negative results.
Using color
It is good to use color in charts, but it is important to consider how we are going to
use it and what we are going to do with it.
The chart with only one color gives the same information as the chart with multicolors.
In fact, it can be argued that the chart with all the colors might actually add some
confusion to what should be a simple chart.
[ 378 ]
Chapter 7
We should, perhaps, learn a lesson from nature. Things that stand out from the
background can be seen. Things that stand out more than other objects will be
noticed even more. However, if everything is standing out, then nothing will
come to the forefront of our attention.
If we use softer colors for most of our bars, with a plain white background, then
we can see those bars very well. If we need one of the bars to stand out, because
it needs some action, then we can have just that bar can have a stronger color to
attract attention:
[ 379 ]
Visualizing Data
Anything that is okay should probably have no coloring at all. This means that other
areas are easier to find. Anything that is bad can remain red and should be a call to
action to have users click and discover.
But what about amber? We really need to think about this. Do we want users to click
and discover? If so, then perhaps it should be red. If not, perhaps it should have no
color at all.
So, instead of RAG, perhaps we should be implementing R.
We have some common elements here: background color, striped rows, and
grid lines. By making a few tweaks to the Style tab of this chart, we can clean
up superfluous pixels:
[ 380 ]
Chapter 7
We can see that the background color has been removed completely. The vertical
grid lines have also been removed as the white areas in between the columns act
as very effective separators. The horizontal grid lines have been left but are now
almost transparent; they serve as effective guidelines, but are not impactful in the
cleaner display.
It is not just in tables that we should keep things clean. In bar charts, there are
options to have backgrounds on the display area and lines around the bars:
[ 381 ]
Visualizing Data
Color blindness
Color blindness is something that affects up to eight or nine percent of the male
population. It is almost exclusively a male issue, as female color blindness is extremely
rare, and the colors affected are, in the majority, between red and green.
Of course, when we consider things such as RAG beacons on dashboards, we can
see that there might be problems for quite a large number of people in even seeing
data. We should really be aware of this and consider the colors that we choose for
different purposes.
A great resource for color selection is the Color Brewer website:
http://www.colorbrewer2.org.
This website suggests color ranges that we can use, including color-blind-safe
selections.
In general, we should avoid juxtaposing green and red. If we are using diverging
hues, we should not use green and red and instead use blue along with either green
or red. This gives most people the greatest chance of seeing the data.
Using maps
A lot of data that we deal with might have a spatial component. This could be a
post code or address that can be geocoded, or we might already have latitude and
longitude information. Just because we have this, it doesn't mean that we need to
plot the information on a map!
While the data might have a spatial component, it usually doesn't have a special
dependency; it doesn't really matter to our analysis exactly where the data occurred.
In these cases, a map is just a pretty display, while a bar chart is a better option.
[ 382 ]
Chapter 7
Quite often, people use colored areas on a map to indicate information. This is
known as a choropleth, the classic example being used with US election polls
and results:
US election results
[ 383 ]
Visualizing Data
If we look at this map, with large swathes of red, we might be surprised that Obama
won the election! The problem is that quite a lot of the land area of the US, especially
in the mid-west, has a low population, so contributes less votes to the overall result.
The New York Times came up with a novel approach to solving thisresizing the
states based on electoral vote size:
electoral-map/.
One other issue that we must consider with the use of maps is the general level of
education. Several studies have shown that a percentage of the population is unable
to correctly identify states and countries on a map. Consider whether a simple bar
chart would be more appropriate.
[ 384 ]
Chapter 7
Summary
This has been quite an interesting chapter because a lot of the content wasn't true!
We can have some confidence about the historical information. We found out that the
beginnings of data visualization started with mathematics and analytical geometry.
Once the use of charts was establishedlargely by Joseph Priestly and then followed
by William Playfairtheir use became more and more commonplace as useful ways
of telling stories with data.
We should have a better understanding of our audience. Of course, this can only apply
to most of our audience because there are always outliers.
Design guidelines are never set in stone. What is correct in design today will have
changed tomorrowjust look at the iPhone. However, fundamentals will not change
and we should be aware of them.
We have reached the end of the road for this book. By now, hopefully, your Qlik
education will have advanced towards mastery. Of course, you will not become a
master until you start to implement these practices and even create your own best
practice. Let's hope that this book is a good foundation.
[ 385 ]
Alternatively, you can buy the book from Amazon, BN.com, Computer Manuals and
most internet book retailers.
www.PacktPub.com