b52 Bigdata
b52 Bigdata
b52 Bigdata
changed today’s data landscape. In addition, the Volume: a vast amount of data is generated every
growing number of sensor technologies, embedded in second. It has been estimated that every minute of the
devices of all kinds, and the widespread use of the day, Google receives over 4,000,000 search queries,
internet in daily life, has expanded exponentially the Facebook users share 2,460,000 pieces of content and
amount of information available and the different ways YouTube users upload 72 hours of new videos.
in which it is collected (Figure 1).
Variety: many different types of data are now available.
Even though there might be no universally accepted New technologies can analyse data from different
definition of Big Data, the term generally refers to the sources such as messages, photos or video recordings,
3
increased complexities in the use of data. More social media or sensor data. Data collected through all
precisely, it can be described as the product of the these sources can either be categorised as ‘structured’
following elements.
1 A Pols Data is the New Oil, Privacy is the New Green (12/06/2015)
2 TechWeek Gartner: Big Data Could Put Your Business At Risk (07/10/2015)
3 The Wall Street Journal Big Data's Big Problem: Little Talent (29/04/2012). Despite the great popularity of ‘Big Data’ in the business world, the
author of this article points out that a widespread lack of understanding of the issue still persists.
4 Aci The Data Explosion in 2014 Minute by Minute – Infographic (12/07/2014)
5 B Marr Big Data: Using Smart Big Data, Analytics and Metrics to Make Better Decisions and Improve Performance Wiley (01/02/15)
24 Greencoat Place, London SW1P 1BE • t +44 (0) 20 7798 6040 • e info@ibe.org.uk • www.ibe.org.uk • Charity No. 1084014
Business Ethics and Big Data
Issue 52 | June 2016 Page 2
(easily quantified and organised for systematic analysis Big Data for Business: potential and
e.g. credit card transactions), or ‘unstructured’(harder to concerns
analyse in an automated way e.g. video, blog posts, “Making the world more intelligent, identifying patterns
6
social media content). undetected before, taking decisions not based on the
limited experts’ knowledge but on the huge mass of
Velocity: the speed at which new data is generated and data from the inscrutable reality. This is the promise of
9
circulated is accelerating. In some cases, technology Big Data.”
can analyse the data while it is being generated, without
ever putting it into databases. In a time of digital revolution, information represents a
valuable asset. Thanks to rapid developments in
2. INTERNET OF THINGS (IoT) technologies, companies worldwide have increasingly
The development of the IoT is increasing the volume of come to rely on Big Data in order to solve problems,
data collected, the velocity of the process and the develop new strategies and target their messages and
variety of sources. It describes the ability of devices to products. Recognising the importance of this, some
communicate with each other using embedded sensors companies have created a specific role to link Big Data
10
that are linked through wired and wireless networks. with their organisation’s strategy.
These devices could include appliances in everyday use
(e.g. mobile phones or a thermostat), vehicles such as Moreover, it is increasingly acknowledged that the
cars and or new ‘wearable’ technologies like the availability and use of such datasets can have benefits
smartwatch. These connected devices, which generally for both customers and society as a whole. Box 1 gives
assist people in their daily lives, use the internet to some examples of the potential positive impact of Big
transmit, collect and analyse data. Each of these Data.
actions leaves digital traces, which are aggregated to
7
form the bulk of Big Data. On the other hand, the public has developed a greater
awareness and sensitivity towards the topic, driven by
3. ALGORITHMS the increasingly prominent role of the IoT in everyone’s
An algorithm is a series of predefined programming lives. The IoT, in particular, has made it possible for
instructions for a computer to solve sequentially a companies to collect data in ways that might not be fully
recurrent problem, especially one involving calculations understood by users (e.g. from mobile phone calls or
8
and the processing of data. The data collected is often public transport travel passes). As a result, some feel
processed using algorithms that can identify useful constantly under the scrutiny of a ‘Big Brother’ that
patterns within the vast amount of information available. serves the economic interests of businesses and over
11
As the complexity of datasets increases, so does the which they have little or no control. This perception
importance of applying an appropriate and flawless has produced what has been labelled as a ‘data trust
algorithm able to extract reliable information. deficit’: research shows that the public trust in
companies to use data appropriately is lower than trust
generally. This can negatively affect the reputation of
companies or whole industries, with media, internet,
telecommunication and insurance companies being
12
particularly affected.
6 ‘Reinventing society in the wake of Big Data. A conversation with Alex (Sandy) Pentland’ Edge (30/08/2012).
7 The Guardian Defining the internet of things – time to focus on the data (06/11/2014)
8 U Shafaque et al. Algorithm and Approaches to Handle Big Data International Journal of Computer Applications (2014)
9 R Smolan and C Kucklick Der vermessene Mensch GEO (2013) p.85
10 As an example, Vodafone has advertised the position of Head of Big Data for Business Development, whose purpose is to “Develop,
commercialise, and continuously improve a competitively advantaged Big Data strategy, driving adoption, innovation, and business outcomes
throughout the operating countries. This includes the internal value realisation and external monetisation Big Data across Commercial and Enterprise
businesses.”
11 BBC When Big Data becomes Big Brother (05/06/2015)
12 Royal Statistical Society Public attitudes to the use and sharing of their data: research for the Royal Statistical Society by Ipsos MORI (2014)
24 Greencoat Place, London SW1P 1BE • t +44 (0) 20 7798 6040 • e info@ibe.org.uk • www.ibe.org.uk • Charity No. 1084014
Business Ethics and Big Data
Issue 52 | June 2016 Page 3
13 The online portal developed by GSK is available at the following address: DataRequest.com.
14 Rob Frost presented the project at the conference Data for Humanity, held on 11 May 2015 by The Crowd, a platform for the business community
to share ideas and innovations.
15 L Taylor, R Schroeder Is bigger better? The emergence of Big Data as a tool for international development policy Springer Science (2014)
16 For more information, see Challenge 4 Development and IBE Data for Development Senegal: Report of the External Review Panel (April 2015).
17 The Report of the High-Level Panel of Eminent Persons on the Post-2015 Development Agenda A new global partnership: eradicate poverty and
transform economies through sustainable development (30/05/2013). See p. 23.
24 Greencoat Place, London SW1P 1BE • t +44 (0) 20 7798 6040 • e info@ibe.org.uk • www.ibe.org.uk • Charity No. 1084014
Business Ethics and Big Data
Issue 52 | June 2016 Page 4
Box 2 How to protect human rights in the ‘Era of Big be committed to respect it. However, its application
18 19
Data’. poses some concerns. When customers use particular
services provided by a company, they are required to
Stop High-Tech Profiling. New surveillance tools and trust the organisation with their data, but often they have
data gathering techniques that can assemble detailed little insight into how information about them is being
information about any person or group create a collected, analysed and used.
heightened risk of profiling and discrimination. Clear
limitations and robust audit mechanisms are necessary Once a dataset is shared, for instance, it is very hard to
to make sure that if these tools are used it is in a control its diffusion and to predict how it will be used.
responsible and equitable way. Different datasets could be merged and linked together,
for example, providing extremely detailed information
Ensure fairness in Automated Decisions. about individuals or groups. This introduces ethical grey
Computerised decision-making in areas such as areas around privacy.
employment, health, education, and lending must be
judged by its impact on real people, must operate fairly Another important issue is data anonymisation. This is
for all communities, and in particular must protect the an important tool to protect privacy as it involves either
interests of those that are disadvantaged. Independent encrypting or removing personally identifiable
review and other remedies may be necessary to assure information from datasets. While there are efforts
that a system works fairly. underway to ensure anonymisation is effective, there is
still room for improvement and often the anonymisation
Respect the Law. Laws and regulations need to be cannot be guaranteed.
20
18 The Leadership Conference on Civil and Human Rights Civil Rights Principles for the Era of Big Data
19 Royal Statistical Society Public attitudes to the use and sharing of their data: research for the Royal Statistical Society by Ipsos MORI
(2014).The Royal Statistical Society sees real potential benefits from data-sharing, however there are also public concerns about the use of data.
20 J Sedayao et al. Making Big Data, Privacy, and Anonymization work together in the Enterprise: Experiences and Issues Conference Paper: 2014
3rd International Congress on Big Data (Big Data Congress) (01/07/2014)
21 Frank Pasquale The Black Box Society: The Secret Algorithm Behind Money and Information Harvard University Press (2014)
24 Greencoat Place, London SW1P 1BE • t +44 (0) 20 7798 6040 • e info@ibe.org.uk • www.ibe.org.uk • Charity No. 1084014
Business Ethics and Big Data
Issue 52 | June 2016 Page 5
unaware of her pregnancy and who found out as a might be both external and internal, and the risk of
consequence of Target’s approach. The company misuse by employees of the company’s information
declined to comment on the specific situation, but should not be underestimated.
numerous questions were raised about Target’s
22
conduct. In the past few years, this topic has come to public
attention with some well publicised cases of data
Group Privacy security violation which have shown the significant
The issue of group privacy is also of concern. When impact of corporate data breaches on individuals. For
used to analyse large groups of people, the information example, in July 2015, a group called ‘The Impact
that Big Data can reveal may be hugely beneficial. Team’ hacked the database of Ashley Madison, a dating
Examples include the possibility of tracking the spread website for extramarital affairs. The group copied
of a disease more quickly, or bringing relief to a disaster personal information about the site's user base,
zone more effectively. including real names, home addresses, search history
and credit card transaction records, and threatened to
However, there can also be downsides which require release users' names and personally identifying
consideration, especially when operating in countries information if Ashley Madison was not immediately shut
with limited regulation and potentially weak government. down. Although this cyber attack was aimed at
Datasets could easily be acquired by companies with preventing what were considered ethically questionable
ethically questionable marketing strategies, or political activities, it was a violation of people’s right to privacy
groups wanting to use the information to target specific and the company was accused of not taking data
23
sets of people. protection seriously.
25
These privacy issues can only be magnified by the INFORMED CONSENT AND OPENNESS WITH
spread of the IoT particularly in low and middle income INFORMATION
countries, which are generally less technologically How informed consent to process personal information
advanced and might have less reliable privacy is obtained from users is another critical issue.
protection systems. This may particularly be the case in Traditional methods of data collection require the
Africa, which has seen an exponential rise in the use of explicit consent of respondents, stating clearly the
digital communication technologies and especially of purpose and objectives of the data collection. The
mobile phones as users have embraced mobile advent of the IoT has challenged this approach, blurring
communications to overcome a weak or non-existent the borders of what can be considered informed
24
landline infrastructure. consent to the use of personal data.
Data security
In the UK the ability of organisations and researchers to
A critical issue closely linked with privacy is the security
use such data is limited by the Data Protection Act
of personal data and how companies make sure that
1998. This states that consent must be obtained from
their databases are protected from unauthorised users.
individuals before their data can be used for research or
Appropriate security mechanisms are essential to 26
commercial purposes. The primary method for
promote trust in business: customers and other
obtaining consent, especially on social media platforms,
stakeholder groups need to be assured that the
is by asking users to agree to terms and conditions
information they provide is safely and confidentially
when they register to use the service. However,
stored. It is worth noting that threats to data security
research shows that “Signing social media platforms'
22 The New York Times How Companies Learn Your Secrets (16/02/2012)
23 This issue emerged as a particularly material one within the project “Data for Development”, promoted by the telecommunication company
Orange. An example that emerged from the project is related to the risk that armed groups might use the information available to plan their military
strategy. For further information, see IBE Data for Development Senegal: Report of the External Review Panel (April 2015).
24 The Guardian Internet use on mobile phones in Africa predicted to increase 20-fold (05/06/2014)
25 Business Insider Extramarital affair website Ashley Madison has been hacked and attackers are threatening to leak data online (20/07/2015).
According to the BBC, the release of such sensitive information had strong consequences for some of the people involved and at least two people
are believed to have committed suicide as a result.
26 Data Protection Act (1998)
24 Greencoat Place, London SW1P 1BE • t +44 (0) 20 7798 6040 • e info@ibe.org.uk • www.ibe.org.uk • Charity No. 1084014
Business Ethics and Big Data
Issue 52 | June 2016 Page 6
terms and conditions does not necessarily correlate to expense of those less visible. The access and the ability
informed consent, as research has shown that users to use new information and technologies vary between
sign these complicated documents without reading them individuals, communities and countries. Similar
27
in order to open their accounts”. disparities, often known as ‘digital divide’, might produce
29
inequalities in opportunities and outcomes.
A recent example of this issue around consent is
provided by the social experiment that Facebook The city of Boston faced this type of issue when
undertook on 700,000 of its users to determine whether implementing ‘Street Bump’, a mobile application that
the company could alter their emotional state. The used a smartphone accelerometer and GPS feed to
experiment was carried out without informing these collect data about road conditions such as potholes.
users and was designed to assess whether more The public were encouraged to download and use the
positive or negative comments in a Facebook newsfeed app to report potholes to the city’s Public Works
would impact how they updated their own page. Department. However, as some groups of people (e.g.
Facebook used an algorithm to filter content. individuals on low income and the elderly) were less
Researchers found those shown more negative likely to own a smartphone or download the app,
comments responded by posting more negative information from these groups was not being recorded.
comments and vice versa. The company stated that the Repair services were therefore being concentrated in
experiment was legitimate as all users had to tick a box wealthier neighbourhoods. The city of Boston solved the
agreeing to their terms and conditions, including problem by completing the dataset using other sources
consenting to "internal operations, including not subject to this bias, such as reports from city-roads
30
troubleshooting, data analysis, testing, research and inspectors and other more traditional channels.
service improvement". However, the criticism sparked
by the initiative forced Facebook to apologise and to The Ethics Test
28
produce more transparent guidelines on the matter.
QUESTIONS FOR ETHICS AND COMPLIANCE
FAIR TREATMENT OF STAKEHOLDERS AND PRACTICTIONERS
INTEGRITY OF BIG DATA
The reliability of Big Data and the ability of algorithms to Given the increasing importance these issues have for
generate valid conclusions are matters of debate. business, more structured forms of governance of Big
Whereas traditional statistical methodologies rely on Data appear necessary. During a workshop organised
samples that are chosen to be representative of the by the Royal Statistical Society in November 2015 to
whole population being analysed, the new datasets discuss the opportunities and ethical risks of Big Data
produced by Big Data might not be statistically accurate participants stressed the need for data governance to
and therefore could produce flawed results. For this minimise harm and maximise benefits from the use of
reason, a fourth ‘V’ of Big Data is often added to Big Data. They also emphasised the inclusion of
Volume, Variety and Velocity, listed at the beginning of considerations of risk and risk management.
this Briefing: it is Veracity, which refers to the Additionally, it was also pointed out that legal terms and
trustworthiness and integrity of data. conditions are more geared toward resolving corporate
31
liability than addressing public understanding.
When the veracity of a dataset can’t be guaranteed,
significant issues might arise. Some individuals or These observations highlight potential new
groups might accidentally be accorded more visibility opportunities for the Ethics Function in companies,
and thus be favoured, or discriminated against, at the which could hold an oversight responsibility on the
ethical aspects of Big Data collection and use.
27 Beninger et al Research using Social Media; Users' Views NatCen Social Research (February 2014)
28 The Guardian Facebook sorry – almost – for secret psychological experiment on users (02/10/15)
29 The Guardian, ‘Data could be the real draw of the internet of things – but for whom?’ (14/09/2015). This issue has been addressed also in other
IBE publications: IBE Briefing 48 Business Ethics across Generations (July 2015) – which explores the digital divide between age groups at work –
and IBE Report Data for Development Senegal: Report of the External Review Panel (April 2015) – where the digital divide between different
countries is taken into account.
30 See the website that the city of Boston launched to promote the initiative.
31 Royal Statistical Society The Opportunities and Ethics of Big Data. Workshop Report February 2016
24 Greencoat Place, London SW1P 1BE • t +44 (0) 20 7798 6040 • e info@ibe.org.uk • www.ibe.org.uk • Charity No. 1084014
Business Ethics and Big Data
Issue 52 | June 2016 Page 7
In order to see what this role might involve, this Briefing such potential outcomes. Before sharing data, it could
now provides some questions which Ethics be useful to map a company’s suppliers and other
Professionals can usefully ask themselves. stakeholders in order to identify the most vulnerable
links. Moreover, a privacy impact assessment may be
Do we know how the company uses Big Data and to advisable, as highlighted by the Information
what extent it is integrated into strategic planning? Commissioner's Office (ICO) in its Code of Practice on
34
Knowing clearly for what purpose the data will be used this subject.
is important both to make the most of this resource and
to identify the critical issues that may arise. Moreover, Does my organisation have any safeguard mechanisms
research has found that public support increases if the in place to mitigate these risks?
context for data use is explained and people are able to Evidence shows that having preventative processes in
32
deliberate on it. As it represents a particularly sensitive place to enhance data security and protection is an
area, Ethics Officers should make sure they are aware effective way to promote trust. Customers are
and up to date with what is happening on Big Data particularly sensitive about anonymity, and companies
within their organisations. should adopt a method of anonymisation which, while
allowing the company to make information derived from
Do we send a privacy notice when we collect personal personal data available in a form that is rich and usable,
data? Is it written in a clear and accessible language clearly protects individual data subjects. Again, the ICO
which allows users to give a truly informed consent? has issued a Code of Practice to help companies on this
35
When customers or other stakeholders are required to point. Providing people with the ability to opt-out,
provide personal information, having terms and harsh penalties on data misuse and control on data
36
conditions that state clearly how and to what extent the access can also make a difference.
data will be used is an important first step in protecting
privacy. In particular, it is advisable to be careful with Do we make sure that the tools to manage these risks
the small print and/or sensitive information. A are effective and measure outcome?
customer’s perception of being ‘blackmailed’ and forced Audit has a key role to play in helping companies deal
to agree to conditions they do not fully understand can with these issues. The ICO can undertake a consensual
have a strongly negative impact on trust and reputation. audit across the public and private sector to assess the
The Privacy Notice Code of Practice issued by the processing of personal information, and provides
Information Commissioner, the regulator in the UK, practical advice on how organisations can improve the
33 37
provides guidance on the matter. way they deal with it.
Does my organisation assess the risks linked to Big Do we conduct appropriate due diligence when sharing
Data? or acquiring data from third parties?
It is important that companies develop a structured and People expect organisations to share their personal
methodical approach to assessing the risks associated data where it’s necessary to provide them with the
with Big Data. Identifying any possible outstanding services they want and to prevent certain risks. To do
negative impact that the use of Big Data might have on so, companies rely on different types of disclosure,
some groups of people, who may be the most involving very complex information chains that cross
vulnerable amongst the company’s stakeholders and organisational and even national boundaries. Often
what might happen if the datasets become public, can companies rely on third parties to collect and acquire
increase awareness of the potential damage that may the data they need. It is important that due diligence
result from a data breach. Consequently, appropriate procedures are in place when buying information in the
mechanisms can be put in place that could help prevent same way as they are for the other kinds of goods and
32 Economic and Social Research Council Public dialogues on using administrative data (2014)
35 Information Commissioner’s Office (ICO) Anonymisation: Managing Data Protection Risk Code of Practice.
36 Royal Statistical Society Public attitudes to the use and sharing of their data: research for the Royal Statistical Society by Ipsos MORI (2014)
37 Information Commissioner’s Office (ICO) Auditing data protection a guide to ICO data protection audits.
24 Greencoat Place, London SW1P 1BE • t +44 (0) 20 7798 6040 • e info@ibe.org.uk • www.ibe.org.uk • Charity No. 1084014
Business Ethics and Big Data
Issue 52 | June 2016 Page 8
services. Companies should ensure that their suppliers recognised standards do exist (namely the ISO/IEC
39
uphold similar ethical standards, and guarantee the 27001 on Information Security Management), and can
transparency and accountability of these practices. provide some guidelines and assistance to
Moreover, in circumstances where companies are organisations seeking to deal with these issues in their
required to share data with business partners or other code of ethics or internal policies. It is also worth noting
organisations, it is important that the company lives up that the European Commission intends to strengthen
to its commitment to integrity. Some useful guidance on and unify data protection for individuals within the
this aspect is provided by the Data Sharing Code of European Union (EU) through the General Data
38
Practice issued by the ICO. Protection Regulation (GDPR). This regulation, which
also addresses export of personal data outside the EU,
Conclusion was formally adopted by the EU Council and Parliament
It is important that companies realise that they have a in April 2016 and will take effect after a two-year
responsibility to promote transparency and prevent transition period. Nevertheless, each company is
misuse of personal data. encouraged to articulate its own specific approach,
based on their corporate values. Open dialogue and a
The consequences and repercussions of questionable joint effort of companies and public bodies can help
ethical conduct when dealing with Big Data can be promote effective action and ensure stakeholders are
significant and affect a company’s reputation, customer fully aware of the real risks that they face.
relationships and ultimately revenues. Even the
perception of unethical data handling has the power to The threats and opportunities posed by Big Data
undermine both internal and external trust. represent issues that go beyond transparency or
privacy. Responsible businesses are encouraged to
In a fast growing and fairly new regulatory area, it can take the initiative and tackle the main issues they face
be difficult for business to determine the right approach when using Big Data, to maintain a consistent alignment
and define responsibilities. Some internationally between values and behaviour and to mitigate the risks.
The IBE would like to thank all those who contributed to this Briefing. A special thanks goes to Natasha Le
Sellier from L’Oreal for the important input to the discussion and to Claudia Natanson – Security Practitioners,
Gareth Tipton – BT, and Roeland Beerten and Olivia Varley-Winter – Royal Statistical Society for reviewing the
briefing and providing useful comments and suggestions. We are grateful to all who contributed, but the IBE
remains solely responsible for its content.
This and other Business Ethics Briefings are available to download free of charge from the IBE website:
http://www.ibe.org.uk/list-of-publications/67/47/
If there is a topic you would like to see covered, please get in touch with us on +44 (0) 20 7798 6040 or email:
research@ibe.org.uk
24 Greencoat Place, London SW1P 1BE • t +44 (0) 20 7798 6040 • e info@ibe.org.uk • www.ibe.org.uk • Charity No. 1084014