Big Data: Jump To Navigation Jump To Search
Big Data: Jump To Navigation Jump To Search
Big Data: Jump To Navigation Jump To Search
Big data is a field that treats ways to analyze, systematically extract information from,
or otherwise deal with data sets that are too large or complex to be dealt with by
traditional data-processing application software. Data with many cases (rows) offer
greater statistical power, while data with higher complexity (more attributes or columns)
may lead to a higher false discovery rate.[2] Big data challenges include capturing
data, data storage, data analysis, search, sharing, transfer, visualization, querying,
updating, information privacy and data source. Big data was originally associated with
three key concepts: volume, variety, and velocity. When we handle big data, we may
not sample but simply observe and track what happens. Therefore, big data often
includes data with sizes that exceed the capacity of traditional software to process
within an acceptable time and value.
Current usage of the term big data tends to refer to the use of predictive analytics, user
behavior analytics, or certain other advanced data analytics methods that extract value
from data, and seldom to a particular size of data set. "There is little doubt that the
quantities of data now available are indeed large, but that's not the most relevant
characteristic of this new data ecosystem."[3] Analysis of data sets can find new
correlations to "spot business trends, prevent diseases, combat crime and so
on."[4] Scientists, business executives, medical practitioners, advertising
and governments alike regularly meet difficulties with large data-sets in areas
including Internet searches, fintech, urban informatics, and business informatics.
Scientists encounter limitations in e-Science work, including meteorology, genomics,
[5]
connectomics, complex physics simulations, biology and environmental research. [6]
Data sets grow rapidly, to a certain extent because they are increasingly gathered by
cheap and numerous information-sensing Internet of things devices such as mobile
devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency
identification (RFID) readers and wireless sensor networks.[7][8] The world's technological
per-capita capacity to store information has roughly doubled every 40 months since the
1980s;[9] as of 2012, every day 2.5 exabytes (2.5×260 bytes) of data are generated.
[10]
Based on an IDC report prediction, the global data volume was predicted to grow
exponentially from 4.4 zettabytes to 44 zettabytes between 2013 and 2020. By 2025,
IDC predicts there will be 163 zettabytes of data. [11] One question for large enterprises is
determining who should own big-data initiatives that affect the entire organization. [12]
Relational database management systems, desktop statistics[clarification needed] and software
packages used to visualize data often have difficulty handling big data. The work may
require "massively parallel software running on tens, hundreds, or even thousands of
servers".[13] What qualifies as being "big data" varies depending on the capabilities of the
users and their tools, and expanding capabilities make big data a moving target. "For
some organizations, facing hundreds of gigabytes of data for the first time may trigger a
need to reconsider data management options. For others, it may take tens or hundreds
of terabytes before data size becomes a significant consideration." [14]
Contents
1Definition
2Characteristics
3Architecture
4Technologies
5Applications
o 5.1Government
o 5.2International development
o 5.3Healthcare
o 5.4Education
o 5.5Media
o 5.6Insurance
o 5.8Information technology
6Case studies
o 6.1Government
6.1.1China
6.1.2India
6.1.3Israel
6.1.4United Kingdom
o 6.2Retail
o 6.3Science
o 6.4Sports
o 6.5Technology
7Research activities
8Critique
o 8.3Critiques of novelty
10References
11Further reading
12External links
Definition[edit]
The term has been in use since the 1990s, with some giving credit to John Mashey for
popularizing the term.[15][16] Big data usually includes data sets with sizes beyond the
ability of commonly used software tools to capture, curate, manage, and process data
within a tolerable elapsed time.[17] Big data philosophy encompasses unstructured, semi-
structured and structured data, however the main focus is on unstructured data. [18] Big
data "size" is a constantly moving target, as of 2012 ranging from a few dozen terabytes
to many zettabytes of data.[19] Big data requires a set of techniques and technologies
with new forms of integration to reveal insights from data-sets that are diverse, complex,
and of a massive scale.[20]
"Variety", "veracity" and various other "Vs" are added by some organizations to describe
it, a revision challenged by some industry authorities. [21]
A 2018 definition states "Big data is where parallel computing tools are needed to
handle data", and notes, "This represents a distinct and clearly defined change in the
computer science used, via parallel programming theories, and losses of some of the
guarantees and capabilities made by Codd's relational model."[22]
The growing maturity of the concept more starkly delineates the difference between "big
data" and "Business Intelligence":[23]
Characteristics[edit]
Shows the growth of big data's primary characteristics of volume, velocity, and variety
Architecture[edit]
Big data repositories have existed in many
forms, often built by corporations with a special
need. Commercial vendors historically offered
parallel database management systems for big
data beginning in the 1990s. For many years,
WinterCorp published the largest database
report.[31][promotional source?]
Teradata Corporation in 1984 marketed the
parallel processing DBC 1012 system. Teradata
systems were the first to store and analyze 1
terabyte of data in 1992. Hard disk drives were
2.5 GB in 1991 so the definition of big data
continuously evolves according to Kryder's Law.
Teradata installed the first petabyte class
RDBMS based system in 2007. As of 2017,
there are a few dozen petabyte class Teradata
relational databases installed, the largest of
which exceeds 50 PB. Systems up until 2008
were 100% structured relational data. Since
then, Teradata has added unstructured data
types including XML, JSON, and Avro.
In 2000, Seisint Inc. (now LexisNexis Risk
Solutions) developed a C++-based distributed
platform for data processing and querying
known as the HPCC Systems platform. This
system automatically partitions, distributes,
stores and delivers structured, semi-structured,
and unstructured data across multiple
commodity servers. Users can write data
processing pipelines and queries in a
declarative dataflow programming language
called ECL. Data analysts working in ECL are
not required to define data schemas upfront and
can rather focus on the particular problem at
hand, reshaping data in the best possible
manner as they develop the solution. In 2004,
LexisNexis acquired Seisint Inc.[32] and their
high-speed parallel processing platform and
successfully used this platform to integrate the
data systems of Choicepoint Inc. when they
acquired that company in 2008.[33] In 2011, the
HPCC systems platform was open-sourced
under the Apache v2.0 License.
CERN and other physics experiments have
collected big data sets for many decades,
usually analyzed via high-throughput
computing rather than the map-reduce
architectures usually meant by the current "big
data" movement.
In 2004, Google published a paper on a process
called MapReduce that uses a similar
architecture. The MapReduce concept provides
a parallel processing model, and an associated
implementation was released to process huge
amounts of data. With MapReduce, queries are
split and distributed across parallel nodes and
processed in parallel (the Map step). The
results are then gathered and delivered (the
Reduce step). The framework was very
successful,[34] so others wanted to replicate the
algorithm. Therefore, an implementation of the
MapReduce framework was adopted by an
Apache open-source project named Hadoop.
[35]
Apache Spark was developed in 2012 in
response to limitations in the MapReduce
paradigm, as it adds the ability to set up many
operations (not just map followed by reducing).
MIKE2.0 is an open approach to information
management that acknowledges the need for
revisions due to big data implications identified
in an article titled "Big Data Solution Offering".
[36]
The methodology addresses handling big
data in terms of useful permutations of data
sources, complexity in interrelationships, and
difficulty in deleting (or modifying) individual
records.[37]
2012 studies showed that a multiple-layer
architecture is one option to address the issues
that big data presents. A distributed
parallel architecture distributes data across
multiple servers; these parallel execution
environments can dramatically improve data
processing speeds. This type of architecture
inserts data into a parallel DBMS, which
implements the use of MapReduce and Hadoop
frameworks. This type of framework looks to
make the processing power transparent to the
end-user by using a front-end application
server.[38]
The data lake allows an organization to shift its
focus from centralized control to a shared model
to respond to the changing dynamics of
information management. This enables quick
segregation of data into the data lake, thereby
reducing the overhead time.[39][40]
Technologies[edit]
A 2011 McKinsey Global Institute report
characterizes the main components and
ecosystem of big data as follows:[41]
Applications[edit]
Bus wrapped with SAP Big data parked outside IDF13.
Case studies[edit]
Government[edit]
China[edit]
In 2012, the Obama
administration announced the Big Data
Research and Development Initiative, to
explore how big data could be used to
address important problems faced by the
government.[94] The initiative is composed
of 84 different big data programs spread
across six departments.[95]
Big data analysis played a large role
in Barack Obama's successful 2012 re-
election campaign.[96]
The United States Federal
Government owns five of the ten most
powerful supercomputers in the world.[97]
[98]
Research activities[edit]
Encrypted search and cluster formation in big
data were demonstrated in March 2014 at the
American Society of Engineering Education.
Gautam Siwach engaged at Tackling the
challenges of Big Data by MIT Computer
Science and Artificial Intelligence
Laboratory and Dr. Amir Esmailpour at UNH
Research Group investigated the key features
of big data as the formation of clusters and their
interconnections. They focused on the security
of big data and the orientation of the term
towards the presence of different types of data
in an encrypted form at cloud interface by
providing the raw definitions and real-time
examples within the technology. Moreover, they
proposed an approach for identifying the
encoding technique to advance towards an
expedited search over encrypted text leading to
the security enhancements in big data.[133]
In March 2012, The White House announced a
national "Big Data Initiative" that consisted of six
Federal departments and agencies committing
more than $200 million to big data research
projects.[134]
The initiative included a National Science
Foundation "Expeditions in Computing" grant of
$10 million over 5 years to the AMPLab[135] at the
University of California, Berkeley.[136] The
AMPLab also received funds from DARPA, and
over a dozen industrial sponsors and uses big
data to attack a wide range of problems from
predicting traffic congestion[137] to fighting cancer.
[138]
Critique[edit]
Critiques of the big data paradigm come in two
flavors: those that question the implications of
the approach itself, and those that question the
way it is currently done.[162] One approach to this
criticism is the field of critical data studies.
Critiques of the big data
paradigm[edit]
"A crucial problem is that we do not know much
about the underlying empirical micro-processes
that lead to the emergence of the[se] typical
network characteristics of Big Data".[17] In their
critique, Snijders, Matzat, and Reips point out
that often very strong assumptions are made
about mathematical properties that may not at
all reflect what is really going on at the level of
micro-processes. Mark Graham has leveled
broad critiques at Chris Anderson's assertion
that big data will spell the end of theory:
[163]
focusing in particular on the notion that big
data must always be contextualized in their
social, economic, and political contexts. [164] Even
as companies invest eight- and nine-figure
sums to derive insight from information
streaming in from suppliers and customers, less
than 40% of employees have sufficiently mature
processes and skills to do so. To overcome this
insight deficit, big data, no matter how
comprehensive or well analyzed, must be
complemented by "big judgment," according to
an article in the Harvard Business Review.[165]
Much in the same line, it has been pointed out
that the decisions based on the analysis of big
data are inevitably "informed by the world as it
was in the past, or, at best, as it currently is".
[56]
Fed by a large number of data on past
experiences, algorithms can predict future
development if the future is similar to the past.
[166]
If the system's dynamics of the future change
(if it is not a stationary process), the past can
say little about the future. In order to make
predictions in changing environments, it would
be necessary to have a thorough understanding
of the systems dynamic, which requires theory.
[166]
As a response to this critique Alemany Oliver
and Vayre suggest to use "abductive reasoning
as a first step in the research process in order
to bring context to consumers' digital traces and
make new theories emerge".[167] Additionally, it
has been suggested to combine big data
approaches with computer simulations, such
as agent-based models[56] and complex systems.
Agent-based models are increasingly getting
better in predicting the outcome of social
complexities of even unknown future scenarios
through computer simulations that are based on
a collection of mutually interdependent
algorithms.[168][169] Finally, the use of multivariate
methods that probe for the latent structure of
the data, such as factor analysis and cluster
analysis, have proven useful as analytic
approaches that go well beyond the bi-variate
approaches (cross-tabs) typically employed with
smaller data sets.
In health and biology, conventional scientific
approaches are based on experimentation. For
these approaches, the limiting factor is the
relevant data that can confirm or refute the
initial hypothesis.[170] A new postulate is accepted
now in biosciences: the information provided by
the data in huge volumes (omics) without prior
hypothesis is complementary and sometimes
necessary to conventional approaches based
on experimentation.[171][172] In the massive
approaches it is the formulation of a relevant
hypothesis to explain the data that is the limiting
factor.[173] The search logic is reversed and the
limits of induction ("Glory of Science and
Philosophy scandal", C. D. Broad, 1926) are to
be considered.[citation needed]
Privacy advocates are concerned about the
threat to privacy represented by increasing
storage and integration of personally identifiable
information; expert panels have released
various policy recommendations to conform
practice to expectations of privacy.[174][175][176] The
misuse of Big Data in several cases by media,
companies and even the government has
allowed for abolition of trust in almost every
fundamental institution holding up society. [177]
Nayef Al-Rodhan argues that a new kind of
social contract will be needed to protect
individual liberties in a context of Big Data and
giant corporations that own vast amounts of
information. The use of Big Data should be
monitored and better regulated at the national
and international levels.[178] Barocas and
Nissenbaum argue that one way of protecting
individual users is by being informed about the
types of information being collected, with whom
it is shared, under what constrains and for what
purposes.[179]
Critiques of the 'V' model[edit]
The 'V' model of Big Data is concerting as it
centres around computational scalability and
lacks in a loss around the perceptibility and
understandability of information. This led to the
framework of cognitive big data, which
characterizes Big Data application according to:
[180]
See also[edit]
For a list of companies, and tools, see
also: Category:Big data.
References[edit]
1. ^ Hilbert, Martin; López, Priscila
(2011). "The World's Technological
Capacity to Store, Communicate, and
Compute
Information". Science. 332 (6025): 60–
65. Bibcode:2011Sci...332...60H. doi:10.1
126/science.1200970. PMID 21310967. S
2CID 206531385. Retrieved 13
April 2016.
2. ^ Breur, Tom (July 2016). "Statistical
Power Analysis and the contemporary
"crisis" in social sciences". Journal of
Marketing Analytics. 4 (2–3): 61–
65. doi:10.1057/s41270-016-0001-3. ISSN
2050-3318.
3. ^ boyd, dana; Crawford, Kate (21
September 2011). "Six Provocations for
Big Data". Social Science Research
Network: A Decade in Internet Time:
Symposium on the Dynamics of the
Internet and
Society. doi:10.2139/ssrn.1926431. S2CID
148610111.
4. ^ Jump up to:a b c d e f g "Data, data
everywhere". The Economist. 25 February
2010. Retrieved 9 December 2012.
5. ^ "Community cleverness
required". Nature. 455 (7209): 1.
September
2008. Bibcode:2008Natur.455....1.. doi:10.
1038/455001a. PMID 18769385.
6. ^ Reichman OJ, Jones MB, Schildhauer
MP (February 2011). "Challenges and
opportunities of open data in
ecology". Science. 331 (6018): 703–
5. Bibcode:2011Sci...331..703R. doi:10.11
26/science.1197962. PMID 21311007. S2
CID 22686503.
7. ^ Hellerstein, Joe (9 November
2008). "Parallel Programming in the Age of
Big Data". Gigaom Blog.
8. ^ Segaran, Toby; Hammerbacher, Jeff
(2009). Beautiful Data: The Stories Behind
Elegant Data Solutions. O'Reilly Media.
p. 257. ISBN 978-0-596-15711-1.
9. ^ Jump up to:a b Hilbert M, López P (April
2011). "The world's technological capacity
to store, communicate, and compute
information" (PDF). Science. 332 (6025):
60–5. Bibcode:2011Sci...332...60H. do
i:10.1126/science.1200970. PMID 213109
67. S2CID 206531385.
10. ^ "IBM What is big data? – Bringing big
data to the enterprise". ibm.com.
Retrieved 26 August 2013.
11. ^ Reinsel, David; Gantz, John; Rydning,
John (13 April 2017). "Data Age 2025: The
Evolution of Data to Life-
Critical" (PDF). seagate.com.
Framingham, MA, US: International Data
Corporation. Retrieved 2 November 2017.
12. ^ Oracle and FSN, "Mastering Big Data:
CFO Strategies to Transform Insight into
Opportunity" Archived 4 August 2013 at
the Wayback Machine, December 2012
13. ^ Jacobs, A. (6 July 2009). "The
Pathologies of Big Data". ACMQueue.
14. ^ Magoulas, Roger; Lorica, Ben (February
2009). "Introduction to Big Data". Release
2.0. Sebastopol CA: O'Reilly Media (11).
15. ^ John R. Mashey (25 April 1998). "Big
Data ... and the Next Wave of
InfraStress" (PDF). Slides from invited talk.
Usenix. Retrieved 28 September 2016.
16. ^ Steve Lohr (1 February 2013). "The
Origins of 'Big Data': An Etymological
Detective Story". The New York Times.
Retrieved 28 September 2016.
17. ^ Jump up to:a b Snijders, C.; Matzat, U.; Reips,
U.-D. (2012). "'Big Data': Big gaps of
knowledge in the field of
Internet". International Journal of Internet
Science. 7: 1–5.
18. ^ Dedić, N.; Stanier, C. (2017). "Towards
Differentiating Business Intelligence, Big
Data, Data Analytics and Knowledge
Discovery". Innovations in Enterprise
Information Systems Management and
Engineering. Lecture Notes in Business
Information Processing. 285. Berlin ;
Heidelberg: Springer International
Publishing. pp. 114–122. doi:10.1007/978-
3-319-58801-8_10. ISBN 978-3-319-
58800-1. ISSN 1865-1356. OCLC 909580
101.
19. ^ Everts, Sarah (2016). "Information
Overload". Distillations. Vol. 2 no. 2.
pp. 26–33. Retrieved 22 March 2018.
20. ^ Ibrahim; Targio Hashem, Abaker;
Yaqoob, Ibrar; Badrul Anuar, Nor; Mokhtar,
Salimah; Gani, Abdullah; Ullah Khan,
Samee (2015). "big data" on cloud
computing: Review and open research
issues". Information Systems. 47: 98–
115. doi:10.1016/j.is.2014.07.006.
21. ^ Grimes, Seth. "Big Data: Avoid 'Wanna
V' Confusion". InformationWeek.
Retrieved 5 January 2016.
22. ^ Fox, Charles (25 March 2018). Data
Science for Transport. Springer Textbooks
in Earth Sciences, Geography and
Environment.
Springer. ISBN 9783319729527.
23. ^ "avec focalisation sur Big Data &
Analytique" (PDF). Bigdataparis.com.
Retrieved 8 October 2017.
24. ^ Jump up to:a b Billings S.A. "Nonlinear System
Identification: NARMAX Methods in the
Time, Frequency, and Spatio-Temporal
Domains". Wiley, 2013
25. ^ "le Blog ANDSI » DSI Big
Data". Andsi.fr. Retrieved 8 October2017.
26. ^ Les Echos (3 April 2013). "Les Echos –
Big Data car Low-Density Data ? La faible
densité en information comme facteur
discriminant – Archives". Lesechos.fr.
Retrieved 8 October 2017.
27. ^ Kitchin, Rob; McArdle, Gavin (17
February 2016). "What makes Big Data,
Big Data? Exploring the ontological
characteristics of 26 datasets". Big Data &
Society. 3 (1):
205395171663113. doi:10.1177/20539517
16631130.
28. ^ Onay, Ceylan; Öztürk, Elif (2018). "A
review of credit scoring research in the age
of Big Data". Journal of Financial
Regulation and Compliance. 26 (3): 382–
405. doi:10.1108/JFRC-06-2017-0054.
29. ^ Big Data's Fourth V
30. ^ Kitchin, Rob; McArdle, Gavin (5 January
2016). "What makes Big Data, Big Data?
Exploring the ontological characteristics of
26 datasets". Big Data & Society. 3 (1):
205395171663113. doi:10.1177/20539517
16631130. ISSN 2053-9517.
31. ^ "Survey: Biggest Databases Approach
30 Terabytes". Eweek.com. Retrieved 8
October 2017.
32. ^ "LexisNexis To Buy Seisint For $775
Million". Washington Post. Retrieved 15
July 2004.
33. ^ https://www.washingtonpost.com/wp-
dyn/content/article/2008/02/21/AR2008022
100809.html
34. ^ Bertolucci, Jeff "Hadoop: From
Experiment To Leading Big Data Platform",
"Information Week", 2013. Retrieved on 14
November 2013.
35. ^ Webster, John. "MapReduce: Simplified
Data Processing on Large Clusters",
"Search Storage", 2004. Retrieved on 25
March 2013.
36. ^ "Big Data Solution Offering". MIKE2.0.
Retrieved 8 December2013.
37. ^ "Big Data Definition". MIKE2.0.
Retrieved 9 March 2013.
38. ^ Boja, C; Pocovnicu, A; Bătăgan, L.
(2012). "Distributed Parallel Architecture
for Big Data". Informatica
Economica. 16 (2): 116–127.
39. ^ "SOLVING KEY BUSINESS
CHALLENGES WITH A BIG DATA
LAKE" (PDF). Hcltech.com. August 2014.
Retrieved 8 October2017.
40. ^ "Method for testing the fault tolerance of
MapReduce frameworks" (PDF). Computer
Networks. 2015.
41. ^ Jump up to:a b Manyika, James; Chui,
Michael; Bughin, Jaques; Brown, Brad;
Dobbs, Richard; Roxburgh, Charles;
Byers, Angela Hung (May 2011). "Big
Data: The next frontier for innovation,
competition, and productivity". McKinsey
Global Institute. Retrieved 16
January2016.
42. ^ "Future Directions in Tensor-Based
Computation and Modeling"(PDF). May
2009.
43. ^ Lu, Haiping; Plataniotis, K.N.;
Venetsanopoulos, A.N. (2011). "A Survey
of Multilinear Subspace Learning for
Tensor Data" (PDF). Pattern
Recognition. 44 (7): 1540–
1551. doi:10.1016/j.patcog.2011.01.004.
44. ^ Pllana, Sabri; Janciak, Ivan; Brezany,
Peter; Wöhrer, Alexander (2016). "A
Survey of the State of the Art in Data
Mining and Integration Query
Languages". 2011 14th International
Conference on Network-Based Information
Systems. 2011 International Conference
on Network-Based Information Systems
(NBIS 2011). IEEE Computer Society.
pp. 341–348. arXiv:1603.01113. Bibcod
e:2016arXiv160301113P. doi:10.1109/NBi
S.2011.58. ISBN 978-1-4577-0789-6. S2C
ID 9285984.
45. ^ Wang, Yandong; Goldstone, Robin; Yu,
Weikuan; Wang, Teng (October 2014).
"Characterization and Optimization of
Memory-Resident MapReduce on HPC
Systems". 2014 IEEE 28th International
Parallel and Distributed Processing
Symposium. IEEE. pp. 799–
808. doi:10.1109/IPDPS.2014.87. ISBN 9
78-1-4799-3800-1. S2CID 11157612.
46. ^ L'Heureux, A.; Grolinger, K.; Elyamany,
H. F.; Capretz, M. A. M. (2017). "Machine
Learning With Big Data: Challenges and
Approaches". IEEE Access. 5: 7776–
7797. doi:10.1109/ACCESS.2017.269636
5. ISSN 2169-3536.
47. ^ Monash, Curt (30 April 2009). "eBay's
two enormous data warehouses".
Monash, Curt (6 October 2010). "eBay
followup – Greenplum out, Teradata > 10
petabytes, Hadoop has some value, and
more".
48. ^ "Resources on how Topological Data
Analysis is used to analyze big data".
Ayasdi.
49. ^ CNET News (1 April 2011). "Storage
area networks need not apply".
50. ^ "How New Analytic Systems will Impact
Storage". September 2011. Archived
from the original on 1 March 2012.
51. ^ Hilbert, Martin (2014). "What is the
Content of the World's Technologically
Mediated Information and Communication
Capacity: How Much Text, Image, Audio,
and Video?". The Information
Society. 30 (2): 127–
143. doi:10.1080/01972243.2013.873748.
S2CID 45759014.
52. ^ Rajpurohit, Anmol (11 July
2014). "Interview: Amy Gershkoff, Director
of Customer Analytics & Insights, eBay on
How to Design Custom In-House BI
Tools". KDnuggets. Retrieved 14
July 2014. Dr. Amy Gershkoff: "Generally,
I find that off-the-shelf business
intelligence tools do not meet the needs of
clients who want to derive custom insights
from their data. Therefore, for medium-to-
large organizations with access to strong
technical talent, I usually recommend
building custom, in-house solutions."
53. ^ "The Government and big data: Use,
problems and potential". Computerworld.
21 March 2012. Retrieved 12
September 2016.
54. ^ "White Paper: Big Data for Development:
Opportunities & Challenges (2012) –
United Nations Global
Pulse". Unglobalpulse.org. Retrieved 13
April 2016.
55. ^ "WEF (World Economic Forum), & Vital
Wave Consulting. (2012). Big Data, Big
Impact: New Possibilities for International
Development". World Economic Forum.
Retrieved 24 August2012.
56. ^ Jump up to:a b c d Hilbert, Martin (15 January
2013). "Big Data for Development: From
Information- to Knowledge
Societies". SSRN 2205145.
57. ^ "Elena Kvochko, Four Ways To talk
About Big Data (Information
Communication Technologies for
Development Series)". worldbank.org. 4
December 2012. Retrieved 30 May 2012.
58. ^ "Daniele Medri: Big Data & Business: An
on-going revolution". Statistics Views. 21
October 2013.
59. ^ Tobias Knobloch and Julia Manske (11
January 2016). "Responsible use of
data". D+C, Development and
Cooperation.
60. ^ Huser V, Cimino JJ (July
2016). "Impending Challenges for the Use
of Big Data". International Journal of
Radiation Oncology, Biology,
Physics. 95 (3): 890–
894. doi:10.1016/j.ijrobp.2015.10.060. PM
C 4860172. PMID 26797535.
61. ^ Sejdic, Ervin; Falk, Tiago H. (4 July
2018). Signal Processing and Machine
Learning for Biomedical Big Data. Sejdić,
Ervin,, Falk, Tiago H. [Place of publication
not
identified]. ISBN 9781351061216. OCLC
1044733829.
62. ^ Raghupathi W, Raghupathi V (December
2014). "Big data analytics in healthcare:
promise and potential". Health Information
Science and Systems. 2 (1):
3. doi:10.1186/2047-2501-2-3. PMC 4341
817. PMID 25825667.
63. ^ Viceconti M, Hunter P, Hose R (July
2015). "Big data, big knowledge: big data
for personalized healthcare" (PDF). IEEE
Journal of Biomedical and Health
Informatics. 19 (4): 1209–
15. doi:10.1109/JBHI.2015.2406883. PMI
D 26218867. S2CID 14710821.
64. ^ O'Donoghue, John; Herbert, John (1
October 2012). "Data Management Within
mHealth Environments: Patient Sensors,
Mobile Devices, and Databases". Journal
of Data and Information Quality. 4 (1):
5:1–5:20. doi:10.1145/2378016.2378021.
S2CID 2318649.
65. ^ Mirkes EM, Coats TJ, Levesley J,
Gorban AN (August 2016). "Handling
missing data in large healthcare dataset: A
case study of unknown trauma
outcomes". Computers in Biology and
Medicine. 75: 203–
16. arXiv:1604.00627. Bibcode:2016arXiv
160400627M. doi:10.1016/j.compbiomed.2
016.06.004. PMID 27318570. S2CID 587
4067.
66. ^ Murdoch TB, Detsky AS (April 2013).
"The inevitable application of big data to
health care". JAMA. 309 (13): 1351–
2. doi:10.1001/jama.2013.393. PMID 2354
9579.
67. ^ Vayena E, Salathé M, Madoff LC,
Brownstein JS (February 2015). "Ethical
challenges of big data in public
health". PLOS Computational
Biology. 11 (2):
e1003904. Bibcode:2015PLSCB..11E3904
V. doi:10.1371/journal.pcbi.1003904. PMC
4321985. PMID 25664461.
68. ^ Copeland, CS (July–August 2017). "Data
Driving Discovery"(PDF). Healthcare
Journal of New Orleans: 22–27.
69. ^ Jump up to:a b Yanase J, Triantaphyllou E
(2019). "A Systematic Survey of
Computer-Aided Diagnosis in Medicine:
Past and Present Developments". Expert
Systems with Applications. 138:
112821. doi:10.1016/j.eswa.2019.112821.
70. ^ Dong X, Bahroos N, Sadhu E, Jackson
T, Chukhman M, Johnson R, Boyd A,
Hynes D (2013). "Leverage Hadoop
framework for large scale clinical
informatics applications". AMIA Joint
Summits on Translational Science
Proceedings. AMIA Joint Summits on
Translational Science. 2013:
53. PMID 24303235.
71. ^ Clunie D (2013). "Breast tomosynthesis
challenges digital imaging infrastructure".
72. ^ Yanase J, Triantaphyllou E (2019). "The
Seven Key Challenges for the Future of
Computer-Aided Diagnosis in
Medicine". Journal of Medical
Informatics. 129: 413–
422. doi:10.1016/j.ijmedinf.2019.06.017. P
MID 31445285.
73. ^ "Degrees in Big Data: Fad or Fast Track
to Career Success". Forbes. Retrieved 21
February 2016.
74. ^ "NY gets new boot camp for data
scientists: It's free but harder to get into
than Harvard". Venture Beat. Retrieved 21
February 2016.
75. ^ Wedel, Michel; Kannan, PK (2016).
"Marketing Analytics for Data-Rich
Environments". Journal of
Marketing. 80 (6): 97–
121. doi:10.1509/jm.15.0413. S2CID 1684
10284.
76. ^ Couldry, Nick; Turow, Joseph (2014).
"Advertising, Big Data, and the Clearance
of the Public Realm: Marketers' New
Approaches to the Content
Subsidy". International Journal of
Communication. 8: 1710–1726.
77. ^ "Why Digital Advertising Agencies Suck
at Acquisition and are in Dire Need of an
AI Assisted Upgrade". Ishti.org. 15 April
2018. Retrieved 15 April 2018.
78. ^ "Big data and analytics: C4 and Genius
Digital". Ibc.org. Retrieved 8
October 2017.
79. ^ Marshall Allen (17 July 2018). "Health
Insurers Are Vacuuming Up Details About
You – And It Could Raise Your
Rates". www.propublica.org. Retrieved 21
July 2018.
80. ^ "QuiO Named Innovation Champion of
the Accenture HealthTech Innovation
Challenge". Businesswire.com. 10 January
2017. Retrieved 8 October 2017.
81. ^ "A Software Platform for Operational
Technology Innovation"(PDF). Predix.com.
Retrieved 8 October 2017.
82. ^ Z. Jenipher Wang (March 2017). "Big
Data Driven Smart Transportation: the
Underlying Story of IoT Transformed
Mobility".
83. ^ "That Internet Of Things Thing".
84. ^ Jump up to: Solnik, Ray. "The Time Has
a b
Further reading[edit]
Library resources about
Big data
External links[edit]
Media related to Big data at Wikimedia
Commons
The dictionary definition of big data at
Wiktionary
0-7
003227
262
Categories:
Big data
Data management
Distributed computing problems
Transaction processing
Technology forecasting
Data analysis
Databases
Navigation menu
Not logged in
Talk
Contributions
Create account
Log in
Article
Talk
Read
Edit
View history
Search
Search Go
Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
Contribute
Help
Learn to edit
Community portal
Recent changes
Upload file
Tools
What links here
Related changes
Special pages
Permanent link
Page information
Cite this page
Wikidata item
Print/export
Download as PDF
Printable version
In other projects
Wikimedia Commons
Wikiversity
Languages
العربية
অসমীয়া
বাংলা
Español
हिन्दी
தமிழ்
తెలుగు
اردو
中文
45 more
Edit links
About Wikipedia
Disclaimers
Contact Wikipedia
Mobile view
Developers
Statistics
Cookie statement