Week 6 - Topic Overview

Week 6: The differences between primary and secondary data.
The use of secondary data

Students’ Learning Outcomes
• explain the differences between quantitative, qualitative and mixed methods

research designs and choose between these;
• identify the full variety of available secondary data;
• appreciate ways in which secondary data can be utilized to help to answer your
research question(s) and to meet your objectives;
• understand the advantages and disadvantages of using secondary data in research
projects;
• use a range of techniques to search for secondary data;
• evaluate the suitability of secondary data for answering your research question(s)
and meeting your objectives in terms of measurement validity, coverage, reliability,
validity, and measurement bias.
Task - Forum
• Assess the suitability of secondary data for your research

• Make pairs and work on the case study of week 6.
• Critically evaluate the secondary data for your research
Introduction
In Week 5, we introduced the research onion as a way of depicting the issues underlying your choice of
data collection method or methods and peeled away the outer two layers – research philosophy and
approach to theory development. This week, we uncover the next three layers: methodological choice,
research strategy, or strategies and choosing the time horizon for your research. As we saw in Week 5,
the way you answer your research question will be influenced by your research philosophy and approach
to theory development. Your research philosophy and approach to theory development, whether this is
deliberate or by default, will subsequently influence your selections shown in the next three layers of the
research onion. These three layers can be thought of as focusing on the process of research design, which
is the way you turn your research question into a research project. The key to these selections will be to
achieve coherence all the way through your research design.
UU-MBA-711-ZM - Dissertation Page 1

Choice and coherence in research design
As discussed in Week 5, your research design is the general plan of how you will go about answering
your research question(s) (the importance of clearly defining the research question cannot be
overemphasised). It will contain clear objectives derived from your research question(s), specify the
sources from which you intend to collect data, how you propose to collect and analyse these, and discuss
ethical issues and the constraints you will inevitably encounter (e.g., access to data, time, location and
money). Crucially, it should demonstrate that you have thought through the elements of your particular
research design.
The first methodological choice is whether you follow a quantitative, qualitative, or mixed methods
research design. Each of these options is likely to call for a different mix of elements to achieve coherence
in your research design. The nature of your research project will also be either exploratory, descriptive,
explanatory, evaluative or a combination of these. Within your research design you will need to use one
or more research strategies to ensure coherence within your research project. Your methodological choice
and related strategies will also influence the selection of an appropriate time horizon. Each research
design will lead to potential ethical concerns, and it will be important to consider these in order to
minimise or overcome them. It is also important to establish the quality of your research design. Finally,
we recognise that practical constraints will affect research design, especially the nature of your own role
as researcher. These aspects of your research design are vital to understanding what you wish to achieve
and how you intend to do so, even if your design changes subsequently. You must have a clear design
with valid reasons for each of your research design decisions. Your justification for each element should
be based on the nature of your research question(s) and objectives, show consistency with your research
philosophy and demonstrate coherence across your research design.
It is useful at this point to recognise a distinction between design and tactics. Design is concerned with
the overall plan for your research project; tactics are about the finer details of data collection and analysis
– the centre of the research onion (Figure 6.1). Decisions about tactics will involve you being clear about
the different quantitative and qualitative data collection techniques (e.g., questionnaires, interviews, focus
groups, and secondary data) and subsequent quantitative and qualitative data analysis procedures, which
are discussed below.

Figure 6.1 The research ‘onion’ (Saunders, 2016, p. 164)
The meaning of primary data
Raw data, also known as primary data, is data (e.g., numbers, instrument readings, figures, etc.) collected
from a source. If a scientist sets up a computerized thermometer that records the temperature of a chemical
mixture in a test tube every minute, the list of temperature readings for every minute, as printed out on a
spreadsheet or viewed on a computer screen is "raw data." Raw data has not been subjected to processing,
"cleaning" by researchers to remove outliers, obvious instrument reading errors or data entry errors, or
any analysis (e.g., determining central tendency aspects such as the average or median result). As well,
raw data has not been subject to any other manipulation by a software program or a human researcher,
analyst or technician. It is also referred to as primary data. Raw data is a relative term (see data), because
even once raw data has been "cleaned" and processed by one team of researchers, another team may
consider this processed data to be "raw data" for another stage of research. Raw data can be inputted to a
computer program or used in manual procedures such as analyzing statistics from a survey. The term
"raw data" can refer to the binary data on electronic storage devices, such as hard disk drives (also referred
to as "low-level data"). Primary data is data that is collected by a researcher from first-hand sources, using
methods like surveys, interviews, or experiments. It is collected with the research project in mind, directly
from primary sources.
Methodological choice: the use of a quantitative, qualitative or mixed methods research design for
primary data collection
One way of differentiating quantitative research from qualitative research is to distinguish between
numeric data (numbers) and non-numeric data (words, images, video clips, and other similar material).

In this way, ‘quantitative’ is often used as a synonym for any data collection technique (such as a
questionnaire) or data analysis procedure (such as graphs or statistics) that generates or uses numerical
data. In contrast, ‘qualitative’ is often used as a synonym for any data collection technique (such as an
interview) or data analysis procedure (such as categorising data) that generates or uses non-numerical
data. This is an important way to differentiate this methodological choice; however, this distinction is
both problematic and narrow.
It is problematic because, in reality, many business and management research designs are likely to
combine quantitative and qualitative elements. This may be for a number of reasons. For example, a
research design may use a questionnaire, but it may be necessary to ask respondents to answer some
‘open’ questions in their own words rather than ticking the appropriate box, or it may be necessary to
conduct follow-up interviews to seek to explain findings from the questionnaire. Equally, some
qualitative research data may be analysed quantitatively or be used to inform the design of a subsequent
questionnaire. In this way, quantitative and qualitative research may be viewed as two ends of a
continuum, which in practice are often mixed. Research design may, therefore, mix methods in a number
of ways, which we discuss later.
Quantitative research design
Research philosophy
Quantitative research is generally associated with positivism, especially when used with predetermined
and highly-structured data collection techniques. However, a distinction needs to be drawn between data
about the attributes of people, organisations or other things and data based on opinions, sometimes
referred to as ‘qualitative’ numbers. In this way, some survey research, while conducted quantitatively,
may be seen to fit partly within an interpretivist philosophy. Quantitative research may also be used within
the realist and pragmatist philosophies (see ‘Mixed methods research design’ later).
Approach to theory development
Quantitative research is usually associated with a deductive approach, where the focus is on using data to
test the theory. However, it may also incorporate an inductive approach, where data are used to develop
theory.
Characteristics
Quantitative research examines relationships between variables, which are measured numerically and
analysed using a range of statistical and graphical techniques. It often incorporates controls to ensure the
validity of data, as in an experimental design. Because data are collected in a standard manner, it is
important to ensure that questions are expressed clearly, so they are understood in the same way by each
participant. This methodology often uses probability sampling techniques to ensure generalisability. The
researcher is seen as independent from those being researched, who are usually called respondents.
Quantitative research design may use a single data collection technique, such as a questionnaire, and
corresponding quantitative analytical procedure. This is known as a mono method quantitative study
(Figures 6.1 and 6.2). Quantitative research design may also use more than one quantitative data

collection technique and corresponding analytical procedure. This is known as a multi-method
quantitative study (Figures 6.1 and 6.2). You might, for example, decide to collect quantitative data
using both questionnaires and structured observation, analysing these data using statistical (quantitative)
procedures. Multi-method is the branch of multiple methods research that uses more than one
quantitative or qualitative method but does not mix the two (Figure 6.2).
The use of multiple methods has been advocated within business and management research (Bryman
2006) because it is likely to overcome weaknesses associated with using only a single or mono method,
as well as providing scope for a richer approach to data collection, analysis, and interpretation.
Figure 6.2 Methodological choice (Saunders, 2016, p. 167)
Research strategies
Quantitative research is principally associated with experimental and survey research strategies. In
quantitative research, a survey research strategy is typically conducted through the use of questionnaires
or structured interviews or, possibly, structured observation.
Qualitative research design
Research philosophy
Qualitative research is often associated with an interpretive philosophy (Denzin and Lincoln 2011). It is
interpretive because researchers need to make sense of the subjective and socially constructed meanings
expressed about the phenomenon being studied. Such research is sometimes referred to as naturalistic
since researchers need to operate within a natural setting, or research context, in order to establish trust,
participation, access to meanings, and in-depth understanding. Like quantitative research, qualitative
research may also be used within realist and pragmatist philosophies (see ‘Mixed methods research
design’ later).
Many varieties of qualitative research commence with an inductive approach to theory development,
where a naturalistic and emergent research design is used to build a theory or to develop a richer

theoretical perspective than already exists in the literature. However, some qualitative research strategies
start with a deductive approach to test an existing theory using qualitative procedures (Yin 2014). In
practice, much qualitative research uses an abductive approach to theory development where inductive
inferences are developed, and deductive ones are tested iteratively throughout the research.
Characteristics
Qualitative research studies participants’ meanings and the relationships between them, using a variety
of data collection techniques and analytical procedures to develop a conceptual framework and theoretical
contribution. Bansal and Corley (2011) point out that while qualitative research is characterised by
methodological variations, it remains vital irrespective of the method used to demonstrate methodological
rigour and theoretical contribution.
Data collection is non-standardised so that questions and procedures may alter and emerge during a
research process that is both naturalistic and interactive. It is likely to use non-probability sampling
techniques. The success of the researcher’s role is dependent not only on gaining physical access to
participants but also building rapport and demonstrating sensitivity to gain cognitive access to their data.
Qualitative research design may use a single data collection technique, such as semi-structured
interviews, and corresponding qualitative analytical procedure. This is known as a mono method
qualitative study (Figures 6.1 and 6.2). Qualitative research design may also use more than one
qualitative data collection technique and corresponding analytical procedure. This is known as a multi-
method qualitative study (Figures 6.1 and 6.2). You might, for example, decide to collect qualitative
data using in-depth interviews and diary accounts, analysing these data using qualitative procedures.
Research strategies
Qualitative research is associated with a variety of strategies. While these share ontological and
epistemological roots and common characteristics, each strategy has a specific emphasis and scope as
well as a particular set of procedures. Some of the principal strategies used with qualitative research are
action research, case study research, ethnography, Grounded Theory, and narrative research. Some of
these strategies can also be used in quantitative research design, such as a case study strategy or used in
mixed-methods research design, as we now discuss.
Mixed methods research design
Research philosophy
We consider two philosophical positions that often lead to mixed methods research designs. Mixed
methods research is the branch of multiple methods research that combines the use of quantitative and
qualitative data collection techniques and analytical procedures (Figure 6.2).
In Week 5, we discussed the philosophical position of realism and in particular, that of the critical realists.
They believe that while there is an external, objective reality to the world in which we live, the way in
which each of us interprets and understands it will be affected by our particular social conditioning. To
accommodate this realist ontology and interpretivist epistemology (Tashakkori and Teddlie 2010),

researchers may, for example, use quantitative analysis of officially published data followed by
qualitative research methods to explore perceptions.
Pragmatism may also be likely to influence a mixed methods research design. Pragmatists view the
exclusive adoption of one philosophical position as unhelpful and choose instead to see these as either
end of a continuum, allowing a choice of whichever position or a mixture of positions will help them to
undertake their research (Tashakkori and Teddlie 2010). For pragmatists, the nature of the research
question, the research context, and likely research consequences are driving forces determining the most
appropriate methodological choice (Nastasi et al. 2010). Both quantitative and qualitative research is
valued by pragmatists, and the exact choice will be contingent on the particular nature of the research.
A mixed methods research design may use a deductive, inductive, or abductive approach to theory
development. For example, quantitative or qualitative research may be used to test a theoretical
proposition or propositions, followed by further quantitative or qualitative research to develop a richer
theoretical understanding. Theory may also be used to provide direction for the research. In this way a
particular theory may be used to provide a focus for the research and to limit its scope (Tashakkori and
Teddlie 2010).
Characteristics
In mixed methods research, quantitative and qualitative techniques are combined in a variety of ways that
range from simple, concurrent forms to more complex and sequential forms (Figure 6.2). The ways in
which quantitative and qualitative research may be combined, as well as the extent to which this may
occur, have led to the identification of a number of variations of mixed methods research (Creswell and
Plano Clark 2011; Nastasi et al. 2010). Concurrent mixed methods research involves the separate use
of quantitative and qualitative methods within a single phase of data collection and analysis (a single-
phase research design) (Figure 6.3). This allows both sets of results to be interpreted together to provide
a richer and more comprehensive response to the research question in comparison to the use of a mono
method design. Where you collect qualitative and quantitative data in the same phase of research in order
to compare how these data sets support one another, you will be using a concurrent triangulation
design.

Figure 6.3 Mixed methods research designs (Saunders, 2016, p. 170)
Using a concurrent mixed methods design should provide richer data than a mono method design and be
shorter in timescale as well as more practical to undertake than a sequential mixed methods design.
Sequential mixed methods research involves more than one phase of data collection and analysis
(Figure 6.3). In this design, the researcher will follow the use of one method with another in order to
expand or elaborate on the initial set of findings. In a double-phase research design, this leads to two
alternative mixed-methods research strategies, either a sequential exploratory research design
(qualitative followed by quantitative) or a sequential explanatory research design (quantitative
followed by qualitative). In a more complex, sequential, multi-phase design, mixed methods research
will involve multiple phases of data collection and analysis (e.g. qualitative followed by quantitative, then
by a further phase of qualitative). Using a double-phase or multi-phase research design suggests a
dynamic approach to the research process, which recognises that mixed methods research is both
interactive and iterative, where one phase subsequently informs and directs the next phase of data
collection and analysis. The exact nature of this interaction and iteration in a particular research project
may shape the way in which qualitative and quantitative methods are chosen and integrated at each phase
of the research (Greene 2007; Nastasi et al. 2010; Ridenour and Newman 2008; Teddlie and Tashakkori
2009).
Where you mix quantitative and qualitative methods at every stage of your research (design, data
collection, and analysis, interpretation, and presentation of the research), you will be using a fully
integrated mixed methods research design. Where you use quantitative and qualitative methods at only
one stage or particular stages of your research, you will be using a partially integrated mixed methods
research approach (Nastasi et al. 2010; Teddlie and Tashakkori 2009). Mixed methods research may use
quantitative research and qualitative research equally or unequally (Creswell and Plano Clark 2011). In
this way, the priority or weight given to either quantitative or qualitative research may vary, so that one
methodology has a dominant role, while the other plays a supporting role, depending on the purpose of
the research project. This prioritisation may also reflect the preferences of the researcher or the

expectations of those who commission the research (such as your project tutor or the managers in an
organisation).
The purpose of the research may emphasise the initial use and prioritisation of qualitative research (as in
an exploratory study, where qualitative precedes quantitative) or the initial use and prioritisation of
quantitative research (as in a descriptive study, before the possible use of supporting qualitative research
to explain particular findings further). The overall purpose of the research may also emphasise the
dominance of either quantitative or qualitative research (e.g., as in a sequential project which commences
with a qualitative, exploratory phase, followed by a quantitative, descriptive phase and which is
completed by a further qualitative, explanatory phase). The purpose of other research projects may lead
to the more equal use of quantitative and qualitative research methods. The research approach may also
lead to the relative prioritisation of either quantitative or qualitative methods. In this way, an inductive
approach designed to generate theoretical concepts and to build theory may lead to a greater emphasis on
the use of qualitative methods. The characteristics that help to define mixed methods research highlight
how quantitative and qualitative methods may be combined in a number of ways to provide you with
better opportunities to answer your research question (Tashakkori and Teddlie 2010).
The meaning of secondary data
Secondary data refers to data which is collected by someone who is someone other than the user. Common
sources of secondary data for social science include censuses, information collected by government
departments, organizational records, and data that was originally collected for other research purposes.
Secondary data analysis can save time that would otherwise be spent collecting data and, particularly in
the case of quantitative data, can provide larger and higher-quality databases that would be unfeasible for
any individual researcher to collect on their own. In addition, analysts of social and economic change
consider secondary data essential, since it is impossible to conduct a new survey that can adequately
capture past change and/or developments. However, secondary data analysis can be less useful in
marketing research, as data may be outdated or inaccurate.
The use of secondary data
When thinking about how to obtain data to answer their research question(s) or meet their objectives,
students are increasingly expected to consider undertaking further analyses of data that were collected
initially for some other purpose. Such data are known as secondary data and include both raw data and
published summaries. Once obtained, this data can be further analyzed to provide additional or different
knowledge, interpretations, or conclusions (Bulmer et al. 2009). Despite this, many students
automatically think in terms of collecting new (primary) data specifically for that purpose. Unlike national
governments, non- governmental agencies, and other organizations, they do not have the time, money, or
access to collect detailed large data sets themselves. Fortunately, over the past decade, the number of
sources of potential secondary data has, alongside the ease of gaining access, proliferated. Such secondary
data may enable you to answer or partially answer your research question(s).
Most organizations collect and store a wide variety and large volume of data to support their day-to-day
operations: for example, payroll details, copies of letters, minutes of meetings, and business transactions
such as sales queries and purchases. Quality daily newspapers contain a wealth of data, including reports

about takeover bids and companies’ share prices. Government departments undertake surveys and publish
official statistics covering social, demographic, and economic topics. Consumer research organizations
collect data that are used subsequently by different clients. Trade organizations collect data from their
members on topics such as sales that are subsequently aggregated and published. Search engines such as
Google collect data on the billions of searches undertaken daily, and social networking sites (such as
Facebook) host web pages for particular interest groups, including those set up by organizations, storing
them alongside other data group members’ posts and photographs.
Some of these data, in particular, documents such as company minutes, are available only from the
organizations that produce them, and so access will need to be negotiated. Others, such as web pages on
social networking sites, can range from being ‘open’ for everyone using the site to view to being
completely ‘restricted’ other than to group members. Governments’ survey data, such as censuses of
population, are widely available to download in the aggregated form via the Internet as governments
allow open access to data they have collected. Such survey data are also often deposited in, and available
from, data archives. Online computer databases containing company information, such as Amadeus and
Datamonitor, can frequently be accessed via your university library web pages. In addition, companies
and professional organizations usually have their websites, which may contain data that are useful to your
research project.
For certain types of research projects, such as those requiring national or international comparisons or
data from a large number of people, secondary data will help probably provide the main source to answer
your research question(s) and to address your objectives. However, if you are undertaking your research
project as part of a course of study, we recommend that you check the assessment regulations before
deciding whether you are going to use primary or secondary or a combination of both types of data. Some
universities explicitly require students to collect primary data for their research projects. Most research
questions are answered using some combination of secondary and primary data. Invariably where limited
appropriate secondary data are available, you will have to rely mainly on data you collect yourself.
This week, we examine the different types of secondary data that are likely to be available to help you to
answer your research question(s) and meet your objectives, how you might use them, and a range of
methods for locating these data. We then consider the advantages and disadvantages of using secondary
data and discuss ways of evaluating their validity and reliability. We do not attempt to provide a
comprehensive list of secondary data sources because as these are expanding rapidly, it would be an
impossible task.
Types of secondary data and uses in research
Secondary data include both quantitative (numeric) and qualitative (non-numeric) data and are used
principally in both descriptive and explanatory research. The secondary data you analyze further may be
raw data, where there has been little, if any processing, or compiled data that have received some form
of selection or summarising. Many secondary datasets currently available were primary data sets that
have been re-combined with other data sets to create larger data sets. Where data sets are massive,
complex, and difficult to process using traditional computational analysis techniques (as in our opening
vignette), they are referred to as big data. Within business and management research projects, secondary
data are used most frequently as part of a case study or survey research strategy. However, there is no

reason not to include secondary data in other research strategies, including Archival Research, Action
Research, and Experimental Research. Different researchers (e.g., Bryman 1989; Dale et al. 1988; Hakim
1982, 2000) have generated a variety of classifications for secondary data. These classifications do not,
however, capture the full variety of data. We have, therefore, built on their ideas to create three main
subgroups of secondary data: document-based, survey-based, and those compiled from multiple sources
(Figure 4.1).
Figure 4.1 Types of secondary data
Source: Mark Saunders, Philip Lewis and Adrian Thornhill 2015
Document secondary data
Document secondary data are often used in research projects that also collect primary data. However, you
can also use them on their own or with other sources of secondary data, for example, for business history
research within an Archival Research strategy. Document secondary data are defined as data that, unlike
the spoken word, endure physically (including digitally) as evidence, allowing data to be transposed
across both time and space and reanalyzed for a purpose different to that for which they were originally

collected (Lee 2012). Increasingly available online, they include both text materials and non-text
materials. Text materials include notices, correspondence (including emails), minutes of meetings, reports
to shareholders, diaries, transcripts of speeches and conversations, administrative and public records, as
well as the text of web pages. Text can also include modules, journal and magazine articles, and
newspapers. Although modules, articles, journals, and reports are a common storage medium for
compiled secondary data, the text can be important raw secondary data in its own right. You could analyze
the text of companies’ annual reports to establish the espoused attitude of companies in different sectors
to environmental issues. Using Content Analysis, such text secondary data could also be used to generate
statistical measures such as the frequency with which environmental issues are mentioned.
Secondary data also include non-text materials (Figure 4.1), such as voice- and video- recordings,
pictures, drawings, films, and television programs (Lee 2012) as well as the non-text content of web
pages. These data can be analyzed both quantitatively and qualitatively, including transcribing spoken
words and analyzing them as text. In addition, these secondary data can be used to help triangulate
findings based on other data, such as text material and primary data collected through observation,
interviews, or questionnaires. Increasingly researchers are making use of web-based materials generated
by online communities as document secondary data. While data stored in the majority of web pages, such
as blogs and those set up by social networking sites’ user groups, were never intended to be used in this
way, they can still provide secondary data for research projects. There are, however, a number of issues
related to using such data, including locating it, evaluating its usefulness in relation to your research
question and objectives, and associated ethical issues.
For your research project, the document sources you have available can depend on whether you have
been granted access to an organization’s records as well as on your success in locating data archives, and
other Internet, commercial and library sources. Access to an organization’s data will be dependent on
gatekeepers within that organization. In our experience, those research projects that make use of
document secondary data often do so as part of a within-company Action Research project or a case study
of a particular organization. When you analyze text and non-text materials, such as a web page, a
television news report, or a newspaper article directly as part of your research, you are using those
materials as secondary data. However, often, such materials are just the source of your secondary data,
rather than the actual secondary data you are analyzing.
Survey-based secondary data
Survey-based secondary data refers to existing data originally collected for some other purpose using a
survey strategy, usually questionnaires. Such data normally refer to organizations, people, or households.
They are made available as compiled data tables or, increasingly frequently, as a downloadable matrix of
raw data for secondary analysis. Survey-based secondary data will have been collected through one of
three distinct subtypes of survey strategy: censuses, continuous/regular surveys, or ad hoc surveys (Figure
4.1). Censuses are usually carried out by governments and are unique because, unlike surveys,
participation is obligatory.
Consequently, they provide excellent coverage of the population surveyed. As a consequence, they are
usually clearly defined, well documented, and of high quality. Such data are easily accessible in compiled
form and are widely used by other organizations and individual researchers.

Continuous and regular surveys are those, excluding censuses, which are repeated over time (Hakim
1982). They include surveys where data are collected throughout the year and those repeated at regular
intervals. Census and continuous and regular survey data provide a useful resource with which to compare
or set in context your research findings from primary data. Aggregate data are usually available via the
Internet, in particular for government surveys. When using this data, you need to check when they were
collected, as there can be over a year between collection and publication!
Survey secondary data may be available in sufficient detail to provide the main data set from which to
answer your research question(s) and to meet your objectives. They may be the only way in which you
can obtain the required data. If your research question is concerned with national variations in consumer
spending, it is unlikely that you will be able to collect sufficient data of your own. You will, therefore,
need to rely on secondary data such as those contained in the report Family Spending (Office for National
Statistics 2014b). For some research questions and objectives, suitable data will be available in published
form. For others, you may need more disaggregated data. This is most likely to be available via the
Internet, often from data archives.
Multiple-source secondary data
Multiple-source secondary data can be compiled entirely from the document or survey secondary data or
can be an amalgam of the two. The key factor is that different data sets have been combined to form
another data set prior to your accessing the data. One of the more common types of multiple-source data
that you are likely to come across is online compilations of company information stored in databases.
Other multiple-source secondary data include the various share price listings for different stock markets
reported in the financial pages of quality newspapers. While newspapers are usually available online,
there may be a charge to view their web pages. Fortunately, university libraries usually have recent paper
copies, while national and regional newspapers can also be accessed using online databases. The way in
which a multiple-source data set has been compiled will dictate the sorts of the research question(s) or
objectives for which you can use it. One method of compilation is to extract and combine selected
comparable variables from a number of surveys or from the same survey that has been repeated a number
of times to provide longitudinal data. For many undergraduate and taught master’s courses’ research
projects, this is one of the few ways in which you will be able to obtain data over a long period. Other
ways of obtaining time-series data are to use a series of company documents, such as appointment letters
or public and administrative records, as sources from which to create your longitudinal secondary data
set.
Data can also be compiled for the same population over time using a series of ‘snapshots’ to form cohort
studies. Such studies are relatively rare, owing to the difficulty of maintaining contact with members of
the cohort from year to year. Secondary data from different sources can also be combined if they have
the same geographical basis to form area-based data sets (Hakim 2000). Such data sets usually draw
together quantifiable information and statistics. They are commonly compiled by national governments
for their country and their component standard economic planning regions and by regional and local
administrations for their region. Such area-based multiple-source data sets are increasingly only available
online through national governments’ information gateways, regional administration’s information
gateways, or data archives.

Searching for secondary data
Unless you are approaching your research project intending to analyze one specific secondary data set
that you already know well, your first step will be to ascertain whether the data you need is available.
Your research question(s), objectives, and the literature you have already reviewed will guide this. For
many research projects, you are likely to be unsure as to whether the data you require is available as
secondary data. Fortunately, there are a number of clues to the sorts of data that are likely to be available.
The breadth of data discussed in the previous section serves only to emphasize that, despite the increasing
importance of the Internet, potential secondary data may be stored in a variety of locations. Finding
relevant secondary data requires detective work, which has two interlinked stages:
1. establishing whether the sort of data you require is likely to be available as secondary data;
2. locating the precise data you require. Establishing the likely availability of secondary data
There are a number of clues to whether the secondary data you require is likely to be available. As part
of your literature review, you will have already read journal articles and modules on your chosen topic.
Where these have made use of secondary data, they will provide you with an idea of the sort of available
data. In addition, these articles and modules should contain full references to the sources of the data.
Where these refer to published secondary data such as those stored in online databases or multiple-source
or survey reports, it is usually relatively easy to find the original source. Establishing the availability of
relevant web-based materials generated by online communities that can be used as secondary data such
as blogs and pages set up by social networking sites’ user groups can be even more difficult. Tertiary
literature such as indexes and catalogs can also help you to locate secondary data. Online searchable data
archive catalogs may prove a useful source of the sorts of secondary data available.
Advantages and disadvantages of secondary data
Advantages
For many research questions and objectives, the main advantage of using secondary data is the enormous
saving in resources, in particular, your time and money (Vartanian 2011). In general, it is much less
expensive and time-consuming to use secondary data than to collect the data yourself, especially where
the data can be downloaded as a file that is compatible with your analysis software. You will also have
more time to think about theoretical aims and substantive issues, and subsequently, you will be able to
spend more time and effort analyzing and interpreting the data. If you need your data quickly, secondary
data may be the only viable alternative. In addition, they are often higher-quality data than could be
obtained by collecting your own (Smith 2006).
Unobtrusive
Using secondary data within organizations may also have the advantage that, because they have already
been collected, they provide an unobtrusive measure. Cowton (1998) refers to this advantage as
eavesdropping, emphasizing its benefits for sensitive situations.
Longitudinal studies may be feasible

For many research projects, time constraints mean that secondary data provide the only possibility of
undertaking longitudinal studies. This is possible either by creating your own or by using an existing
multiple-source data set. Comparative research may also be possible if comparable data are available.
You may find this to be of particular use for research questions and objectives that require regional or
international comparisons. However, you need to ensure that the data you are comparing were collected
and recorded using comparable methods. Comparisons relying on unpublished data or data that are
currently unavailable in that format, such as the creation of new tables from existing census data, are
likely to be expensive, as such tabulations will have to be specially prepared. Although your research is
dependent on access being granted by the owners of the data, principally governments, many countries
are enshrining increased rights of access to information held by public authorities through freedom of
information legislation. This gives a general right to access to recorded information held by public
authorities, although a charge may be payable. However, this is dependent upon your request not being
contrary to relevant data protection legislation or agreements.
Can provide comparative and contextual data
Often it can be useful to compare data that you have collected with secondary data. This means that you
can place your findings within a more general context or, alternatively, triangulate your findings. If you
have used a questionnaire, perhaps to collect data from a sample of potential customers, secondary data
such as a national census can be used to assess the generalisability of findings; in other words how
representative these data are of the total population.
Can result in unforeseen discoveries
Reanalysing secondary data can also lead to unforeseen or unexpected new discoveries. Dale et al. (1988)
cite establishing the link between smoking and lung cancer as an example of such a serendipitous
discovery. In this example, the link was established through secondary analysis of medical records that
had not been collected to explore any such relationship.
Permanence of data
Unlike data that you collect yourself, secondary data generally provide a source of data that is often
permanent and available in a form that may be checked relatively easily by others (Denscombe 2007).
This means that the data and your research findings are more open to public scrutiny.
Disadvantages
Data that you collect yourself will be collected with a specific purpose in mind: to answer your research
question(s) and to meet your objectives. Unfortunately, secondary data will have been collected for a
specific purpose that differs from your research question(s) or objectives (Denscombe 2007).
Consequently, the data you are considering may be inappropriate for your research question. If this is the
case, then you need to find an alternative source, or collect the data yourself! More probably, you will be
able to answer your research question(s) or address your objectives only partially. Common reasons for
this include the data being collected a few years earlier and so not being current or the methods of
collection differing between the original data sources, which have been amalgamated subsequently to
form the secondary data set you to intend to use.

Access may be difficult or costly
Where data have been collected for commercial reasons, gaining access may be difficult or costly. Market
research reports, such as those produced by Mintel or Key Note, may cost a great deal if the report(s) that
you require is not available online via your university’s library.
Aggregations and definitions may be unsuitable
The fact that secondary data were collected for a different purpose may result in others, including ethical
problems. Much of the secondary data you use is likely to be in published reports. As part of the
compilation process, data will have been aggregated in some way. These aggregations, while meeting the
requirements of the original research, may not be quite so suitable for your research. The definitions of
data variables may not be the most appropriate for your research question(s) or objectives. In addition,
where you intend to combine data sets, definitions may differ markedly or have been revised over time
(Box 8.7). Alternatively, the documents you are using may represent the interpretations of those who
produced them, rather than offer an objective picture of reality.
No real control over data quality
Although many of the secondary data sets available from governments and data archives are likely to be
of a higher quality than you could ever collect yourself, there is still a need to assess the quality of these
data. Wernicke (2014) notes that although many national statistical agencies are obliged by national law
to provide data of high quality, this may not be the case. Looking at official economic data, he argues that
these are distorted by the informal economy, hidden money, and false and non-responses.
The initial purpose may affect how data are presented
When using data that are presented as part of a report, you also need to be aware of the purpose of that
report and the impact that this will have on the way the data are presented. This is especially so for internal
organizational documents and external documents such as published company reports and newspaper
reports. Reichman (1962; cited by Stewart and Kamins 1993) emphasizes this point referring to
newspapers, although the sentiments apply to many documents. He argues that newspapers select what
they consider to be the most significant points and emphasize these at the expense of supporting data.
This, Reichman states, is not a criticism as the purpose of the reporting is to bring these points to the
attention of readers rather than to provide a full and detailed account. However, if we generalize from
these ideas, we can see that the culture, predispositions, and ideals of those who originally collected and
collated the secondary data will have influenced the nature of these data, at least to some extent. For these
reasons, you must evaluate carefully any secondary data you intend to use. Possible ways of doing this
are discussed in the next section.
Evaluating secondary data sources
Secondary data must be viewed with the same caution as any primary data that you collect. You need to
be sure that:
• they will enable you to answer your research question(s) and to meet your objectives;

• the benefits associated with their use will be greater than the costs;
• you will be allowed access to the data.
Secondary sources that appear relevant at first may not on closer examination to be appropriate to your
research question(s) or objectives. It is, therefore important to evaluate the suitability of secondary data
sources for your research. Invariably, as highlighted in the week’s opening vignette, this can be
problematic where insufficient information is provided by the data source to allow this.
Stewart and Kamins (1993) argue that, if you are using secondary data, you are at an advantage compared
with researchers using primary data. Because the data already exist, you can evaluate them prior to use.
The time you spend evaluating any potential secondary data source is time well spent, as rejecting
unsuitable data earlier can save much wasted time later! Such investigations are even more important
when you have a number of possible secondary data sources you could use. Most authors suggest a range
of validity and reliability criteria against which you can evaluate potential secondary data. These, we
believe, can be incorporated into a three-stage process (Figure 4.2). However, this is not always a
straightforward process, as sources of the secondary data do not always contain all the information you
require to undertake your evaluation.
Alongside this process, you also need to consider the accessibility of the secondary data. For some
secondary data sources, in particular, those available via the Internet or in your university library, this
will not be a problem. It may, however, still necessitate long hours working in the library if the sources
are paper-based and ‘for reference only.’ For other data sources, such as those within organizations, you
need to obtain permission prior to gaining access and may well also need to consider potential ethical
implications where personal data are involved. This will be necessary even if you are working for the
organization.
Overall suitability
Measurement validity
One of the most important criteria for the suitability of any data set is measurement validity. Secondary
data that fail to provide you with the information that you need to answer your research question(s) or
meet your objectives will result in invalid answers (Smith 2006). Unfortunately, there are no clear
solutions to problems of measurement invalidity. All you can do is try to evaluate the extent of the data’s
validity and make your own decision. A common way of doing this is to examine how other researchers
have coped with this problem for a similar secondary data set in a similar context. If they found that the
measures, while not exact, were suitable, then you can be more confident that they will be suitable for
your research question(s) and objectives. If they had problems, then you may be able to incorporate their
suggestions as to how to overcome them. Your literature search will probably have identified other such
studies already.
Coverage and unmeasured variables
The other important overall suitability criterion is coverage. You need to be sure that the secondary data
cover the population about which you need data, for the time period you need, and contain data variables

that will enable you to answer your research question(s) and to meet your objectives. For all secondary
data sets, coverage will be concerned with two issues:
• ensuring that unwanted data are or can be excluded;

• ensuring that sufficient data remain for analyses to be undertaken once unwanted data have been
excluded (Hakim 2000).
When analyzing secondary survey data, you will need to exclude those data that are not relevant to your
research question(s) or objectives. Some secondary data sets, in particular, those collected using a survey
strategy, may not include variables you have identified as necessary for your analysis. These are termed
unmeasured variables. Their absence may not be particularly important if you are undertaking descriptive
research. However, it could drastically affect the outcome of explanatory research as a potentially
important variable has been excluded.
Precise suitability
Reliability and validity
The reliability and validity you ascribe to secondary data are functions of the method by which the data
were collected and the source. You can make a quick assessment of these by looking at the source of the
data. Dochartaigh (2007) and others refer to this as assessing the authority or reputation of the source.
Survey data from large, well-known organizations such as those found in Mintel and Key Note market
research reports are likely to be reliable and trustworthy. The continued existence of such organizations
is dependent on the credibility of their data. Consequently, their procedures for collecting and compiling
the data are likely to be well thought through and accurate. Survey data from government organizations
are also likely to be reliable, although they may not always be perceived as such. However, you will
probably find the validity of documentary data, such as organizations’ records more difficult to assess.
While organizations may argue that their records are reliable, there are often inconsistencies and
inaccuracies. You, therefore, need also to examine the method by which the data were collected and try
to ascertain the precision needed by the original (primary) user. Dochartaigh (2007) suggests a number
of areas for initial assessment of the authority of documents available via the Internet. These can be
adapted to assess the authority of all types of secondary data. First, as suggested in the previous paragraph,
it is important to discover the person or organization responsible for the data and to be able to obtain
additional information through which you can assess the reliability of the source. For data in printed
publications, this is usually reasonably straightforward. However, for secondary data obtained via the
Internet, it may be more difficult. Dochartaigh (2007), therefore, suggests that you also look for a
copyright statement and the existence of published documents relating to the data to help validation. The
former of these, when it exists, can indicate who is responsible for the data. The latter, he argues,
reinforces the data’s authority, as printed publications are regarded as more reliable.
For all secondary data, a detailed assessment of the validity and reliability will involve you in an
assessment of the method or methods used to collect the data (Dale et al. 1988). These may be provided
as hyperlinks for Internet-based data sets, although they may not be sufficiently detailed to enable you to
make a full assessment. Alternatively, they may be discussed in the methodology section of an associated
report. Your assessment will involve looking at who was responsible for collecting or recording the

information and examining the context in which the data were collected. From this, you should gain some
feeling regarding the likelihood of potential errors or biases. In addition, you need to look at the process
by which the data were selected and collected or recorded. Where sampling has been used to select cases
(usually as part of a survey strategy), the sampling procedure adopted, and the associated sampling error
and response rates will give clues to validity. Secondary data collected using a questionnaire with a high
response rate are also likely to be more reliable than those from one with a low response rate. However,
commercial providers of high-quality, reliable data sets may be unwilling to disclose details about how
data were collected. This is particularly the case where these organizations see the methodology as
important to their competitive advantage.
The validity and reliability of collection methods for survey data will be easier to assess where you have
a clear explanation of the techniques used to collect the data. This needs to include a clear explanation of
any sampling techniques used and response rates (discussed earlier) as well as a copy of the data collection
instrument, which is usually a questionnaire. By examining the questions by which data were collected,
you will gain a further indication of the validity. Where data have been compiled, as in a report, you need
to pay careful attention to how these data were analyzed and how the results are reported. Where
percentages (or proportions) are used without actually giving the totals on which these figures are based,
you need to scrutinize the data. Where quotations appear to be used selectively without other supporting
evidence, you should beware, as the data may be unreliable.
Measurement bias
Measurement bias can occur for three reasons (Hair et al. 2011):
• deliberate distortion of data;

• changes in the way data are collected;
• when the data collection technique did not truly measure the topic of interest.
Deliberate distortion occurs when data are recorded inaccurately on purpose, and is most common for
secondary data sources such as organizational records. Managers may deliberately fail to record minor
accidents to improve safety reports for their departments. Data that have been collected to further a
particular cause or the interests of a particular group are more likely to be suspect as the purpose of the
study may be to reach a predetermined conclusion (Smith 2006). Reports of consumer satisfaction surveys
may deliberately play down negative comments to make the service appear better to their target audience
of senior managers and shareholders, and graphs may deliberately be distorted to show an organization
in a more favorable light. Unfortunately, measurement bias resulting from deliberate distortion is difficult
to detect. While we believe that you should adopt a neutral stance about the possibility of bias, you still
need to look for pressures on the original source that might have biased the data. For written documents
such as minutes, reports, and memos, the intended target audience may suggest possible bias. Therefore,
where possible, you will need to triangulate the findings with other independent data sources. Where data
from two or more independent sources suggest similar conclusions, you can have more confidence that
the data on which they are based are not distorted. Conversely, where data suggest different conclusions
you need to be more wary of the results. Changes in the way in which data were collected can also
introduce changes in measurement bias.

Provided that the method of collecting data remains constant in terms of the people collecting it and the
procedures used, the measurement biases should remain constant. Once the method is altered, perhaps
through a new procedure for taking minutes or a new data collection form, then the bias also changes.
This is very important for longitudinal data sets where you are interested in trends rather than actual
numbers. Your detection of biases is dependent on discovering that the way data are recorded has

changed. Within-company sources are less likely to have documented these changes than government-
sponsored sources.
Box 4.1 Checklist

Evaluating your secondary data sources
Overall suitability
✔ Does the data set contain the information you require to answer your research question(s) and meet your objectives?
✔ Do the measures used match those you require?
✔ Is the data set a proxy for the data you really need?
✔ Does the data set cover the population that is the subject of your research?
✔ Does the data set cover the geographical area that is the subject of your research?
✔ Can data about the population that is the subject of your research be separated from unwanted data?
✔ Are the data for the right time period or sufficiently up to date?
✔ Are data available for all the variables you require to answer your research question(s) and meet your objectives?
✔ Are the variables defined clearly?
Precise suitability
✔ How reliable is the data set you are thinking of using?
✔ How credible is the data source?
✔ Is it clear what the source of the data is?
✔ Do the credentials of the source of the data (author, institution or organisation sponsoring the data) suggest it is likely
to be reliable?
✔ Do the data have an associated copyright statement?
✔ Do associated published documents exist?
✔ Does the source contain contact details for obtaining further information about the data?
✔ Is the method described clearly?
✔ If sampling was used, what was the procedure and what were the associated sampling errors and response rates?
✔ Who was responsible for collecting or recording the data?
✔ (For surveys) Is a copy of the questionnaire or interview checklist included?
✔ (For compiled data) Are you clear how the data were analysed and compiled?
✔ Are the data likely to contain measurement bias?
✔ What was the original purpose for which the data were collected?
✔ Who was the target audience and what was its relationship to the data collector or compiler (were there any vested
interests)?
✔ Have there been any documented changes in the way the data are measured or recorded including definition changes?
✔ How consistent are the data obtained from this source when compared with data from other sources?
✔ Have the data been recorded accurately?
✔ Are there any ethical concerns with using the data?

Box 4.1 Checklist (cont’d)
Evaluating your secondary data sources
Costs and benefits

✔ What are the financial and time costs of obtaining these data?
✔ Can the data be downloaded into a spreadsheet, statistical analysis software or word processor?
✔ Do the overall benefits of using these secondary data sources outweigh the associated costs?
And finally
✔ Is permission required to use these data and, if ‘yes’, can you obtain it?
Sources: Dale et al. (1988); Dochartaigh (2007); Hair et al. (2011); Smith (2006); Stewart and Kamins (1993); Vartanian (2011)
Statistical Considerations
Statistical power is defined as the probability of rejecting a null hypothesis (H0), assuming that it is false,
and given additional assumptions about the true values of population parameters (see, e.g., Cohen 1988,
1992; Norcross et al. 2017). It differs for different study designs and different statistical tests; for example,
it can sometimes be improved by using pretests or repeated measures (see Guo et al. 2013; Vickers and
Altman 2001). However, for a given study design and a given analysis plan, power depends mainly on
effect size and sample size, so we focus on these two factors for simplicity.
There are methodological considerations that arise more frequently when using secondary data, but these
issues are certainly not unique to its use. Among these considerations are the treatment of missing data,
utilization of sampling weights, and being thoughtful about the statistical consequences of working with
extensive samples. Andersen et al. (2010) used secondary data to examine the association between
posttraumatic stress disorder (PTSD) and physical disease in over 4,000 Iraq and Afghanistan war
veterans. The large sample size was beneficial because it allowed for precise estimation of population
values and provided good statistical power for planned comparisons (e.g., probability of detecting an
association that exists in the population). However, an important consideration with large sample sizes is
the risk of having too much power and of detecting very small, perhaps trivial, effects as statistically
significant. Because the project was examining mental and physical disease conditions, this team
considered the significant findings in terms of clinical relevance. They found that within the first 5 years
of returning from war, veterans with PTSD were at over 30% increased risk of hypertensive and digestive
disease conditions. Compared to a statistically significant but weak association (1–2% increased risk), an
effect of this size is clinically meaningful for health care providers when considering physical disease
prevention and treatment programs for veterans with PTSD.
To determine the practical magnitude of significant findings, it can also be helpful to compare the
observed effect to standardized effect sizes (Cohen, 1988) or to a known effect size as presented in similar
published studies. Importantly, even with large sample sizes, there may not be adequate power for subset

analyses or to examine interactions between variables of interest. Although their sizeable secondary
dataset included both female and male veterans, Andersen et al. (2010) did not have adequate power to
examine interactions between gender, PTSD, and disease conditions among the smaller subsample of
women.
Missing data
Secondary datasets, particularly in longitudinal studies, often have missing data. The secondary data
repository should have information about the amount and location of missing data in the dataset. Although
this is an unavoidable problem, many commonly used statistical software programs offer various options
for handling missing data. One conventional approach deletes all observations with a missing value on
any one of the variables being used in the analysis (i.e., listwise deletion, complete case analysis). This
approach assumes that the missing data are ‘missing completely at random’ (MCAR), which means that
the subset of subjects with complete data represents a random sample of the original set of observations
(Alison, 2001). Unfortunately, this assumption is not often met, and if missing data are scattered about
any observations, this approach can reduce the sample size substantially, leading to inefficient use of the
data and reduced statistical power. There are different types of missing data mechanisms (Rubin, 1976)
and some statistical techniques make assumptions about the type of mechanism (e.g., Linear Mixed
Models assume the data are ‘missing at random’ [MAR], a much weaker [less restrictive] assumption
than MCAR) (Verbeke & Molenberghs, 2000; West, Welch, & Galecki, 2007).
In contrast to MCAR, under an assumed MAR missing data mechanism, the subset of cases with complete
data is not assumed to be a random sample of the original set of observations. This distinction has
implications for the validity of estimation procedures because maximum likelihood (ML) will produce
valid parameter estimates if the missing data mechanism is MCAR or MAR, although for MAR it is
necessary to assume that the specification of the joint distribution of the responses is correct. Under
generalized least squares estimation (GLS), parameter estimates are valid for MCAR but can be biased
for assumed MAR mechanisms. As a rule, it is important to assess the amount and mechanism of missing
data because this information will help inform the choice of a statistical method and the decision to use a
missing value imputation procedure (Little & Rubin, 2002). We strongly recommend consulting with an
individual well-versed in these statistical issues prior to beginning one’s data analysis.
References
Allison, P. D. (2001). Missing data. Sage publications.
Ataullah, A., Davidson, I., Le, H., & Wood, G. (2014). Corporate diversification, information asymmetry
and insider trading. British Journal of Management, 25(2), 228-251.
Andersen, J., Wade, M., Possemato, K., & Ouimette, P. (2010). Association between posttraumatic stress
disorder and primary care provider-diagnosed disease among Iraq and Afghanistan
veterans. Psychosomatic Medicine, 72(5), 498-504.

Bansal, P., & Corley, K. (2011). The coming of age for qualitative research: Embracing the diversity of
qualitative methods.
Becker, H.S. (1998) Tricks of the Trade. Chicago University Press.
Bryman, A. (1989). In bulmer, m.(ed). research methods and organisation studies.
Bulmer, M. S. PJ and Allum, N.(2009), Editors' introduction. Bulmer, M., Sturgis, PJ and Allum,
Secondary Analysis of Survey Data.
Muller, K. (1989). Statistical power analysis for the behavioral sciences.
Cohen, J. (1992). A power primer. Psychological bulletin, 112(1), 155.
Cowton, C. J. (1998). The use of secondary data in business ethics research. Journal of Business
Ethics, 17(4), 423-434.
Creswell, J.W., and Plano Clark, V.L. (2011) Designing and Conducting Mixed Methods Research .
Sage.
Dale, A., Arber, S., & Procter, M. (1988). Doing secondary analysis. Unwin Hyman.
Davis, A., Hirsch, D., and Padley, M. (2014) A Minimum Income Standard for the UK in 2014. A Joseph
Rowntree Foundation Report.
DeMers, J. (2014). The top ten benefits of social media marketing. Forbes.
Denscombe, M. (2007). The good research guide for small scale research.
Denzin, N. K., & Lincoln, Y. S. (Eds.). (2011). The Sage handbook of qualitative research. sage.
Dochartaigh, N. Ó. (2012). Internet research skills. Sage.
European Commission (2015) European Union Labour Force Survey (EU LFS).
George, G., Haas, M. R., & Pentland, A. (2014). Big data and management.
Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., & Brilliant, L. (2009).
Detecting influenza epidemics using search engine query data. Nature, 457(7232), 1012-1014.
Greene, J. C. (2007). Mixed methods in social inquiry (Vol. 9). John Wiley & Sons.
Guo, Y., Logan, H. L., Glueck, D. H., & Muller, K. E. (2013). Selecting a sample size for studies with
repeated measures. BMC medical research methodology, 13(1), 1-8.

Hair, J.F., Celsi, J.W., Money, A.H., Samouel, P. and Page, M.J. (2011) Essentials of Business Research
Methods. Sharpe.
Hakim, C. (1982). Secondary analysis in social research: A guide to data sources and methods with
examples. Allen and Unwin/Unwin Hyman.
Hakim, C. (2000). Research Design, Successful designs for social and economic research. Routledge.
Hookway, N. (2008). Entering the blogosphere': some strategies for using blogs in social
research. Qualitative research, 8(1), 91-113.
Kavanagh, M.J., Thite, M. and Johnson, R.D. (2012) Human Resource Information Systems: Basics,
Applications, and Future Directions. Sage.
Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of Google Flu: traps in big data
analysis. Science, 343(6176), 1203-1205.
Symon, G., & Cassell, C. (Eds.). (2012). Qualitative organizational research: core methods and current
challenges. Sage.
Little, R. J., & Rubin, D. B. (2002). Statistical inference with missing data.
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Hung Byers, A. (2011). Big
data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute.
Nastasi, B. K., Hitchcock, J. H., & Brown, L. M. (2010). An inclusive framework for conceptualizing
mixed methods design typologies: Moving toward fully integrated synergistic research
models. Handbook of mixed methods in social & behavioral research, 2, 305-338.
Norcross, J. C., Hogan, T. P., Koocher, G. P., & Maggio, L. A. (2017). Clinician's guide to evidence-based
practices: Behavioral health and addictions. Oxford University Press.
Reichman, C.S. (1962) Use and Abuse of Statistics. Oxford University Press.
Saunders, M. N., Dietz, G., & Thornhill, A. (2014). Trust and distrust: Polar opposites, or independent but
co-existing?. Human Relations, 67(6), 639-665.
Benz, C. R., Ridenour, C. S., & Newman, I. (2008). Mixed methods research: Exploring the interactive
continuum. SIU Press.

Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581-592.
Saunders, M., Lewis, P., & Thornhill, A. (2016). Research methods for business students. Pearson
education.
Smith, E. (2008). Using secondary data in educational and social research. McGraw-Hill Education
(UK).
Stewart, D. W., & Kamins, M. A. (1993). Secondary research: Information sources and methods (Vol. 4).
Sage.
Tashakkori, A., & Teddlie, C. (2010). Sage handbook of mixed methods in social and behavioral research.
SAGE publications.
Teddlie, C. and Tashakkori, A. (2009) Foundations of Mixed Methods Research: Integrating Quantitative
and Qualitative Approaches in the Social and Behavioral Sciences. Sage.
Vartanian, T. P. (2010). Secondary data analysis. Oxford University Press.
Verbeke, G., & Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. Springer-Verlag.
Vickers, A. J., & Altman, D. G. (2001). Analysing controlled trials with baseline and follow up
measurements. Bmj, 323(7321), 1123-1124.
Wernicke, I. H. (2014). Quality of Official Statistics Data on the Economy. Journal Of Finance,
Accounting & Management, 5(2).
West, B. T., Welch, K. B., & Galecki, A. T. (2006). Linear mixed models: a practical guide using
statistical software. Chapman and Hall/CRC.
Yin, R. K. (2014). Case study research: Design and methods (Vol. 5). sage.

Week 6 - Topic Overview

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Week 6 - Topic Overview

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 6 - Topic Overview

Uploaded by

Copyright:

Available Formats

Week 6: The differences between primary and secondary data.

The use of secondary data

• explain the differences between quantitative, qualitative and mixed methods

• Assess the suitability of secondary data for your research

UU-MBA-711-ZM - Dissertation Page 1

UU-MBA-711-ZM - Dissertation Page 2

The meaning of primary data

UU-MBA-711-ZM - Dissertation Page 3

Quantitative research design

Approach to theory development

UU-MBA-711-ZM - Dissertation Page 4

Figure 6.2 Methodological choice (Saunders, 2016, p. 167)

Qualitative research design

Approach to theory development

UU-MBA-711-ZM - Dissertation Page 5

Mixed methods research design

UU-MBA-711-ZM - Dissertation Page 6

Approach to theory development

UU-MBA-711-ZM - Dissertation Page 7

UU-MBA-711-ZM - Dissertation Page 8

The meaning of secondary data

The use of secondary data

UU-MBA-711-ZM - Dissertation Page 9

Types of secondary data and uses in research

UU-MBA-711-ZM - Dissertation Page 10

Figure 4.1 Types of secondary data

Source: Mark Saunders, Philip Lewis and Adrian Thornhill 2015

Document secondary data

UU-MBA-711-ZM - Dissertation Page 11

Survey-based secondary data

UU-MBA-711-ZM - Dissertation Page 12

Multiple-source secondary data

UU-MBA-711-ZM - Dissertation Page 13

Advantages and disadvantages of secondary data

Longitudinal studies may be feasible

UU-MBA-711-ZM - Dissertation Page 14

Can provide comparative and contextual data

Can result in unforeseen discoveries

UU-MBA-711-ZM - Dissertation Page 15

Aggregations and definitions may be unsuitable

No real control over data quality

The initial purpose may affect how data are presented

Evaluating secondary data sources

UU-MBA-711-ZM - Dissertation Page 16

Coverage and unmeasured variables

UU-MBA-711-ZM - Dissertation Page 17

• ensuring that unwanted data are or can be excluded;

Reliability and validity

UU-MBA-711-ZM - Dissertation Page 18

• deliberate distortion of data;

UU-MBA-711-ZM - Dissertation Page 19

UU-MBA-711-ZM - Dissertation Page 20

Box 4.1 Checklist

UU-MBA-711-ZM - Dissertation Page 21

Costs and benefits

UU-MBA-711-ZM - Dissertation Page 22

and insider trading. British Journal of Management, 25(2), 228-251.

veterans. Psychosomatic Medicine, 72(5), 498-504.

UU-MBA-711-ZM - Dissertation Page 23