Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chapter Five Qualitative Research

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 22

CHAPTER FIVE: PROCESSING AND ANALYSIS OF DATA

5.1 Nature and Meaning of Data Processing

What is data processing?


Data processing occurs when data is collected and translated into usable
information. Usually performed by a data scientist or team of data
scientists, it is important for data processing to be done correctly as not
to negatively affect the end product or data output.
Data processing starts with data in its raw form and converts it into a
more readable format (graphs, documents, etc.), giving it the form and
context necessary to be interpreted by computers and utilized by
employees throughout an organization.
Six stages of data processing
1. Data collection
Collecting data is the first step in data processing. Data is pulled from
available sources, including data lakes and data warehouses. It is
important that the data sources available are trustworthy and well-built
so the data collected (and later used as information) is of the highest
possible quality.
2. Data preparation
Once the data is collected, it then enters the data preparation stage. Data
preparation, often referred to as “pre-processing” is the stage at which
raw data is cleaned up and organized for the following stage of data
processing. During preparation, raw data is diligently checked for any
errors. The purpose of this step is to eliminate bad data (redundant,
1
incomplete, or incorrect data) and begin to create high-quality data for
the best business intelligence.
3. Data input
The clean data is then entered into its destination (perhaps a CRM like
Sales force or a data warehouse like Redshift), and translated into a
language that it can understand. Data input is the first stage in which raw
data begins to take the form of usable information.
4. Processing
During this stage, the data inputted to the computer in the previous stage
is actually processed for interpretation. Processing is done using
machine learning algorithms, though the process itself may vary slightly
depending on the source of data being processed (data lakes, social
networks, connected devices etc.) and its intended use (examining
advertising patterns, medical diagnosis from connected devices,
determining customer needs, etc.).
5. Data output/interpretation
The output/interpretation stage is the stage at which data is finally usable
to non-data scientists. It is translated, readable, and often in the form of
graphs, videos, images, plain text, etc.). Members of the company or
institution can now begin to self-serve the data for their own data
analytics projects.
6. Data storage

2
The final stage of data processing is storage. After all of the data is
processed, it is then stored for future use. While some information may
be put to use immediately, much of it will serve a purpose later on. Plus,
properly stored data is a necessity for compliance with data protection
legislation like GDPR. When data is properly stored, it can be quickly
and easily accessed by members of the organization when needed.
5.1.1 Qualitative Data Processing: Editing; Coding; Classification.

After the data have been collected, the researcher turns to the task of analyzing them. The data, after collection,
has to be processed and analyzed in accordance with the outline laid down for the purpose at the time of
developing the research plan. This is essential for a scientific study and for ensuring that we have all relevant data
for making contemplated comparisons and analysis. The analysis of data requires a number of closely related
operations such as establishment of categories, the application of these categories to raw data through coding,
tabulation and then drawing statistical inferences. The unwieldy data should necessarily be condensed into a few
manageable groups and tables for further analysis. Thus, a researcher should classify the raw data into some
purposeful and usable categories. Coding operation is usually done at this stage through which the categories of
data are transformed into symbols that may be tabulated and counted. Editing is the procedure that improves the
quality of the data for coding. With coding the stage is ready for tabulation. Tabulation is a part of the technical
procedure wherein the classified data are put in the form of tables. The mechanical devices can be made use of at
this juncture. A great deal of data, especially in large inquiries, is tabulated by computers. Computers not only
save time but also make it possible to study a large number of variables affecting a problem simultaneously.
Analysis work after tabulation is generally based on the computation of various percentages, coefficients, etc., by
applying various well defined methods or techniques. Supporting or conflicting with original new hypotheses
should be subjected to tests of significance to determine with what validity data can be said to indicate any
conclusion(s).

DATA ANALYSIS BASICS: EDITING, CODING, & CLASSIFICATION


After collecting data, it must be reduced to some form suitable for analysis so that conclusions or findings can be
reported to target population. For analyzing data researchers must decide –
(a) Whether the tabulation of data will be performed by hand or by computer.
(b) How information can be converted into a form that will allow it to be processed efficiently.
(c) What statistical tools or methods will be employed?
Now a days computers have become an essential tool for the tabulation and analysis of data. Even in simple
statistical procedures computer tabulation is encouraged for easy and flexible handling of data. Micro and laptop
computers can produce tables of any dimension and perform statistical operations much more easily and usually

3
with far less error than is possible manually. If the data is large and the processing undertaken by computer the
following issues are considered.
1. Data preparation which includes editing, coding, and data entry.
2. Exploring, displaying and examining data which involves breaking down, examining and rearranging data so as
to search for meaningful description, patterns and relationships.

4
1. EDITING
First step in analysis is to edit the raw data. Editing detects errors and omissions, corrects them whatever possible.
Editor’s responsibility is to guarantee that data are – accurate; consistent with the intent of the questionnaire;
uniformly entered; complete; and arranged to simplify coding and tabulation. Editing of data may be
accomplished in two ways - (i) field editing and (ii) in-house also called central editing. Field editing is
preliminary editing of data by a field supervisor on the same data as the interview. Its purpose is to identify
technical omissions, check legibility, and clarify responses that are logically and conceptually inconsistent. When
gaps are present from interviews, a call-back should be made rather than guessing what the respondent would
probably said. Supervisor is to re-interview a few respondents at least on some pre-selected questions as a validity
check. In center or in-house editing all the questionnaires undergo thorough editing. It is a rigorous job performed
by central office staff.
2. CODING: Coding refers to the process of assigning numerals or other symbols to answers so that responses
can be put into a limited number of categories or classes. Such classes should be appropriate to the research
problem under consideration. They must also possess the characteristic of exhaustiveness (i.e., there must be a
class for every data item) and also that of mutual exclusively which means that a specific answer can be placed in
one and only one cell in a given category set. Another rule to be observed is that of unidimensionality by which is
meant that every class is defined in terms of only one concept. Coding is necessary for efficient analysis and
through it the several replies may be reduced to a small number of classes which contain the critical information
required for analysis. Coding decisions should usually be taken at the designing stage of the questionnaire. This
makes it possible to precode the questionnaire choices and which in turn is helpful for computer tabulation as one
can straight forward key punch from the original questionnaires. But in case of hand coding some standard
method may be used. One such standard method is to code in the margin with a coloured pencil. The other
method can be to transcribe the data from the questionnaire to a coding sheet. Whatever method is adopted, one
should see that coding errors are altogether eliminated or reduced to the minimum level.
3. CLASSIFICATION: Most research studies result in a large volume of raw data which must be reduced into
homogeneous groups if we are to get meaningful relationships. This fact necessitates classification of data which
happens to be the process of arranging data in groups or classes on the basis of common characteristics. Data
having a common characteristic are placed in one class and in this way the entire data get divided into a number
of groups or classes. Classification can be one of the following two types, depending upon the nature of the
phenomenon involved:
(a) Classification according to attributes: As stated above, data are classified on the basis of common
characteristics which can either be descriptive (such as literacy, sex, honesty, etc.) or numerical (such as weight,
height, income, etc.). Descriptive characteristics refer to qualitative phenomenon which cannot be measured
quantitatively; only their presence or absence in an individual item can be noticed. Data obtained this way on the
basis of certain attributes are known as statistics of attributes and their classification is said to be
classification according to attributes.
Such classification can be simple classification or manifold classification. In simple classification we consider
only one attribute and divide the universe into two classes—one class consisting of items possessing the given
attribute and the other class consisting of items which do not possess the given attribute. But in manifold
classification we consider two or more attributes simultaneously, and divide that data into a number of classes
(total number of classes of final order is given by 2n, where n = number of attributes considered).* Whenever data

5
are classified according to attributes, the researcher must see that the attributes are defined in such a manner that
there is least possibility of any doubt/ambiguity concerning the said attributes.
(b) Classification according to class-intervals: Unlike descriptive characteristics, the numerical characteristics
refer to quantitative phenomenon which can be measured through some statistical units. Data relating to income,
production, age, weight, etc. come under this category. Such data are known as statistics of variables and are
classified on the basis of class intervals.

5.1.2 Concept Mapping


It should be clear by now that qualitative data analysts spend a lot of time committing thoughts to paper (or to computer file), but
this process is not limited to text alone. Often, we can think out relationships among concepts more clearly by putting the concepts
in a graphic format, a process called concept mapping. Some researchers put all their major concepts on a single sheet of paper,
whereas others spread their thoughts across several sheets of paper, blackboards, magnetic boards, computer pages, or other media
(Strauss & Corbin, 1998).

5.1.3 Quantitative Data Processing:

A quantitative data set is typically a data matrix consisting of rows and columns where one row corresponds
to one unit of observation and one column corresponds to one variable. To analyzed quantitative data,
software for statistical analysis is needed, as well as at least basic knowledge of statistics and quantitative
methods.

In social sciences, empirical quantitative data are usually collected by surveys, for example, by a postal,
telephone, face-to-face or Internet survey. When such collection methods are used, the unit of observation is
most often the individual, and the variables in the data matrix represent the survey responses of these
individuals. Data matrices are sometimes also called micro data (referring to individual response data) or
numerical data.

What Does Data Entry Mean?


Data entry is the process of transcribing information into an electronic medium such as a computer or other
electronic device. It can either be performed manually or automatically by using a machine or computer.
Most data entry tasks are time consuming in nature; however data entry is considered a basic, necessary task
for most organizations.

Some Problems in Data Processing


There are several challenges that come while processing the data. Let me put some light on the
key challenges that appear while processing the data.

6
1- Collection of data

The first challenge in data processing comes in the collection or acquisition of the correct data
for the input. We have the following data sources from which we can acquire data:

 Administrative data sources,


 Mobile and website data,
 Social media,
 Tech. support calls,
 Statistical surveys,
 Census,
 Purchasing data from third parties.
There are many more examples. Sometimes, the data collection agent walks door to door to
collect the data that we need, which is rare but still happens.

The challenge here is to collect the exact data to get the proper result. The result directly depends
on the input data. Hence, it is vital to collect the correct data to get the desired result.

Solution
Choosing the right data collection technique can help to overcome this challenge. Below are the
4 different data collection techniques:

 Observation: Making direct observation is a quick and effective way to collect simple
data with minimal intrusion.
 Questionnaire: Surveys can be carried out to every corner of the globe. With them, the
researcher can structure and precisely formulate the data collection plan.
 Interview: Interviewing is the most suitable technique to interpret and understand the
respondents.
 Focus group session: The presence of several relevant people simultaneously debating on
the topic gives the researcher a chance to view both sides of the coin and build a balanced
perspective.

7
2- Duplication of data

As the data is collected from different data sources, it often happens that there is duplication in
data. The same entries and entities may present a number of times during the data encoding
stage. This duplicate data is redundant and may produce an incorrect result.

Hence, we need to check the data for duplication and proactively remove the duplicate data.

Solution
Data deduplication is adapted to reduce the cost and free storage space. Deduplication
technology of data identifies the identical data blocks and eliminates redundant data.
This technique significantly reduces the size of disk usage and also reduces the disk IO traffic.
Hence, it enhances the processing performance and helps in achieving precise and high accuracy
data.

3- Inconsistency of data

When we collect a huge amount of data, there is no guarantee that the data would be complete or
all the fields that we need are filled correctly. Moreover, the data may be ambiguous.

As the input/raw data is heterogeneous in nature and is collected from autonomous data sources,
the data may conflict with each other in three different levels:

 Schema Level: Different data sources have different data models and different schemas
within the same data model.
 Data representation level: Data in different sources are represented in different
structures, languages, and measurements.
 Data value level: Sometimes, the same data objects have factual discrepancies among
various data sources. This occurs when we obtain two data objects from different sources
and they are identified as versions of each other. But, the value corresponding to their
attributes differ.

8
Solution
In this situation, we need to check for the completeness of the data. Also, we have to see the
dependency and importance of the field (inconsistent field) to the desired result. Furthermore, we
need to proactively figure out bugs to ensure consistency in the database.

4- Variety of data

The input data, as it is collected from different sources, can contain different forms. The rows
and columns of a relational database don’t limit the data. The data varies from application to
application and source to source. Much of this data is unstructured and cannot fit into a
spreadsheet or a relational database.

It may be that the collected data is in text or tabular format. On the other hand, it may be a
collection of photographs and videos and sometimes maybe just audio.

Sometimes to get the desired result, there is a need to process different forms of data altogether.

Solution
There are different techniques for resolving and managing data variety, some of them are as
follows:

 Indexing: Different and incompatible data types can be related together with the indexing
technique.
 Data profiling: This technique helps in identifying the abnormalities and interrelationship
between the different data sources.
 Metadata: Meta description of data and its management helps in achieving contextual
consistency in the data.
 Universal format conversion: In this technique, we can convert the collected data into a
universally accepted format such as Extensible markup language (XML).
5- Data Integration

Data integration means to combine the data from various sources and present it in a unified view.

9
With the increased variety of data and different formats of data, the challenge to integrate the
data becomes bigger.

The data integration consists of various challenges that are as follows:

 Isolation: The majority of applications are developed and deployed in isolation which
makes it difficult to integrate the data across various applications.
 Technological Advancements: With the advancement in technology, the ways to store
and retrieve data changes. The problem here occurs in the integration of newer data with
legacy data.
 Data Problems: The challenge in data integration rises when the data is incorrect,
incomplete, or is of the wrong format.
Then we have to figure the right approach to integrate the data so that the data remains
consistent.

Solution
There are mainly three techniques for integrating data:

 Consolidation: It captures data from multiple sources and integrates it into a single
persistent data store.
 Federation: It gives a single virtual view of multiple data sources. When it fires a query it
returns data from the most appropriate data source.
 Propagation: Data propagation applications copies data from one source to another.
Furthermore, it guarantees a two-way data exchange regardless of the type of data
synchronization.
6- Volume and Storage of data

When processing big data, the volume of the data is considerably large. Big data consists of both
structured and unstructured data. This includes the data available on social networking sites,
records of companies, data from surveillance sources, research and development data, and much
more. Here comes the challenge to store and manage this sheer volume of data. Another
challenge is what amount of data is to present to the RAM so that the processing is faster and the
resource utilization is smart.

10
Also, we need to back up the data to ensure it is protected from any sort of loss. The data loss
could occur due to software or hardware issues, natural disasters, or human error.

Now, the data itself is huge in volume and we need to take a copy or backup of the data for
safety. This increases the amount of stored data by up to 150% or even more.

Solution
Below are the possible approaches that we may use to store a large amount of data:

 Object storage: with this approach, it is easier to store very large sets of data. It is a
replacement for the traditional, tree-like file system.
 Scale-out NAS: is capable of scaling the capacity of the storage. It usually has its own
distributed or clustered file system.
 Distributed nodes: most often the low-cost commodity implements this. It attaches
directly to the computer server or even server memory.
7- Poor Description and Meta Data

One of the major sources of the input data is the data that is stored over time in a relational
database. But this data is not properly formatted and there is no meta description of the storage,
structure, and relation of the data entities with each other.

The scenario becomes even worse when the amount of data is large and the database itself links
to other databases. Without proper documentation of the database, it is quite difficult to extract
the correct input data from the databases.

Solution
 De-normalize the database for querying purposes.
 Use the stored procedure to allow complex data management tasks.
 Using a NoSQL database for storing data

11
8- Modification of Network Data

The data is distributed and simultaneously related to each other in a complex structure. The
challenge here is to modify the structure of the data or add some data to it.

The internet is a network that consists of a variety of data, a lot of applications and websites
generate data that are all of different forms and characteristics. Schema interconnects all of
them.

A schema is the definition of the indexes, packages, table/rows, and meta-data of a database.

It is difficult to transport data if a database doesn’t handle Schema.

Solution
Server Data Tools (SDT) includes a schema compare utility that we can use to compare two
database definitions. The SDT can compare any combination of source and target databases.
Moreover, It also reports any discrepancies between schemas and detects mismatching data types
and defaults of columns.

9- Security

Security plays the most important role in the data field. Hacking the data might result in a data
leak. Hence, it may cost highly to the data processing firm. The hacker might even change or
delete the data that we have acquired and processed after a lot of struggle.

The reasons for the security breach in a database are mainly due to these reasons:

 Most of the data processing systems have a single level of protection.


 No encryption of Either the raw data or the result/ output data.
 Access of the data to unethical IT professionals that present a risk for data loss.
Solution
To ensure the security of the data we should follow the below-mentioned practices:

12
 Do not connect to public networks;
 Keep personal information safe and secure with a strong password;
 Limit the access of humans to the data;
 Encrypt and back up the data.
10- Cost

Cost is a matter of consideration. When the amount of the data increases, then the cost in each
stage of the data processing increases gradually.

The cost of data processing depends on the following factors:

 The type of processed data;


 Turn around time to complete the processing of data and get the required result;
 The accuracy of the data;
 Workforce working on data processing.
Solution
The stakeholders or the management looking into data processing must consider the budget and
the expenses. Compressing the data reduces its size and thus the data occupies less disk space.
With proper planning of the costs and expenses, the firm could earn well with the data
processing service.

5.1 Nature of Data Analysis: [Non-Statistical Analysis and Statistical Analysis]


5.2.1 Qualitative Data Analysis
Qualitative data analysis is the classification and interpretation of linguistic (or visual) material to make statements about implicit
and explicit dimensions and structures of meaning-making in the material and what is represented in it. Meaning-making can refer
to subjective or social meanings. Qualitative data analysis could also be appliedto discover and describe issues in the field or
structures and processes in routines and practices. Often, qualitativedata analysis combines approaches of a rough analysis of the
material (overviews, condensation, summaries) with approaches of a detailed analysis (elaboration of categories, hermeneutic
interpretations or identified structures).The final aim is often to arrive at generalizable statements by comparing various materials
or various texts orseveral cases (Bryman, 2001).
Aims of Qualitative Data Analysis
According to Uwe Flick (2009), the analysis of qualitative data can have several aims. The first aim may be to describe a
phenomenon in some or greater detail. The phenomenon can be the subjective lived experiences of a specific individual or group
(e.g. the way people continue to live after displacing from home land). This could focus on the case (individual or group) and its
special features and the links between them. The analysis can also focus on looking interplay of several cases (individuals or
groups) and on what they have in common or on the

13
Differences between them. The second aim is to explore the conditions on which the existing differences are based. This means to
look for explanations of the observed differences (e.g. circumstances which make it more likely that the coping with a specific
social disarticulation is more successful than in other cases). The third aim may be to develop a theory of the phenomenon under
study from the analysis of empirical material (e.g. a theory of displacement).

Basic Principles of Qualitative Data Analysis


According to Bryman and Burgess (1994) there are basic principles of qualitative data analysis. These are mentioned below:
_ People differ in their experience and understanding of reality (Constructivist-many meanings).
_ A social phenomenon can’t be understood outside its own context (Context-bound).
_ Qualitative research can be used to describe phenomenon or generate theory grounded on data.
_ Understanding human behavior emerges slowly and non-linearly.
_ Exceptional cases may yield insights into a problem or new idea for further inquiry.
2.4. Basic Features of Qualitative Data Analysis
Bryman and Burgess (1994) further explained the basic features of qualitative data analysis. These are mentioned below:
_ Analysis is circular and non-linear.
_ Iterative and progressive.
_ Close interaction with the data.
_ Data collection and analysis is simultaneous.
_ Level of analysis varies.
_ Uses inflection.
_ Can be sorted in many ways.
_ Qualitative data by itself has meaning.

2.5. Stages to Analyze Qualitative Data

14
There are no universally agreed stages in the process of analyzing qualitative data. Different authors mentioned different stages in
order to analyze qualitative data. For instance, Scott and Scott & Usher (2004) conceived that a typical qualitative analytical
approach to consist five stages. However, Bryman & Burgess (1994) argue that analysis of qualitative data has to follow six stages.
Moreover, Creswell (2009), contrary to the view of Bryman & Burgess (1994), believes that the process of qualitative data analysis
and interpretation can best be represented by a spiral image, a data analysis spiral, in which the researcher moves in analytic circles
rather than using a fixed linear approach.
Therefore, there are always variations in the number and description of steps for doing qualitative data analysis by different
authors. To this effect, I prefer to review different sources describing steps to analyze
qualitative data and adopt the most possible steps from each. To this end, the process of qualitative data analysis consists of four
stages (steps), namely: Familiarization, Data Reduction, Data Display, and Report Writing. The details of these analytical stages
are described and illustrated as follows.
2.5.1. Familiarization
Before beginning the process of filtering and sorting data, the researcher must become familiar with their variety and diversity of
material gathered. Even if the researcher own does not collect the data, it is must to form feeling about key issues and emergent
themes in the data by considering the context.
Essentially, familiarization involves concentration in the data: listening to tapes, reading transcripts, studying observational notes
and so on. According to Bryman and Burgess (1994) in some cases it is possible to review all the material at the familiarization
stage, for example where only a few interviews have been carried out, or where there is a generous timetable for the research. They
further outline number of features in the data collection process and points to be depend on while selecting the material to be
reviewed, such as:
_ The range of methods used
_ The number of researchers involved
_ The diversity of people and circumstances studied
_ The time period over which the material was collected
_ The extent to which the research agenda evolved or was modified during that time
2.5.1.1. Identifying a Thematic Framework
In the familiarization stage, the researcher is not only gaining an overview of the richness, depth, and diversity of the collected
data, but also he/she starts the process of abstraction and conceptualization. While reviewing the material, the researcher is
expected to make notes, record the range of responses to questions posed by the researchers themselves, jot down frequent themes
and issues which emerge as important to the study participants themselves. As Bryman and Burgess (1994) mentioned, once the
selected material has been reviewed, the researcher returns to these research notes, and attempts to identify key issues, concepts
and themes according to which the data can be examined and referenced. That is, she or he sets up a thematic framework within
which the material can be filtered and sorted. When identifying and constructing this framework or index, the researcher will be
drawing upon a priori issues such as:
_ Issues informed by the original research aims and introduced into the interviews using the topic guide
_ Emergent issues raised by the respondents themselves
_ Analytical themes arising from the recurrence or patterning of particular views or experiences
2.5.2. Data Reduction
It is very likely that qualitatively captured research project is going to generate more data than its final write up. However,
engaging in data reduction process is very helpful in order to edit the data, summarize it, and make it presentable. Therefore, we
have to reduce our data to make things more manageable and evident. According to Huberman and Miles (1994) with data
reduction, the potential universe of data is reduced in an anticipatory way as the researcher chooses a conceptual framework,
research questions, cases, and instruments. Once actual field notes, interviews, tapes, or other data are available, data summaries,
coding, finding themes, clustering, and writing stories are all instances of further data selection and condensation. From the very
possible ways to reduce and organize data in qualitative study, this paper attempts to look in to coding of qualitative data, writing
memos, and mapping concepts graphically. Hence, these ideas give a useful starting point for finding order in qualitative data.
2.5.2.1. Coding
Saldana (2013) has argued that coding does not constitute the totality of data analysis, but it is a method to organize the data so that
underlying messages portrayed by the data may become clear to the researcher. Charmaz (2006) describes coding as the pivotal

15
link between data collection and explaining the meaning of the data. A code is a descriptive construct designed by the researcher to
capture the primary content or essence of the data. Coding is an interpretive activity and therefore it is possible that two researchers
will attribute two different codes to the same data. The context in which the research is done, the nature of the research and interest
of the researcher will influence which codes the researcher attributes to the data (Saldana, 2013). During the coding process, some
codes may appear repeatedly and that may be an indication of emerging patterns. These emerging patterns or similarity
among the codes may give rise to categories. Coding is not only labeling, but also linking, that is, linking data to an idea. It is a
cyclic process. By incorporating more cycles into the coding process, richer meanings, categories, themes and concepts can be
generated from the data (Saldana, 2013).
2.5.2.1.1. Practical Aspects of Coding
Saldana (2013) gives a practical guide for the coding process. He mentioned that it is helpful to type the data on the left two-thirds
of a page and to leave the right margin open for notes. Whenever the topic of the data seems to change, the researchers can start a
new paragraph. In writing down the data, researchers need to decide whether they want to give a verbatim transcription of the
interviews for their specific study. Saldana (2013) further explains the importance of reading the data, to do some ‘pre-coding’ by
circling, highlighting or underlining significant words or sentences. However, different authors urge researchers to start the coding
process whilst they are collecting the data, keeping in mind that the codes may change during later cycles. Saldana (2009) suggests
that researchers should keep their research questions and aims of their studies in mind. The following questions may assist them in
their coding decisions:
_ What are people doing? What are they trying to accomplish?
_ Exactly how are they doing it? What strategies are they using?
_ How do they talk about, characterize and understand what is going on?
_ What assumptions are they making?
_ What is going on here? What do I, as researcher, learn from these notes? What strikes me?
These questions correspond with aspects, that might be coded, namely activities or behavior, events, strategies or tactics, present
situations, meanings, participation, relationships or interactions, conditions or constraints, consequences, settings and the
researcher’s own reflections.The number of codes, Saldana (2013) states, depends on the context, the nature of the data and to what
degree of fineness the researcher wants to examine the detail. Data can be ‘lumped’ together with a single code or can be ‘split’
into many smaller parts, each bearing its own code. Both methods have advantages and disadvantages. Even though splitting is
time-consuming, it may produce a more nuanced analysis. the essence of categorizing, although it may produce superficial
analysis. The number of codes may change during a second cycle of Saldana (2013) advises that the initial coding should be done
on hard copies, although electronic resources are available, as hard copies tend to give a better perspective.
2.5.2.1.2. Writing Analytic Memos: Concurrently with coding
According to Saldana (2013) the analytic memos document how the coding process is developing and codes may trigger deeper
reflection on the side of the researcher on the meaning of the data. It is important that researchers write down their insights.
Analytical memos give researchers the opportunity to reflect and comment on the following:
_ How they personally relate to the participants and the phenomenon;
_ Their research questions;
_ The code choices;
_ Emergent patterns and categories;
_ Problems and ethical dilemmas in the study; and
_ The future direction for the study.
Writing analytical memos can be seen as the transitional phase from coding to the more formal writing of the report on the study.
The analytical memos can also be coded and categorized and may even lead to better codes or categories for the data (Saldana,
2013).
2.5.2.1.3. Useful Coding Methods for Qualitative Data
Saldana (2013) mentions that grounded theory, one of the approaches in qualitative research, has six coding techniques in its
coding catalog. Researchers normally use these coding methods during two coding cycles. During the first cycle, the data is split
into segments then; in vivo coding, process coding and initial coding may be used. During the second cycle, researchers compare
codes, note emerging patterns and reorganize the data into categories by using the focused, axial and theoretical coding techniques.

16
Despite the fact, that these techniques are mentioned for grounded theory, it is important to note that researchers can use these
coding methods also in non-grounded theory studies (Saldana, 2013).
A. In vivo Coding
This method of coding is useful for beginner qualitative researchers, as the exact word or phrase of the participant serves as a code.
In order to distinguish in vivo codes, the researchers put them between inverted commas. The researchers look for words or phrases
that seem to stand out, for example nouns with impact, action-orientated verbs, evocative word choices, clever phrases or
metaphors. In vivo coding can be the only coding method used during the first cycle of data analysis, but it may be limiting
(Saldana, 2013).
B. Process (Action) Coding
A process code is a word or a phrase that captures action. It is done by using gerunds (‘-ing’ words) as part of the code. Process
coding is useful to identify an on-going action as a response to situations, or an action to handle a problem, or to reach a goal. As a
process code usually conveys movement and shows how things have changed over time, it helps the researchers to give a dynamic
account of events. It conveys a path of the participant’s process (Saldana, 2013).
C. Initial (Open) Coding
Initial coding refers to the process of breaking the qualitative data down into distinct parts and coding these by using in vivo
coding, process coding, and other coding methods. The researchers then examine these parts closely and compare them for
similarities and differences. During this process, the researchers may already become aware of emerging categories and code them.
It is important to remember that initial codes and categories are tentative and may change as the analysis process progresses. After
initial coding, the researchers need time for reflection by
means of the writing of analytical memos (Saldana, 2013).
D. Focused Coding
Saldana (2013) explains that after initial coding, the researchers gets on focused coding by identifying the most frequent or
significant codes in order to develop the prominent categories (it is linked to axial coding). He warns that the researchers should be
aware that these categories do not always have well-defined boundaries and that the codes in a specific category may have
different degrees of belonging. Rubin and Rubin (quoted by Saldana, 2013) recommend that the researchers organize the categories
hierarchically in main categories and subcategories in order to understand the relationship between them.
E. Axial Coding
The goal of axial coding is the strategic reassembling of data that have been split during initial coding. In the process of crossing
out synonyms and redundant codes, the dominant codes will become apparent. The axis of the axial coding is a category. During
axial coding, categories are related to subcategories and the properties and dimensions of a category are specified (Saldana, 2013).
Central categories describe the key properties of the phenomenon, causal categories capture the circumstances that form the
structure of the studied phenomenon, strategies describe the actions or interactions of people in response to the phenomenon, and
consequential categories represent the outcomes of the actions or interactions.
F. Theoretical (Selective) Coding
It is the process to select the theoretical code or core category that functions like an umbrella that covers all codes and categories. It
relates to all categories and subcategories. It addresses the how and why questions to explain the phenomena. However, this is not
necessary for every qualitative study (Saldana, 2013).
2.5.2.2. Memoing
In qualitative data analysis approaches, particularly in grounded theory, the coding process involves more than simply categorizing
large pieces of text. As you code data, you should also be using the technique of memoing, writing memos or notes to yourself and
others involved in the study. Some of what you write during analysis may end up in your final report; much of it will at least excite
what you write. In many qualitative data analysis approaches, these memos have a special significance. Strauss and Corbin (1998)
distinguish three kinds of memos: code notes, theoretical notes, and operational notes.
A. Code Notes
It is useful to identify the code labels and their meanings. This is particularly important because, as in all qualitative studies, most
of the terms we use with technical meanings also have meanings in everyday language. Hence, it’s essential, to write down a clear
account of what you mean by the codes used in your analysis (Strauss & Corbin,1998).
B. Theoretical Notes

17
It tries to cover a variety of topics such as, reflections of the dimensions and deeper meanings of concepts,relationships among
concepts, theoretical propositions, and so on. All of us have reflected over the nature of something, trying to think it out, to make
sense out of it. In qualitative data analysis, it’s vital to write down these thoughts, even those you’ll later discard as useless. They
will vary greatly in length, though you should limit them to a single main thought so that you can sort and organize them later
(Strauss & Corbin, 1998)
C. Operational Notes
It deals primarily with methodological issues. Some will draw attention to data collection circumstances that may be relevant to
understand the data later on. Others will consist of notes directing future data collection (Strauss & Corbin, 1998).
Generally speaking, writing these memos occurs throughout the data collection and analysis process.
Thoughts demanding memos will come to you as you reread notes or transcripts, code mass of text, or discuss the project with
others. It’s a good idea to get in the habit of writing out your memos as soon as possible after the thoughts come to you.
Notice that whereas we often think of writing as a linear process, starting at the beginning and moving through to the conclusion,
memoing is very different. It might be characterized as a process of creating chaos and then finding order within it.
3. Displaying Data in Qualitative Studies
Hubeman and Miles’s (1994) notion of data display roughly involves using textual representations of your data for the purpose of
selecting segments that best illustrate your concepts of interest. Typically, this includes thefollowing:
_ Carefully reading and rereading data transcriptions
_ Making notes in the margins (sometimes referred to as ‘research memos’)
_ Highlighting important passages or themes as representations of particular concepts.
The objective is to gradually transform a seemingly disorganized raw data into a recognizable conceptual
scheme. For most sociologists, the medium of choice for display and selection purposes is paper; however, some might be more
comfortable viewing their data on a computer screen. In fact, there are some computer software programs, such as NUD*IST or
NVivo, that allow you to draw diagrams and write research memos on the margins of your computer screen.

5.2 Nature of Data Analysis: [Non-Statistical Analysis and Statistical Analysis]


5.2.6 Grounded Theory Methods
Grounded theory is a general research methodology, a way of thinking about and conceptualizing data.
Grounded theory is a systematic methodology that has been largely applied to qualitative research conducted
by social scientists. The methodology involves the construction of hypotheses and theories through the collecting
and analysis of data.[1][2][3] Grounded theory involves the application of inductive reasoning. The methodology
contrasts with the hypothetico-deductive model used in traditional scientific research.
A study based on grounded theory is likely to begin with a question, or even just with the collection of qualitative
data. As researchers review the data collected, ideas or concepts become apparent to the researchers. These
ideas/concepts are said to "emerge" from the data. The researchers tag those ideas/concepts with codes that
succinctly summarize the ideas/concepts. As more data are collected and re-reviewed, codes can be grouped
into higher-level concepts and then into categories. These categories become the basis of a hypothesis or a new
theory. Thus, grounded theory is quite different from the traditional scientific model of research, where the
researcher chooses an existing theoretical framework, develops one or more hypotheses derived from that
framework, and only then collects data for the purpose of assessing the validity of the hypotheses. [4]
5.2.1 Semiotics
What Is Semiotics?
Semiotics is the study of how words and other symbolic systems of communication make meaning. The term
originates from the Greek word for sign, semeion, which means anything that is used to represent or stand in for
something. For example, the word "chair" is the sign that English speakers use to describe the thing with four
legs that people sit on.

18
What are the three areas in semiotics?
Cognitive Semiotics studies how individuals conceptualize meaning by using sign systems. Social and Cultural
Semiotics studies how sign systems develop and are used in specific cultures. Visual Semiotics focuses on non-
linguistic visual signs in art and design.
5.3. Conversation Analysis
Conversation analysis is less interested in interpreting the content of texts that have been produced for research
purposes, for instance interview responses. Rather it is interested in the formal analysis of everyday situations. The
theoretical background of conversation analysis is ethno-methodology. This approach is mainly suits to studies
aimed at exploring members' formal procedures for constructing social reality. To do this, empirical material is
selected to collect data, involved in recording everyday interaction processes as precisely as possible.
Bergmann (2004a) outlines this approach, as follows:
Conversation Analysis refers to a research approach dedicated to the investigation of social interaction as a
continuing process of producing and securing meaningful social order. Conversation analysis proceeds on the basis
that in all forms of linguistic and non-linguistic, direct and indirect communication, actors are occupied with the
business of analyzing the situation and the context of their actions, interpreting the expression of their own action,
producing situational appropriateness, intelligibility and effectiveness in their own expression. The goal of this
approach is to determine the constitutive principles and mechanisms by means of which actors, in the situational
completion of their actions and in reciprocal reaction to their action, create the meaningful structures and order of
a sequence of events and of the activities that constitute these events. In terms of method Conversation analysis
begins with the richest possible documentation, with audio-visual recording and subsequent transcription of real
social events, and breaks these down into individual structural principles of social interaction as well as the
practices used to manage them by participants in an interaction.

Conversation analysis (CA) is an approach to the study of social interaction that empirically
investigates the mechanisms by which humans achieve mutual understanding.[1] It focuses on both
verbal and non-verbal conduct, especially in situations of everyday life. CA originated as
a sociological method, but has since spread to other fields. CA began with a focus on
casual conversation, but its methods were subsequently adapted to embrace more task- and
institution-centered interactions, such as those occurring in doctors' offices, courts, law
enforcement, helplines, educational settings, and the mass media, and focus on multimodal and
nonverbal activity in interaction, including gaze, body movement and gesture. As a consequence,
the term conversation analysis has become something of a misnomer, but it has continued as a
term for a distinctive and successful approach to the analysis of interactions. CA
and ethnomethodology are sometimes considered one field and referred to as EMCA.

5.3 Statistical Analysis: Statistics in Research


Measures of central tendency allow us to better understand the average of a dataset. Three commonly-used
measures are described below: Mean, median (and quartiles), mode.

Mean
One of the most common summary statistics is the mean, can be calculated by summing all observations and
dividing this sum by the total number of observations. Note that the mean is quite sensitive to outliers in your
dataset. If some of your observations are extremely high or low relative to most of the data, then the mean for
all of these observations may be misleading in that it will be biased in the direction of these outliers. How to
calculate that in Excel? (available in French).

19
What you need to know: The mean is a very “simple” statistical parameter. It is easy to calculate, and has
been used widely. The drawback of this is that:

1. Given its sensitivity to outliers, it can quickly become meaningless if you’re facing a complex
context, or if you’re data collection quality is lacking.
2. In case of a distribution of data that is not symmetrical, you should not use it or use it with caution, as
it won’t be representative/statistically robust.

For example, as you can see here under, you have two series of data of KAP surveys looking at the average
quantity of water delivered (litres/person/day) in two different locations that have the same mean (the dotted
line) which is equal to 21,2 l/p/d.

But these two distributions are totally different: the green one is centred around the mean, with most of the
values around it and very little spread, therefore the information the mean conveys is quite robust: most of
the people have access to around 21l/p/d.

But the second one (the red) shows a very different reality, as there are a lot of people who have access to a
very high quantity of water and a lot of people with very few quantity of water, which means that despite a
correct mean, the program still has a lot of work to do to cover the majority of the population (and we would
be wrong to assume that we reached a standard).

20
One of the elements that will allow you to highlight this difference, and not be fooled by similar means, is
the standard deviation. Also, to calculate it when using, excel, you can refer to this link (available in French).

In our case study, we have calculated the mean of the Food Consumption Scores, which can provide us an
indication of the food insecurity among the sample population.

As seen in the table below, the mean value of the FCS is 42.7, which indicates an ‘Acceptable’ FCS scores
according to the standardized indicator thresholds.

Threshold Score

Poor 0 to 21

Borderline 21.5 to 35

Acceptable 35.5 +

However, as indicated below, this measure of central tendency alone may lead be misleading; if we stopped
there, we may think the sample population does not have food insecurity. The mean must be considered in
conjunction with further analysis!

Median (and quartiles)


We can also calculate the median value of a variable by ordering all observations from smallest to largest
and selecting the observation in the middle. Check out how to do it in Excel (available in French).

The median always corresponds to the second quartile and can be applied to most of the situations you will
encounter in the field.

What you need to know: You should always use the median, as it is not sensitive to the outliers. After
calculating it, compare it to your mean, as it will give you a first idea of the spread of your data.

In our case study, we have calculated the medium of the Food Consumption Scores. The median value FCS
value is 40.8.

The median value also indicates an ‘Acceptable’ FCS score according to the standardized indicator
thresholds. However, given the median is lower than the mean, we can know that there is a higher
concentration of datapoints in the lower limits of FCS scores among the sample population. Therefore, our
data is more concentrated below the mean than above the mean, and may be influenced by higher, ‘outlier’
values.

21
Mode
The mode is the value that appears most frequently in our observations. The mode is used most often for
qualitative data where the mean and median aren’t appropriate to calculate. How to calculate that in
Excel: Simple (available in French)!

In our case study, respondents were asked a number of questions that provided qualitative data. One
example is the question: “How was the food acquired?”, which included a list of potential answer options.
The mode indicates the most common response, which was ‘Household own production’. In conjunction
with the food security indicators, we could hypothesize that the target population continues to have high
levels of agricultural production that has not been fully disrupted by the conflict.

22

You might also like