Assignment3
Assignment3
March 6, 2025
1 Scenario 1
1.1 Answer the following questions related to the design
of this experiment
• 1. What are the objects, subjects, treatments, and factors used in the four
designs listed above for this experiment?
Objects: The programming tasks
Subjects: 12 students
Treatments: Two different IDEs
Factors: IDE-A or IDE-B and Prog-1 or Prog-2
• 2. How would you describe Design-A and Design-D in terms of a standard
design type, e.g., one factor or two treatments?
Design-A is one factor with two treatments, while Design-D is a paired
comparison design.
1
This design also has limitations: there is the threat of maturation, because
the second time a student is going to implement Program-1, it might be
faster not because of the IDE, but because it is the second time imple-
menting that specific program, and experience might have been gain from
the first implementation, even if it was done with a different IDE.
2
Null Hypothesis H0 : There is no difference in the average implementation
time between IDE-A and IDE-B.
Alternative Hypothesis HA : There is a difference in the average imple-
mentation time between IDE-A and IDE-B.
• 2. Use descriptive statistics and visualize the data in Table 5 use e.g. box
plots, histograms, and scatter plots. Which visualization tool helped you
develop some insights into the data? What were the insights, e.g. any
interesting patterns or trends in the data, a clear difference in efficiency
between two IDEs, or outliers?
We used RStudio to visualize the data and run the data analysis.
We can see from the box plot that IDE-A has a lower median than IDE-B.
Still, there’s no significant difference between the two. From this box plot,
we can also see that data for IDE-A have a higher variance than the ones
for IDE-B. Finally, from the plot it appears that there are no outliers in
the given data.
• 3. Choose and justify your choice of a parametric/nonparametric test
for analyzing the given data (document the steps you undertook and the
results).
To analyze the data we choose a parametric test. Specifically, we run the
Student’s t-test for differences between population means, as described in
section 6.6.2.2 of Fenton, N., & Bieman, J. (2014). Software metrics: a
rigorous and practical approach.
3
We run this test since it is used to compare two independent groups. Be-
fore running this test, we needed to check if the data given would respect
the assumptions of this test, as described in paragraph 6.6.2.2 of ”Fen-
ton, N., & Bieman, J. (2014). Software metrics: a rigorous and practical
approach.” Specifically, we checked the distributions of the data in both
groups, since they have fewer than 30 subjects per group. To do so, we
produced a Q-Q plot and ran the Shapiro-Wilk normality test.
4
• 5. Based on the results would you be confident to recommend an IDE
either IDE-A or IDE-B for use in your company? Why or why not?
We would not be confident about recommending either IDE-A or IDE-
B, since from the current data, it cannot be stated if one of the two is
better than the other in terms of efficiency. This is because, as discussed
in question 4., the null hypothesis cannot be rejected.
2 Scenario 2
• A. Describe the approach that you will follow to analyze the given data
(i.e. the three papers identified in Section 2.2). Please read Chapter 18
of C. Robson, K. McCartan, Real world research: A resource for social
scientists and practitioner-researchers. Fourth Edition. Wiley, 2016, to
make an informed decision about your approach and the steps you take.
For example, the analysis approach you will use (a. Quasi-statistical ap-
proach, b. thematic coding approach or c. grounded theory approach).
Also describe your mechanism for coding the data. Also explain why you
chose the approach over other alternatives.
In this scenario, the approach chosen for qualitative data analysis is the-
matic code analysis. We chose this approach mainly because of its flex-
ibility. Indeed, as reported in ”Chapter 18 of C. Robson, K. McCar-
tan, Real world research: A resource for social scientists and practitioner-
researchers”, thematic analysis is “very flexible, can be used with virtually
all types of qualitative data.” Moreover, it is a convenient instrument for
summarizing the key features of any volume of qualitative data and can
be used for a wide range of fields and disciplines. Another aspect that im-
proves the flexibility of this approach is that it can be used with virtually
any type of qualitative data. We will use it to compare findings across mul-
tiple papers in a structured way and find common key concepts. We also
chose this approach because it can be used for exploratory and descriptive
studies, unlike approaches such as grounded theory analysis, where the
focus is on “generating a theory to explain what is central in the data”.
Such approach is more strict and time-consuming. Our mechanism for
coding the data involves carefully reading the papers under analysis sev-
eral times, assigning a coding to each identified unit. We carried out this
approach manually, as we would have done if we had worked on paper,
highlighting the individual segments and annotating the related coding in
the margin. Then, we put similar ideas together to make the information
easier to understand. This part of the process is the one that will allow
us to produce themes.
• B. Please describe the coding procedure that you followed. For each step,
please provide an example of how you coded the information in the papers
Specifically, we segmented the data into smaller units (sentences) and
we labeled each unit with a code. We used an iterative approach to re-
5
view and refine our coding, improving segmentation and label assignment.
To identify our codes, we focused mainly on ”Consequences”, ”Strategies,
Practices or Tactics”, and ”Conditions or Constraints”, to try and identify
the core information about collaboration between researchers and indus-
try. These codes were then grouped to form themes where we identified
similar units. To define themes we tried to identify codes that discusses
the same macro-topic. This helped us to discover patterns and understand
the main ideas in the papers.
Here we provide an example of our coding mechanism:
6
Figure 4: Example 1: coding - cont.
7
research, while industry can benefit from the fact that it can support
innovation in concrete ways. Finally, lack of commitment on either
side can lead to minimal or even no technology transfer, which can
mean failure for the collaboration itself.
2. What patterns have been proposed for industry-academia collabora-
tions?
For successful industry-academic collaborations, we need technology
transfer models, for example, Gorschek proposed seven steps for a
successful collaboration as we can see below.
8
applied. Failing in this aspect, can lead to solutions that are then
poorly applicable in practice, and as stated by Gorschek et al., based
on their experience, “from our point of view the value of these results
were directly linked to usability in industry.”