JBR-07365; No of Pages 7
Journal of Business Research xxx (2011) xxx–xxx
Contents lists available at SciVerse ScienceDirect
Journal of Business Research
Website performance and behavioral consequences: A formative
measurement approach
Astrid Dickinger a,⁎, Brigitte Stangl b
a
b
MODUL University Vienna, Department of Tourism and Hospitality Management, Am Kahlenberg 1, 1190 Vienna, Austria
HTW Chur, University of Applied Sciences, Institute for Tourism and Leisure Research, ITF Comercialstrasse 22, 7000 Chur, Switzerland
a r t i c l e
i n f o
Article history:
Received 1 November 2010
Received in revised form 1 June 2011
Accepted 1 August 2011
Available online xxxx
Keywords:
Website evaluation
Formative measurement
Reflective measurement
Structural equation modeling
a b s t r a c t
Increasing internet use to access information and book trips significantly impacts the tourism industry and
calls attention to learn more about website performance and evaluation. Specifically, how do tourists interact
with websites? Which websites transform travelers into repeat visitors? Most research in this field employs
multi-item measures conceptualized as reflective models. This paper suggests a theory-based alternative, formative measurement approach for website performance. The construct comprises eight dimensions: system
availability, ease of use, usefulness, navigational challenge, website design, content quality, trust, and enjoyment. Website users identify dimensional indicators using evaluation surveys. Structural equation modeling
examines linkages between the website performance index and outcome measures (satisfaction, value, and
loyalty). Data provided by 455 travelers show the formative index works well. A second sample of 316 respondents cross-validates the findings. The study develops a sound and parsimonious measure allowing
the monitoring and benchmarking of traveler perceptions over time.
© 2011 Elsevier Inc. All rights reserved.
High-quality websites are critical because today's travelers increasingly search for information online and buy tourism related
products and services via the internet (Bieger & Laesser, 2004;
Crompton, 1979; Gursoy & McCleary, 2004; Marcussen, 2009; Vogt
& Fesenmaier, 1998). Compared to offline shopping experiences,
online-consumers rely on fewer senses, making them dependent on
product and service descriptions and pictures (Koufaris, 2002). Consequently, online consumers require excellent website performance.
Travelers planning a trip also expect web page content to be engaging
and well designed (Rice, 1997; Van der Heijden, 2003). Not surprisingly,
tourists demand websites to provide accurate information and easy
navigation (Chung & Law, 2003; Lu, 2004). These expectations require
effective website designs. To improve the online trip planning experience, a need exists to better conceptualize and operationalize website
performance addressing travelers' needs and their website behavior
(Park & Gretzel, 2007).
Most researchers use reflectively measured constructs to investigate website quality antecedents as well as outcome measures (e.g.,
satisfaction, positive word-of-mouth, or intention to use). Among
the most common antecedents are ease of use and usefulness
(Davis, 1989), enjoyment, website design, and content quality (De
Marsico & Levialdi, 2004; Parasuraman, Zeithaml, & Malhotra, 2005;
Van der Heijden, 2003; Venkatesh, Morris, Davis, & Davis, 2003).
The literature's current focus discusses the suitability of multiple
⁎ Corresponding author.
E-mail addresses: astrid.dickinger@modul.ac.at (A. Dickinger),
brigitte.stangl@htwchur.ch (B. Stangl).
item measurements. Rossiter (2005) summarizes this issue in his famous paper “Reminder: a horse is a horse.” A second ongoing debate
questions how to appropriately conceptualize and operationalize latent constructs (Collier & Bienstock, 2009; Diamantopoulos & Siguaw,
2006; Rossiter, 2002). Some constructs might be better represented
by formative rather than by reflective indicators (see Diamantopoulos
& Winklhofer, 2001).
Although some studies on online formative measures exist (e.g.,
Collier & Bienstock, 2009; Gable, Sedera, & Chan, 2008; Mathieson,
Peacock, & Chin, 2001), a need to develop alternative measurements
for tourism still exists. The present study develops a formative measurement for website performance. Having a sound, parsimonious
measure allows continual website evaluation. Users need not complete extensive questionnaires in order to tailor sites based on their
behavior and needs. Thus, this study's contribution: 1) provides an
overview of website performance and quality concepts and discusses
the applied measurement approaches; 2) develops, tests, and crossvalidates a parsimonious formative measure for website performance; and 3) discusses theoretical as well as managerial implications regarding the formative measure.
1. Literature review
1.1. Approaches for website evaluation
Over the last decade, IS (Information Systems), Marketing, and
Tourism scholars devote a lot attention to websites and online shopping behavior. Most studies follow or extend popular cognitive
0148-2963/$ – see front matter © 2011 Elsevier Inc. All rights reserved.
doi:10.1016/j.jbusres.2011.09.017
Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J
Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017
2
A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx
behavioral models such as the Theory of Reasoned Action (Fishbein &
Ajzen, 1975), Theory of Planned Behavior (Ajzen, 1991), Technology
Acceptance Model (Davis, 1989), and Unified Theory of Acceptance
and Use of Technology (Venkatesh et al., 2003). Marketing research examines the impact of antecedents on website quality and satisfaction
(Barnes & Vidgen, 2002; DeLone & McLean, 2003; Parasuraman et al.,
2005; Wolfinbarger & Gilly, 2001). More recently Meta-analyses reveal
the most examined drivers of website performance (see Table 1; also
see DeLone & McLean, 1992, 2003; Park & Gretzel, 2007).
Table 1 shows most constructs are reflective latent variables. Increasingly, scholars debate about reflectively measured constructs'
appropriateness (Diamantopoulos & Winklhofer, 2001; Edwards &
Bagozzi, 2000; MacKenzie, Podsakoff, & Jarvis, 2005). Some fields
now use alternative, formative measurement approaches such as eservice quality (Collier & Bienstock, 2006; Collier & Bienstock, 2009;
Hsu, 2008), trustworthiness (Serva, Benamati, & Fuller, 2005), information system success (Gable et al., 2008), and effectiveness (Scott,
1994). Diamantopoulos and Winklhofer (2001) stress a need to critically reflect on which approach is appropriate for the specific research
question.
Thorough theoretical considerations are imperative to decide the
correct measurement perspective (Diamantopoulos & Winklhofer,
2001; Rossiter, 2002). An in-depth theoretical foundation is essential
due to fundamental differences between the two alternatives (see
Table 2).
Reflectively conceptualized measures show changes in the construct's indicators (Diamantopoulos & Siguaw, 2006). Reflective constructs' items are “chosen randomly from the universe of items
Table 1
Website evaluation criteria.
1
Author, model, and method
Constructs included
Barnes and Vidgen (2002): WebQual.
Qualitative and quantitative surveys,
n = 46
Davis, Bagozzi, and R. (1989): TAM,
TRA. Experiment, n = 107
Dabholkar and Bagozzi (2002): TAM
with extensions. Survey,
experimental design, n = 392
Usability, design, information, trust,
empathy, quality
Ease of use, usefulness, attitude,
intention to use
Performance, fun, self-efficacy, novelty
3
seeking, need for interaction, selfconsciousness, perceived waiting
time, social anxiety, attitude, ease of
use, intention to use
Park and Gretzel (2007): Meta
Ease of use, responsiveness,
4
analysis of 153 academic papers.
fulfillment, security/privacy,
personalization, visual appearance,
information quality, trust, interactivity
Kim and Fesenmaier (2008): Website Informativeness, usability, credibility,
5
persuasiveness. Survey, n = 1416
inspiration, involvement, reciprocity
6
DeLone and McLean (1992): I/S
System quality, information quality,
Success. Conceptual study
use, user satisfaction, individual
impact, organizational impact
Parasuraman et al. (2005): E-S Qual. Efficiency, system availability,
7
Survey, n = 549
fulfillment, privacy, responsiveness,
compensation, contact, perceived
value, loyalty intentions
8
Van der Heijden (2003): TAM. Web
Perceived attractiveness, perceived
based survey, n = 828
enjoyment, ease of use, usefulness,
attitude, intention to use, perceived
attractiveness
Venkatesh et al. (2003): UTAUT.
Effort expectancy, performance
9
Survey, n = 215
expectancy, facilitating conditions,
social norms, intention to use, gender,
age, experience, voluntariness of use
Quality, fulfillment/reliability, website
10 Wolfinbarger and Gilly (2001):
design, privacy/security, customer
eTailQ. Focus groups, tasks, survey,
service
n = 1013
11 Mathwick and Rigdon (2004):
Escapism, intrinsic enjoyment,
Survey, n = 110
attitude toward the brand and site,
navigational challenge, internet search
skill, internet usage, decisional control,
product involvement
2
Table 2
Reflective vs. formative constructs — comparison of characteristics.
Reflective constructs
Formative constructs
Construct is reflected in the indicators
Account for observed variances in the
outer model — error is assessed at
the item level
Identification achieved with three
effect indicators
Important aspects:
Construct is a composite of indicators
Minimize residuals in the structural
relationship — error is assessed at the
construct level
Identification is given only if the
construct is embedded into a larger model
Important aspects:
•
Internal consistency or reliability
•
Indicators examine different
dimensions
•
Positive correlation between
measures
•
Multicollinearity is a problem
•
Unidimensionality allows for
removing indicators to improve
construct validity without affecting
content validity
•
Removing an indicator affects
content validity
Theoretical relationships between construct and indicators need to be thought about
thoroughly.
relating to the construct of interest” (DeVellis, 1991, p. 55) and often
included or deleted arbitrarily. Not all items have equal chance of inclusion, potentially leading to inclusion or exclusion of conceptually
unnecessary or necessary items (Rossiter, 2002). Consequently, measures are often idiosyncratic limiting comparability to other studies
(Gable et al., 2008). With formative measures, “indicators could be
seen as causing rather than being caused by the latent variable measured
by the indicators” (MacCallum & Browne, 1993, p. 533). Thus, formative constructs are a linear function of respective measures (Jarvis,
MacKenzie, & Podsakoff, 2003). Consequently, cause indicators are
not interchangeable like reflective measures. Content validity is affected when indicators are removed from the formative model.
Changing one indicator affects the latent variable ceteris paribus.
Reflective models account for observed variance in the measurement model and the error is assessed at the item level. Formative
models minimize residuals in the inner model and residuals are minimized in the structural relationship. The error is assessed at the construct level rather than with the observed variables and thus, is not
random measurement error. Instead the error is a disturbance term
of the latent formative construct representing all causes not
accounted for by the indicators and impacts the latent construct. Researchers suggest setting the error term to zero, but they stress a theoretical foundation is needed to explain why the indicators capture
the whole concept (Diamantopoulos, 2006).
These differences need to be considered when a model is conceptualized and estimated. In contrast to reflective models, formative
model identification only is achieved if effects on other constructs
are included. For reflective measures reliability, positive correlations
between measures and unidimensionality are prerequisites. Formative measures' indicators must examine different dimensions and
avoid multicollinearity between the indicators (Bollen, 1989; Bollen
& Lennox, 1991; Edwards & Bagozzi, 2000; Fornell & Bookstein,
1982). Table 2 summarizes the differences between the two measurement perspectives.
2. Index construction
Construct development guidelines from classical test theory
do not apply to formative measures (Churchill, 1979). Following
Diamantopoulos and Winklhofer (2001), four critical issues need to
be addressed to successfully construct a formative measure. First, the
latent variable must be defined properly to specify the construct's scope.
Second, the indicators must specify the underlying construct's
total scope. Rossiter (2002) argues multiple items may increase a
homogenous or well-known construct's weight in reflective measurement, even if a single item measure serves the purpose and
Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J
Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017
A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx
potentially is more accurate. Formative constructs require at least one
indicator for each dimension's formulation. However, an excessive
number of indicators must be avoided. To measure the whole scope,
but at the same time achieve parsimony specifying the concept's dimensions as well as item identification are crucial. This step requires
guidance by a thorough literature review. Jarvis et al. (2003) reveal
mis-specified models are common (nearly one-third) in highly
ranked journals. Other researchers confirm this conclusion supporting the view that misspecifications result in measurement error
(e.g., Petter, Straub, & Rai, 2007). Additionally, misspecifications affect
structural models and influence theory testing (Edwards & Bagozzi,
2000). Bagozzi (1994, p. 333) stresses that “an index is more abstract
and ambiguous than a latent variable measured with reflective indicators”.
Similar to multiple regression analysis, omitting relevant variables
for formative measures leads to estimation problems and a substantially different index. One source for identifying problems is the
error term. A high error term suggests revisiting the latent variable's
conceptual definition (Diamantopoulos, 2006; MacCallum & Browne,
1993).
The third step assesses multicollinearity. Since a formative measure is based on multiple regression analysis, the sample size and
intercorrelation strength between the indicators affect coefficients'
stability. Indicators' magnitudes serve as validity coefficients due to
the direct relationship between each indicator and the underlying
construct (Bollen, 1989). Multicollinearity suggests the indicators
contain unnecessary, redundant information. Both, indicator validity
bias and estimation difficulties may arise (Albers & Hildebrandt,
2006; Bollen & Lennox, 1991). The variance inflation factor (VIF)
and tolerance test for mulitcollinearity (Kleinbaum, Kupper, & Müller,
1988).
The final step concerns external validity. Since assessing indicator
appropriateness is problematic, “the best we can do … is to examine
how well the index relates to measures of other variables” (Bagozzi,
1994, p. 333). This process requires correlating each indicator to a
feasible, theory-based variable not part of the index (see Spector,
1992). A more common approach employs a multiple indicators and
multiple causes (MIMIC) model linking the index as an antecedent
to theoretical constructs (Hauser & Goldberger, 1971; Jöreskog &
Goldberger, 1975). A good fit of the MIMIC model suggests indicator
suitability for the formative construct (Diamantopoulos & Winklhofer,
2001). This approach evaluates nomological validity. To measure overall fit, inspection of stand-alone (e.g., chi-square, RMSEA) and incremental fit indices (e.g., TLI, CFI) follow standard cut-off criteria (see
Hu & Bentler, 1995). Finally, the index requires cross-validation
(Cudeck & Browne, 1983).
3. Website performance: conceptual measure development
Website performance measures overall appeal from a consumer's
point of view. The performance literature suggests that indicators cause
performance and not the other way round. Formative rather reflective
measures best assess website performance (Diamantopoulos, 1999).
To define a construct, Rossiter (2002) suggests clarifying the rater
entity, the object, and the attribute. In this case, tourists searching information online are the rater entities, websites providing travel information are objects, and website performance evaluation is the
attribute. To better understand website performance and the subsequent measurement development, a qualitative survey was conducted. Data were collected through in-depth interviews (Bruhn,
Georgi, & Hadwich, 2008). Explorative, open ended interviews collected 35 narratives from tourists. Respondents provided detailed answers and multiple perspectives about their conceptualization of
website performance. The qualitative results confirm the definition
of website performance and support the sub-dimensions discussed
in the next section following Diamantopoulos and Winklhofer's
(2001) four stages of index construction.
3
3.1. Scope of the construct
The index intends to capture website performance based on previous literature and qualitative primary data. After defining website performance, interviewees broke the concept down into sub-dimensions
relevant to performance. Meta-analyses by DeLone and McLean (1992,
2003) and Park and Gretzel (2007) provide an overview of concepts
considered necessary for website success. Condensing the qualitative
study data and literature review uncover consistently mentioned dimensions for the formative index.
The first dimension is system availability. This dimension refers to
the website's technical functionality and performance. Web users
quickly leave unresponsive or slow loading websites (Parasuraman
et al., 2005; Zeithaml, Parasuraman, & Malhotra, 2002). Ease of use
and usefulness are also key components. A useful system allows
users to find information without a great effort. Websites must be
easy to use and provide useful information (Collier & Bienstock,
2006; Davis, 1989; Novak, Hoffman, & Yung, 2000). Furthermore,
website users dislike navigational challenges. Effective websites need
to be well-structured, easy, and intuitive to navigate (De Marsico &
Levialdi, 2004; Rossi, Schwabe, & Lyardet, 1999). Qualitative interviews support website attractiveness too. Website design is a key driver of performance perception. This dimension deals with the site's
visual appearance including characteristics such as color, text presentation, pictures, videos, and sound (De Marsico & Levialdi, 2004;
Norman, 2002; Shneiderman, 2003). Not surprisingly, content quality
is of outmost importance. High quality websites provide compelling
content (Barnes & Vidgen, 2000; Novak et al., 2000). Travelers access
a website searching for information; however, they also want to be
entertained (Koufaris & Hampton-Sosa, 2002; Park & Gretzel, 2007;
Venkatesh, 2000). Interview respondents mentioned exploratory
browsing and fun during their interaction with a website. Finally,
the interviews confirm website trust leads to increased usage and repeat visitors (Dickinger, 2011; Gefen, Straub, & Boudreau, 2000). The
formative construct comprises eight dimensions. Fig. 1 (H1a-g) captures the scope of website performance, including mutually exclusive
and necessary aspects.
3.2. Indicator specification
Interviewees' suggestions helped specify indicators capturing the
formative construct's distinct dimensions. After explaining the website
performance dimensions, interviewees formulated precise questions
for each identified dimension. These questions were categorized and
structured by the researchers (Strauss & Corbin, 1998). Some items
were deleted due to redundancy; other items were refined or
rephrased. Respondents were contacted again to obtain feedback on
the completeness, phrasing, and appropriateness of the listed items.
The respondents consistently agreed that the listed items accurately
evaluate website performance for all eight dimensions. The procedure
also indicates a single item measurement approach is suitable. This exercise identified the best measurements for each dimension (Rossiter,
2002). Respondents picked the most precise items for each concept
and performed some final adaptations or rephrasing (see Table 3).
3.3. Indicator collinearity and external validity
Next, multicollinearity between indicators and the index's external validity need to be examined. Examining this broader framework
includes assessing a website's impact on satisfaction as well as on
site's value influencing loyalty. The literature commonly suggests
these relationships (e.g., Cronin, Brady, & Hult, 2000; Parasuraman
et al., 2005). Website performance (H1a-h) impacts satisfaction
(H2) and perceived value (H3) which in turn influence loyalty (H4
and H5 respectively). Fig. 1 shows the full research model.
Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J
Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017
4
A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx
Fig. 1. Website performance index.
4. Study design and measurement
To develop a sound formative measure, the current study follows
the four stages suggested in literature (content specification, indicator specification, collinearity test, and external validity) as well as
model cross-validation with a fresh data set.
The literature and the qualitative survey identify eight dimensions
and captured the whole scope of the focal construct website performance: system availability, ease of use, usefulness, navigational challenge, website design, content quality, trust, and enjoyment. Each
dimension's indicators were identified using a qualitative study. An
online questionnaire collected data. The questionnaire's first page
Table 3
Measurement items used in the research model.
Formative concept
Website performance
H1a: usefulness
H1b: ease of use
H1c: enjoyment
H1d: website design
H1e: trust
H1f: content quality
H1g: navigational
challenge
H1h: system
availability
Reflective concepts
I find the website useful for my search task.
I find it easy to get the website to do what I want it to do.
I find the website entertaining.
I like the look and feel of the website.
I think the website is trustworthy.
The website communicates relevant information.
It is easy to understand the overall navigation structure
of the website.
Pages on this site do not stop loading when I search.
Factor loadings
Study 1
Factor loadings
Study 2
Value (adapted from Brady et al., 2005; Cronin et al., 2000)
The search results are worth the effort.
.911
.840
The value received through the search
.953
.805
justifies the effort.
Satisfaction (adapted from Brady et al, 2005; Cronin et al., 2000)
This website is what you need when you
.838
.675
search for information.
After the search I was satisfied with the
.867
.501
results.
I will recommend the website.
.833
.746
I am satisfied with the website.
.892
.753
Loyalty Intentions (adapted from Moon & Kim, 2001; Zeithaml et al., 1996)
I will use the website on a regular basis in
.867
.870
the future.
I will frequently use the website in the
.887
.789
future.
I will strongly recommend others to use
.865
.839
the website.
explains the study's context and the purpose. Details include the duration for completing the questionnaire, background information, and
an invitation to participate. First, respondents were asked to read a
search task and perform the task following a link to a tourism website. This site contained destination information from travel blogs,
media, magazines, and tourism organizations. After finishing the
search task, respondents completed the questionnaire.
Three latent reflective concepts assess external validity (satisfaction, value, and loyalty). They were measured by adapting previously
developed and tested Likert scales. Satisfaction and loyalty were measured by four and three items respectively adapted from Moon & Kim
(2001), Zeithaml et al. (1996), Brady et al. (2005) and Cronin et al.
(2000). Value was measured with two items adapted from Cronin et
al. (2000). The measurement instrument was pre-tested by 42 students
to assure clarity and readability.
The reflective latent concepts are evaluated applying Fornell and
Larcker's (1981) approach. This technique would not be feasible for
the formative part of the measurement instrument. Average variance
extracted (AVE) is at an acceptable range between .72 and .87, construct reliability (CR) is between .91 and .93; both exceeding the cut
off value of .5 (AVE) and .7 (CR). Discriminant validity is satisfied because squared shared variance does not exceed AVE (see Table 4).
The variance inflation factor (VIF) provides information on the
presence of collinearity. VIF scores should remain below 10; the results find all VIFs are less than two (Kleinbaum et al., 1988). Furthermore, the tolerances were greater than .6, above the recommended
minimum value of .3 (Diamantopoulos & Siguaw, 2006).
5. Analyses
The model was estimated using MPlus, a second generation SEM
tool (Muthén & Muthén, 2007). This tool offers some key advantages.
First, estimators do not require normal distribution or metric data.
Furthermore, the software allows for specification of formative measures. Fig. 1 shows both an estimated formative index and a multiple
Table 4
Evaluation of the measurement model.
Construct
CR
1
2
3
Loyalty
Value
Satisfaction
.91
.93
.93
.762
.460
.621
.869
.723
.716
Note: Average variance extracted is reported on the diagonal.
Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J
Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017
5
A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx
indicators multiple causes (MIMIC) model (Jöreskog & Goldberger,
1975). This analysis allows reflective indicators, providing fit indicators to help assess overall model fit. Further index assessment includes examining the individual γ-parameters (Diamantopoulos &
Winklhofer, 2001).
suggest linking the index to reflective constructs with which they
normally would be linked with. The results show positive effects of
website performance on satisfaction (β = .98; p b .001) and perceived
value (β = .85; p b .001). These results confirm H2 and H3. In line with
H4, satisfaction has a direct, positive effect on loyalty (β = .86;
p b .001). However, the effect of value on loyalty is not significant.
6. Results
6.3. Cross-validation
6.1. Sample profile
Data cleaning reduced the sample to 445 fully completed questionnaires. Respondent gender included more females (58.4%) than
males (41.6%). The average age of the sample is 28.3 years. For education level, more than half of the sample finished high school (53.9%),
36.6% acquired a university degree, and the rest completed compulsory, vocational training, or did not specify their education. Most respondents can be classified as experienced internet users. More
than 84% of respondents go online constantly or several times a day.
6.2. Estimation of the research model
Incremental fit indices as well as stand-alone fit indices show
the model fits the data well. The comparative fit index (CFI) and the
Tucker–Lewis index (TLI) results are .957 and .945 respectively.
Root Mean Square Error of Approximation (RMSEA) is satisfactory
(.058). Table 5 details the fit indicators.
Now the proposed formative measure is examined in more detail.
Usefulness, (γ = .22; p b .001), ease of use (γ = .19; p b .001), enjoyment (γ = .10; p b .008), website design (γ = .16; p b .001), trust
(γ = .07; p b .020), content quality (γ = .34; p b .001), navigational
challenge (γ = .12; p b .001), and system availability (γ = .07;
p b .020) return significant coefficients in support of H1a, H1b, H1c,
H1d, H1e, H1f, H1g, and H1h. The r-square of the index is .77. These
results confirm the proposed formative index for website performance. Next, investigating the results of the structural part (paths between the formative index and the reflective concepts) provides
nomological validation. Diamantopoulos and Winklhofer (2001)
Cudeck and Browne (1983) recommend cross validating the constructed index with fresh data. This cross validation was performed
by evaluating a national tourism organization's website. Data were
collected through an online survey using the first study's questionnaire. A total of 316 usable questionnaires were collected. This sample's gender distribution is similar with 53.5% female and 46.5%
male respondents. Respondents' average age is about the same as
the first sample (27.7 years). Furthermore, participants also are highly educated; 59.8% finished high school and 31.3% obtained a university degree. Finally, participants of the second sample also are very
knowledgeable about the internet, with more than 81% being online
constantly/several times a day.
The variance inflation factor and tolerance values show good results. Investigating overall model fit shows values for CFI (.921) and
TLI (.915) that are above the required cut off value of .9. RMSEA
(.066) remains well below the .8 threshold. Furthermore, the factor
loadings confirm an equally strong measurement model for the reflective measurements. These results confirm the evaluation of the
measurement instrument from Study 1.
Again, all the indicators forming the index show significant effects.
The r-square of website performance is high with .72. Usefulness
(γ = .22; p b .001), enjoyment (γ = .24; p b .001), content quality
(γ = .19; p b .001) and ease of use (γ = .23; p b .001) show the highest
contribution to the index as found in the first study. Also, the effects
within the model's structural part show the same patterns. The effects
of website performance on satisfaction and value are again positive
and highly significant (β = .98 and β = .91). Value's effect on loyalty
is not significant; however, the satisfaction on loyalty relationship is
Table 5
Results of the model test and the cross validation.
Index construction
Study 1
Model for website performance
Usefulness
Ease of use
Enjoyment
Website design
Trust
Content quality
Navigational challenge
System availability
H2:
H3:
H4:
H5:
website performance → satisfaction
website performance → value
satisfaction → loyalty
value → loyalty
Cross validation
Study 2
Standardized γ
p-value
Standardized γ
p-value
.22
.19
.10
.16
.07
.34
.12
.07
b.001
b.001
b.008
b.001
b.020
b.001
b.001
b.020
.22
.23
.24
.12
.13
.19
.09
.16
b.001
b.001
b.001
b.009
b.001
b.001
b.025
b.001
Standardized β
p-value
Standardized β
p-value
.98
.85
.86
n.s.
b.001
b.001
b.001
n.s.
.98
.91
1.315
n.s.
b.001
b.001
b.001
n.s.
r-square
Website performance
Loyalty
Value
Satisfaction
.77
.61
.74
.97
.72
.76
.83
.98
Fit indicators
CFI
TLI
RMSEA
.957
.945
.058
.921
.915
.066
Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J
Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017
6
A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx
confirmed. Surprisingly, the β-coefficient for satisfaction on loyalty is
above one. To rule out multicollinearity as reason for the inflated coefficient, the variance inflation factor and tolerance must be inspected
(Kline, 2004; Maruyama, 1997). Inspecting the residuals allows ruling
out that a Heywood case causes the inflated estimate (Dillon, Kumar,
& Mulani, 1987). Jöreskog (1999) suggests higher coefficients are acceptable if regarded as regression coefficients. Thus, the index's cross
validation as well as the whole model is successful.
7. Discussion of the findings
Study 1 results show important website performance dimensions
include content quality and usefulness followed by ease of use and
website design. The effects of trust and system availability are not
as strong as for the earlier mentioned items; however, the results
are highly significant. This finding also confirms results from the qualitative study. Respondents note trust is important for information
websites, but even more essential for transaction sites. For online
booking, a website's effort to gain trust must be higher than for information websites. The proposed indicator captures all relevant dimensions and applies to multiple settings. The results show the formative
index performs well because all identified dimensions show highly
significant γ-values. The hypothesized effects on behavioral variables
such as value and satisfaction are confirmed. These results are consistent with website quality/satisfaction/performance literature (Cronin
et al., 2000; DeLone & McLean, 2003; Oliver, 1997). The tested model
demonstrates the quality of the proposed formative measurement approach for website performance. Cross-validation also confirms the
formative website performance measure.
7.1. Theoretical implications
This research challenges the common reflective approach to evaluate websites using formative measurement instruments. An indepth literature review and a qualitative study involving internet
users lead to the development of a formative index. This index comprises eight dimensions representing the defined scope of the construct. These dimensions include: system availability, ease of use,
usefulness, navigational challenge, website design, content quality,
trust, and enjoyment.
Petter et al. (2007) stress measures used in previous studies need
careful evaluation to assure the specific measurement approach is the
most appropriate. The present study demonstrates the need for thorough theoretical foundation to decide upon the appropriateness of a
measurement paradigm. To assist researchers, this paper shows
how the literature guides concept definition. Then, the study presents
a step by step approach to successfully design adequate items for a
formative construct (Bruhn et al., 2008). This process is essential
even if items for reflective measures exist because two completely
different approaches are employed. While the literature provides
many possible variables for testing reflective constructs (e.g., Barnes
& Vidgen, 2000; Moon & Kim, 2001), a need exists to identify items
for formative measures. These measures are a composite of indicators
and items which cannot be selected from a universe without first determining content validity. Thus, items of formative and reflective
measures differ from a conceptual point of view (DeVellis, 1991).
The qualitative study reveals indicators relevant for the dimensions
of the construct. Study participants consistently agreed on proposed
dimensions' importance and a majority chose the same measurement
items. The qualitatively identified items perform well.
A formative model should be inclusive, avoiding further uncontrolled appearance of new labels for website performance concepts.
The present study shows that the dimensions are essential since all
γ-values are statistically significant. Some measures may be stronger
in a different consumption behavior context (e.g., information search
vs. purchasing online). Including all theoretically sound dimensions
prevents adding and removing items arbitrarily. Taking these considerations into account leads to developing a robust formative index.
7.2. Managerial implications
The proposed web performance index is a robust, parsimonious,
and simple measurement instrument. This index has an advantage
over most reflective measures because often people chose items for
a reflective construct arbitrarily (DeVellis, 1991; Rossiter, 2002),
decreasing accuracy as well as constraining comparability.
The index allows organizations to evaluate their websites and
assess the impact of website performance on value and satisfaction.
For companies, knowledge about consumers' needs is essential to
design effective websites. Satisfied users become loyal customers. A
large, loyal customer base is particularly important for transaction
sites or websites dependent on sponsors. Appropriate measures of
customer needs are essential. Applying mis-specified measures
leads to estimation problems affecting inference in hypothesis testing. This mistake leads to bad management decisions (MacCallum &
Browne, 1993). In the worst case scenario, resources allocated to improving the website may not contribute to improved website
performance.
The website performance index is valuable to companies: 1) interested in evaluating their website based on customers' needs or requirements and looking for an easy to use and parsimonious
measurement instrument; 2) seeking to assess their website performance continuously without absorbing customers time extensively;
and 3) desiring to establish a benchmark and compare their website
performance over time and/or with competitors.
7.3. Limitations and future research
This study has limitations to be addressed in future research. First,
the study needs replication across different contexts and using different samples. This conceptual framework needs testing with more
transaction-oriented websites. The results should lead to higher coefficients for system availability and trust since respondents in a transaction setting are more vulnerable to system failure (Dickinger,
forthcoming; Gefen et al., 2000). Furthermore, additional content
validity in different contexts is worthwhile. Follow-up validation
studies would demonstrate the generalizability of the index across
different settings.
Second, the model was developed and validated with data from
one country. Could the same formative index reflect the view of a
more general, worldwide audience? Replicating the study (i.e. the
qualitative item identification study and the quantitative model testing studies) across different countries would provide insights into
whether or not these dimensions are global.
Finally, this work encourages further discussion on which paradigm is more appropriate depending on research questions and theory backgrounds. This debate encourages more index development
and leads to the employment of more parsimonious measurement instruments for market research. An extensively validated and widely
adapted website performance model facilitates further research and
insights regarding online consumer behavior and preferences. The
present study makes a significant step in this direction.
References
Ajzen I. The theory of planned behavior. Organizational Behavior and Human Decision
Processes 1991;50:179–211.
Albers S, Hildebrandt L. Methodische Probleme Bei Der Erfolgsfaktorenforschung –
Messfehler, Formative Versus Reflektive Indikatoren Und Die Wahl Des
Strukturgleichungs-Modells. Zeitschrift für betriebswirtschaftliche Forschung
2006;58(1):2-33.
Bagozzi RP. Structural equation models in marketing research: basic principles. In: Bagozzi
RP, editor. Principles of marketing research. Cambridge, MA: Blackwell; 1994.
Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J
Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017
A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx
Barnes S, Vidgen R. Webqual: an exploration of web site quality. Proceedings of the
eighth European conference on information systems, Vienna; 2000.
Barnes S, Vidgen R. An integrative approach to the assessment of e-commerce quality.
Journal of Electronic Commerce Research 2002;3(3):114–27.
Bieger T, Laesser C. Information sources for travel decisions: toward a source process
model. Journal of Travel Research 2004;24:357–71. [May].
Bollen K. Structural equations with latent variables. New York: John Wiley & Sons; 1989.
Bollen K, Lennox R. Conventional wisdom on measurement: a structural equation perspective. Psychological Bulletin 1991;110(2):305–14.
Brady MK, Knight GA, Cronin JJJ, Hult TM, Keillor BD. Removing the contextual lens: A
multinational, multi-setting comparison of service evaluation models. Journal of
Retailing 2005;81(3):215–30.
Bruhn M, Georgi D, Hadwich K. Customer equity management as formative secondorder construct. Journal of Business Research 2008;61:1292–301.
Chung T, Law R. Developing a performance indicator for hotel websites. Hospitality
Management 2003;22:119–25.
Churchill Jr GA. A paradigm for developing better measures of marketing constructs.
Journal of Marketing Research 1979;16(1):64–73.
Collier JE, Bienstock CC. Measuring service quality in e-retailing. Journal of Service Research 2006;8(3):260–75.
Collier JE, Bienstock CC. Model misspecification: contrasting formative and reflective
indicators for a model of e-service quality. Journal of Marketing Theory and Practice
2009;17(3):283–93.
Crompton JL. Motivations for pleasure vacation. Annals of Tourism Research 1979;6:408–22.
Cronin JJJ, Brady MK, Hult GTM. Assessing the effects of quality, value, and customer
satisfaction on consumer behavioral intentions in service environments. Journal
of Retailing 2000;76(2):193–218.
Cudeck R, Browne MW. Cross-validation of covariance structures. Multivariate Behavioral Research 1983;18(2):147.
Dabholkar PA, Bagozzi RP. An attitudinal model of technology-based self-service: moderating effects of consumer traits and situational factors. Journal of the Academy of
Marketing Science 2002;30(3):184–201.
Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information
technology. The Management Information Systems Quarterly 1989;13(3):319–40.
Davis FD, Bagozzi RP, R. WP. User acceptance of computer technology: a comparison of
two theoretical models. Management Science 1989;35(8):982-1003.
De Marsico M, Levialdi S. Evaluating web sites: exploiting user's expectations. International Journal of Human Computer Studies 2004;60:381–416.
DeLone WH, McLean ER. Information systems success: the quest for the dependent variable.
Information Systems Research 1992;3(1):60–95.
DeLone WH, McLean ER. The Delone and Mclean model of information systems success: a
ten-year update. Journal of Management Information Systems 2003;19(4):9-30.
DeVellis RF. Scale development: theory and applications. Newbury Park, CA: Sage Publications; 1991.
Diamantopoulos A. Esport performance measurement: reflective versus formative indicators. International Marketing Review 1999;16(6):144–457.
Diamantopoulos A. The error term in formative measurement models: interpretation
and modeling implications. Journal of Modeling in Management 2006;1(1):7-17.
Diamantopoulos A, Siguaw JA. Formative versus reflective indicators in organizational
measure development: a comparison and empirical illustration. British Journal of
Management 2006;17:263–82.
Diamantopoulos A, Winklhofer HM. Index construction with formative indicators: an alternative to scale development. Journal of Marketing Research 2001;38(2):269–77.
Dickinger A. The Trustworthiness of Online Channels for Experience- and Goal-Directed Search Tasks. Journal of Travel Research 2011;50(4):378–91.
Dillon WR, Kumar A, Mulani R. Offending estimates in covariance structure analysis:
comments on the causes of and solutions to Heywood cases. Psychological Bulletin
1987;101(1):126–35.
Edwards JR, Bagozzi RP. On the nature of and direction of relationships between constructs. Psychological Methods 2000;5(2):155–74.
Fishbein M, Ajzen I. Belief, attitude, intention and behavior: an introduction to theory
and research. MA: Addison-Wesley, Reading; 1975.
Fornell C, Larcker DF. Evaluating Structural Equation Models with Unobservable Variables and Measurement Error. Journal of Marketing Research 1981;18(1):39–50.
Fornell C, Bookstein FL. Two structural equation models: Lisrel and Pls applied to consumer exit-voice theory. Journal of Marketing Research 1982;14:440–52.
Gable GG, Sedera D, Chan T. Re-conceptualizing information system success: the isimpact measurement model. Journal of the Association for Information Systems
2008;9(7):377–408.
Gefen D, Straub DW, Boudreau M-C. Structural equation modeling and regression
guidelines for research practice. Communications of the Association for Information Systems 2000;4(7):1-78.
Gursoy D, McCleary KW. An Integrative model of tourists' information search behavior.
Annals of Tourism Research 2004;31(2):353–73.
Hauser RM, Goldberger AS. The treatment of unobservable variables in path analysis.
In: Costner HL, editor. Sociological methodology. San Francisco: Jossey-Bass; 1971.
Hsu SH. Developing an index for online customer satisfaction: adaptation of American customer satisfaction index. Expert Systems with Applications 2008;34(4):3033–42.
Hu L-T, Bentler PM. Evaluating model fit. In: Hoyle RH, editor. Structural equation modeling:
concepts, issues and applications. Thousand Oaks: Sage Publications; 1995.
Jarvis CB, MacKenzie SB, Podsakoff PM. A critical review of construct indicators and
measurement model misspecification in marketing and consumer research. The
Journal of Consumer Research 2003;30(2):199–218.
7
Jöreskog KG. How large can a standardized coefficient be? . [from]http://www.ssicentral.
com/lisrel/techdocs/HowLargeCanaStandardizedCoefficientbe.pdf 1999.
Jöreskog KG, Goldberger AS. Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association
1975;10:631–9.
Kim H, Fesenmaier DR. Persuasive design of destination web sites: an analysis of first
impression. Journal of Travel Research 2008;47:3-13.
Kleinbaum DG, Kupper L, Müller KE. Applied regression analysis and other multivariate
methods. Boston: PWS-Kent; 1988.
Kline RB. Principles and practice of structural equation modeling. New York: Guilford
Press; 2004.
Koufaris M. Applying the technology acceptance model and flow theory to online consumer behavior. Information Systems Research 2002;13(2):205–23.
Koufaris M, Hampton-Sosa W. Customer trust online: examining the role of the experience with the web-site. 04.12.2008. . [from]http://cisnet.baruch.cuny.edu/papers/
cis200205.pdf 2002.
Lu J. Development, distribution and evaluation of online tourism services in China.
Electronic Commerce Research 2004;4:221–39.
MacCallum RC, Browne MW. The use of causal indicators in covariance structure
models: some practical issues. Psychological Bulletin 1993;114:533–41.
MacKenzie SB, Podsakoff PM, Jarvis CB. The problem of measurement model misspecification in behavioral and organizational research and some recommended solutions. The Journal of Applied Psychology 2005;90(4):710–30.
Marcussen C. Trends in European internet distribution — of travel and tourism services.
[23 March 2009]http://www.crt.dk/UK/staff/chm/trends.htm 2009. [Retrieved 25
March 2010, 2010, from].
Maruyama GM. Basics of structural equation modeling. Thousand Oaks: Sage; 1997.
Mathieson K, Peacock E, Chin WW. Extending the technology acceptance model: the
influence of perceived user resources. Database for Advances in Information Systems
2001;32(3):86-112.
Mathwick C, Rigdon E. Play, flow, and the online search experience. The Journal of Consumer Research 2004;31:324–32.
Moon J-W, Kim Y-G. Extending the tam for a world-wide-web context. Information
Management 2001;38:217–30.
Muthén L, Muthén B. Mplus user's guide: statistical analysis with latent variables. Los
Angeles: Muthén & Muthén; 2007.
Norman DA. The design of everyday things. New York: Basic Books; 2002.
Novak TP, Hoffman DL, Yung Y-F. Measuring the customer experience in online environments: a structural modeling approach. Marketing Science 2000;19(1):22–44.
Oliver RL. Satisfaction: a behavioral perspective on the consumer. New York: McGrawHill;
1997.
Parasuraman A, Zeithaml VA, Malhotra A. E-S-Qual: a multiple-item scale for assessing
electronic service quality. Journal of Service Research 2005;7(3):213–33.
Park YA, Gretzel U. Success factors for destination marketing web sites: a qualitative
meta-analysis. Journal of Travel Research 2007;46(1):46–63.
Petter S, Straub D, Rai A. Specifying formative constructs in information systems research. MIS Quaterly 2007;31(4):623–56.
Rice M. What makes users revisit a web site? Marketing News 1997;31(6):12.
Rossi G, Schwabe D, Lyardet F. Improving web information systems with navigational
patterns. Computer Networks 1999;31(11–16):1667–78.
Rossiter JR. The C-Oar-Se procedure for scale development in marketing. International
Journal of Research in Marketing 2002;19(4):305–35.
Rossiter JR. Reminder: a horse is a horse. International Journal of Research in Marketing
2005;22(1):23–5.
Scott J. The Measurement of information systems effectiveness: evaluating a measuring
instrument. Fifteenth international conference on information systems — ICIS,
Vancouver, British Columbia; 1994.
Serva MA, Benamati JS, Fuller MA. Trustworthiness in B2c E-commerce: an examination of alternative models. The DATA BASE for Advances in Information Systems
2005;36(3):89-108.
Shneiderman B. Designing your next generation foundation website. Foundation News
& Commentary; 2003. p. 34–41.. (November/December).
Spector PE. Summated ratings scales construction. Newbury Park, CA: Sage Publications; 1992.
Strauss A, Corbin J. Basics of qualitative research: techniques and procedures for developing grounded theory. Newbury Park, CA: Sage Publications; 1998.
Van der Heijden H. Factors influencing the usage of websites: the case of a generic portal in
The Netherlands. Information Management 2003;40:541–9.
Venkatesh V. Determinants of perceived ease of use: integrating control, intrinsic motivation; and emotion into technology acceptance model. Information Systems Research 2000;11(4):342–65.
Venkatesh V, Morris MG, Davis GB, Davis FD. User acceptance of information technology:
toward a unified view. The Management Information Systems Quarterly 2003;27(3):
425–78.
Vogt CA, Fesenmaier DR. Expanding the functional information search model. Annals of
Tourism Research 1998;25(3):551–78.
Wolfinbarger M, Gilly MC. Shopping online for freedom, control, and fun. California
Management Review 2001;43(2):34–55.
Zeithaml VA, Leonard LB, Parasuraman A. The Behavioral Consequences of Service
Quality. Journal of Marketing 1996;60(2):31–46.
Zeithaml VA, Parasuraman A, Malhotra A. Service quality delivery through web sites: a
critical review of extant knowledge. Journal of the Academy of Marketing Science
2002;30(4):362–76.
Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J
Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017