Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
JBR-07365; No of Pages 7 Journal of Business Research xxx (2011) xxx–xxx Contents lists available at SciVerse ScienceDirect Journal of Business Research Website performance and behavioral consequences: A formative measurement approach Astrid Dickinger a,⁎, Brigitte Stangl b a b MODUL University Vienna, Department of Tourism and Hospitality Management, Am Kahlenberg 1, 1190 Vienna, Austria HTW Chur, University of Applied Sciences, Institute for Tourism and Leisure Research, ITF Comercialstrasse 22, 7000 Chur, Switzerland a r t i c l e i n f o Article history: Received 1 November 2010 Received in revised form 1 June 2011 Accepted 1 August 2011 Available online xxxx Keywords: Website evaluation Formative measurement Reflective measurement Structural equation modeling a b s t r a c t Increasing internet use to access information and book trips significantly impacts the tourism industry and calls attention to learn more about website performance and evaluation. Specifically, how do tourists interact with websites? Which websites transform travelers into repeat visitors? Most research in this field employs multi-item measures conceptualized as reflective models. This paper suggests a theory-based alternative, formative measurement approach for website performance. The construct comprises eight dimensions: system availability, ease of use, usefulness, navigational challenge, website design, content quality, trust, and enjoyment. Website users identify dimensional indicators using evaluation surveys. Structural equation modeling examines linkages between the website performance index and outcome measures (satisfaction, value, and loyalty). Data provided by 455 travelers show the formative index works well. A second sample of 316 respondents cross-validates the findings. The study develops a sound and parsimonious measure allowing the monitoring and benchmarking of traveler perceptions over time. © 2011 Elsevier Inc. All rights reserved. High-quality websites are critical because today's travelers increasingly search for information online and buy tourism related products and services via the internet (Bieger & Laesser, 2004; Crompton, 1979; Gursoy & McCleary, 2004; Marcussen, 2009; Vogt & Fesenmaier, 1998). Compared to offline shopping experiences, online-consumers rely on fewer senses, making them dependent on product and service descriptions and pictures (Koufaris, 2002). Consequently, online consumers require excellent website performance. Travelers planning a trip also expect web page content to be engaging and well designed (Rice, 1997; Van der Heijden, 2003). Not surprisingly, tourists demand websites to provide accurate information and easy navigation (Chung & Law, 2003; Lu, 2004). These expectations require effective website designs. To improve the online trip planning experience, a need exists to better conceptualize and operationalize website performance addressing travelers' needs and their website behavior (Park & Gretzel, 2007). Most researchers use reflectively measured constructs to investigate website quality antecedents as well as outcome measures (e.g., satisfaction, positive word-of-mouth, or intention to use). Among the most common antecedents are ease of use and usefulness (Davis, 1989), enjoyment, website design, and content quality (De Marsico & Levialdi, 2004; Parasuraman, Zeithaml, & Malhotra, 2005; Van der Heijden, 2003; Venkatesh, Morris, Davis, & Davis, 2003). The literature's current focus discusses the suitability of multiple ⁎ Corresponding author. E-mail addresses: astrid.dickinger@modul.ac.at (A. Dickinger), brigitte.stangl@htwchur.ch (B. Stangl). item measurements. Rossiter (2005) summarizes this issue in his famous paper “Reminder: a horse is a horse.” A second ongoing debate questions how to appropriately conceptualize and operationalize latent constructs (Collier & Bienstock, 2009; Diamantopoulos & Siguaw, 2006; Rossiter, 2002). Some constructs might be better represented by formative rather than by reflective indicators (see Diamantopoulos & Winklhofer, 2001). Although some studies on online formative measures exist (e.g., Collier & Bienstock, 2009; Gable, Sedera, & Chan, 2008; Mathieson, Peacock, & Chin, 2001), a need to develop alternative measurements for tourism still exists. The present study develops a formative measurement for website performance. Having a sound, parsimonious measure allows continual website evaluation. Users need not complete extensive questionnaires in order to tailor sites based on their behavior and needs. Thus, this study's contribution: 1) provides an overview of website performance and quality concepts and discusses the applied measurement approaches; 2) develops, tests, and crossvalidates a parsimonious formative measure for website performance; and 3) discusses theoretical as well as managerial implications regarding the formative measure. 1. Literature review 1.1. Approaches for website evaluation Over the last decade, IS (Information Systems), Marketing, and Tourism scholars devote a lot attention to websites and online shopping behavior. Most studies follow or extend popular cognitive 0148-2963/$ – see front matter © 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.jbusres.2011.09.017 Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017 2 A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx behavioral models such as the Theory of Reasoned Action (Fishbein & Ajzen, 1975), Theory of Planned Behavior (Ajzen, 1991), Technology Acceptance Model (Davis, 1989), and Unified Theory of Acceptance and Use of Technology (Venkatesh et al., 2003). Marketing research examines the impact of antecedents on website quality and satisfaction (Barnes & Vidgen, 2002; DeLone & McLean, 2003; Parasuraman et al., 2005; Wolfinbarger & Gilly, 2001). More recently Meta-analyses reveal the most examined drivers of website performance (see Table 1; also see DeLone & McLean, 1992, 2003; Park & Gretzel, 2007). Table 1 shows most constructs are reflective latent variables. Increasingly, scholars debate about reflectively measured constructs' appropriateness (Diamantopoulos & Winklhofer, 2001; Edwards & Bagozzi, 2000; MacKenzie, Podsakoff, & Jarvis, 2005). Some fields now use alternative, formative measurement approaches such as eservice quality (Collier & Bienstock, 2006; Collier & Bienstock, 2009; Hsu, 2008), trustworthiness (Serva, Benamati, & Fuller, 2005), information system success (Gable et al., 2008), and effectiveness (Scott, 1994). Diamantopoulos and Winklhofer (2001) stress a need to critically reflect on which approach is appropriate for the specific research question. Thorough theoretical considerations are imperative to decide the correct measurement perspective (Diamantopoulos & Winklhofer, 2001; Rossiter, 2002). An in-depth theoretical foundation is essential due to fundamental differences between the two alternatives (see Table 2). Reflectively conceptualized measures show changes in the construct's indicators (Diamantopoulos & Siguaw, 2006). Reflective constructs' items are “chosen randomly from the universe of items Table 1 Website evaluation criteria. 1 Author, model, and method Constructs included Barnes and Vidgen (2002): WebQual. Qualitative and quantitative surveys, n = 46 Davis, Bagozzi, and R. (1989): TAM, TRA. Experiment, n = 107 Dabholkar and Bagozzi (2002): TAM with extensions. Survey, experimental design, n = 392 Usability, design, information, trust, empathy, quality Ease of use, usefulness, attitude, intention to use Performance, fun, self-efficacy, novelty 3 seeking, need for interaction, selfconsciousness, perceived waiting time, social anxiety, attitude, ease of use, intention to use Park and Gretzel (2007): Meta Ease of use, responsiveness, 4 analysis of 153 academic papers. fulfillment, security/privacy, personalization, visual appearance, information quality, trust, interactivity Kim and Fesenmaier (2008): Website Informativeness, usability, credibility, 5 persuasiveness. Survey, n = 1416 inspiration, involvement, reciprocity 6 DeLone and McLean (1992): I/S System quality, information quality, Success. Conceptual study use, user satisfaction, individual impact, organizational impact Parasuraman et al. (2005): E-S Qual. Efficiency, system availability, 7 Survey, n = 549 fulfillment, privacy, responsiveness, compensation, contact, perceived value, loyalty intentions 8 Van der Heijden (2003): TAM. Web Perceived attractiveness, perceived based survey, n = 828 enjoyment, ease of use, usefulness, attitude, intention to use, perceived attractiveness Venkatesh et al. (2003): UTAUT. Effort expectancy, performance 9 Survey, n = 215 expectancy, facilitating conditions, social norms, intention to use, gender, age, experience, voluntariness of use Quality, fulfillment/reliability, website 10 Wolfinbarger and Gilly (2001): design, privacy/security, customer eTailQ. Focus groups, tasks, survey, service n = 1013 11 Mathwick and Rigdon (2004): Escapism, intrinsic enjoyment, Survey, n = 110 attitude toward the brand and site, navigational challenge, internet search skill, internet usage, decisional control, product involvement 2 Table 2 Reflective vs. formative constructs — comparison of characteristics. Reflective constructs Formative constructs Construct is reflected in the indicators Account for observed variances in the outer model — error is assessed at the item level Identification achieved with three effect indicators Important aspects: Construct is a composite of indicators Minimize residuals in the structural relationship — error is assessed at the construct level Identification is given only if the construct is embedded into a larger model Important aspects: • Internal consistency or reliability • Indicators examine different dimensions • Positive correlation between measures • Multicollinearity is a problem • Unidimensionality allows for removing indicators to improve construct validity without affecting content validity • Removing an indicator affects content validity Theoretical relationships between construct and indicators need to be thought about thoroughly. relating to the construct of interest” (DeVellis, 1991, p. 55) and often included or deleted arbitrarily. Not all items have equal chance of inclusion, potentially leading to inclusion or exclusion of conceptually unnecessary or necessary items (Rossiter, 2002). Consequently, measures are often idiosyncratic limiting comparability to other studies (Gable et al., 2008). With formative measures, “indicators could be seen as causing rather than being caused by the latent variable measured by the indicators” (MacCallum & Browne, 1993, p. 533). Thus, formative constructs are a linear function of respective measures (Jarvis, MacKenzie, & Podsakoff, 2003). Consequently, cause indicators are not interchangeable like reflective measures. Content validity is affected when indicators are removed from the formative model. Changing one indicator affects the latent variable ceteris paribus. Reflective models account for observed variance in the measurement model and the error is assessed at the item level. Formative models minimize residuals in the inner model and residuals are minimized in the structural relationship. The error is assessed at the construct level rather than with the observed variables and thus, is not random measurement error. Instead the error is a disturbance term of the latent formative construct representing all causes not accounted for by the indicators and impacts the latent construct. Researchers suggest setting the error term to zero, but they stress a theoretical foundation is needed to explain why the indicators capture the whole concept (Diamantopoulos, 2006). These differences need to be considered when a model is conceptualized and estimated. In contrast to reflective models, formative model identification only is achieved if effects on other constructs are included. For reflective measures reliability, positive correlations between measures and unidimensionality are prerequisites. Formative measures' indicators must examine different dimensions and avoid multicollinearity between the indicators (Bollen, 1989; Bollen & Lennox, 1991; Edwards & Bagozzi, 2000; Fornell & Bookstein, 1982). Table 2 summarizes the differences between the two measurement perspectives. 2. Index construction Construct development guidelines from classical test theory do not apply to formative measures (Churchill, 1979). Following Diamantopoulos and Winklhofer (2001), four critical issues need to be addressed to successfully construct a formative measure. First, the latent variable must be defined properly to specify the construct's scope. Second, the indicators must specify the underlying construct's total scope. Rossiter (2002) argues multiple items may increase a homogenous or well-known construct's weight in reflective measurement, even if a single item measure serves the purpose and Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017 A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx potentially is more accurate. Formative constructs require at least one indicator for each dimension's formulation. However, an excessive number of indicators must be avoided. To measure the whole scope, but at the same time achieve parsimony specifying the concept's dimensions as well as item identification are crucial. This step requires guidance by a thorough literature review. Jarvis et al. (2003) reveal mis-specified models are common (nearly one-third) in highly ranked journals. Other researchers confirm this conclusion supporting the view that misspecifications result in measurement error (e.g., Petter, Straub, & Rai, 2007). Additionally, misspecifications affect structural models and influence theory testing (Edwards & Bagozzi, 2000). Bagozzi (1994, p. 333) stresses that “an index is more abstract and ambiguous than a latent variable measured with reflective indicators”. Similar to multiple regression analysis, omitting relevant variables for formative measures leads to estimation problems and a substantially different index. One source for identifying problems is the error term. A high error term suggests revisiting the latent variable's conceptual definition (Diamantopoulos, 2006; MacCallum & Browne, 1993). The third step assesses multicollinearity. Since a formative measure is based on multiple regression analysis, the sample size and intercorrelation strength between the indicators affect coefficients' stability. Indicators' magnitudes serve as validity coefficients due to the direct relationship between each indicator and the underlying construct (Bollen, 1989). Multicollinearity suggests the indicators contain unnecessary, redundant information. Both, indicator validity bias and estimation difficulties may arise (Albers & Hildebrandt, 2006; Bollen & Lennox, 1991). The variance inflation factor (VIF) and tolerance test for mulitcollinearity (Kleinbaum, Kupper, & Müller, 1988). The final step concerns external validity. Since assessing indicator appropriateness is problematic, “the best we can do … is to examine how well the index relates to measures of other variables” (Bagozzi, 1994, p. 333). This process requires correlating each indicator to a feasible, theory-based variable not part of the index (see Spector, 1992). A more common approach employs a multiple indicators and multiple causes (MIMIC) model linking the index as an antecedent to theoretical constructs (Hauser & Goldberger, 1971; Jöreskog & Goldberger, 1975). A good fit of the MIMIC model suggests indicator suitability for the formative construct (Diamantopoulos & Winklhofer, 2001). This approach evaluates nomological validity. To measure overall fit, inspection of stand-alone (e.g., chi-square, RMSEA) and incremental fit indices (e.g., TLI, CFI) follow standard cut-off criteria (see Hu & Bentler, 1995). Finally, the index requires cross-validation (Cudeck & Browne, 1983). 3. Website performance: conceptual measure development Website performance measures overall appeal from a consumer's point of view. The performance literature suggests that indicators cause performance and not the other way round. Formative rather reflective measures best assess website performance (Diamantopoulos, 1999). To define a construct, Rossiter (2002) suggests clarifying the rater entity, the object, and the attribute. In this case, tourists searching information online are the rater entities, websites providing travel information are objects, and website performance evaluation is the attribute. To better understand website performance and the subsequent measurement development, a qualitative survey was conducted. Data were collected through in-depth interviews (Bruhn, Georgi, & Hadwich, 2008). Explorative, open ended interviews collected 35 narratives from tourists. Respondents provided detailed answers and multiple perspectives about their conceptualization of website performance. The qualitative results confirm the definition of website performance and support the sub-dimensions discussed in the next section following Diamantopoulos and Winklhofer's (2001) four stages of index construction. 3 3.1. Scope of the construct The index intends to capture website performance based on previous literature and qualitative primary data. After defining website performance, interviewees broke the concept down into sub-dimensions relevant to performance. Meta-analyses by DeLone and McLean (1992, 2003) and Park and Gretzel (2007) provide an overview of concepts considered necessary for website success. Condensing the qualitative study data and literature review uncover consistently mentioned dimensions for the formative index. The first dimension is system availability. This dimension refers to the website's technical functionality and performance. Web users quickly leave unresponsive or slow loading websites (Parasuraman et al., 2005; Zeithaml, Parasuraman, & Malhotra, 2002). Ease of use and usefulness are also key components. A useful system allows users to find information without a great effort. Websites must be easy to use and provide useful information (Collier & Bienstock, 2006; Davis, 1989; Novak, Hoffman, & Yung, 2000). Furthermore, website users dislike navigational challenges. Effective websites need to be well-structured, easy, and intuitive to navigate (De Marsico & Levialdi, 2004; Rossi, Schwabe, & Lyardet, 1999). Qualitative interviews support website attractiveness too. Website design is a key driver of performance perception. This dimension deals with the site's visual appearance including characteristics such as color, text presentation, pictures, videos, and sound (De Marsico & Levialdi, 2004; Norman, 2002; Shneiderman, 2003). Not surprisingly, content quality is of outmost importance. High quality websites provide compelling content (Barnes & Vidgen, 2000; Novak et al., 2000). Travelers access a website searching for information; however, they also want to be entertained (Koufaris & Hampton-Sosa, 2002; Park & Gretzel, 2007; Venkatesh, 2000). Interview respondents mentioned exploratory browsing and fun during their interaction with a website. Finally, the interviews confirm website trust leads to increased usage and repeat visitors (Dickinger, 2011; Gefen, Straub, & Boudreau, 2000). The formative construct comprises eight dimensions. Fig. 1 (H1a-g) captures the scope of website performance, including mutually exclusive and necessary aspects. 3.2. Indicator specification Interviewees' suggestions helped specify indicators capturing the formative construct's distinct dimensions. After explaining the website performance dimensions, interviewees formulated precise questions for each identified dimension. These questions were categorized and structured by the researchers (Strauss & Corbin, 1998). Some items were deleted due to redundancy; other items were refined or rephrased. Respondents were contacted again to obtain feedback on the completeness, phrasing, and appropriateness of the listed items. The respondents consistently agreed that the listed items accurately evaluate website performance for all eight dimensions. The procedure also indicates a single item measurement approach is suitable. This exercise identified the best measurements for each dimension (Rossiter, 2002). Respondents picked the most precise items for each concept and performed some final adaptations or rephrasing (see Table 3). 3.3. Indicator collinearity and external validity Next, multicollinearity between indicators and the index's external validity need to be examined. Examining this broader framework includes assessing a website's impact on satisfaction as well as on site's value influencing loyalty. The literature commonly suggests these relationships (e.g., Cronin, Brady, & Hult, 2000; Parasuraman et al., 2005). Website performance (H1a-h) impacts satisfaction (H2) and perceived value (H3) which in turn influence loyalty (H4 and H5 respectively). Fig. 1 shows the full research model. Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017 4 A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx Fig. 1. Website performance index. 4. Study design and measurement To develop a sound formative measure, the current study follows the four stages suggested in literature (content specification, indicator specification, collinearity test, and external validity) as well as model cross-validation with a fresh data set. The literature and the qualitative survey identify eight dimensions and captured the whole scope of the focal construct website performance: system availability, ease of use, usefulness, navigational challenge, website design, content quality, trust, and enjoyment. Each dimension's indicators were identified using a qualitative study. An online questionnaire collected data. The questionnaire's first page Table 3 Measurement items used in the research model. Formative concept Website performance H1a: usefulness H1b: ease of use H1c: enjoyment H1d: website design H1e: trust H1f: content quality H1g: navigational challenge H1h: system availability Reflective concepts I find the website useful for my search task. I find it easy to get the website to do what I want it to do. I find the website entertaining. I like the look and feel of the website. I think the website is trustworthy. The website communicates relevant information. It is easy to understand the overall navigation structure of the website. Pages on this site do not stop loading when I search. Factor loadings Study 1 Factor loadings Study 2 Value (adapted from Brady et al., 2005; Cronin et al., 2000) The search results are worth the effort. .911 .840 The value received through the search .953 .805 justifies the effort. Satisfaction (adapted from Brady et al, 2005; Cronin et al., 2000) This website is what you need when you .838 .675 search for information. After the search I was satisfied with the .867 .501 results. I will recommend the website. .833 .746 I am satisfied with the website. .892 .753 Loyalty Intentions (adapted from Moon & Kim, 2001; Zeithaml et al., 1996) I will use the website on a regular basis in .867 .870 the future. I will frequently use the website in the .887 .789 future. I will strongly recommend others to use .865 .839 the website. explains the study's context and the purpose. Details include the duration for completing the questionnaire, background information, and an invitation to participate. First, respondents were asked to read a search task and perform the task following a link to a tourism website. This site contained destination information from travel blogs, media, magazines, and tourism organizations. After finishing the search task, respondents completed the questionnaire. Three latent reflective concepts assess external validity (satisfaction, value, and loyalty). They were measured by adapting previously developed and tested Likert scales. Satisfaction and loyalty were measured by four and three items respectively adapted from Moon & Kim (2001), Zeithaml et al. (1996), Brady et al. (2005) and Cronin et al. (2000). Value was measured with two items adapted from Cronin et al. (2000). The measurement instrument was pre-tested by 42 students to assure clarity and readability. The reflective latent concepts are evaluated applying Fornell and Larcker's (1981) approach. This technique would not be feasible for the formative part of the measurement instrument. Average variance extracted (AVE) is at an acceptable range between .72 and .87, construct reliability (CR) is between .91 and .93; both exceeding the cut off value of .5 (AVE) and .7 (CR). Discriminant validity is satisfied because squared shared variance does not exceed AVE (see Table 4). The variance inflation factor (VIF) provides information on the presence of collinearity. VIF scores should remain below 10; the results find all VIFs are less than two (Kleinbaum et al., 1988). Furthermore, the tolerances were greater than .6, above the recommended minimum value of .3 (Diamantopoulos & Siguaw, 2006). 5. Analyses The model was estimated using MPlus, a second generation SEM tool (Muthén & Muthén, 2007). This tool offers some key advantages. First, estimators do not require normal distribution or metric data. Furthermore, the software allows for specification of formative measures. Fig. 1 shows both an estimated formative index and a multiple Table 4 Evaluation of the measurement model. Construct CR 1 2 3 Loyalty Value Satisfaction .91 .93 .93 .762 .460 .621 .869 .723 .716 Note: Average variance extracted is reported on the diagonal. Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017 5 A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx indicators multiple causes (MIMIC) model (Jöreskog & Goldberger, 1975). This analysis allows reflective indicators, providing fit indicators to help assess overall model fit. Further index assessment includes examining the individual γ-parameters (Diamantopoulos & Winklhofer, 2001). suggest linking the index to reflective constructs with which they normally would be linked with. The results show positive effects of website performance on satisfaction (β = .98; p b .001) and perceived value (β = .85; p b .001). These results confirm H2 and H3. In line with H4, satisfaction has a direct, positive effect on loyalty (β = .86; p b .001). However, the effect of value on loyalty is not significant. 6. Results 6.3. Cross-validation 6.1. Sample profile Data cleaning reduced the sample to 445 fully completed questionnaires. Respondent gender included more females (58.4%) than males (41.6%). The average age of the sample is 28.3 years. For education level, more than half of the sample finished high school (53.9%), 36.6% acquired a university degree, and the rest completed compulsory, vocational training, or did not specify their education. Most respondents can be classified as experienced internet users. More than 84% of respondents go online constantly or several times a day. 6.2. Estimation of the research model Incremental fit indices as well as stand-alone fit indices show the model fits the data well. The comparative fit index (CFI) and the Tucker–Lewis index (TLI) results are .957 and .945 respectively. Root Mean Square Error of Approximation (RMSEA) is satisfactory (.058). Table 5 details the fit indicators. Now the proposed formative measure is examined in more detail. Usefulness, (γ = .22; p b .001), ease of use (γ = .19; p b .001), enjoyment (γ = .10; p b .008), website design (γ = .16; p b .001), trust (γ = .07; p b .020), content quality (γ = .34; p b .001), navigational challenge (γ = .12; p b .001), and system availability (γ = .07; p b .020) return significant coefficients in support of H1a, H1b, H1c, H1d, H1e, H1f, H1g, and H1h. The r-square of the index is .77. These results confirm the proposed formative index for website performance. Next, investigating the results of the structural part (paths between the formative index and the reflective concepts) provides nomological validation. Diamantopoulos and Winklhofer (2001) Cudeck and Browne (1983) recommend cross validating the constructed index with fresh data. This cross validation was performed by evaluating a national tourism organization's website. Data were collected through an online survey using the first study's questionnaire. A total of 316 usable questionnaires were collected. This sample's gender distribution is similar with 53.5% female and 46.5% male respondents. Respondents' average age is about the same as the first sample (27.7 years). Furthermore, participants also are highly educated; 59.8% finished high school and 31.3% obtained a university degree. Finally, participants of the second sample also are very knowledgeable about the internet, with more than 81% being online constantly/several times a day. The variance inflation factor and tolerance values show good results. Investigating overall model fit shows values for CFI (.921) and TLI (.915) that are above the required cut off value of .9. RMSEA (.066) remains well below the .8 threshold. Furthermore, the factor loadings confirm an equally strong measurement model for the reflective measurements. These results confirm the evaluation of the measurement instrument from Study 1. Again, all the indicators forming the index show significant effects. The r-square of website performance is high with .72. Usefulness (γ = .22; p b .001), enjoyment (γ = .24; p b .001), content quality (γ = .19; p b .001) and ease of use (γ = .23; p b .001) show the highest contribution to the index as found in the first study. Also, the effects within the model's structural part show the same patterns. The effects of website performance on satisfaction and value are again positive and highly significant (β = .98 and β = .91). Value's effect on loyalty is not significant; however, the satisfaction on loyalty relationship is Table 5 Results of the model test and the cross validation. Index construction Study 1 Model for website performance Usefulness Ease of use Enjoyment Website design Trust Content quality Navigational challenge System availability H2: H3: H4: H5: website performance → satisfaction website performance → value satisfaction → loyalty value → loyalty Cross validation Study 2 Standardized γ p-value Standardized γ p-value .22 .19 .10 .16 .07 .34 .12 .07 b.001 b.001 b.008 b.001 b.020 b.001 b.001 b.020 .22 .23 .24 .12 .13 .19 .09 .16 b.001 b.001 b.001 b.009 b.001 b.001 b.025 b.001 Standardized β p-value Standardized β p-value .98 .85 .86 n.s. b.001 b.001 b.001 n.s. .98 .91 1.315 n.s. b.001 b.001 b.001 n.s. r-square Website performance Loyalty Value Satisfaction .77 .61 .74 .97 .72 .76 .83 .98 Fit indicators CFI TLI RMSEA .957 .945 .058 .921 .915 .066 Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017 6 A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx confirmed. Surprisingly, the β-coefficient for satisfaction on loyalty is above one. To rule out multicollinearity as reason for the inflated coefficient, the variance inflation factor and tolerance must be inspected (Kline, 2004; Maruyama, 1997). Inspecting the residuals allows ruling out that a Heywood case causes the inflated estimate (Dillon, Kumar, & Mulani, 1987). Jöreskog (1999) suggests higher coefficients are acceptable if regarded as regression coefficients. Thus, the index's cross validation as well as the whole model is successful. 7. Discussion of the findings Study 1 results show important website performance dimensions include content quality and usefulness followed by ease of use and website design. The effects of trust and system availability are not as strong as for the earlier mentioned items; however, the results are highly significant. This finding also confirms results from the qualitative study. Respondents note trust is important for information websites, but even more essential for transaction sites. For online booking, a website's effort to gain trust must be higher than for information websites. The proposed indicator captures all relevant dimensions and applies to multiple settings. The results show the formative index performs well because all identified dimensions show highly significant γ-values. The hypothesized effects on behavioral variables such as value and satisfaction are confirmed. These results are consistent with website quality/satisfaction/performance literature (Cronin et al., 2000; DeLone & McLean, 2003; Oliver, 1997). The tested model demonstrates the quality of the proposed formative measurement approach for website performance. Cross-validation also confirms the formative website performance measure. 7.1. Theoretical implications This research challenges the common reflective approach to evaluate websites using formative measurement instruments. An indepth literature review and a qualitative study involving internet users lead to the development of a formative index. This index comprises eight dimensions representing the defined scope of the construct. These dimensions include: system availability, ease of use, usefulness, navigational challenge, website design, content quality, trust, and enjoyment. Petter et al. (2007) stress measures used in previous studies need careful evaluation to assure the specific measurement approach is the most appropriate. The present study demonstrates the need for thorough theoretical foundation to decide upon the appropriateness of a measurement paradigm. To assist researchers, this paper shows how the literature guides concept definition. Then, the study presents a step by step approach to successfully design adequate items for a formative construct (Bruhn et al., 2008). This process is essential even if items for reflective measures exist because two completely different approaches are employed. While the literature provides many possible variables for testing reflective constructs (e.g., Barnes & Vidgen, 2000; Moon & Kim, 2001), a need exists to identify items for formative measures. These measures are a composite of indicators and items which cannot be selected from a universe without first determining content validity. Thus, items of formative and reflective measures differ from a conceptual point of view (DeVellis, 1991). The qualitative study reveals indicators relevant for the dimensions of the construct. Study participants consistently agreed on proposed dimensions' importance and a majority chose the same measurement items. The qualitatively identified items perform well. A formative model should be inclusive, avoiding further uncontrolled appearance of new labels for website performance concepts. The present study shows that the dimensions are essential since all γ-values are statistically significant. Some measures may be stronger in a different consumption behavior context (e.g., information search vs. purchasing online). Including all theoretically sound dimensions prevents adding and removing items arbitrarily. Taking these considerations into account leads to developing a robust formative index. 7.2. Managerial implications The proposed web performance index is a robust, parsimonious, and simple measurement instrument. This index has an advantage over most reflective measures because often people chose items for a reflective construct arbitrarily (DeVellis, 1991; Rossiter, 2002), decreasing accuracy as well as constraining comparability. The index allows organizations to evaluate their websites and assess the impact of website performance on value and satisfaction. For companies, knowledge about consumers' needs is essential to design effective websites. Satisfied users become loyal customers. A large, loyal customer base is particularly important for transaction sites or websites dependent on sponsors. Appropriate measures of customer needs are essential. Applying mis-specified measures leads to estimation problems affecting inference in hypothesis testing. This mistake leads to bad management decisions (MacCallum & Browne, 1993). In the worst case scenario, resources allocated to improving the website may not contribute to improved website performance. The website performance index is valuable to companies: 1) interested in evaluating their website based on customers' needs or requirements and looking for an easy to use and parsimonious measurement instrument; 2) seeking to assess their website performance continuously without absorbing customers time extensively; and 3) desiring to establish a benchmark and compare their website performance over time and/or with competitors. 7.3. Limitations and future research This study has limitations to be addressed in future research. First, the study needs replication across different contexts and using different samples. This conceptual framework needs testing with more transaction-oriented websites. The results should lead to higher coefficients for system availability and trust since respondents in a transaction setting are more vulnerable to system failure (Dickinger, forthcoming; Gefen et al., 2000). Furthermore, additional content validity in different contexts is worthwhile. Follow-up validation studies would demonstrate the generalizability of the index across different settings. Second, the model was developed and validated with data from one country. Could the same formative index reflect the view of a more general, worldwide audience? Replicating the study (i.e. the qualitative item identification study and the quantitative model testing studies) across different countries would provide insights into whether or not these dimensions are global. Finally, this work encourages further discussion on which paradigm is more appropriate depending on research questions and theory backgrounds. This debate encourages more index development and leads to the employment of more parsimonious measurement instruments for market research. An extensively validated and widely adapted website performance model facilitates further research and insights regarding online consumer behavior and preferences. The present study makes a significant step in this direction. References Ajzen I. The theory of planned behavior. Organizational Behavior and Human Decision Processes 1991;50:179–211. Albers S, Hildebrandt L. Methodische Probleme Bei Der Erfolgsfaktorenforschung – Messfehler, Formative Versus Reflektive Indikatoren Und Die Wahl Des Strukturgleichungs-Modells. Zeitschrift für betriebswirtschaftliche Forschung 2006;58(1):2-33. Bagozzi RP. Structural equation models in marketing research: basic principles. In: Bagozzi RP, editor. Principles of marketing research. Cambridge, MA: Blackwell; 1994. Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017 A. Dickinger, B. Stangl / Journal of Business Research xxx (2011) xxx–xxx Barnes S, Vidgen R. Webqual: an exploration of web site quality. Proceedings of the eighth European conference on information systems, Vienna; 2000. Barnes S, Vidgen R. An integrative approach to the assessment of e-commerce quality. Journal of Electronic Commerce Research 2002;3(3):114–27. Bieger T, Laesser C. Information sources for travel decisions: toward a source process model. Journal of Travel Research 2004;24:357–71. [May]. Bollen K. Structural equations with latent variables. New York: John Wiley & Sons; 1989. Bollen K, Lennox R. Conventional wisdom on measurement: a structural equation perspective. Psychological Bulletin 1991;110(2):305–14. Brady MK, Knight GA, Cronin JJJ, Hult TM, Keillor BD. Removing the contextual lens: A multinational, multi-setting comparison of service evaluation models. Journal of Retailing 2005;81(3):215–30. Bruhn M, Georgi D, Hadwich K. Customer equity management as formative secondorder construct. Journal of Business Research 2008;61:1292–301. Chung T, Law R. Developing a performance indicator for hotel websites. Hospitality Management 2003;22:119–25. Churchill Jr GA. A paradigm for developing better measures of marketing constructs. Journal of Marketing Research 1979;16(1):64–73. Collier JE, Bienstock CC. Measuring service quality in e-retailing. Journal of Service Research 2006;8(3):260–75. Collier JE, Bienstock CC. Model misspecification: contrasting formative and reflective indicators for a model of e-service quality. Journal of Marketing Theory and Practice 2009;17(3):283–93. Crompton JL. Motivations for pleasure vacation. Annals of Tourism Research 1979;6:408–22. Cronin JJJ, Brady MK, Hult GTM. Assessing the effects of quality, value, and customer satisfaction on consumer behavioral intentions in service environments. Journal of Retailing 2000;76(2):193–218. Cudeck R, Browne MW. Cross-validation of covariance structures. Multivariate Behavioral Research 1983;18(2):147. Dabholkar PA, Bagozzi RP. An attitudinal model of technology-based self-service: moderating effects of consumer traits and situational factors. Journal of the Academy of Marketing Science 2002;30(3):184–201. Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information technology. The Management Information Systems Quarterly 1989;13(3):319–40. Davis FD, Bagozzi RP, R. WP. User acceptance of computer technology: a comparison of two theoretical models. Management Science 1989;35(8):982-1003. De Marsico M, Levialdi S. Evaluating web sites: exploiting user's expectations. International Journal of Human Computer Studies 2004;60:381–416. DeLone WH, McLean ER. Information systems success: the quest for the dependent variable. Information Systems Research 1992;3(1):60–95. DeLone WH, McLean ER. The Delone and Mclean model of information systems success: a ten-year update. Journal of Management Information Systems 2003;19(4):9-30. DeVellis RF. Scale development: theory and applications. Newbury Park, CA: Sage Publications; 1991. Diamantopoulos A. Esport performance measurement: reflective versus formative indicators. International Marketing Review 1999;16(6):144–457. Diamantopoulos A. The error term in formative measurement models: interpretation and modeling implications. Journal of Modeling in Management 2006;1(1):7-17. Diamantopoulos A, Siguaw JA. Formative versus reflective indicators in organizational measure development: a comparison and empirical illustration. British Journal of Management 2006;17:263–82. Diamantopoulos A, Winklhofer HM. Index construction with formative indicators: an alternative to scale development. Journal of Marketing Research 2001;38(2):269–77. Dickinger A. The Trustworthiness of Online Channels for Experience- and Goal-Directed Search Tasks. Journal of Travel Research 2011;50(4):378–91. Dillon WR, Kumar A, Mulani R. Offending estimates in covariance structure analysis: comments on the causes of and solutions to Heywood cases. Psychological Bulletin 1987;101(1):126–35. Edwards JR, Bagozzi RP. On the nature of and direction of relationships between constructs. Psychological Methods 2000;5(2):155–74. Fishbein M, Ajzen I. Belief, attitude, intention and behavior: an introduction to theory and research. MA: Addison-Wesley, Reading; 1975. Fornell C, Larcker DF. Evaluating Structural Equation Models with Unobservable Variables and Measurement Error. Journal of Marketing Research 1981;18(1):39–50. Fornell C, Bookstein FL. Two structural equation models: Lisrel and Pls applied to consumer exit-voice theory. Journal of Marketing Research 1982;14:440–52. Gable GG, Sedera D, Chan T. Re-conceptualizing information system success: the isimpact measurement model. Journal of the Association for Information Systems 2008;9(7):377–408. Gefen D, Straub DW, Boudreau M-C. Structural equation modeling and regression guidelines for research practice. Communications of the Association for Information Systems 2000;4(7):1-78. Gursoy D, McCleary KW. An Integrative model of tourists' information search behavior. Annals of Tourism Research 2004;31(2):353–73. Hauser RM, Goldberger AS. The treatment of unobservable variables in path analysis. In: Costner HL, editor. Sociological methodology. San Francisco: Jossey-Bass; 1971. Hsu SH. Developing an index for online customer satisfaction: adaptation of American customer satisfaction index. Expert Systems with Applications 2008;34(4):3033–42. Hu L-T, Bentler PM. Evaluating model fit. In: Hoyle RH, editor. Structural equation modeling: concepts, issues and applications. Thousand Oaks: Sage Publications; 1995. Jarvis CB, MacKenzie SB, Podsakoff PM. A critical review of construct indicators and measurement model misspecification in marketing and consumer research. The Journal of Consumer Research 2003;30(2):199–218. 7 Jöreskog KG. How large can a standardized coefficient be? . [from]http://www.ssicentral. com/lisrel/techdocs/HowLargeCanaStandardizedCoefficientbe.pdf 1999. Jöreskog KG, Goldberger AS. Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association 1975;10:631–9. Kim H, Fesenmaier DR. Persuasive design of destination web sites: an analysis of first impression. Journal of Travel Research 2008;47:3-13. Kleinbaum DG, Kupper L, Müller KE. Applied regression analysis and other multivariate methods. Boston: PWS-Kent; 1988. Kline RB. Principles and practice of structural equation modeling. New York: Guilford Press; 2004. Koufaris M. Applying the technology acceptance model and flow theory to online consumer behavior. Information Systems Research 2002;13(2):205–23. Koufaris M, Hampton-Sosa W. Customer trust online: examining the role of the experience with the web-site. 04.12.2008. . [from]http://cisnet.baruch.cuny.edu/papers/ cis200205.pdf 2002. Lu J. Development, distribution and evaluation of online tourism services in China. Electronic Commerce Research 2004;4:221–39. MacCallum RC, Browne MW. The use of causal indicators in covariance structure models: some practical issues. Psychological Bulletin 1993;114:533–41. MacKenzie SB, Podsakoff PM, Jarvis CB. The problem of measurement model misspecification in behavioral and organizational research and some recommended solutions. The Journal of Applied Psychology 2005;90(4):710–30. Marcussen C. Trends in European internet distribution — of travel and tourism services. [23 March 2009]http://www.crt.dk/UK/staff/chm/trends.htm 2009. [Retrieved 25 March 2010, 2010, from]. Maruyama GM. Basics of structural equation modeling. Thousand Oaks: Sage; 1997. Mathieson K, Peacock E, Chin WW. Extending the technology acceptance model: the influence of perceived user resources. Database for Advances in Information Systems 2001;32(3):86-112. Mathwick C, Rigdon E. Play, flow, and the online search experience. The Journal of Consumer Research 2004;31:324–32. Moon J-W, Kim Y-G. Extending the tam for a world-wide-web context. Information Management 2001;38:217–30. Muthén L, Muthén B. Mplus user's guide: statistical analysis with latent variables. Los Angeles: Muthén & Muthén; 2007. Norman DA. The design of everyday things. New York: Basic Books; 2002. Novak TP, Hoffman DL, Yung Y-F. Measuring the customer experience in online environments: a structural modeling approach. Marketing Science 2000;19(1):22–44. Oliver RL. Satisfaction: a behavioral perspective on the consumer. New York: McGrawHill; 1997. Parasuraman A, Zeithaml VA, Malhotra A. E-S-Qual: a multiple-item scale for assessing electronic service quality. Journal of Service Research 2005;7(3):213–33. Park YA, Gretzel U. Success factors for destination marketing web sites: a qualitative meta-analysis. Journal of Travel Research 2007;46(1):46–63. Petter S, Straub D, Rai A. Specifying formative constructs in information systems research. MIS Quaterly 2007;31(4):623–56. Rice M. What makes users revisit a web site? Marketing News 1997;31(6):12. Rossi G, Schwabe D, Lyardet F. Improving web information systems with navigational patterns. Computer Networks 1999;31(11–16):1667–78. Rossiter JR. The C-Oar-Se procedure for scale development in marketing. International Journal of Research in Marketing 2002;19(4):305–35. Rossiter JR. Reminder: a horse is a horse. International Journal of Research in Marketing 2005;22(1):23–5. Scott J. The Measurement of information systems effectiveness: evaluating a measuring instrument. Fifteenth international conference on information systems — ICIS, Vancouver, British Columbia; 1994. Serva MA, Benamati JS, Fuller MA. Trustworthiness in B2c E-commerce: an examination of alternative models. The DATA BASE for Advances in Information Systems 2005;36(3):89-108. Shneiderman B. Designing your next generation foundation website. Foundation News & Commentary; 2003. p. 34–41.. (November/December). Spector PE. Summated ratings scales construction. Newbury Park, CA: Sage Publications; 1992. Strauss A, Corbin J. Basics of qualitative research: techniques and procedures for developing grounded theory. Newbury Park, CA: Sage Publications; 1998. Van der Heijden H. Factors influencing the usage of websites: the case of a generic portal in The Netherlands. Information Management 2003;40:541–9. Venkatesh V. Determinants of perceived ease of use: integrating control, intrinsic motivation; and emotion into technology acceptance model. Information Systems Research 2000;11(4):342–65. Venkatesh V, Morris MG, Davis GB, Davis FD. User acceptance of information technology: toward a unified view. The Management Information Systems Quarterly 2003;27(3): 425–78. Vogt CA, Fesenmaier DR. Expanding the functional information search model. Annals of Tourism Research 1998;25(3):551–78. Wolfinbarger M, Gilly MC. Shopping online for freedom, control, and fun. California Management Review 2001;43(2):34–55. Zeithaml VA, Leonard LB, Parasuraman A. The Behavioral Consequences of Service Quality. Journal of Marketing 1996;60(2):31–46. Zeithaml VA, Parasuraman A, Malhotra A. Service quality delivery through web sites: a critical review of extant knowledge. Journal of the Academy of Marketing Science 2002;30(4):362–76. Please cite this article as: Dickinger A, Stangl B, Website performance and behavioral consequences: A formative measurement approach, J Bus Res (2011), doi:10.1016/j.jbusres.2011.09.017