Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Measuring Information Systems Service Quality: Concerns for a Complete Canvas

MIS Quarterly, 1997
...Read more
SERVOUAL Concems Measuring Information Systems Service Quaiity: Concerns on the Use of the SERVQUAL Questionnaire^ By: Thomas P. Van Dyke Department of Information Systems and Technologies College of Business and Economics Weber State University 3804 University Circie Ogden, Utah 84408-3804 U.S.A. tvandyke @ weber.edu Leon A. Kappelman Business Computer Information Systems Department Coilege of Business Administration University of North Texas Denton, Texas 76203-3677 U.S.A. kapp@unt.edu Victor R. Prybutok Business Computer information Systems Department Coilege of Business Administration University of North Texas Denton, Texas 76203-3677 U.S.A. Abstract A recent MIS Quarterly article rightfully points out that service is an important part of the role of the information systems (IS) department and that most IS assessment measures have a product orientation (Pitt et al. 1995). The article went on to suggest the use of an IS-context- modified version of the SERVQUAL instrument to assess the quality of the services supplied by an information services provider (Parasuraman et al. 1985, 1988, 1991).^ However, a number of problems with the SERVQUAL instrument have been discussed in the literature (e.g., Babakus and Boiler 1992; Carman 1990; Cronin and Taylor 1992, 1994; Teas 1993). This article reviews that literature and discusses some of the implications for measuring sen/ice quality In the information systems context. Findings indicate that SERVQUAL suffers from a number of conceptual and empirical difficul- ties. Conceptual difficulties include the opera- tionalization of perceived service quality as a difference or gap score, the ambiguity of the expectations construct, and the unsuitability of using a single measure of service quality across different industries. Empirical problems, which may be linked to the use of difference scores, include reduced reliability, poor convergent validity, and poor predictive validity. This sug- gests that (1) some alternative to difference scores is preferable and should be utilized; (2) if used, caution should be exercised in the inter- pretation of iS-SERVQUAL difference scores; and (3) further work is needed in the develop- ment of measures for assessing the quality of iS services.. Keywords: IS management, evaluation, mea- surement, service quality, user attitudes, user expectations iSRL Categories: AI0104, AlOlOg, A104, E10206.03, GB02, GB07 'Robert Zmud was the accepting senior editor for this paper. The terms "information services provider" or "IS services provider* are used to refer to any provider of information systems services. This inciudes the information systems function within an organization as weii as extemai vendors of information systems products and/or services. MIS Ouarterly/June 1997 195
Research Note Introduction Due to the growth of outsourcing, end-user- controlled information assets, joint ventures, and other alternative mechanisms by which organizations are meeting their need for infor- mation systems services, IS managers are increasingly concerned about improving the perceived (as well as actual) service quality of the IS function (Kettinger and Lee 1994). In recent years, the use of SERVOUAL-based instruments has become increasingly popular with information systems (IS) researchers. However, a review of the literature suggests that the use of such instruments may result in a number of measurement problems. A recent article makes several important con- tributions to the assessment and evaluation of the effectiveness of information systems (IS) departments in organizations (Pitt et al. 1995). The article: 1. Points out that, although service is an important part of the role of the IS depart- ment, most IS assessment measures have a product orientation. 2 Proposes an extension of the categorization of IS success measures (DeLone and McLean 1992) to include service quality. 3. Proposes the use of the SERVOUAL instru- ment from marketing (Parasuraman et al. 1985, 1988, 1991) to operationalize the IS service quality construct and modify the wording of the instrument to better accom- modate its use in an IS context. 4. Adapts and augments a theory regarding the determinants of service quality expecta- tions to an IS context and offers ideas for future research. A number of studies, however, identify poten- tial difficulties with the SERVOUAL instrument (e.g., Babakus and Boiler 1992; Carman 1990; Cronin and Taylor 1992, 1994; Teas 1993, 1994). This research note reviews some of the literature regarding difficulties with the SERVOUAL instrument in general and exam- ines the Implications of these difficulties to the use of the instrument in an IS context. The SERVQUAL Instrument: Problems Identified in the Literature The difficulties with the SERVOUAL instrument identified in the literature can be grouped into two main categories: (1) conceptual and (2) empirical; although, the boundary between them blurs because they are closely inter-relat- ed. The conceptual problems center around (1) the use of two separate instruments, one for each of two constructs (i.e., perceptions and expectations), to operationalize a third conceptually distinct construct (i.e., perceived service quaiity) that is itself the result of a com- plex psychological process; (2) the ambiguity of the expectations construct; and (3) the suit- ability of using a single instrument to measure service quality across different industries (i.e., content validity). The empirical problems are, by and large, the result of these conceptual dif- ficulties, most notably the use of difference scores, in conjunction with the a theoretical nature of the process used in the construction of the original five dimensions of service quali- ty. The empirical difficulties most often attrib- uted to the SERVOUAL instrument include low reliability, unstable dimensionality, and poor convergent validity. A review of these concep- tual and empirical problems should serve to caution those who wish to use SERVOUAL to measure the service quality of an information system provider. Conceptual difficuities with SERVQUAL Subtraction as a "Simuiation" Psychoiogicai Process of a Many of the difficulties associated with the SERVQUAL instrument stem from the opera- 196 MIS Quarteriy/June 1997
SERVOUAL Concems Measuring Information Systems Service Quaiity: Concerns on the Use of the SERVQUAL Questionnaire^ By: Thomas P. Van Dyke Department of Information Systems and Technologies College of Business and Economics Weber State University 3804 University Circie Ogden, Utah 84408-3804 U.S.A. tvandyke @ weber.edu Leon A. Kappelman Business Computer Information Systems Department Coilege of Business Administration University of North Texas Denton, Texas 76203-3677 U.S.A. kapp@unt.edu Victor R. Prybutok Business Computer information Systems Department Coilege of Business Administration University of North Texas Denton, Texas 76203-3677 U.S.A. 'Robert Zmud was the accepting senior editor for this paper. Abstract A recent MIS Quarterly article rightfully points out that service is an important part of the role of the information systems (IS) department and that most IS assessment measures have a product orientation (Pitt et al. 1995). The article went on to suggest the use of an IS-contextmodified version of the SERVQUAL instrument to assess the quality of the services supplied by an information services provider (Parasuraman et al. 1985, 1988, 1991).^ However, a number of problems with the SERVQUAL instrument have been discussed in the literature (e.g., Babakus and Boiler 1992; Carman 1990; Cronin and Taylor 1992, 1994; Teas 1993). This article reviews that literature and discusses some of the implications for measuring sen/ice quality In the information systems context. Findings indicate that SERVQUAL suffers from a number of conceptual and empirical difficulties. Conceptual difficulties include the operationalization of perceived service quality as a difference or gap score, the ambiguity of the expectations construct, and the unsuitability of using a single measure of service quality across different industries. Empirical problems, which may be linked to the use of difference scores, include reduced reliability, poor convergent validity, and poor predictive validity. This suggests that (1) some alternative to difference scores is preferable and should be utilized; (2) if used, caution should be exercised in the interpretation of iS-SERVQUAL difference scores; and (3) further work is needed in the development of measures for assessing the quality of iS services.. Keywords: IS management, evaluation, measurement, service quality, user attitudes, user expectations iSRL Categories: AI0104, AlOlOg, A104, E10206.03, GB02, GB07 The terms "information services provider" or "IS services provider* are used to refer to any provider of information systems services. This inciudes the information systems function within an organization as weii as extemai vendors of information systems products and/or services. MIS Ouarterly/June 1997 195 Research Note Introduction Due to the growth of outsourcing, end-usercontrolled information assets, joint ventures, and other alternative mechanisms by which organizations are meeting their need for information systems services, IS managers are increasingly concerned about improving the perceived (as well as actual) service quality of the IS function (Kettinger and Lee 1994). In recent years, the use of SERVOUAL-based instruments has become increasingly popular with information systems (IS) researchers. However, a review of the literature suggests that the use of such instruments may result in a number of measurement problems. A recent article makes several important contributions to the assessment and evaluation of the effectiveness of information systems (IS) departments in organizations (Pitt et al. 1995). The article: 1. Points out that, although service is an important part of the role of the IS department, most IS assessment measures have a product orientation. 2 Proposes an extension of the categorization of IS success measures (DeLone and McLean 1992) to include service quality. 3. Proposes the use of the SERVOUAL instrument from marketing (Parasuraman et al. 1985, 1988, 1991) to operationalize the IS service quality construct and modify the wording of the instrument to better accommodate its use in an IS context. 4. Adapts and augments a theory regarding the determinants of service quality expectations to an IS context and offers ideas for future research. A number of studies, however, identify potential difficulties with the SERVOUAL instrument (e.g., Babakus and Boiler 1992; Carman 1990; Cronin and Taylor 1992, 1994; Teas 1993, 1994). This research note reviews some of the literature regarding difficulties with the 196 MIS Quarteriy/June 1997 SERVOUAL instrument in general and examines the Implications of these difficulties to the use of the instrument in an IS context. The SERVQUAL Instrument: Problems Identified in the Literature The difficulties with the SERVOUAL instrument identified in the literature can be grouped into two main categories: (1) conceptual and (2) empirical; although, the boundary between them blurs because they are closely inter-related. The conceptual problems center around (1) the use of two separate instruments, one for each of two constructs (i.e., perceptions and expectations), to operationalize a third conceptually distinct construct (i.e., perceived service quaiity) that is itself the result of a complex psychological process; (2) the ambiguity of the expectations construct; and (3) the suitability of using a single instrument to measure service quality across different industries (i.e., content validity). The empirical problems are, by and large, the result of these conceptual difficulties, most notably the use of difference scores, in conjunction with the a theoretical nature of the process used in the construction of the original five dimensions of service quality. The empirical difficulties most often attributed to the SERVOUAL instrument include low reliability, unstable dimensionality, and poor convergent validity. A review of these conceptual and empirical problems should serve to caution those who wish to use SERVOUAL to measure the service quality of an information system provider. Conceptual difficuities with SERVQUAL Subtraction as a "Simuiation" of a Psychoiogicai Process Many of the difficulties associated with the SERVQUAL instrument stem from the opera- SERVQUAL Concems tionalization of a service quality construct that is theoretically grounded in a discrepancy or gap model. In conceptualizing service quality, Parasuraman et al. 1985, 1988, 1991, 1994b) use the "service quality model," which posits that one's perception of service quality is the result of an evaluation process whereby "the customer compares . . . the perceived service against the expected semce" (Gronroos 1984, p. 37). hopes to receive (e.g., Parasuraman et al. 1985,1988, 1991; ZeithamI et al. 1993). These multipie definitions and corresponding operationalizations of "expectations" in the SERVQUAL literature result in a concept that is loosely defined and open to multiple interpretations (Teas). Yet even when concise definitions are provided, various interpretations of the expectations construct can result in potentially serious measurement validity problems. Rather than develop an instrument to directly measure the perception of service quaiity that is the outcome of this cognitive evaluation process, the SERVQUAL instrument (Parasuraman et al. 1988, 1991) separately measures the expected level of service and the experienced level of service. Then service quality scores are calculated as the difference between these two measures. These three sets of scores are commonly referred to as expectation (E), perception (P), and SERVQUAL (whereby, SERVQUAL = P - E). Although not without precedent, and certainly worthy of fair empirical evaluation, the implicit assumption that subtraction accurately portrays this cognitive process seems overly simplistic. Even if one fully accepts this discrepancy model of experiences vis-^-vis expectations as indicative of the general process whereby one arrives at an evaluation of a service experience, the notion that the specific mechanism is mereiy subtraction does not logically follow. The use of differences was, and remains, an operational decision. Regrettably, it does not appear to have been a particularly good one. The direct measurement of one's perception of service quality that is the outcome of this cognitive evaluation process seems more likely to yield a valid and reliable outcome. If the discrepancy is what one wants to measure, then one should measure it directly. The conceptualization of "expectations" consistent with the SERVQUAL model is the vector attribute interpretation—"that is one on which a customer's ideal point is at an infinite level" (Parasuraman et al. 1994, p. 116). Unfortunately, as the proportion of extreme responses (e.g., seven on a seven point scale) increases, the expectation scores become less useful as an increasing proportion of the variation in difference-based SERVQUAL scores is due only to changes in perception scores. Ambiguity of the "Expectations" Construct Teas (1994) notes that SERVQUAL expectations are variously defined as desires, wants, what a service provider should possess, normative expectations, ideal standards, desired service, and the level of service a customer Teas (1993) found three different interpretations of "expectations" derived from an analysis of follow-up questions to an administration of the SERVQUAL questionnaire. Qne interpretation of expectations is as a forecast or prediction. The forecast interpretation of expectations cannot be discriminated from the disconfirmed expectations model of consumer satisfaction (Oliver 1980). This interpretation is inconsistent with the definition of service quality put forth by Parasaraman et al. (1988) and results in a discriminant validity problem with respect to consumer satisfaction. A second interpretation of expectations is as a measure of attribute importance. When respondents use this interpretation, the resulting perception-minus-expectation scores exhibit an inverse relationship between attribute importance and perceived service quality, all other things being equal. The third interpretation identified by Teas (1993) is the "classic ideal point" concept. Parasuraman et al. (1994) describe this when they note that "the P-E [i.e., perceptionsminus-expectations] specification could be problematic when a service attribute is a dassic ideai point attribute—that is one on which a customer's ideal point is at a finite level and therefore, performance beyond which will dis- MIS Quarterly/June 1997 197 Research Note please the customer (e.g., friendiiness of a salesperson in a retail store)" (p. 116). This interpretation of expectations results in an inverse of the relationship between the SERVQUAL score, calculated as perception minus expectation (P - E), and actual service quaiity for all values when perception scores are greater than expectation scores (i.e., P > E). This interpretation is consistent with the finding that user satisfaction scores were highest when actual user participation was in congruence with the user's need for participation, rather than merely maximized (Doll and Torkzadeh 1989). These various interpretations of the "expectation" construct lead to a number of measurement problems. The findings suggest that a considerable portion of the variance in the SERVQUAL instrument is the result of measurement error induced by respondent's varying interpretations of the "expectations" construct (Teas 1993). Three separate types of expectations have been described (Boulding et al. 1993): (1) the will expectation, what the customer believes will happen in their next service encounter; (2) the should expectation, what the customer believes should happen in the next service encounter; and (3) an ideal expectation, what a customer wants in an ideal sense. The ideai interpretation of expectation is often used in the SERVQUAL literature (Boulding et al. 1993). Boulding et al. (1993) differentiate between shouid and ideai expectations by stating that what customers think shouid happen may change as a result of what they have been told to expect by the service provider, as well as what the consumer views as reasonable and feasible based on what they have been told and their experience with the firm or a competitor's service. In contrast, an ideal expectation may "be unrelated to what is reasonable/feasible and/or what the service provider tells the customer to expect" (Boulding et al. 1993, p. 9). A series of experiments demonstrated results that were incompatibie with the gap model of service quality (Boulding et al. 1993). Instead, the results demonstrate that service quality is influenced only by perceptions. Moreover, the results indi- 198 MIS Quarterly/June 1997 cate that perceptions are influenced by both will and should expectations, but in opposite directions. Increasing w///expectations leads to a higher perception of service quality whereas an increasing expectation of what should be delivered during a service encounter will actually decrease the ultimate perception of the quality of the service provided (Boulding et al. 1993). Not only do these findings faii to support the gap model of service quality, but these results also demonstrate the wildly varying impact of different interpretations of the expectations construct. Different methods to operationalize "expectations" in developing their IS versions of SERVQUAL have been used (Pitt el al. 1995; Kettinger and Lee 1994). Qne study used the instructions to the survey to urge the respondents to "think about the kind of IS unit that would deliver excellent quality of service" (Pitt et al. 1995). The items then take a form such as: El They wiii have up-to-date hardware and software. Whereas the second study (Kettinger and Lee 1994) used the form: E1 Excellent college computing services will have up-todate equipment Recall that some respondents to SERVQUAL were found to interpret expectations as forecasts or predictions (Teas 1993). This interpretation corresponds closely with the will expectation (Boulding et al. 1993). It is easy to see how this interpretation might be formed especially with the "They will" phrasing (Pitt et al. 1995). Unfortunately, the impact of the will expectation on perceptions of service quality is opposite from that intended by the SERVQUAL authors and the (P-E) or gap model of service quality (Boulding et al. 1993). In summary, a review of the literature indicates that respondents to SERVQUAL may have numerous interpretations of the expectations construct and that these various interpretations have different and even opposite impacts on perceptions of service quality. Moreover, some of the findings demonstrate that expectations influence only perceptions and that perceptions alone directly influence overall service quality (Boulding et al. 1993). These findings fail to support the (P-E) gap model of service SERVOUAL Concems quality and indicate that the use of the expectations construct as operationalized by SERVQUAL-based instruments is problematic. Applicability of SERVQUAL Across Industries Another often mentioned conceptual problem with SERVQUAL concerns the applicability of a single instrument for measuring service quality across different industries. Several researchers have articulated their concerns on this issue. A study of SERVQUAL across four different industries found it necessary to add as many as 13 additional items to the instrument in order to adequately capture the service quality construct in various settings, while at the same time dropping as many as 14 items from the original instrument based on the results of factor analysis (Carman 1990). The conclusion arrived at was that considerable customization was required to accommodate differences in service settings. Another study attempted to utilize SERVQUAL in the banking industry (Brown et al. 1993). The authors were struck by the omission of items which they thought a priori would be critical to subject's evaluation of service quality. They concluded that it takes more than simple adaptation of the SERVQUAL items to effectively address service quality across diverse settings. A study of sen/ice quality for the retail sector also concluded that utilizing a single measure of service quality across industries is not feasible (Dabholkar et al. 1996). Researchers of service quality in the information systems context appear to lack consensus on this issue. Pitt et al. (1995) state that they could not discern any unique features of IS that make the standard SERVQUAL dimensions inappropriate nor could they discern any dimensions with some meaning of service quality in the IS domain that had been excluded from SERVQUAL. Kettinger and Lee (1994), however, found that SERVQUAL should be used as a supplement to the UIS (Baroudi and Qrlikowski 1988) because that instrument also contains items that are infiportant determinants of IS service quality. Their findings suggest that neither the UIS nor SERVQUAL alone can capture all of the factors which contribute to perceived service quality in the IS domain. For example, items contained in the UIS include the degree of training provided to users by the IS staff, the level of communication between the users and the IS staff, and the time required for new systems development and implementation, all of which possess strong face validity as determinants of IS service quality. In addition, Kettinger and Lee dropped the entire tangibies dimension from their IS version of SERVQUAL based on the results of confirmatory factor analysis. These finding contradict the belief that all dimensions of SERVQUAL are relevant and that there are of no unique features of the IS domain not included in the standard SERVQUAL instrument (Pitt et al. 1995). It is difficult to argue that items concerning the manner of dress of IS employees and the visual attractiveness of IS facilities (i.e., tangibles) should be retained as important factors in the IS domain while issues such as training, communication, and time to complete new systems development are excluded. We agree that using a single measure of service quality across industries is not feasible (Dabholkar et al. 1996) and therefore future research should involve the development of industry-specific measures of service quality. Empirical difficulties with the SERVQUAL instrument A difference score Is created by subtracting the measure of one construct from the measure of another in an attempt to create a measure of a third distinct construct. For example, in scoring the SERVQUAL instrument, an expectation score is subtracted from a perception score to create such a "gap" measure of service quality. Even if one assumes that the discrepancy theory is correct and that these are the only (or at least, the last) two inputs into this cognitive process, it still raises the question: Can calculated difference scores operationalize the outcome of a cognitive discrepancy? It appears that several problems with the use of difference scores make them a MiS Ouarteriy/June 1997 199 Research Note poor measure of psychological constructs (e.g., Edwards 1995; Johns 1981; Lord 1958; Peter et al. 1993; Wall and Payne 1973). Among the difficulties related to the use of difference measures discussed in the literature are low reliability, unstable dimensionality, and poor predictive and convergent validities. Reiiabiiity Probiems With Difference Scores Many studies demonstrate that Cronbach's (1951) alpha, a widely-used method of estimating instrument reliability, is inappropriate for difference scores (e.g., Cronbach and Furby 1970; Edwards 1995; Johns 1981; Lord 1958; Peter et al. 1993; Prakash and Lounsbury 1983; Wall and Payne 1973). This is because the reliability of a difference score is dependent on the reliability of the component scores and the correlation between them. The correct formula for calculating the reliability of a difference score (rD) is: where r,, and r^ are the reliabilities of the two component scores,CT,^and a^^ are the variances of the component scores, and r,2 is the correlation between the component scores (Johns 1981). This formula shows that as the correlation of the component scores increases, the reliability of the difference scores is decreased. An example was provided where the reliability of the difference score formed by subtracting one component from another with an average reliability of .70, and a correlation of .40, is only .50 (Johns 1981). Thus, while the average reliability of the two components is .70, which is considered acceptable (Pitt et al. 1995; cf., Nunnally 1978), the correlation between the components reduces the reliability of the difference score to a level that most researchers would consider unacceptable (Peter et al. 1993). An example of the overestimatlon for the reliability caused by the misuse of Cronbach's alpha can be found in the analysis of service quality for a computer manufacturer (Parasuraman et al. 1994a; see Table 1). Note that Cronbach's alpha consistently overestimates the actual reliability for the difference scores of each dimension (column 2). Also note that the use of the correct formula for calculating the reliability of a difference score has demonstrated that the actual reliabilities for the SERVQUAL dimensions may be as much as .10 lower than reported by researchers incorrectly using Cronbach's alpha. In addition, these findings show that the non-difference, direct response method results in consistently higher reliability scores than the (P-E) difference method of scoring. These results have important implications for the IS-SERVQUAL (Pitt et al. 1995). Table 1. Reliability of SERVQUAL: The iUlisuse of Cronbach's Alpha Cronbachs' a (Non-Difference) Cronbachs' a (Difference) Johns' a for Differences (Difference) Tangibles .83 .75 .65 Reliability .91 .87 .83 Responsiveness .87 .84 .81 Assurance .86 .81 .71 Empathy .90 .85 .81 A Priori Dimensions Note: Difference scores calculated as perception minus expectation (P - E). 200 MIS Quarterly/June 1997 SERVOUAL Concems Cronbach's alpha, which consistently overestimates the reliability of difference scores, was used incorrectly. Even when using the inflated alpha scores, Pitt et al. note that two of three reliability measures for the tangibles dimension fall below the 0.70 level required for commercial applications. Had they utilized the appropriate modified alpha, they may have concluded that the tangibles dimension is not reliable in the IS context, a finding which would have been consistent with the results of Kettinger and Lee (1994). A review of the literature clearly indicates that by utilizing Cronbach's alpha, researchers tend to overestimate the reliabilities of difference scores especially when the component scores are highly correlated: Such is the case with the SERVOUAL instrument (Peter et al. 1993). Predictive and Convergent Vaiidity issues With Difference Scores Another problem with the SERVOUAL instrument concerns the poor predictive and convergent validities of the measure. Convergent validity Is concerned with the extent to which multiple measures of the same construct agree with each other (Cambell and Fiske 1959). Predictive validity refers to the extent to which scores of one construct are empirically related to scores of other conceptually-related constructs (Bagozzi et al. 1992; Kappelman 1995; Parasuraman et al. 1991). One study reported that perceptions-only SERVOUAL scores had higher correlations with an overall service quality measure (i.e., convergent measure) and with complaint resolution scores (i.e., the predictive measure) than did the perception-minus-expectation difference scores used with SERVOUAL (Babakus and Boiler 1992). A different study , performed regression analyses in which an overall single-question service quality rating was regressed separately on both difference scores (i.e., perception minus expectation) and perception-only scores (Parasuraman et al. 1991). The perception-only SERVOUAL scores produced higher adjusted r-squared values (ranging from .72 to .81) compared to the SERVOUAL difference scores (ranging from .51 to .71). The predictive validity of difference scores, a non-difference direct response score, and the perceptions only scores for SERVOUAL in the context of a financial institution hae been compared (Brown et al. 1993). Correlation analysis was performed between the various scores and a three-item behavioral intentions scale. Behavioral intentions include such concepts as whether the customer would recommend the financial institution to a friend or whether they would consider the financial institution first when seeking new services. The results of the study show that both the perceptions only (.31) and direct response (.32) formats demonstrated higher correlations with the behavioral intentions scale than did the traditional difference score (.26). The superior predictive and convergent validity of perception-only scores was confirmed (Cronin and Taylor 1992). Those results indicated higher adjusted r-squared values for perception-only scores across four different industries. The perception component of the perception-minus-expectation score consistently performs better as a predictor of overall service quality than the difference score itself (Babakus and Boiler 1992; Boulding et al. 1993; Cronin and Taylor 1992; Parasuraman etal. 1991). Unstabie Dimensionaiity of the SERVQUAL Instrument The unstable nature of the factor structure for the SERVOUAL instrument may be related to the atheoretical process by which the original dimensions were defined. The SERVOUAL questionnaire is based on a multi-dimensional model (i.e., theory) of service quality. A 10 dimensional model of service quality based on a review of the service quality literature and the extensive use of both executive and focus group interviews was developed (Parasuraman et al. 1985). During instrument development, Parasuraman et al. (1988) MIS Quarterly/June 1997 201 Research Note Table 2. Unstable Dimensionality of SERVQUAL Study Instrument Analysis Factor Structure Carman (1990) Four modified SERVOUALs using 12-21 of the original items Principal axis factor analysis with oblique rotation Five to nine factors Bresinger and Lambert (1990) Original 22 items Principal axis factor anaiysis with obiique rotation Four factors with eigenvalues > 1 Parasuraman, ZeithamI, and Berry (1991) Original 22 items Principal axis factor analysis with oblique rotation Five factors, but different from a priori model. Tangibles dimension spiit into two factors, while responsiveness and assurance dimensions loaded on a single factor. Finn and Lamb (1991) Originai 22 items LISREL confirmatory factor analysis Five-factor model had poor fit. Babakus and Boiler (1991) Original 22 items (1) Principal axis factor analysis with oblique rotation. (2) Confirmatory factor analysis (1) Five-factor modei not supported (2) Two factors Cronin and Taylor (1992) Originai 22 items Principal axis factor analysis with oblique rotation Unidimensional structure "Van Dyke and Popelka (1993) 19 of original 22 items Principal axis factor analysis with oblique rotation Unidimensional structure *Pitt, Watson, and Kavan (1995) Original 22 items Principal components and maximum iikelihood with verimax rotation (1) Financial institution seven-factor model with tangibies and empathy spWt into two. (2) Consulting firm fivefactors, none matching the original. (3) Information systems service firm—three-factor model. *Kettinger and Lee (1994) Original 22 items LISREL confirmatory factor analysis Four-factor model, tangibles dimension dropped. Original 22 items Principal axis factor analysis with oblique rotation (1) Korea, three-factor model, tangibies retained. (2) Hong Kong, four-factor model, tangibies retained. Kettinger • , Lee, and Lee (1995) "Measured information systems service quality. 202 MiS Quarterly/June 1997 SERVQUAL Concerns began with 97 paired questions (i.e., one for expectation and one for perception), items (i.e., question pairs) were first dropped on the basis of within-dimension Cronbach coefficient aiphas, reducing the pool to 54 question pairs. More items were then dropped or reassigned based on oblique-rotation factor loadings and within-dimension Cronbach coefficient alphas resulting in a 34 paired-item instrument with a proposed seven-dimensional structure. A second data collection and analysis with this "revised" definition and operationalization of service quaiity resulted in the 22 paired-item SERVQUAL instrument with a proposed five-dimensional structure. Two of these five dimensions contained items representing seven of the original 10 dimensions. We are cautioned, however, that those who wish to interpret factors as real dimensions shoulder a substantial burden of proof (Gronbach and Meehl 1955). Moreover, such proof must rely on more than just empirical evidence (e.g., Bynner 1988; Galletta and Lederer 1989). The results of several studies have demonstrated that the five dimensions claimed for the SERVQUAL instrument are unstabie (see Tabie 2). SERVQUAL studies in the information systems domain have also demonstrated the unstable dimensionality of the SERVQUAL instrument. The service quality of IS services was measured in three different industries, a financial institution, a consulting firm, and an information systems service business (Pitt et al. 1995). Factor analysis was conducted using principal components and maximum likelihood methods with varimax rotation for a range of models. Analysis indicated differing factor structures for each type of firm. Analysis of the results for a financiai institution indicated a seven-factor modei with both the tangibles and empathy dimensions split into two. These breakdowns should not be surprising. Pitt et al. note that "up-to-date hardware and software" are quite distinct from physical appearances in the IS domain. The empathy dimension was created by the original SERVQUAL authors from two distinctly different constructs, namely understanding and access, which were combined due to the factor ioadings alone, without regard to underlying theory. Not only did IS- SERVQUAL not match the proposed model, its factor structure varied across settings. Analysis of the data from the consulting firm resuited in a five-factor model although none of these matched the originai a priori factors. The factor analysis of the information systems business data resulted in the extraction of only three factors. LISREL confirmatory factor analysis was used on SERVQUAL data collected from users (i.e., students) of a college computing services department (Kettinger and Lee 1994). Analysis of this data resuited in a four factor solution. The entire tangibies dimension was dropped. An IS version of SERVQUAL was used in a cross-national study (Kettinger et ai. 1995). Results of exploratory common factor analysis with oblique rotation indicated a three-factor model from a Korean sample and a four-factor model was extracted from a Hong Kong data set. The tangibles dimension was retained in the analysis of both of the Asian samples. The unstable dimensionality of SERVQUAL demonstrated in many domains, including information services, is not just a statistical curiosity. The scoring procedure for SERVQUAL calls for averaging the P-E gap scores within each dimension (Parasuraman et al. 1988). Thus a high expectation coupled with a iow perception for one item would be canceled by a low expectation and high perception for another item within the same dimension. This scoring method is oniy appropriate if all of the items in that dimension are interchangeable. This type of analysis would be justified if SERVQUAL demonstrated a clear and consistent dimensional structure. However, given the unstable number and pattern of the factor structures, averaging groups of items to calculate separate scores for each dimension cannot be justified. Therefore, for scoring purposes, each item should be treated individually and not as part of some a priori dimension. MIS Quarteriy/June 1997 203 Research Note In summary, numerous problems with the original SERVQUAL instrument are described in the literature (e.g., Babakus and Boiler 1992; Carman 1990; Cronin and Taylor 1992, 1994; Teas 1993, 1994). The evidence suggests that difference scores, like the SERVQUAL perception-minus-expectation calculation, tend to exhibit reduced reliability, poor discriminant validity, spurious correlations, and restricted variance problems (Edwards 1995; Peter et al. 1993). The fact that the perception component of the difference score exhibits better reliability, convergent validity, and predictive validity than the perception-minus-expectation difference score itself calls into question the empirical and practical usefulness of both the expectations scores as well as the difference scores (Babakus and Boiler 1992; Cronin and Taylor 1992; Parasuraman et al. 1994). Moreover, inconsistent definitions and/or interpretations of the "expectation" construct lead to a number of problems. The findings of Teas (1993) suggest that a considerable portion of the variance in SERVQUAL scores is the result of measurement error induced by respondents' varying interpretations of the expectations construct. In addition, since expectations, as well as perceptions, are subject to revision based on experience, concerns regarding the temporal reliability of SERVQUAL difference scores are raised. Furthermore, the dimensionality of the SERVQUAL instrument is problematic. It was reported that an analysis of ISSERVQUAL difference scores resulted in either three, five, or seven factors depending on the industry (Pitt et al. 1995). A portion of the instability in the dimensionality of SERVQUAL can be traced to the development of the original instrument (i.e., Parasuraman et al. 1988). Given these problems, users of the SERVQUAL instrument should be cautioned to assess the dimensionality implicit in their specific data set in order to determine whether the hypothesized five-factor structure that has been proposed (Parasuraman et al. 1988, 1991) is supported in their particular domain. Moreover, if the item elimination and dimen- 204 MiS Quarterty/June 1997 sional collapsing utilized in the development of SERVQUAL has resulted in a 22 paired-item instrument that in fact does not measure all of the theoretical dimensions of the service quality construct (i.e., content validity), then the use of linear sums of those items for purposes of measuring overall service quality is problematic as well (Galletta and Lederer 1989). Many of the difficulties identified with the SERVQUAL instrument also apply to the ISmodified versions of the SERVQUAL instrument used by Pitt et al. (1995) and by Kettinger and Lee (1994). It appears that the IS-SERVQUAL instrument, utilizing difference scores, is neither a reliable nor a valid measurement for operationalizing the sen/ice quality construct for an information systems services provider. The IS versions of the SERVQUAL instrument, much like the original instrument (Parasuraman et al. 1988), suffer from unstable dimensionality and are likely to exhibit relatively poor predictive and convergent validity, as well as reduced reliability when compared to non-difference scoring methods. The existing literature provides impressive evidence that the use of perception-minus-expectation difference scores is problematic. This critique of the perceptions-minus-expectations gap score should not be interpreted as a claim that expectations are not important or that they should not be measured. Qn the contrary, evidence indicates that both shouid and wiii expectations are precursors to perceptions but that perceptions alone directly influence overall perceived sen/ice quality (Boulding et al. 1993). Our criticism is not with the concept of expectations per se, but rather the operationalization of service quality as a simple subtraction of an ambiguously defined expectations construct from the perceptions of the service actually delivered. IS professionals have been known to raise expectations to an unrealistically high level in order to gain user commitment to new systems and technologies. This can make it much more difficult to deliver systems and services that will be perceived as successful. According to the model developed by Boulding et al. (1993), per- SERVOUAL Concems ceived service quality can be increased by either improving actual performance or by managing expectations, specifically by reducing should expectations and/or increasing will expectations. These two different types of expectations are not differentiated by the traditional SERVQUAL gap scoring method. A better approach to understanding the impact of expectations on perceived service quality may be to measure wiii and shou/af expectations separately and then compare them to a service quality measure that utilizes either a direct response or perceptions-only method of scoring. Prescriptions for the use of SERVQUAL The numerous problems associated with the use of difference scores suggest the need for an alternative response format. Qne alternative is to use the perceptions-only method of scoring. A review of the literature (Babakus and Boiler 1992; Boulding et al. 1993; Cronin and Taylor 1992; Parasuraman et al. 1991, 1994), indicates that perceptions-only scores are superior to the perception-minus-expectation difference scores in terms of reliability, convergent validity, and predictive validity. In addition, the use of perceptions-only scores reduces by 50% the number of items that must be answered and measured (44 items to 22). Moreover, the findings of Boulding et al. (1993) suggest that expectations are a precursor to perceptions and that perceptions alone directly influence service quality. A second alternative, suggested by Carman (1990) and Babakus and Boiler (1992), is to revise the wording of the SERVQUAL items into a format combining both expectations and perceptions into a single question. Such an approach would maintain the theoretical value of expectations and perceptions in assessing service quality, as well as reduce the number of questions to be answered by 50%. This direct response format holds promise for overcoming the inherent problems with calculated difference scores. Items with this format could be presented with anchors such as "falls far short of expectations" and "greatly exceeds expectations." Qne study indicates that such direct measures possess higher reliability and Improved convergent and predictive validity when compared to difference scores (Parasuraman et al. 1994a). Conclusion Recognizing that we cannot manage what we cannot measure, the increasingly competitive market for IS services has emphasized the need to develop valid and reliable measures of the service quality of information systems services providers, both internal and external to the organization. An important contribution to this effort was made with the suggestion of a IS-modified version of the SERVQUAL instrument (Pitt et al. 1995). However, earlier studies raised several important questions concerning the SERVQUAL instrument (e.g., Babakus and Boiler 1992; Carman 1990; Cronin and Taylor 1992, 1994; Peter et al. 1993; Teas 1993, 1994). A review of the literature suggests that the use of difference scores with the IS-SERVQUAL instrument results in neither a valid nor reliable measure of perceived IS service quality. Those choosing to use any version of the IS-SERVQUAL instrument are cautioned. Scoring problems aside, the consistently unstable dimensionality of the SERVQUAL and IS-SERVQUAL instruments intimates that further research is needed to determine the dimensions underlying the construct of service quality. Given the importance of the service quality concept in IS theory and practice, the development of improved measures of service quality for an information systems services provider deserves further theoretical and empirical research. References Babakus, E., and Boiler, G. W. "An Empirical Assessment of the SERVQUAL Scale," Joumai of Business Research (24:3), 1992, pp. 253-268. Bagozzi, R., Davis, F., and Warshaw, P. "Development and Test of a Theory of MIS Ouarterly/June 1997 205 Research Note Technological Learning and Usage," Human Relations, (45:)7, 1992, pp. 659-686. Baroudi, J. and Orlikowski, W. "A Short-Form Measure of User Information Satisfaction: A Psychometric Evaluation and Notes on Use," Journai of Management information Systems (4:4), 1988, pp. 44-49. Boulding, W., Kaira, A., Staelin, R., and ZeithamI, V. A. "A Dynamic Process Model of Service Ouality: From Expectations to Behavioral Intentions," Journai of Mari<eting Research (30:1), 1993, pp. 7-27. Brensinger, R. P., and Lambert, D. M. "Can the SERVQUAL Scale be Generalized to Business-to-Business Services?" in Knowiedge Deveiopment in Mari<eting, AMA's Summer Educators Conference Proceedings, Boston, MA, 1990, p. 289. Brown, T. J., Churchill, G. A., and Peter, J. P. "Improving the Measurement of Service Ouality," Journai of Retaiiing (69:1), 1993, pp. 127-139. Bynner, J. "Factor Analysis and the Construct Indicator Relationship," Human Reiations (41:5), 1988, pp. 389-405. Campbell, D. T., and Fiske, D. W. "Convergent and Discriminant Validation by the Multitrait-Multimethod Matrix," Psychoiogicai Buiietin (56), 1959, pp. 81-105. Carman, J. M. "Consumer Perceptions of Service Ouality: An Assessment of the SERVOUAL Dimensions," Journai of Retaiiing (66:1), 1990, pp. 33-55. Cronbach, L. J. "Coefficient Alpha and the Internal Structure of Tests," Psychometrii<a (16), 1951, pp. 297-334. Cronbach, L. J., and Furby, L. "How We Should Measure 'Change'-or Should We?" Psychoiogicai Buiietin (74), July 1970, pp. 74-73. Cronbach, L. J., and Meehl, P. "Construct Validity in Psychological Tests," Psychoiogicai Buiietin (21), 1955, pp. 281-302. Cronin, J. J., and Taylor, S. A. "Measuring Service Ouality: A Reexamination and Extension," Journai of Mari<eting (56:3), 1992, pp. 55-68. Cronin, J. J., and Taylor, S. A. "SERVPERF versus SERVOUAL: Reconciling Performance-Based and Perceptions206 MIS Ouarterly/June 1997 Minus-Expectations Measurements of Service Ouality," Journai of Marketing (58:1), 1994, pp. 125-131. Dabholkar, P. A., Thorpe, I. D., and Rentz, J. O. "A Measure of Service Ouality for Retail Stores: Scale Development and Validation," Journai of the Academy of Marketing Sciences (24:1), 1996, pp. 3-16. DeLone, W., and McLean, E. "Information Systems Success: The Ouest for the Dependent Variable," information Systems Research (3:1), March 1992, pp. 60-95. Doll, W. J., and Torkzadeh, G. "End-User Computing Involvement: Discrepancy Model," Management Science (35:10), 1989, pp. 1151-1171. Edwards. J. R. "Alternatives to Difference Scores as Dependent Variables in the Study of Congruence in Organizational Research," Organizationai Behavior and Human Decision Processes (64:3), 1995, pp. 307-324. Finn, D. W., and Lamb, C. W. "An Evaluation of the SERVOUAL Scales in a Retailing Setting," Advances in Consumer Research (18), 1991, pp. 338-357. Galletta, D. F., and Lederer, A. L. "Some Cautions on the Measurement of User Information Satisfaction," Decision Sciences (20:3), 1989, pp. 419-439. Gronroos, C. "A Service Ouality Model and its Marketing Implications," European Joumai of Marketing (18:4), 1984, 36-55. Johns, G. "Difference Score Measures of Organizational Behavior Variables: A Critique," Organizationai Behavior and Human Performance (27), 1981, pp. 443-463. Kappelman, L. "Measuring User Involvement: A Diffusion of Innovation Approach," The DATA BASE for Advances in information Systems, (26:2,3), 1995, pp. 65-83. Kettinger, W. J., and Lee, C. C. "Perceived Service Ouality and User Satisfaction with the Information Services Function," Decision Sciences (25:5), 1994, pp. 737-766. Kettinger, W., Lee C, and Lee, S. "Global Measurements of Information Service Ouality: A Cross-National Study," Decision Sciences (26:5), 1995, pp. 569-585. SERVQUAL Concems Lord, F. M. "The Utilization of Unreliable Difference Scores," Journal of Educationai Psychoiogy (49:3), 1958, pp. 150-152. Nunnaliy, J. Psychometric Theory, 2nd ed., McGraw-Hill, New York, 1978. Oliver, R. L. "A Cognitive Modei of the Antecedents and Consequences of Satisfaction Decisions," Journai of Mari<eting Research (17:3), 1980, pp. 460-469. Parasuraman, A., ZeithamI, V. A., and Berry, L. L. "A Conceptual Model of Service Quality and its Implications for Future Research," Journal of Mari<eting (49:4), 1985, pp. 41-50. Parasuraman, A., ZeithamI, V. A., and Berry, L L. "SERVQUAL: A Multiple Item Scale for Measuring Consumer Perceptions of Service Quality," Journai of Retaiiing (64:1), 1988, pp. 12-40. Parasuraman, A., ZeithamI, V. A., and Berry, L. L. "Refinement and Reassessment of the SERVQUAL Scale," Journai of Retaiiing (67:4), 1991, pp. 420-450. Parasuraman, A., ZeithamI, V. A., and Berry, L. L. "Alternative Scales for Measuring Service Quality: A Comparative Assessment Based on Psychometric and Diagnostic Criteria," Journai of Retaiiing (70:3), 1994a, pp. 201-229. Parasuraman, A., ZeithamI, V. A., and Berry, L. L. "Reassessment of Expectations as a Comparison in Measuring Service Quality: Implications for Further Research," Journai ofMari<eting(5a:^), 1994b, pp. 111-124. Peter, J. P., Churchill, G. A., and Brown, T. J. "Caution in the Use of Difference Scores in Consumer Research," Journai of Consumer Research (19:1), 1993, pp. 655-662. Pitt, L. F., Watson, R. T., and Kavan, C. B. "Service Quality: A Measure of Information Systems Effectiveness," MiS Ouarteriy (19:2), June 1995, pp. 173-187. Prakash, V., and Lounsbury, J. W. "A Reliability Problem in the Measurement of Disconfirmation of Expectations," in Advances in Consumer Research, Vol. 10, Tybout, A.M. and Bagozzi, R. P. (eds.) Association for Consumer Research, Ann Arbor, Ml, 1983, pp. 244-249. Teas, R. K. "Expectations, Performance Evaluation and Consumer's Perception of Quality," Journai of Mari<eting (57:4), 1993, pp. 18-34. Teas, R. K. "Expectations as a Comparison Standard in Measuring Service Quality: An Assessment of a Reassessment," Journai of Mari<eting (58:^), 1994, pp. 132-139. Van Dyke, T. P., and Popelka, M. E. "Development of a Quality Measure for an Information Systems Provider," in The Proceedings of the Decision Sciences institute (3), 1993, pp. 1910-1912. Wall, T. D., and Payne, R. "Are Deficiency Scores Deficient?" Journai of Appiied Psychoiogy (56:3), 1973, pp. 322-326. ZeithamI, V. A., Berry, L., and Parasuraman, A. "The Nature and Determinant of Customer Expectation of Service Quality," Journai of the Academy of Mariteting Science (2V.\), 1993, pp. 1-12. About the Authors Thomas P. Van Dyke is an assistant professor of information systems and technologies at Weber State University and recently completed his a doctoral dissertation in business computer information systems at the University of North Texas. His current research interests include the effects of alternative presentation formats on biases and heuristics in human decision making and MIS evaluation and assessment. He has published articles in The Journai of Computer information Systems and Proceedings of the Decision Sciences institute. Leon A. Kappelman is an associate professor of business computer information systems in the College of Business Administration at the University of North Texas and associate director of the Center for Quality and Productivity. After a successful career in industry, he received his Ph.D. (1990) in management information systems from Georgia State University. He has pubiished over two dozen journal articles. His work has appeared in Communications of the ACM, Journai of Management information Systems, DATA BASE for Advances in information Systems, Journai of Systems Management, Journai of Computer information Systems, MIS Quarteriy/June 1997 207 Research Note InformationWeek, National Productivity Review, Project Management Joumal, Joumal of Information Technology Management, Industrial Management, as well as other journals and conference proceedings. He authored Information Systems for Managers, McGrawHill (1993). His current research interests include the management of information assets, information systems (IS) development and implementation, IS project management, and MIS evaluation and assessment. He is cochair of the Society for Information Management's (SIM) Year 2000 Working Group. Victor R. Prybutok is the director of the University of North Texas Center for Quality 208 MIS Quarteriy/June 1997 and Productivity and an associate professor of management science in the Business Computer Information System Department. He has published articles in Technometrics, Operations Research, Economic Quality Control, and Ouaiity Progress, as well as other journals and conference proceedings. Dr. Prybutok is a senior member of the American Society for Quaiity Control (ASQC), an ASQC certified quality engineer, a certified quality auditor, and a 1993 Texas quaiity award examiner. His current research interests include project management, assessment of quality programs, neural networks, and MIS evaluation and assessment.
Keep reading this paper — and 50 million others — with a free Academia account
Used by leading Academics
Murat Atan
Ankara Hacı Bayram Veli University
Heidi Jane Smith
Universidad Iberoamericana - Mexico
E.i.abdul Sathar
University of Kerala
Anoop Chaturvedi
University of Allahabad