Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Scientometrics DOI 10.1007/s11192-016-2034-y Does high impact factor successfully predict future citations? An analysis using Peirce’s measure Gangan Prathap1 • S. Mini1 • P. Nishy1 Received: 9 April 2015 Ó Akadémiai Kiadó, Budapest, Hungary 2016 Abstract Journals are routinely evaluated by journal impact factors. However, more controversially, these same impact factors are often used to evaluate authors and groups as well. A more meaningful approach will be to use actual citation rates. Since in each journal there is a very highly skewed distribution of articles according to citation rates, there is little correlation between journal impact factor and actual citation rate of articles from individual scientists or research groups. Simply stated, journal impact factor does not successfully predict high citations in future. In this paper, we propose the use of Peirce’s measure of predictive success (Peirce in Science 4(93):453–454, 1884) to see if the use of journal impact factors to predict high citation rates is acceptable or not. It is seen that this measure is independent of Pearson’s correlation (Seglen 1997) and gives a more quantitative refinement of the Type I and Type II classification of Smith (Financ Manag 133–149, 2004). The measures are used to examine the portfolios of some active scientists. It is clear that the journal impact factor is not effective in predicting future citations of successful authors. Keywords Performance analysis  Bibliometrics  Impact factor  Citations  Peirce’s measure & Gangan Prathap gp@niist.res.in S. Mini mini@niist.res.in P. Nishy nishy@niist.res.in 1 CSIR National Institute for Interdisciplinary Science and Technology, Thiruvananthapuram 695019, India 123 Scientometrics Introduction Seglen (1997) observed that evaluating the scientific quality of a result published in a paper in a recognized standard journal ‘‘is a notoriously difficult problem which has no standard solution.’’ Ideally, each scientific result should be evaluated in a process now known as peer review in which subject experts assess the work for quality and quantity. However, this is becoming difficult to perform and even experts resort to simpler approaches like the use of quantitative indicators like citation rates and journal impact factors (Seglen 1997). In terms of causality, it is citation rates which determine journal impact factors and not the other way round. But this is often forgotten or ignored. Ever since the journal impact factor was introduced (Garfield 1955, 1999, 2005), they have been used not only to evaluate articles but also individuals, groups and institutions (Calza and Garbisa 1995; Taubes 1993; Vinkler 1986; Maffulli 1995). Such an approach is badly flawed because within each journal there is a very highly skewed distribution of articles according to citation rates (Seglen 1992) and this will imply that there is little correlation between journal impact factor and actual citation rate of articles from individual scientists or research groups (Vinkler 1986; Seglen 1994). Simply stated, journal impact factor does not successfully predict high citations in future. Smith (2004) looked at the problematic issue of using decision rules based on journal impact factor to promote or reward scientists or give grants to proposals. Specifically, the question asked was whether a ‘‘top N journals’’ approach provides a reasonable decision rule when it comes to identifying top articles in the finance literature. The accuracy of these decision rules was interpreted in terms of Type I errors (a ‘‘top’’ article is rejected by a particular decision rule, e.g. in top three journals) and Type II errors (a ‘‘non-top’’ article is accepted as a top article) for each journal and combinations of the journals. High error rates were observed suggesting that identifying top articles required looking beyond the ‘‘Top N journals’’. In this paper, we propose another measure to see if the use of journal impact factors to predict high citation rates is acceptable or not. This is Peirce’s measure of predictive success (Peirce 1884). It is seen that this measure is independent of Pearson’s correlation Fig. 1 The four quadrants of the Predictor-Event space following Peirce (1884) 123 Scientometrics (Seglen 1997) and gives a more quantitative refinement of the Type I and Type II classification of Smith (2004). Peirce’s measure of success of prediction We are dealing with a problem where we use impact factor (IF) of the journal in which the article has appeared as a predictor to separate a top article from an article which does not make the cut (e.g. if IF C IFt it is a top article where IFt is the chosen threshold). This is done a priori and is nothing more than a promise. Much after the event, it is possible to count the actual citations C, and if a suitable threshold is identified as Ct, then the event is said to have taken place if C C Ct. What we now need to do is to assess the success of our decision rule in predicting future highly cited articles. Peirce’s measure does precisely that. Figure 1 shows the four quadrants of the Predictor-Event space following Peirce (1884). The TT quadrant signifies all cases where the predictor has promised a top article (T for True) and the event shows that this has been realised (T for True). Similarly, FF quadrant collects all cases where the predictor has rejected the article (F for false) and the event shows that this has been correctly predicted (F for false, i.e. non-event). The FT quadrant is therefore the one that represents Type I error—incorrectly predicted events where we have rejected a case that should have been accepted (Smith 2004). The TF quadrant represents all Type II errors, incorrectly predicted non-events where we accepted cases which should have been rejected. The Peirce’s measure of ‘‘the science of the method’’ is given by the simple formula i ¼ TT=ðTT þ FT Þ TF=ðTF þ FF Þ: We can show by very simple calculations that i will range from 1 (the decision rule is 100 % successful and there are no Type I and Type II errors), through 0 (TT/TF = FT/FF) to -1 (all are Type I or Type II errors). In the next section we shall take some examples to demonstrate the method. The first is taken from Seglen (1994) and the other is real-life data collected for three highly decorated scientists from the authors’ institution from 1987 to 2014. Demonstration of Peirce’s measure of success of prediction We start with the example taken from Seglen (1994) and cited in Fig. 3 of Seglen (1997). The correlation between journal impact (IF) and actual citation rate (C) of articles from four individual scientists is seen to range from 0.05 to 0.63. Table 1 shows how the Table 1 The relationship between Pearson’s correlation r and Peirce’s i for the four authors in Fig. 3 of Seglen (1997) Author r 1 0.05 0.223 2 0.27 -0.083 i 3 0.44 -0.118 4 0.63 -0.093 123 Scientometrics Pearson’s correlation r compares with the computed value of i [this is done by counting the cases by inspection of the charts in Fig. 3 of Seglen (1997)]. The thresholds for counting were taken as IFt = 1.0 and Ct = 1.0. We see that there is no pattern of relationship between r and i. Thus even if numerically there is a high correlation between journal impact factor and actual citation rate, it does not mean that decision rules based on IF are effective in separating highly cited work from poorly cited work. We next look at some real-life data collected for three highly decorated scientists from the authors’ institution. From 1987 to 2014, the CSIR National Institute for Interdisciplinary Science and Technology has maintained a list of all its papers and the Impact Factors (IF) for the corresponding year of the journals in which they have appeared. In each case, we also calculated the total number of citations (C) received by each paper as of March 2015. The Web of Science Core Collection was used to obtain these figures. Many exploratory studies were conducted which showed that there was little relationship (both by way of slope and by Pearson’s correlation) between IF and C. In this paper we report the results for top three scientists ranked in terms of total citations C who have been active during this period. Incidentally, these three also ranked as the top three in terms of their hand g-indices. In each case, after many exploratory studies using various combinations of thresholds, we found IFt = 2 and Ct = 50 as reasonable thresholds for predicting and confirming a top article. With these thresholds, 1178 articles out of the 2098 articles published from NIIST during 2004–2014 appeared in journals with IFt C 2. However, only 175 articles had received more than 50 citations. Table 2 shows in each case, the number of papers P published, the total citations C, the h- and g-indices, of the three leading authors, the fraction of Type I and Type II articles if these thresholds were used to define predicted and realized success, and the value of Peirce’s measure of success i. In all three cases, we see a small fraction of Type I errors (\4 %) and a large fraction of Type II errors (ranging from nearly 25 to 75 %). In all cases, the measure of success if we use a decision rule based on IF is poor. Concluding remarks An erroneous and unjustifiable practice that is still followed in many places is the use of journal impact factors to evaluate the quality of scientific work of authors, groups and institutions. A more justifiable practice is the use of actual citation rates. An interesting challenge is to ask if journal impact factors can successfully predict high citations in future. In this paper, we used Peirce’s measure of predictive success (Peirce 1884) to see if the use of journal impact factors to predict high citation rates is acceptable or not. It is seen that this measure is independent of Pearson’s correlation (Seglen 1997) and gives a more quantitative refinement of the Type I and Type II classification of Smith (2004). The Table 2 The indicators for three top scientists from CSIR-NIIST Author Field 1 Photochemistry 2 Organic chemistry 3 Biotechnology 214 123 C h g Type I Type II i 85 5533 42 74 0.035 0.494 0.044 104 4118 31 62 0.000 0.760 0.071 3971 35 54 0.037 0.248 0.344 P Scientometrics measures are used to examine the portfolios of some active scientists. Type I errors are seen to be low (\4 %) and Type II errors are significant (ranging from nearly 25 to 75 %). In all cases, the measure of success if we use a decision rule based on IF is poor. It is clear that the journal impact factor is not effective in predicting future citations of successful authors. References Calza, L., & Garbisa, S. (1995). Italian professorships. Nature, 374, 492. Garfield, E. (1955). Citation indexes to science: A new dimension in documentation through association of ideas. Science, 122(3159), 108–111. Garfield, E. (1999). Journal impact factor: A brief review. Canadian Medical Association Journal, 161(8), 979–980. Garfield, E. (2005). The agony and the ecstasy: the history and meaning of the journal impact factor. International Congress on Peer Review and Biomedical Publication. http://garfield.library.upenn.edu/ papers/jifchicago2005.pdf. Maffulli, N. (1995). More on citation analysis. Nature, 378, 760. Peirce, C. S. (1884). The numerical measure of the success of predictions. Science, 4(93), 453–454. Seglen, P. O. (1992). The skewness of science. Journal of the American Society for Information Science, 43, 628–638. Seglen, P. O. (1994). Causal relationship between article citedness and journal impact. Journal of the American Society for Information Science, 45, 1–11. Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research. British Medical Journal, 314, 498–502. Smith, S. D. (2004). Is an article in a top journal a top article? Financial Management, 133–149. Taubes, G. (1993). Measure for measure in science. Science, 260, 884–886. Vinkler, P. (1986). Evaluation of some methods for the relative assessment of scientific publications. Scientometrics, 10, 157–177. 123