Roger Clarke's Web-Site

© Xamax Consultancy Pty Ltd,  1995-2024
Photo of Roger Clarke

Roger Clarke's 'IS Citation Analysis'

A Pilot Citation Analysis of Australian Information Systems Researchers

Roger Clarke **

Review Version of 26 April 2006

© Xamax Consultancy Pty Ltd, 2006

Available under an AEShareNet Free
for Education licence or a Creative Commons 'Some
Rights Reserved' licence.

This document is at http://www.anu.edu.au/people/Roger.Clarke/SOS/AISCit0605.html


Abstract

Citation-counts of refereed articles are a potentially valuable measure of the impact of a researcher's work, in the information systems discipline as in many others. This paper reports on a pilot study of the apparent impact of Australian IS researchers, as disclosed by citation-counts of their works and those of leading international researchers. The study is placed in the context of the emergent Research Quality Framework (RQF). Citation analysis using currently available indexes is found to be fraught with many problems. There is high risk that, if implemented, the RQF will have substantial negative impacts on access to research funding by Australian IS academics.


Contents


1. Introduction

Information systems (IS) is a maturing discipline, with a considerable specialist literature, and relationships with reference disciplines that are now fairly stable and well-understood. In a mature discipline, various forms of 'score-keeping' are undertaken. One reason for this is as a means to distinguish among applicants for promotion, and contenders for senior appointments.

An increasingly significant application of score-keeping is as a factor in the allocation of resources to support research. A dramatic shift has occurred in the Australian higher education sector since the late 1980s. The longstanding model of collegial self-regulation within universities has been subjected to the dictates of economic rationalism, with the imposition of managerialist hierarchies within institutions, and of bureaucratic regulation from without. The latest round of external regulation takes the form of the emergent Research Quality Framework (RQF).

The work reported on in this paper was undertaken as part of a project on the history of the information systems discipline in Australia (Clarke 2006). The motivation for the sub-project was to gather evidence of the impact of Australian IS academics. The paper commences by briefly reviewing the emergent RQF, and its potential impacts on the discipline. This is followed by a brief review of citation analysis, and its hazards. The paper then describes the objectives of the research, and the method applied. The raw scores are presented, and a number of issues arising from the analysis are identified and examined.


2. The Research Quality Framework

The Department of Education, Science and Training (DEST) has been developing a Research Quality Framework (DEST 2005, 2006). The RQF seeks to measure both the quality of research, which includes "its intrinsic merit and academic impact", and the broader impact or use of the research, which refers to "the extent to which it is successfully applied in the broader community". The outcomes of the measurements would be rankings, which would be used in decision-making about the allocation of research-funding. The unit of study is 'research groupings', which are to be decided by each institution, but will be subject to considerable constraints in terms of disciplinary focus and minimum size.

The thinking behind the RQF derives largely from the UK Research Assessment Exercise (RAE), which has been operational since 1986, and to some extent also from a New Zealand scheme, the Performance-Based Research Fund (PBRF 2005). The procedure in the most recent RAE in 2001 was described as follows: "Every higher education institution in the UK may make a submission to as many of the units of assessment as they choose. Such submissions consist of information about the academic unit being assessed, with details of up to four publications and other research outputs for each member of research-active staff. The assessment panels award a rating on a scale of 1 to 5, according to how much of the work is judged to reach national or international levels of excellence" (RAE 2001, p. 3). The 2008 Exercise also involves evaluation of "up to four items ... of research output produced during the publication period (1 January 2001 to 31 December 2007) by each individual named as research active and in post on the census date (31 October 2007)" (RAE 2005, p. 13). Similarly, in the New Zealand scheme, a major part of the 'quality evaluation' depends on assessment of the 'evidence portfolio' prepared by of for each eligible staff-member, which contains up to four 'nominated research outputs' (PBRF 2005, p. 41).

The design of all three schemes involves lengthy and bureaucratic specifications of how research groupings and individuals are to fill in forms, including definitions of the publications that can be included, large evaluation panels comprising people from disparate disciplines, and lengthy and bureaucratic assessment processes. The RAE in particular has been a very highly resource-intensive activity. A recent report suggests that it is shortly to be abandoned (MacLeod 2006), on the basis that it has achieved its aims.

The particular aspect of the RQF that is relevant to this paper is the use of publications and citations as indicators of research quality and research impact. A conventional indicator of research quality is publications in refereed venues, primarily in 'quality, refereed journals' and secondarily in the stronger refereed conferences. This may be supplemented by the rankings of those journals and conference proceedings. Rankings tend to reflect the perspective of disciplines. They therefore tend to de-value journals that focus on research domains, because such venues are inherently multi-disciplinary in nature. Based on the relevant panel's assessment of the "contribution" of the research grouping's "research work", and the "significance" of the area in which the work is undertaken, the research grouping would be allocated a rating, on a scale of 1 (low) to 5 (high).

A potential indicator of research impact, on the other hand, is evidence of the use of refereed publications, in particular citations of them. Citations within journals and refereed conferences would indicate the researcher's impact on other researchers; whereas citations within less formal literatures such as professional magazines and newsletters, government reports, and the trade press, might be used as indicators of impact in the broader community. Other indicators of impact include re-publications, translations and inclusion in collections and anthologies. Impact is, however, largely a cumulative matter, rather than being related to a single publication. The focus might be more broadly on reputation or 'esteem'. Other indicators of reputation could include international appointments, prizes and awards, membership of academies and editorial boards, keynote-speaker invitations, and collaborations with other highly-reputed researchers. Impact is to be assessed against a 3-point scale of High, Moderate and Limited.

There are many concerns about the RQF undertaking, and what impacts it will have. Its design is entirely consistent with a political agenda to concentrate research funding on large, longstanding, well-established groups that have good connections overseas, and to reduce research funding to other groups and to individuals.

One concern specific to the IS discipline is that the term 'information systems' is found only once in DEST (2005). It has been assigned to Panel 4: Mathematical and information sciences and technology (RFCD Codes 2301-2399). It is not mentioned under Panel 10: Economics, commerce & management (RFCD Codes 3401-3599).

The IS discipline straddles the technical and applications domains. In this author's view, however, the majority of Information Systems academics would relate their work to 'Commerce and Management' rather than 'Information, Computing and Communication Sciences'. It is to be expected that the mathematicians, computer scientists and engineers who populate the panel entitled 'Mathematical and information sciences and technology' will lack interest in and appreciation of many aspects of IS research, how it is undertaken, and what constitutes quality, impact and reputation.

The IS discipline is a 40-year-old upstart, with about 700 members, who are scattered across a hundred departments only half of which are identifiably IS. Moreover, it has had a long battle to achieve recognition within the wider academic community. The ARC coding scheme - now called the Research Fields, Courses and Disciplines (RFCD) Classification - contained no entries for IS topics until the late 1990s, and the 75-strong ARC College of Experts has included IS Professors only since 2001 (successively, Janice Burn, Graeme Shanks, and Michael Rosemann). However, the chances of panellists being assigned in propertion to the number of academics practising in the relevant disciplines appears low, and the chances of IS academics being invited onto panels appears to be low as well.

In DEST (2005), Panel 4 (Mathematical and information sciences and technology) cross-refers to RFCD Codes 2301-2399, and Panel 10 (Economics, commerce & management) to RFCD Codes 3401-3599. But those codes differ from the current version of the RFCD Classification (ARC 2006), whose relevant entries are shown in Exhibit 1.

Exhibit 1: Relevant Entries in the RFCD

280000 INFORMATION, COMPUTING AND COMMUNICATION SCIENCES

280100 Information Systems

This discipline covers information science and information systems. It includes:

The subsidiary codes are:

350000 COMMERCE, MANAGEMENT, TOURISM AND SERVICES

The sole subsidiary code is:

As the research-pot becomes slimmer, and the competition heats up, and 'better-connected' disciplines make their plays, IS appears very likely to face yet slimmer prospects of getting a reasonable proportion of the available research resources. One motivation for this paper is to assess the extent to which citation analysis may support or hinder the effectiveness of score-keeping for IS research endeavour, whether in the context of the envisaged RQF or otherwise.


3. Citation Analysis

"Citations are references to another textual element [relevant to] the citing article. ... In citation analysis, citations are counted from the citing texts. The unit of analysis for citation analysis is the scientific paper" (Leydesdorff 1998). Leydesdorff and others apply 'citation analysis' to the study of cross-references within a literature, in order to document the intellectual structure of a discipline. This paper is concerned with its use for the somewhat different purpose of evaluating the quality and/or impact of works and their authors by means of the references made to them in refereed journal articles.

Authors have cited prior works for centuries. Gradually, the extent to which a work was cited in subsequent literature emerged as an indicator of the work's influence, which in turn implied significance of the author. Whether the influence of work or author was of the nature of notability or notoriety was, and remains, generally ignored by citation analysis. Every citation counts equally, always provided that it is in a work recognised by whoever is doing the counting.

Attempts to formally measure the quality and/or impact of works, and of their authors, on the basis of the number of citations that they gather, is a fairly recent phenomenon. Indeed, the maintenance of citation indices appears to date only to about 1960, with the establishment of the Science Citation Index (SCI), associated with Garfield (1964). Considerable progress in the field of 'bibliometrics' has ensued.

The advent of the open, public Internet, particularly since the Web exploded in 1993, has stimulated many developments. Individual journal publishing companies such as Elsevier, Blackwell, Kluwer, Springer and Taylor & Francis have developed automated cross-linkage services, at least within their own journal-sets. Meanwhile, the open access movement is endeavouring to produce not only an open eLibrary of refereed works, but also full and transparent cross-referencing within the literature. A leading project in the area was the Open Citation (OpCit) project in 1999-2002. An outgrowth from the OpCit project, the Citebase prototype, was referred to as 'Google for the refereed literature'. It took little time for Google Inc. to discover the possibility of a lucrative new channel. It launched Google Scholar in late 2004.

These need to be regarded as initial forays. All still fall far short of the vision of the electronic library, which was conceived by Vannevar Bush (1945), and articulated by Ted Nelson in the 1960s as 'hypertext'. As outlined in his never-completed Project Xanadu, the electronic library would include the feature of 'transclusion', that is to say that quotations are included by precise citing of the source, rather than by replicating some part of the content of the source.

In the complex, inter-locking and ambiguous field of learned publishing, it is to be expected that citation analysis will gave rise to a degree of contention. Dissatisfaction with it as a means of evaluating the quality and impact of works and of researchers has a long history. See, for example, Hauffe (1994), MacRoberts & MacRoberts (1997) and Adam (2002).

Comparisons within a discipline are problematic enough. Comparisons between disciplines would be extraordinarily difficult. Disciplines are at different stages of maturity. They are reflected in the indices to varying degrees. They have widely varying numbers of academics. And their publishing patterns vary widely, e.g. some publish heavily in journals, others heavily in books, and yet others in other forms, including music, art and software. An inherent bias arises throughout academic endeavour in favour of works in the English language, and against works in other languages.

The remainder of this section briefly summarises deficiencies that are relevant to the present concern, viz. the effectiveness of citation analysis in the IS discipline in the 2005-2010 timeframe, with particular reference to Australian researchers. Problems with citation analysis include the following:

In applying citation analysis to Australian IS researchers, this work seeks to avoid or overcome these problems, and where that cannot be achieved, to remain mindful of their potential consequences.


4. The Research Purposes and Method

The work reported on in this paper had two objectives:

Clearly, there is more than a little scope for conflict between the two. In a mature context, the effectiveness would have been previously established, and the assessment could proceed with much greater confidence. But none of the technique of citation analysis, the available databases, or the discipline under study are mature.

The following method was adopted:

Further detail on each of these steps is now provided.

The set of venues was developed by reference to the now well-established literature on IS journals and their rankings, for which a bibliography is provided at Saunders (2005). Consideration was given to the lists and rankings there, including the specific rankings used by several universities, and available on that site. Details of individual journals were checked in the most comprehensive of the several collections, which is maintained at Deakin University (Lamp 2005).

The author believes that the set selected, and listed in Exhibit 2, represents a fairly conventional view of the key refereed journals on the management side of the IS discipline. It significantly under-represents those journals that are in the reference discpline of computer science, and at the intersection between IS and computer science. The reason is that these fields are themselves highly diverse, and a very large number of venues would need to be considered, and many included, in which the heavy majority of IS researchers neither read nor publish. Less conventionally, the list separates a few 'AA-rated' journals, and divides the remainder into general, specialist and regional journals. There is, needless to say, ample scope for debate on all of these matters.

Consideration could be given to supplementing the journals with the major refereed conferences. In this author's experience, ICIS is widely regarded as approaching AA status, ECIS as a generic A, and ACIS, PACIS and AMCIS may be considered by some as being generic A as well. These are accessible and indexed in the Association for Information Systems' AIS eLibrary, ICIS since it commenced in 1980, ECIS since 2000, and ACIS since 2002. However, no attempt was made to extend the analysis undertaken in this work to even the strongest of the refereed conferences.

A brief survey was conducted of available citation indices. It was clear that Thomson / ISI needed to be included, because it is well-known and would be very likely to be used by evaluators. Others considered included:

Thomson Scientific, previously known as the Institute for Scientific Information (ISI), publishes the Science Citation Index (SCI) and the Social Science Citation Index (SSCI). In January 2006, its site stated that SCI indexes 6,496 'journals' (although some are proceedings), and that SSCI indexes 1,857 'journals'. The company's policies in relation to inclusion (and hence exclusion) of venues is explained at http://scientific.thomson.com/mjl/selection/. An essay on the topic is at Thomson (2005), but access is unreliable.

Elsevier's Scopus has only been operational since late 2004. The next three are computer science indexes adjacent to IS, and at the time the research was conducted the last on the list was still experimental.

The decision was taken to utilise the Thomson/ISI SCI and SSCI Citation Indices, and to extract comparable data from Google Scholar. A more comprehensive project would be likely to add Scopus into the mix.

Exhibit 2 shows the set of journals. The third and fourth columns indicate whether the journal is included in the Thomson/ISI SCI and SSCI Citation Indices. The final column shows the inferences drawn by the author regarding the extent of the Thomson/ISI coverage of the journal.

Exhibit 2: Refereed Venues Selected

Journal NameJournal Abbrev.SSCISCIIssues Included
  
AA Journals (3)    
Information Systems ResearchISRY Only from 1994, Vol. 4 ?
Journal of Management Information SystemsJMISY Only from 1999, Vol. 16!
Management Information Systems QuarterlyMISQY Only from 1984, Vol. 8!
  
AA Journals in the Major Reference Disciplines (4)    
Communications of the ACM (Research Articles only)CACM YFrom 1958, Vol. 1
Management ScienceMSY From 1955, Vol. 1
Academy of Management JournalAoMJY From 1958, Vol. 1
Organization ScienceOSY From 1990, Vol. 1?
  
A Journals - General (9)    
Communications of the AIS (Peer Reviewed Articles only)CAIS  None!
DatabaseData Base YOnly from 1982 Vol. 14 ?
Information Systems FrontiersISF YOnly from 2001, Vol. 3
Information Systems JournalISJY Only from 1995, Vol. 5
Information & ManagementI&MY Only from 1983, Vol. 6
Journal of the AISJAIS  None!
Journal of Information SystemsJIS  None!
Journal of Information TechnologyJITY Only 18 articles
WirtschaftsinformatikWI YOnly from 1990, Vol. 32
  
A Journals - Specialist (15)    
Decision Support SystemsDSS YOnly from 1985, Vol. 1
Electronic MarketsEM  None
International Journal of Electronic CommerceIJECY From 1996, Vol. 1
Information & OrganizationI&O  None
Information Systems ManagementISMY Only from 1994, Vol. 11
Information Technology & PeopleIT&P  None!
Journal of End User ComputingJEUC  None
Journal of Global Information ManagementJGIM  None
Journal of Information Systems EducationJISE  None
Journal of Information Systems ManagementJISM  None
Journal of Management SystemsJMS  None
Journal of Organizational and End User ComputingJOEUC  None
Journal of Organizational Computing and Electronic CommerceJOCEC  None
Journal of Strategic Information SystemsJSIS YFrom 1992, Vol. 1 ?
The Information SocietyTISY Only from 1997, Vol. 13! - ?
  
A Journals - Regional (3)    
Australian Journal of Information SystemsAJIS  None
European Journal of Information SystemsEJIS YOnly from 1995, Vol. 4
Scandinavian Journal of Information SystemsSJIS  None

When assembling a list of individuals active in the field, it is challenging to be sure of being comprehensive. People enter and depart from the discipline. There are overlaps with the Computer Science discipline, with various management disciplines, and with the IS profession. Immigration, emigration and expatriates make the definition of 'Australian' less than straightforward.

The set of names of Australian academics was developed based on the author's experience in the field since about 1970, but in particular by reference to the Australian IS Academics Directory (Clarke 1988), the Australasian IS Academics Directory (Clarke 1991), the Asia Pacific Directory of IS Researchers (Gable & Clarke 1994 and 1996), and the ISWorld Directory (1997-). Arbitrary decisions were taken as to who was an expatriate Australian, and how long immigrants needed to be active in Australia to be treated for the purposes of this analysis as being Australian. Data was sought in relation to about 100 leading Australian IS researchers, including 4 well-known and successful expatriates.

Data was extracted from the SCI and SSCI citation indices over several days in late January 2006. Access was gained through the ANU Library Reverse Proxy, by means of the company's 'Web of Science' offering. Both sets of searches were restricted to 1978-2006, across all Citation Indices (Science Citation Index - SCI-Expanded, Social Sciences Citation Index - SSCI and Arts & Humanities Citation Index - A&HCI). Multiple name-spellings and initials were checked, and where doubt arose were also cross-checked with the AIS eLibrary.

In order to provide comparison with leaders in the IS discipline outside Australia, the same process was applied to a small sample of leading overseas researchers. The selection was not intended to be random. It relied on this author's longstanding involvement in the field, and his knowledge of the literature and the individuals concerned. Relatively uncommon surnames were used, in order to reduce the likelihood of pollution through the conflation of articles by multiple academics.

Google Scholar was then searched for a sub-set of the 30 apparently most-highly-cited Australian researchers, and supplementary research was udertaken within the Thomson/ISI database. These elements were performed in respectively early and late April 2006.


5. Thomson/ISI

This section presents the results of the pilot citation analyses of the Thomson/ISI collections.

5.1 Method

The data collected for each of the authors was the apparent count of articles, and the apparent total count of citations. The Thomson/ISI site provides several search-techniques. The search-technique used was the 'General Search'. In each case, the search-terms used were <Author includes author-surname> combined with <Author includes author-initial or -initials>, with the date-range restricted to 1978 onwards. In the case of common names, however, this was qualified with <Address includes 'Australia'>. This provides a list of articles by all authors sharing the name in question, provided that they were published in venues that are in the ISI database. For each such article, a citation-count is provided, which is defined as the number of other articles in the ISI database that cite it. The list was inspected, in order to remove articles that appeared to be by people other than the person being targetted.

This measure was selected primarily because it is the most obvious, and hence the most likely to be used by someone evaluating the apparent contribution of a particular academic or group of academics. It is also the most constrictive definition available, and hence could be argued to be the most appropriate to use when evaluating applicants for the most senior appointments, and when evaluating the reputations of the most prominent groups of researchers. Other possible approaches are discussed at the end of this section.

5.2 Citation-Counts for Australian IS Researchers

Exhibit 3 provides data for the highest-scoring Australian IS academics, using an arbitrary cut-off of 20 total citations. This resulted in the inclusion in the table of the 4 expatriates and 26 local researchers. For each person, the data shown is the total citation-count, the number of papers in the index, and the citation-count for that person's most-cited paper.

Exhibit 3: Citation-Counts for Australian IS Researchers

Citation Count
Number of Articles

Largest Per-Article Count

Expatriates   
Iris Vessey (as I.)
601
35
111
Rick Watson (as R.T.)
485
28
78
Ted Stohr (as E.A.)
217
12
108
Peter Weill (as P.)
178
13
47
Locals
Marcus O'Connnor (as M.)
354
31
66
Ron Weber (as R.)
328
22

38

Philip Yetton (as P. and P.W.)
270
26
57
Michael Lawrence (as M.)
208
27
66
Michael Vitale (as M. and M.R.)
179
14
107
Ross Jeffery (as D.R., and as R.)
172
28
38
Marianne Broadbent (as M.)
166
24
36
Bob Edmundson (as R.H.)
86
4
66
Graham Low (as G. and G)
83
19
38
Peter Seddon (as P. and P.B.)
70
6
60
June Verner (as J. and J.M.)
65
18
22
Graeme Shanks (as G.)
61
13
14
Paula Swatman (P.M)
53
9
29
Kit Dampney (as C and C.N.G.)
47
16
25
John D'Ambra (as J.) **
47
7
16
Roger Clarke (as R. and R.A.)
44
22
16
Michael Rosemann (as M.)
41
14
18
Graham Winley (as G.)
40
13
13
Chris Sauer (as C.)
40
18
13
Simpson Poon (as S.)
37
3
29
Robert Johnston (as R.B.)
36
10
11
Guy Gable (as G.G.)
34
9
23
Errol Iselin (as E.R.)
31
5
15
Mike Papazoglou (as M.)
31
16
6
Paul Ledington (as P.)
27
10
15
Shirley Gregor (as S.)
25
7
16
David Wilson (as D.)
22
17
4
Joan Cooper (as J.)
20
7
10

** [John D'Ambra has pointed out a specific problem with his data, in that ISI (and, to a lesser degree, Google Scholar) does not cope at all well with apostrophes embedded in names. His adjusted total 6 months before the date this data was extracted was 59, and his article-count several higher - added Jan 2008]

5.3 Citation-Counts for Leading International IS Academics

Exhibit 4 endeavours to provide a benchmark. The appropriate comparison is not with people in other disciplines, because their data reflects different numbers of academics, and different numbers of journals, which are differentially indexed. The table shows the same data as for Australian IS academics, but for well-known leaders in the discipline in North America and Europe.

Exhibit 4: Citation-Counts for Leading International IS Academics

Citation Count
Number of Articles

Largest Per-Article Count

North American   
Lynne Markus (as M.L.)
1,335
39
296
Izak Benbasat (as I.)
1,281
71
155
Dan Robey (as D.)
1,247
45
202
Sirkka Jarvenpaa (as S.L.)
960
40
107
Detmar Straub (as D. and D.W.)
873
49
160
Rudy Hirschheim (as R.)
600
44
107
Gordon Davis (as G.B.)
428
48
125
Peter Keen (as P.G.W.)
427
21
188
Eph(raim) McLean (as E. and E.R.)
119
30
31
Europeans
Kalle Lyytinen (as K.)
458
55
107
Leslie Willcocks (as L.)
231
42
28
Bob Galliers (as R.D. and R.)
173
34

25

Guy Fitzgerald (as G.)
121
50
38
Enid Mumford (as E.)
103
21
42
Claudio Ciborra (as C. and C.U.)
60
13
26
Frank Land (as F.)
59
18
19
David Avison (as D.)
51
37
44
Ron Stamper (as R. and R.K.)
32
13
16
Niels Bjorn-Andersen (as N.)
3
3
2

5.4 Quality Assessment

In the European list, I was surprised by the low counts for Ron Stamper and Niels Bjorn-Andersen. I report later on the results of an investigation of their citations as indicated by Google Scholar.

In the North American list, I was surprised by the low results for Eph McLean, and consequently I examined his list more closely. An article that, in my view, could have been expected to be among the most highly-cited in the discipline (Delone & McLean's 'Information Systems Success: The Quest for the Dependent Variable') is missing from the Index. It was published in ISR, which would have been AA-rated by most in the discipline from its inception. But ISR is indexed only from 5, 1 (March 1994), whereas the Delone & McLean paper appeared in 3, 1 (March 1992). Using the 'Cited Ref Search' provide by ISI also fails to detect it under McLean E.R., but does detect a single citation when the search is performed on 'INFORM SYST RES' and '1992', and finds it with Author Delone W.H. with 7 variants of the citation, all of which are misleadingly foreshortened to 'INFORMATION SYSTEMS'. These disclose the very substantial sum of 448 hits. A later section reports on the citations of this paper as indicated by Google Scholar.

In order to gain some insight into the impact on individual researchers, a comprehensive assessment was undertaken of the inclusion and exclusion of the refereed works of the author of this paper. In the author's view, this tiny and highly convenient sample is legitimate, because I am not young, have a substantial publications record, have had some impact, have diverse interests, have published in diverse literatures, and am well-positioned to ensure accuracy in the analysis because I know all of my own works. The outcome of the analysis was as follows:

Citation analysis based on the Thomson/ISI collection leaves a great deal to be desired. Particular problems included the following:

These deficiencies result in differential bias against researchers. The variability is not easily predictable, although it does appear to depend on the individual's areas of interest, with very poor coverage of specialisations that have emerged during the last decade, of journals that focus on research-domains and that accordingly accept papers written from varying disciplinary perspectives, and of the interfaces between IS and law, and IS and public policy.

Some of the deficiencies would appear to affect all disciplines (e.g. the apparent incompleteness of journals that are claimed to be indexed, and the failure to differentiate refereed from unrefereed content in at least some journals). Many others would appear to mainly affect relatively new disciplines, such as the long delay before journals are included, and refusals to include some journals even when representations are made.

The question remains as to what factors affect citation-count, that need to be considered when seeking to infer something from it, particularly something about an individual researcher (such as the significance of their overall contribution to the discipline, or perhaps the (academic) impact of their individual papers). The following appear to be important factors, with observations about how IS compares with disciplines generally:

5.5 Alternative Methods

This sub-section considers alternative ways in which the Thomson/ISI database could be applied to the purpose. Two other categories of search are available. One is 'Advanced Search', which provides the ability to apply Boolean operators on combinations of fields. This is appropriate when conducted tightly focussed searches. If a common evaluation method were able to be defined, it may be feasible to use 'Advanced Search' to apply it. It was not considered relevant to the pilot study conducted here.

The other category of search is called 'Cited Ref Search'. This is looser than the 'General Search' used in this pilot study. It lists every article by all authors sharing the name in question, that has been cited by any article in any publishing venue in the database. It therefore includes articles that were published in venues that are not in the ISI database. It is arguable that this scope of citation-count represents a more appropriate measure of the reputation, worth or impact of an individual researcher or group of researchers. The measure counts citations in articles in the ISI database, rather than citations in articles in the ISI database of articles that are also in the ISI database.

In order to test the likely impact of this alternative approach, a small number of researchers in each category were selected, and the study repeated. The design of the search-facility creates challenges. For example, it does not enable restriction to <Address includes 'Australia'>. In addition, very little information is provided for each hit, the sequence provided is alphabetical by short journal title, and common names generate in excess of 1,000 hits.

Appendix 1 provides the results of this part of the study. All of the researchers for whom the analysis was undertaken had substantial numbers of articles that were cited in the ISI database, but that were not themselves in the ISI database. In some cases, relaxing the criterion resulted in an increase in the citation-count by 20-30%, but in others in an increase by factors of as much as 5. The data in Exhibit 5 enables the following inferences to be drawn:

Dependence on the General Search alone provides only a restricted measure of the reputation, worth or impact of an academic. Moreover, it may give a seriously misleading impression of the impact of researchers who publish in non-ISI venues such as books, and journals targetted at the IS profession and management. To the extent that citation analysis of Thomson/ISI data is used for evaluation purposes, a method needs to be carefully designed that reflects the objectives of the analysis. In addition, it would be essential for an opportunity to be provided for the individuals concerned to consider the results and submit supplementary information.


6. Google Scholar

Google Scholar is still an experimental service. From a bibliometric perspective, it is crude, because it is based on brute-force free-text analysis, without recourse to metadata, and without any systematic approach to testing venues for quality before including them. On the other hand, it has the advantages of substantial reach, ready accessibility, and popularity. It is inevitable that it will be used as a basis for citation analysis, and therefore important that it be compared against Thomson's more formal database.

The analysis presents some challenges. The method adopted was to test the hits for a sample of the researchers whose data from Thomson/ISI appears in Exhibits 2 and 3. Statistics were extracted from Google Scholar using various search-strings. For example, <A>, <I>, <IS> and <IT> are all 'stop-words' and if they are included in search terms they are ignored. Search-terms of the form <"I Vessey" OR "Vessey I"> appeared to generate the most useful results, and this format was generally applied.

A very small sample was used, because of the resource-intensity involved, and the experimental nature of the procedure. An intentionally biassed sample was selected, firstly because the intention was to test the comparability of the two sets of data, both in terms of coverage and citation-counts, and secondly in order to avoid conflation among multiple authors and the omission of entries.

The multiple-author problem is highly varied. For many of the researchers for whom data is presented below, there was no evident conflation with others, i.e. their Top-10 appeared on the first page of 10 entries displayed by Google Scholar. For a few, it was necessary to skip some papers, and move to the second or even third page. To reach Eph McLean's 10th-ranked paper, it was necessary to check 60 titles, and to reach Ron Weber's 10th, 190 titles were inspected. A test on my own, rather common name on the above basis resulted in 12,700 hits. To reach the 10th-most-cited, it was necessary to inspect 558 entries. The challenges involved in this kind of analysis are underlined by the fact that the first 558 entries include a moderate number of papers by other R. Clarkes on topics and in literatures that are at least adjacent to topics I have published on and venues I have published in. These could have easily been mistakenly assigned to me by a researcher who lacked a detailed knowledge of my publications list. There are many researchers with common names, and hence accurate citation analysis based on name alone is difficult to achieve. Experiments with searching based on article-titles gave rise to other challenges.

Appendix 2 contains tables (numbered 5A through 5F) which show comparisons between the raw Google citation-count and the Thomson/ISI citation-count. Results are shown for seven researchers. In each case, careful comparison was necessary, to ensure accurate matching of articles uncovered by Google, and those disclosed by Thomson/ISI.

A number of conclusions can be drawn, some of which are intuitive, some less so; and some of which are disturbing. In particular:

An important further test was the measure generated for Delone & McLean's 'Information Systems Success: The Quest for the Dependent Variable'. This was undertaken by keying the search-term <E McLean W DeLone> into Google Scholar, and critically considering the results. The test was performed twice, in early April 2006 and late April 2006. The results differed in ways that suggested that, during this period, Google was actively working on the manner in which its software counts citations and presents hits. The later, apparently better organised counts are used here. The analysis was complicated by the following:

The raw results comprised 824 citations for the main entry (and a total of 832 hits). Based on a limited pseudo-random sample from the first 40 of the 824, many appeared to be indeed attributable to the paper. This is a citation-count of a very high order. An indication of this is that the largest Thomson/ISI citation-count for an IS paper that was located during this research was 296, for a paper in CACM by Lynne Markus. In Google, that paper scored 472, less than 60% of the count for Delone & McLean. In short, it appears that, as a result of what is most likely data capture error, Thomson/ISI may currently omit the single most-referenced paper in the entire discipline, in the process reducing Eph McLean's citation-count by many hundreds.

A further test was undertaken to compare the citation-counts generated by ISI and Google Scholar for two European researchers whose ISI counts were lower than I had expected:

Some perspective on what 100 citations means in the IS discipline can be gained from an assessment of the total citation-count of the top 10 items that are discovered in response to some terms of considerable popularity in recent years. The terms were not selected in any systematic manner. The count of the apparently 10th-most-cited article containing the expression is also highlighted, because it provides an indication of the depth of the heavily-cited literature using that term:

Another aspect of interest is the delay-factor before citations begin to accumulate. Some insight was gained from an informal sampling of recent MISQ articles, supplemented by searches for the last few years' titles of this author's own refereed works. A rule of thumb appears to be that there is a delay of 6 months before any citations are detected by Google, and of 18 months before any significant citation-count is apparent. The delay is rather longer on Thomson/ISI. This is to be expected, because of the inclusion of edited and lightly-refereed venues in Google, which have a shorter review-and-publication cycle than Thomson/ISI's journals, most of which are heavily refereed.

In summary, citation-counts above about 75-100 on Google Scholar appear to indicate a high-impact article, and above 40-50 a significant-impact article. Appropriate threshholds on Thomson/ISI would appear to be about half of those on Google. These threshholds are of course specific to IS, and other levels would be likely to be appropriate in other disciplines, and in other countries.


7. Potential Applications and Implications

Reputation is a highly multi-dimensional construct, and reduction to a score is morally dubious, intellectually unsatisfying, and economically and practically counter-productive. On the other hand, the frequency with which papers are cited by other authors is a factor that an assessment of reputation would ignore at its peril.

The scope exists for citation analysis to be used carefully, as an aid in evaluation. Members of search and selection committees responsible for the appointment of key staff-members apply the technique, but incorporate subtlety, sophistication and insight into their work. An approach that could be adopted is for individuals to themselves undertake citation analysis of their own works, and submit both the raw data and additional information needed to fill out the picture.

In the context of the RQF, the likelihood of subtlety, sophistication and insight being applied appears remote. The RQF is from the outset a political mechanism aimed at focussing research funding on a small proportion of research centres within a small proportion of institutions. In addition, the RQF is a mass-production exercise, and is subject to heavily bureaucratic processes and definitions. To the extent that citation analysis is used in the process, it will inevitably be largely mechanical.

Simplistic application of raw citation-counts to evaluate the performance of individual researchers and of research groupings would disadvantage some disciplines, many research groupings, and many individual researchers. Citation-counts will be a factor in quality rankings and/or impact scores; and funding will follow rankings, at least from DEST and the ARC to universities, and possibly to research groupings, and probably to individuals, both from the ARC and through universities' internal allocation mechanisms.

The IS discipline is especially exposed to the risk of simplistic application of citation analysis. It is highly dispersed across departments. In only a small number of cases is IS likely to achieve recognition as a Research Grouping in its own right. For the many reasons identified above, citation-counts will suggest that most IS researchers fall short of the criteria demanded for the higher rankings.

There is a possibility that rankings will genuinely reflect much more than just quantitative measures. There is a possibility that individuals and research groupings will have a meaningful opportunity to make submissions as to why the quantitiative measures do not give rise to an appropriate ranking. There is a possibility that those submissions may be given strong weighting by Panels. But the probability of each of these is low, because this is a zero-sum game for a limited pool, and Panel-members are unlikely to be from the IS discipline, and are unlikely to be sympathetic to IS, particularly when such sympathy would be at cost to their own discipline and colleagues.


8. Conclusions

There may be a world in which the electronic library envisioned by Bush and Nelson has come into existence, in which all citations are counted, and in which venues are subject to a weighting scheme that reflects differences among disciplines and research-domains, and that is subject to progressive adaptation.

Back in the real world, however, the electronic library is deficient in a great many ways. It is fragmented and very poorly cross-linked. And the interests of copyright-owners (including discipline associations but particularly the for-profit corporations that publish and exercise control over the majority of journals) are currently in building more and more substantial barriers rather than working towards integration. It remains to be seen whether that will be broken down by the communitarian open access movement, or by the new generation of corporations spear-headed by Google.

In the current and near-future contexts, citation analysis is a very blunt weapon, which should be applied only with great care, but which appears very likely to harm the interests of the less politically powerful disciplines such as IS.


References

In all cases in which URLs are shown, the web-page was last accessed in early April 2006.

Adam D. (2002) 'Citation analysis: The counting house' Nature 415 (2002) 726-729

ARC (2006) 'Research Fields, Courses and Disciplines Classification (RFCD)' Australian Research Council, undated, apparently of 21 February 2006, at http://www.arc.gov.au/apply_grants/rfcd_seo_codes.htm

Bush V. (1945) 'As We May Think' The Atlantic Monthly. July 1945, at http://www.theatlantic.com/doc/194507/bush

Clarke R. (Ed.) (1988) 'Australian Information Systems Academics: 1988/89 Directory' Australian National University, November 1988

Clarke R. (Ed.) (1991) 'Australasian Information Systems Academics: 1991 Directory' Australian National University, April 1991

Clarke R. (2006) 'A Retrospective on the Information Systems Discipline in Australia' Xamax Consultancy Pty Ltd, 26 January 2006, at http://www.anu.edu.au/people/Roger.Clarke/SOS/AISHist0601.html

DEST (2005) 'Research Quality Framework: Assessing the quality and impact of research in Australia: Final Advice on the Preferred RQF Model' Department of Education, Science & Training, December 2005 , at http://www.dest.gov.au/sectors/research_sector/policies_issues_reviews/key_issues/research_quality_framework/final_advice_on_preferred_rqf_model.htm

DEST (2006) 'Research Quality Framework Guidelines Scoping Workshop : Workshop Summary' Department of Education, Science & Training, 9 February 2006 , at http://www.dest.gov.au/NR/rdonlyres/1D5A7163-A754-48B7-88B7-5530F486EDD4/9772/RQFGuidelinesScopingWorkshopOutcomesFINAL17March06.pdf

Gable G. & Clarke R. (Eds.) (1994) 'Asia Pacific Directory of Information Systems Researchers: 1994' National University of Singapore, 1994

Gable G. & Clarke R. (Eds.) (1996) 'Asia Pacific Directory of Information Systems Researchers: 1996' National University of Singapore, 1996

Garfield E. (1964) ''Science Citation Index - A New Dimension in Indexing' Science 144, 3619 (8 May 1964) 649-654 , at http://www.garfield.library.upenn.edu/essays/v7p525y1984.pdf

Hauffe H. (1994) 'Is Citation Analysis a Tool for Evaluation of Scientific Contributions?' Proc. 13th Winterworkshop on Biochemical and Clinical Aspects of Pteridines, St.Christoph/Arlberg, 25 February 1994, at http://www.uibk.ac.at/ub/mitarbeiter_innen/publikationen/hauffe_is_citation_analysis_a_tool.html

Lamp J. (2005) 'The Index of Information Systems Journals', Deakin University, version of 16 August 2005, at http://lamp.infosys.deakin.edu.au/journals/index.php

Leydesdorff L. (1998) 'Theories of Citation?' Scientometrics 43, 1 (1998) 5-25, at http://users.fmg.uva.nl/lleydesdorff/citation/index.htm

MacLeod D. (2006) 'Research exercise to be scrapped' The Guardian, 22 March 2006, at http://education.guardian.co.uk/RAE/story/0,,1737082,00.html

MacRoberts M.H. & MacRoberts B.R. (1997) 'Citation content analysis of a botany journal' J. Amer. Soc. for Infor. Sci. 48 (1997) 274-275

PBRF (2005) 'Performance-Based Research Fund', N.Z. Tertiary Education Commission, July 2005, at http://www.tec.govt.nz/downloads/a2z_publications/pbrf2006-guidelines.pdf

Perkel J.M. (2005) 'The Future of Citation Analysis' The Scientist 19, 20 (2005) 24

RAE (2001) 'A guide to the 2001 Research Assessment Exercise', U.K. Department for Employment and Learning, apparently undated, at http://www.hero.ac.uk/rae/Pubs/other/raeguide.pdf

RAE (2005) 'Guidance on submissions' Department for Employment and Learning, RAE 03/2005, June 2005, at http://www.rae.ac.uk/pubs/2005/03/rae0305.pdf

Saunders C. (2005) 'Bibliography of MIS Journals Citations', Association for Information Systems, undated but apparently of 2005, at http://www.isworld.org/csaunders/rankings.htm


Appendix 1: Thomson/ISI Cited Ref Search

This Appendix provides comparisons of results obtained using the ISI General Search and ISI Cited Ref Search. A small number of academics were selected, from among expatriates, local researchers, North Americans and Europeans. All but one (this author) were selected because of their relatively uncommon names, in order to ease the difficulties of undertaking the searches and thereby achieve reasonable quality in the data, and no significance should be inferred from inclusion in or exclusion from this short list.

In this table, the first three columns show the number of citations for each author of articles that are in the ISI database, together with the count of those articles, and the largest citation-count found. This data should correspond with that for the same researcher in Exhibits 3 and 4, but in practice there are many small variants, some caused by the 3-month gap between the studies. The next three columns show the same data for articles by the author that are not in the ISI database. The final two columns show the sum of the two Citation-Count columns, and the apparent Expansion Factor, computed by dividing the Total Citations by the Citation-Count for articles in the ISI database.

Exhibit 5: Thomson/ISI Cited Ref Search

---- In ISI Database ----
-- Not in ISI Database -
Researcher
Citation-Count
Article-Count
Highest Cite-Count
Citation-Count
Article-Count
Highest Cite-Count
Total Citations
Expansion Factor
Sirkka Jarvenpaa (as S.L.)
973
34
110
575
118
88
1548

1.6

Peter Keen (as P.G.W.)
425
14
190
1625
325
463
2050

4.8

Eph McLean (as E. and E.R.)
132
14
41
84
29
42
216
1.6
(Eph Mclean, corrected)
132
14
41
532
20
448
664
5.0
---
---
---
---
---
---
---
---
---
David Avison (as D.)
66
9
46
99
37
16
165
2.5
Ron Stamper (as R. and R.K.)
59
13
16
255
149
19
314
5.3
Frank Land (as F.)
74
16
19
161
88
25
235
3.2
---
---
---
---
---
---
---
---
---
Iris Vessey (as I.)
622
32
114
186
52
76
808
1.3
---
---
---
---
---
---
---
---
---
Philip Yetton (as P. and P.W.)
278
20
59
65
51
6
343
1.2
Peter Seddon (as P. and P.B.)
81
4
69
67
30
22
148
1.8
Graeme Shanks (as G.)
66
10
15
51
32
7
117
1.8
Paula Swatman (as P.M)
43
4
31
102
39
23
145
3.4
Roger Clarke (as R. and R.A.)
41
11
17
176
131
8
217
5.3
Guy Gable (as G.G.)
35
4
29
73
24
39
108
3.1

Appendix 2: Thomson/ISI cf. Google Comparisons for Selected Researchers

This Appendix provides detailed comparisons of results for both ISI and Google Scholar. Seven Australian academics were selected, from among both expatriates and local researchers. All but one (this author) were selected because of their relatively uncommon names, in order to ease the difficulties of undertaking the searches and thereby achieve reasonable quality in the data, and no significance should be inferred from inclusion in or exclusion from this short list.

In each of the following tables:

Exhibit 6A: Thomson/ISI cf. Google - Iris Vessey

Google Count
Thomson Count
Venue
145
111
Journal
92
Unindexed (!!)
Journal (ISR)
88
83
Journal
86
Unindexed (!!)
Journal (CACM)
56
26
Journal
52
Unindexed
Conference (ICIS)
52
Unindexed
Journal (IJMMS)
48
Unindexed (!!)
Journal (CACM)
41
Unindexed (!!)
Journal (IEEE Software)
31
9
Journal
691 or 320
229 (of 601)
Totals

Exhibit 6B: Thomson/ISI cf. Google - Ron Weber

Google Count
Thomson Count
Venue
125
38
Journal
102
Unindexed
Journal (JIS)
106
30
Journal
87
36
Journal
72
26
Journal (Commentary)
65
20
Journal (Commentary)
45
Unindexed
Book
34
Unindexed
Journal (JIS)
34
22
Journal
31
24
Journal
701 or 520
196 (of 328)
Totals

Exhibit 6C: Thomson/ISI cf. Google - Philip Yetton

Google Count
Thomson Count
Venue
302
Unindexed
Book
55
11
Journal
42
12
Journal
32
12
Journal
31
34
Journal (1988)
27
Unindexed
Book
26
57
Journal (1982)
20
23
Book
18
6
Journal (1985)
18
Unindexed
Government Report
571 or 224
155 (of 270)
Totals

Exhibit 6D: Thomson/ISI cf. Google - Peter Seddon

Google Count
Thomson Count
Venue
133
60
Journal
47
Unindexed
Journal (CAIS)
43
Unindexed
Conference (ICIS)
33
Unindexed
Conference
22
Unindexed (!)
Journal (DB, 2002)
24
2
Journal (I&M, 1991)
18
2
Journal
18
Unindexed
Journal (JIS)
13
Unindexed
Conference (ECIS)
9
0
Journal (JIT, Editorial)
360 or 184
64 (of 70)
Totals

Exhibit 6E: Thomson/ISI cf. Google - Paula Swatman

Google Count
Thomson Count
Venue
117
29
Journal
73
Unindexed
Journal (Int'l Mkting Rev)
61
Unindexed
Journal (TIS)
43
Unindexed
Journal (JSIS)
29
Unindexed (!)
Journal (IJEC)
26
12
Journal
26
Unindexed
Journal (JIS)
24
6
Journal
22
Unindexed
Conference
20
Unindexed
Journal (EM)
441 or 167
47 (of 53)
Totals

Exhibit 6F: Thomson/ISI cf. Google - Roger Clarke

Position
Google Count
Thomson Count
Venue
57
81
14
Journal
59
85
16
Journal
102
60
Unindexed
Journal (IT&P)
148
47
Unindexed
Journal (TIS)
253
33
Unindexed
Conference
325
28
Unindexed
Conference
373
25
3
Journal
407
23
Unindexed
Journal (JSIS)
539
18
Unindexed
Journal
558
17
Unindexed
Conference
417 or 191
33 (of 44)
Totals

Exhibit 6G: Thomson/ISI cf. Google - Guy Gable

Google Count
Thomson Count
Venue
102
Unindexed (!)
Journal (EJIS, 1994)
56
Unindexed (!)
Journal (ISF, 2000)
40
Unindexed
Journal (JGIM)
27
6
Journal (MS)
27
Unindexed
Conference
24
23
Journal (I&M, 1991)
23
Unindexed
Conference
14
Unindexed
Conference
13
Unindexed
Conference
10
Unindexed
Conference
336 or 51
29 (of 34)
Totals

Acknowledgements

This paper has benefitted from comments from several colleagues on an earlier draft, in particular from Peter Seddon of the University of Melbourne, who identified a methodological flaw that needed to be addressed. Responsibility for the work rests, however, entirely with the author.


Author Affiliations

Roger Clarke is Principal of Xamax Consultancy Pty Ltd, Canberra. He is also a Visiting Professor in the Cyberspace Law & Policy Centre at the University of N.S.W., a Visiting Professor in the E-Commerce Programme at the University of Hong Kong, and a Visiting Professor in the Department of Computer Science at the Australian National University.



xamaxsmall.gif missing
The content and infrastructure for these community service pages are provided by Roger Clarke through his consultancy company, Xamax.

From the site's beginnings in August 1994 until February 2009, the infrastructure was provided by the Australian National University. During that time, the site accumulated close to 30 million hits. It passed 65 million in early 2021.

Sponsored by the Gallery, Bunhybee Grasslands, the extended Clarke Family, Knights of the Spatchcock and their drummer
Xamax Consultancy Pty Ltd
ACN: 002 360 456
78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916

Created: 1 April 2006 - Last Amended: 26 April 2006 by Roger Clarke - Site Last Verified: 15 February 2009
This document is at www.rogerclarke.com/SOS/AISCit0605.html
Mail to Webmaster   -    © Xamax Consultancy Pty Ltd, 1995-2022   -    Privacy Policy