Roger Clarke's Web-Site
© Xamax Consultancy Pty Ltd, 1995-2021
|Identity Matters||Other Topics||Waltzing Matilda||What's New|
Principal, Xamax Consultancy Pty Ltd, Canberra
Visiting Fellow, Department of Computer Science, Australian National University
Version of 28 November 1993
Published in the Journal of Law and Information Science 4,2 (December 1993)
© Xamax Consultancy Pty Ltd, 1998
This document is at http://www.rogerclarke.com/DV/PaperProfiling.html
Profiling is a data surveillance technique which is little-understood and ill-documented, but increasingly used. It is a means of generating suspects or prospects from within a large population, and involves inferring a set of characteristics of a particular class of person from past experience, then searching data-holdings for individuals with a close fit to that set of characteristics.
It is rather different from better-known data surveillance techniques such as front-end verification and data matching. It raises rather different issues, and requires rather different regulatory measures. This paper surveys the limited information available, defines and describes the technique and its social implication, and argues the case for action by regulators.
Data surveillance, usefully abbreviated to dataveillance, is the systematic use of personal data systems in the investigation or monitoring of the actions or communications of one or more persons. It is supplanting more traditional forms of surveillance because it is cheap and effective.
Personal dataveillance involves subjecting an identified individual to monitoring, whereas in mass dataveillance groups of people are monitored in order to generate suspicion about particular members of the population. Personal dataveillance techniques include transaction-triggered screening, front-end verification, front-end audit and cross-system enforcement. Mass dataveillance techniques include general use of the above techniques, without any transaction to trigger them, plus additional tools such as profiling.
Personal and mass dataveillance can be facilitated by concentrating data from hitherto separate sources. Alternatively, a more comprehensive data collection can be assembled using computer matching (as the technique is called in the United States) or data matching (as it is referred to in Australia). For a comprehensive review of dataveillance, see Clarke (1988). See also Rule (1974), Smith (1974)-, Kling (1978), Rule et al (1980), OTA (1985), OTA (1986), Laudon (1986), Flaherty (1989), Bennett (1992), Clarke (1992), and Madsen (1992).
Profiling is a particular dataveillance technique which is little documented. No active measures have been identified anywhere in the world which have been designed to explicitly subject it to controls. The purposes of this paper are:
It has proven extremely difficult to undertake original empirical research into profiling practices. Considerable difficulties were previously encountered by the author during the period 1987-92, in undertaking a study of data matching (Clarke 1992). Particularly in Australia, general enquiries and freedom of information requests were countered by invocation of exemption clauses. There was also very little published, and very little of that was by researchers independent of the organisations concerned. During the last few years, the openness in relation to data matching has improved markedly. In the United States, this resulted from the firm congressional and presidential support for computer matching programs and the extent to which they have become embedded in agency practices, followed by the 1988 Privacy Protection and Computer Matching Act which expressly required a degree of publicity for such programs. In Australia, the decreased secretiveness followed the passage of the Privacy Act 1988, the inclusion in that statute of explicit mention of data matching, and the interest shown in the topic by the Privacy Commissioner since his appointment in 1989.
No such liberalisation has yet occurred in either the United States or Australia concerning profiling. Accordingly, this paper has been developed predominantly by reflection on technological capabilities, anecdotes and unofficial information, and through use of the limited secondary sources. This is clearly unsatisfactory, but so too would be the continued absence of a critical literature on the topic.
The sense in which the term 'profile' is used in this paper is "2. ... [the] schematic representation of [a] person's interests for use in information retrieval" (Concise Oxford, 1976, p.885). The term 'profiling' refers to the process of creating and using such a profile.
There appear to be few authoritative definitions in the literature. One which is oriented specifically toward law enforcement uses is "correlating a number of distinct data items in order to assess how close a person comes to a predetermined characterisation or model of infraction" (Marx & Reichman, 1984, p.429). Another, oriented toward uses by the direct marketing industry, is the application of statistical techniques such as regression analysis, non-responder segmentation and models for recency, frequency and monetary value of purchases to find out which consumers are good prospects for an offer and which are not (Novek et al 1990, p.529, referencing Stevenson 1987).
In order to encompass both public and private sector applications, the following is proposed as a working definition:
Profiling is a technique whereby a set of characteristics of a particular class of person is inferred from past experience, and data-holdings are then searched for individuals with a close fit to that set of characteristics.
This author's research has been unable to locate any comprehensive review of actual applications of profiling. In particular, most Government publications (e.g. PCIE 1981) are generally unhelpful. One early reference to profiling referred to use by the United States Internal Revenue Service to predict the 'audit potential' of individuals' tax returns (Rule 1974, p.282). Marx & Reichman (1984) provided some evidence of uses in law enforcement. The Office of Technology Assessment of the U.S. Congress noted that most U.S. federal agencies had applied the technique to develop a wide variety of profiles including drug dealers, taxpayers who underreport their income, likely violent offenders, arsonists, rapists, child molesters, and sexually exploited children (OTA, 1986, pp.87-95).
The manifold potential applications in governmental contexts are variously supportive of the interests of the individuals they identify, or of society at large, or benign, or inimical to those interests. To provide a feel for the diversity of its potential, consider the following possible target groups:
In the private sector, profiling can be applied to such matters as the location of employees with particular education, experience and language-skills. The primary use has been, however, in the identification of customers likely to be interested in buying a new product or service. There has been a shift in marketing budgets away from advertising in the mass media towards direct marketing, or 'individualised mass marketing': "sufficiently detailed information on the buying habits and personal preferences of individuals ... enable firms to create individual messages for each consumer.
This need for accurately identifying buyers, combined with the technological capability of 'massaging' and manipulating massive quantities of data about thousands of people in a coherent fashion, has spurred a massive reworking of the methods used by marketers in reaching and controlling potential customers" (Mukherjee & Samarajiva 1993, p.52. See also Novek et al 1990, p.526). Much of the direct marketing literature compromises careful analysis with hyperbole, and confuses the present with futures desired by the author, or his employer or clients. See, however, Stevenson (1986, 1987), and reviews of direct marketing practices in Novek et al (1990), Burns et al (1992), Larsen (1992), Mukherjee & Samarijava (1993) and Gandy (1993).
The steps in the profiling process can be abstracted as follows:
Profiling can be conceived as a sophisticated variant of single-factor screening techniques (which are conducted at the time a transaction is processed), or single-factor file-analysis (conducted at some subsequent time). Screening involves the comparison of just a single characteristic against:
Profiling constitutes multi-factor screening (if conducted on transactions) or multi-factor file-analysis (if conducted at some subsequent time).
Profiling may be based entirely on data which the organisation already holds. More commonly, it draws on the data-holdings of multiple organisations using the facilitative techniques of data concentration and/or data matching (Clarke 1988, pp.504-5, Clarke 1992). There is an apparent incentive for organisations conducting profiling not only to search out and acquire existing data from various sources, but also to undertake or stimulate new data collection. In some cases, static demographic data is sufficient; more commonly, however, the need is for an ongoing stream of 'transaction generated information' (McManus 1990, Mukherjee & Samarajiva 1993, Gandy 1993). This is a facet of what was referred to by Rule et al (1980) as the increasing 'information intensity' of modern society.
Like any other technique, profiling is neither good nor evil. It is capable of application for very worthwhile purposes. Unfortunately, as has been the case with data matching (Clarke 1992, pp.41-6), there is little on the public record which evidences serious attempts to assess the real value of profiling by government agencies. The benefits of marketing applications, however, can be summarised as:
This paper's primary concern is with the 'downside' of profiling, i.e.:
Profiling is a mass dataveillance as distinct from a personal dataveillance technique: it does not involve the monitoring of an identified individual for a specific reason, but is instead concerned with the finding individuals about whom to be suspicious, who can then be subjected to personal surveillance. It therefore has all of the potential negative impacts of mass dataveillance techniques generally, as outlined in Exhibit 1. The more general, social impacts are best discussed in the context of dataveillance in general, rather than of one particular technique (see Clarke 1988). This paper accordingly focuses on the dangers to the individual.
In the case of private sector use, concerns exist about selectiveness in advertising. Profiling has considerable potential to improve the efficiency with which companies undertake marketing communications with their customers and prospects. At some point, however, selectivity in advertising crosses a boundary to become consumer manipulation (Packard 1957, Larsen 1992). Novek et al (1990) perceive dangers that go beyond mere moral arguments: "profiles ... allow companies to pre-judge the future behavior of consumers, leading some of these firms to ignore certain types of people, and thereby limiting such persons' access to information about goods and services" (p.533). They suggest that the combination of consumer profiling with 'geodemographic clustering' techniques is inevitably leading to "'electronic redlining', where calls from low-income neighborhoods identified by their telephone exchange, can be routed to a busy signal, a long queue, or a recorded message suggesting that the desired information service is not presently available" (p.535).
More generally, "the segmentation and marginalisation of consumer information markets further limits the availability of information necessary for informed consumer choice, while simultaneously increasing consumer dependence upon the direct marketer's tightly managed information stream. The result is a market dominated by sellers ... The wider this information gap, the more difficult it becomes to ensure the equitable and efficient working of the marketplace" (p.536).
To The Individual
From: Clarke (1988), p.505
In the case of public sector applications, several issues are of importance:
Associated with these concerns is the extent to which judgmental valuations enter into applications of profiling. Cultural, racial and gender biases, for example, are inevitable, because of the facts of the matter (e.g. the arrest rates of aboriginal people in Australia, and persons of negro and latino origin in the United States, are higher than those for white people), the way in which data is collected, organised and presented (e.g. more data is collected about people of lower socio-economic origins, because they are more commonly applicants for benefits), and the way in which characteristics are inferred (i.e. the people who prepare the profiles bring with them their own theories about which kinds of people are prone to behave in the manner being targetted).
A variety of factors might act to prevent unreasonable uses of profiling, and constrain unreasonable practices in relation to such profiling as is done. These factors are summarised in Exhibit 2.
An organisation might decide on the basis of 'good corporate citizenship', or good faith or fair dealings with their clients, or good relations with their customers, not to use the technique, or to apply particular controls to ensure that the procedures are not unfair. In addition, it is conceivable that employees and contractors who are important to the process may regard themselves as constrained by the code of ethics of their professional body. This might preclude use of the technique at all, use of it for particular purposes, or use of it without particular features protective of the data subjects. There is little evidence, however, of such mechanisms having significant impact on the use of dataveillance practices generally, or of profiling in particular.
Stated government policies regarding such matters as fairness, equity and anti-discrimination might act as constraints, as may codes of conduct, undertakings or policy stances by oversight agencies in the public sector or industry associations in the private sector. There have been instances of constraints on dataveillance arising in this manner, but none is apparent in relation to profiling.
There may be circumstances in which another organisation acts in such a way that an organisation feels itself constrained to not use profiling, or to apply protections. In particular, a competitor may use negative advertising to project its much greater respect for the privacy of its customers; or a government agency whose participation is crucial to the project may decide not to make its data available because of the scheme's privacy-invasive or discriminatory nature. Once again, some instances of this mechanism in operation can be seen in both Australia and the United States in relation to dataveillance practices generally, but not to profiling.
Public opinion can have an effect on profiling practices. Companies which are known to use the technique can be avoided, and publicly vilified, by consumers and their representatives. The passage of the U.S. Video Privacy Protection Act of 1988 (following the publication of Justice Bork's video rental history), the withdrawal of the Lotus/Equifax Marketplace product in 1990, and the cancellation of the Blockbuster video rental profile database in 1991 (see Mukherjee & Samarajiva 1993, p.51) provide evidence that public opinion can be effective in constraining use of profiling and its facilitative mechanisms.
These constraints are, however, entirely dependent upon the practices and their implications becoming common knowledge: they are entirely non-operative if the organisation is successful in suppressing the fact of its use of the technique. Moreover, government agencies are generally much less responsive to pressure from public interest groups through the media. They do, however, tend to appreciate the point better if their Ministers or Secretaries of State, encouraged by the party's constituents and financiers, require them to adapt their procedures.
Another constraining factor may be the insufficiency of the available infrastructure. The hardware, networking and software capabilities to support the project must be in place, or able to be acquired or developed; for example, there may be doubts about the ability of the participants to develop suitable algorithms. Historical data must be available to support the development of a profile; and current data must be available which can be run against the profile in order to generate suspects or prospects. Given the demand powers of government agencies, the serious weaknesses of controls over flows of personal data in the public sector, and the almost complete absence of controls in the private sector, it cannot be expected that these factors have represented a significant brake on profiling activities.
Allowance must be made for circumstances in which the organisation or organisations concerned are unable to bring a scheme to fruition, despite its technical feasibility. Inadequacies in the skills of individuals in writing software and running it against data-holdings may act as a (probably fairly limited) constraint. Difficulties in bilateral and multilateral negotiations among government agencies, and among companies, have been a more effective defence against dataveillance techniques. They have been particularly pronounced between layers of government (i.e. federal, state and local), and between countries. A variety of measures have been adopted by individual governments, and between pairs and among sets of governments, in order to overcome these limitations.
Finally, it would be expected that economic factors would act as a constraint on schemes which were not worthwhile, or which were only worthwhile once, or occasionally, rather than as regular and ongoing programmes. The method whereby economic evaluation is used is generally referred to as cost-benefit analysis or CBA (Sassone & Schaffer 1978, Thompson 1980, Gramlich 1981, DOF 1991, DOF 1993). In practice, the extent to which formal CBA techniques are applied in government, and the quality of them, leaves a great deal to be desired (Clarke 1992, 1993). CBA has seldom acted as a constraint in the public sector in the United States or Australia. It may be more effective in the private sector, where financial viability is a more immediate consideration.
Given that profiling is potentially highly intrusive, and that intrinsic control mechanisms appear to act as at best patchy and partial constraints on unreasonable uses of profiling, it is important to give consideration to regulatory mechanisms.
This author's research has to date unearthed no regulatory measures dealing explicitly with profiling. Moreover, long-standing common law protections such as the laws of confidence and defamation, and privacy and data protection measures, were conceived without any understanding of the technique. Such protections as do exist are therefore generic, or accidental and incidental.
One provision which exists in some form in most statutes is that referred to by the OECD Guidelines (1980) as the Openness Principle, i.e. "there should be a general policy of openness about developments, practices and policies with respect to personal data". Unfortunately the implementation of the Principle in most countries falls far short of that aspiration, particularly because of the wording chosen by Parliamentary Draftsmen to implement it, and the manifold exemptions and exceptions provided. The Australian Privacy Act 1988, for example, requires only that "a record-keeper ... shall ... takes such steps as are, in the circumstances, reasonable to enable any person to ascertain ... the nature of [personal information held] ... [and] the main purposes for which that information is used" (Principle 5). Unsurprisingly, disclosure of profiling activities is rare, even in response to direct requests for information.
In most countries, there are constraints on the collection, storage, disclosure and use of personal data, and these would appear to act as controls on profiling, as for other dataveillance techniques. In fact, much of the apparent control is illusory. Many agencies are wholly or partly exempt, and many exceptions exist (for example, in the Information Privacy Principles in the Australian Privacy Act, the qualifier 'reasonable' occurs fifteen times, and 'protection of the public revenue' is a sufficient ground for use and disclosure of personal data). Agencies generally claim practices to be 'authorised by law' simply because they are not prohibited, and are generally consistent with the agency's mission. The private sector in both the United States and Australia is subject to only very limited regulation of dataveillance practices.
Profiling is an important application of information technology, but also one which embodies considerable dangers to individuals and society. On the basis of the limited evidence publicly available, the conclusion is inescapable that the law in most countries provides only very limited means of constraining, or even ensuring public knowledge about, the use of profiling. Profiling is largely being conducted without public knowledge, without justification, and without appropriate safeguards.
Time works in the favour of even the most objectionable uses of dataveillance techniques, because Parliaments, Ministers and regulatory agencies are hesitant to reverse long-established practices: they tend to be convinced by the argument that the activity must have an economic justification because it would not otherwise have been in existence for so long. Meanwhile technological advances are resulting in further improvements to the effectiveness and the economics of the technique, and in increasingly large pools of accessible personal data.
Government agencies and corporations are taking advantage of the lack of regulation to apply profiling as they see fit. Intrinsic controls over dataveillance techniques have been repeatedly shown to be utterly inadequate. Profiling therefore demands far more attention than has to date been given to it by researchers, by executives in both the public and private sector, by regulatory agencies, and by legislators.
Due to the copious weaknesses in existing privacy protections, the inventiveness of agencies in circumventing them, and the ravages of technological change, many authors have argued the urgent need for 'second-generation' privacy protective legislation (e.g. Laudon 1986, Clarke 1988, Flaherty 1989, Clarke 1992, Bennett 1993). Further piecemeal measures could be considered, including the extension of existing statutes to cope with profiling. The challenge to legislators is, however, much broader than profiling alone. The nature and scope of privacy protections need to be re-examined, and a much more comprehensive framework provided. Such a framework would remove the exemptions and exceptions, address the social and economic justification for privacy-invasive programmes, and empower a suitably resourced 'watchdog' agency to study all uses of dataveillance techniques, and submit each of them to an appropriate and detailed regulatory regime.
Bennett C. (1992) 'Regulating Privacy: Data Protection and Public Policy in Europe and the United States' Cornell University Press, New York, 1992
Bennett C. (1993) 'The Public Surveillance of Personal Data: Comparative Responses and Policy Options' Working Paper, Department of Political Science, Uni. of British Columbia (August 1993)
Burns R.E., Samarajiva R. & Mukherjee R. (1992) 'Utility Customer Information: Privacy and Competitive Implications' NRRI 92-11 (Sept 1992) National Regulatory Rsearch Institute, 92-11, (September 1992)
Clarke R.A. (1988) 'Information Technology and Dataveillance' Comm. ACM 31,5 (May 1988) Re-published in C. Dunlop and R. Kling (Eds.), 'Controversies in Computing', Academic Press, 1991
Clarke R.A. (1992) 'Computer Matching: A Normative Regulatory Framework' Working Paper, Dept of Commerce, Australian National Uni. (August 1992)
Clarke R.A. (1993) 'Cost/Benefit Analysis as a Control Over Privacy-Intrusive Data Processing: Experience in the Computer Matching Area' Working Paper, Dept of Commerce, Australian National Uni. (110 pp.) (August 1993)
DOF (1991) 'Handbook of Cost-Benefit Analysis' Clth Dept Finance, Canberra, 1993
DOF (1993) 'Value for Your IT Dollar: Guidelines for Cost-Benefit Analysis of Information Technology Proposals' Commonwealth Department of Finance, Canberra (August 1993)
EC (1992) 'Amended Proposal for a Council Directive on the protection of individuals with regard to the processing of personal data and on the free movement of such data' Council of the European Communities, Brussels, October 15, 1992, document COM(92) 422 Final - SYN 287, reproduced in Transnational Data and Communications Report (Nov/Dec 1992)
Flaherty D.H. (1989) 'Protecting Privacy in Surveillance Societies' Uni. of North Carolina Press, 1989
Gandy O.H. (1993) 'The Panoptic Sort. Critical Studies in Communication and in the Cultural Industries' Westview, Boulder CO, 1993
Gramlich E.M. (1981) 'Benefit-Cost Analysis of Government Programs' Prentice-Hall 1981
Kling R. (1978) 'Automated Welfare Client-Tracking and Service Integration: The Political Economy of Computing' Commun. ACM 21,6 (June 1978) 484-493
Larsen E. (1992) 'The Naked Consumer: How Our Private Lives Become Public Commodities' Henry Holt and Company, New York, 1992
Laudon K.C. (1986) 'Dossier Society: Value Choices in the Design of National Information Systems' Columbia U.P., 1986
McManus T.E. (1990) 'Telephone Transaction-Generated Information: Rights and Restrictions' Program on Information Resources Policy, Harv. Uni., Cambridge, 1990
Madsen W. (1992) 'Handbook of Personal Data Protection' MacMillan, London, 1992
Marx G.T. & Reichman N. (1984) 'Routinising the Discovery of Secrets' Am. Behav. Scientist 27,4 (Mar/Apr 1984) 423-452
Mukherjee R. & Samarajiva R. (1993) 'The transaction web' Media Information Australia, 1993
Novek E., Sinha N. & Gandy O. (1990) 'The Value of Your Name' Media, Culture & Society 12 (1990) 525-543
OECD (1980) 'Guidelines for the Protection of Privacy and Transborder Flows of Personal Data' Organisation for Economic Cooperation and Development, Paris, 1980
OTA (1985) 'Federal Government Information Technology: Electronic Surveillance and Civil Liberties' OTA-CIT-293, U.S. Govt Printing Office, Washington DC, Oct 1985
OTA (1986) 'Federal Government Information Technology: Electronic Record Systems and Individual Privacy' OTA-CIT-296, U.S. Govt Printing Office, Washington DC, Jun 1986
Packard V. (1957) 'The Hidden Persuaders' Penguin, London, 1957
PCIE (1981) 'Inventory of Federal Computer Applications to Prevent/Detect Fraud, Waste and Mismanagement' U.S. Dept. of Labor, undated, c. 1981?
Rule J.B. (1974) 'Private Lives and Public Surveillance: Social Control in the Computer Age' Schocken Books, 1974
Rule J.B., McAdam D., Stearns L. & Uglow D. (1980) 'The Politics of Privacy' New American Library, 1980
Sassone P.G. & Schaffer W.A. (1978) 'Cost-Benefit Analysis: A Handbook' Academic Press, 1978
Smith R.E.(Ed.) (1974-) 'Privacy Journal' monthly since November 1974
Stevenson J. (1986) 'The Next Direct Marketing' Direct Marketing (October 1986)
Stevenson J. (1987) 'The History and Family Tree of 'Databased' Direct Marketing' Direct Marketing (December 1987)
Thompson M. (1980) 'Benefit-Cost Analysis for Program Evaluation' Sage, 1980
The content and infrastructure for these community service pages are provided by Roger Clarke through his consultancy company, Xamax.
From the site's beginnings in August 1994 until February 2009, the infrastructure was provided by the Australian National University. During that time, the site accumulated close to 30 million hits. It passed 65 million in early 2021.
Sponsored by the Gallery, Bunhybee Grasslands, the extended Clarke Family, Knights of the Spatchcock and their drummer
Xamax Consultancy Pty Ltd
ACN: 002 360 456
78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916
Created: 13 October 1995 - Last Amended: 24 June 1998 by Roger Clarke - Site Last Verified: 15 February 2009
This document is at www.rogerclarke.com/DV/PaperProfiling.html