Roger Clarke's 'Computer Science Research Ethics'

Roger Clarke's Web-Site

© Xamax Consultancy Pty Ltd, 1995-2024

HOME

eBusiness

Information
Infrastructure

Dataveillance
& Privacy

Identity Matters

Other Topics

What's New

Waltzing
Matilda

Advanced Site-Search

Roger Clarke's 'Computer Science Research Ethics'

Ethical Considerations in Computer Science Research

Notes of 11 October 2013
For a Panel Session run by CSIRO IR & Friends and ANU RSCS IHCC on 14 October 2013

Roger Clarke **

Available under an AEShareNet licence or a Creative Commons licence.

This document is at http://www.rogerclarke.com/SOS/CSREth.html

Introduction

My first foray into research ethics was in 1976, when I co-authored a set of Privacy Guidelines for Market Research, for the then NSW Privacy Committee and the Market Research Society. During the last 25 years, my publication indexes tell me that I've contributed something to Ethics discussions about every 2 years.

In 2008, I worked with two others in a Task Force to substantially revise the Code of Research Conduct for the Association for Information Systems (AIS). A great deal of the Code has to do with behaviour as a professional researcher including as an author, editor, reviewer and supervisor. It has a particularly strong focus on the moral panic of the period, plagiarism by academics. This Panel session, on the other hand, is concerned with researchers' exploitation of people.

There are many different research techniques used in IS and CS. As I understand it, the focus of this Panel is intended to be on:

field research, where people's behaviour is observed in natural settings
controlled experiments, where people's behaviour is observed in contrived and controlled settings, typically in a laboratory
surveys, where people provide data in response to verbal stimuli (interviews and focus groups), or to written stimuli (questionnaires)
secondary research, using data arising from other activities, such as log files, databases, search-terms and social media

These notes accordingly commence with a brief review of the ethical framework within which research ought to operate, and then considers issues arising firstly in relation to observations of human behaviour, and secondly to the handling of personal data

Human Rights

Many health care researchers and professionals use the word 'patient' in a way that makes it sound like 'trainee cadaver'. Social, IS and CS researchers do something similar, by using the dehumanising term 'research subjects' when they mean 'people'. So the first step in a discussion of ethical behaviour is to talk about 'person', 'people' and 'human beings', not 'subjects'.

A substantial body of literature relates to the rights of human beings. Unfortunately, the laws relating to human rights are in most countries vague, and they vary enormously among jurisdictions. In Australia, most formal rights are incidental and even accidental, rather than being a central plank of the constitution. The frameworks provided by instruments such as the ICCPR (1966) and the Human Rights Act (ACT, 2004) extend to rights to life, constraints on capital punishment, slavery, arbitrary arrest and detention, and suchlike concerns. These are rather too broad for a discussion of CS research ethics.

In my work since the mid-1980s, primarily in eBusiness design and adoption, and secondarily in surveillance, I've found it much more effective to focus on important interests that human beings share, which I usually refer to as 'the dimensions of privacy'. (I've used the first four since at least the mid-1990s, but the fifth is a recent addition yet to appear in any of my refereed papers):

privacy of the physical person
privacy of personal behaviour
privacy of personal communications
privacy of personal data
privacy of personal experience

Both 1 and 3 have some relevance to IS and CS research, but are not further addressed in these notes. The following section focusses on number 2 - behaviour, and the final section considers 4 - data, and the recently-emerged 5th dimension.

People's Behaviour

So-called privacy laws, apart from being extremely weak, are also very narrow, applying only to the protection of data. Individuals have considerable concerns about observation of their behaviour. This is quite intense in relation to behaviour in private places, but people also have a need and an expectation of private space in public places.

People who feel that they are under observation behave differently. A number of terms exist, that apply to the phenomenon such as the fishbowl effect and the Hawthorne effect. This fundamentally invalidates a great deal of field and experimental research, because what is observed is not necessarily 'what people do', but rather 'what people do when they're being observed'.

Nonetheless, a great deal of observation is done, in an endeavour to achieve progress in the application of IT to human needs. So a regulatory framework is needed for the design, conduct and exploitation of observation-based research techniques. Desirably, that framework needs to be informed by a deep understanding of human needs and human rights.

A fundamental requirement is that the person give their consent to observation, to the provision of stimuli (particularly if they are physically disturbing, such as high-pitched sound, or are emotionally disturbing), and to any recording of image, video, sound or data arising from their behaviour. Consent does not exist unless it has all of the following characteristics:

informed, by which is meant that the individual understands its implications. The consent must be specific and bounded, and it must be clear from the expression, or at least from the context:
- what action(s) the consent authorises
- to whom the authorisation is provided
- for what purpose(s) the authorisation applies; and
- over what time-period it operates
freely-given, by which is meant that there must be no legal compulsion, duress, coercion or undue influence, and any inducements offered must not be so significant as to undermine the person's capacity to exercise their judgement
revocable and variable

The need for consent collides headlong with the risk of biassing behaviour through fore-knowledge of the purpose, stimuli or target-behaviour. Various approaches can be taken that have some ethical justification, such as the use of a pretext prior to the observation, but the disclosure of the real purpose afterwards, with the consent to use measures arising from the observation not provided until after that disclosure has occurred. A very poor proxy is the authorisation of a breach of the standards by some organisation that has the legal capacity to render individuals' consent, or some aspect of consent, irrelevant.

A further significant factor is the sensitivity, variously of the context, the stimuli and the behaviours. Some contexts, stimuli and behaviours are sensitive matters for people generally. On the other hand, sensitivity has to be judged by the individual concerned, and not by some designer on their behalf. A great deal of IS and CS research is such that few people would trigger a sensitivity threshold. On the other hand, in for example the CHI area, such design factors as colour-combinations, reverse-blink and strobe-effects have serious physical effects on some people, and images and videos that appear anodyne to most people can give rise to strong emotional reactions in a few.

What guidance is available to IS and CS researchers? The ARC relies on documents published by the NH&MRC. The ARC web-site points firstly to the NH&MRC Australian Code for the Responsible Conduct of Research (2007), which provides little guidance, saying only:

1.8 Respect research participants
Researchers must comply with ethical principles of integrity, respect for persons, justice and beneficence.
Written approval from appropriate ethics committees, safety and other regulatory bodies must be obtained when required.

The ARC then defers to the more detailed NH&MRC National Statement on Ethical Conduct in Human Research (2007, with updates to 2013). This document provides on pp. 19-21 a reasonably workable interpretation of consent as described above. This is undermined however, by pp. 23-24, which contain two full pages of exceptions under which consent can be fraudulently obtained, or ignored entirely. The requirement that these be approved by an in-house ethics committee does little to dispel the disquiet felt by many people about the abuse of human rights involved in such research processes.

It is not clear on what basis the NH&MRC prescriptions achieve any authority at law in relation to IS and CS research. The NH&MRC has authority, arising under s. 95 and s.95A of the Privacy Act combined with the Privacy Commissioner's approval of two sets of NH&MRC Guidelines for the public sector and for the private sector (which date to 2000 and 2001 respectively). But this authority relates solely to medical research and health data, and does not extend to anything of relevance to most IS and CS research.

It may well be that IS and CS research is subject to the common law, and that the NH&MRC documents, and Ethics Committee clearance, do not provide researchers and their employers with protection against lawsuits in, say, assault, negligence or breach of confidence.

People's Data

People have high expectations about the privacy of data relating to them, and (contrary to the prognostications of self-interested commentators like Sun's Scott McNeely, Google's Eric Schmidt and Facebook's Mark Zuckerberg) those expectations are increasing, not decreasing. The OAIC's 2013 survey, published on 9 October, is good evidence of that.

Legal protections exist for only a sub-set of people's expectations about the privacy of personal data. Moreover, the protections are subject to a mass of exemptions, exceptions, authorisations, planned loopholes and unplanned loopholes. They are subject to very weak enforcement powers, and Privacy Commissioners generally, and especially the Commonwealth Privacy Commissioner, are extremely timid in applying even such powers as they have.

Law enforcement, and to a considerable extent the law itself, are largely irrelevant to the behaviour of government agencies, corporations, universities and individual researchers. In practice, the main regulatory mechanisms are the threat of exposure by the media, the opprobrium that goes with it, the risk of loss of reputation leading to lower standing and less ready access to research funding, and hence intra-institutional procedures to rein in the most risky practices.

In making decisions about research designs, once again freely-given, informed and revocable consent is central. Some guidance may be found on pp. 25-31 of the NH&MRC National Statement on Ethical Conduct in Human Research (2007, with updates to 2013). However, it is unclear whether, and on what basis, an Ethics Committee can legally authorise misleading statements about research purposes and processes, let alone waive the need for consent.

Personal data may become available to an IS or CS researcher from:

observational techniques
As noted above, these suffer from the risk of measuring not 'what people do', but rather 'what people do when they think they're being observed'
survey techniques
These suffer from the risk of measuring not 'what people do', but rather 'what people say they do'
another source
This suffers the risk of uncertainty about what it was that was measured
other sources
This suffers the further risk of uncertainty about compatibility among the measures

As with the observation of behaviour, the handling of data raises issues about sensitivity. And, again, some general statements are possible, such as that most people are particularly concerned about health and financial data, and about political attitudes and religious beliefs. And, again, many people have sensitivities specific to themselves or their current context. Often, these sensitivities relate to seemingly bland data, such as birthdate, address and phone-number. IS and CS researchers need to take great care where their work involves personal feelings, location and tracking, social network data, and especially cross-linkages that break down the separations among their often-separate social networks of family, workmates, professional colleagues, and friendship groups.

All personal data, from all sources, needs to be subject to a range of safeguards. Its use and disclosure need to be constrained to the original purpose. It needs to be secured. It needs to be deidentified as soon as practicable, and as effectively as practicable. It needs to be destroyed as soon as practicable. These are well-travelled paths, and a great deal of experience has been gained. Unfortunately, one of the lessons learnt is that a remarkably large percentage of organisations are hopelessly incapable of implementing even well-known data security safeguards.

The degree of public concern rises in the case of what is often called secondary research, which applies data to purposes additional to the original purpose. Key issues include identification, anonymity and pseudonymity, and that all-too-seldom-studied concept, re-identification. Data protection laws are commonly very weak in these areas. Moreover, a great deal of the data exploited in this way is under the control of US corporations. The US is an outlaw and a scoff-law in relation to privacy. Thanks to Eric Snowden, US national security agencies are now well-documented as flouting both public expectations and such limited legal protections as exist. US corporations do the same. The abuse of identified data streams by social networking services and other forms of social media is fundamental to those companies' business models - 'you are the product being sold'.

A further substantial step is the consolidation of multiple data-streams. Each data-stream has purpose, meaning and relevance, within its context. Exploitation for other purposes, disclosure to other divisions of a conglomerate, to 'strategic partners', and to other business partners, not only rips up the compact that existed between the individual and the organisation that collected the data but also creates the strong likelihood of misinterpretation and erroneous inference. This is inevitable with the fuzzy inferencing associated with data mining.

A further step, to 'big data', raises the temperature to new heights. Reidentifiability is intrinsic to big, multiply-sourced data sets. The errors will have significant harmful effects on individuals about whom judgements are made in secret, based on erroneous data. In all cases, the inclusion of sensitive data within the data-set acts as a multiplier of the levels of concern about the practices.

A final, still-emergent issue is impacts on the privacy of personal experience. There was no data about what I read when I was young, whereas your reading behaviour is captured. My social networks were my own business, but yours are inferred by analysis of personal communications. No device monitored my gaze in order to infer points-of-interest. My purchasing and travel behaviours were not recorded at all, or were anonymous, or were recorded in silo'd databases. You've lost all of those protections. Next comes analysis of co-location with other deviants. The point-to-point cameras that are being installed on NSW highways and on Canberra's trunk-roads have only a little to do with speeding fines, and everything to do with mass surveillance of road traffic and a mass database of sightings of vehicle registration plates with date-time-and-location stamps.

IS and CS researchers need to be aware of where their work sits in this staircase of intrusiveness. It's entirely feasible to design and conduct research involving personal data in ways that are ethical as well as informative and useful. But it's very easy to breach ethics and to breach human rights, and a considerable amount of research is doing precisely that.

Some rough guidelines would start with the following:

consent is fundamental, and has to be freely-given and informed
but consent cannot be elastic or 'bundled'. Secondary uses of personal data are out-of-bounds
credible pseudonymisation, with strong controls on the index that links pseudonyms to respondents' identities, was sufficient to enable highly sensitive AIDS epidemiological research to be conducted, tracing people's sex-partners, and at a time when homosexuality was still a crime in some jurisdictions
but security against hackers and accidents and staff-turnover and government agency demand powers is a must
credible anonymisation is a powerful way to wash data free from ethical constraints
but most anonymisation isn't credible, especially where data-sets are retained, and combined, because the richness of the data makes reidentification at least tenable and even easy
automated data falsification has potential, but it will have to exhibit a number of key attributes. It will have to be credible, it will have to be irreversible, it will have to be known to have been done (so that organisations don't attempt to use it for administrative purposes and hence make seriously wrong inferences about individuals), and it will have to undermine reidentifiability without undermining the research purpose

Conclusions

IS and CS research may face less serious challenges than some disciplines, such as health and particular kinds of social science; but we have ethical issues to contend with, and as we integrate IT artefacts more closely with people our work is becoming more contentious.

Public disenchantment with the practices of governments and business, and with the utterly inadequate legal protections provided by parliaments and oversight agencies, is increasingly resulting in surges of concern about particular activities. (During the week these words were written, the cause of the week in the Sydney media was the scanning of ID documents in entertainment venues).

The public is also increasingly taking privacy protection into its own hands. I've written on multiple occasions from the 1980s onwards that "we're training people to omit data, falsify data, provide misleading data, and use multiple identities". The OAIC's 2013 Survey showed that 17% of respondents say that they often or always provide false personal details, and often or always use a false name when giving personal information, and the proportion who say they never do has dropped to two-thirds. More of them will learn very soon.

Did we really want to undermine public confidence in its institutions, and teach people to be habitual liars? And what effect does that have on the quality of survey-based research?

Appendix: Extract from the AIS Code of Research Conduct

In relation to the rights of research subjects, the AIS Code of Research Conduct says:

3. Respect the rights of research subjects, particularly their rights to information privacy, and to being informed about the nature of the research and the types of activities in which they will be asked to engage.

Scholars are expected to maintain, uphold and promote the rights of research subjects, especially rights associated with their information privacy. Subjects in academic research routinely volunteer information about their behavior, attitudes, intellect, abilities, experience, health, education, emotions, aspirations, and so on. If you are collecting such data, you have an obligation to respect the confidentiality of your subjects by storing data in a secure place, destroying it after a specified period of time, and never using it for any purpose other than that to which the subjects agreed prior to their participation.

In addition, unless an institutionally-approved research protocol allows otherwise, research subjects should be informed in advance of the purpose of any research procedure or activities in which they may be asked to participate. They also have the right to withdraw from the research at any stage.

Researchers must respect these rights and not coerce or otherwise force research subjects to participate against their will, or in a manner that is not conducive with their best interests.

Author Affiliations

Roger Clarke is Principal of Xamax Consultancy Pty Ltd, Canberra. He is also a Visiting Professor in the Cyberspace Law & Policy Centre at the University of N.S.W., and a Visiting Professor in the Research School of Computer Science at the Australian National University.

Personalia

Photographs
Presentations
Videos

Access
Statistics

The content and infrastructure for these community service pages are provided by Roger Clarke through his consultancy company, Xamax.

From the site's beginnings in August 1994 until February 2009, the infrastructure was provided by the Australian National University. During that time, the site accumulated close to 30 million hits. It passed 65 million in early 2021.

Sponsored by the Gallery, Bunhybee Grasslands, the extended Clarke Family, Knights of the Spatchcock and their drummer

Xamax Consultancy Pty Ltd
ACN: 002 360 456
78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916

Created: 10 October 2013 - Last Amended: 11 October 2013 by Roger Clarke - Site Last Verified: 15 February 2009
This document is at www.rogerclarke.com/SOS/CSREth.html
Mail to Webmaster - © Xamax Consultancy Pty Ltd, 1995-2022 - Privacy Policy