Roger Clarke's 'Responsible Data Analytics'

Roger Clarke's Web-Site

© Xamax Consultancy Pty Ltd, 1995-2024

HOME

eBusiness

Information
Infrastructure

Dataveillance
& Privacy

Identity Matters

Other Topics

What's New

Waltzing
Matilda

Advanced Site-Search

Roger Clarke's 'Responsible Data Analytics'

Guidelines for Responsible Data Analytics

Notes for a Panel Session at a Consultation Event on Big Data / Open Data
of the UN Special Rapporteur on the Right to Privacy
UNSW Sydney, 26-27 July 2018

Notes of 28 July 2018

Roger Clarke **

Available under an AEShareNet licence or a Creative Commons licence.

This document is at http://www.rogerclarke.com/EC/BDOD18.html

I was invited to participate in a panel providing feedback on the UN Rapporteur's Draft Report. I make these comments from the perspectives of the information systems practice and discipline, tinged with my background as a consultant.

Prof. Graham Greenleaf has identified the benefits that will arise from linking the Rapporteur's Recommendations into the existing international legal framework. My proposition is that there is also a need to embody the Recommendations within the modus operandi of each organisation. I'll make a couple of specific suggestions about how this can be done.

1. The Need for Practical Guidelines

The UN Rapporteur's work is leading to the enunciation of Principles. But for organisations to be able ato apply such ideas to their business operations, those Principles have to be operationalised in the form of Guidelines.

A guidance document needs to result in effective management of the costs and risks that affect all parties. However, to be adopted, the Guidelines need to be communicated in such a manner that relevant organisations appreciate that their application will deliver them sufficient benefits. This is most easily achieved by showing how the Guidelines address the negative impacts on organisations, particularly financial cost and reputational harm.

Examples of the kinds of topics that need to be addressed by the Guidelines are data quality, data comparability, process quality, decision quality, transparency, recourse, sanctions and enforcement.

From the perspectives of human rights and consumer rights, it is of course highly desirable that organisations be subject to formal requirements, perhaps embodied in Standards, but preferably built into enforceable Codes, within either a statutory regulatory framework or a co-regulatory scheme.

There are two reasons why I'm proposing the weak regulatory form of mere 'Guidelines'. Firstly, Guidelines can achieve an impact in the relatively short term, while formal regulatory arrangements are established. Otherwise there will inevitably be a long interlude during which organisations can justify inaction on the basis that it's as yet unclear what their responsibilities are. The second reason is that it provides an opportunity to learn from applications of the Guidelines, such that they can be refined into a practicable form prior to enshrining them in legal or quasi-legal form.

2. An Example: Transparency

There are a number of aspects of transparency, of which I address here just one. Natural justice demands that when an organisation makes a decision that may be adverse to an individual, that person must have access to a humanly-understandable explanation of the reasons for that decision. Unless such an explanation of the rationale is provided, the organisation fails the fundamental requirement that it be accountable for its actions, and the individual is precluded from providing a coherent argument in support of a request for review, a complaint, or an action before a tribunal or court.

Until the mid-1980s, computer-based decision-making was expressed in third-generation languages such as COBOL, which inherently involved a procedural or algorithmic solution, and hence a statement of a problem to be solved. (The assistance of a programmer may have been needed to interpret code and its application in a particular case, but plenty of programmers were readily available). So the transparency requirement was satisfied.

Later generations of software development tools have overrun 3rd generation languages. Rule-based expert systems (the 5th generation) embody a description of a problem-domain, but not of a problem or a solution. It's theoretically feasible to generate explanations from rule-based systems, but many applications fail to provide that capability.

The current fashion in software development is neural networks. This approach can be described in various ways. These include 6th-generation; a form of AI (artificial intelligence); a form of AI/ML (a sub-set of AI called machine learning); and empirical, i.e. based on real-world data. The development of software using neural networks involves a very limited amount of modelling, followed by the shovelling of a lot of data into a generic program. This collection of data, called a 'training set', delivers a set of weightings on the various factors identified in the model. Data describing a new case can be run against those weightings in order to draw inferences. The new case may or may not be added into the training set. If it is added in, the weightings shift a little to reflect the new case.

6th generation neural network approaches create major problems for transparency. There is no definition of a problem, or a solution, or even of a problem-domain. The weightings are correlations, not a rationale. It is highly misleading to use the term 'algorithm' when describing such approaches. Applications of neural networks are merely empirical and are not based on a defined procedure. No explanation is feasible, other than some vague statement about the weightings currently assigned to different factors. Further, where machine-learning results in change over time, repeatability and auditability are undermined.

In short, 6th generation, neural network-based applications breach the transparency requirement, and preclude organisational accountability.

Proponents offer the assurance that the generation of explanations from neural networks is an active area of research. But for such research to ever bear fruit, it has to overcome very serious challenges. The inferences arise from a long chain of data. Generating a summary of that data, and then generating from that summary an expression of a rationale or a causal structure, is a fearsome intellectual endeavour. Human brains are good at 'justification', i.e. the ex post facto rationalisation of previous behaviour; but producing a generalised artificial intelligence as devious as the human mind is another very long-term project.

Then, even if those challenges are overcome, the problem remains of appreciating context. Data is almost invariably incomplete, because only that data was gathered which appeared at the time to be both necessary and economic. Trying to program a device to consider what data might be missing that might have affected the outcome, is an exercise in futility.

3. Guidelines Mapped to Business Processes

This discussion of transparency was a quick dive-down into just one of perhaps 30-40 factors that need to be covered in a comprehensive set of Guidelines. A guidance document needs sufficient depth in relation to all of these factors in order to assist organisations to manage the risks to others, and to themselves.

There's then one further requirement. The Guidelines need to be mapped effectively into the organisation's business processes. Most commonly, this will be through quality assurance and/or risk assessment activities.

Being a good consultant, I've produced a set of Guidelines , and a further paper has shown where in the life-cycle of a data analytics project each particular guideline needs to be addressed. Being also a bad consultant, I've given the intellectual property away, by publishing it (Clarke 2018, Clarke & Taylor 2018).

4. Conclusion

My proposition is that the UN Rapporteur's Recommendations need to include the articulation of Principles into Guidelines, and the mapping of Guidelines into Business Processes.

Various activities by various organisations are giving rise to various guidance documents. Many of these guidance documents, however, are seriously deficient, undermined by myopia or self-interest. (For a critique of one such document, see Raab & Clarke 2016).

I submit that the Guidelines I've published be applied, either as a basis for UN-promulgated Guidelines, or as a tool, template or exemplar whereby other sets of Guidelines can be evaluated or developed.

Sources

Clarke R. (1991) 'A Contingency Approach to the Application Software Generations' Database 22, 3 (Summer 1991) 23 - 34, PrePrint at http://www.rogerclarke.com/SOS/SwareGenns.html

Raab C. & Clarke R. (2016) 'Inadequacies in the UK's Data Science Ethical Framework' Euro. Data Protection L. 2, 4 (Dec 2016) 555-560, PrePrint at http://www.rogerclarke.com/DV/DSEFR.html

Clarke R. (2018) 'Guidelines for the Responsible Application of Data Analytics' Computer Law & Security Review 34, 3 (May-Jun 2018) 467- 476, https://doi.org/10.1016/j.clsr.2017.11.002, PrePrint at http://www.rogerclarke.com/EC/GDA.html

Clarke R. & Taylor K. (2018) 'Towards Responsible Data Analytics: A Process Approach' Proc. Bled eConference, 17-20 June 2018, PrePrint at http://www.rogerclarke.com/EC/BDBP.html

Author Affiliations

Roger Clarke is Principal of Xamax Consultancy Pty Ltd, Canberra. He is also a Visiting Professor in Cyberspace Law & Policy at the University of N.S.W., and a Visiting Professor in the Research School of Computer Science at the Australian National University.

Personalia

Photographs
Presentations
Videos

Access
Statistics

The content and infrastructure for these community service pages are provided by Roger Clarke through his consultancy company, Xamax.

From the site's beginnings in August 1994 until February 2009, the infrastructure was provided by the Australian National University. During that time, the site accumulated close to 30 million hits. It passed 65 million in early 2021.

Sponsored by the Gallery, Bunhybee Grasslands, the extended Clarke Family, Knights of the Spatchcock and their drummer

Xamax Consultancy Pty Ltd
ACN: 002 360 456
78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916

Created: 28 July 2018 - Last Amended: 28 July 2018 by Roger Clarke - Site Last Verified: 15 February 2009
This document is at www.rogerclarke.com/EC/BDOD18.html
Mail to Webmaster - © Xamax Consultancy Pty Ltd, 1995-2022 - Privacy Policy