Roger Clarke's Web-Site© Xamax Consultancy Pty Ltd, 1995-2024 |
||||||
HOME | eBusiness |
Information Infrastructure |
Dataveillance & Privacy |
Identity Matters | Other Topics | |
What's New |
Waltzing Matilda | Advanced Site-Search |
Review Draft of 11 February 2010
© Xamax Consultancy Pty Ltd, 2009
Available under an AEShareNet licence or a Creative Commons licence.
This document is at http://www.rogerclarke.com/II/RCMal.html
The media and the general public have a very hazy understanding of what malware is about. Meanwhile, the negative impacts of malware continue to grow. Effective technical, policy and legal measures are necessary; but policy formation is hindered by the absence of a reliable dialect in which discussions can take place. This paper presents a categorisation scheme for malware which enables each form of malware to be defined in terms of some combination of the vector whereby it is transmitted, its payload, and the manner in which it is invoked.
There is an enormous degree of confusion among the general public about viruses, worms, trojans, bots and the rest of the menagerie. These confusions are fostered by media reports that use terms loosely, and that mangle the facts as they try to cut through the complexity and present a digestible story.
The root cause of the mess is the lack of a clear model and terminology. It's high time that the IT security discipline, profession and industry got their collective act together, and promulgated a simple body of theory that can be communicated widely, and that provides a basis for informed discussion and analysis. This paper proposes an approach that applies existing terms in ways generally consistent with current usages in such texts as Skoudis (2003), Erbschloe (2005), Szor (2005) and Aycock (2006). Its purpose is to provide clear distinctions among concepts to enable better-informed public discussion and analysis. The Working Paper on which this paper is based, Clarke (2009a), has an associated Glossary that consolidates the definitions (Clarke 2009b).
Some of the confusion in the popular literature arises from the conflation of 'undesired content' issues (such as spam) and 'malbehaviour' (such as spamming and phishing) with 'undesired software' (malware). The paper accordingly commences by placing the discussion in context, and distinguishing those three topic-areas.
In Clarke & Maurushat (2007), an analysis of the security of consumer devices, particularly in the context of mobile payments, underlined the importance of the distinction between vectors and payloads. This paper extends that approach to include the manner of invocation of deployed malware. It then shows how the various forms of malware can be readily categorised using those three dimensions, and demonstrates the advantages that the categorisation offers.
This preliminary section provides an outline of the security model within which the analysis was conducted, and distinguishes malware from related notions.
The paper works within the conventional information technology (IT) security model, as depicted in Exhibit 1. Threatening events, which are specific instances of generic threats, impinge on vulnerabilities, resulting in harm. Safeguards are used in an endeavour to protect against threats, and to detect and address vulnerabilities. Safeguards may focus on deterrence, prevention, detection or investigation of the threats or events, or amelioration of the harm. Security is a condition in which harm does not arise, because threats and vulnerabilities are effectively countered by safeguards.
The paper's scope generally does not extend to attack mechanisms, such as vulnerability scanners, port scanners and password testers, nor to tools used in intermediary devices on the Internet, such as packet sniffers.
The term 'mal-content' usefully describes data that exists on a device, which the person controlling the device did not intend to be there and/or does not want there. Forms of mal-content include spam, email-attachments pushed to the device (including unrequested promotional material and pornography), unexpected content arriving on the device as a result of requests submitted in web-browsers (including advertisements), and unsolicited material arriving through P2P networks.
Many of the means whereby mal-content comes to be on a device involve forms of malbehaviour. A common example is 'phishing' for valuable information, particularly people's authenticators such as passwords and PINs. A longstanding example is incitement to users to download and execute software. The social engineering techniques used to enveigle users into enabling the reticulation of mal-content often mimic the actions of trusted providers of legitimate content (Mitnick & Simon 2002).
Malware is mal-content in the form of software. It would be unsatisfactory, however, to define malware simply as 'software that exists on a device, which the person controlling the device did not intend to be there'. Such a definition would have multiple problems, including the following:
A further aspect of some definitions of malware is the notion of causing harm. A useful definition of malware must also, however, encompass instances in which no harm is intended (e.g. because the software is a proof-of-concept or demonstration), or no harm is done (e.g. because the software is inoperable on the device in question, as occurs when a specifically Windows executable reaches a device that does not run a Windows operating system), or the item remains latent (i.e. on the device but never invoked).
A further aspect that creates difficulties in defining malware is the question of intent. In most cases, the designer and/or distributor intend to cause harm. However, some software that may need to be included in the definition of malware is intentionally beneficial (such as a worm designed to counter the negative impact of a harmful worm). Moreover, some is not intended to perform any harmful function, but only to demonstrate 'proof of concept', or enable measurement of the rate of dissemination arising from a particular design feature.
Extending to a broadly inclusive definition is also problematical, because it risks encompassing software that causes harm as a result of programming errors rather than malicious intent. One benefit of this approach, however, is that it avoids the legally and practically difficult problem of inferring intent (particularly where the perpetrator is unknown). Another benefit is that it explicitly recognises low-quality software as a threat.
In order to encompass the intended range of items, and to reflect the complexities discussed above, the following operational definition is adopted in this paper:
Malware is:
In terms of the IT security model, items of malware are threats, and need to be distinguished from vulnerabilities; but differentiating between the two is not always simple. An example of a category that can be either a vulnerability or a threat is a 'backdoor'. It may be used to cause harm to the device or an interest of a user of the device (in which case it is a threat), or it may be used to cause harm to some other device or person (in which case it is a vulnerability - although the capability of the device thereby becomes a threat to the other devices or persons).
The proposition put forward in this paper is that categories of malware can be simply explained and understood, and alternative counter-measures analysed and discussed, by distinguishing attributes in three dimensions. Malware:
The following sub-sections consider the three dimensions in turn.
The term 'vector' refers to the means whereby undesired content comes to be on a device. The term 'vector' encompasses both the act of transmission and the techniques whereby the transmission occurs. Two broad sub-categories are based on the medium of transmission:
There are several other ways in which cross-categorisation can be undertaken. One is based on the instigation of the transmission:
Another basis for differentiation is the vehicle for transmission, and in particular dependence on:
These distinctions can be applied to mainstream user services, in order to construct a set of sub-categories of vector that can be communicated to users in terms they are likely to be familiar with. See Exhibit 2.
Devices running the various versions of Windows operating systems have been especially rich in the kinds of vulnerabilities that make these vectors attractive to attackers (Chen & Robert 2004). As indicated earlier, there is a longstanding (but recently much-reduced) problem of highly permissive default parameter-settings, and a norm of wide-ranging permissions granted to user-accounts. That problem is compounded by a feature of Microsoft's architecture for server interactions with networked clients, which has had a variety of names, but is usefully referred to as 'ActiveX controls'. Microsoft's design inherently provides code that arrives at the client with access to the entire device. This is quite different from the much less insecure approach adopted by alternatives such as Java and (with qualifications) ECMAscript/Javascript, which are limited to a 'sandbox'.
The term 'payload' distinguishes the content of a transmission from the means whereby it is delivered, including the addressing information. In the case of spam, for example, the payload is the text of the message itself, and/or the contents of an attachment to the message. The contents of an attachment can be in any form, including text, formatted text such as HTML, image, sound, video, executable code and invokable code that is not directly executable, including invokable code within HTML, in such forms as Javascript and VBscript.
In the case of malware, the payload is the active code that is delivered to the target device in order to perform some function or functions. The scope of the payload may include functions ancillary to the ultimate purpose, such as means of obscuring the existence or operation of the malware. However, the term is often used to exclude code whose function is to cause the malware to replicate itself.
A considerable number of functions might be performed by malware. Exhibit 3 provides a structured list of sub-categories, which enables various kinds of malware to be reliably distinguished from one another.
A commonly-used term in the malware arena that has been omitted from the above list is 'trojan'. The reason is that this term is subject to a wide variety of undisciplined usages. The common element in the various senses of the term is that a trojan appears to the user to perform one or more desirable functions, and may do so, but is designed to also perform one or more additional functions that the user is not aware of. This is a particularly easy confidence trick to play, because legitimate and illegitimate invitations to download and to run software are essentially indistinguishable from one another. The offer of anti-virus software is a particular favourite technique used by attackers.
In relation to the transmission vector, the two primary flavours of the term have the following attributes:
In relation to the payload, the two primary flavours of the term 'trojan' can be represented as follows:
The analysis in this paper adopts the following terms and definitions, on the grounds that these two categories are valuable tools in security analysis and benefit from a term to describe them, whereas the others are already satisfactorily covered by other established terms:
As is evident from the definitions offered in Exhibit 3, the nature of the payload is the key determinant for most categories of malware. Many categories may, however, be usefully qualified by a vector-descriptor in order to identify a sub-category; hence, for example, a 'backdoor trojan' and an 'email virus'. A special case is so-called 'drive-by download'. The term refers generally to downloads that a user authorises unwittingly; but it is often used specifically within the context of the Web; and it is sometimes used specifically for exploits on devices using Microsoft Windows that take advantage of the inbuilt vulnerability that ActiveX controls represent (Howes 2004).
The notion of 'invocation' is used here to refer to the causing of the code to run in the target device. The generic 'invoke' is preferred to 'execute' because of the many forms in which code may be delivered. In particular, the code may be native to the instruction-set of the target device, or it may in a form that requires a compiler, an interpreter or a run-time interpreter. The term 'invoke' also encompasses embedded code, such as macros within word processing and spreadsheet documents.
The device's operating system may include some safeguards against the unauthorised invocation of programs, such as specific permissions associated with each user-account, which limit the software that can be run from that account and/or the data that can be accessed from it. Effective malware needs to be able to circumvent or subvert such safeguards.
Exhibit 4 identifies key sub-categories of malware invocation.
The above categories can be cross-categorised in several ways. One involves distinguishing between local invocation within the device, and invocation triggered remotely by an action by an attacker.
An item of malware, when invoked, may perform its function(s) and then terminate. Alternatively, it may remain memory-resident and active. A further variant is that it may remain memory-resident, but dormant, pending some trigger. Such software is commonly referred to as a 'daemon' or in Microsoft environments a 'Windows service'. A daemon is capable of performing functions that a one-time program is not, such as taking advantage of ephemeral data or a communications channel that is only open briefly.
Clarity of language is crucial to understanding and analysis. This paper has documented an approach to the categorisation of malware which enables each particular kind of malware to be defined in terms of the vector it uses, the payload it carries, and/or the the manner in which it is invoked. A Glossary of malware terms arising from this analysis is in Clarke (2009b).
The use of the definitions proposed in this paper should significantly reduce the ambiguity of text that uses malware-related terms. A statement constructed along the following lines should result in far less confusion than has been the case in the past: "<A payload-type> has reached the device by means of <a vector-type>, has run as a result of <an invocation-type>, and has resulted in <a harm-type>". Hence "a bot has reached the device by means of a web-site application attack, its active code has been invoked by means of instructions stored on a web-site, and it has participated in the relaying of spam. (This has been detected by systems elsewhere, and has in turn resulted in our IP-addresses being black-listed, blocking our own email services)".
Beyond overcoming mere ambiguity, use of this reconceptualisation of malware should result in fewer accidental generalisations whose scope needs to be qualified. It should also have the effect of guiding conversations and analyses towards appropriate inferences, at least among the media and the general public, and among lawyers and others responsible for policy-formation, but potentially even among IT professionals.
The ultimate purpose of a carefully formulated suite of definitions, set within a simple theoretical framework, is the more effective design and implementation of safeguards. At a technical level, benefits include more reliable recognition of malware, the avoidance of false-positives, and guidance in the design of technical countermeasures. Organisational measures should be enhanced through clearer communication as part of awareness, education and training of staff, and greater clarity in risk assessment and risk management plans. The framework can also be used in devising more effective legal measures such as definitions of malware-related crimes.
Aycock J. (2006) 'Computer Viruses and Malware' Springer Advances in Information Security, Vol. 22, 2006, at http://vx.netlux.org/lib/mja01.html
Chen T. & Robert J.-M. (2004) 'The Evolution of Viruses and Worms' Chapter in Chen W.W.W. (Ed.) 'Statistical Methods in Computer Security' CRC Press, 2004, pp. 265-286, at http://vx.netlux.org/lib/atc01.html
Clarke R. (1991) 'A Contingency Approach to the Application Software Generations' Database (Summer 1991), at http://www.rogerclarke.com/SOS/SwareGenns.html
Clarke R. (2009a) 'Categories of Malware' Xamax Consultancy Pty Ltd, Working Paper, 21 September 2009, at http://www.rogerclarke.com/II/MalCat-0909.html
Clarke R. (2009b) 'Glossary of Malware Definitions' Xamax Consultancy Pty Ltd, Working Paper, 21 September 2009, at http://www.rogerclarke.com/II/MalCat-0909.xls
Clarke R. & Maurushat A. (2007) 'Passing the Buck: Who Will Bear the Financial Transaction Losses from Consumer Device Security' J. of Law, Information and Science 18 (2007) 8-56, PrePrint at http://www.rogerclarke.com/II/ConsDevSecy.html
Erbschloe M. (2005) 'Trojans, Worms and Spyware: A Computer Security Professional's Guide to Malicious Code' Elsevier Butterworth-Heinemann, 2005, at http://vx.netlux.org/lib/ame01.html
Howes E.L. (2004) 'The Anatomy of a 'Drive-by-Download' Spyware Warrior, 29 March 2004, at http://www.spywarewarrior.com/uiuc/dbd-anatomy.htm
Mitnick K.D. & Simon W.L. (2002) 'The Art of Deception: Controlling the Human Element of Security' Wiley, 2002
Skoudis E. (2003) 'Malware: Fighting Malicious Code' Prentice Hall, 2003
Solomon A. (1993) 'A Brief History of PC Viruses' various versions 1986-1993, at various locations, including http://vx.netlux.org/lib/aas14.html
Szor P. (2005) 'The Art of Computer Virus Research and Defense' Addison Wesley Professional, 2005, at http://vx.netlux.org/lib/aps00.html
This paper draws heavily on work undertaken in conjunction with colleagues in the Cyberspace Law & Policy Centre at the University of N.S.W. in Sydney, and the contributions of Graham Greenleaf and Alana Maurushat are acknowledged.
Roger Clarke is Principal of Xamax Consultancy Pty Ltd, Canberra. He is also a Visiting Professor in the Cyberspace Law & Policy Centre at the University of N.S.W., and a Visiting Professor in the Department of Computer Science at the Australian National University.
Personalia |
Photographs Presentations Videos |
Access Statistics |
The content and infrastructure for these community service pages are provided by Roger Clarke through his consultancy company, Xamax. From the site's beginnings in August 1994 until February 2009, the infrastructure was provided by the Australian National University. During that time, the site accumulated close to 30 million hits. It passed 65 million in early 2021. Sponsored by the Gallery, Bunhybee Grasslands, the extended Clarke Family, Knights of the Spatchcock and their drummer |
Xamax Consultancy Pty Ltd ACN: 002 360 456 78 Sidaway St, Chapman ACT 2611 AUSTRALIA Tel: +61 2 6288 6916 |
Created: 7 February 2010 - Last Amended: 11 February 2010 by Roger Clarke - Site Last Verified: 15 February 2009
This document is at www.rogerclarke.com/II/RCMal.html
Mail to Webmaster - © Xamax Consultancy Pty Ltd, 1995-2022 - Privacy Policy