Roger Clarke's Web-Site© Xamax Consultancy Pty Ltd, 1995-2024 |
||||||
HOME | eBusiness |
Information Infrastructure |
Dataveillance & Privacy |
Identity Matters | Other Topics | |
What's New |
Waltzing Matilda | Advanced Site-Search |
Working Paper Version of 21 September 2009
© Xamax Consultancy Pty Ltd, 2009
Available under an AEShareNet licence or a Creative Commons licence.
This document is at http://www.rogerclarke.com/II/MalCat-0909.html
The slide-set is at http://www.rogerclarke.com/II/MalCat-0909.ppt
A printable version for workshop purposes is at http://www.rogerclarke.com/II/MalCat-0909-DiscnSet.pdf
Malware, particularly the sophisticated forms of malware now being applied to criminal purposes, threaten not only the integrity of computing services, but the Internet and network-dependent economies and societies more generally. The assessment of alternative technical and policy measures is hindered by the low quality of discussion about malware, which is rooted in the absence of a reliable dialect in which discussions can take place. This paper develops a set of categories and working definitions of malware, intended to enable meaningful discussions and analysis. The terminology is designed to be at a level sufficiently technical to avoid undue simplification, but facilitate appreciation by policy analysts without deep knowledge of computing or networks.
Popular treatments of malware are characterised by a vast amount of confusion. Over 20 years after viruses became well-understood and worms erupted into public discussions, it's high time that much more order was brought to the field.
The need for much greater clarity extends beyond the broad public arena. Malware is becoming an increasingly serious public policy issue. To counter this, innovations are necessary in technical, organisational and regulatory measures, and coordination must be achieved among organisations in the public and private sectors. Meaningful discussions depend on mutual understanding, and hence a stable and commonly-agreed language is a pre-requisite to progress being made.
This paper proposes a terminological framework within which analysis and discussion can be undertaken. Its purpose is to distinguish the various categories of malware in such a manner as to support the evaluation of alternative policy measures that might be effective in combatting them.
The paper works within the conventional information technology (IT) security model, as depicted in Exhibit 1. Threatening events, which are specific instances of generic threats, impinge on vulnerabilities, resulting in harm. Safeguards are used in an endeavour to protect against threats and detect and address vulnerabilities. Safeguards may focus on deterrence, prevention, detection or investigation of the threats or events, or amelioration of the harm. Security is a condition in which harm does not arise, because threats and vulnerabilities are effectively countered by safeguards.
The paper's scope generally does not extend to attack mechanisms, such as vulnerability scanners, port scanners and password testers, nor to tools used in intermediary devices on the Internet, such as packet sniffers.
Various forms of malware have emerged over time, as both IT infrastructure and attacks on it have become progressively more complex and sophisticated. The first form in which malware appeared was referred to as a 'virus', a term borrowed from biology in 1983. The form itself, however, can be recognised in phenomena that first appeared as far back as 1971.
The notion arrived in the public arena when viruses were transmitted by floppy disk among Apple micro-computers in 1981. Major infections first appeared during the late 1980s on what were then called 'IBM PCs'. Even at that time, some virus transmission was occurring over networks rather than via disks, but networks only became the dominant means of transmission after Internet accessibility became widespread from the mid-1990s (Solomon 1993).
Until about 2000, viruses remained the dominant form of malware. Indeed the term 'virus' was initially used for any form of what is referred to in this paper as malware, and in popular literature that habit is still apparent. During the early years of the new century, however, independent programs called worms, propagating over networks, and primarily over the Internet, became the more common form (Chen & Robert 2004). Sample short descriptions of a virus and a worm are provided in Appendix 1. Fuller expositions of the nature of viruses and worms are in texts such as Skoudis (2003), Erbschloe (2005), Szor (2005) and Aycock (2006).
Both viruses and worms are threats which need to exploit vulnerabilities in order to propagate, and in order to perform whatever their function may be. Vulnerabilities exist in all operating systems, in all systems software, and in all applications. During the relevant period, the most widespread systems software family by far was the succession of Microsoft Windows products. On workstations that used that operating system, the Microsoft Office suite of applications was very common. Further, until the early 2000s, Microsoft's quality control over security vulnerabilities was lamentable. For this combination of reasons, Microsoft products have both harboured a vast array of vulnerabilities, and been the primary target for malware writers. As late as 2004, (Chen & Robert 2004) reported that "the vast majority of Windows PCs are vulnerable to new outbreaks". Although Microsoft's quality control has been improved, its products are also much more complex, and hence a wide array of vulnerabilities still exists, and more regularly emerge.
The terms 'virus' and 'worm' distinguish forms that malware takes. Those terms do not provide an indication at any level of detail about what 'vector', or means of distribution, any given item of malware may use. Similarly, those terms do not provide any information about the nature of the function that the malware performs, usefully referred to as its 'payload'. A range of terms has come into use to describe particular categories of payload. The following section will distinguish among such categories of malware as trojans, spyware, bots, rootkits and web-based attacks. In addition to the literature cited in the body of this paper, a range of resources listed at the end of the paper were considered during the analysis and compilation of the definitions.
In Clarke & Maurushat (2007), the distinction between vectors and payloads was used in conducting an analysis of the security of consumer devices, particularly in the context of mobile payments. This paper extends that approach to include the manner of invocation of deployed malware. It then shows how the various forms of malware can be readily categorised using those three dimensions, and demonstrates the advantages that the categorisation offers.
Some of the confusion in the popular literature arises from the conflation of 'undesired content' issues (such as spam) and 'malbehaviour' (such as spamming and phishing) with 'undesired software' (malware). The paper accordingly commences by distinguishing those three topic-areas. It then introduces and applies the vector, payload and invocation dimensions, in order to deliver a set of operational definitions of the primary categories of malware. The discussion is summarised in a glossary.
It is important to disentangle malware from related notions. Firstly, undesired software is a special case of undesired content more generally. Secondly, malware may come to be on a device as a result of the manipulation of human behaviour. This section outlines each of these ideas, and in the third sub-section an operational definition of malware is developed, appropriate to the purpose of the undertaking.
By undesired content is meant here data that exists on a device which the person controlling the device did not intend to be there. It includes spam, email-attachments pushed to the device (including unrequested promotional material and pornography), unexpected content arriving on the device as a result of requests submitted in web-browsers (including advertisements), and unsolicited material arriving through P2P networks. The term also encompasses software. Undesired content comes to be on a device by means of a 'vector' that delivers a 'payload'. These terms will be discussed in the following sections.
The term `social engineering' is applied to ways of enveigling users into providing the vector whereby undesired content reaches their device. A common example is 'phishing'. A message is presented to a user, commonly by email, but in principle using any form of messaging. The message encourages the person to divulge important information, e.g. by replying to the email, or (in recent times, more commonly) by going to what presents as though it were the web-site of a trusted party such as a bank, and then providing an authenticator such as a password or PIN.
An even more longstanding example of malbehaviour is incitement to users to download and execute software. It is particularly challenging to devise countermeasures against this reticulation method, because there have to be channels for downloading software to machines, and malware-providers can easily mimic the actions of trusted providers of legitimate software.
A recently-popular form of social engineering is the 'botnet herder' technique of luring a person into downloading and executing malware by means of some form of chat or instant messaging (IM) service (i.e. a service provided by particular protocols and associated client- and server-software that support human-real-time / synchronous communication).
From the perspective of the discussion in the preceding sub-sections, malware is a particular category of undesired content, and generally comes to be on a device as a result of some form of malbehaviour. The term 'malware' is widely used as the generic term for a considerable family of software and techniques that are implemented by means of software. Malware results, may result, or is intended to result in some deleterious consequence.
A wide diversity of forms is encompassed, including viruses, worms, trojans, spyware, bots, rootkits and web-based attacks. In security terms, items of malware are threats, and need to be distinguished from vulnerabilities; but differentiating between the two is not always simple. An example of a category that can be either a vulnerability or a threat is a 'backdoor'. This is a means whereby access is gained to a user-account on a device, bypassing safeguards. It may be an inherent vulnerability (such as a still-available default user-account with high privileges and an obvious username/password pair such as SYSTEM/SYSTEM - as was once the case with the DEC VMS operating system). Alternatively, it may be a threat, in the form of a known way in which some other vulnerability can be exploited to create such a user-account, whose details are known only to the attacker.
A variety of proposals have been made for categorising malware. For example, Alcock (2006) distinguishes malware along three dimensions:
The overlapping nature of Alcock's categories is typical of the clumsiness of so many of the attempts to develop a workable taxonomy. As the following paragraphs will demonstrate, several challenges arise in establishing operational definitions of both malware and particular forms of malware. The explanations provided in popular literatures, and even those in the refereed literature, are of little use to a study intended to support technical management and the design of regulatory measures. The remainder of this section considers the factors that may need to be reflected in a definition,and then proposes an operational definition of malware. The subsequent section proposes a means of categorising particular forms of malware.
Most malware (including worms, trojans, spyware, bots, rootkits and web-based attacks) takes the form of an independent executable program. A virus, on the other hand, is a segment of code inserted into a host program. It is not handled by the device's operating system as an independent executable, but rather performs its function when the host program is loaded and executed.
Malware may not be delivered in directly executable code expressed in the target device's machine-language, but may be in a high-level language that requires a compiler or an interpreter, or a mid-level language that requires a run-time interpreter (Clarke 1991). Some environments have serious vulnerabilities, in that code may be automatically accepted and invoked. For example, for many years, the default on a range of Microsoft products when they were shipped was to immediately execute VBScript code that they received.
Further, the code may be a fragment within an otherwise inert document-file. Some utilities, such as word processors and spreadsheet modellers, enable small code fragments or 'macros' to be embedded within the documents produced using them. These macros may be used as a means of transferring malware between devices. Microsoft's Office suite is dominant in this arena, and has long suffered, and continues to suffer a range of vulnerabilities in this area. Unlike executable files, macros are expressed in a high-level language and depend on an interpreter. As a result, they are potentially 'cross-platform' viruses, i.e. they are not specific to an operating system, processor-chip, or machine-language, but rather can run in any environment for which the appropriate interpreter has been delivered.
Most forms of malware (including many instances of viruses, worms, trojans, spyware, bots and rootkits) are stored on the device and need to be expressly invoked in order to perform their function, whereas others (particularly web-based attacks) are received and executed without prior storage.
Most malware is not intentionally loaded, and is not intentionally invoked, by a user of the device. That applies to viruses, worms, some spyware, bots, rootkits and web-based attacks. On the other hand, a trojan is software that claims to (and may well) perform a desired function; but also performs some other undesired function that the person who installs it on the device was not aware of at the time. Some spyware may also have this feature.
Malware generally performs a function that is harmful. In most cases, the harm is to some interest of the user of the device, or of the person responsible for it. The harm may take such forms as deletion of data, modification of data, and disclosure of data. In other cases, the target of the harm may be some other person. This most commonly arises from bots, which are used for relaying spam and running distributed denial of service attacks. A useful definition of malware must also, however, encompass instances in which no harm is intended (e.g. because the software is a proof-of-concept or demonstration), or no harm is done (e.g. because the software is inoperable on the device in question, as occurs when a specifically Windows executable reaches a device that does not run a Windows operating system), or the item remains latent (i.e. on the device but never invoked).
Most malware is intentionally harmful. On the other hand, some is intentionally beneficial (such as a worm designed to counter the negative impact of a harmful worm); and some is not intended to perform any harmful function, but only to demonstrate 'proof of concept', or enable measurement of the rate of dissemination arising from a particular design feature. A more broadly inclusive definition risks extending to software that causes harm as a result of programming errors rather than malicious intent. One benefit of this approach, however, is that it avoids the legally and practically difficult problem of inferring intent (particularly where the perpetrator is unknown). Another benefit is that it recognises low-quality software as a threat.
In order to encompass the intended range of items, and to reflect the complexities discussed above, the following operational definition is proposed:
Malware is:
The following section considers three aspects or dimensions of differentiation among the various categories of malware, and develops from that a definition of each category.
Malware uses a 'vector' to deliver a 'payload' which performs a function that is harmful to some party and which is 'invoked' by some means. This section discusses a number of categories of malware at a sufficient depth to enable them to be categorised according to these three dimensions. The final sub-section provides a tabular summary. Appendix 2 consolidates into glossary form the definitions developed in the analysis that follows.
The term 'vector' refers to the means whereby undesired content comes to be on a device. The term 'vector' encompasses both the act of transmission and the techniques whereby the transmission occurs. It encompasses two broad sub-categories:
In the case of transmission or download from a separate device, a variety of sub-categories arise, in particular:
Devices running the various versions of Windows operating systems are especially rich in vulnerabilities that make these vectors attractive to attackers. As indicated earlier, there is a longstanding (but recently much-reduced) problem of highly permissive default parameter-settings, and a norm of wide-ranging permissions granted to user-accounts. That problem is compounded by a feature of Microsoft's architecture for server interactions with networked clients, which has had a variety of names, but is usefully referred to as 'ActiveX controls'. Microsoft's design inherently provides code that arrives at the client with access to the entire device. This is quite different from the much less insecure approach adopted by alternatives such as Java and (with qualifications) ECMAscript/Javascript, which are limited to a 'sandbox'.
The definitions of most categories of malware do not limit the vector used, although many categories may be usefully qualified by a vector-descriptor in order to identify a sub-category; hence, for example, an 'email virus'. A special case is so-called 'drive-by download'. The term refers generally to downloads that a user authorises unwittingly; but it is sometimes used specifically within the context of the Web; and it is sometimes used specifically for exploits on devices using Microsoft Windows that take advantage of the inbuilt vulnerability that ActiveX controls represent (Howes 2004).
The term 'payload' was originally used in the U.K. in about 1930, to refer to the carriage capacity of aircraft. In IT, from about the 1970s, it distinguishes the content of a communication from the means whereby it is delivered, including the addressing information. In the case of spam, for example, the payload is the text of the message itself, and/or the contents of an attachment to the message. The contents of an attachment can be in any form, including text, formatted text such as HTML, image, sound, video, executable code and invokable code that is not directly executable, including invokable code within HTML, in such forms as Javascript and VBscript.
In the case of malware, the payload is the active code that is delivered to the target device in order to perform some function or functions. The scope of the payload may include functions ancillary to the ultimate purpose, such as means of obscuring the existence or operation of the malware. However, the scope usually excludes code whose function is to cause the malware to replicate itself.
A considerable number of functions might be performed by malware. The following list of categories has been devised in order to reflect known functions and specialist terms, but also to be wide enough to cover other likely applications:
One commonly-used term in the malware arena has been omitted from the above list. The reason is that the term 'trojan' is subject to a wide variety of undisciplined usages. The common element in the various senses of the term is that a trojan appears to the user to perform one function, and may do so, but is designed to perform an additional function that the user is not aware of. This is a particularly easy confidence trick to use, because legitimate and illegitimate invitations to download and to run software are essentially indistinguishable from one another. The offer of anti-virus software is a particular favourite technique used by attackers.
In relation to the transmission vector, the two primary flavours of the term are:
In relation to the payload, the two primary flavours of the term can be represented as follows:
The analysis in this paper adopts the following terms and definitions, on the grounds that these two categories are valuable tools in security analysis and benefit from a term to describe them, whereas the others are already satisfactorily covered by other established terms:
The notion of 'invocation' is used to refer to the causing of the code to run in the target device. The generic 'invoke' is preferred to 'execute' because of the many forms in which code may be delivered. In particular, the code may be native to the instruction-set of the target device, or it may in a form that requires a compiler, an interpreter or a run-time interpreter. The term 'invoke' also encompasses embedded code, such as macros within word processing and spreadsheet documents.
The device's operating system may include some safeguards against the unauthorised invocation of programs, such as specific permissions associated with each user-account, which limit the software that can be run from that account and/or the data that can be accessed from it. Effective malware needs to be able to circumvent or subvert such safeguards.
The following is a summary of key ways in which malware can be invoked:
The above categories can be grouped in several ways. One involves distinguishing between local invocation within the device, and invocation triggered remotely by an action by an attacker.
An item of malware, when invoked, may perform its function(s) and then terminate. Alternatively, it may remain memory-resident and active. A further variant is that it may remain memory-resident, but dormant, pending some trigger. Such software is commonly referred to as a 'daemon' or in Microsoft environments a 'Windows service'. A daemon is capable of performing functions that a one-time program is not, such as taking advantage of ephemeral data or a communications channel that is only open briefly.
Clarity of language is crucial to understanding an analysis. This paper has documented an approach to the categorisation of malware, which enables each particular kind of malware to be defined in terms of the vector it uses, the payload it carries, or the the manner in which it is invoked.
The use of the definitions proposed in this paper should significantly reduce the ambiguity of statements that use the dozen or so relevant terms. Beyond overcoming mere ambiguity, this should result in fewer accidental generalisations whose scope needs to be qualified. It should also have the effect of guiding conversations and analyses towards appropriate inferences.
The ultimate purpose of a carefully formulated set of definitions set within a theoretical framework is the more effective design of safeguards. At a technical level, benefits include more reliable recognition of malware, and the avoidance of false-positives. Organisational measures should be enhanced through clearer communication as part of awareness, education and itions can also be used to ensure effective legal measures such as definitions of malware-related crimes.
Anderson R. (2008) 'Security Engineering: A Guide to Building Dependable Distributed Systems' Wiley, 2nd Edition, 2008
AS/NZS 3931 (1998) `Risk Analysis of Technological Systems - Application Guide' Standards Australia, 1998
AS/NZS 4360 (1999) `Risk Management' Standards Australia, 1995, 1999
AusCERT (2006) 'Protecting your computer from malicious code' AusCERT, 10 April 2006, at http://national.auscert.org.au/render.html?it=3352
Aycock J. (2006) 'Computer Viruses and Malware' Springer Advances in Information Security, Vol. 22, 2006, at http://vx.netlux.org/lib/mja01.html
Belton M (2006) `Understanding Malware and Internet Browser Security', Berbee Information Networks Corporation, May 2006, at http://www.berbee.com/public/learning/WP_UnderstandingMalware.aspx
Bradley T. (2006) 'Essential Computer Security: Everyone's Guide to Email, Internet, and Wireless Security' Syngress, 2006
Chen T. & Robert J.-M. (2004) 'The Evolution of Viruses and Worms' Chapter in Chen W.W.W. (Ed.) 'Statistical Methods in Computer Security' CRC Press, 2004, pp. 265-286, at http://vx.netlux.org/lib/atc01.html
Clarke R. (1988) 'Who Is Liable for Software Errors? Proposed New Product Liability Law in Australia' Xamax Consultancy Pty Ltd, December 1988, at http://www.rogerclarke.com/SOS/PaperLiaby.html
Clarke R. (1991) 'A Contingency Approach to the Application Software Generations' Database (Summer 1991), at http://www.rogerclarke.com/SOS/SwareGenns.html
Clarke R. & Maurushat A. (2007) 'Passing the Buck: Who Will Bear the Financial Transaction Losses from Consumer Device Security' J. of Law, Information and Science 18 (2007) 8-56, PrePrint at http://www.rogerclarke.com/II/ConsDevSecy.html
Erbschloe M. (2005) 'Trojans, Worms and Spyware: A Computer Security Professional's Guide to Malicious Code' Elsevier Butterworth-Heinemann, 2005, at http://vx.netlux.org/lib/ame01.html
Gutmann P. (2005?) 'The Convergence of Internet Security Threats (Spam, Viruses, Trojans, Phishing)', at http://www.cs.auckland.ac.nz/~pgut001/pubs/blended.pdf
Howes E.L. (2004) 'The Anatomy of a 'Drive-by-Download' Spyware Warrior, 29 March 2004, at http://www.spywarewarrior.com/uiuc/dbd-anatomy.htm
Jakobsson M. & Myers S. (eds.) (2006) 'Phishing and Countermeasures: Understanding the Increasing Problem of Electronic Identity Theft' Wiley, 2006
James L. (2005) 'Phishing Exposed' Syngress, 2005
Landoll D.J. (2005) 'The Security Risk Assessment Handbook: A Complete Guide for Performing Security Risk Assessments' CRC, 2005
Lehtinen R. & Gangemi G.T. (2006) 'Computer Security Basics' O'Reilly, 2nd ed., 2006
Mitnick K.D. & Simon W.L. (2002) 'The Art of Deception: Controlling the Human Element of Security' Wiley, 2002
Peltier T.R. (2005) 'Information Security Risk Analysis' Auerbach, 2nd Edition, 2005
Pfleeger C. & Pfleeger S. (2006) `Security in Computing' Prentice Hall, 4th ed., 2006
Skoudis E. (2003) 'Malware: Fighting Malicious Code' Prentice Hall, 2003
Slay J. & Koronios A. (2006) 'Information Technology Security & Risk Management' Wiley, 2006
Solomon A. (1993) 'A Brief History of PC Viruses' various versions 1986-1993, at various locations, including http://vx.netlux.org/lib/aas14.html
Stafford T.F. & Urbaczewski A. (2004) 'Spyware: The Ghost in the Machine' Commun. Association for Information Systems 14 (2004) 291-306, at http://web.njit.edu/~bieber/CIS677F04/stafford-spyware-cais2004.pdf
Symantec (2009) 'Web Based Attacks' Symantec White Paper, February 2009, at http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/web_based_attacks_02-2009.pdf, accessed July 2009
Szor P. (2005) 'The Art of Computer Virus Research and Defense' Addison Wesley Professional, 2005, at http://vx.netlux.org/lib/aps00.html
AusCERT, at http://www.auscert.org.au/
Baranovich A. 'Virus Exchange Library', at http://vx.netlux.org/lib/
CERT, Bibliography of Security Books and Articles, at http://www.cert.org/other_sources/books.html
DOXdesk 'Definitions of parasite-related terms', at http://www.doxdesk.com/parasite/definitions.html
Edelman B. '"Spyware": Research, Testing, Legislation, and Suits', at http://www.benedelman.org/spyware/
Kaspersky, at http://www.viruslist.com/
SANS Institute, at http://www.sans.org/
Sophos, at http://www.sophos.com
Spyware Guide, at http://www.spywareguide.com/term_list.php
Symantec, at http://www.symantec.com/
US-CERT, at http://www.us-cert.gov/
Wikipedia Entries:
These two sample short descriptions are extracted from Chen & Robert (2004).
(1) The Bubbleboy virus, which came to light in early 2000:
"[the virus] took advantage of a security hole in Internet Explorer that automatically executed Visual Basic Script embedded within the body of an e-mail message. The virus would arrive as e-mail with the subject "BubbleBoy is back" and the message would contain an embedded HTML file carrying the viral VB Script. If read with Outlook, the script would be run even if the message is just previewed. A file is added into the Windows start-up directory, so when the computer starts up again, the virus e-mails a copy of itself to every address in the Outlook address books".
(2) The Blaster or LovSan worm, which came to light in August 2003:
"[The worm is] targeted to a Windows DCOM RPC (distributed component object model remote procedure call) vulnerability announced only a month earlier on July 16, 2003. The worm probes for a DCOM interface with RPC listening on TCP port 135 on Windows XP and Windows 2000 PCs. Through a buffer overflow attack, the worm causes the target machine to start a remote shell on port 4444 and send a notification to the attacking machine on UDP port 69. A tftp (trival file transfer protocol) "get" command is then sent to port 4444, causing the target machine to fetch a copy of the worm as the file MSBLAST.EXE".
INSERT A SUITABLY FORMATTED VERSION OF THE SPREADSHEET]
The current version of the spreadsheet is here
Roger Clarke is Principal of Xamax Consultancy Pty Ltd, Canberra. He is also a Visiting Professor in the Cyberspace Law & Policy Centre at the University of N.S.W., and a Visiting Professor in the Department of Computer Science at the Australian National University.
Personalia |
Photographs Presentations Videos |
Access Statistics |
The content and infrastructure for these community service pages are provided by Roger Clarke through his consultancy company, Xamax. From the site's beginnings in August 1994 until February 2009, the infrastructure was provided by the Australian National University. During that time, the site accumulated close to 30 million hits. It passed 65 million in early 2021. Sponsored by the Gallery, Bunhybee Grasslands, the extended Clarke Family, Knights of the Spatchcock and their drummer |
Xamax Consultancy Pty Ltd ACN: 002 360 456 78 Sidaway St, Chapman ACT 2611 AUSTRALIA Tel: +61 2 6288 6916 |
Created: 5 July 2009 - Last Amended: 21 September 2009 by Roger Clarke - Site Last Verified: 15 February 2009
This document is at www.rogerclarke.com/II/MalCat-0909.html
Mail to Webmaster - © Xamax Consultancy Pty Ltd, 1995-2022 - Privacy Policy