Roger Clarke's 'Practicable Backup of Cloud Data'

Roger Clarke's Web-Site

© Xamax Consultancy Pty Ltd, 1995-2024

HOME

eBusiness

Information
Infrastructure

Dataveillance
& Privacy

Identity Matters

Other Topics

What's New

Waltzing
Matilda

Advanced Site-Search

Roger Clarke's 'Practicable Backup of Cloud Data'

Can Small Users Recover from the Cloud?

Version of 9 July 2017

Computer Law & Security Review 33, 6 (December 2017) 754-767

Roger Clarke **

Available under an AEShareNet licence or a Creative Commons licence.

This document is at http://www.rogerclarke.com/EC/PBAR-SP.html

A fuller Working Paper version is at http://www.rogerclarke.com/EC/PBAR-SP-WP.html

The previous version is at http://www.rogerclarke.com/EC/PBAR-SP-160727.html

Abstract

Large numbers of small organisations and prosumers have shifted away from managing data on their own devices and are now heavily reliant on service-providers for both storage and processing of their data. Most such entities are also dependent on those service-providers to perform backups and enable data recovery. Prior work defining users' backup needs was applied to this context in order to establish specifications for appropriate backup arrangements. A sample of service-providers was assessed against those specifications. Their backup and recovery mechanisms were found to fall seriously short of the need.

1. Introduction
2. The Research Method Adopted
3. The Process Applied
4. Naive CloudUsage
5. DIY Backup
6. Backup as a Service (BaaS)
7. Conclusions
References

1. Introduction

All organisations are responsible for managing their data effectively, but only those organisations that are of substantial size are capable of bringing appropriate resources to bear on the problem. The focus of this paper is on entities that lack scale and IT expertise. Further, the paper is concerned specifically with those aspects of data management that relate to data availability and integrity, and that are addressed by backup and recovery arrangements.

1.1 Small Users

The entities within scope of the paper are of several kinds. One is 'micro-organisations' that involve at most one or two individuals. They may or may not be incorporated, their activities may be stimulated by economic or social motivations, and they may be for-profit or otherwise. Some small organisations, with up to c.20 employees, have similar characteristics.

A further relevant category of entities is individuals who make relatively sophisticated personal use of computing facilities. This may be for the management of personal finance, tax and pension fund, for correspondence, for databases of images, videos and audio, or for a family-tree. Such individuals are referred to here as 'prosumers'. The term was coined by Toffler (1970, 1980), and has progressively matured (Tapscott & Williams 2006, Clarke 2008). A prosumer is a consumer who is proactive (e.g. is demanding, and expects interactivity with the producer) and/or is a producer as well as a consumer. In the context of computer usage, a third attribute of relevance is professionalism, to some extent of the person themselves but also in relation to their expectation of the quality of the facilities and services that they use.

The significance of the work reported in this paper extends beyond micro-organisations and prosumers, however. During the last two centuries, workers were mostly engaged full-time by organisations under 'contracts of service'. The last few decades have seen increasing casualisation of workforces, with large numbers of individuals engaged through 'contracts for services', giving rise to 'the gig economy'. This requires each individual to take a far greater degree of self-responsibility. To the extent that large organisations depend on sub-contractors' use of IT and management of data, the security risks faced by sub-contractors impact upon the organisations that engage them.

Risk importation occurs even in the case of conventional employees, because of the Bring Your Own Device (BYOD) phenomenon. On the one hand, this outsources IT device provision from the employer to the employee. On the other, it insources to the employer the insecurities of their employees' devices. A key risk is that data on which the organisation depends may not be subject to adequate backup and recovery arrangements.

A prior study was undertaken of the risks involved in consumer migration from local applications to remote services (Clarke 2011). That paper concluded with the following quotation: "Some cloud computing outfit is going to quickly and quietly shut down, taking with it the data (business, photos, video, memories, etc.) of tens of thousands of users. Once we're storing everything in the cloud, what's to keep us from losing everything in the cloud?" (Cringely 2011, emphasis in original). Clarke (2012) documented the extent and nature of cloud interruptions and failures in the period 2005-11. The present paper is motivated by the need for clear guidance for small users faced with such service-provider frailties.

1.2 Backup and Recovery

A key purpose of backup and recovery arrangements is to ensure that data continues to be available in its appropriate form, despite loss of, or compromise to, the primary copy. A range of alternative approaches exists. References of value to the research reported here were Chervenak et al. (1998), plus Lennon (2001), Gallagher (2002), Preston (2007), de Guise (2008), Strom (2010), TOB (2012) and Cole (2013).

Backup may be performed at the level of a database or a file. Alternatively, multiple versions of, or all changes to, data may be stored within a database or file. The separate copy/ies may be on the same storage-device, or on another local storage-device, or on a device in a known location that is sufficiently remote that it is not subject to the same local-area risks as the original copy, or on a device whose location is unknown. The copy/ies may be online or offline. The device(s) containing the copy/ies may be in the possession of the relevant entity or of another party. The relevant entity may or may not own the device(s) in question, and in either case exercise by the entity of its rights may be subject to limitations because of the rights of other parties.

Backup processes vary in terms of their immediacy and frequency, their scale, the quality assurance applied to the resulting copy/ies, and their accessibility by the relevant entity, and by other authorised parties. A substantial range of alternative forms of backup procedures exists, including within-file backup that sustains version history, full and incremental backups of storage volumes, mirroring, and spooling to new media.

A predecessor paper reported on an in-depth analysis of the backup needs of micro-organisations and prosumers that store the primary copy of their data in-house, under their own responsibility (Clarke 2016). That paper also addressed circumstances in which an entity uses a remote file-hosting service, but processes the files on the entity's devices. Appendix 2 to the predecessor paper provides comprehensive catalogues of the characteristics of backup data (incl. logical, physical and organisational locations, and accessibility), and of backup processes (incl. timeframe, scale, quality assurance and archival). Appendix 3 further identifies the contexts, processes and attributes of a dozen specific forms of backup procedure.

1.3 The Cloud

During the last decade, the user's proximity to their data has diminished. The data used to be 'here', on the consumer's own device. It moved to 'there', as consumers used relatively local Internet Services Providers, with a known footprint. As the dependency came to be on large national ISPs, and particularly on ISPs outside the consumer's local jurisdiction, the footprint became less visible, and the data moved 'somewhere'. To the extent that cloud computing is applied, consumers' data is now 'anywhere'.

At the applications level, cloud computing takes the form of so-called [Application] Software as a Service (SaaS) offerings. Under the SaaS model, the service-provider both stores the primary copy of the data and performs much of the processing (Armbrust et al. 2010, Höfer & Karagiannis 2011). The user has a relatively very thin application on their desktop and/or laptop, in many cases in the form of scripts downloaded to their browsers, or a small 'app' on their smartphone and/or tablet.

The term 'SaaS' is associated with office automation services such as Google Docs, and the customer relationship management (CRM) service Salesforce. However, the pattern was emergent for some years, in such forms as webmail (operated both by local ISPs and by large providers such as Hotmail, Yahoo! and Gmail), family-tree data (e.g. ancestry.com) and textual documents, commonly called web-logs or blogs (e.g. wordpress.com). The more sophisticated forms that have emerged since about 2005 extend 'outsourcing' to 'cloudsourcing' by taking advantage of inexpensive commoditised hosts and virtualisation features, which has had the effect of articulating the industry into a wholesale-retail network model.

A SaaS segmentation analysis was presented in Clarke (2011). Since then, however, there have been further developments in SaaS offerings. No empirically-based taxonomy was located in the formal literature. One reason for this is that market offerings continue to develop. For example, backup as a service, and disaster recovery as a service, were emergent during the period during which this study was undertaken. The following segments are proposed as a means of identifying potential objects of study, on the basis of examples and partial classification schemes evident in both refereed and commercial literatures:

Communications and Collaboration Services
Examples include email services such as Yahoo!, Hotmail and Gmail, proprietary asynchronous messaging services, synchronous messaging services (chat, Instant Messaging), shared diaries, and project management services
Office Suites for Document Preparation and Maintenance
Examples include Zoho, Google Docs / Apps (in late 2016 re-named G Suite), MS Office 365 and collaboration-oriented services such as Huddle and Atlassian Confluence
Blog-Sites and Content Management System (CMS) Services
Examples include Wordpress.com and Google Blogger, and the highly-fragmented market of SaaS for CMS
Database Services, which comprise both generic offerings and multiple sub-categories:
- Customer Relationship Management (CRM), e.g. Salesforce
- Accounting, Financial Management Information Systems (FMIS) and Enterprise Resource Planning (ERP), such as NetSuite, MYOB, Xero
- Investment Portfolios, such as SelfWealth
- Audio Repositories, such as iTunes
- Image Galleries, such as Instagram, Flickr and Picasa
- Video Galleries, such as YouTube and Vimeo
- Family Trees, such as Ancestry.com and FamilySearch.org

1.4 SaaS and Backup

Dependence on Software as a Service gives rise to additional data risks (Clarke 2013), and backup and recovery processes have vital roles to play in managing a number of them. An early statement of the problem appeared in Armbrust et al. (2010): "customers cannot easily extract their data and programs from one site to run on another". This was more fully articulated by Buffington (2012): "What about if the SaaS provider goes dark? Maybe out of business? Perhaps a victim of Denial of Service attacks or broad data corruption (that is then replicated between sites). What is your plan? Do you back up the data from your SaaS provider? In what format(s) is the backup in? Is the data readable or importable into a platform that you own? How would you bring the functionality back online for your local users? for your remote users? Most importantly, have you tested that recovery?". Some further insights are available in the commercial literature, e.g. Taylor (2014), Buffington (2014, 2015).

In formal literatures, on the other hand, even within the cloud and SaaS security arena, the topic of backup and recovery has attracted limited attention. A Google Scholar search on <SaaS backup>, even in early 2017, delivered just a small number of articles, only a handful of which have achieved Google citation-counts, the highest a mere 20. In the IEEE Library, <SaaS backup> finds only 3 hits, one of them a preliminary report by this author (Clarke 2015) on the project reported in greater detail in the present paper. In the 30,000-entry eLibrary of the Association for Information Systems (AISeL), no articles had <SaaS AND backup> in the Title or Abstract, and the 12 with SaaS in the Abstract and backup in the text made only fleeting mention of backup, and almost all were attitudinal surveys that made no contributions to design. Even an article on selection criteria for SaaS services barely mentioned the topic (Repschlaeger et al. 2012).

Broader searches on <SaaS security> delivered far more articles, with far more citations. On the other hand, of the 70 hits on the IEEE library using variants of the search-terms <cloud security backup>, few actually focus on backup issues. A 350-page book on `cloud security' has less than a page on backups, and only 8 pages on business continuity and disaster recovery as a whole (Krutz & Vines 2010). Of the highly-cited papers in the area, Balachandra et al. (2009), Subashini & Kavitha (2011), Javaraiah (2011), Chen & Zhao (2012) and even AlZain et al. (2012) address security generally but with either superficial mentions of data backup and recovery, or none at all. Among the relevant publications of standards bodies, NIST (2011) is vacuous. ENISA (2015) advises SMEs that "Customers should assess which backups are made by the provider and if they need to implement or request additional back-up mechanisms ... Customers should assess which data is stored server-side, and client-side" (p.9). However, ENISA fails to provide any guidance as to what information to gather, how to evaluate it, and what requirements to communicate to their suppliers. The modest commercial literature contains useful observations, but limited guidance (e.g. Menn 2011, Cringely 2011).

The research reported on in this article set out to fill what appears to be a yawning gap in the literature, by defining the needs of the micro-organisation and prosumer market-segment, and evaluating a sample of SaaS providers against these requirements.

2. The Research Method Adopted

The purpose of the research reported in this paper was defined as:

To develop practical guidance on how micro-organisations and individuals can use backup and recovery techniques to address data risks arising in the context of remote, cloud-based (SaaS) services

The analysis in the prior paper on backup of locally-stored data (Clarke 2016) proposed a customised process for performing risk assessment focussed specifically on matters for which backup and recovery are relevant safeguards. It applied the conventional security model, as summarised in Appendix 1 to that paper, and drew on the accumulated literature on risk assessment generally, in particular AS 4360-2004 (which is more specific than the successor ISO 27000 series including 27017 and 27018) and NIST (2012, 2013). See also ISO 19086 re Service Level Agreements. A straightforward process was proposed whereby risk assessment could be conducted by or for small organisations and prosumers. That process is presented in Table 1.

Table 1: The Risk Assessment and Management Process

Analyse

Define the Objectives and Constraints
Identify the relevant Stakeholders, Assets, Values and categories of Harm
Analyse Threats and Vulnerabilities
Identify existing Safeguards
Identify and Prioritise the Residual Risks

Design

Identify alternative Backup and Recovery Designs
Evaluate the alternatives against the Objectives and Constraints
Select a Design, or adapt / refine the alternatives to achieve an acceptable Design

Plan the implementation
Implement
Review the implementation

____________________

It is not feasible to conduct risk assessment on a general case. So a particular profile was developed for use in the research. It is defined in Table 2. The criteria used in devising it were:

simplicity, in order to keep the scale of the work and its presentation within bounds
representative value, i.e. including a moderate proportion of the elements that arise in more complex cases, such that it can reasonably be used as a proxy for prosumers more generally, and for micro-organisations; and
familiarity to the author, in order to take advantage of existing knowledge and obviate the need for field-work during this phase of the project

Table 2: The Test-Case

A person who:

is a moderately sophisticated user of computing devices
uses their computing devices for personal activities and/or in support of a micro-organisation
has limited professional expertise in information technology matters

The functions that the person performs are primarily:

preparation and amendment of documents
maintenance of data-sets and databases in a variety of formats,
including text, structured data, image, sound and video
exchange of communications with other people
access to web-sites
maintenance of their own web-sites
the use of Internet Banking and eCommerce, but only as a purchaser, not as a merchant

The person operates out of a home-office using equipment as follows:

a desktop device
a portable / laptop / clam-shell device
a handheld device

The person has limited and/or somewhat haphazard backup and recovery arrangements in place

The person is not likely to be a specific target for attackers - as distinct from being subject to random unguided attacks by malware. This intentionally excludes categories such as private detectives, IT security contractors, and social and political activists, all of whom face the risk of being directly targeted by opponents and by government agencies

___________________

The research reported here applied risk assessment and risk management techniques in order to design backup plans. The final step was an initial contribution towards evaluation of those plans. This was achieved by examining the feasibility of implementing the plans in relation to a small sample of SaaS services. The choice among the many services needed to reflect the diversity of the market segments discussed earlier and to embody sufficient differences in the nature of the services that they were likely to be somewhat independent studies rather than, in effect, the same study performed several times. On the other hand, the set needed to be sufficiently small to enable performance of the analysis with available resources. The following were selected:

an image database - Instagram
This is a mature service, targeted at consumers generally
a family-tree database - ancestry.com
This is a mature example of a structured database, and is depended on by many prosumers
an accounting service - Xero
This is a modern product that has enjoyed considerable growth in use by both small organisations and individuals
the leading CRM - Salesforce
This is mature, widely-used, the clear leader in the market-segment, and much-used by small organisations
a popular office suite - Google Docs
This is reasonably mature (although ever-changing), and its use within relevant market segments is widespread. The 'enterprise' version, Google Apps, was not evaluated, because it is for larger organisations. Nor was the relevant Apps add-on service called Vault - which has, however, been subject to criticisms (Gwava 2015)

This paper provides an overview of the process that was applied, summarises the requirements that were developed, and reports on the assessment of the abpve sample of SaaS services. A comprehensive report on the study is provided in an underlying Working Paper (Clarke 2017).

3. The Process Applied

This section summarises the assessment of risks for which backup and recovery are relevant safeguards. It follows the sequence of steps specified under `Analyse' in Table 1.

A lengthy, formal statement of the objectives and constraints was defined. Expressed in terms more appropriate to small users, on the other hand, the aim was:

To achieve reasonable levels of data availability and integrity for reasonable cost

An analysis was undertaken of the interests of the particular test-case user defined in Table 2. This identified relevant data assets, values associated with those assets, and categories of harm to that the data could suffer. The user values that could be damaged are listed in Table 3.

Table 3: Damage to Small Users' Values

Reduced Asset Value
Degraded Operational Capacity
Reduced Revenue or Amenity
Cost, Time, Effort and Economic Loss Incurred during Recovery
Damaged Reputation
Negative Privacy Impact on Individuals
Non-Compliance with Obligations or Commitments

The next phase of the process involved the identification of threats and vulnerabilities that could give rise to such harm. To facilitate communication with small users, a mnemonic approach was adopted to Threats, using FATE (Fire, Attack, Training and Equipment) to represent respectively Environmental Events (F), Attacks (A), Human-Caused Accidents (T) and Accidents within Infrastructure (E). Further details are in Table 7 of Clarke (2016).

A number of key aspects of SaaS service-provision then needed to be considered. One concern is the lack of clarity about what obligations a SaaS provider owes to its customers, how long they last, and whether and how they could be enforced. A contract may exist - provided that all of the conditions for the formation of a contract are satisfied, notably the provision of consideration by both parties. If so, then the contract is governed by the terms that the provider displays, together with any overriding conditions imposed by such jurisdiction(s) as may be relevant to the context (Millard 2013, especially Chapters 2-5). Generally, the terms published by SaaS providers offer very little in the way of undertakings to the user: Warranties in relation to the availability of the data are uncommon, even more so warranties in relation to ongoing availability, and especially long-term availability. Moreover, terms are in most cases imposed by providers on small users, are generally malleable (typically changeable unilaterally by the provider, with little or no notice), and are in practice (and possibly even at law) unenforceable (because small users rarely have a realistic ability to sue a large corporation that has little or no local footprint, that is difficult to make contact with, and whose terms stipulate a jurisdiction remote from the user and expensive to litigate in). Regulators have been slow to act in relation to SaaS providers' terms, although the Competition and Markets Authority has very recently brought pressure to bear on providers in the UK. In short, the backup features of SaaS services are generally whatever they are, and as a matter of practice, and even of law, there is generally very little that the user can do about them.

Secondly, it is unclear what degree of reliance a small organisation or individual can place on the continued existence of each particular SaaS service. Most SaaS providers offer services in order to make a profit. So lines of business that are not sufficiently profitable are routinely withdrawn. Some providers also close users' accounts if they consider that they have not been used sufficiently recently. In addition, it is in the interests of a commercial service-provider to make it difficult for users to extract their data, because that raises the switching costs and achieves 'loyal' (in the narrow sense of 'locked-in') clients. In practice, reliability of access has mostly been reasonably good, but many instances of service-interruption and data-loss were documented in Clarke (2012), and little appears to have happened since then to reduce the concerns identified in that study. There are questions not only about the longevity of individual services, but also about the survival of each SaaS provider. For example, of the 25 CMS SaaS catalogued in Kaskade (2011), 7 (28%) appeared, a mere 4 years later, to no longer exist.

Withdrawals of services from the market, and failures of SaaS providers, have generally involved only short notice to users. This includes multiple withdrawn services that had been offered by very large corporations such as Google, Apple, Facebook and Microsoft. In some cases, the service was deemed to be no longer of interest to the company. In other cases, however, it was a competitive service that the larger corporation acquired and promptly closed down. An example of a withdrawn service is the drop.io file-sharing service, which was closed at 6 weeks' notice following takeover by Facebook. Similar fates befell Pownce and FitFinder (micro-blogging), Yahoo!'s Geocities (web-hosting), Photos and Briefcase (file-hosting), and a great many Google products including Buss, Video, Knol, Health, Notebook, Orkut and Panoramio.

Given the considerable scope for SaaS services to fail, data escrow becomes an attractive idea. Data escrow involves a copy of the data being on deposit with a trusted third party, which is contractually bound to deliver it to the right people in the right circumstances. Circumstances relevant to the present context are failure of the cloud service-provider, and withdrawal of its service. (The focus of escrow arrangements is on disaster scenarios rather than routine operations). There are a number of patents in the area, but the slim formal literature is primarily concerned with personal privacy protections rather than the data values addressed in this paper. An example of a functioning data escrow service is that provided by the NCC Group in relation to DNS registry data; but it does not appear that services implementing data escrow principles are generally available, and particularly not in relation to SaaS services.

A third concern about SaaS services is that many of them store their customers' data in proprietary formats. The question therefore arises as to whether data recovered from a backup is in a format that makes it effectively available to the customer. When the EU's General Data Protection Regulation (GDPR) comes intyo effect in May 2018, Art. 20 relating to Data Portability creates a requirement for the export of "a structured, commonly-used and machine-readable format"; but whether this will be an effective solution in all circumstances remains to be seen.

Some data is of ephemeral value, and the functions that a prosumer or micro-organisation needs to have assured access to may be no more than creation, in most cases by themselves; access, perhaps only by themselves, but in many cases by other people as well; and prevention of inappropriate access. Indeed, some SaaS providers appear to work on the presumption that immediacy and 'living for the now' are dominant and that long-term planning for data is out-of-step with the zeitgeist.

On the other hand, a considerable amount of data retains value over an extended period of time. In some circumstances, that period may be definable (e.g. 5-7 years for tax law compliance); but in others the period may be indeterminate (e.g. family trees and photo-albums). For data with longer-term value, users are reliant on ongoing access to a broader range of functions, including creation, maintenance, management, access, prevention of inappropriate access, backup and recovery.

A range of pre-existing factors intentionally or incidentally mitigate the risks confronting the test-user described in Table 2. Examples of existing safeguards include cautious human behaviours, physical security safeguards, accounting controls. and insurance policies. By considering the specific user profile on which this study was based, it was then feasible to draw conclusions about the residual risks that need to be managed, together with an assessment of their likelihood and severity. The assignment of probability and severity ratings is most comprehensively presented in NIST (2012, Appendices G, H and I). The ratings are a matter of judgement, which should in principle be made by the responsible user, in their own particular context. In practice, however, many small users are likely to have common needs. The outcomes are presented in Table 4.

Table 4: Priority Threat-Vulnerability Combinations

	Risk	Probability Rating (H, M, L)	Severity Rating (E, H, M, L)
1.	Unavailable Data or Unavailable Service to Access the Data Long-Term	Medium	Extreme
2.	Inaccessible Service – Wide Area Network Failure / Congestion Long-Term	Medium	Extreme
3.	Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Long-Term	Medium	Extreme
4.	Inaccessible Data (Data-Format unable to be processed by the user) Long-Term	Medium	Extreme
5.	Unavailable Data or Unavailable Service to Access the Data Short-Term	High	High
6.	Inaccessible Service – Wide Area Network Failure / Congestion Short-Term	High	High
7.	Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Short-Term	High	High
8.	Inaccessible Data (Data-Format unable to be processed) Short-Term	Medium	High
9.	Mistaken User Amendment, Deletion or Overwriting of a File	Medium	High

Each of the items in Table 4 may arise from a variety of circumstances. In the case of item 1, for example, the cause of unavailability might be:

failure of technical infrastructure of the SaaS service-provider, or of an organisation on which it depends
malfunction of an automated process, or malperformance of a manual procedure, of the SaaS service-provider or of an organisation on which it depends
denial of service attack on the network, equipment, processes or data of the SaaS service-provider, or of an organisation on which it depends. These include technical attacks, organisational attacks such as exercise of a lien due to non-payment of fees or other breaches of contract, and legal attacks such as injunctions and lawful (or otherwise) instructions (or even mere requests) by a government agency
withdrawal of service by the SaaS provider, or by an organisation on which it depends, due to a corporate decision to cease offering it
insolvency of the SaaS service-provider, or of an organisation on which it depends
withdrawal of service to the particular customer by the SaaS provider, e.g. due to non-payment of fees levied on the customer
deletion or loss of files, e.g. after formal termination of contract, or after expiry of contract is deemed by the SaaS provider to have occurred, due to non-access to the data within some period of time, or breach or non-fulfilment of a term of service

The process then moved into the design phase, by considering alternative backup and recovery strategies and tactics, and selecting appropriate designs for various circumstances. Any particular user may have data of varying degrees of criticality. For example, a set of images of previous work that a person has performed may be primarily of archival interest and have latent rather than actual value; whereas a debtors' ledger may be critical to the survival of the user's business. The analysis conducted here is relevant to all small users for whom any of their data is critical to the performance of their functions. Of course, many of them may only become aware of that criticality too late, once the damage has been done.

Because of the uncertainties about supplier and service reliability and about accessibility during the lifetime of the data's value, one of the most critical design choices is whether the backup is stored locally by the user, or remotely by some other party. Three approaches are shown in Table 5. In the first approach, the user relies entirely on the SaaS provider. In the second, the user draws backups down to their own site. In the third case, the user relies on the services of a further party, referred to here as a 'Backup as a Service' or BaaS provider. In these three cases, the user is respectively entirely dependent upon and trusting of the service provider (referred to here as the 'naive cloud' approach); or is required to understand, design, implement and perform backup ('DIY Backloads'); or is dependent on two service-providers rather than just one ('BaaS on SaaS').

Table 5: SaaS Backup Alternatives

Short Description	Indicative Timeframe	Location: 1st-Level Backup	Location: 2nd-Level Backup	User Experience
Naive Cloud	2010-	SaaS Provider	SaaS Provider	Implicit Trust
DIY Backloads	2015-	SaaS Provider	Local	More Demanding
BaaS on SaaS	2015-	SaaS Provider	BaaS Provider	Less Demanding

The following sections consider in turn the three contexts identified in Table 5. In each case, a requirements statement was first derived from the literature and the preceding analysis, and then the offerings of a sample of SaaS service-providers were evaluated against the requirements.

4. Naive Cloud Usage

This section considers the most common, and generally default approach. This involves implicit trust placed by users in the provider. A list of requirements was developed which addresses the high-priority threat-vulnerability combinations identified in Table 4. The requirements were categorised into infrastructure features, file-precautions, backup runs and business processes. Of particular significance were a high to very high level of service-availability; continual saves of changes made to files; an appropriate approach to file-versioning; malware management; backup to a geographically remote location and into off-line storage; and effective and efficient recovery procedures. Further details are available in the underlying Working Paper (Clarke 2017).

Strategies were sought for dealing with the priority threat-vulnerability combinations. Unfortunately, as indicated in Table 6, where full reliance is placed on the SaaS provider, almost all of them are incapable of being satisfactorily mitigated.

Table 6: Risk Management Strategies - Naive Cloud Usage

	Risk	Strategy
1.	Unavailable Data or Unavailable Service to Access the Data Long-Term	Nil, for the duration of the (long) Unavailability
2.	Inaccessible Service – Wide Area Network Failure / Congestion Long-Term	Nil, for the duration of the (long) Inacessibility
3.	Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Long-Term	Nil, for the duration of the (long) Inacessibility
4.	Inaccessible Data (Data-Format unable to be processed by the user) Long-Term	Nil
5.	Unavailable Data or Unavailable Service to Access the Data Short-Term	Nil, for the duration of the (short) Unavailability
6.	Inaccessible Service – Wide Area Network Failure / Congestion Short-Term	Nil, for the duration of the (short) Inacessibility
7.	Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Short-Term	Nil, for the duration of the (short) Inacessibility
8.	Inaccessible Data (Data-Format unable to be processed) Short-Term	Nil
9.	Mistaken User Amendment, Deletion or Overwriting of a File	Continuous or Continual Versioning

The final step in the research reported in this paper was to conduct an assessment of the capabilities of a sample of SaaS providers in comparison with the requirements of small users. This was done by reviewing each supplier's product description, terms of service, product commentaries, and such other sources as could be located, particularly organisations that offer backup products or services relevant to it. The reviews were complemented by searches for relevant published papers in both the academic and popular literatures - although only modest numbers of them were found.

A number of features were common across most or all of the five providers. In all cases, the SaaS providers' terms emphatically deny all avoidable warranties about the service's quality, and about the availability of the service and the availability or integrity of the data. Little or no information is provided about the nature of the backups undertaken, or the procedures involved. There appears to be no data escrow. There appear to be no long-term archival copies. Use of the SaaS provider's facilities for the recovery of lost data is at best obscure, and in some cases at least discouraged and even non-existent. The approach adopted to malware management is at least unclear and in some cases is deficient. Further, in at least three cases, and possibly all five, the provider's terms enable the company to use and disclose user-content, for a very wide range of purposes, forever.

The remainder of this section identifies some of the more significant aspects relating to the individual services.

Instagram

Instagram offers a simple storage service for static files, primarily in order to make them accessible by other people, but of necessity also as a repository for them. It projects the feeling of social immediacy, and, in many circumstances, the images that users store there are of ephemeral interest. Halfway through the company's c. 10-page Terms of Use page is the statement "Instagram encourages you to maintain your own backup of your Content. In other words, Instagram is not a backup service and you agree that you will not rely on the Service for the purposes of Content backup or storage. Instagram will not be liable to you for any modification, suspension, or discontinuation of the Services, or the loss of any Content" (emphases added). It is unlikely that many users ever read that sentence, let alone grasp its significance. To the extent that images have ongoing value to users, it is abundantly clear that reliance on Instagram even as a medium-term repository, let alone as a long-term archive of visual memories (cf. 'a photo album'), is extremely unwise.

ancestry.com

This service enables the development and maintenance of one or more segments of structured family tree, supported by a substantial set of resources. Many of those resources are drawn or captured from public sources, but a vast amount of data has been contributed by the service's users. It appears that only the current content may be available for display, and that either versioning is not available to users, or is not performed. The availability and reliability of the services, and of backup and recovery, are unclear. Two-thirds of the way through the company's c. 10-page 'Terms and Conditions' web-page is this brief passage: "we shall not be liable to you for ... (v) loss or corruption of, or damage to, data ..." (emphasis added). Given the great value that people generally place on the data that they accumulate about their families, it is remarkable that there appears to be no mention of the provider's backup policies or practices anywhere on the site.

Xero

Xero provides an online accounting service. Given that accounting for even the tiniest micro-business involves periodic cycles, the need to capture data and generate reports for various periods at various times, and 5-7 year archival requirements, it would be reasonable to expect that the design and the advice to users would emphasise backup and recovery. During late 2016, a statement appeared on the Security page saying "Your backups are kept up to date. Online backups are updated throughout the day, every day, and stored in multiple secure locations". However, the following statements can be found half-way through the company's c. 10-page Terms of Use page: "Xero adheres to its best practice policies and procedures to prevent data loss, including a daily system data back-up regime, but does not make any guarantees that there will be no loss of Data", and "Xero expressly excludes liability for any loss of Data no matter how caused". The sum total of the assistance it provides to users concerning the steps that they need to take is "You must maintain copies of all Data inputted into the Service" (emphases added). Blind reliance on Xero by small users is a recipe for accounting disaster.

Salesforce

Salesforce is quite blunt about its shortcomings in this area. Its web-page on Backup and restore your Salesforce data declares that "Salesforce Support doesn't offer a comprehensive data restoration service", and warns the user not to rely on it for recovery purposes: "Although Salesforce does maintain back up data and can recover it, it's important to regularly back up your data locally so that you have the ability [to] restore it to avoid relying on Salesforce backups to recover your data. The recovery process is time consuming and resource intensive and typically involves an additional fee". It also explains that it is wildly expensive: "Data Recovery is a last resort process ... The price is a flat rate of $US 10,000" (emphases added). This would be very expensive for many micro-organisations and individuals, and prohibitive for some.

Google Docs

The core functions of Google Docs, as with any office suite, are the creation and modification of word processing documents and spreadsheets; but several further capabilities are included (Jeffries 2012). The storage of user files is managed by Google Drive. Google Drive has several modes of operation, which give rise to considerable complexity in relation to the ways in which backup and recovery needs to be performed. When documents are being edited, progressive saves are made, and a concept of file-versions exists; but the frequency with which a new version of the file is created is not entirely clear. During the conduct of this project, despite searches being conducted on Google sites, and links being followed, no source was located that explained what backup and recovery processes exist within the Google Docs and related services. After the paper had been completed, a publication emerged that provided some of the relevant information in relation to Apps / G Suite, for large users, but for Docs, for small users (Google 2017).

Security risks are highest in the case of word processing and spreadsheet documents, because they may contain active code. Despite this vulnerability, malware management is implemented by the service-provider in such a manner that some users have reported complete loss of data as a result of the penetration of their working devices by ransomware - because the Google Drive copies were then overwritten by encrypted copies when auth-synch'ing occurred (Austin 2013). It appears that, under more recent arrangements, prior copies on Google Drive may be able to be manually recovered.

______________

The support for backup provided by these five mainstream SaaS services is such that individuals and organisations that are reliant on being able to access data should not place implicit trust in any of them. Put another way, a rough score arising from the analysis is 0 (zero) out of 5. Every user needs to ensure that they retain copies of all source-materials that they used to create the data submitted to the SaaS service-provider, so that they can start again with an alternative provider if and when the first one fails them. This may require special measures if some of the source-materials were ephemeral, e.g. voice, telephone-calls or temporary on-screen displays. Moreover, given the current, clearly immature stage of the development of SaaS services, the cautionary statements made about the five test-providers may well be generally applicable to SaaS providers generally.

5. DIY Backup

This section investigates how a user can take Do-It-Yourself (DIY) responsibility for their own backup and recovery, despite entrusting both the primary copy of their data, and all processing functions, to a service-provider. This requires a backup to be held locally. The service-provider needs to fulfil a number of requirements, but safeguards continue to be needed on the individual's own devices as well.

The key issues are whether it is feasible for a user to extract their data from the SaaS provider's site, and whether the data that is extracted is in a usable format. Strategies were sought for dealing with the priority threat-vulnerability combinations identified in Table 4. As indicated in Table 7, most can, at least in principle, be addressed provided that one element of the plan is mirroring or periodic backload to the user's own site.

Table 7: Risk Management Strategies - DIY Backup

	Risk	Strategy
1.	Unavailable Data or Unavailable Service to Access the Data Long-Term	Mirroring or Periodic Backload to a local storage-device
2.	Inaccessible Service – Wide Area Network Failure / Congestion Long-Term	Mirroring or Periodic Backload to a local storage-device
3.	Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Long-Term	Mirroring or Periodic Backload to a local storage-device but only if directly-connected
4.	Inaccessible Data (Data-Format unable to be processed by the user) Long-Term	Mirroring or Periodic Backload to a local storage-device but only if reformatting is feasible
5.	Unavailable Data or Unavailable Service to Access the Data Short-Term	Mirroring or Periodic Backload to a local storage-device
6.	Inaccessible Service – Wide Area Network Failure / Congestion Short-Term	Mirroring or Periodic Backload to a local storage-device
7.	Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Short-Term	Mirroring or Periodic Backload to a local storage-device but only if directly-connected
8.	Inaccessible Data (Data-Format unable to be processed) Short-Term	Mirroring or Periodic Backload to a local storage-device but only if reformatting is feasible
9.	Mistaken User Amendment, Deletion or Overwriting of a File	Mirroring or Periodic Backload to a local storage-device, combined with Continuous or Continual File-Versioning

An assessment was undertaken of the extent to which the selected sample of SaaS providers enable small users' requirements to be fulfilled.

Instagram

It appears that Instagram does not support a backload to the user's own working-device. On the other hand, Instagram does provide an API that facilitates extraction of files by third parties, and three services were found that appear to use that API to enable download files to the user's device: Instaport (which downloads all files), Grabinsta (for individual files only) and digi.me. Hence it would appear that an alert user can find a way to backup their images from Instagram. However, the statement on the company's c. 10-page Terms of Use page that "Instagram encourages you to maintain your own backup of your Content" is not supported by any indication of how it might be done.

Moreover, the Terms of Use of the API preclude its use "for any application that replicates or attempts to replace the essential user experience of Instagram.com or the Instagram apps". This has the appearance of an attempt to prevent users from switching their images to an alternative provider of image-hosting services; yet that is precisely what a user needs to do if Instagram under-performs, materially changes the service, withdraws it, or closes down.

ancestry.com

The family-tree data stored by ancestry.com is capable of being extracted by the user into a proprietary format, downloaded, and loaded into a copy of the inexpensive software product, Family Tree Maker. In early 2016, ancestry sold that product to the developer of the Mac OSX version, Software MacKiev. It appears that the product continues to be an effective means of performing DIY backup from the ancestry.com site, and the ancestry.com site provides guidance in relation to its use. It is also possible to extract data into another format, GEDCOM, which is used by a number of alternative software products (FHD 2017).

Xero

Xero provides functions for exporting files to their own device. However, despite the five-decade history of electronic data interchange (EDI), the company does not appear to have applied an industry-standard format for expressing the chart of accounts, transaction data and report-format parameters.

Moreover, the primary format in which data can be extracted is comma-separated values (CSV) files. These suffer from the serious problem that no reliable field-separation meta-character exists. Further, there remain some doubts as to whether additional files loaded onto the SaaS service, such as scans of inbound invoices, are extractable. The effectiveness of backup to the user's site is accordingly a concern, and considerable accounting and IT expertise would be essential if an attempt were to be made to reconstruct the accounts from backups.

Salesforce

Salesforce enables data to be extracted, but according to the company's site:

extraction can only be performed weekly or monthly
there may be frequent and long delays in a scheduled extraction being performed
it can only be performed manually by the user and cannot be automated
the data is delivered in comma-separated values (CSV) files, with the only other separator available being a space. This is seriously problematic, because:
- many of the fields already legitimately contain commas and/or spaces
- each CSV is an unstructured, flat file, whereas the database is structured
- there may be scores of record-types and hence scores of flat files
- no documentation was evident of the data model used by the service, and hence the relationships among the many flat files may be obscured and hence may need to be interpolated

In any case, there is no evidence that the data is of any value other than as a basis for reloading into Saleforce. Further, the company acknowledges that recreating a Saleforce database from a backup may be very difficult, and it cannot provide assistance: "Overall planning, development and implementation of a strategy to manage record restoration falls outside the scope of Support's offerings" (emphasis added).

Google Docs

Google Docs files are stored in a proprietary format, and as a result mirrored files are only readable using Google Drive Desktop App and/or the Google Chrome browser. In addition, only the current version is available, and versioning support is highly restricted.

In mid-2016, considerable difficulty was encountered in assessing the feasibility of conducting DIY backup from Google Docs to users' own storage-devices. Reasons for this include the lack of straightforward documentation on Google's many sites, the existence of multiple documents with mutually inconsistent content scattered across different structures, and the frequent changes made to Google services, often without notice to their users. A history of the techniques involved at various times is to be found in the successive, archived versions of Malunui et al. (2016). Some of the current pitfalls are described in Witzel (2016).

On the basis of the available sources, it appears that, until the end of 2014, there were considerable difficulties involved in extracting a usable backup of documents, and any particular user may or may not have been able to achieve it. Since the beginning of 2015, a backup-to-local-storage mechanism has been available, usefully described in Rowe (2015). Finding authoritative documentation on Google sites in the period September 2015 to March 2016 was challenging, and continually led to different source-pages. A snapshot of one page was taken in March 2016 (Google 2016).

By early 2017, however, it appeared that DIY backup was operational, with three relevant approaches identified by Malunui et al. (2016) - Download into .docx and similar, Sync to a local storage-volume, and Google Takeout to extract a compressed version of the contents of the user's Google Drive. An established third-party product, Insync was available, and well-priced for small users.

_________________

Of the five services in the sample, only two, ancestry.com and Google Docs, enable an individual or small organisation to perform an effective backup to their own site in a reasonably straightforward manner. In one case, Instagram, third-party services may satisfy the need, but many small users may not find them. In the other two cases, there are conceptual and technical challenges that are sufficiently substantial as to be insurmountable for many users. So a rough score of the sample is 2.5 out of 5.

It would be a serious concern if the results from the sample apply to the population, because that would mean that only about half of all SaaS providers enable straightforward implementation of DIY backload to users' own devices. The problem may affect large and medium-sized organisations as well, but is particularly significant for individuals and small organisations that seek to, for example, maintain an image-library, ensure that they have access to an archive of their electronic tax records, not lose their data relating to customers and prospects, and not lose their correspondence files with government agencies, suppliers, employers, financial institutions, insurers and pension funds.

6. Backup as a Service (BaaS)

In this section, the possibility is considered of using an additional cloud service - a 'Backup as a Service' (BaaS) provider - as a defence of last resort against malfunction or malperformance by a SaaS provider.

The BaaS provider's focus is on backup and recovery services. However, they of course share common features with all other SaaS providers, in particular:

their services may cease if the account is not maintained
their commitment to providing the service may cease if it doesn't generate a profit or otherwise satisfy the corporation's current strategic objectives. For example, Google withdrew its Postini email backup service in 2012
circumstances could arise under which the user cannot access the data when they need it
the data that is accessed may not be in a usable format

Applying basic security principles, it is important to ensure that the SaaS and BaaS providers do not have common vulnerabilities, because that would reduce what appears to be the relative security of two points-of-failure to the relative insecurity of only one. Examples of situations to be avoided include physical co-location of the SaaS and BaaS, network co-location, common ownership, and storage of the two copies of data within the same jurisdiction. At this stage, the BaaS market-segment appears to be under-developed. A wholesale option exists in the form of Amazon Web Services (AWS) Glacier, but micro-organisations and prosumers would need an intermediary provider to utilise that service.

Strategies were sought for dealing with the priority threat-vulnerability combinations identified in Table 4. The results are presented in Table 8.

Table 8: Risk Management Strategies - BaaS

	Risk	Strategy
1.	Unavailable Data or Unavailable Service to Access the Data Long-Term	Mirroring or Periodic Crossload to another cloud-provider but only if BaaS Provider not affected
2.	Inaccessible Service – Wide Area Network Failure / Congestion Long-Term	Mirroring or Periodic Crossload to another cloud-provider but only if BaaS Provider not affected
3.	Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Long-Term	Nil, for the duration of the (long) inacessibility
4.	Inaccessible Data (Data-Format unable to be processed by the user) Long-Term	Mirroring or Periodic Backload to another cloud-provider but only if reformatting is feasible
5.	Unavailable Data or Unavailable Service to Access the Data Short-Term	Mirroring or Periodic Crossload to another cloud-provider but only if BaaS Provider not affected
6.	Inaccessible Service – Wide Area Network Failure / Congestion Short-Term	Mirroring or Periodic Crossload to another cloud-provider but only if BaaS Provider not affected
7.	Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Short-Term	Nil, for the duration of the (short) inacessibility
8. s	Inaccessible Data (Data-Format unable to be processed) Short-Term	Mirroring or Periodic Backload to another cloud-provider but only if reformatting is feasible
9.	Mistaken User Amendment, Deletion or Overwriting of a File	Mirroring or Periodic Crossload to another cloud-provider, combined with Continuous or Continual File-Versioning

The five sample SaaS were evaluated, to establish the extent to which the use of a BaaS is feasible.

Instagram

Instagram provides an API that facilitates extraction of files by third parties, such as Frostbox, iDrive and StreamNation. The nominal prohibition on use of the API "for any application that replicates or attempts to replace the essential user experience of Instagram.com or the Instagram apps" is presumably ignored by all parties. If a user searches for a BaaS provider, they may find one; but Instagram appears to do nothing to assist users' awareness of or knowledge about the available options.

ancestry.com

Although ancestry.com enables extraction of family-tree data into a proprietary format and possibly a second format, there do not appear to be any BaaS providers of backup services. It appears that ancestry.com may actively block this, in that it provides no API, and its terms expressly prohibit "distribution of your password to others for access to Ancestry".

Xero

Xero makes an API available to BaaS providers. Services that utilise the API include LedgerBackup (which emails a CSV file for each of the tables in Xero), Boxkite (which mirrors Xero files to Dropbox) and Safeguard-My-Xero. However, it is unclear whether any of these BaaS providers offer any service other than making CSV files available to the user in the event that the Xero service ceases to perform its function. It is also unclear whether all of the files that are extracted are capable of even being re-imported into Xero, let alone into another package or service, in order to re-constitute the set of accounts. It is accordingly far from clear that Xero is fit for use by any business that wants to be assured of its capacity to manage its accounts and comply with its obligations to tax agencies. There appears to be very little literature on this serious deficiency, but see Dorricott (2013).

Salesforce

Salesforce publishes APIs to BaaS providers, and a range of third parties claim to provide backup services. These include Backupify, Cloudally, Cloudfinder, Ownbackup, SesameSoftware, Skyvia and Spanning. However, the only way in which these backups can be used is to reload the data back into Salesforce. Moreover, it appears that all such products are oriented towards 'enterprises', i.e. large and medium-sized organisations, that none provide straightforward explanations of the kinds that would be comprehensible to micro-organisations and prosumers, and that costs may be of the order of USD 6,000 p.a. There appears to be no BaaS solution suitable for small users.

Google Docs

Google Docs is the subject of multiple third-party backup offerings, but they are oriented towards enterprises. In mid-2016, the BaaS service offerings for consumers and small business were far from clear, although a new offering was heralded by (Witzel 2016) as providing the kind of BaaS service needed. In early 2017, Malunui et al. (2016) mentioned as third-party (BaaS) providers Spanning, Syscloud, or Backupify. However, Spanning's and Syscloud's offerings appeared to address only 'enterprise' needs in relation to Google Apps, not small users' needs in relation to Docs, and Syscloud's site stated that Backupify had been replaced by Syscloud. BackupGoo similarly focussed on Apps not Docs. In short, it is unclear what, if any, BaaS options exist for small users.

______________

The BaaS approach to backups appears to be in one case feasible, although not advertised by the provider (Instagram), and in one it may have become feasible as this paper was being finalised (Google Docs). In a third case, it is fraught with difficulty because of the absence of services oriented towards small organisations and individuals (Salesforce). In one case it appears to be impractical (Xero), and in the other infeasible (ancestry.com). A rough score would seem to be about 1.5 out of 5.

7. Conclusions

This paper has reported on research into the backup needs of micro- and some other small organisations, and of individuals. It has focussed on a test-case, in order to not merely provide general guidance, but also to deliver a specification that fulfils the declared objective. A companion paper presented backup plans for contexts in which the user is self-sufficient, or uses a backup service, or uses a file-hosting service for the primary copy of their files. This paper has examined the contemporary context in which the user is dependent on a service-provider for hosting not only the data but also the application software. It has devised strategies for three alternative approaches, and reported on an analysis of their feasibility in respect of a sample of SaaS service-providers. The combined rough score was about 3 out of 15, or about 20%, which, in any normal evaluation scheme, is awarded a dismal Fail.

Complete reliance on these five SaaS providers is highly unlikely to result in an outcome that addresses the backup and recovery needs of individuals and small organisations. A user who takes responsibility for their own backups may be able to do so, but may need to reliably perform somewhat onerous and technically challenging tasks, and even then may not be able to achieve the intended result. The BaaS approach is currently at best problematical and at worst infeasible.

Possibly these five services are still at an immature stage in their development. On the other hand, all have been in existence for quite some time - Instagram 6 years (since 2011), SaleForce 18 years (1999) and Google Docs 11 years (2006), and the SaaS forms of ancestry.com 17 years (c.2000) and of xero 7 years (c.2010). Allowance also has to be made for the possibility that these services are not intended for serious use - although only Instagram appears to have that kind of profile. A further consideration is that the sample might not be representative of the population of SaaS services intended for serious but small users.

The analysis needs to be subjected to review by peers, and improvements made to reflect their feedback. Further, the analysis needs to be applied to additional and special test-cases, reflecting the needs of small organisations and individuals with particular profiles different from that applied here. The analysis is likely to require adaptation to the extent that usage among the target market-segment of general-purpose computing devices (desktops and laptops) declines to the point that datafile creation and amendment are generally undertaken on appliances (whose current forms are smartphones and tablets). Because the capabilities of such devices are limited by their providers, they may not provide the scope for users to manage their backup arrangements. To the extent that usage of general-purpose computing devices by micro-organisations and prosumers ceases, or becomes prohibitively difficult or expensive, or simply prohibited, the DIY alternative would become infeasible.

The specific backup plans developed during this project and/or variants of and successors to them, are capable of being productised by providers. These include corporations that sell hardware, that sell operating systems, that sell pre-configured hardware and software, that sell value-added hardware and software installations, that sell storage-devices, that sell storage services, that sell SaaS services, and that sell BaaS services. In addition, current service-providers can use the risk assessment, or the specific backup plans, as a basis for evaluating their offerings and planning improvements to address deficiencies.

The analysis of five mainstream SaaS providers concluded that, in many cases, very substantial barriers prevent the implementation of effective backup strategies by small organisations and individuals using SaaS services. This is a very important conclusion, and one that does not appear to have been published in the literature to date. Given the vital economic, social and personal importance of assuring ongoing access to data, this is a fairly remarkable finding.

Reasons why the ideas suggested in this paper will not come about are easily found, including:

insecurity is an externality, whose negative consequences are not experienced by the cloud service provider
security features can be costly
demand by users is at least muted, and even sbsent
no comparative advantage is apparent to providers

On the other hand, there are drivers for these activities, and processes exist whereby the features may come about. For example:

features may 'trickle down' from enterprise versions
security-aware market segments may demand them
regulators such as tax agencies may exert moral suasion on providers

It is important that researchers apply the analysis conducted in this project to further SaaS providers, and that pressure be brought to bear on those, possibly many, providers that fall materially short of users' needs. The adoption of cloud solutions may have been enthusiastic, but is has been blind, and that blindness will have serious negative consequences for some users. The recent actions of the UK Competition and Markets Authority may provide a template for adpotion by regulators elsewhere.

A further implication for academics is that the very limited attention paid to backup of SaaS-maintained data in the formal research literature, a full decade after it emerged, is harmful to users. One reason for the lack of a literature may be that Computer Science is concerned with much more technically advanced matters than backup and recovery. Another may be that the emphasis of Information Systems research is so strongly on social science perspectives and the observation of technology in use that only a limited amount of constructive research is being undertaken. Alternatively, IS researchers may have such a strong commitment to the needs of large corporations that the interests of small organisations and individuals are being largely ignored. A literature on consumer law in cloud contexts is emergent, however. See, for example, Millard (2013, Chapter 13).

As individuals increasingly act as prosumers, they can be expected to become more demanding, and more interested in investing in effective but practical backup arrangements. Greater economic significance will arise from the large numbers of micro-organisations that have become highly dependent on cloud-stored data, and whose performance will be seriously undermined by unrecoverable data loss. Meanwhile, the many large organisations that are significantly virtualised and dependent on sub-contractors will become concerned about importing those subcontractors' security risks. They are therefore likely to bring pressure to bear on small and micro-businesses to demonstrate the appropriateness of their backup arrangements, and to provide warranties and indemnities.

The work reported here sounds the alarm. But it also lays a foundation for significant improvements in key aspects of the data security not only of prosumers and micro-organisations, but also of the larger organisations that depend on them.

References

AlZain M.A., Pardede E., Soh B. & Thom J.A. (2012) `Cloud Computing Security: From Single to Multi-Clouds' Proc. 45th Hawaii International Conference on System Sciences, at http://www.computer.org/csdl/proceedings/hicss/2012/4525/00/4525f490.pdf

Armbrust M., Fox A., Griffith R., Joseph A.D., Katz R., Konwinski A. & Zaharia M. (2010) 'A view of cloud computing' Communications of the ACM, 53, 4 (April 2010) 50-58

AS 4360 (2004) `Risk Management' Standards Australia, 2004

Austin R.P.B. (2013) 'Virus encrypted all google drive files - Cryptolocker virus' Google Drive Help Forum, October 2013, at https://productforums.google.com/forum/#!msg/drive/DmZKoIcAPzg/siCsN_lZDlQJ

Balachandra B.R., Paturi V.R. & Rakshit A. (2009) `Cloud Security Issues' Proc. IEEE International Conference on Services Computing, 2009, at https://xa.yimg.com/kq/groups/2584474/89013670/name/NDU-3.pdf

Bradley T. (2011) `30 Days With...Google Docs: Day 25: Don't Lose Your Google Docs Data' PCWorld, May 2011, at http://www.pcworld.com/article/228707/day_25_dont_lose_your_google_docs_data.html

Buffington J. (2012) 'How do you back up SaaS? I'd like to know' ESG, 21 December 2012, at http://www.esg-global.com/blogs/how-do-you-back-up-saas-id-like-to-know/

Buffington J. (2014) 'SaaS Backup ... Hunger Games style' ESG, 11 December 2014, at http://blog.esg-global.com/saas-backup-...-hunger-games-style

Buffington J. (2015) 'Your SaaS Application needs to be Backed Up!' ESG, 14 May 2015, at http://research.esg-global.com/reportaction/Blog0514201504/TOC?include=backup%20SaaS

Chen D. & Zhao H. (2012) `Data Security and Privacy Protection Issues in Cloud Computing' Proc. IEEE International Conference on Computer Science and Electronics Engineering, 2012, at http://xa.yimg.com/kq/groups/2584474/417972861/name/NDU-1.pdf

Chervenak A. L., Vellanki V. & Kurmas Z. (1998) 'Protecting file systems: A survey of backup techniques' Proc. Joint NASA and IEEE Mass Storage Conference, March 1998, at http://www.storageconference.us/1998/papers/a1-2-CHERVE.pdf

Clarke R. (2008) 'B2C Distrust Factors in the Prosumer Era' Invited Keynote, Proc. CollECTeR Iberoamerica, Madrid, 25-28 June 2008, pp. 1-12, at http://www.rogerclarke.com/EC/Collecter08.html

Clarke R. (2011) 'The Cloudy Future of Consumer Computing' Proc. 24th Bled eConference, June 2011, PrePrint at http://www.rogerclarke.com/EC/CCC.html

Clarke R. (2012) 'How Reliable is Cloudsourcing? A Review of Articles in the Technical Media 2005-11' Computer Law & Security Review 28, 1 (February 2012) 90-95, PrePrint at http://www.rogerclarke.com/EC/CCEF-CO.html

Clarke R. (2013) 'Data Risks in the Cloud' Journal of Theoretical and Applied Electronic Commerce Research (JTAER) 8, 3 (December 2013) 59-73, Preprint at http://www.rogerclarke.com/II/DRC.html

Clarke R. (2015) 'SaaS Backup Fails the Fitness for Purpose Test'
IEEE Cloud Computing 2, 6 (Nov-Dec 2015) 58-63, PrePrint at http://www.rogerclarke.com/EC/FPTB.html

Clarke R. (2016) 'Practicable Backup Arrangements for Micro-Organisations and Individuals ' Australasian Journal of Information Systems, 20 (September 2016), at http://dx.doi.org/10.3127/ajis.v20i0.1250, PrePrint at http://www.rogerclarke.com/EC/PBAR.html

Clarke R. (2017) 'Backup Strategies for Users Dependent on Service-Providers' Xamax Consultancy Pty Ltd, , at http://www.rogerclarke.com/EC/PBAR-SP-WP.html

Cole E. (2013) 'Personal Backup and Recovery' Sans Institute, September 2013, at http://www.securingthehuman.org/newsletters/ouch/issues/OUCH-201309_en.pdf

Cringely R. (2011) '2011 prediction #8: Cloudburst' I, Cringely, 6 January 2011, at http://www.cringely.com/2011/01/2011-prediction-8-cloudburst/

Dorricott B. (2013) 'Read Xero's Terms and Conditions and then weep!' Meteorical, 31 December 2013, at http://www.meteorical.com.au/2013/12/read-xeros-terms-and-conditions-and-then-weep/

ENISA (2015) `Cloud Security Guide for SMEs' European Union Agency for Network and Information Security, April 2015, at https://www.enisa.europa.eu/activities/Resilience-and-CIIP/cloud-computing/security-for-smes/cloud-security-guide-for-smes/at_download/fullReport

FMD (2017) 'Yes, You CAN Download Your Tree From Ancestry.com - Here's How' Family History Daily, 11 January 2017, at http://familyhistorydaily.com/genealogy-help-and-how-to/yes-you-can-download-your-tree-from-ancestry-com-heres-how/

Gallagher M.J. (2002) 'Centralized Backups' SANS Institute, July 2001, at http://www.sans.org/reading-room/whitepapers/backup/centralized-backups-513

Google (2016) `Download your data' Google Inc., accessed 3 March 2016, at https://support.google.com/accounts/answer/3024190?hl=en, mirrored at http://www.rogerclarke.com/EC/Google-Dyd-160303.pdf

Google (2017) 'Google Cloud Security and Compliance Whitepaper: How Google protects your data' Google, May 2017, at https://storage.googleapis.com/gfw-touched-accounts-pdfs/google-cloud-security-and-compliance-whitepaper.pdf, mirrored at http://www.rogerclarke.com/EC/google-cloud-security-and-compliance-whitepaper.pdf

de Guise P. (2008) 'Enterprise Systems Backup and Recovery: A Corporate Insurance Policy' Auerbach, 2008

Gwava (2015) 'Top 10 Google Vault Archiving Drawbacks' Gwava, original of 18 June 2014, at http://www.gwava.com/blog/top-10-google-vault-archiving-drawbacks

Höfer C.N. & Karagiannis G. (2011) `Cloud computing services: taxonomy and comparison' J Internet Serv Appl 2 (2011) 81-94, at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.460.3939&rep=rep1&type=pdf

IASME (2013) 'Information Assurance For Small And Medium Sized Enterprises'
IASME Standard v. 2.3, March 2013, at https://www.iasme.co.uk/images/docs/IASME%20Standard%202.3.pdf

Javaraiah V. (2011) 'Backup for Cloud and Disaster Recovery for Consumers and SMBs' Proc. IEEE 5th Int'l Conf. on Advanced Networks and Telecommunication Systems (ANTS), December 2011

Jefferies C.P. (2012) 'Google Drive: The Differences Between the Web App and the Desktop App' Backupify, 15 August 2012, at http://blog.backupify.com/2012/08/15/google-drive-the-differences-between-the-web-app-and-the-desktop-app/

Kaskade J. (2011) 'The Reality Of Public Cloud' James Kaskade, June 2011, at http://jameskaskade.com/?p=1722

Krutz R.L. & Vines R.D. (2010) `Cloud Security: A Comprehensive Guide to Secure Cloud Computing' Wiley, 2010

Lennon S. (2001) 'Backup Rotations - A Final Defense' SANS Institute, August 2001, at http://www.sans.org/reading-room/whitepapers/sysadmin/backup-rotations-final-defense-305

Malunui et al. (2016) `How to Back Up Google Docs', December 2016, at http://www.wikihow.com/Back-Up-Google-Docs, with many successive archived versions from December 2013 onwards, at https://web.archive.org/web/20161219153448/http://www.wikihow.com/Back-Up-Google-Docs

Menn J. (2011) 'Cloud creates tension between accessibility and security' Financial Times, 14 November 2011, at, http://www.ft.com/intl/cms/s/0/6513a4d6-0a06-11e1-85ca-00144feabdc0.html;2011. com/EC/CCEF-ITMediaReports-1109.rtf

Millard C. (ed.) (2013) 'Cloud Computing Law' Oxford University Press, 2013

NIST (2011) 'Guidelines on Security and Privacy in Public Cloud Computing' Special Publication 800-144, National Institute of Standards and Technology, December 2011, at http://csrc.nist.gov/publications/nistpubs/800-144/SP800-144.pdf

NIST (2012) 'Guide for Conducting Risk Assessments' National Institute of Standards and Technology, Special Publication SP 800-30 Rev. 1, September 2012, at http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-30r1.pdf

Preston W.C. (2007) 'Backup & Recovery' O'Reilly Media, 2007

Repschlaeger J., Wind S., Zarnekow R. & Turowski K. (2012) `Selection Criteria for Software as a Service: An Explorative Analysis of Provider Requirements" AMCIS 2012 Proceedings, July 29, 2012, Paper 3, at http://aisel.aisnet.org/amcis2012/proceedings/EnterpriseSystems/3

Rowe W. (2015) `How to Back Up Google Docs' Tech-Recipes, 4 January 2015, at http://www.tech-recipes.com/rx/52333/backup-google-docs/

Strom S. (2010) 'Online Backup: Worth the Risk?' SANS Institute, May 2010, at http://www.sans.org/reading-room/whitepapers/backup/online-backup-worth-risk-33363

Subashini S. & Kavitha V. (2011) `A survey on security issues in service delivery models of cloud computing' Journal of Network and Computer Applications 34 (2011) 1-11, at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.259.6975&rep=rep1&type=pdf

Tapscott D. & Williams A.D. (2006) 'Wikinomics: How Mass Collaboration Changes Everything' Portfolio, 2006Taylor C. (2014) 'Backup as a Service: To BaaS or Not to BaaS' Datamation, 10 November 2014, at http://www.datamation.com/cloud-computing/backup-as-a-service-to-baas-or-not-to-baas-1.html

TOB (2012) 'Types of Backup' typesofbackup.com, June 2012, at http://typesofbackup.com

Toffler A. (1970) 'Future Shock' Pan, 1970

Toffler A. (1980) 'The Third Wave' Pan, 1980

Witzel L. (2016) `Do I Need To Back Up Google Docs? Absolutely' Spanning, 25 January 2016, at http://spanning.com/blog/do-i-need-to-back-up-google-docs-absolutely/

Acknowledgements

The assistance of Russell Clarke is gratefully acknowledged, in relation to conception, detailed design and implementation of backup and recovery arrangements for the author's business and personal needs, and for review of a draft of this paper. An assignment on a sub-set of this topic was set for ANU Computer Science students. The analysis reported here benefited from the submissions by Rebecca Catanzariti and Julie Noh, and by Patrick McCawley and Simon Whittenbury.

Author Affiliations

Roger Clarke is Principal of Xamax Consultancy Pty Ltd, Canberra. He is also a Visiting Professor in the Cyberspace Law & Policy Centre at the University of N.S.W., and a Visiting Professor in the Research School of Computer Science at the Australian National University.

Personalia

Photographs
Presentations
Videos

Access
Statistics

The content and infrastructure for these community service pages are provided by Roger Clarke through his consultancy company, Xamax.

From the site's beginnings in August 1994 until February 2009, the infrastructure was provided by the Australian National University. During that time, the site accumulated close to 30 million hits. It passed 65 million in early 2021.

Sponsored by the Gallery, Bunhybee Grasslands, the extended Clarke Family, Knights of the Spatchcock and their drummer

Xamax Consultancy Pty Ltd
ACN: 002 360 456
78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916

Created: 28 August 2014 - Last Amended: 9 July 2017 by Roger Clarke - Site Last Verified: 15 February 2009
This document is at www.rogerclarke.com/EC/PBAR-SP.html
Mail to Webmaster - © Xamax Consultancy Pty Ltd, 1995-2022 - Privacy Policy

Roger Clarke's Web-Site

© Xamax Consultancy Pty Ltd, 1995-2024

Can Small Users Recover from the Cloud?

Abstract

Contents

1. Introduction

1.1 Small Users

1.2 Backup and Recovery

1.3 The Cloud

1.4 SaaS and Backup

2. The Research Method Adopted

Table 1: The Risk Assessment and Management Process

Table 2: The Test-Case

3. The Process Applied

Table 3: Damage to Small Users' Values

Table 4: Priority Threat-Vulnerability Combinations

Table 5: SaaS Backup Alternatives

4. Naive Cloud Usage

Table 6: Risk Management Strategies - Naive Cloud Usage

5. DIY Backup

Table 7: Risk Management Strategies - DIY Backup

6. Backup as a Service (BaaS)

Table 8: Risk Management Strategies - BaaS

7. Conclusions

References

Acknowledgements

Author Affiliations