Roger Clarke's Web-Site

© Xamax Consultancy Pty Ltd,  1995-2024
Photo of Roger Clarke

Roger Clarke's 'Practicable Backup of Cloud Data'

Backup Strategies for Users Dependent on Service-Providers
A Working Paper in support of
'Can Small Users Recover from the Cloud?'

Version of 10 March 2017

Roger Clarke **

© Xamax Consultancy Pty Ltd, 2014-17

Available under an AEShareNet Free
for Education licence or a Creative Commons 'Some
Rights Reserved' licence.

This document is at http://www.rogerclarke.com/EC/PBAR-SP-WP.html


Abstract

Large numbers of small organisations and prosumers have shifted away from managing data on their own devices and are now heavily reliant on service-providers for both storage and processing of their data. Most such entities are also dependent on those service-providers to perform backups and enable data recovery. Prior work defining users' backup needs was applied to this context in order to establish specifications for appropriate backup arrangements. A sample of service-providers was assessed against those specifications. Their backup and recovery mechanisms were found to fall seriously short of the need.

This Working Paper provides details of the analysis that was undertaken, and includes text and tables that support the following:


Contents


1. Introduction

Organisations are responsible for managing their data effectively, but only those that are of substantial size are capable of bringing sufficient resources to bear on the problem. The focus of this paper is on entities that lack scale and IT expertise. Further, the paper is concerned specifically with the question of backup arrangements, whose purpose is to ensure that data continues to be available despite loss of, or compromise to, the primary copy.

The entities within scope of the paper are of several kinds. One is 'micro-organisations' that involve at most one or two individuals. They may or may not be incorporated, their activities may be stimulated by economic or social motivations, and they may be for-profit or otherwise. Some small organisations, with up to 20 employees, have similar characteristics.

A further relevant category of entities is individuals who make relatively sophisticated personal use of computing facilities. This may be for the management of personal finance, tax and pension fund, for correspondence, for databases of images, videos and audio, or for a family-tree. Such individuals are referred to here as 'prosumers'. The term was coined by Toffler (1970, 1980), and has progressively matured (Tapscott & Williams 2006, Clarke 2008). A prosumer is a consumer who is proactive (e.g. is demanding, and expects interactivity with the producer) and/or is a producer as well as a consumer. In the context of computer usage, a third attribute of relevance is professionalism, to some extent of the person themselves but also in relation to their expectation of the quality of the facilities and services that they use.

The significance of the work reported in this paper extends beyond micro-organisations and prosumers, however. During the last two centuries, workers were mostly engaged full-time by organisations under 'contracts of service'. The last few decades have seen increasing casualisation of workforces, with large numbers of individuals engaged through 'contracts for services'. This requires them to take a far greater degree of self-responsibility. To the extent that large organisations depend on sub-contractors' use of IT and management of data, the security risks faced by sub-contractors impact upon the organisations that engage them.

Risk importation occurs even in the case of conventional employees, because of the Bring Your Own Device (BYOD) phenomenon. On the one hand, this outsources IT device provision from the employer to the employee. On the other, it insources to the employer the insecurities of their employees' devices. A key risk is that data on which the organisation depends may not be subject to adequate backup and recovery arrangements.

A prior study was undertaken of the risks involved in consumer migration from local applications to remote services (Clarke 2011). That paper concluded with the following quotation: "Some cloud computing outfit is going to quickly and quietly shut down, taking with it the data (business, photos, video, memories, etc.) of tens of thousands of users.  Once we're storing everything in the cloud, what's to keep us from losing everything in the cloud?" (Cringely 2011, emphasis in original). Clarke (2012a) documented the extent and nature of cloud interruptions and failures in the period 2005-11. This paper is motivated by the need for clear guidance for small users faced with such service-provider frailties.

A predecessor paper reported on an in-depth analysis of the backup needs of micro-organisations and prosumers that store the primary copy of their data in-house, under their own responsibility (Clarke 2016). That paper also addressed circumstances in which an entity relies on a remote file-hosting service for storage of the primary copy of the files, but processes the files on the entity's own devices.

During the last decade, however, the user's proximity to their data has further diminished. The data used to be 'here', on the consumer's own device. It moved to 'there', as consumers used relatively local Internet Services Providers, with a known footprint. As the dependency came to be on large national ISPs, and particularly on ISPs outside the consumer's local jurisdiction, the footprint became less visible, and the data moved 'somewhere'. To the extent that cloud computing is applied, consumers' data is now 'anywhere'. At the applications level, cloud computing takes the form of so-called [Application] Software as a Service (SaaS) offerings. Under the SaaS model, the service-provider both stores the primary copy of the data and performs much of the processing (Armbrust et al. 2010, Höfer & Karagiannis 2011). The user has a relatively very thin application on their desktop and/or laptop, in many cases in the form of scripts downloaded to their browsers, or a small 'app' on their smartphone and/or tablet.

The term 'SaaS' is associated with office automation services such as Google Docs, and the customer relationship management (CRM) service Salesforce. However, the pattern was emergent for some years, in such forms as webmail (operated both by local ISPs and by large providers such as Hotmail, Yahoo! and Gmail), family-tree data (e.g. ancestry.com) and textual documents, commonly called web-logs or blogs (e.g. wordpress.com). The more sophisticated forms that have emerged since about 2005 extend 'outsourcing' to 'cloudsourcing' by taking advantage of inexpensive commoditised hosts and virtualisation features, and articulating the industry into a wholesale-retail network model.

A SaaS segmentation analysis was presented in Clarke (2011). Since then, however, there have been further developments in SaaS offerings. No empirically-based taxonomy was located in the formal literature. The following segments are proposed as a means of identifying potential objects of study, on the basis of examples and partial classification schemes evident in both refereed and commercial literatures:

Dependence on such services gives rise to additional data risks (Clarke 2013), and backup and recovery have a role to play in managing a number of them. An early statement of the problem appeared in Armbrust et al. (2010): "customers cannot easily extract their data and programs from one site to run on another". This was more fully articulated by Buffington (2012): "What about if the SaaS provider goes dark? Maybe out of business? Perhaps a victim of Denial of Service attacks or broad data corruption (that is then replicated between sites). What is your plan? Do you back up the data from your SaaS provider? In what format(s) is the backup in? Is the data readable or importable into a platform that you own? How would you bring the functionality back online for your local users? for your remote users? Most importantly, have you tested that recovery?". Some further insights are available in the commercial literature (e.g. Taylor 2014, Buffington 2014, 2015).

In formal literatures, on the other hand, even within the cloud and SaaS security arena, the topic of backup and recovery has attracted limited attention. Searches on <SaaS backup> delivered a small number of articles, only a dozen of which have achieved Google citation-counts, the largest of which was 12. In the IEEE Library, <SaaS backup> finds only 3 hits, one of them a brief opinion piece by this author (Clarke 2015b). Google Scholar finds a modest number of articles, but none have more than 20 citations. In the 30,000-entry eLibrary of the Association for Information Systems (AISeL), no articles have <SaaS AND backup> in the Abstract, and the 12 with SaaS in the Abstract and backup in the text were almost all attitudinal surveys, and made no contributions to design. Typical of the limited level of detail in these papers was the comment in one paper that "companies should place particular importance on defining careful and granular SLAs on security/privacy aspects including clear ... backup policies". Backup has attracted very little interest within the AIS community, with only 4 papers in the last decade including <backup> in the Abstract. Even an article on selection criteria for SaaS services barely mentioned the topic (Repschlaeger et al. 2012).

Broader searches on <SaaS security> delivered far more articles, with far more citations. The IEEE library using < cloud security backup> found 70 hits, but few of these focus on backup issues. A 350-page book on `cloud security' has less than a page on backups, and only 8 pages on business continuity and disaster recovery as a whole (Krutz & Vines 2010). Of the highly-cited papers in the area, Balachandra et al. (2009), Subashini & Kavitha (2011), Javaraiah (2011), Chen & Zhao (2012) and even AlZain et al. (2012) address security generally but with superficial mentions of data backup and recovery, or none at all. Among the relevant publications of standards bodies, NIST (2011) is vacuous. ENISA (2015) advises SMEs that "Customers should assess which backups are made by the provider and if they need to implement or request additional back-up mechanisms ... Customers should assess which data is stored server-side, and client-side" (p.9). However, ENISA fails to provide any guidance as to what information to gather, how to evaluate it, and what requirements to communicate to their suppliers. The modest commercial literature contains useful observations, but limited guidance (e.g. Menn 2011, Cringely 2011).

This article sets out to fill this gap in the IS literature, by defining the needs of the micro-organisation and prosumer market-segment, and evaluating a sample of SaaS providers against these requirements.


2. The Research Method Adopted

The previous section identified a shortfall in research relating to backup in SaaS contexts, including in relation to practicable backup plans for micro-organisations and prosumers. The purpose of the research reported in this paper was defined as:

to develop practical guidance on how micro-organisations and individuals can use backup techniques to address data risks arising in the context of remote, cloud-based (SaaS) services

Reflecting that purpose, the paper is not addressed exclusively to researchers. It is expressed in language that is intended to be accessible to professionals as well, and consequently avoids unnecessary intellectualisation of the issues.

The work adopted the design science approach to research (Hevner et al. 2004, Hevner 2007). In terms of the research method described by Peffers et al. (2007), the research's entry-point is `problem-centred'. The process commences by applying risk assessment techniques in order to develop an articulated definition of the problem and of the objectives. An artefact is then designed - in this case a set of backup plans. In terms of Hevner (2007), the article's important contributions are to the requirements phase of the Relevance Cycle, and to the Design Cycle, drawing on existing theory in the areas of risk assessment, data management, and data security. The article contributes to the evaluation phase of the Relevance Cycle by applying the new artefact to a sample of SaaS services, and laying a firm foundation for application and field testing.

The analysis in the companion paper, (Clarke 2016), proposed a customised process for performing risk assessment focussed specifically on matters for which backup and recovery are relevant safeguards. It applied the conventional security model, as summarised in Appendix 1 to that paper, and drew on the accumulated literature on risk assessment generally, in particular AS 4360-2004 (which is more specific than the successor ISO 27000 series) and NIST (2012, 2013). A straightforward process was proposed whereby risk assessment could be conducted by or for small organisations and prosumers. The process is presented in Table 1.

Table 1: The Process

From Table 2 of Clarke (2016)

Analyse

(1) Define the Objectives and Constraints

(2) Identify the relevant Stakeholders, Assets, Values and categories of Harm

(3) Analyse Threats and Vulnerabilities

(4) Identify existing Safeguards

(5) Identify and Prioritise the Residual Risks

Design

(1) Identify alternative Backup and Recovery Designs

(2) Evaluate the alternatives against the Objectives and Constraints

(3) Select a Design (or adapt / refine the alternatives to achieve an acceptable Design)

Do

(1) Plan the implementation

(2) Implement

(3) Review the implementation

It is not feasible to conduct risk assessment on a general case. The companion paper accordingly declared a particular profile to be used in the research. The criteria used in devising it were:

Consideration of the test-case used in the companion paper led to the conclusion that it was as applicable to the SaaS contexts as it was to earlier models of data-storage and processing. The test-case is defined in Table 2.

Table 2: The Test-Case

From Section 3 of Clarke (2016)

  • A person who:

    • is a moderately sophisticated user of computing devices
    • uses their computing devices for personal activities and/or in support of a micro-organisation
    • has limited professional expertise in information technology matters

  • The functions the person performs are primarily:

    • preparation and amendment of documents
    • maintenance of data-sets and databases in a variety of formats, including text, structured data, image, sound and video
    • exchange of communications with other people
    • access to web-sites
    • maintenance of their own web-sites
    • the use of Internet Banking and eCommerce, but only as a purchaser, not as a merchant

  • The person operates out of a home-office using equipment as follows:

    • a desktop device
    • a portable / laptop / clam-shell device
    • a handheld device

  • The person has limited and/or somewhat haphazard backup and recovery arrangements in place
  • The person is not likely to be a specific target for attackers (as distinct from being subject to random unguided attacks by malware), i.e. such categories are excluded as private detectives, IT security contractors, and social and political activists who face the risk of being directly targeted by opponents and by government agencies

The main body of the the research reported in this paper applied risk assessment and risk management techniques, in order to design backup plans. The final step was an initial contribution towards evaluation of those plans. This was achieved by examining the feasibility of implementing the plans in relation to a small sample of SaaS services. The set needed to reflect the diversity of the market segments discussed earlier and embody sufficient differences in the nature of the services that they were likely to be somewhat independent studies rather than the same study performed several times. On the other hand, the set needed to be sufficiently small to enable performance of the analysis with available resources. The following were selected:

The remainder of this paper proceeds as follows. In section 3, the analysis segment of the process declared in Table 1 is applied to the test-case defined in Table 2. This results in a prioritised set of threat-vulnerability combinations that the backup mechanism needs to address. On the basis of that assessment of risks, section 4 applies the design segment of Table 1. Both of those sections draw on the prior study of backp contexts that pre-date SaaS. They summarise, rather than re-present, the relevant sections of Clarke (2016). The analysis identifies three alternative approaches to the hosting of SaaS backups, which are depicted as 'Naive Cloud', 'DIY Backup' and 'BaaS on SaaS'. In sections 5-7, Backup Plans are presented for each of those three approaches. For each approach, the offerings of the sample of providers are evaluated against the Plan, in order to provide insight into the extent to which currently-available services satisfy the needs of the test-case.


3. Risk Assessment

This section summarises the assessment of risks for which backup and recovery are relevant safeguards. It follows the sequence of steps specified under `Analyse' in Table 1.

(1) Objectives and Constraints

Based on previous work (Clarke 2015a), the following statement was adopted of the purpose of the backup plans, and the constraints within which the design needs to work:

To avoid, prevent or minimise harm arising from environmental incidents, attacks and accidents, avoiding harm where practicable, and coping with harm when it arises, by balancing the reasonably predictable financial costs and other disbenefits of safeguards against the less predictable and contingent financial costs and other disbenefits arising from security incidents

On the other hand, many practitioners are likely to prefer a simpler formulation, such as:

To achieve reasonable levels of security for reasonable cost

(2) Stakeholders, Assets, Value, Harm

Stakeholder analysis was undertaken, followed by the identification of relevant assets. The resultant list is in Table 3.

Table 3: Relevant Data Assets

  • Business-Related Content
    such as databases, transaction files, reports, work-in-progress, sources of data and information, customer information, details of outstanding debts
  • Financial Data
    such as records of assets and transactions, insurance details
  • Payment Authenticators
    such as PINs, credit-card details
  • Identity Authenticators
    such as passwords, passport and driver's licence details
  • Funds
    such as bitcoin wallets
  • Personal Data, in some cases of a sensitive nature:

    • of an individual
      e.g. diaries, address-books, music collections, health-related data
    • of the individual's family
      e.g. family albums, family history, tax return data
    • of other people
      e.g. if the individual performs counselling, mentoring or coaching

  • Infrastructure Configuration Data
    such as settings, parameters and scripts that support computing operations

The analysis reviewed approaches to data attributes, properties and values. The longstanding set of Confidentiality, Integrity and Availability (`the CIA list' (Saltzer & Schroeder 1975) is much-criticised, and many alternatives and adjuncts have been offered (e.g. Parker 1998, Cherdantseva & Hilton 2012). The set of values adopted in this analysis is defined in Table 4.

Table 4: Relevant Values Associated with Data

  • Accessibility
    The data is accessible to appropriate entities in appropriate circumstances
  • Inaccessibility
    The date is otherwise not accessible
  • Quality
    The data adequately satisfies all dimensions of data integrity
    defined as accuracy, precision, and timeliness, which in turn comprises temporal applicability, up-to-dateness and currency
  • Completeness
    Sufficient contextual information is available that the data is not liable to be misinterpreted. Of particular concern are provenance, and the data's syntax and semantics

Categories of harm to data were derived from the literature, most usefully ISO 27005 (2012, Annex B, pp. 39-40). They are listed in Table 5. The forms of consequential harm to stakeholders' values are listed in Table 6.

Table 5: Harm to Values Associated with Data

  • Accessibility

    The data is not accessible to appropriate entities in appropriate circumstances:

    • Data Loss:

      • Data in volatile memory is dependent on continuous functioning of the CPU and electrical power
      • Data in non-volatile memory is at risk of being over-written, in many cases at file-level and in some cases at record-level within databases
      • Data storage-media and data storage-devices containing storage-media are subject to theft, destruction and malfunction

    • Data Unavailability at a relevant time, in particular due to shortfalls in infrastructure performance

  • Inaccessibility

    The data is otherwise accessible. This takes several forms:

    • Data Access, whereby data in storage is accessed by an inappropriate person, or for an inappropriate purpose
    • Data Disclosure, whereby data in storage is communicated to an inappropriate person, or for an inappropriate purpose
    • Data Interception, whereby data in transit is accessed by an inappropriate person, or for an inappropriate purpose

  • Quality

    The data does not adequately satisfy all dimensions of data integrity:

    • Data Quality is low at the time of collection
    • Data Quality is low at the time of use, due to Data Modification, Loss of Data Integrity or Corruption

  • Completeness
    Sufficient contextual information, in particular provenance and data syntax and semantics, is not available

Table 6: Harm to Stakeholder Values
Arising from Harm to Values Associated with Data

  • Reduced Asset Value
    e.g. loss of a debtors ledger or prospects database with intrinsic value
  • Degraded Operational Capacity
    Tasks cannot be performed
  • Degraded Service Quality
    Tasks cannot be performed well
  • Reduced Revenue or Amenity
    (depending on whether the purpose is economic or social)
  • Cost, Time, Effort and Economic Loss Incurred during Recovery
    incl. the acquisition of backup data, the performance of recovery procedures, transport and communications, and the replacement of payment or identity authenticators
  • Damaged Reputation
    incl. the confidence of family, employees, customers, investors or regulators
  • Negative Privacy Impact on Individuals
    e.g. through unauthorised access to personal data, or disclosure or interception of personal data
  • Non-Compliance with Obligations or Commitments
    e.g. through loss of tax records

(3) Threats and Vulnerabilities

Drawing on relevant sources, most usefully ISO 27005 (pp. 42-49) and NIST (2012, pp. 65-76), threats to data were catalogued. The list is lengthy, and is presented in Appendix 1. To facilitate communication with small organisations and individuals, a mnemonic approach was adopted to Threats, using FATE (Fire, Attack, Training and Equipment) to represent respectively Environmental Events (F), Attacks (A), Human-Caused Accidents (T) and Accidents within Infrastructure (E).

Similarly, relevant data vulnerabilities were catalogued. See Appendix 2. In both Appendices, the differences between the SaaS context and the simpler contexts considered in the predecessor paper are highlighted by means of the `SaaS' prefix.

(4) Existing Safeguards

The next step in the process was to identify pre-existing factors that intentionally or incidentally mitigate risks. Examples include conservative human behaviours, physical security safeguards, accounting controls. and insurance policies.

(5) Priority Threat-Vulnerability Combinations

Consideration of existing safeguards enabled the final step to be taken - that of identifying and prioritising the residual risks that are not satisfactorily addressed by existing safeguards. The industry norm for prioritisation involves assigning to each residual risk a severity rating and a probability rating, and then sorting the residual risks into descending order, showing extreme ratings before high rating, etc. This is most comprehensively presented in NIST (2012, Appendices G, H and I). Severity and probability ratings were assigned on the basis of the author's assessment of the test-case described in Table 2. The results are summarised in Table 7.

Table 7: Priority Threat-Vulnerability Combinations

 

 

Risk

Probability Rating
(H, M, L)
Severity Rating
(E, H, M, L)
1.
Unavailable Data or Unavailable Service to Access the Data
Long-Term
MediumExtreme
2.
Inaccessible Service – Wide Area Network Failure / Congestion
Long-Term
MediumExtreme
3.
Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Long-TermMediumExtreme
4.
Inaccessible Data (Data-Format unable to be processed)
Long-Term
MediumExtreme
5.
Unavailable Data or Unavailable Service to Access the Data
Short-Term
HighHigh
6.
Inaccessible Service – Wide Area Network Failure / Congestion
Short-Term
HighHigh
7.
Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Short-TermHighHigh
8.
Inaccessible Data (Data-Format unable to be processed)
Short-Term
MediumHigh
9.
Mistaken Amendment, Deletion or Overwriting of a FileMediumHigh

The following section builds on the risk assessment reported above in order to establish a suitable framework within which the Design segment of the Process outlined in Table 1 can be applied.


4. Risk Management

The consideration of alternative backup designs drew on the existing literature, most usefully Chervenak et al. (1998), plus Lennon (2001), Gallagher (2002), Preston (2007), de Guise (2008), Strom (2010), TOB (2012) and Cole (2013). The companion paper, Clarke (2016), provides a consolidated list of the characteristics of backup data and of backup processes, in Appendix 2 to that paper; and descriptions of each of the various categories of backup procedure, in Appendix 3 to that paper. In designing a backup regime for any particular context, key considerations include the frequency with which full backups are performed, whether incremental backups are performed and if so how frequently, whether copies are kept online or offline, and whether second-level archives are kept and if so whether they are later over-written or archived.

In the SaaS context, because of the heavy reliance on another party, a number of characteristics of supplier and service quality need to be taken into account. The first concern is the lack of clarity about what obligations a SaaS provider owes to its customers, how long they last, and whether and how they could be enforced. A contract may exist - provided that all of the conditions for the formation of a contract are satisfied, notably the provision of consideration by both parties. If so, the contract is governed by the Terms that the provider displays, together with any overriding conditions imposed by such jurisdiction(s) as may be relevant to the contractual relationship. Generally, the Terms published by SaaS providers offer very little in the way of undertakings to the user, and are malleable, and are in practice (and possibly even at law) unenforceable. Warranties in relation to the accessibility of the data are uncommon, even more so warranties in relation to ongoing accessibility, and especially long-term accessibility. In short, the backup features of SaaS services are whatever they are, and as a matter of practice, and even of law, there is generally very little that the user can do about them.

Secondly, it is unclear what degree of reliance a small organisation or individual can place on the continued existence of each particular SaaS service. Most SaaS providers offer services in order to make a profit. So lines of business that are not sufficiently profitable are routinely withdrawn. Some providers also close users' accounts if they consider that they have not been used sufficiently recently. In addition, it is in the interests of a commercial service-provider to make it difficult for users to extract their data, because that raises the switching costs and achieves 'loyal' (in the narrow sense of 'locked-in') clients. In practice, reliability of access has mostly been reasonably good, but many instances of service-interruption and data-loss were documented in Clarke (2012a), and little appears to have happened since then to reduce the concerns identified in that study. There are questions not only about the longevity of individual services, but also about the survival of each SaaS provider. For example, of the 25 CMS SaaS catalogued in Kaskade (2011), 7 (28%) appeared, a mere 4 years later, to no longer exist.

Withdrawals of services from the market, and failures of SaaS providers, have generally involved little or no notice to users. This includes multiple withdrawn services that had been offered by very large corporations such as Google, Apple, Facebook and Microsoft. In some cases, the service was deemed to be no longer of interest to the company. In other cases, however, it was a competitive service that the larger corporation had acquired in order to close it down. An example of a withdrawn service is the drop.io file-sharing service, which was closed at 6 weeks' notice following takeover by Facebook. Similar fates befell Pownce and FitFinder (micro-blogging), Yahoo! Geocities (web-hosting), Yahoo! Photos, Yahoo! Briefcase (file-hosting), Google Buss, Google Video, Google Knol, Google Health, Google Notebook and Google Orkut.

Given the considerable scope for SaaS services to fail, data escrow becomes an attractive idea. Data escrow involves a copy of the data being on deposit with a trusted third party, which is contractually bound to deliver it to the right people in the right circumstances. Circumstances relevant to the present context are failure of the provider, and withdrawal of the service. There are a number of patents in the area, but the slim formal literature is primarily concerned with personal privacy protections rather than the data values addressed in this paper. An example of a functioning data escrow service is that provided by the NCC Group in relation to DNS registry data; but it does not appear that services implementing data escrow principles are generally available, and particularly not in relation to SaaS services.

A third concern about SaaS services is that many of them store their customers' data in proprietary formats. The question therefore arises as to whether backed-up data is sufficiently portable to satisfy the accessibility value. Some data is of ephemeral value, and the functions that a prosumer or micro-organisation needs to have assured access to may be no more than:

In some areas of business, there is an presumption that immediacy and 'living for the now' is dominant and that long-term planning for data is out-of-step with the zeitgeist. On the other hand, a considerable amount of data retains value over an extended period of time. In some circumstances, that period may be definable (e.g. 5-7 years for tax law compliance); but in others the period may be indeterminate (e.g. family trees and photo-albums). For data with longer-term value, users are reliant on ongoing access to a broader range of functions:

Because of the uncertainties about supplier and service reliability and about accessibility during the lifetime of the data's value, one of the most critical design choices is whether the first-level backup is stored locally by the user, or remotely by some other party. Within the SaaS context, there are several approaches that can be taken. In the first case, the user relies entirely on the SaaS provider. In the second, the user draws backups down to their own site. In the third case, the user relies on the services of a further party, referred to here as a 'Backup as a Service' or BaaS provider. In these three cases, the user is respectively entirely dependent upon and trusting of the service provider; or is required to understand, design, implement and perform backup; or is dependent on two service-providers rather than just one. See Table 8.

Table 8: SaaS Backup Alternatives

Short
Description
Indicative
Timeframe
Location of the
1st-Level Backup
Location of the
2nd-Level Backup
User
Experience
Naive Cloud
2010-
SaaS Provider
SaaS Provider
Implicit Trust
DIY Backup
2010-
Local
SaaS Provider
Demanding
BaaS on SaaS
2015-
SaaS Provider
BaaS Provider
Undemanding

The following sections consider in turn the three contexts identified in Table 8. In each case, a requirements statement is first derived from the literature and the preceding analysis, and then the offerings of a sample of SaaS service-providers are evaluated against the requirements. Section 5 considers the extent to which SaaS providers appear to justify the implicit trust that many users place in them. Section 6 specifies a Backup Plan whereby users can ensure that they have a copy of their data in their own storage-devices. Section 7 considers the possibility of using an additional cloud service - a 'Backup as a Service' provider - as a defence of last resort against malfunction or malperformance by a SaaS provider.


5. Backup Intrinsic to SaaS Services

This section considers the most common, and generally default approach, particularly during the early years of SaaS services. A list of requirements was developed which addresses the high-priority threat-vulnerability combinations that confront a small organisation or individual that has the profile declared in Table 2 and that depends on a SaaS provider to perform backup and recovery on the user's behalf. The Requirements Statement is in Table 9. The requirements are categorised into infrastructure features, file-precautions, backup runs and business processes. A list of essential elements is presented first, followed by a further set of recommended actions, and then by a list of unaddressed risks.

Table 9: SaaS Backup Requirements Statement

ESSENTIAL

    Infrastructure Features

  1. a. The SaaS provider is to store the Primary copies of all files
    b. The SaaS provider is to provide facilities for creating, amending and accessing the data from the user's working-devices
    c. The SaaS provider is to provide 99.9% service-availability (10 minutes' downtime per week)
    This ensures that the user has access to their data and to the ability to create, amend, access and delete their data
  2. File-Precautions

  3. The SaaS provider is to perform continual saves when data is being changed
    This guards against the loss of recently-completed amendments due to infrastructure failure
  4. The SaaS provider is to create a new file-version when making significant amendments
    This ensures recoverability from mistaken amendments and deletions
  5. The SaaS provider is to run malware detection and eradication software
    a. on all incoming files
    b. on all stored files

    These two safeguards greatly reduce the frequency with which malware will affect the individual's data. This is particularly important on devices using a Microsoft OS
  6. Backup Runs

  7. The SaaS provider is to maintain a full Backup to a separate storage-medium no less frequently than daily
    This provides an accessible and reasonably up-to-date First-Level Backup
  8. The SaaS provider is to
    a. maintain a full Second-Level Backup on a weekly, fortnightly or monthly basis

    b. store the Second-Level Backup in such a manner that it is not subject to the location-based risks that apply to the First-Level Backup
    These together address the risk of the Primary and the First-Level Backup both being inaccessible for any reason
    c. store the Second-Level Backup offline
    This addresses the risk of simultaneous corruption of all versions of the data, e.g. by ransomware
  9. The SaaS provider is to ensure that, periodically, a Full Backup is provided to a Data Escrow service, under conditions that ensure its availability to the user
    This addresses the risk that the Primary copy and the Backups, which are all in the possession of the service-provider, may become inaccessible by the individual, for such reasons as undetected loss of data, bankruptcy, withdrawal of service, a commercial dispute, or seizure by another party
  10. Business Processes

  11. The SaaS provider is to document and periodically rehearse backup procedures to implement all of the Backup Runs
  12. The SaaS provider is to document and periodically rehearse recovery procedures for individual files, and for the complete set of files, from First-Level Backup and from Second-Level Backup

RECOMMENDED

    Infrastructure Features – Additional Measures

  1. The SaaS provider is to facilitate mutual authentication and channel encryption between all of the individual's devices and the service-provider
    This greatly reduces the risks of data interception and data corruption, particularly when connecting from insecure external locations
  2. Backup Runs – Additional Measures

  3. a. The SaaS provider is to, half-yearly, retire a Full Backup to Archive
    b. The SaaS provider is to store successive Archive copies locally and remotely
    This ensures that an occasional set of old file-copies exists and hence earlier versions of files can be recovered
  4. The SaaS provider is to, annually, spool 3-year-old Archives to new media
    This addresses the risk of storage-media decay
  5. The SaaS provider is to, 5-yearly, spool all Archives to a new media-type
    This addresses the risk of having storage-media that no storage-device can read

UNADDRESSED RISKS

  1. It is not feasible to require the SaaS provider to ensure that the format is readable by software readily available to the individual
    This means that the user is at risk that the data may be in principle accessible, but in practice unusable
  2. It is not feasible to require the SaaS provider to ensure that the user's data is inaccessible to the SaaS provider
    This is feasible (although not simple) in relation to backup copies, which can be encrypted to a key that only the user has access to. But the Primary copy is processed in clear and hence, even if it is stored encrypted, the SaaS provider must be able to decrypt it

Strategies were sought for dealing with the priority threat-vulnerability combinations. Unfortunately, as indicated in Table 10, where full reliance is placed on the SaaS provider, almost all of them are incapable of being addressed.

Table 10: Risk Management Strategies - Naive Cloud Usage

 
RiskStrategy
1.
Unavailable Data or Unavailable Service to Access the Data
Long-Term
Nil, for the duration of the (long) Unavailability
2.
Inaccessible Service – Wide Area Network Failure / Congestion
Long-Term
Nil, for the duration of the (long) Inacessibility
3.
Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Long-TermNil, for the duration of the (long) Inacessibility
4.
Inaccessible Data (Data-Format unable to be processed)
Long-Term
Nil
5.
Unavailable Data or Unavailable Service to Access the Data
Short-Term
Nil, for the duration of the (short) Unavailability
6.
Inaccessible Service – Wide Area Network Failure / Congestion
Short-Term
Nil, for the duration of the (short) Inacessibility
7.
Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Short-TermNil, for the duration of the (short) Inacessibility
8.
Inaccessible Data (Data-Format unable to be processed)
Short-Term
Nil
9.
Mistaken Amendment, Deletion or Overwriting of a FileContinuous or Continual Versioning

The sample of SaaS service-providers was then evaluated against the Requirements Statement in Table 10. A review was conducted of each supplier's product description, of product commentaries, and of other sources, particularly organisations that offer backup products or services relevant to it. The reviews were complemented by searches for relevant published papers in both the academic and popular literatures - although few were found.

A number of features were common across most or all of the five providers:

The remainder of this section presents the outcomes of the evaluation that were specific to the individual services.

Instagram

Instagram offers a simple storage service for static files, primarily in order to make them accessible by other people, but of necessity also as a repository for them. It projects the feeling of social immediacy, and, in many circumstances, the images that users store there are of ephemeral interest. Halfway through the company's c. 10-page Terms of Use page is the statement "Instagram encourages you to maintain your own backup of your Content. In other words, Instagram is not a backup service and you agree that you will not rely on the Service for the purposes of Content backup or storage. Instagram will not be liable to you for any modification, suspension, or discontinuation of the Services, or the loss of any Content" (emphasis added). It is unlikely that many users ever read that sentence, let alone grasp its significance. To the extent that images have ongoing value to users, it is abundantly clear that reliance on Instagram even as a medium-term repository, let alone as a long-term archive of visual memories (cf. a 'photo album'), is extremely unwise.

ancestry.com

This service enables the development and maintenance of one or more segments of structured family tree, supported by a substantial set of resources a great deal drawn or captured from public sources, plus a vast amount of data contributed by its users. It appears that there is no notion of versioning, with only the current content available for display. The reliability of service, of service-availability, and of backup and recovery, is unclear. Two-thirds of the way through the company's c. 10-page 'Terms and Conditions' web-page is this brief passage: "we shall not be liable to you for ... (v) loss or corruption of, or damage to, data ...". Given the great value that people generally place on the data that they accumulate about their families, it is remarkable that there appears to be no mention of the provider's backup policies or practices anywhere on the site.

Xero

Xero provides an online accounting service. Given that accounting for even the tiniest micro-business involves periodic cycles, the need to capture data and generate reports for various periods at various times, and 5-7 year archival requirements, it would be reasonable to expect that the design and the advice to users would emphasise backup and recovery. During late 2016, a statement appeared on the Security page saying "Your backups are kept up to date. Online backups are updated throughout the day, every day, and stored in multiple secure locations". However, statements half-way through the company's c. 10-page Terms of Use page says that "Xero adheres to its best practice policies and procedures to prevent data loss, including a daily system data back-up regime, but does not make any guarantees that there will be no loss of Data", and "Xero expressly excludes liability for any loss of Data no matter how caused". The sum total of the assistance it provides to users concerning the steps that they need to take is "You must maintain copies of all Data inputted into the Service" (emphases added).

Salesforce

Salesforce is quite blunt about its shortcomings in this area. Its web-page on Backup and restore your Salesforce data declares that "Salesforce Support doesn't offer a comprehensive data restoration service", and warns the user not to rely on it for recovery purposes: "Although Salesforce does maintain back up data and can recover it, it's important to regularly back up your data locally so that you have the ability [to] restore it to avoid relying on Salesforce backups to recover your data. The recovery process is time consuming and resource intensive and typically involves an additional fee". It also explains that it is wildly expensive: "Data Recovery is a last resort process ... The price is a flat rate of $US 10,000" (emphases added). This would be very expensive for many micro-organisations and individuals, and prohibitive for some.

Google Docs

The core functions of Google Docs, as with any office suite, are the creation and modification of word processing documents and spreadsheets; but several further capabilities are included (Jeffries 2012). The storage of user files is managed by Google Drive. Google Drive has several modes of operation, which give rise to considerable complexity in relation to the ways in which backup and recovery needs to be performed. When documents are being edited, progressive saves are made, and a concept of file-versions exists; but the frequency with which a new version of the file is created is not entirely clear. Despite searches being conducted on Google sites, and links being followed, no source was located that explained what backup and recovery processes exist within the Google Docs and related services.

Security risks are highest in the case of word processing and spreadsheet documents, because they may contain active code. Despite this vulnerability, malware management is implemented by the service-provider in such a manner that some users have reported complete loss of data as a result of the penetration of their working devices by ransomware (Austin 2013).

______________

The support for backup provided by these five mainstream SaaS services is such that individuals and organisations that are reliant on being able to access data should not place implicit trust in any of them. Put another way, a rough score arising from the analysis is 0 out of 5. Every user needs to ensure that they retain copies of all source-materials that they used to create the data submitted to the SaaS service-provider, so that they can start again with an alternative provider if and when the first one fails them. This may require special measures if some of the source-materials were ephemeral, e.g. voice, telephone-calls or temporary on-screen displays. Moreover, given the current, clearly immature stage of the development of SaaS services, the cautionary statements made about the five test-providers may well be generally applicable to SaaS providers generally.


6. DIY Backup of SaaS Data

The previous section considered the appropriateness of an individual or small organisation simply trusting a SaaS provider to protects its users' interests. This section investigates how a user can take Do-It-Yourself (DIY) responsibility for their own backup and recovery, despite entrusting both the Primary copy and all processing functions to a service-provider.

An early reference described a simple Linux-based server to be installed on the premises of the consumer or micro-organisation to automatically perform incremental backloads of SaaS contents such as SharePoint (Javaraiah 2011). However, the paper was superficial and addressed very few of the user needs and practical challenges identified earlier in this paper.

In comparison with the local storage approach addressed in the predecessor paper (Clarke 2016), the use of remote storage reverses the logic and requires a backup to be held locally. The service-provider needs to fulfil a number of requirements, but safeguards continue to be needed on the individual's own devices as well. The Plan, in Appendix 3, is closely aligned with that for SaaS backup in Table 9, but with several important differences.

The key issues are whether it is feasible for a user to extract their data from the SaaS provider's site, and whether the data that is extracted is in a usable format. Strategies were sought for dealing with the priority threat-vulnerability combinations identified in Table 7. As indicated in Table 11, most can, at least in principle, be addressed through mirroring or periodic backload to the user's own site.

Table 11: Risk Management Strategies - DIY Backup

 
RiskStrategy
1.
Unavailable Data or Unavailable Service to Access the Data
Long-Term
Mirroring or Periodic Backload to a local storage-device
2.
Inaccessible Service – Wide Area Network Failure / Congestion
Long-Term
Mirroring or Periodic Backload to a local storage-device
3.
Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Long-TermMirroring or Periodic Backload to a local storage-device but only if directly-connected
4.
Inaccessible Data (Data-Format unable to be processed)
Long-Term
Mirroring or Periodic Backload to a local storage-device but only if reformatting is feasible
5.
Unavailable Data or Unavailable Service to Access the Data
Short-Term
Mirroring or Periodic Backload to a local storage-device
6.
Inaccessible Service – Wide Area Network Failure / Congestion
Short-Term
Mirroring or Periodic Backload to a local storage-device
7.
Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Short-TermMirroring or Periodic Backload to a local storage-device but only if directly-connected
8.
Inaccessible Data (Data-Format unable to be processed)
Short-Term
Mirroring or Periodic Backload to a local storage-device but only if reformatting is feasible
9.
Mistaken Amendment, Deletion or Overwriting of a FileMirroring or Periodic Backload to a local storage-device, combined with Continuous or Continual File-Versioning

An assessment was undertaken of the extent to which the selected sample of SaaS providers enable the requirements specified in Appendix 3 to be fulfilled.

Instagram

It appears that Instagram does not support a download to the user's own working-device. On the other hand, Instagram does provide an API that facilitates extraction of files by third parties, and three services were found that appear to use that API to enable download files to the user's device: Instaport (which downloads all files), Grabinsta (for individual files only) and digi.me. Hence it would appear that an alert user can find a way to backup their images from Instagram; but nothing on the company's site suggests that this might be important, nor how it might be done. This is surprising, given that some users may want to upload photos from a handheld device, and later download them onto a desktop device.

Moreover, the Terms of Use of the API preclude its use "for any application that replicates or attempts to replace the essential user experience of Instagram.com or the Instagram apps". This has the appearance of an attempt to prevent users from switching their images to an alternative provider of image-hosting services; yet that is precisely what a user needs to do if Instagram under-performs, materially changes the service, withdraws it, or closes down.

ancestry.com

The family-tree data stored by ancestry.com is capable of being extracted by the user into a proprietary format, downloaded, and loaded into a copy of the inexpensive software product, Family Tree Maker. In early 2016, ancestry sold that product to the developer of the Mac OSX version, Software MacKiev. It appears that the product continues to be an effective means of performing DIY backup from the ancestry.com site. It is also possible to extract into another format, GEDCOM, which is used by a number of alternative software products (FHD 2017).

Xero

Xero provides functions for exporting files to their own device. However, despite the five-decade history of electronic data interchange (EDI), the company does not appear to have applied an industry-standard format for expressing the chart of accounts, transaction data and report-format parameters. Apart from PDF and TXT/XML, the primary format that data can be extracted into appears to be comma-separated values (CSV) files - which suffer from the serious problem that no reliable field-separation meta-character exists. Data can also be extracted into formats intended to be compatible with various other software products, but the site acknowledges a variety of incompatibilities. Further, there remain some doubts as to whether additional files loaded onto the SaaS service, such as scans of inbound invoices, are extractable. The effectiveness of backup to the user's site is accordingly a concern, and considerable accounting and IT expertise would be essential if an attempt were to be made to reconstruct the accounts from backups.

Salesforce

Salesforce enables data to be extracted, but according to the company's site:

In any case, there is no evidence that the data is of any value other than as a basis for reloading into Saleforce. Further, the company acknowledges that recreating a Saleforce database from a backup may be very difficult, and it cannot provide assistance: "Overall planning, development and implementation of a strategy to manage record restoration falls outside the scope of Support's offerings" (emphasis added).

Google Docs

Google Docs files are stored in a proprietary format, and as a result mirrored files are only readable using Google Drive Desktop App and/or the Google Chrome browser. In addition, only the current version is available, and versioning support is highly restricted. In mid-2016, considerable difficulty was encountered in assessing the feasibility of conducting DIY backup from Google Docs to users' own storage-devices. Reasons for this include the lack of straightforward documentation on Google's many sites, the existence of multiple documents with mutually inconsistent content scattered across different structures, and the frequent changes made to Google services, often without notice to their users. Indicative of the problems is that an auto-synch service advertised in Bradley (2011), called Memeo, was withdrawn within a year - presumably at least in part because ongoing changes to Google's storage arrangements make the maintenance of such third-party services untenable. A history of the techniques involved at various times is to be found in the successive, archived versions of Malunui et al. (2016). Some of the current pitfalls are described in Witzel (2016).

On the basis of the available sources, it appears that, until the end of 2014, there were considerable difficulties involved in extracting a usable backup of documents, and any particular user may or may not have been able to achieve it. Since the beginning of 2015, a backup-to-local-storage mechanism has been available, usefully described in Rowe (2015). Finding authoritative documentation on Google sites in the period September 2015 to March 2016 was challenging, and continually led to different source-pages. A snapshot of one page was taken in March 2016 (Google 2016).

By early 2017, however, it appeared that DIY backup was operational, with three relevant approaches identified by Malunui et al. (2016) - Download, into .docx and similar, Sync to a local storage-volume, and Google Takeout to extract a compressed version of the contents of the user's Google Drive, and an established third-party product, Insync available, and well-priced for small users.

_________________

Of the five services in the sample, only two, ancestry.com, enable an individual or small organisation to perform an effective backup to their own site in a reasonably straightforward manner. In the other three cases, there are conceptual and technical challenges that are sufficiently substantial as to be insurmountable for many users. So a rough score of the sample is 2 out of 5.

It would be a serious concern if the results from the sample apply to the population, and fewer than half of all SaaS providers enable straightforward implementation of DIY backload to users' own devices. The problem may affect large and medium-sized organisations as well, but is particularly significant for individuals and small organisations that seek to, for example, maintain an image-library, ensure that they have access to an archive of their electronic tax records, not lose their data relating to customers and prospects, and not lose their correspondence files with government agencies, suppliers, employers, financial institutions, insurers and pension funds.


7. Backup as a Service (BaaS)

This section considers the third approach, whereby an additional, specialist service-provider undertakes to perform backup of a user's data from their SaaS provider - Backup-as-a-Service (BaaS). In the terms used in this paper, a BaaS provider maintains a Second-Level Backup of the data whose Primary copy and First-Level Backup are held by the main SaaS service-provider.

The BaaS provider's focus is on backup and recovery services. However, they of course share common features with all other SaaS providers:

Applying basic security principles, it is important to ensure that the SaaS and BaaS providers do not have common vulnerabilities, because that would reduce what appears to be the relative security of two points-of-failure to the relative insecurity of only one. Examples of situations to be avoided include physical co-location of the SaaS and BaaS, network co-location, common ownership, and storage of the two Backups within the same jurisdiction. At this stage, the BaaS market-segment appears to be under-developed. A background/wholesale option exists in the form of Amazon Web Services (AWS) Glacier, but micro-organisations and prosumers would need an intermediary provider to utilise that service.

A Backup Plan was specified whereby users can have confidence in a combination of SaaS and BaaS providers providing reliable backup and recovery of their data. See Appendix 4. This has a common structure with the previous two Plans and some key differences.

Strategies were sought for dealing with the priority threat-vulnerability combinations identified in Table 7. The results are presented in Table 12.

Table 12: Risk Management Strategies - BaaS

 
RiskStrategy
1.
Unavailable Data or Unavailable Service to Access the Data
Long-Term
Mirroring or Periodic Crossload to another cloud-provider but only if BaaS Provider not affected
2.
Inaccessible Service – Wide Area Network Failure / Congestion
Long-Term
Mirroring or Periodic Crossload to another cloud-provider but only if BaaS Provider not affected
3.
Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Long-TermNil, for the duration of the (long) inacessibility
4.
Inaccessible Data (Data-Format unable to be processed)
Long-Term
Mirroring or Periodic Backload to another cloud-provider but only if reformatting is feasible
5.
Unavailable Data or Unavailable Service to Access the Data
Short-Term
Mirroring or Periodic Crossload to another cloud-provider but only if BaaS Provider not affected
6.
Inaccessible Service – Wide Area Network Failure / Congestion
Short-Term
Mirroring or Periodic Crossload to another cloud-provider but only if BaaS Provider not affected
7.
Inaccessible Service – Local Area Network or Internet Connection Failure / Congestion – Short-TermNil, for the duration of the (short) inacessibility
8.
s
Inaccessible Data (Data-Format unable to be processed)
Short-Term
Mirroring or Periodic Backload to another cloud-provider but only if reformatting is feasible
9.
Mistaken Amendment, Deletion or Overwriting of a FileMirroring or Periodic Crossload to another cloud-provider, combined with Continuous or Continual File-Versioning

The five sample SaaS were evaluated against the Backup Plan in Appendix 4, to establish the extent to which the use of a BaaS is feasible.

Instagram

Instagram provides an API that facilitates extraction of files by third parties, such as Frostbox, iDrive and StreamNation. The nominal prohibition on use of the API "for any application that replicates or attempts to replace the essential user experience of Instagram.com or the Instagram apps" is presumably ignored by all parties. If a user searches for a BaaS provider, they may find one; but Instagram appears to do nothing to assist users' awareness of or knowledge about the available options.

ancestry.com

Although ancestry.com enables extraction of family-tree data into a proprietary format and possibly a second format, there do not appear to be any BaaS providers of backup services. It appears that ancestry.com may actively block this, in that it provides no API, and its Terms expressly prohibit "distribution of your password to others for access to Ancestry".

Xero

Xero makes an API available to BaaS providers. Services that utilise the API include LedgerBackup (which emails a CSV file for each of the tables in Xero), Boxkite (which mirrors Xero files to Dropbox) and Safeguard-My-Xero. However, it is unclear whether any of these BaaS providers offer any service other than making CSV files available to the user in the event that the Xero service ceases to perform its function. It is also unclear whether all of the files that are extracted are capable of even being re-imported into Xero, let alone into another package or service, in order to re-constitute the set of accounts. It is accordingly far from clear that Xero is fit for use by any business that wants to be assured of its capacity to manage its accounts and comply with its obligations to tax agencies. There appears to be very little literature on this serious deficiency, but see Dorricott (2013).

Salesforce

Salesforce publishes APIs to BaaS providers, and a range of third parties claim to provide backup services. These include Backupify, Cloudally, Cloudfinder, Ownbackup, SesameSoftware, Skyvia and Spanning. However, the only way in which these backups can be used is to reload the data back into Salesforce. Moreover, it appears that all such products are oriented towards 'enterprises', i.e. large and medium-sized organisations, none provide straightforward explanations of the kinds that would be comprehensible to micro-organisations and prosumers, and the only fixed-fee that was readily found was a minimum USD 6,000 p.a. There appears to be no BaaS solution suitable for small users.

Google Docs

Google Docs is the subject of multiple third-party backup offerings, but they are oriented towards enterprises. In mid-2016, the BaaS service offerings for consumers and small business were far from clear, although a new offering was heralded by (Witzel 2016) as providing the kind of BaaS service needed. In early 2017, Malunui et al. (2016) mentioned as third-party (BaaS) providers Spanning, Syscloud, or Backupify. However, Spanning's and Syscloud's offerings appeared to address only 'enterprise' needs in relation to Google Apps, not small users' needs in relation to Docs, and Syscloud's site stated that Syscloud had replaced Backupify. BackupGoo similarly focussed on Apps not Docs. In short, it is unclear what, if any, BaaS options exist for small users.

______________

The BaaS approach to backups appears to be in two cases feasible, but challenging and ill-fitted to the needs of small organisations and individuals (Instagram and Google Docs). In one case, it is fraught with difficulty because of the absence of services oriented towards small organisations and individuals (Salesforce). In one case it appears to be impractical (Xero), and in the other infeasible (ancestry.com). A rough score would seem to be about 1 out of 5.


8. Conclusions

This paper has reported on research into the backup needs of micro- and some other small organisations, and of individuals. It has focussed on a test-case, in order to not merely provide general guidance, but also to deliver a specification that fulfils the declared objective. A companion paper presented backup plans for contexts in which the user is self-sufficient, or uses a backup service, or uses a file-hosting service for the primary copy of their files. This paper has examined the contemporary context in which the user is dependent on a service-provider for hosting not only the data but also the application software. It has defined Backup Plans for three alternative approaches, and reported on an analysis of their feasibility in respect of a sample of SaaS service-providers.

The design science research method as described by Peffers et al. (2007) was applied, commencing with `problem-centred initiation', through the problem definition, objectives formulation and articulation phases, and into the design phase, resulting in three sets of specifications. Tables 3-7 and Appendices 1 and 2 provide analyses of Assets, forms of Harm, Data Threats, Vulnerabilities and Priority Threat-Vulnerability Combinations. They represent templates or exemplars that can be applied to similar studies of somewhat different contexts. That analysis laid a foundation for the risk management phase, and Backup Plans were defined for three alternative approaches, expressed in Table 9 and Appendices 3 and 4.

The approach adopted to the evaluation phase was to select a small but diverse sample of SaaS services, and assess the extent to which those services appear to enable each of the three alternative approaches to backup to be performed. From the analysis presented in section 5, it was evident that complete reliance on SaaS providers, at least at this stage in the maturation of such services, is unlikely to result in an outcome that addresses the backup and recovery needs of individuals and small organisations. In section 6, it was clear that a user who takes responsibility for their own First-Level Backups and Archives at least needs to reliably perform somewhat onerous and technically challenging tasks, and even then may not be able to achieve the intended result. In section 7, the BaaS approach is currently at best problematical and at worst infeasible.

The analysis needs to be subjected to review by peers, and improvements made to reflect their feedback. Further, the analysis needs to be applied to additional and special test-cases, reflecting the needs of small organisations and individuals with particular profiles. The analysis is likely to require adaptation to the extent that the usage among the target market-segment of general-purpose computing devices (desktops and laptops) declines to the point that datafile creation and amendment are generally undertaken on limited-capability appliances (whose current forms are smartphones and tablets).

The specific Backup Plans proposed in this paper, and/or variants of and successors to them, are capable of being productised by providers. These include corporations that sell hardware, that sell operating systems, that sell pre-configured hardware and software, that sell value-added hardware and software installations, that sell storage-devices, that sell storage services, that sell SaaS services, and that sell BaaS services. In addition, current service-providers can use the risk assessment, or the specific Backup Plans, as a basis for evaluating their offerings and planning improvements to address deficiencies.

The analysis of five mainstream SaaS providers concluded that, in many cases, very substantial barriers prevent the implementation of effective backup strategies by small organisations and individuals using SaaS services. This is a very important conclusion, and one that does not appear to have been published in the literature to date. Given the vital economic, social and personal importance of assuring ongoing access to data, this is a fairly remarkable finding. The adoption of cloud solutions may have been enthusiastic, but is has been blind, and that blindness will have serious negative consequences for some users.

A further concern is the very limited attention paid to backup of SaaS-maintained data in the formal research literature, a full decade after it emerged. One reason for the lack of a literature may be that Computer Science is concerned with much more technically advanced matters than backup and recovery. Another may be that the emphasis of Information Systems research is so strongly on social science perspectives and the observation of technology in use that only a limited amount of constructive research is being undertaken. Alternatively, researchers may have such a strong commitment to the needs of corporations that the interests of small organisations and individuals are being largely ignored.

As individuals increasingly act as prosumers, they can be expected to become more demanding, and more interested in investing in effective but practical backup arrangements. Meanwhile, many large organisations will become concerned about importing subcontractors' security risks, and will bring pressure to bear on them to demonstrate the appropriateness of their backup arrangements, and to provide warranties and indemnities. The work reported here sounds the alarm. But it also lays a foundation for significant improvements in key aspects of the data security not only of prosumers and micro-organisations, but also of the larger organisations that depend on them.


References

AlZain M.A., Pardede E., Soh B. & Thom J.A. (2012) `Cloud Computing Security: From Single to Multi-Clouds' Proc. 45th Hawaii International Conference on System Sciences, at http://www.computer.org/csdl/proceedings/hicss/2012/4525/00/4525f490.pdf

Armbrust M., Fox A., Griffith R., Joseph A.D., Katz R., Konwinski A. & Zaharia M. (2010) 'A view of cloud computing' Communications of the ACM, 53, 4 (April 2010) 50-58

AS 4360 (2004) `Risk Management' Standards Australia, 2004

Austin R.P.B. (2013) 'Virus encrypted all google drive files - Cryptolocker virus' Google Drive Help Forum, October 2013, at https://productforums.google.com/forum/#!msg/drive/DmZKoIcAPzg/siCsN_lZDlQJ

Balachandra B.R., Paturi V.R. & Rakshit A. (2009) `Cloud Security Issues' Proc. IEEE International Conference on Services Computing, 2009, at https://xa.yimg.com/kq/groups/2584474/89013670/name/NDU-3.pdf

Bradley T. (2011) `30 Days With...Google Docs: Day 25: Don't Lose Your Google Docs Data' PCWorld, May 2011, at http://www.pcworld.com/article/228707/day_25_dont_lose_your_google_docs_data.html

Buffington J. (2012) 'How do you back up SaaS? I'd like to know' ESG, 21 December 2012, at http://www.esg-global.com/blogs/how-do-you-back-up-saas-id-like-to-know/

Buffington J. (2014) 'SaaS Backup ... Hunger Games style' ESG, 11 December 2014, at http://blog.esg-global.com/saas-backup-...-hunger-games-style

Buffington J. (2015) 'Your SaaS Application needs to be Backed Up!' ESG, 14 May 2015, at http://research.esg-global.com/reportaction/Blog0514201504/TOC?include=backup%20SaaS

Chen D. & Zhao H. (2012) `Data Security and Privacy Protection Issues in Cloud Computing' Proc. IEEE International Conference on Computer Science and Electronics Engineering, 2012, at http://xa.yimg.com/kq/groups/2584474/417972861/name/NDU-1.pdf

Cherdantseva Y. & Hilton J. (2012) 'A Reference Model of Information Assurance & Security'
Proc. IEEE ARES 2013 SecOnt workshop, 2-6 September, 2013, Regensburg, at http://users.cs.cf.ac.uk/Y.V.Cherdantseva/RMIAS.pdf

Chervenak A. L., Vellanki V. & Kurmas Z. (1998) 'Protecting file systems: A survey of backup techniques' Proc. Joint NASA and IEEE Mass Storage Conference, March 1998, at http://www.storageconference.us/1998/papers/a1-2-CHERVE.pdf

Clarke R. (2008) 'B2C Distrust Factors in the Prosumer Era' Invited Keynote, Proc. CollECTeR Iberoamerica, Madrid, 25-28 June 2008, pp. 1-12, at http://www.rogerclarke.com/EC/Collecter08.html

Clarke R. (2011) 'The Cloudy Future of Consumer Computing' Proc. 24th Bled eConference, June 2011, PrePrint at http://www.rogerclarke.com/EC/CCC.html

Clarke R. (2012a) 'How Reliable is Cloudsourcing? A Review of Articles in the Technical Media 2005-11' Computer Law & Security Review 28, 1 (February 2012) 90-95, PrePrint at http://www.rogerclarke.com/EC/CCEF-CO.html

Clarke R. (2012b) 'A Framework for the Evaluation of CloudSourcing Proposals' Proc. 25th Bled eConference, June 2012, PrePrint at http://www.rogerclarke.com/EC/CCEF.html

Clarke R. (2013) 'Data Risks in the Cloud' Journal of Theoretical and Applied Electronic Commerce Research (JTAER) 8, 3 (December 2013) 59-73, Preprint at http://www.rogerclarke.com/II/DRC.html

Clarke R. (2015a) 'The Prospects of Easier Security for SMEs and Consumers' Computer Law & Security Review 31, 4 (August 2015) 538-552, PrePrint at http://www.rogerclarke.com/EC/SSACS.html

Clarke R. (2015b) 'SaaS Backup Fails the Fitness for Purpose Test'
IEEE Cloud Computing 2, 6 (Nov-Dec 2015) 58-63, PrePrint at http://www.rogerclarke.com/EC/FPTB.html

Clarke R. (2016) 'Practicable Backup Arrangements for Micro-Organisations and Individuals ' Xamax Consultancy Pty Ltd, February 2016, at http://www.rogerclarke.com/EC/PBAR.html

Cole E. (2013) 'Personal Backup and Recovery' Sans Institute, September 2013, at http://www.securingthehuman.org/newsletters/ouch/issues/OUCH-201309_en.pdf

Cringely R. (2011) '2011 prediction #8: Cloudburst' I, Cringely, 6 January 2011, at http://www.cringely.com/2011/01/2011-prediction-8-cloudburst/

Dorricott B. (2013) 'Read Xero's Terms and Conditions and then weep!' Meteorical, 31 December 2013, at http://www.meteorical.com.au/2013/12/read-xeros-terms-and-conditions-and-then-weep/

ENISA (2015) `Cloud Security Guide for SMEs' European Union Agency for Network and Information Security, April 2015, at https://www.enisa.europa.eu/activities/Resilience-and-CIIP/cloud-computing/security-for-smes/cloud-security-guide-for-smes/at_download/fullReport

Gallagher M.J. (2002) 'Centralized Backups' SANS Institute, July 2001, at http://www.sans.org/reading-room/whitepapers/backup/centralized-backups-513

Google (2016) `Download your data' Google Inc., accessed 3 March 2016, at https://support.google.com/accounts/answer/3024190?hl=en, mirrored at http://www.rogerclarke.com/EC/Google-Dyd-160303.pdf

de Guise P. (2008) 'Enterprise Systems Backup and Recovery: A Corporate Insurance Policy' Auerbach, 2008

Gwava (2015) 'Top 10 Google Vault Archiving Drawbacks' Gwava, original of 18 June 2014, at http://www.gwava.com/blog/top-10-google-vault-archiving-drawbacks

Hevner A.R. (2007) 'A Three Cycle View of Design Science Research' Scandinavian Journal of Information Systems, 2007, 19(2):87-92

Hevner A.R., March S.T. & Park, J. (2004) 'Design research in information systems research' MIS Quarterly, 28, 1 (2004), 75-105

Höfer C.N. & Karagiannis G. (2011) `Cloud computing services: taxonomy and comparison' J Internet Serv Appl 2 (2011) 81-94, at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.460.3939&rep=rep1&type=pdf

IASME (2013) 'Information Assurance For Small And Medium Sized Enterprises'
IASME Standard v. 2.3, March 2013, at https://www.iasme.co.uk/images/docs/IASME%20Standard%202.3.pdf

ISO 27005 (2012) 'Information Technology - Security Techniques - Information Security Risk Management' International Standards Organisation, 2012

ISO 31010 (2009) `Risk Management- Risk Assessment Techniques' International Standards Organisation, 2009

Javaraiah V. (2011) 'Backup for Cloud and Disaster Recovery for Consumers and SMBs' Proc. IEEE 5th Int'l Conf. on Advanced Networks and Telecommunication Systems (ANTS), December 2011

Jefferies C.P. (2012) 'Google Drive: The Differences Between the Web App and the Desktop App' Backupify, 15 August 2012, at http://blog.backupify.com/2012/08/15/google-drive-the-differences-between-the-web-app-and-the-desktop-app/

Kaskade J. (2011) 'The Reality Of Public Cloud' James Kaskade, June 2011, at http://jameskaskade.com/?p=1722

Krutz R.L. & Vines R.D. (2010) `Cloud Security: A Comprehensive Guide to Secure Cloud Computing' Wiley, 2010

Lennon S. (2001) 'Backup Rotations - A Final Defense' SANS Institute, August 2001, at http://www.sans.org/reading-room/whitepapers/sysadmin/backup-rotations-final-defense-305

Malunui et al. (2013) `How to Back Up Google Docs', many successive archived versions from 30 December 2013, at https://web.archive.org/web/20151024172831/http://www.wikihow.com/Back-Up-Google-Docs

Menn J. (2011) 'Cloud creates tension between accessibility and security' Financial Times, 14 November 2011, at, http://www.ft.com/intl/cms/s/0/6513a4d6-0a06-11e1-85ca-00144feabdc0.html;2011. com/EC/CCEF-ITMediaReports-1109.rtf

NIST (2011) 'Guidelines on Security and Privacy in Public Cloud Computing' Special Publication 800-144, National Institute of Standards and Technology, December 2011, at http://csrc.nist.gov/publications/nistpubs/800-144/SP800-144.pdf

NIST (2012) 'Guide for Conducting Risk Assessments' National Institute of Standards and Technology, Special Publication SP 800-30 Rev. 1, September 2012, at http://csrc.nist.gov/publications/nistpubs/800-30-rev1/sp800_30_r1.pdf

Parker D.B. (1998) 'Fighting Computer Crime' John Wiley & Sons, 1998

Peffers K., Tuunanen T., Rothenberger M.A. & Chatterjee S. (2007) 'A Design Science Research Methodology for Information Systems Research' Journal of Management Information Systems 24, 3 (Winter 2007-8) 45-77

Preston W.C. (2007) 'Backup & Recovery' O'Reilly Media, 2007

Repschlaeger J., Wind S., Zarnekow R. & Turowski K. (2012) `Selection Criteria for Software as a Service: An Explorative Analysis of Provider Requirements" AMCIS 2012 Proceedings, July 29, 2012, Paper 3, at http://aisel.aisnet.org/amcis2012/proceedings/EnterpriseSystems/3

Rowe W. (2015) `How to Back Up Google Docs' Tech-Recipes, 4 January 2015, at http://www.tech-recipes.com/rx/52333/backup-google-docs/

Saltzer J. & Schroeder M. (1975) 'The protection of information in computer systems' Proc. IEEE 63, 9 (1975), pp. 1278-1308

Strom S. (2010) 'Online Backup: Worth the Risk?' SANS Institute, May 2010, at http://www.sans.org/reading-room/whitepapers/backup/online-backup-worth-risk-33363

Subashini S. & Kavitha V. (2011) `A survey on security issues in service delivery models of cloud computing' Journal of Network and Computer Applications 34 (2011) 1-11, at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.259.6975&rep=rep1&type=pdf

Tapscott D. & Williams A.D. (2006) 'Wikinomics: How Mass Collaboration Changes Everything' Portfolio, 2006Taylor C. (2014) 'Backup as a Service: To BaaS or Not to BaaS' Datamation, 10 November 2014, at http://www.datamation.com/cloud-computing/backup-as-a-service-to-baas-or-not-to-baas-1.html

TOB (2012) 'Types of Backup' typesofbackup.com, June 2012, at http://typesofbackup.com

Toffler A. (1970) 'Future Shock' Pan, 1970

Toffler A. (1980) 'The Third Wave' Pan, 1980

Witzel L. (2016) `Do I Need To Back Up Google Docs? Absolutely' Spanning, 25 January 2016, at http://spanning.com/blog/do-i-need-to-back-up-google-docs-absolutely/


Appendix 1: Threats to Data

Environmental Event

  • Electrical Event (interruption, surge)
  • Fire Event
  • Water Event
  • Impact Event

SaaS: The basic set of locations in which events may occur includes the individual's location, and elements of the remote networks used from time to time to achieve Internet connectivity for the laptop and handheld. With SaaS, these all remain, but events at further locations come into play, including the service-provider's servers, infrastructure, and network connections

SaaS: In addition to events affecting the individual's location, threats now arise from actions by the service-provider, and events affecting the service-provider's location

Attack

  • On Data Storage

    • Sabotage
    • Theft
    • Seizure

  • On Traffic

    • Interception

  • On a Business Process

    • Abuse of Privilege, e.g. unauthorised disclosure by an insider
    • Maquerade, e.g. access by adopting an authorised identity
    • Social Engineering

  • On a Computer-Based Process

    • 'Hacking' / Cracking
    • Malware, incl. Ransomware

SaaS: The data is subject to attacks by the service-provider, such as use for unauthorised purposes, suspension of service based on a claim of breach of terms (e.g. late payment, breach of intellectual property, or use of the service for a purpose not approved of by the service-provider)

SaaS: The data is subject to third-party attacks targetting not only the individual's location, but also the service-provider's location and communication links between them

SaaS: Because a SaaS provider is an agglomerator of valuable data, it constitutes a 'honey-pot', and hence is subject to more, and more professional attacks

SaaS: The data-holdings and the operations of SaaS providers are highly visible to, and subject to attacks by, a wide array of organisations that mostly ignore individuals and small organisations. The categories of attacker include:

  • Strategic Partners of the service-provider, e.g. 'small print' in the contract terms that permits exploitation of the individual's data by other parties
  • 'Co-Tenants' that use the same service-provider's infrastructure, e.g. gaining unauthorised access to the individual's data by taking advantage of weaknesses in the service-provider's security safeguards
  • Litigants, who may, by means of sub poenas, court orders or injunctions achieve unauthorised access to the individual's data, seizure of the individual's data, prevention of the service-provider's operations, or seizure of the service-provider's servers or infrastructure
  • Receivers, who may suspend operations or exercise a lien on data held by the service-provider, in order to gain an advantage for the company's creditors
  • Government Agencies, which may gain access to the individual's data by exercising judicially sanctioned or merely administratively sanctioned powers, seizing the individual's data, seizing all of the data held by the service-provider, or seizing the service-provider's infrastructure
  • Staff-Members or Contractors, who may successfully mount insider attacks
  • 'Hackers', who may mount outsider attacks, for purposes as varied as the technical challenge, extortion, sabotage, industrial espionage and spying, whether as principal or agent, including on behalf of competitors, foreign governments and the government of the individual's own country

Accident, i.e. Unintentional Error

  • By A Human

    • Business Process Design Error
    • Inadequate Training in a Business Process
    • Business Process Performance Error

SaaS: Because processes have been delegated from a user (who is insufficiently trained and reliable) to a service-provider (that is presumed to be specialist, professional and reliable), mainstream errors such as forgetting to run a backup are less likely to occur.

SaaS: In addition to accidents caused by the individual, accidents caused by the provider and its staff are now relevant, such as non-performance of processes, mistaken suspension of service, migration of the service to a new data format and/or software platform, withdrawal of the service, bankruptcy

  • Within Infrastructure

    • Equipment Failure (Processor, Storage Device, Infrastructure, Power)
    • Storage-Medium Failure
    • Network Malfunction
    • Data Incompatibility

SaaS: In addition to failures within the individual’s infrastructure, failures within the provider’s much more substantial infrastructure come into play

SaaS: The format in which data is backed up may be specific to the service-provider or the software it uses. If so, backups may not enable the data to be recovered or the service restored.


Appendix 2: Data Vulnerabilities

Infrastructural Vulnerabilities

  • Dependence on the availability, reliability and integrity of:

    • Power Supply, subject to the Threats of
      blackouts, brownouts, voltage variability, UPS failure
    • Computing Facilities, subject to the Threats of
      planned and unplanned downtime, unavailability of a storage-device that can read a particular storage-medium, seizure powers
    • Networking Facilities, subject to the Threats of
      outages, congestion, DOS attack

SaaS: The basic set of locations in which vulnerabilities exist includes the individual's devices, their local network, any local network storage, and the remote networks used from time to time to achieve Internet connectivity for the laptop and handheld. With SaaS , these all remain, but additional vulnerability locations come into play, including the service-provider's servers, the service-provider's infrastructure, and the network connection between the individual's devices and the service-provider

SaaS: Network unavailability is highly significant, because of the need to synchronise between any file-copies on user-devices and the master-copy on the SaaS server, between the master-copy on the SaaS server and the Backup copy, and between the master-copy on the SaaS server and any Second-level Backup copy

    • Storage-Media, subject to the Threats of
      disk crash, corruption, encryption/hostage/ransom, loss, online accessibility of live and backup data at the same time, seizure powers, unreadability due to humidity, dust, magnetic disturbance, corrosion, etc.
    • Ancillary Services, e.g. air-conditioning, fire equipment,
      subject to the Threats of outages and malfunctions
    • Automated Processes, subject to the Threats of
      design and coding errors, malware, wrong versions of software or data, erroneous recovery of software or data, overwrite of valid backups with corrupted backups

  • Dependence on the effectiveness of Access Controls over:

    • Authenticators
    • Software Execution
    • Remote Access
    • Message Transmission
    • Encryption and Decryption

Human Vulnerabilities

  • Dependence on the availability, reliability and integrity of individuals, subject to the Threats of:

    • Inadequate Performance
    • Inadequate Training
    • Inadequate Loyalty
    • Insufficient Wariness and Scepticism

SaaS: Generally, SaaS staff-members are likely to be more professional than individuals and small organisations.

SaaS: Generally, SaaS providers are far from transparent in relation to their operations, and contracts with small organisations and individuals are not subject to comprehensive Service-Level Agreements (SLAs), but only to Terms of Service dictated by the SaaS provider. Hence, for example, the existence, frequency of update, location and accessibility of backups are unlikely to be able to be discovered. This precludes effective risk management being performed.


Appendix 3: DIY Backup Plan to the User's Site

ESSENTIAL

    Infrastructure Features

  1. The user is to install power-surge protection and an 'uninterruptible' power supply with battery backup (UPS)
    The first feature greatly reduces the likelihood of electrical surges and failures harming the user's Backup copy, and the second provides sufficient time for an orderly close-down of mains-dependent devices when the power goes off
  2. a. The SaaS provider is to store the Primary copies of all files
    b. The SaaS provider is to provide facilities for creating, amending and accessing the data from the user's working-devices
    c. The SaaS provider is to provide 99.9% service-availability (10 minutes' downtime per week)
    This ensures that the user has access to their data and to the ability to create, amend, access and delete their data
  3. The SaaS provider is to facilitate access by the user's working-devices, in order to enable extraction of copies of all of the user's data
    This ensures that the user will be able to perform backup runs
  4. The user is to install storage device(s) that include removable storage-media, to support the process of downloading backups and storing them off-line
  5. File-Precautions

  6. The SaaS provider is to perform continual saves when data is being changed
    This guards against the loss of recently-completed amendments due to infrastructure failure
  7. The SaaS provider is to create a new file-version when making significant amendments
    This ensures recoverability from mistaken amendments and deletions
  8. The SaaS provider is to run malware detection and eradication software
    a.
    on all incoming files
    b. on all stored files

    These two safeguards greatly reduce the frequency with which malware will affect the individual's data. This is particularly important on devices using a Microsoft OS
  9. The user is to run malware detection and eradication software
    a. on each storage-device at the time it is connected to any working-device
    b. on all incoming files arriving via email, fetches using a web-browser, etc.
    These two safeguards greatly reduce the frequency with which malware will affect the user's Backup copy. This is particularly important on devices using a Microsoft OS
  10. Backup Runs

  11. a. The user is to perform weekly, fortnightly or monthly Full Backup to a rotating set of 2, 3 or 4 storage-media, as First-Level Backup
    This addresses the risk of the Primary and the Second-Level Backup both being inaccessible for any reason
    b. The user is to store the First-Level Backup storage-media offline
    This addresses the risk of simultaneous corruption of all versions of the data, e.g. by ransomware
  12. The SaaS provider is to maintain a full Backup to a separate storage-medium no less frequently than daily
    This provides an accessible and reasonably up-to-date Second-Level Backup
  13. a. The user is to, annually, and after each significant upgrade to software, perform a complete Disk-Image Backup of all working-devices , including all software and parameter-files
    This address the risk of serious contamination of software by malware
    b. The user is to store the resulting Disk-Images remotely and offline
    This addresses the risks associated with the primary location.
  14. Business Processes

  15. The user is to document and periodically rehearse backup procedures to implement the First-Level Backup Runs. These procedures may need to include format-conversion, from that exported by the SaaS provider to that needed by the user
  16. The user is to document and periodically rehearse recovery procedures for the following activities. These procedures may need to include format-conversion, from that exported by the SaaS provider to that needed by the user:

    • recovery of an individual file, from First-Level Backup
    • recovery of the complete set of files, from First-Level Backup
    • recovery of the operating environments, from the annual Disk-Images

  17. The SaaS provider is to document and periodically rehearse backup procedures to implement all of the Second-Level Backup Runs
  18. The SaaS provider is to document and periodically rehearse recovery procedures for individual files, and for the complete set of files, from Second-Level Backup

RECOMMENDED

    Infrastructure Features – Additional Measures

  1. The SaaS provider is to facilitate mutual authentication and channel encryption between all of the individual's devices and the service-provider
    This greatly reduces the risks of data interception and data corruption, particularly when connecting from insecure external locations
  2. File-Precautions – Additional Measures

  3. The user is to, weekly, run malware detection and eradication software on all stored files
    This is particularly important on devices using a Microsoft OS
  4. Backup Runs – Additional Measures

  5. a. The user is to, half-yearly, retire a Full Backup to Archive
    b. The user is to store successive Archive copies locally and remotely, and possibly also on a third site
    This ensures that an occasional set of old file-copies exists and hence earlier versions of files can be recovered
  6. The user is to, annually, spool 3-year-old Archives to new media
    This addresses the risk of storage-media decay
  7. The user is to, 5-Yearly, spool all Archives to a new media-type
    This addresses the risk of having storage-media that no storage-device can read

UNADDRESSED RISKS

  1. It may not be feasible to require the SaaS provider to ensure that the format is readable by software readily available to the individual
    This would mean that the user is at risk that the data may be in principle accessible, but in practice unusable
  2. It is not feasible to require the SaaS provider to ensure that the user's data is inaccessible to the SaaS provider
    This is feasible (although not simple) in relation to Backup copies, which can be encrypted to a key that only the user has access to. But the Primary copy is processed in clear and hence, even if it is stored encrypted, the SaaS provider must be able to decrypt it


Appendix 4: Backup Plan Using BaaS

ESSENTIAL

    Infrastructure Features

  1. a. The SaaS provider is to store the Primary copies of all files
    b. The SaaS provider is to provide facilities for creating, amending and accessing the data from the user's working-devices
    c. The SaaS provider is to provide 99.9% service-availability (10 minutes' downtime)
    This ensures that the user has access to their data and to the ability to create, amend, access and delete their data
  2. The SaaS provider is to facilitate access by the BaaS provider, in order to enable extraction of copies of all of the user's data
    This ensures that the BaaS provider will be able to perform backup runs
  3. File-Precautions

  4. The SaaS provider is to perform continual saves when data is being changed
    This guards against the loss of recently-completed amendments due to infrastructure failure
  5. The SaaS provider is to create a new file-version when making significant amendments
    This ensures recoverability from mistaken amendments and deletions
  6. The SaaS provider is to run malware detection and eradication software
    a.
    on all incoming files
    b. on all stored files

    These two safeguards greatly reduce the frequency with which malware will affect the individual's data. This is particularly important on devices using a Microsoft OS
  7. Backup Runs

  8. The SaaS provider is to maintain a full Backup to a separate storage-medium no less frequently than daily
    This provides an accessible and reasonably up-to-date First-Level Backup
  9. The BaaS provider is to:
    a. maintain a full Second-Level Backup on a weekly, fortnightly or monthly basis

    b. store the Second-Level Backup in such a manner that it is not subject to the location-based risks that apply to the First-Level Backup
    These together address the risk of the Primary and the First-Level Backup both being inaccessible for any reason
    c. store the Second-Level Backup offline
    This addresses the risk of simultaneous corruption of all versions of the data, e.g. by ransomware
    d. facilitate user access to and use of the Second-Level Backup
    This addresses the risk that the Primary copy and the First-Level Backup, which are in the possession of the SaaS provider, may become inaccessible by the individual, for such reasons as undetected loss of data, bankruptcy, withdrawal of service, a commercial dispute, or seizure by another party
  10. Business Processes

  11. The SaaS provider is to document and periodically rehearse backup procedures to implement all of the Backup Runs
  12. The SaaS provider is to document and periodically rehearse recovery procedures for individual files, and for the complete set of files, from First-Level Backup
  13. The BaaS provider is to document and periodically rehearse backup procedures to implement the Second-Level Backup Runs. These procedures may need to include format-conversion, from that exported by the SaaS provider to that needed by the user
  14. The BaaS provider is to document and periodically rehearse recovery procedures for individual files, and for the complete set of files, from Second-Level Backup. These procedures may need to include format-conversion, from that exported by the SaaS provider to that needed by the user

RECOMMENDED

    Infrastructure Features – Additional Measures

  1. The SaaS provider is to facilitate mutual authentication and channel encryption between all of the individual's devices and the service-provider
    This greatly reduces the risks of data interception and data corruption, particularly when connecting from insecure external locations
  2. Backup Runs – Additional Measures

  3. a. The BaaS provider is to, half-yearly, retire a Full Backup to Archive
    b. The BaaS provider is to store successive Archive copies locally and remotely, and possibly also on a third site
    This ensures that an occasional set of old file-copies exists and hence earlier versions of files can be recovered
  4. The BaaS provider is to, annually, spool 3-year-old Archives to new media
    This addresses the risk of storage-media decay
  5. The BaaS provider is to, 5-Yearly, spool all Archives to a new media-type
    This addresses the risk of having storage-media that no storage-device can read
  6. The BaaS provider is to ensure that:
    • the Second-Level Backup is encrypted
    • the decryption key is accessible by the user
    • the decryption key is not accessible by either the SaaS or the BaaS provider
      These features reduce the risk of the remote backup data being accessed by parties other than the user

UNADDRESSED RISKS

  1. It may not be feasible to require the SaaS provider to ensure that the format is readable by software readily available to the individual
    This would mean that the user is at risk that the data may be in principle accessible, but in practice unusable
  2. It is not feasible to require the SaaS provider to ensure that the user's data is inaccessible to the SaaS provider
    This is feasible (although not simple) in relation to Backup copies, which can be encrypted to a key that only the user has access to. But the Primary copy is processed in clear and hence, even if it is stored encrypted, the SaaS provider must be able to decrypt it


Acknowledgements

The assistance of Russell Clarke is gratefully acknowledged, in relation to conception, detailed design and implementation of backup and recovery arrangements for the author's business and personal needs, and for review of a draft of this paper. An assignment on a sub-set of this topic was set for ANU Computer Science students. The analysis reported here benefited from the submissions by Rebecca Catanzariti and Julie Noh, and by Patrick McCawley and Simon Whittenbury.


Author Affiliations

Roger Clarke is Principal of Xamax Consultancy Pty Ltd, Canberra. He is also a Visiting Professor in the Cyberspace Law & Policy Centre at the University of N.S.W., and a Visiting Professor in the Research School of Computer Science at the Australian National University.



xamaxsmall.gif missing
The content and infrastructure for these community service pages are provided by Roger Clarke through his consultancy company, Xamax.

From the site's beginnings in August 1994 until February 2009, the infrastructure was provided by the Australian National University. During that time, the site accumulated close to 30 million hits. It passed 65 million in early 2021.

Sponsored by the Gallery, Bunhybee Grasslands, the extended Clarke Family, Knights of the Spatchcock and their drummer
Xamax Consultancy Pty Ltd
ACN: 002 360 456
78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916

Created: 28 August 2014 - Last Amended: 10 March 2017 by Roger Clarke - Site Last Verified: 15 February 2009
This document is at www.rogerclarke.com/EC/PBAR-SP-WP.html
Mail to Webmaster   -    © Xamax Consultancy Pty Ltd, 1995-2022   -    Privacy Policy