Disaster Recovery

1. Business Processes Analysis

1.1 Core Processes

Need to identify Core Processes from the BusinessProcesses.

How to do a revocation:

SO in terms of business continuity / disaster recovery, these are CORE:

  1. critical systems

  2. OCSP/CRL servers

1.2 Secondary Processes

Secondary:

  1. email + maillists - all redundant
  2. support - receive certificate complaints, do revocations on them
  3. arbitration

Discretionary -- all other processes in the list are discretionary. In context of Disaster, these are ignored, for the time being.

2. Standard Process Times

Standard Process Times (SPT) is needed as a baseline.

  1. revocation
    • Support -- rebuild + startup?
      • redundant channels:
        • email support
        • website POST box
        • phone??? VoIP??? SMS???
        • IRC + chat
    • 0 time for receiving certificate complaints
    • 1 hour to pass to arbitration
  2. Arbitration
    • 1 mailing list
    • 1 hour hour to designate Arbitrator
    • 24 hours to get 1st ruling on revocation
    • does arbitrator need guidelines?
      • false positives, false negatives, discretion amongst arbitrators....
  3. Revocation by Support
    • 1 hour to revoke
  4. Critical Systems
    • new CRL from support - 0 time
    • distribution to OCSP / CRL servers - 0 time

Then, the SPT for revocation is: 3 + 24 = 27

3. Recovery Time Objectives

Recovery Time Objectives (RTOs) for core processes are how long it takes to recover the core+secondary processes needed.

27 hours

  1. critical systems -- rebuild and start up -- ??
    • this would have to be faster than total revocation time
    • board will have to define this time:
    • within 24 hours
  2. OCSP/CRL -- rebuild and start up????
    • 0 time: must have redundancy
  3. Mail+mailing lists (Arbitration)
    • 0 time - redundant -- requirement, we need redundant mail for arbitrators?

3.1 Failure Times

How long will it take then? Target is 27 hours.

== 27 hours

4. Maximum Acceptable Outage

Maximum Acceptable Outage (MAO) is the total time that the business decress it can be down for in this context.

  1. OCSP/CRL == 0 time for existing ones
  2. 2 days before new revocations issued
  3. email / support / maillists == 0 time (redundant)
    • how long does it take to realise problems with mail systems?
    • throw at tech people ... we want redundant mail + 0 time

5. Recovery Point Objective

Recovery Point Objective (RPO) is the time back to which we recover.

  1. what time before Disaster do we have data for? (Backups)?
  2. revocation: 24 hours (normal incremental backups)
    • ==> revocations can be lost

    • ==> user / Arbitrator must do confirm/retry manually

    • ==> write in CPS "you must check within 24 hours to confirm/retry"

      • mail: RPO == 1 hour on mail incoming (so 1 hour SPT can be met)
  3. OCSP/CRL: no issue because source files on critical systems
    • and on other OCSP servers
    • ==> requirement to load up from other OCSPs and form source.

    • RPO == 0 time
  4. critical systems: RPO == 24 hours

6. Others

Service Delivery Objectives: not offered (community CA).

Best efforts standard for revocation:

7. Strategy and Planning:

What plans exist to put in place the systems and infrastructure required to meet the targets?

Maintenance

8. Decision Contact Info

8.1 Oophaga

9. Threats & Disasters

As in Security Manual.

  1. data breach
  2. false certificate issuance
    • arbitration -> revocation

    • arbitration -> investigation, checking the logs

  3. root compromise
    • revoke root with vendors (business protocol)
    • reissue root
    • revoke subroot / certs

10. Side Question


DisasterRecovery (last edited 2012-06-26 08:49:48 by UlrichSchroeter)