stateless cryptography for virtual environments

10
Stateless cryptography for virtual environments T. Visegrady S. Dragone M. Osborne Migrating systems onto virtualized environments, such as cloud platforms, is becoming a business imperative. Such platforms offer the promise of higher resilience combined with a relatively low cost of ownership. The platforms also involve a number of challenges that hinder their adoption, and a primary concern involves security. These security concerns stem in part from vulnerabilities that underlying virtualization functionality introduces, such as the ability to capture and replay the execution state of a virtualized machine. In systems where security is paramount, HSMs (hardware security modules) are often used. HSMs provide a tamper-resistant environment for storing sensitive cryptographic material and for executing cryptographic operations using this material. HSMs may appear to be important components for enhancing the security of virtual environments; however, current implementations are not well suited for this purpose. In this paper, we describe a typical HSM solution stack based on the de facto industry standard called PKCS #11 (Public Key Cryptography Standard # 11). We explain the challenges introduced by virtualized platforms and show why the typical architectures based on PKCS #11 are not suitable for such environments. Finally, we describe an alternative IBM HSM solution called EP11 (Enterprise PKCS #11) and show how it addresses many of these challenges. Introduction In this paper, we describe how HSMs (hardware security modules) have traditionally been embedded in static architectures, and we discuss the need to offer cloud-based applications a more dynamic and reliable cryptographic API (application programming interface). We describe the de facto cryptographic-token application programming interface Public Key Cryptography Standard 11 (PKCS #11) and present how we have changed the architecture to support cryptographic users of virtualized clients. While the changes involved are substantial, we describe how our system can continue to support existing, legacy PKCS #11 applications while servicing the workloads of virtualized clients. We describe security vulnerabilities inherent in virtual environments and identify the key features that virtual clients require, namely the reliable storage and migration of security state, and high-quality randomness. We demonstrate how the additional features needed for stateless implementation allow the efficient management of large datasets and key databases, and how our system can accommodate extremely dynamic environments, even for large numbers of hosts, which themselves appear and disappear frequently, and without notification. We describe how our disintermediation of driver state allows local replication within HSMs, thus improving the error resiliency. Finally, we show how randomness from an HSM-internal Btrue-random[ entropy source may be locally virtualized within an HSM and how this local replication can make computation error-resilient while requiring no further state replication within driver or host applications. Hardware security modules Hardware security modules (HSMs) belong to the family of cryptographic tokens that are physical devices attached to general-purpose computing hosts. They provide a secure environment for generating and storing cryptographic material and for executing cryptographic operations. HSMs ÓCopyright 2014 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done without alteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied by any means or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor. T. VISEGRADY ET AL. 5:1 IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014 0018-8646/14 B 2014 IBM Digital Object Identifier: 10.1147/JRD.2013.2287811

Upload: m

Post on 06-Mar-2017

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Stateless cryptography for virtual environments

Stateless cryptography forvirtual environments

T. VisegradyS. DragoneM. Osborne

Migrating systems onto virtualized environments, such as cloudplatforms, is becoming a business imperative. Such platforms offerthe promise of higher resilience combined with a relatively low costof ownership. The platforms also involve a number of challengesthat hinder their adoption, and a primary concern involves security.These security concerns stem in part from vulnerabilities thatunderlying virtualization functionality introduces, such as the abilityto capture and replay the execution state of a virtualized machine.In systems where security is paramount, HSMs (hardware securitymodules) are often used. HSMs provide a tamper-resistantenvironment for storing sensitive cryptographic material and forexecuting cryptographic operations using this material. HSMs mayappear to be important components for enhancing the security ofvirtual environments; however, current implementations are notwell suited for this purpose. In this paper, we describe a typicalHSM solution stack based on the de facto industry standard calledPKCS #11 (Public Key Cryptography Standard # 11). We explain thechallenges introduced by virtualized platforms and show why thetypical architectures based on PKCS #11 are not suitable for suchenvironments. Finally, we describe an alternative IBM HSM solutioncalled EP11 (Enterprise PKCS #11) and show how it addressesmany of these challenges.

IntroductionIn this paper, we describe how HSMs (hardware securitymodules) have traditionally been embedded in staticarchitectures, and we discuss the need to offer cloud-basedapplications a more dynamic and reliable cryptographicAPI (application programming interface). We describe thede facto cryptographic-token application programminginterface Public Key Cryptography Standard 11 (PKCS #11)and present how we have changed the architecture tosupport cryptographic users of virtualized clients. Whilethe changes involved are substantial, we describe howour system can continue to support existing, legacyPKCS #11 applications while servicing the workloadsof virtualized clients.We describe security vulnerabilities inherent in virtual

environments and identify the key features that virtualclients require, namely the reliable storage and migrationof security state, and high-quality randomness. We

demonstrate how the additional features needed forstateless implementation allow the efficient management oflarge datasets and key databases, and how our system canaccommodate extremely dynamic environments, even forlarge numbers of hosts, which themselves appear anddisappear frequently, and without notification.We describe how our disintermediation of driver state

allows local replication within HSMs, thus improvingthe error resiliency. Finally, we show how randomnessfrom an HSM-internal Btrue-random[ entropy source maybe locally virtualized within an HSM and how this localreplication can make computation error-resilient whilerequiring no further state replication within driver orhost applications.

Hardware security modulesHardware security modules (HSMs) belong to the familyof cryptographic tokens that are physical devices attachedto general-purpose computing hosts. They provide a secureenvironment for generating and storing cryptographicmaterial and for executing cryptographic operations. HSMs

�Copyright 2014 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done withoutalteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied by any means or distributed

royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor.

T. VISEGRADY ET AL. 5 : 1IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

0018-8646/14 B 2014 IBM

Digital Object Identifier: 10.1147/JRD.2013.2287811

Page 2: Stateless cryptography for virtual environments

have traditionally been used as trust points in carefullycontrolled and audited infrastructures for which conventionalwisdom has generally dictated that cryptography should beapplied as close to the consuming application as possiblein order for it to be secure. Their cryptographic contentsare sometimes considered so critical that the built-inHSM defense mechanisms are augmented by placementof the HSM in specially designed security architectures.These architectures are designed to restrict access at theinfrastructure layer by careful use of firewalls and networkisolation.In an ideal, secure system, cryptographic material should

be generated and used within the protected boundary ofan HSM. Sensitive material such as keys should never leavethis secure and tamper-resistant environment. In reality,business continuity is important, and systems need to beresilient. HSMs may fail completely, sometimes intentionallyresponding to tamper, requiring key material to be securelyreplicated across multiple HSMs. Software componentswithin HSMs may also malfunction, causing critical securitysituations that are difficult to detect. For example, failuremay occur in the components that are used to generaterandom numbers. Random numbers are the basis for manycryptographic algorithms and protocols, and being able topredict the value of a particular number is a major securityweakness. A particular class of random numbers, calledpseudorandom numbers (PRNs), is important due to theircharacteristic of generating a repeatable sequence givena particular unpredictable input-value Bseed.[ Seeds shouldbe supplied by a highly unpredictable Bentropy[ source,a TRNG (True Random Number Generator), which by itselfis a sensitive resource. In practice, reliability is achievedby comparing values returned from multiple HSMs,which is only possible for deterministic algorithms butnot for TRNGs.

The PKCS #11 standardApplications use HSMs through APIs. The most widespreadcryptographic API is the PKCS #11, originally publishedby RSA Laboratories [1]. Responsibility for the PKCS #11standard has since been transferred to OASIS (Organizationfor the Advancement of Structured Information Standards)[2]. OASIS is a nonprofit consortium that supports thedevelopment and adoption of open standards for the globalinformation society. The PKCS #11 standard defines aplatform-independent API that host applications use to accesscryptographic tokens, such as HSMs and smart cards.Since the cryptographic tokens themselves are notstandardized, an abstraction layer is defined. The API definesmost commonly used cryptographic object types such asECC (Elliptic Curve Cryptography) keys, RSA (Rivest,Shamir, and Adleman) keys, DSA (Digital SignatureAlgorithm) keys, and AES (Advanced Encryption Standard)keys. The PKCS #11 standard also defines all of the

functions needed to use, create/generate, modify and deletethose objects. HSM vendors supply implementations ofthe standard in the form of PKCS #11 CryptographicProviders. There may be multiple such providers onany given system. Simultaneous multiple applicationsare supported with the separation between applicationsaccomplished using public and private session handles.A session handle is an identifier that uniquely identifiesa stateful sequence of interactions (session) between anapplication and a PKCS #11 provider.A typical interaction between a host application and a

PKCS #11 provider is shown in Figure 1. There are anumber of setup and breakdown steps that are required.These include selecting and initializing the provider (1),selecting an HSM (2), opening a session with the HSM (3),and then authenticating to the HSM (4). Certain functionsmay be called directly, such as a request to return a sequenceof random data (5). Functions such as digitally signing apiece of data (9) require that the associated cryptographicobjects stored within the HSM are located (5–8). In theexample digital signature operation, a private key is requiredwhose reference is needed for the signing function. Breakingdown (i.e., terminating) the session requires releasing allknown identifying state information (10), releasing allof the session state (11), and, finally, releasing all of theprovider state (12).

Distributing keysWe have described the need to replicate and distribute keysfor resilience and business continuity purposes. As the keystorage format and secure distribution mechanisms werenot part of the PKCS #11standard, HSM vendors resortedto proprietary mechanisms for distributing keys. Some ofthese mechanisms introduced unwanted side effects suchas enabling owners of key material stored in some productsto extract private key material in unencrypted form [3].Once outside of the HSM, keys are generally morevulnerable. A common approach to protecting keys duringdistribution is to encrypt (wrap) a key with another key.These keys are sometimes called KEKs (key encryptionkeys), key transport keys, or key wrapping keys. Thisapproach to protecting keys is called key wrapping. Keysprotected in this fashion can reside quite securely outsideof an HSM, either in transit or in an external repositorysuch as a database. The PKCS #11 standard defines wrap(export) and unwrap (import) function interfaces to importand export such keys. Note that one is still left with theproblem of distributing key encryption keys.

Binding attributes to keysIn order to be able to use keys in a secure fashion, a seriesof conditions and restrictions need to be defined and tightlyassociated or bound to the key. One example is usagerestrictions on which cryptographic operations can be

5 : 2 T. VISEGRADY ET AL. IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

Page 3: Stateless cryptography for virtual environments

performed using a specific key. The PKCS #11 standarddefines these restrictions in terms of attributes and specifiesa command to attach these attributes to a key. A problem

arises when the binding between a key and its attributes islost. This happens, for example, when exporting a keythrough the PKCS #11 wrapping function or when derivingnew keys from existing keys. This loss of binding has ledto a number of API-related vulnerabilities being discovered[4, 5]. The reaction by HSM vendors has been to implementa number of proprietary extensions to PKCS #11 thatremove some of the API-related vulnerabilities. However,the fundamental problem of attribute disassociationpersists for keys exported outside of the HSM using thewrap command.In summary, we have described the importance of

distributing keys for resilience purposes, and the use ofthe wrap and unwrap interface in the PKCS #11 standardto facilitate this distribution, and introduced the securityproblems that arise when key attributes are removed fromkeys during the distribution process.

Virtual environmentsVirtual environments are built on concepts such asabstraction of the underlying hardware into a softwarerepresentations and the ability to save and restore the stateof these representations. These concepts introduce someparticular challenges with respect to cryptographic operationsand security in general.A virtual machine (VM) snapshot is a copy of its state,

configuration, disk data, and memory at a specific time.The snapshot is saved as a disk image and can be restoredat any time. Snapshots may be manual or automated duringVM execution. Snapshots can be considered an extremelyrapid backup technique that allows risky operations suchas upgrading or patching an application to be quicklyreverted. They also allow a VM to be quickly instantiatedon another part of the network in the case where a failoveris required or when performing load balancing.To revert to a snapshot simply means restarting the VM

with the same processor, peripheral, and memory state thatwas saved when the snapshot was taken. In addition tothe new vulnerabilities created by saving physical machinerepresentations to disk images, we will show how the conceptof saving and replaying state has certain critical implicationswhen it comes to cryptographic algorithms.

Virtual machines and securityVM environments introduce a number of new vulnerabilitiesthat are not present in more static infrastructures.Instinctively, one thinks of those vulnerabilities introducedby running multiple VMs on the same hardware platform.If an attacker is able to gain control over one VM, itmay be possible for the attacker to launch a number ofside-channel attacks against a second VM co-located onthe same physical server. It has been shown that usingsuch attacks it is possible to extract decryption keys froma neighboring VM running on the same machine [6].

Figure 1

An example PKCS #11 interaction. The dashed arrows indicateinformation returned in an asynchronous fashion.

T. VISEGRADY ET AL. 5 : 3IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

Page 4: Stateless cryptography for virtual environments

More interesting from our perspective, are vulnerabilitiesthat have an impact on cryptographic algorithms and theirimplementation. This category of vulnerabilities resultfrom the mechanism of using snapshots to restore, replicate,backup, and transfer VMs. VM snapshots often containa security-relevant state that can be exploited. It has beenshown, given repeated use of a VM snapshot, how oneof the most common security mechanisms of the Internet,TLS (Transport Layer Security), can be compromisedand how it is even possible to extract the secret DSAauthentication key of the server [7, 8].A number of cryptographic algorithms rely on the

assumption that the previous execution history is notavailable. Restoring a VM that contains a cryptographicrelevant state Bbreaks[ this assumption.A specific category of execution history vulnerabilities

is related to problems with random data. Many of today’scryptographic algorithms rely on good quality randomdata that is Bfresh[ and used only once. It has been describedhow in virtual environments, it is conceivable that a VMis rolled back (i.e., returned) to a point at which randomnesshas been selected but not used [9]. This has graveconsequences for cryptographic protocols that use randomdata for generating session keys or nonces, numbers thatare used once in a cryptographic protocol, for example,to prevent a previous server interaction from being replayed(replay attacks). Digital signature schemes such as thosebased on DSA or ECDSA (Elliptic Curve DSA) have alsobeen shown to be vulnerable [8–10], as have cryptographicprotocols used in some privacy-enhanced applications [9].In summary, it is important that applications that use

cryptographic algorithms are constructed in such a way thata certain state cannot be captured in a snapshot and thatapplications have access to a reliable source of randomness.

Using cryptographic modules invirtual environmentsTwo of the major reasons for migrating applicationsinto virtual environments are the benefits of additionalscalability and resilience. In the case of scalability, thesmall performance penalty incurred by the abstraction orvirtualization of the hardware is more than compensatedfor by the ease in which it is possible to scale systemshorizontally (i.e., scaling systems by adding additionalvirtualization units). Resilience is increased through theability to use VM snapshots to rapidly restore an applicationin the event of a failure or to dynamically migrate anapplication from one server to another in response to plannedmaintenance activity or an infrastructure problem.Some vendors offer some form of network-attached HSM,

commonly a PCI (Peripheral Component Interconnect)-baseddevice packaged within a host server. PKCS #11 providerson the application host connect via TCP/IP (TransmissionControl Protocol) to the HSM host, which maps commands

through the local I/O interface to the HSM. An applicationis generally Bunaware[ that the HSM being used is notlocally attached, as the network transport mechanisms areimplemented below the PKCS#11 provider interfaces.The advantage of this approach is that it is simple to share

a financially costly cryptographic device with a numberof applications. The disadvantages are that 1) there is stilla strong link between an application and a particular HSM,and 2) the application still has to store state in order tointeract with the HSM.In order to benefit from virtual environments, applications

need to be independent of specific hardware, and be simpleto scale and simple to migrate. We now introduce theIBM Enterprise PKCS #11 (EP11) stateless concept thataddresses a number of the challenges that we have described.

Introducing the IBM Enterprise PKCS #11stateless conceptThe EP11 library provides applications with an interfacevery similar to PKCS #11 with two main differences:1) cryptographic material, such as keys, are storedpersistently and encrypted outside of the HSM, and2) HSM devices themselves are kept essentially stateless.Figure 2 shows a typical PKCS #11 software stack andan equivalent IBM Enterprise PKCS #11 software stackside-by-side. The figure clearly shows the major differencebetween the two schemes, namely the relocation of state,sessions, and cryptographic objects [labeled (A)] from withinthe HSM to a layer outside of the HSM in the PKCS #11host library. Cryptographic material stored outside of theHSM is encrypted using wrapping keys and packed intoself-contained objects. Wrapping keys may be definedper application or per group of applications. They needto be distributed to each HSM that is being provisioned[labeled (B)]. An application can use an HSM only if theappropriate wrapping key is present. A routing table isused that maps wrapping keys to HSMs [labeled (C)].This table can be local or centrally administered.Synchronization of wrapping keys and associated routingtables is an offline, audited administrative activity, outsideof the scope of PKCS #11.

Supporting legacy applicationsLegacy PKCS #11 compliant applications are supportedin the EP11 library by mapping PKCS #11 handlesto objects within the host library, and by simulating thosefunctions not required under our scheme, namely thoserelated to the stateful initialization and teardown ofsessions and to object discovery (see Figure 1, numbers 2,5, 6, 7, and 11).

Mapping handles to objectsStandard PKCS #11 applications use references to objects(handles) stored within an HSM. EP11 has direct access

5 : 4 T. VISEGRADY ET AL. IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

Page 5: Stateless cryptography for virtual environments

to these objects in encrypted form. The EP11 host librarysimply replaces PKCS #11 handles with encrypted objects.All other PKCS #11 parameters are simply passed throughtogether with the encrypted object in a single request tothe HSM. This approach allows a lightweight (i.e., efficientin terms of size and complexity) host implementationthat maintains the appearance of a standard PKCS #11provider while utilizing multiple HSMs. Figure 3 showsthe EP11 architecture in the mainframe environment, wherehost assistance is provided by the central Integrated CryptoService Facility (ICSF). ICSF is a component of z/OSand is offered with the base product. It is the softwarecomponent that provides access to the IBM System z cryptohardware. The figure shows the storage of state, sessions,and objects outside of the HSM and the routing table thatallows EP11 requests to be distributed across multiple HSMs,both within a single host and across multiple hosts. Theserialization and transport of such requests is handledby the PKCS#11 Transport layer. Within the HSM,requests and their associated state are reconstructed beforebeing executed with the aid of cryptographic librariesand card firmware.

Startup/shutdown codeThe HSM in our EP11 concept relies on externalcryptographic objects, and, unlike the PKCS #11 standard,these objects are combined together with the cryptographicfunction in a single self-contained request. This alleviatesthe need for an EP11 application to execute a numberfunctions such as login, key search, or object disposal.The host library may immediately submit functional requestswhen called by an application, assuming that at least oneHSM with a valid wrapping key is available.In practice, host libraries, transforming from PKCS #11

to EP11 calls, manage startup and shutdown operations.The startup operation generally omits any library or HSMinitialization, since none is necessary at the level of PKCS#11 services. PKCS #11 functions that are used to searchfor keys on HSMs, such as C_FindObject, are implementedin the host library as scalable hash lookups on either alocal or central database. The calling application is presentedwith responses that simulate the original stateful, linearsearch of a finite HSM database.The EP11 interface is particularly beneficial to virtualized

hosts in cloud infrastructures due to the little startup

Figure 2

Comparison of PKCS #11 and EP11 architectures.

T. VISEGRADY ET AL. 5 : 5IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

Page 6: Stateless cryptography for virtual environments

or shutdown processing required. Traditional statefulCSPs (cryptographic service providers) would requireauthentication to all HSMs they would like to utilize, andthe CSP would need to track virtualized hosts that disappearor reappear. Similarly, regular PKCS #11 lookups suchas FindObject require state maintenance on a per-callerbasis, which, in our case, is a simply an administrativefunction of the host key database.Since there is very little startup/shutdown API code, and

scalable search replaces stateful object location, the EP11host library easily accommodates an essentially unboundednumber of host images, with most startup/shutdown-relatedoperations serviced completely within host code.

State encryptionCryptographic objects stored outside of the HSM areprotected using an authenticated encryption schemecomprising encryption followed by a MAC). Onecharacteristic of this scheme is that it prevents many ofthe attacks on standard PKCS #11 APIs. Prohibiting theseparation of attributes and unencrypted keys is essentialin order to prevent attacks based on manipulation ofattributes [4, 5].EP11 stores attributes of public keys as standard SPKIs

(subject public key information structures) with Message

Authentication Code (MAC) additions. This allows theattribute-management infrastructure to simply accommodatepublic keys. Key usage restrictions are relevant for publickeys when they are used to wrap other sensitive data.While usage restrictions are automatically availablefor keys generated within EP11, unencrypted externalpublic keys may be imported to MAC form in anadministrator-assisted, audited process.Since updates of wrapping keys follow a two-stage commit

procedure, and may be scheduled across multiple HSMs,system availability is not affected by an ongoing migrationprocess. Host libraries may re-encrypt wrapped structureson multiple HSMs if the migration process is properlyplanned. In this case, the re-encryption of host key databasesmay utilize as much HSM computational capacity asfeasible in the form of a lower-priority HSM job. Ignoringhardware failures, the only host-observable effect of anongoing wrapping-key migration is a transient decreaseof aggregate HSM capacity during the period that wrappingkeys are switched.

Attribute Bound ObjectsAttribute Bound Objects (ABOs) are container objects thatcryptographically bind key material with usage controlattributes. The lack of such formats is a known limitation

Figure 3

EP11 Architecture in z/OS environment. (PCIe: PCI Express.).

5 : 6 T. VISEGRADY ET AL. IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

Page 7: Stateless cryptography for virtual environments

of the PKCS #11 standard [4, 5]. In our implementation,ABOs can coexist with standard PKCS #11objects(Bnon-AB[ keys) using the same function calls. Asidefrom wrapping keys, AB and non-AB keys are functionallyinterchangeable. During wrapping, non-AB keys may beused for standard PKCS #11 wrapping algorithms, whereEP11 is compatible with any other PKCS #11 library.There are minor restrictions, such as the settings-controlledrestrictions prohibiting some PKCS #11 functionalitywith known dependencies, for example, preventing weakerkeys from wrapping stronger ones.When transporting keys are bound to attributes, AB keys

use the same PKCS #11 wrapping and unwrapping services,but their usage is incompatible with PKCS #11 wrapping.The AB methods use a documented, non-PKCS #11,authenticated-encryption (AE) format. The fact that PKCS#11 lacks a standard AE mode means that, currently,our AB format is proprietary. While our format is welldocumented, it does not match any of the PKCS #11 wrapformats. Our AB-wrap format is closely related to the nativeEP11 wrapped-key format. This format has been definedin an architecture-neutral form to allow semi-standardizedinterchange with other crypto providers.

SessionsWhile essentially stateless, the EP11 key-encryptionformat allows the binding of state objects to sessions. Oursessions only loosely resemble the PKCS #11 concept, andintroduce a modest amount of state to register host entitiesto each eligible HSM. EP11 sessions differ from PKCS #11sessions particularly in the setup and breakdown steps:log in to token, search for keys, perform operations, disposekeys, and log out. In our system, sessions are disjointentities controlling groups of keys that are stored externalto the HSM.Session identifiers are pseudo-random values that are

derived from host or application supplied information such asjob numbers. Any HSM that does not contain a particularsession will reject any encrypted state bound to that session.The overall latency introduced by the key derivationstep within the object-unwrapping process is negligible.Session identifiers are the mechanism used to track theoriginating jobs or host entities. They allow authenticationto a HSM based on proof-of-possession for all authenticatedhost-resident state, including non-sensitive data such aspublic keys. State encryption combines the controllingwrapping key and the session identifier to derive anencryption key unique to the combination in such asway that keys corresponding to different sessionsare cryptographically separated.Since sessions themselves are small fixed size objects,

the number of active sessions may grow to Bsufficientlylarge[ values, even if the actual number is in fact limitedby HSM memory capacity. In the current release, the number

is in the thousands, but can be increased if necessary. Inpractice, we expect virtual clients to combine image andjob identification in the case where encrypted state needs tobe non-portable across different images. Conversely, clientsthat may share jobs would benefit from more centralizedmanagement of session/identifier construction. Note that atypical cloud environment load-balancing scenario usingmultiple identical images would count as a single clientsince they are interchangeable from a cryptographicprovider perspective.

Revoking access to HSM servicesThe capability of combining keys and sessions to restrict theuse of a host-resident state also provides an important benefitto virtualized hosts: keys may be effectively and quicklyrevoked simply by terminating the session. Access to thekeys themselves is not necessary, and session-managementinfrastructure code can be outside of the virtualized images.This allows keys to be safely stored within imagesthemselves. It further allows the use of standard storageresiliency and scalability mechanisms offered by virtualizedimages without introducing security problems. In suchenvironments, session-management infrastructure may easilyimplement policies in which the sessions of compromisedhosts are terminated, thus preventing any future use ofcompromised keys. This is possible since sessions cannotbe reconstructed without the originating passphrase, PIN(personal identification number), etc. Thus, even if anattacker is able to obtain a foreign session, the attackercannot reactivate parts of a key database belonging to thatsession after its owner logs out. Administrators can log-outarbitrary sessions, but not log them back; thus, if thepassphrase of the session ID is lost, those keys becomeuseless.

Adaptive caching of sensitive state plaintextUnlike PKCS #11, EP11 functions pass in objects and stateexternally. These objects need to be decrypted within theHSM and the integrity and authentication of the contentsverified. While the additional processing overhead ofrequests is not negligible, an important performanceenhancement is available to EP11, since the unwrappingof host-resident state may aggressively optimize the cachingof decrypted objects. While a single component deals withall state unwrapping functionality, the individual internalfunctions have full context information and augmentunwrap-requests with relevant metadata. In this scenario,hardware-assisted state unwrapping is a realistic optionsince some of our firmware revisions include the necessarystream-processing primitives.In the current HSM firmware implementation, caching

decisions are static and not updated during runtime.A fixed internal hash table is used with a simple least recentlyused (LRU) replacement strategy for cache entries. The

T. VISEGRADY ET AL. 5 : 7IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

Page 8: Stateless cryptography for virtual environments

selection between 1) caching unwrapped objects and2) letting them be added to the cache after unwrappingis mainly made based on the function. Operationstypical for high-volume workloads such as digital signingbenefit most from caching, whereas object unwrappingfor key/attribute state modification do not and are flaggedto skip caching.In our scheme, objects subject to caching are all

authenticated using object signature verification. A sideeffect of this authentication is that we are able to use partsof the HMAC (keyed-hash message authentication code)signatures as hashes for optimizing cache management. Sincethe cache is organized as a two-level hierarchy, individualsub-groups LRU replacements operate on an individuallysmall number of elements, and may be aggressivelyspecialized. The LRU replacement for four to six entriesis turned into an optimal set of comparisons by the gcccompiler version 4.1, the current embedded compiler forour HSM target.Even the simple caching decisions that are currently used

demonstrate a significant latency decrease across practicalworkloads. Our sample workload is a strongly PKI (PublicKey Infrastructure)-orientated digital signature applicationfor signing electronic identity documents. We have observedthat even a modest number of cache entries quickly adapt tothe calling pattern. In our experiments, 256 or 512 cacheentries already have close to 90% hit rates, since the CA(certificate authority) issuing signatures is hierarchical. Undersuch conditions, the most frequently signing private keyswithin the PKI hierarchy are clearly Bhot[ (e.g. available) inthe cache. Experiments on realistic workload sizes indicatethat the effective decrease in performance is slightly lessthan 10% compared to an implementation operating onfully HSM-resident keys such as a regular PKCS #11implementation on comparable hardware. Increasing thecache element count, on our current firmware and HSMs, isa trivial enhancement, since the total byte-count of cachedelements in our CA workload is negligible (megabyte rangefor 512 keys) compared with available HSM storage.Since caching of a sensitive state has no impact other

than the reduced latency, we may improve cache strategiesbased on flows present within the HSM. A natural extensionwould be the matching of performance statistics with objecttypes. As an example, an overwhelmingly PKI-orientedworkload could even prioritize symmetric keys aslower-priority candidates for caching, when a long-termusage pattern emerges. While the necessary statistics arealready available to the HSM, EP11 does not currently adaptcaching based on such dynamic criteria.

Scalability across HSMsLoad-balancing cryptographic functions across multipleHSMs are a feature performed directly by libraries locatedon the host platform. These libraries maintain lists of active

HSMs and their associated load and performance figures.The libraries are able to accurately determine systemperformance due to the fact that they are aware of both thenature of requests and the corresponding keys used. Thisallows overall system performance to be efficiently managed.Load statistics may also be combined with the expectedperformance characteristics of requests, together with the pastand current utilization history of the HSMs. The conceptattaching state to HSM requests enables the utilizationof multiple HSMs. This allows us to trade a small fixedaddition to latency with effectively unbounded systemthroughput.

HSM hardware and resilienceAn HSM generally consists of two types of distinctcomputational elements: module processor units (MPUs)and special-purpose computing cores, such as task-specificaccelerators. To ensure the high reliability and availabilityof cryptographic functionality, there are a number ofalternatives. One approach is to replicate most of thecomputing elements within a HSM, including entireMPUs, in order to tolerate faults caused by failure of theunderlying hardware. A second approach is to introduceerror-detecting codes into the arithmetic units of thespecial-purpose cores and MPUs.While replicating computing elements may provide higher

RAS (reliability, availability, and serviceability), it hasthe drawback in that the additional processing overheadlimits achievable performance. Most HSM devices are lesscomplex commodity devices that target providing highthroughput. These devices are often termed cryptographicaccelerators. Simpler accelerators typically contain multiplecomputing cores (Bmulti-core[), and generally tolerate lowreliability in order to achieve better raw performance. WhileHSMs generally do not provide the reliability required inan enterprise environment, they may be augmented toprovide error-detection.One approach is to use multiple HSMs in order to provide

redundant computation. The limitation of this approach isthat the host system requires multiple HSMs and at least asingle checksum-calculating core. As shown before, thisapproach is not suitable for non-deterministic computationsuch as those involving random values.An alternative method for adding resilience is to insert

checksums directly into the processed stream [11, 12].The latter approach allows host HSM infrastructure codeto compare and remove checksums before the response ispassed back to the host. In practice, a specialist HSM hashingaccelerator calculates response hashes. As a result, thebandwidth to the particular hashing core will be reducedfor the host applications by dedicating checksum calculationsto this core. Furthermore, the channel needs to storetransient state in order to track requests and responses.This solution, however, is restricted to fully deterministic

5 : 8 T. VISEGRADY ET AL. IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

Page 9: Stateless cryptography for virtual environments

operations since it requires identical input to produceidentical outputs.We have seen that the use of VM snapshots may not

meet the cryptographic requirement for good quality randomdata that is Bfresh[ and only used once [9]. An HSM canprovide a solution to this problem in the case where an APIallows access to the random number generator (RNG) locatedin the HSM. In such an environment, the problem describedby Garfinkel and Rosenblum [9] no longer applies sinceeach and every request always obtains Bfresh[ random data.The challenge of supplying a reliable random data service

within the HSM is that the parallel MPUs, which handlethe same requests for reliability reasons, cannot accessthe TRNG (entropy source) hardware in an asynchronousmanner, since its output is not deterministic. A solution to theproblem that allows the cross-checking of non-deterministiccalculations is described in an IBM patent for a virtualizedTRNG [11]. The virtualized TRNG may drive independentDRNGs (deterministic RNGs) in parallel, providingsynchronized Brandom[ states in uncoordinated requestors.The principle is based on a single real TRNG supplyingentropy in the form of a stream, which is replicated formultiple readers. Assuming all non-deterministic reads areso virtualized, the behavior of readers should be identical.Request-specific blocks of TRNG output are exposed to

the processors only indirectly. The raw TRNG bytes areassigned to exactly one request, never sharing TRNG statebetween different requests. Every TRNG byte so usedgets mixed into exactly one VTRNG (virtualized TRNG),externalized only to instances of the same request. EachVTRNG instance maintains a common seed buffer perrequest, and one counter per MPU for each request. EachMPU may maintain its own DRNG. Since DRNGs areseeded by the same seed, they will have identical state.The concept of the VTRNG works only within a single

HSM in providing reliable TRNG functionality; it doesnot work across multiple HSMs at the host level. Thus,to obtain a high-quality and reliable random functionfrom an HSM, the reliability has to be already integrated.

ConclusionIn this paper, we presented the concept of a stateless interfaceto HSM for virtual infrastructures. We have highlightedchallenges with key management and with the additionalsecurity vulnerabilities that virtual environments incur.We have introduced the EP11 concept and shown how itcan be used to address many of the challenges that we havehighlighted. We do not claim that it addresses all of thechallenges raised. So long as applications store securityrelevant state that can be captured as a VM snapshot, therewill be vulnerabilities. What EP11 can provide is a reliableand fresh source of random data that can be used byapplication developers. It is up to developers to ensurethat this data is retained in host memory for the shortest

amount of time possible, thus minimizing the risk that itis captured in a VM snapshot.Our EP11 concept offers a number of additional

characteristics that benefit virtual environments. Considerthe following.

Security migrationApplication developers are often challenged whenimplementing security. Ideally, the underlying infrastructurewithin a virtual environment should provide the appropriatesecurity independent of the application. The concept ofappropriate security depends on many factors, such as thetype of data being processed; the sensitivity of the clientinvolved, and; the mix of other applications that may shareresources with each other. The application developer cannotbe aware of many of these factors. Our solution allows thelevel of security to be configured without involving theapplication. If, at a later time, a stronger security model isrequired, this can be implemented outside of the applicationat the EP11 provider level, thus enabling the concept ofsecurity migration.

Paying for crypto cyclesA basic characteristic of the cloud model is the conceptof Bpaying for compute cycles.[ Our scheme hasproperties that simplify the introduction of this model inthe cryptographic domain, namely paying for cryptographiccycles. The basis for such a feature is the capability tocollect statistics for cryptographic providers. Our schemecan be configured to use global object handles to referenceparticular cryptographic objects. In order for a cryptographicobject to be used, it must be loaded into an HSM. Collectingstatistics using global handles is considered trivial in thiscontext. The EP11 is currently available on the IBMzEnterprise EC12 [13].

*Trademark, service mark, or registered trademark of InternationalBusiness Machines Corporation in the United States, other countries, orboth.

**Trademark, service mark, or registered trademark of PCI-SIG, in theUnited States, other countries, or both.

References1. PKCS #11 Cryptographic Token Interface Standard

(v2.20 28 June 2004) , RSA Lab., Bedford, MA, USA,PKCS #11 v2.20. [Online]. Available: ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-11/v2-20/pkcs-11v2-20.pdf

2. Organization for the Advancement of Structured InformationStandards. [Online]. Available: https://www.oasis-open.org/

3. R. Anderson, M. Bond, J. Clulow, and S. Skorobogatov,BCryptographic processorsVA survey,[ Proc. IEEE, vol. 94,no. 2, pp. 357–369, 2006.

4. J. Clulow, BOn the Security of PKCS #11,[ in Proc. 5th Int.Workshop CHES, 2003, vol. 2779, pp. 411–425.

5. M. Bortolozzo, M. Centenaro, R. Focardi, and G. Steel,BAttacking and fixing PKCS #11 security tokens,[ in Proc.17th ACM Conf. CCS, 2010, pp. 260–269.

T. VISEGRADY ET AL. 5 : 9IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014

Page 10: Stateless cryptography for virtual environments

6. Y. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart, BCross-VMside channels and their use to extract private keys,[ in Proc.ACM Conf. CCS, 2012, pp. 305–316.

7. T. Rispenpart and S. Yilek, BWhen good randomness goes bad:Virtual machine reset vulnerabilities and hedging deployedcryptography,[ in Proc. NDSS, 2010.

8. B. Brumley and N. Tuveri, BRemote timing attacks are stillpractical,[ in ESORICS. Berlin, Germany: Springer-Verlag,2011, pp. 355–371.

9. T. Garfinkel and M. Rosenblum, BWhen virtual is harder thanreal: Security challenges in virtual machine based computingenvironments,[ in Proc. 10th Conf. HOTOS, 2005, vol. 10, p. 20.

10. Q. P. Q. Nguyen and I. Shparlinski, BThe insecurity of the digitalsignature algorithm with partially known nonces,[ J. Cryptology,vol. 15, no. 3, pp. 151–176, Jun. 2002.

11. V. Condorelli, T. J. Dewkett, M. D. Hocker, and T. Visegrady,BCommunications Channel Interposer, Method and ProgramProduct for Verifying Integrity of Untrusted Subsystem Responsesto a Request,[ U.S. Patent 7 516 246, Apr. 7, 2009.

12. S. Dragone, T. Visegrady, and V. Condorelli, BSystem and aMethod for Providing Nondeterministic Data, IBM,[ Patent Appl.No. US20110106870 A1, May 5, 2011.

13. IBM zEnterprise EC12 (zEC12) IBM Systems and TechnologyData Sheet, IBM Corporation, IBM Systems and TechnologyGroup 012, Somers, NY, YSA. [Online]. Available:http://public.dhe.ibm.com/common/ssi/ecm/en/zsd03029usen/ZSD03029USEN.PDF

Received March 16, 2013; accepted for publicationApril 18, 2013

Tamas Visegrady IBM Research Division, Zurich ResearchCenter, 8803 Ruschlikon, Switzerland ([email protected]).Dr. Visegrady received his Ph.D. degree in electrical engineering fromthe University of New Hampshire, specializing in design automationfor microelectronics. After working as a support engineer forhigh-performance computing, security engineering, and then hardwaredevelopment at IBM Poughkeepsie, Dr. Visegrady joined IBMResearch in Zurich in 2003. His current work involves architecturedesign for cryptographic coprocessors, and operating system securityinfrastructure for IBM server platforms.

Silvio Dragone IBM Research Division, Zurich Research Center,8803 Ruschlikon, Switzerland ([email protected]). Mr. Dragonereceived an M.Sc. degree in electrical engineering from the SwissFederal Institute of Technology (ETH) Zurich in 2002. Mr. Dragonejoined the Communication Systems department in IBM Research,Switzerland, in 2002 to work on network processors. Since 2007, hehas worked on embedded security and cryptographic coprocessors.He is author or coauthor of nine patents.

Michael Osborne IBM Research Division, Zurich ResearchCenter, 8803 Ruschlikon, Switzerland ([email protected]).Mr. Osborne received a B.Sc. degree in microelectronics andcomputing from the University of Wales in 1983 and an M.Sc. degreein business strategy from Strathclyde University in 2006. He spent10 years in Germany, the last five as CTO (chief technology officer)of a software company. He joined the Communication Systemsdepartment in IBM Research, Switzerland, in 1998 to work on protocolstacks and network processors. He moved to the Computer Sciencedepartment in 2002 to work on embedded security and securityevaluations. He is author or coauthor of 16 patents.

5 : 10 T. VISEGRADY ET AL. IBM J. RES. & DEV. VOL. 58 NO. 1 PAPER 5 JANUARY/FEBRUARY 2014