event horizon - operational risk taxonomy public loss data

35February 2007

EVEN IN the fast-paced world of risk analytics, event types are a relatively new invention. The

two-level data hierarchy we have all come to know – whether fondly or otherwise –as ‘Basel II event types’, originated with a special task force of the Working Group on Operational Risk (WGOR), formed by the International Institute of Finance. The task force’s recommendations, released in May 2001, were promptly incorporated into the original Basel II framework, where they remain today, virtually unchanged (see note 1).

Agreeing upon a set of standard event types, whether for a single organisation, a consortium or the industry as a whole, is no simple task. As a practitioner, I engaged in a number of such exercises, beginning in the late 1990s (primarily as head of loss data collection at JP Morgan, where the firm’s original model underwent detailed reviews after each of two mergers, and also as co-ordinator of the original WGOR task force, referred to above, and as a long-term member of the ORX Definitions Committee). The process seems less tortuous today than it did in 2001, as industry experience and sophistication have grown, but even now event-type category discussions can feel halting and inconclusive.

Commentators generally agree that the Basel event types are not perfect. A number of critics took aim at the framework upon its initial release. But industry commentary has gradually subsided, if only because the Basel Committee issued a series of interpretive papers without reopening the hierarchy for discussion or clarifying its ambiguities. Currently, event types remain something of an uneasy fixture on the landscape, neither actively vetted and embraced by the community, nor likely to disappear anytime soon.

Event horizon

Rick Cech takes a second look at what makes up operational risk event types, and asks if there is a more advanced way to define them

Best practice

35-39 event.indd 35 2/2/07 2:58:54 pm

This awkward posture, I would argue, is at once both more and less than event types deserve. On the one hand, the Basel II hierarchy is imperfect, in no small part because it is subject to definitional overload. Event types are one of only two structured hierarchies in the Basel framework (the other being business lines). And no one variable, in isolation, can express all the nuances that practitioners want to capture in ‘event space’ without it becoming something of a jumble. It is as if an artist tried to describe a painting with a single adjective, meant to describe colour, brightness, canvas texture, frame size, source of artistic inspiration, etc, all in a single word.

On the other hand, it is altogether too easy to dismiss event types pre-emptively, as arbitrary pigeonholes deficient in analytical rigour. In fact, event type definitions can be made quite rigorous and precise, and it is important to understand that the discipline of ‘getting them right’ is important (even essential) in realising their value.

In light of all this, many firms have implemented ‘modified’ event type hierarchies – more or less mappable to Basel – for recording loss data. (Their tacit hope in doing so is that they will not be reined in by literal-minded regulators intent on Basel orthodoxy.) Even more firms have adopted separate frameworks for self-assessment, with risk categories that differ from those employed in loss recording. The latter development seems unfortunate, since a firm selecting this path forfeits a valuable opportunity to match its operational risk concerns (as represented in self-assessments) with its actual loss experiences (as reflected in event data).

One can conclude from this that the industry needs to take a serious second look at event types as an element of the operational risk data model.

Redefining event typesTo do this topic justice, we need to revisit some fairly basic questions. These include: what in fact are event types? And what are they meant to measure, or to convey? And, having resolved these issues, what role do event types play in a rigorous risk management regime?

The basic principles of ‘event typing’ are not especially difficult to grasp, yet it is surprisingly hard to describe them succinctly. And this is because the concept of ‘event’ itself is rather indirect, requiring a roundabout sort of definition. Specifically, an ‘event’ is something that happens (or fails to happen) in executing an operational process, and which causes the final results of the process to differ from our original desires or expectations (see note 2). Operational risk events, in other words, are defined mainly in relation to some hypothetical ‘expected’ outcome that never actually happens! External events, it is true, can often be described more directly, as in ‘a tornado struck the data centre’, but this isolated exception does not resolve the general messiness of event linguistics.

How can we proceed from this basic realisation to a useful theory of event types? As before, the answer is not particularly difficult, but it does require a bit of set-up.

Let us start here: a primary reason for the existence of financial institutions is to execute a series of valuable, yet often routine, transactions for clients, accurately and consistently. Real-world examples

range widely, from retail check processing to safekeeping of client assets, to wire transfers, to settlement of securities trades, to booking of over-the-counter derivatives, etc. (Discretionary activities, such as market and credit decision-making, also figure importantly in the industry’s value added, of course, but for operational risk purposes we focus on a firm’s non-discretionary or operational processes.)

Because we all want operational processes to unfold simply and predictably, there is a tendency to represent them diagrammatically using tight, directional process maps, sometimes nearly as simple as the illustration above (other process maps are hugely complex, of course, but all share the same core goal of espousing rationality and order).

What is missing from such streamlined diagrams is a full description of the roles played by key process participants. Even routine transactions typically involve several players, each with a well-defined part. ‘Regulars’ on the list include customers/clients, counterparties, exchanges, employees, providers of physical and systems support, etc.

The actions of individual transaction participants might not be worth singling out if we could always rely on them to complete their responsibilities correctly and on time. But in reality there are powerful forces at work against this harmonious outcome. “Simple and predictable” execution is an aspirational goal, not a matter of due course that we can take for granted.

Consider employees. All firms have policies that direct employees to perform their jobs –and only their job – promptly, loyally and with adequate skill. But in life employees may be distracted, overworked, avaricious, disloyal, angry, poorly trained, collusive, etc. And this may cause them not to perform as expected or desired, in any of several predictable ways. When an employee’s behaviour deviates from what we expect, a potential operational risk event looms (see note 3).

The same sort of scenario can be constructed for the other players in a financial services transaction. Clients may be naive, deceptive or confrontational. A counterparty may place competitive advantage above integrity, or be inept. Fraudsters are always on the lookout for avenues by which to sneak something of value away from the firm or its clients. And, finally, daily operations of any organisation rely on the absence of certain external disruptions, such as erupting volcanoes, outbreaks of avian flu, etc.

The creation of a hierarchy of event types involves nothing more – nor less – than the classification of recurring situations in which financial transactions are disrupted by the failure of their participants (either human or natural) to behave as expected. ‘Break-outs’ can take many forms, to be sure, but most flow in recurring channels. And this regularity is of use in classifying them, based on criteria such as: which player was responsible; what sort of rule or obligation was breached; what kind of impact was suffered, etc.

www.opriskandcompliance.com 36

Best practice

Anatomy of an operational process

Initiatetransaction

Operational process

Expected result

Participants in operational processes

Initiatetransaction

Defined business process

Expected resultEmployee performance

Infrastructure and support

Client participation

Counterparty cooperation

35-39 event.indd 36 2/2/07 2:58:55 pm

Best practice

What standards or guidelines should we adopt for this task? A number of key principles suggest themselves (see table, above).

To demonstrate these principles at work, let us classify some common, recurring employee ‘break-outs’ into their corresponding event type buckets. We will restrict ourselves to established Basel categories (or, in some cases, variants that are widely used in the industry). With more time, we could start from scratch to generate a completely new hierarchy, and see if it differed from conventional wisdom.

When employees perform badly, managers have a number of questions. They may want to know, for one thing, if the action was ‘intentional’. But estimating ‘intent’ requires us to judge the mental state of the badly performing employee, often at some time in the not-so-recent past. And for categorisation purposes, such one-off, subjective judgements tend to produce inconsistent and unreliable results. So, instead, we begin with a more objective question: “Judging from available evidence, was the employee acting in furtherance of company business at the time of the event?” By focusing on what the employee was doing, rather than on his or her thought process, it is possible to arrive at specific, observable standards that we can agree upon as indicative of event status. As we will see, this works more consistently (see note 4).

In any case, if we determine that the employee was ‘on company business’ at the time of the event, then we are dealing with some kind of business mistake or misconduct. The exact category depends on the type

of impact, and on the specific rule or standard of behaviour that was violated.

Case 1.a – Execution. Suppose an employee failed to use ordinary skill or diligence in processing a transaction, thus creating a financial loss. This event would be categorised as ‘execution, delivery and process management’. Or, if it took place in the data centre, it might be ‘business disruptions and system failures’ (see note 5). Note that we are not asking at this stage why the employee made a mistake. Our immediate task is purely one of classification.

And this is as far as we need to go here for our present purposes. Matters become more complicated if we attempt a Level 2 classification in the execution category, since Basel introduces a number of elements (vendor or counterparty involvement, customer account management, etc) that in hindsight may cloud the hierarchy at this high level more than advance it. And we do not sense clear industry standards at Level 2 for execution. But these are questions for another day.

Case 1.b – Accidents. If an employee mistake causes physical damage or injury, rather than financial loss, then our reaction is much different, and the Basel category would be ‘damage to

physical assets’ (a poor name, since the activity examples listed under it include events that also may generate physical injury, etc). Some firms and industry associations have created a new, special Level 2 category ‘accidents and public safety’ to explicitly accommodate this type of incident.

Case 1.c – Business practices. Suppose the break-out occurred not because the employee’s performance lacked the required skill, but rather because it violated some legal or contractual requirement of the firm. Examples might include predatory or unfair lending, failure to disclose material information while underwriting securities, breach of contract, etc. Break-outs of this type fall into the Level 1 category ‘clients, products and business practices’.

Improper business practices are treated here as acts undertaken ‘in furtherance of company business’. This is because they typically do arise from acts that, when taken, qualified as authorised company business – no matter how inappropriate the actions may seem in hindsight, and no matter how much the company may later disavow them, and no matter how big a bonus the offending employees may have received. Following this basic, clear-cut rule avoids much potential for subjectivity and inconsistency.

Fiduciary violations constitute a special and important sub-category of improper business practices. If a financial services firm owes a fiduciary duty to its client, even small, clerical mistakes may have legal consequences (otherwise, a higher level of misconduct would be required for liability). The difficulty here, from a classification standpoint, is that ‘fiduciary duty’ is not a business concept, but a legal

37February 2007

Event type category criteria

• The hierarchy should be of manageable size and scope

Everyday users must be able to comprehend the structure of the hierarchy, and navigate its branches to arrive at a correct selection.

• Labels should be as intuitive as possible

A basic requirement for business buy-in. Yet it is challenging at Levels 1 and 2, which are necessarily somewhat abstract. The ultimate solution seems to lie in adopting more concrete, Level 3-style categories that describe familiar, tangible incidents.

• Each event should fall (ideally) into one and only one category

Every event should have a single ‘home’. In reality, we sometimes see ‘compound’ events, in which several break-outs occur at once. But such exceptions needn’t disrupt the process of defining general rules.

• Category ‘buckets’ should reflect differences in the way that risk managers react to particular break-outs

Event types vary in (i) how important they are, (ii) what strategy is used to address them, (iii) who is asked to address them, etc. To the extent that event ‘buckets’ correspond to our intuitive sense of these differences, they will seem more ‘natural’.

• Category boundaries should be defined clearly and used consistently

Objectively verifiable criteria, consistently applied, are essential in differentiating event ‘buckets’. Objectivity is a key to consistent results.

• Category boundaries should be consistent with (and ‘mappable to’) industry standard classifications

Sooner or later, most firms will seek to employ external data in their risk management programmes. If a firm’s internal event categories cannot be mapped to the hierarchies used by external providers (consortia, etc), the task will be far more difficult.

• Category definitions should be based on event characteristics, not on impact types, causes or controls

An event type hierarchy that includes separate buckets for ‘property damage’ and ‘business disturbance’ cannot accommodate events that include both types of impact together, as many do. See box at end for further discussion of this critical point.

• High-level categories generally should not be defined in terms of specific business lines or products

As a rule, event types should be associated with particular lines of businesses, products, etc, only at ‘lower’ levels of the hierarchy. Higher-level categories should be limited to describing general features of the break-out that occurred.

35-39 event.indd 37 2/2/07 2:58:56 pm

one. That is, a ‘fiduciary duty’ exists only where a statute or a judge says that is does, and the standards vary from location to location. Thus, it is possible for a particular event to be properly classified as ‘suitability, disclosure and fiduciary’ in one jurisdiction and ‘improper business or market practices’ in another. Such cases require a degree of specific local knowledge to categorise.

Case 2.a – Fraud/Theft. Now let us turn to a second, quite different class of events, namely those in which the employee’s break-out was not in furtherance of company business, but rather served the employee’s personal interests at the firm’s expense. Embezzlements, misuse of client funds, participation in external theft rings, etc, all constitute internal fraud under Basel. The response of a risk manager to such cases is likely to be a phone call to the local police precinct, rather than preparation for commercial litigation or a regulatory citation (see note 6).

Financial institutions are subject to a particularly grievous form of internal fraud, namely unauthorised activity (commonly called rogue trading). Self-interested manipulations of books or markets by traders are among the most colourful and highly publicised of all operational risk events. But they are not inherently different from general internal fraud – except in their capacity to inflict immense damage on an institution in a very short time. It is this potential that has led to their separate treatment.

Sometimes employees harm the company by violating specific oaths they take as company workers. For instance, when leaving the firm an ex-employee (still subject to prior restrictions) may not solicit the firm’s clients, approach former co-workers to join the new firm, etc. Such cases would fall within Basel’s employment practices and workplace safety

category, and specifically under employee relations, although this category is more commonly applied to cases in which the firm is accused of failing to meet its financial obligations to an employee.

Case 2.b – Malicious acts. Finally, some employee break-outs are attributable, not to a mistake, nor to actions taken for personal gain, but simply to spite or malice. Such events would fall under damage to physical assets for Basel purposes, but other hierarchies now in use include a specific Level 1 category of malicious damage. In addition to clarifying the nature of the event, this approach avoids Basel’s unfortunate Level 1 category label, which fails to refer to personal injury even though such damage is implicit in the Level 3 activity examples.

As shown in the table above, we can reconstruct several Basel event types from a consistent, functional analysis of employee break-outs. Note that we have addressed only one kind of break-out (those by employees). Indeed, our quick review of this topic is not exhaustive, even for employees, and it does not begin to catalogue the incidents that can arise from “break-outs” committed by other transaction participants.

In recent years, many financial services firms, as well as major industry consortia such as The Operational Riskdata eXchange Association, have made significant investments in developing refined event type hierarchies. The quest is not yet complete, however, especially as regards the need to develop more detailed event type categories to bridge the gap between Basel’s generic, two-level structure and the more specialised and granular needs of individual business managers. But deepening the hierarchy is unlikely to succeed if the upper-level categories have not been fully vetted and accepted by industry participants in advance. OR&C

www.opriskandcompliance.com 38

Best practice

Rick Cech, RiskBusiness International

Employee break-outs and associated high-event types

Player On company business? Rule or duty violated Impact Generic event type (see note 7)

Employee Yes Duty to use ordinary care in business activity

Financial ExecutionSystemsCustomer account management

Physical Accident

No Commercial law/ contract provision

Non-fiduciary Business practicesAdvisory activitiesProduct flaw

Employment-related Employee relations

Fiduciary Suitability/Fiduciary

Criminal statute or company policy

Financial Internal fraud

Trading losses Unauthorised trading

Physical Malicious acts

Notes1 See Bank for International Settlements, “QIS 2 – Operational Loss Data – May 2001,”

http://www.bis.org/bcbs/qisoprisknote.pdf . The Basel Committee modified some of the categories proposed by the WGOR task force: theft/fraud was divided into external and internal, and Systems was removed from Execution and elevated to a separate Level 1 category (“Business Disruptions and System Failures”); utility outages also were moved to this category, from “Physical Asset & Infrastructure Events,” which was renamed “”Damage to Physical Assets.”

2 In the very early days, operational risk was called “other” risk, and included everything not subsumed under market or credit risk regimes. This was the purest form of “definition by default,” effectively devoid of positive content. Basel adopted a more substantive, and by now highly familiar, definition: “the risk of direct or indirect loss resulting from inadequate or failed internal processes, people and systems or from external events.”

3 Note that “expectation” is used in a normative, or prescriptive sense, as in, “You are expected to keep your room clean,” rather than in the informed sense of “I expect that you may not have cleaned up your room today, after all.”

4 However objective we may wish to be, some cases will require subjective interpretation – even after full investigation – if only because some of the underlying facts remain unknown. In such cases, the choice is left up to (i) the internal investigator of the event, for non-litigation items (subject, in some institutions, to review by senior ORM officials), (ii) the courts, in final litigation matters, and (iii) the plaintiff ’s allegations, in cases settled before trial.

5 The Basel Committee removed Systems events from Execution and placed it in a separate Level 1 category; the current analysis suggests they might have been better off left where they were.

6 Whether or not an act is “criminal” is not a successful criterion for classification, however, since both “ordinary” theft/fraud and commercial law violations can carry criminal penalties. The defining issue is whether the employee has allegedly violated (1) a law governing commercial behavior or (2) an “ordinary” criminal law, of the type that governs individual behavior generally, outside the commercial arena.

7 Not all categories Basel-based, includes mix of Level 1 and Level 2 Basel categories.

35-39 event.indd 38 2/2/07 2:59:00 pm

Best practice

39February 2007

The ‘Bow Tie’ Model of Operational Risk

Operational risks resemble a bow tie. On the wide, left-hand side of the tie are contributing factors (or causes). On the wide, right side, we have impacts (or effects). The knot in the middle of the tie represents the central event. (There is also a fourth factor, controls, but to represent this item visually we would need to skew the diagram a bit, for example, by adding down-hanging tails, as on a string-tie.) The four attributes just mentioned (events, causes, controls and impacts) correspond to four simple but important questions concerning operational losses: what happened? Why did it happen? What controls failed to keep it from happening? And what effects occurred as a result of its happening? Of these approaches to analysing risks, which is the best for classification?

Contributing factors (or “causes”). Generally speaking, these are conditions that exist in or around operational processes, and which specifically increase the likelihood that an operational break-out will occur. They are like dry pine needles on the floor of a drought-stricken forest. The list of possibilities is large: inadequate training; lack of clearly assigned roles and responsibilities; complexity of the products; bad timing (for example, it is the Friday before a major holiday and personnel are absent or distracted), etc. Contributing causes may not end along with a particular incident. Rather, a causal factor, such as inadequate training, may survive any particular loss incident to strike again another day. Such longevity is generally not a feature of event triggers – such as specific data entry errors – which tend to be uniquely associated with particular events.

Contributing causes are meaty fodder for root cause analysis, but as a classification tool they have seriously drawbacks. First, any operational loss worth recording is likely to have a broad range of contributing causes, often spanning all four of the Basel definitional pillars (people, process, systems and external). Second, observers are prone to have very different subjective views as to what in fact causes events: process-orientated observers tend to select process-

orientated explanations, people-orientated observers tend to select people-orientated ones. Behavioural psychologists may declare that external cues are critical in shaping behaviour. These are turbid waters.

Impacts (or ‘effects’). If contributing factors are too vague and subjective, then perhaps impacts will be more effective. After all, impact is the hard, objective stuff from which general ledger entries are forged. But as a classification principle, impacts actually fare poorly. In the first place, it ends up being more difficult than one initially might assume to assign impacts to simple, unique categories. More importantly, operational risks often have multiple impacts, of varying types, and thus, like causes, impacts fail the uniqueness test as a basis for classification. Finally (and most significant from a substantive standpoint), impacts are far removed from root cause and provide little insight into what instigated a loss.

Controls. Risk managers spend much of their time managing controls. Indeed, many jobs in a company are centred on the operation of specific controls. Thus, controls might seem a comfortable basis on which to construct a classification system. But once again, we find that operational losses regularly involve the failure of entire suites of controls, including individual controls of various types. Moreover, for some losses there are no associated controls at all. So here again, this perspective does not provide an effective basis for classification – even though a firm’s success in operational risk management is intimately linked to the quality and effectiveness of its control design and implementation, and to its skill in estimating the optimal number of controls to implement.

Events. This leaves events, and fortunately enough this alternative does provide a workable platform for classification, situated at the crux of the occurrence. By looking at an event as the first thing that happened (or failed to happen) as an operational risk materialised, we can uniquely assign occurrences to a set of unique event type categories. Moreover, event records in a database can include references to all associated causes, control failures and impacts (in one-to-many sub-tables), making it a neat platform for centralising these other, multiple-factored attributes. It is for these reasons that the event analysis is regarded as central to operational risk data modelling. Event classification is not problem-free, but at present it appears to be the best alternative we have.

EVENT ImpactsContributing

Factors(= "causes")

Classification base Underlying question Brief definition How to identify Advantages Drawbacks

Contributing factors (or causes) Why did it happen? A pre-existing condition that makes an operational risk event more likely to occur (maybe inevitable)

Often persists after the event is over. For instance, lack of proper training can contribute to a specific loss, yet survive the immediate incident and go on to produce many more!

• Intuitive• “Gets to the point”• Reflects root causes

• Multiple causes possible per event / not unique

• Ratings subjective/inconsistent

Events What happened? “What broke?” The occurrence or omission “but for” which we would have nothing to record

A river may have many turns and tributaries, but only one source; the event begins with the first happening that initiated a particular incident.

• Allows unique classification• Allows the whole incident to

be modelled statistically (incl. all impacts)

• Somewhat less direct in root cause analysis

Controls Why was it not prevented? If participants in a business transaction perform perfectly, then no controls would be necessary, technically. Controls are ‘extra steps’ in a business process designed to detect or prevent “break-outs”

If a business transaction can be completed without completing a particular step, then that step is most likely a control..

• Corresponds to the job responsibilities of many risk managers

• Multiple controls failures possible per event/not unique

• Many events lack controls

Impacts (or effects) What was the result? ‘Impact’ is the difference between the expected outcome of an operational process and its actual outcome, given the occurrence of an operational incident. Incidents may have multiple impacts.

An impact is reflected in a dollar amount entered on a general ledger. Some impacts are non-economic (for example, reputation risk), and thus may be non-recordable

• Can be retrieved from financial records (but may not be readily identifiable unless classified as “operational” when booked)

• Remote from root cause • Classification tricky• Multiple impacts possible per

event/not unique

35-39 event.indd 39 2/2/07 2:59:01 pm

event horizon - operational risk taxonomy public loss data

Documents