the architecture for rapid decisions

24
THE ARCHITECTURE FOR RAPID DECISIONS The management of the architecture for business intelligence, and related information management solutions, has undergone a paradigm shift as business drivers have an overriding influence on its design. In the past, much of architectural design was an arcane field focused on the logical and the physical layers and was the preserve of CIOs. In the future, executives are increasingly looking for service-oriented architectures, shorn of the complexity of the IT systems, and allow company strategists to respond flexibly to market opportunities. They want to use SOA, XML and web services to reduce the latencies in information processing and decision-making to be able to respond in real time. Service-oriented architecture is the culmination of the progress towards reusing programming, starting with object-oriented programming, and extended to building applications and processes on the fly. The progress continued with components and concluded with the development of services . Objects enable the reuse of programming code as long as the operating environment, such as Linux, Windows, do not change. Components extend the flexibility and can reuse software even when the process changes. Services are the most flexible as they enable developers to cross both the process and operating system boundaries. Real time decision making is widely recognized as a potent source of competitive advantage. According to a survey conducted by Optimize magazine, 88% of senior executives want to see lower latencies in the availability of decision relevant information. Currently, as much as 16% of time spent by executives, according to research conducted by Outsell , an analyst firm in the information processing industry, is lost on accessing, reviewing, and evaluating information at a total cost of $107 billion besides causing delays in decision-making. The challenge before companies is to choose from an assortment of architectural choices that vendors offer. At this point of time, most

Upload: kishore-jethanandani

Post on 05-Feb-2015

516 views

Category:

Technology


0 download

DESCRIPTION

Real time responses to events will be feasible when enterprises are designed to be maneuverable and their flow of activity is not disrupted by a breakdown in any one component in the chain of business processes that enable the completion of an activity.

TRANSCRIPT

Page 1: The Architecture for Rapid Decisions

THE ARCHITECTURE FOR RAPID DECISIONS

The management of the architecture for business intelligence, and related information

management solutions, has undergone a paradigm shift as business drivers have an overriding

influence on its design. In the past, much of architectural design was an arcane field focused on

the logical and the physical layers and was the preserve of CIOs. In the future, executives are

increasingly looking for service-oriented architectures, shorn of the complexity of the IT systems,

and allow company strategists to respond flexibly to market opportunities. They want to use SOA,

XML and web services to reduce the latencies in information processing and decision-making to

be able to respond in real time.

Service-oriented architecture is the culmination of the progress towards reusing programming,

starting with object-oriented programming, and extended to building applications and processes

on the fly. The progress continued with components and concluded with the development of

services. Objects enable the reuse of programming code as long as the operating environment,

such as Linux, Windows, do not change. Components extend the flexibility and can reuse

software even when the process changes. Services are the most flexible as they enable

developers to cross both the process and operating system boundaries.

Real time decision making is widely recognized as a potent source of competitive advantage.

According to a survey conducted by Optimize magazine, 88% of senior executives want to see

lower latencies in the availability of decision relevant information. Currently, as much as 16% of

time spent by executives, according to research conducted by Outsell, an analyst firm in the

information processing industry, is lost on accessing, reviewing, and evaluating information at a

total cost of $107 billion besides causing delays in decision-making.

The challenge before companies is to choose from an assortment of architectural choices that

vendors offer. At this point of time, most companies are wary of trying new technologies and

would prefer to extract increasing value from the enormous investments in IT sunk in the late

1990s. Emerging technologies, such as web services and SOA, enable them to extract greater

value from their existing investments.

Undoubtedly, incremental investments in architectural changes bring about disproportionate

benefits for companies and data integration and business intelligence tools play a critical role in

reaping greater investments from existing investments. An example of how smart improvements

in architecture raises the profitability of existing investments is the case of Telus Corp., a

Canadian telecommunications company based in Vancouver, British Columbia which deployed

an enterprise business-intelligence suite and integration software to rapidly route relevant

information, from its transactional databases, to its field-service staff all over Canada and lowered

its operational costs by $1 million per month.

Page 2: The Architecture for Rapid Decisions

Business Intelligence software can help to lower latencies even in the infrastructure and help to

extract higher value from existing investments. All too often companies invest in an array of

servers and storage devices to lower downtime and increase the speed of response; the excess

capacity helps to spread the load so that there is no interruption. People's Bank, instead, decided

to lower the downtime of its applications rather than incur extra costs on spare capacity. It

installed a monitoring tool that allows it to compare the utilization rates of its infrastructure with

the response time. The software allows it to anticipate when its demand exceeds its installed

capacity, as indicated by signals on dashboards, or it can look into the root causes of the

underutilization and to find the means to add to capacity.

IT managers almost unanimously agree that business intelligence and integration technologies

account for much of the gains in speed of response while new investments in IT infrastructure are

much less effective. In a survey of IT managers, only 16% of the executives reported that they

had achieved a faster response. On the other hand, 54% of the respondents agreed that Web

services and data warehouses/data marts have accelerated responses much more effectively. By

contrast, very few executive believe that the more mature products in the enterprise software

industry are as effective; 22% agree CRM software has, while the corresponding figure for

supply-chain management tools was 20% and call-center software.

The reality on the ground is that a relatively small proportion of companies have taken advantage

of the opportunities available to lower the latencies in decision-making. Most companies are

moderately effective (75%) in achieving real-time decision-making while 19% are very effective.

The major reported barrier to achieving a high level of effectiveness is the architecture of

business processes that can accept the software that lowers the latencies.

The impact of business intelligence on real time response to events is illustrated by Chase-

Pitkin’s success in lowering the incidence of theft in its stores by using predictive analytics. It

decided to conduct inventory checks, on items that were more susceptible to theft, at weekly

intervals instead of waiting for the lean season to arrive. The data was used SPSS Predictive

Analytics tools to anticipate the next case of shoplifting and take preemptive action to lower the

losses from robbery. This would not have been possible if it did not have the ability to rapidly

aggregate data rapidly and feed it into its business intelligence software.

A framework for governance of information

Data mining for reporting purposes and strategic management is a generation behind the needs

of a real time enterprise. The purpose of data mining was not operational management and even

less to steer business processes to respond to events in real time. Enterprises cannot react

quickly unless information flows to a single point from all its far corners, digested rapidly and

communicated to operational staff just in time to deploy resources to be the master of a situation.

In the past, this was not possible because individual departments frequently had distinct data

sources and applications. When consolidated information was available in a data warehouse, the

Page 3: The Architecture for Rapid Decisions

data was not refreshed fast enough to respond to events and business processes could not

adjust in time to cope with contingencies.

An enterprise governance model is an aid to the management of an enterprise cohesively. Any

one of the information assets of a company has a functional purpose and an administrative

component to manage resources. A database, for example, is a means to store data and its

properties section helps to manage the format, allocation of addresses and its extraction and

presentation. Similarly, an application such as a spreadsheet is a means to store data which can

be manipulated with the menu functions. The relational database and the spreadsheet can work

together using middleware, such as ODBC (On-line Database Connectivity), which has the ability

to mediate between the distinct administrative tools of the two information assets. A real time

enterprise needs more than a point-to-point interconnection of information assets. It has to be

able to link all its assets so that sequential activities can flow from one end to another without a

break.

An enterprise governance model is the means to centralize the administrative tools of all

information assets and business processes. It separates the management of information assets

and processes from the physical activity of completing tasks; the individual functions are invoked

when a task is completed from a remote center which has an overall picture of the objectives to

be achieved. In technical terms, metadata is the information that governs individual databases,

applications or business processes. While much of the metadata in the past was hard coded to

use resources within an application or database or monitored business processes selectively, a

real time enterprise needs more liquid resources.

The enterprise governance model allows metadata of distinct applications or databases to talk to

each other. It is also a means to administer all information resources from a single point of

control. An enterprise governance model maps the functional, analytical and resource flows of a

company and interlinks the metadata of individual assets to work as a single entity.

Several benefits follow when the information assets operate as a continuous process instead of

discrete units. For one, the assets do not have to be present in a single physical location where a

task is being done. Also, the data required to conduct is not reproduced in several different

applications which eliminates the risk of inconsistent definitions. Finally, business processes

operate independently of applications and can be invoked by several different applications.

Industries like the health and the financial industries have a plethora of systems to manage their

varied information stores. The health sector, for example, has information on clinical trials, FDA

regulations, academic literature, product information scattered in several repositories. When

doctors use information, they need to be able to find all this information before they can begin to

prescribe drugs. Companies supplying drugs have to be package information in response to

specific queries that can often to extremely time consuming.

Page 4: The Architecture for Rapid Decisions

Aventis Pharmaceuticals deployed a medical information system that its call center staff could

use to retrieve information to respond to queries from doctors. It needed to link its Siebel CRM

system to its document databases with the medical data. Aventis installed integration middleware,

deployed on a server, from RWD Technologies, with additional capabilities to write rules for

packaging information, so that inter-related data can be presented in a cohesive manner.

Fragmentation of data sources and applications also take place when companies create their

business intelligence infrastructure incrementally. A big bang approach to data warehouses, i.e.,

a centralized data warehouse is risky and requires a large investment at the outset so companies

prefer a data mart for some departments. When their data marts prove to be profitable,

companies want to migrate to a data warehouse or they want to create a federation of data marts.

When they decide to create a federation of data marts, companies need to find a way to integrate

their data sources and applications. Once investments have been sunk in one infrastructure,

companies prefer to continue to use their old infrastructure rather than start all over again with a

data warehouse.

An example of an integration product is IBM’s WebSphere which has business process

management capabilities besides coordinating application-integration and workflow technologies.

The software it acquired from Holosofx software affords modeling of business processes,

simulation of the outcomes of the chosen business processes and comparisons of actual results

with the expected outcomes.

Pointers to information stores

In the past, metadata or the descriptions of data structures were inconspicuous as they were

incorporated into the applications that drove business intelligence functions. When the metadata

was associated with a particular application, it also had definitions that were hard to generalize.

For example, one database would have information on the products in the company. These

products would not have information on the related applications or the technology or solution.

When information is used in a business context, information about products is rarely useful when

it is not seen together with information on its uses. For example, consumers have to choose

between several cell phones. To make their choice, they would need to know which ones can be

used to access the internet, pre-paid cards, etc.

The available metadata, as long as it is embedded in particular applications, cannot help in the

centralized management of an enterprise. In the new world of business intelligence, companies

have to be able to associate categories such as products with customers for them to identify

relationships. Consumers are interested in the attributes of products; they would look for luxury

cars or fuel efficient cars rather than any specific product. Companies have to move beyond

metadata to taxonomies to provide information in the language people use in everyday life. The

rub is that perspectives differ and definitions vary with them. As a result, the corporate world has

been struggling to define categories in a manner that is acceptable to all.

Page 5: The Architecture for Rapid Decisions

A real time enterprise business intelligence infrastructure is centrally managed and needs

cohesive administration to be able to manage the heterogeneous applications common in

enterprises today. In order to work with existing applications and databases, companies have to

migrate to open standards metadata. The absence of metadata implies that companies spend

about 35-40% of the programming budget on transferring data from legacy systems or from one

type of information asset to another.

A data model underpins the Meta data that governs a network of applications. The data model

helps to remove any superfluous Meta data that exists in the enterprise and adds definitions that

help to govern information assets more effectively. The data model prepares the ground for

sharing of information across applications and databases and collaboration based on agreed

definitions.

Data warehouses are so yesterday

Data warehouses have been the widely accepted means to consolidate data for analytical

purposes. When the data stored in data warehouses is subjected to analysis, it is not possible to

refresh it with new data. Any updates in the data warehouse can be done at intervals outside of

working hours which contributes to data latency. On the other hand, transactional data stores

have been reserved for mission critical operational purposes while analysis is accorded a lower

priority if at all.

The conflicting goals of gaining access to current data and operational efficiencies are achieved

by intermittently sending trickles of data from the source operational data store to a data

warehouse. Not all data is required for analytical purposes and has to be selected depending on

the metrics chosen for performance management or other specific purpose. Any change of data

is transferred to a separate partition in the source database and enters a message queue. This

data is then streamed into the receiving database into a separate “active” partition and transferred

periodically. The steady outflow and inflows of data ensure that the performance degradation, if

any, is not consequential.

Data warehouses will continue to be useful to store historical data which has been scrubbed for

quality defects, inconsistencies in definitions and other errors. The emerging scenario is one of a

federated information infrastructure which would include access to real time data. A typical case

is that of 17th Judicial Court of Florida which acquired an events management system from Neon.

Any updates in the data are stored in a real time warehouse which can be accessed for analysis

by case managers. Eventually, the same data can be transferred to a data warehouse.

A heterogeneous and integrated infrastructure needs additional information assets to manage

data access, queries and data storage. It needs an integration server which stores the master

data to manage the conversions of information from one source to another. A heterogeneous

architecture has to be able to manage queries both in the familiar SQL format and the emerging

Page 6: The Architecture for Rapid Decisions

standards of XML. At a more advanced level, the integration can include the management of

workflows across a variety of information systems.

A typical application of heterogeneous systems, co-existing in a federated infrastructure, is the

case of the patient information system that hospitals in Massachusetts are creating for access to

patient prescription information for use in emergency rooms. The data is accessed from a variety

of databases of insurance companies and presented on a dashboard which doctors can use

when patients are treated in emergency rooms. None of the latencies, common in data

warehouses, inhibit the use of data.

Extraction of information from a variety of sources including relational databases and other

documents such as text documents, e-mails, instant messages and spreadsheets requires a

universal system of queries that is not available with the familiar SQL. The alternative XML based

system of queries, executed by XQuery, is able to access information from a wider range of

sources.

A web of services

While a great deal of skepticism is routinely expressed about the maturity of Service-Oriented

Architecture and web services, the early experience suggests a high level of satisfaction with

these technologies especially when it comes to the task of integration of legacy infrastructure.

According to a very recent survey, about 18% of a thousand companies are “SOA-ready,” i.e.,

their infrastructure uses standardized, loosely coupled services while 35% of them expect to have

a fully functional web services infrastructure in place soon. A majority of 55% of the respondents

recognize that web services yield substantial gains in terms of integration of their IT infrastructure.

Minimalist integration, i.e. inter-connection of a few applications is not uncommon but the real

benefits of the SOA architecture are realized when a network is created. Individual applications or

data repositories are reduced to services that can be accessed on the network much like the way

telecom devices, such as telephone instruments, DSL or hosted services are plugged into a

telecom network. The network then becomes instrumental in achieving business objectives; a

group of companies might want to collaborate and use devices such as conferencing tools to talk

to each other. Similarly, the SOA creates a network whereby individual applications provide

services while an overarching network helps to achieve objectives such as providing real time

analytics for coping with an emergency.

The lightweight integration made possible by SOA enables companies to evolve as their business

reality changes without being hamstrung by their infrastructure. They can add new applications

that are complementary to their existing applications and synchronize them with business

processes which can be adapted for new needs. A typical example is that of Owens & Minor, a

Glen Allen, Va., distributor of medical and surgical supplies needed to build a supply chain

network and retain the functionality of their legacy infrastructure installed in the 1970s and 1980s.

Page 7: The Architecture for Rapid Decisions

The core functionality of these systems was exposed and linked to a SOA network. In addition,

the company added business processes management to the network.

An audacious attempt at introducing the SOA architecture is Siebel’s $100 million investment in

Siebel Universal Application Network (UAN). Siebel UAN makes a departure from an applications

oriented architecture to one that makes business processes as the bedrock of an IT system.

Individual units of business processes, such as quote to cash, campaign to lead, etc., are defined

using the BPEL4WS (Business Process Execution Language for Web Services) standard.

The levers of an agile enterprise

Real time responses to events will be feasible when enterprises are designed to be

maneuverable and their flow of activity is not disrupted by a breakdown in any one component in

the chain of business processes that enable the completion of an activity. An analogous situation

is the case of airports which represent a network of processes that have to be completed before

the activity of flying passengers from one end to another is completed. The completion of the

activity involves applications such as the technology for flying airplanes, the communication

technology that enables airplanes to receive information about the safety about their flight path

and the technology for the management of landing of aircraft. All these processes are seamlessly

interconnected and the breakdown of any radar, defects in the control systems of any aircraft or

closure of an airport does not necessarily disrupt air transportation. Airlines have maneuverable

systems in that they routinely cope with fluctuations in traffic and have to be able to route aircraft

to alternative paths.

By contrast, software applications embed business processes that are rarely interconnected to

complete the flow of an activity. These applications can complete some components of tasks in a

value chain analogous to flying an aircraft or managing communications. Real time enterprise, on

the other hand, seeks to mimic a network such as the management of air traffic.

A business process oriented system helps to automate the management of business processes,

integrates them with functionally relevant applications and creates a framework for collaboration

within a team. In addition, it helps to organize work flows and leverages the existing IT

infrastructure for completion of tasks.

One example of business process oriented architecture is the Lloyd’s of London which needs an

integrated business process to inter-connect its many branches and activities spread over several

countries. It installed a BI reporting tool to keep track of money inflows as a result of premiums

paid and outflows from payment of claims which could happen in any country or office. Typically,

employees work with a network of banks, underwriters and other professional services

companies or claims processors. Consequently, they need to report on data as well as

collaborate and communicate with their partners in real time before they can come to decisions

about risk management. A typical task is to reconcile data on transactions conducted in eight

different currencies. Lloyd’s needs data of all these dispersed transactions at a portal interface to

Page 8: The Architecture for Rapid Decisions

be able to estimate its net position. This is only possible when its entire IT infrastructure,

transactional databases, Windows XP operating systems and file servers are inter-connected.

In the past, business processes or workflows were wrapped up with the monolithic applications

that governed their operations. In an environment of heterogeneous applications, similar and

often potentially inter-related business processes lie fragmented divided by their disparate

environments. The emerging design for business processes seeks to mimic the value chain of a

business and constructs seamlessly integrated business processes that complete the flow of

activity to achieve a task. A prerequisite to achieve this goal is to separate the management of

business processes from their specific functional applications and manage the individual units

cohesively in accordance with the desired workflow of an enterprise. The tool for centrally

managing individual units of business processes is a platform, assisted by a graphical user

interface, spells out their semantics and executes their logic consistent with the desired workflow.

An example of the benefits of reducing business processes to their components and then to

interlink them and manage them as a single workflow is the case of inventory management at

Owens & Minor. In the past, the company had to manually complete a series of tasks before it

could return inventory, when it was close to the expiry date, to their manufacturers. The staff has

to trace every item in warehouses, check the return policy of the manufacturer, contact the

manufacturers to obtain a return authorization, create return orders and then inform accounts

receivable to expect a refund. All these series of repetitive functions are represented by a well

defined process, which can be completed with the help of business process management

software. These functions are inter-linked with related applications like financial software to

ensure that all the related information is also available.

In addition, enterprises should be able to loosely couple their business processes with

applications whenever they need to do so to respond to unexpected events which is possible in

SOA architecture. When business processes mirror the value chain of an enterprise, it is possible

for managements to take impromptu actions to respond to contingencies or unexpected events

that might roil their tactical and strategic plans.

The emergence of a variety of middleware enables companies to manage inter-related business

processes and to couple them with applications. Message brokers, transactional queue

managers, and publish/subscribe mechanisms are the means to automate processes; each of the

inter-connected component applications have the ability to send alerts about events and to

receive information about events which need their response. The platforms managing business

processes invoke a message broker, transactional queue manager, or publish/subscribe

middleware layer to tie applications, detect business process related events, and to ensure

routing of events and messages to applications.

One classic case of events triggering alerts and driving business processes is the application

installed at Jet Travel Intelligence. It uses information on natural disasters, political disturbances,

Page 9: The Architecture for Rapid Decisions

and a variety of other metrics to assess the risk for travel by executives in multinational

companies and other organizations. The business process engine it acquired from Fujitsu

embeds units of intelligence or units of validated information with metadata that associates it with

information on traveler profile and itinerary data besides information sources and content. The

customers receive alerts when their travel plans are affected. The task of verifying the information

is divided into units of work; the data flows in and individuals with expertise in individual regions,

subject and legal matter. Each time one step of the work is done, it passes on to another stage.

One example of a product that integrates business processes with services is SAP’s NetWeaver

platform which composes applications, called xApps, from several services. SAP's Enterprise

Services Architecture creates enterprise services, synchronizes them with processes using the

SAP Composite Application Framework, and provides a platform on the NetWeaver application

server to execute them.

Measuring progress

The optimization of business processes would be impossible unless it is also possible to collect

intelligence about activities. Companies need to be able to estimate the metrics of performance at

each of their level of business processes. For example, they need to be able to estimate the time

taken to complete a task or its cost and compare it to the desired level of performance. The task

of measurement is undertaken by Business Activity Monitoring (BAM) software. It measures, for

example, the flow of traffic to a call center. The data feeds from BAM are fed into a real time data

warehouse.

The availability of data, flowing from BAM, enables corporate performance management for

aligning a company’s strategy with its resources. Enterprises have found in balanced scorecards

a precise way to translate their strategies into measurable performance parameters. For example,

the strategic goal of a company could well be to increase its market share. After studying the

market, a company could come to the conclusion that it can increase its market share if the

quality of its products is improved and price is lowered. In terms of operations, this would imply

that the company would have to reduce defects, change its technology, use less materials, raise

labor productivity, improve functionality and usability, improve training, etc.

The log data of business process management software throws up a wealth of information about

labor use, time spent on each process, materials consumed, etc. Corporate performance

management tools in business intelligence tools can pick up this information and analyze it for

improving the metrics.

The data received from business processes would not be actionable unless it can be compared

with the desired performance. Furthermore, any anomaly in the performance has to be

communicated to participants in the work force for taking decisions. This is the task of a rule

engine which compares the actual performance with the required standards and sends out an

alert with there is a hiatus between the two.

Page 10: The Architecture for Rapid Decisions

One of the applications of Business Activity Monitoring is the case of Motorola which used the

data from its business processes to reduce the time involved in order processing. It needed to

reduce the time lapses between the occurrence of a problem and its detection and other

inefficiencies caused by human processing of data as well as from the use of call centers instead

of on-line channels. The implementation of BAM helped Motorola to reduce the inefficiencies in

problem identification and its resolution by 85% and reduction in hold-ups of orders by 75%.

The more advanced versions of business process management offer modeling and simulation

features that help to optimize processes and resource allocation. One instance of the use of

business process data for optimization is the case of Cannondale Bicycle Corp.of Bethel, CT. The

company launches new products frequently which affect the expected revenue, the demand for

raw materials, scheduling of the production processes and the costs of production. The simulation

of alternative scenarios helps to determine the most optimal way to manage uncertainties created

by new products. The availability of real time data helps to respond to events faster and to

explore alternative decisions as the data is received.

Warm ups for an agile enterprise

Services-oriented architecture disentangles components of software that constitute enterprise

software. This includes the separation of workflow management software from applications

software. The autonomous workflow management software is then linked to the network of

resources that are available on a services-oriented architecture.

Performance improvement and optimization is best achieved when the resources invested in

workflows can be modeled, measured and streamlined. Workflow management software begins

where business process management software leaves off; it manages resources including

resources, applications and data after the process design has been spelt out. What is more,

workflow management software allows companies to reallocate resources and change the

associated applications and data sources as needs change. Without an audit of the resources

expensed as work is completed, it is hard to explore avenues for efficiency gains. Workflow

Management Systems also record events such as the initiation and completion of the numerous

activities in business processes; they keep a record of the outcome of an activity, the resources

invested in the activity, etc.

Workflow Management Systems begin with the modeling of workflows in a way that completes an

entire process. This is followed by an effort to optimize the series of tasks that complete a

process much like project management techniques such as PERT/CPM do. Finally, workflow

management software manages the resources required to complete tasks and monitors their use.

An example of the use of workflow software and process monitoring is the way Citigroup uses

them to keep track of the value of a variety of assets after their price information has been

received from all its myriad sources. If marked changes in values of assets are observed, the

Page 11: The Architecture for Rapid Decisions

matter is escalated to the manager. The more advanced modules in the software have risk

management options which prompt contingency plans to guard against grievous loss.

Cleaning the dirty numbers

The transformation of data from transactional databases involves several steps before it can be

used for analysis; the tasks can include consolidating data from several sources, semantics and

reconciling data structures. The volume of work can be enormous; the British Army, for example,

had to extract data from 850 different information systems, and integrate three inventory

management systems and 15 remote systems in order to move suppliers to the scene of war in

Iraq.

One of the first tasks is to consolidate the data that is scattered in several different sources. For

example, a financial institution such as the Citibank does business with a large corporation like

IBM which has subsidiaries and strategic business units in several different geographical

locations. As autonomous units, they will make purchase decisions of their own accord often

without the knowledge of the parent company. For its business intelligence purposes, Citibank will

look for data from customers and will need to aggregate data from all the independent units so

that it can determine the sales to all units of IBM.

When the data is extracted from several different databases, it is very likely that duplicate data

would be obtained. For example, individual units of a company record names of customers in a

variety of ways. Typically, some units will record both the first and the second name of a person

in a single line. Other units within the company could well record the first name in one line

followed by the last name in another line or vice versa. When the data is consolidated, the huge

number of duplicate records cannot be corrected manually and companies have to find a way to

automate this task. The duplications will also happen because the name has been misspelled or

some records have the initials for the middle name or the full name. Companies have to write

rules for correcting the errors; for example, they could try to match addresses and e-mails to

remove the superfluous records.

Similarly, the supply databases have to be able to reconcile the identification numbers of a large

variety of parts, products and materials. This is rarely standardized across the sources of supply.

In the case of the British Army, for example, the identification numbers for a cold-climate ration

pack and an electronic radio valve were identical which would have upset many people if the data

was not reconciled.

Once the data has been collated, it has to be categorized in a consistent manner. A financial

institution, for example, would like to distinguish its customers; a typical classification would be

retail customer and commercial or a corporate customer. For most data, it would not be hard to

tell when a customer is from the retail sector or the commercial sector. The rub is that some

persons belong to both categories; a management partner of a consulting company, for example,

is an individual and a commercial entity when the company is not a limited liability company.

Page 12: The Architecture for Rapid Decisions

Often, individual stakeholders in the company will come to their own conclusions about the

definitions or the metadata when they consolidate the data. When the analysis is done, however,

the company could find itself coming to invalid conclusions.

Finally, companies need to reconcile differences in data structures across several different

applications. Data extracted from a CRM database is not likely to be consistent with similar

information from a supply chain database. In the case of the British Army, for example, the supply

database defined a unit of supply as a can of 250 liters. On the demand side, it was not

uncommon to request one liter cans. If they decided to order 250 cans, the supply side would end

up with a logistical spike that they could not handle.

Data, as it is stored in transactional databases, is not meant to have related contextual

information that helps in analysis of the data. For example, customer orders are gathered by

CRM databases but these would not be adequate if related information on the geographical

region, demographic characteristics and the time of the order cannot also be correlated with it.

Data profiling technology employs a range of analytical techniques to cross-check information

across columns and rows to ferret out the flawed data. For example, it can compare the name in

one column and the gender in another to check for the accuracy of the names. Similarly, it could

scan the range of the data values to check for their accuracy. Information on the employment

status of people could be compared with their age; people with ages exceeding sixty-five have a

low probability of being employed. The reported benefits from data profiling are high; British

Telecommunications realized a saving of £600 million during the last eight years.

The quality of data profiling depends on the number and range of relationships that are examined

in the process of cleaning the data. For example, a vendor could be studying the procurement

behavior of its clients. It could do a simple check of the size of the order against the dimensions

of the package. The checks could be more advanced and correlate information from invoices,

frequency and the revenue base of the company to validate the data based on its consistency.

Some of the tools that do the data profiling are Trillium Discovery, SAS Dataflux and Oracle's

Enterprise Data Hubs. One of the functions of Oracle’s Data Hub is to help data managers

reconcile differences among source systems with the help of an overarching master data

management solution which converts the data definitions from a variety of sources to a single

repository of universally applicable descriptions of data.

The increasing adoption of EII and EAI require real time data improvement that has to be built

into business processes. When disparate applications and data sources are consolidated,

without the benefit of transformation available with data warehouses, the combined data will

bristle with inconsistencies and other inaccuracies.

Considerations about the future

XBRL and Textual Financial Information

Page 13: The Architecture for Rapid Decisions

Today, most companies process largely their internal data on customers, inventories, labor use

and financial information while the potential for using external data, such as regulatory filings, has

been mostly unexploited. This kind of data is extremely valuable for competitive intelligence,

benchmarking, trend analysis and for anticipating the impact of macroeconomic policy. The rub is

that this data is voluminous and hard to categorize. With the advent of XML and its tags, it is

much easier to index information at a microscopic level. The indexing of unstructured data paves

the way for text mining and in extracting insights from unstructured data. XBRL (Extensible

Business Reporting Language) is focused on using XML for indexing of business documents.

One of the applications of XBRL is the categorization of SEC reports which can be used for

financial comparisons. Edgar On-Line has a product I-Metrix, which offers XBRL-tagged

information from all 12,000 or so SEC filers, including widely held private companies and private

firms with public debt instruments. Beginning with 80% of the line-item information contained in

the 10-K annual and 10-Q quarterly filings, the company will cover all line-item information as well

the analytically critical footnotes and other textual information that helps to understand the

quantitative information in context. OneSource Applink is another company that has developed

search engines to access information from its XBRL-tagged database of some 780,000 public

and private companies.

XBRL, in its initial years saw a sluggish pace of adoption, but lately Business Intelligence vendors

have recognized the potential in the technology. Business Objects and Ipedo have set the pace

by announcing a partnership to use Edgar Online data to deliver XBRL-enabled BI capabilities.

Companies need both external and their internal corporate data to make comparisons. Ipedo's

provides the querying capability (XQuery- and SQL-enabled XIP 4.0 information integration

engine is used to bring XBRL data) to access external data from Edgar Online's Web-based I-

Metrix service and puts it together with internal corporate data on the Business Objects XI

platform for making comparisons with competitors. The standardization of the definitions helps in

making valid comparisons.

Growing support from the government has improved the prospects for adoption of XBRL. Federal

Deposit Insurance Corp. uses XBRL to collect process and distribute the quarterly financial

reports that it receives from banks.

Visualization

Large datasets present intimidating challenges when it comes to extracting patterns in a way that

is intelligible for those who have to take decisions in real time. The volume of data continues to

grow exponentially as a variety of sensors, such as those placed to monitor traffic in cities,

overwhelms known ways of analysis such as statistical analysis and data mining. An example

would be emergency response in the face of a monstrous hurricane that frequently strikes the

southern regions of the USA. Governments have to be able to absorb the information about the

movement of the hurricane which often changes direction, velocity and intensity. A variety of

Page 14: The Architecture for Rapid Decisions

variables about atmospheric pressure and temperature, topography and population data have to

be taken into account for assessing potential damage. All this would be an impossible task if the

entire data is not visualized.

Business intelligence, with its multi-dimensional focus, pushes the envelope for visualization of

data. A plain SQL query can aggregate information by a few of the dimensions while a cube helps

to look at data from several angles. When it is visualized in a multi-dimensional space, the data

can bring into relief outliers, clusters and patterns that help to classify the data. Auto theft data

rendered on a map, with additional information on neighborhoods and demographic information

and income status of the population, helps to extract patterns from colossal databases to help

forewarn car owners about their risks and to deploy police in locations where occurrence is

concentrated.

Geography lends itself well to 3D visualization as location provides a context in which several

different types of data can be overlaid on a map and their interactions can be simulated in a

graphical form. One application of this kind of advanced visualization of large data sets is the

solution Vizible Corporation developed for a California municipality. In an emergency situation,

several different resources of a city administration are deployed such as the police, fire fighters,

medical services and they need information on neighborhoods, people, traffic, etc. to come to

impromptu decisions as the situation evolves. Vizible Corporation developed a virtual control

center using GIS data to simulate graphically the situation on the ground so that an emergency

response can be coordinated efficiently.

The major departure that next generation visualization tools make from the familiar world of Excel

charts and graphs is in the interactivity of the diagrams. When people look at diagrams, they want

to test their test hypothesis about how outcomes will change as the actionable variables are

altered. They want to be able to see how the results will change as some of the values are

excluded. For example, people could be looking at the academic performance of men and women

and males are likely to be tickled if they find females are performing better. They may want to find

out whether the academic performance of women in also higher in the more quantitative sciences

likes physics and chemistry. The emerging technology for visualization is more decision-oriented

and enables graphical simulation. These visualization tools allow for querying, animation,

dynamic data content and linking of several diagrams.

The decline in decision latencies, as a result of investment in data visualization, is illustrated by

the experience of Briggs and Stratton, a manufacturer of gasoline engines. After installing a

business intelligence server and web enabled visualization tools, the company was able to find a

way to predict the failure rate of its engines and metrics influencing quality and operational

metrics of the company. The chief benefit of the application was that it allowed the company to

anticipate the costs incurred on warranty and to take pre-emptive action rather than wait for

several months before the weight of cumulative evidence of failures perforce lead to change in its

Page 15: The Architecture for Rapid Decisions

processes. The real time monitoring the metrics, influencing the quality of its engines, helped to

react to any adverse turn of events.

Predictive Modeling for operations

Data mining is not an activity that is normally synonymous with operations. Instead, it is reserved

for the strategists of companies who are aided by statisticians, market research analysts and data

mining specialists. The Predictive Modeling Markup Language brings the benefits of data mining

to operations. While model design is still found in the exclusive world of data mining specialists

with advanced degrees, the model is executed on data received in real time.

In the first generation of applications of predictive analytics, companies used limited data from

their CRM or other transactional databases to rate customers or detect fraud. All the complex

analysis of statistical models is reduced to scores that operations staff can use arrive at credit

scores for estimating credit-worthiness or use CRM data to rank customers for the level of service

they should be provided.

In the future, the applications will go further and profile customers and provide the information to

call center staff to make offers. This second generation of CRM tools mine data and estimate the

customer's lifetime value, credit risk or probability of defection. Some solutions are more

sophisticated and help in differentiating good prospects from casual enquiries. Call center will

receive enough information to engage customers and offer them compelling deals they find hard

to refuse.

64 bit architecture

New hardware technologies, such as the 64 bit processors are expanding the technological ability

to efficiently process colossal databases. While the currently popular 32-bit processor can only

address 232 bytes, or 4Gbytes, of data, 64-bit processors can access 264 bytes, or 16 Exabytes,

of data.

The superior processing capability of computers with 64-bit processors can help to run databases

and other business applications faster as they can access more data from the cache rather than a

disk, manage larger data files and databases on fewer servers, allow more users to use

applications concurrently and reduce software-licensing fees as fewer processors are required for

the same amount of data processing.

One example of the application of 64 bit architecture is in crunching large volumes of health data.

Apex Management Group uses SAS Enterprise BI Server to understand the complex interplay of

plan offerings, disability benefits, medical system utilization and cost. By using a 64-bit

processing platform, Apex could run through 17 million rows and test hypothesis on the fly.

An increasing number of applications will be possible with 64-bit architecture. The advent of RFID

will substantially increase the volume of data that needs to be crunched and that would be

possible with the added memory available. Similarly, city administrations are using sensors to

keep track of crime and this data can be processed with 64 bit architecture.

Page 16: The Architecture for Rapid Decisions