bp data modelling as a service (dmaas)
DESCRIPTION
BP Data Modelling as a Service (DMaaS) presented at DAMA USA (now Enterprise Data World)San DiegoTRANSCRIPT
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2006.
Data Modelling as a Service (DMaaS) at BpDAMA International, San Diego, March 2008Christopher Bradley & Ken Dunn
Contents
1.Information Architecture challenges at Bp2.Our solution
a) Self service user administration & provisioning for BP users
b) Automated Model publishingc) Detailed reporting of ER/Studio and Repository
usage for usertracking (and chargeback)
d) Judicious automatione) Community of Interest
3.Next steps
1. Information Architecture challenges at Bp
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2007. 4
BP is an oil, gas, petrochemicals & renewables company
We employ nearly 100,000 people …operations on 6 continents and in over 100
countries …market capitalisation of $250 billion
…revenues of $270 billion in 2006
…over 25,000 service stations worldwide
BP Overview
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2007. 5
Areas of BP Business
Exploration & Production
Gas
Refining Alternative Energy
Chemicals
LubricantsFuels Marketing &
Retail
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2007. 6
DCT (IT) Landscape
Indicative Digital and Communication Technology statistics:
• 250 Data Centers, moving to 3 Mega Data Centers• 80,000 Desktops – mainly Microsoft XP • 6,000 Servers – Windows & Unix, some Linux• 7,000 Applications - target to reduce significantly• 33 instances of SAP (strategic ERP solution)• 30 petabytes of spinning disk• 26 major “data warehouses” (18 SAP BW, 3 Kalido)• 150 applications independently maintaining Customer data
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2006.7
Characteristics:• Well-integrated, enterprise-wide global data where appropriate• A single view of customer and product master data Key Attributes• Real-time straight-through processing in areas of need• Overt focus on Data Quality• Business insights through greater data visibility• Business ownership with Single Point of Accountability for data• DCT role in providing leadership, coordination and verification
VISIONData and Information are effectively and efficiently managed as a shared corporate asset that is easily accessible.
Information Architecture vision
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2006.8
DataTypes
Master Data MI/BI Data TransactionData
StructuredTechnical
DataDigital
Document
Structure
Models / Taxonomy Catalog / Meta data
Information Architecture Framework
Integrationand AccessQuality Lifecycle
ManagementProcess
GovernancePlanning People
GoalsPrinciples Purpose
ER/Studio
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2006.9
Challenges: Delivery environment
1. Decentralized management− IM tools optional including ER/Studio (need to be persuasive)− no single subject area model (linking entities important)− few standards but many “guidelines” (modeling guidelines)
2. Project focused− documentation gets lost in the project repository (drive to put
models in ER/Studio corporate repository)− continuity of resources is difficult (strong community of interest)− much project work out-sourced (looking to an accreditation
program for partners)
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2006.10
Challenges: Application environment
3. SAP− SAP teams may believe that they only need to configure the
application, thus overlooking the importance of modeling − gaining value from the modeling that is done (cultural change to get
SAP team to actually use models)
4. SOA− demand for XML model management (ER/Studio used)− looking for a quick way to turn data into services (using Composite EII
software)
5. Plus 5,000 other applications− many different overlapping physical models (important to map to
logical)− much integration and ETL work (looking to establish canonical models)
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2006.11
Challenges: Modeling environment
6. Process modeling (ARIS)− integration with data models important (have completed the
logical mapping of entities, still looking at best way to do integration)
− ensuring that “process modelers” don’t also develop the data models (quality has shown to be variable!)
7. Architectural modeling (System Architect)− need to integrate with data and process models (work in early
stages)− confusing for modelers as to which tool to use− confusing for project teams as to where to find information
This document is the confidential property of BP plc. All rights are reserved. Copyright © 2006.12
Challenges: Publishing
8. PDF− critical for inclusion in project documentation− still major communication format
9. SharePoint− official repository for most projects and architectural
documentation− have automated publication of all models so that they are
available to all project team members− need good way to publish ER/Studio including zooming in and
out!!
10.Wiki environment− starting to be popular especially for gathering definitions− need an easy way to keep definitions synchronized with models
13
Prior to 2006
2006 position:2006 position:• Data modelling undertaken to different degrees in different Segments &
Functions.• Very wide variety of tools & techniques used to define DATA models
− ARIS, ERWin, System Architect, KMDM, Enterprise Architect, Power Designer, Q_Designer, Rational, PowerPoint, Visio, …… others?
• Most commonly used tool in BP for Data Modelling is PowerPoint / Visio• Projects encounter common cross Business data concepts, but still create their
own models & definitions.• No repository of Data Models, nor Governance.
Q42005
• Cross BP Data Modelling study – representation from all Segments + Functions.
• Developed agreed requirements statement for data modelling @ BP• Comprehensive evaluation study• Established x-BP licence agreements, MSLA & PSA.
Data quality problemsInconsistent Data definitionsDuplicated DataDifficulty in reconciling MIModels & knowledge lost after each project
2. Our solution
DMaaS portal
A Service not simply tools!
235 models50,529 entities
Standards & Guidelines“How to” guides
Web basedStep by step guides
BP CoursesOnline & classroomSeveral Video guides
Active FAQ & discussion board
Productivity, quality & standards macros
Macros wish list
Active COI. Highly attended & rated
a) Self service user administration & provisioning of users
• Self service user administration & provisioning for BP users to:− register for ER/Studio− gain repository permissions− repository password change− licence server access− view registered users / managers (&
members) of teams can see who’s registered
a) Self Service
− View registered users / managers (& members) of teams can see who’s registered
Self Service
• Lets managers know who has registered (or who has not) on their team
• Lets users verify they are registered correctly• Lets users see other members of the data modelling community at
BP
Self Service – Example: View users
− Register new user
Self Service
Register
Self Service – Example: Register New User
Repository
Repository Server
Data Modelling Environment SharePoint
Email Client
Active Directory
ER/Studio Application
Repository Access Web Service
UserDatabase
Firewall
Data Modelling EnvironmentSupport Application
1. New user request submitted from SharePoint
2. Request received and validated against BP Active Directory
3. User created in database
4. User created in Repository
5. User given default permissions
6. Welcome email sent
= Embarcaderocomponents
b) Automated publishing of models to SharePoint
b) Model Publishing
• Publishing of models from repository to BP Data Modelling Environment SharePoint− Completely automatic generation of models in HTML (no need to
produce ER/Studio report settings files)− Usual approach is to utilise report wizard− Approach would be unworkable for BP’s large # of models− Automatically generate report settings files
− Customise generated reports− Layouts, title etc
− Automatic uploading to SharePoint − Uploading of 1000’s of files to SharePoint is very problematic − Restart built into our upload jobs
− Report home page in SharePoint mimics repository structure− Highlights when repository models and SharePoint reports not
synchronised− Publishing meta data to inform users of status
Model Publishing - Example
Repository
Repository Server
Data Modelling Environment SharePoint
ER/Studio Application
Repository Access Web Service
Firewall
Data Modelling EnvironmentSupport Application
1. Query for updated models
2. Generate settings file
3. Generate HTML version of model
4. Upload HTML to SharePoint
5. Generate and update repository page
Model publishing
Model publishing
Model publishing
Zoom inside the browser!
c) User & usage reporting
c) User & Usage Reporting
• Detailed reporting of ER/Studio and Repository usage for usertracking (and chargeback)
• Custom solution• Database of users
− User department & contact details− MAC address− Repository id
• Licence server usage − Peak number of concurrent users (are we approaching licence
limit?)− Number of unique users registered and using DME (monitor take-
up)
User & Usage Reporting
Concurrent License Usage
0
5
10
15
20
25
30
35
08 M
ay
22 M
ay
05 J
un
19 J
un
03 J
ul
17 J
ul
31 J
ul
14 A
ug
28 A
ug
11 S
ep
25 S
ep
09 O
ct
23 O
ct
06 N
ov
20 N
ov
04 D
ec
18 D
ec
01 J
an
Max UsageUnique Users
• Log files are copied from the server and parsed• Usage graph shows peak concurrent license usage and number of unique
users for a given day• Allows license purchasing decisions to be based on actual usage• Allows Data Modelling Environment take-up to be monitored
d) Judicious automation
Generic Import/Export
Makes changes to the model, e.g. add entites and
attributes
Search Repository
Double-click to get diagram then
view entity
Copy Entity For Re-Use
Select entities and run the Copy Entity
macro
Copy Entity For Re-Use
Run the Paste Entity macro in a new diagram
Copy Entity For Re-Use
Run the Entity Re-use Report macro to see the
list of re-used entities and their differences
Entity Mapping
Define a mapping concept then check diagram into Repository – this allows entities to be mapped to
Entity Mapping
Define a mapping concept then check diagram into Repository – this allows entities to be mapped to
Entity Mapping
Reference the mapping concept from the Manage Mapping Concepts macro
– this creates list attachments to represent
the mapping
Entity Mapping
Generate a Mapping Report, lists entities (or submodels) and where
they are linked to
Render StylesheetGenerate a Mapping
Report, lists entities (or submodels) and where
they are linked to
Render Stylesheet
Apply the simple stylesheet – everything
becomes white
Render Stylesheet
Change stylesheet, all entities with an ‘EDM Business
Domain’ attachment become red
Render Stylesheet
Change stylesheet again, fill colour is
based on attachment value;
Customers become blue, Commodities become green
Validate Data Model
− Data modelling standards and guidelines have been developed.− Large number of users are utilising ER/Studio (>300).− No formal process or organisational function to check quality of
data models.− An automated process (macro) provides a first level assessment
of model quality (i.e. conformance to standards & good practices).
− This does NOT provide any assessment of contentcontent quality – this can only be accomplished by data model domain expert review of model.
− Automated populates the “Validation State” within the model status block.
− Option to run “statistics only”report on models in specificproject folders.
BP Model StatusStatus: ApprovedType: ProjectValidation State: Validated 25/12/2007 73%Reviewed by: Chris Bradley (BRADC6)Approved by: Ken Dunn (DUNNKB)
e) Community of interest
51
Community of Interest (COI)• Purpose:
− This CoI is to share business cases, issues, best practices, guidance, project experiences, and propose domain directives for Data Modelling at BP.
• Why:− Data Modelling is undertaken at different levels across BP (Enterprise, Conceptual, Logical, Physical, Message).− ER/Studio is an accepted & supported tool that BP has adopted across the Enterprise− Several projects are using ER/Studio at BP today and even more in the future− Avoid project islands, re-inventing the wheel, gather project synergies
• Share “best practices”• Charter:
https://wss2.bp.com/DCT/EA/teams/EAPublic/GIA/DME/Admin/Community%20of%20Interest/Data%20Modelling%20COI%20Group%20Charter_V02.doc• Membership:
− The Data Modelling COI is open to all interested BP staff− Third parties such as consultants and offshore providers may also participate by invitation. Any consultants /
contractors or other 3rd parties participating will have a current NDA with BP.− Primarily driven by technical demands
• Involvement of Embarcadero:− Input from Embarcadero− COI can influence Embarcadero product development though our involvement in PAC
PAC 4th – 7th Feb. Key product requests to [email protected]• Meeting Frequency and length:
− Monthly – last Tuesday of the month; 90 minutes / online & “real” meeting• Agenda items:
− Product & DME news, “how to” sessions, user experiences, hot-topic issues.
52S
tong
ly A
gree
Agr
eeD
isag
ree
Stro
ngly
Dis
agre
e
79% 77%70%
55% 60%
4%0%
10%
20%
30%
40%
50%
60%
70%
80%
User Survey:What benefits are you gaining from the Data service?
We are not obtaining any benefits
We are obtaining benefit through use of a common modelling common modelling tooltool
We are obtaining benefit through utilisation of a common repositorycommon repository
We are obtaining benefit through use of common standards, common standards, guidelines & guidelines & processesprocesses
We are obtaining benefit through re-use re-use of models & of models & artefactsartefacts We are obtaining benefit
through provision of central support & helpcentral support & help
2006 & 2007 - evangelise
53
Governance & management
Best practices
DM Tools
Notation
DMRepository
Common (core) set of data definitions
e.g. Master DataImplementationguidelines 200+ users; 8000+
viewers
BP Enterprise modelConceptual modelsLogical modelsPhysical modelsIndustry standard modelsTemplate models235 models50,529 entities
Top 10 BP reasons for developing data model
1. Capturing Business Requirements 2. Promotes Reuse, Consistency, Quality3. Bridge Between Business and Technology
Personnel4. Assessing Fit of Package Solutions5. Identify and Manage Redundant Data6. Sets Context for Project within the
Enterprise 7. Interaction Analysis: Compliments Process
Model8. Pictures Communicate Better than Words9. Avoid Late Discovery of Missed
Requirements 10. Critical in Managing Integration Between
Systems
GET STARTEDRegister for ER/Studio licenseTrainingList of users Sign up to newsletterChange repository permissions Community of InterestProductivity MacrosWeb publication of models
3. Next steps
Challenges
• SAP Architects − “We don’t need to do Data Modelling”
• Selling / promoting purpose of Data Modelling− It’s NOT just for bespoke database developments!
• Expanding online community of interest• Certification of internal AND supplier staff
− An “approved” supplier doesn’t necessarily mean they know Data Modelling!
• Interactive training• Web portal to interrogate repository
− Develop & promote Business Data Dictionary• Drive re-use
− Linking model artefacts to drive re-use (e.g. Entities from Master Data Models)
56
Next steps: 2008 & onwards
SOA:Important in an SoA World. Definition of data & consequently calls to /
results from services is vital.Straight through processing can exacerbate the
issuewhat does the data mean?which definition of X (e.g. “cost of goods”)?need to utilise the logical model and ERP
models definitions Data Lineage:Repository based Data migration design -
ConsistencySource to target mappingReverse engineer & generate Informatica ETLImpact analysis ERP:Model Data requirements – aid configuration / fit
for purpose evaluationData IntegrationLegacy Data take onMaster Data integration
BI / DW:Model Data requirements in Dimensional
ModelReverse engineer BW Info Cubes, BO
Universes, etcGenerate Star / Snowflake / Starflake
schemas
Message modelling:Hierarchic view of data modelCanonicalsUtilise “Sub-models” for each XML
messageGenerate XSDImport WSDLCustomise XSD via ER/Studio macrosVery powerful XML features in new
V7.5
Approved status of models by ….Enterprise, Segment, Function
Model validation service Promotion of “approved” e.g. master data
modelsPromotion of Industry standard models (e.g.
PODS)Drive quality model cultureCross domain Governance
Modelling (Data lineage) befits for SOX compliance
Reward re-useDemonstrate benefits of reuseMake re-use the default behaviourShare BP benefits success stories (e.g. GOIL)
57
Questions?
Contact details
Chris BradleyHead Of Information Management [email protected]+44 1225 475000
Ken DunnHead of Information ArchitectureKen,[email protected]+1 630 836 7805