data modeling is underrated: a bright future ahead in the grand schema things

56
All Contents © 2006 Burton Group. All rights reserved. Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things Peter O’Kelly Research Director pokelly@burtongrou p.com www.burtongroup.co m pbokelly.blogspot.co m Thursday – November 30, 2006

Upload: tovi

Post on 02-Feb-2016

19 views

Category:

Documents


0 download

DESCRIPTION

Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things. Peter O’Kelly Research Director [email protected] www.burtongroup.com pbokelly.blogspot.com. Thursday – November 30, 2006. Data Modeling is Underrated. Agenda Synopsis ~7-minute summary Discussion - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

All Contents © 2006 Burton Group. All rights reserved.

Data Modeling is Underrated:A Bright Future Ahead in the Grand Schema Things

Peter O’Kelly

Research Director

[email protected]

www.burtongroup.com

pbokelly.blogspot.comThursday – November 30, 2006

Page 2: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

2Data Modeling is Underrated

Agenda

• Synopsis• ~7-minute summary• Discussion• Extended-play overview (for reference)

• Analysis• Market snapshot• Market trends• Market impact• Recommendations

Page 3: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

3Data Modeling is Underrated

Synopsis

• Data modeling used to be seen primarily as part of database analysis and design -- for DBMS-nerds only

• There is now growing appreciation for the value of logical data modeling in many domains, both technical and non-technical

• Historically, most data modeling techniques and tools have been inadequate, and often focused more on implementation details than logical analysis and design

• Pervasive use of XML and broader exploitation of metadata, along with improved techniques and tools, is making data modeling more useful for all information workers (as well as data-nerds)

• Data modeling is a critical success factor for XML – in SOA and elsewhere• Data modeling is now

• A fundamental part of the back-to-basics trend in application development • Key to effective exploitation of emerging applications and tools• Essential to regulatory compliance (e.g., information disclosure tracking)

Page 4: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

4Data Modeling is Underrated

~7-minute summary

• Logical data modeling is often misunderstood and underrated

• Models of real-world things (entities), attributes, relationships, and identifiers

• Logical => technology-independent (not implementation models)• Logical data modeling is not 1:1 with relational database design

• It’s as much about building contextual consensus among people as it is capturing model design for software systems

• It’s also exceptionally useful for database design, however

• Some of the historical issues• Costly, complex, and cumbersome tools/techniques• Disproportionate focus on physical database design

Page 5: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

5Data Modeling is Underrated

~7-minute summary

• Logical data modeling is more relevant than ever before• Entities, attributes, relationships, and identifiers

• None of the above are optional if you seek to • Respect and accommodate real-world complexity• Establish robust, shared context with other people

• Revenge of the DBMS nerds• Not just for normalized “number-crunching” anymore…• Native DBMS XML data model management => fundamental changes

• XQuery: relational calculus for XML• SQL and XQuery have very strong synergy• All of the capabilities that made DBMS useful in the first place apply to XML as well

as traditional database models• DBMS price/performance and other equations have radically improved

• Logical modeling tools/techniques are more powerful and intuitive• And less expensive

Page 6: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

6Data Modeling is Underrated

~7-minute summary

• XML-based models are useful but insufficient• Document-centric meta-meta-models are not substitutes for techniques

based on entities, attributes, relationships, and identifiers• Some XML-centric techniques have a lot in common with pre-relational data

model types (hierarchical and network navigation) or mutant “object database” models

• XML also unfortunately has ambiguous aspects like the unfortunate “Entity-Relationship” (E-R) model

• Logical data modeling is not ideal for document-oriented scenarios (involving narrative, hierarchy, and sequence; optimized for human comprehension)

• But a very large percentage of XML today is data-centric rather than document-centric

• And increasingly pervasive beyond-the-basics hypertext (with compound and interactive document models) is often more data- than document-centric

Page 7: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

7Data Modeling is Underrated

~7-minute summary

• Ontology is necessary but insufficient• Categorization is obviously a useful organizing construct

• “Folksonomies” are also often very effective• But…

• Categorization is just one facet of modeling• Many related techniques are conducive to insufficient model detail,

creating ambiguity and unnecessary complexity, e.g., for model mapping

• So…• We’re now seeing microformats and other new words

• … that are fundamentally focused on logical data model concepts• It’d be a lot simpler and more effective to start with logical data models

in the first place

Page 8: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

8Data Modeling is Underrated

Discussion

Page 9: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

9[Extended-play version] Analysis

Market snapshot

• Data modeling concepts• Data modeling benefits• Data modeling in the broader analysis/design landscape• Why data modeling hasn’t been used more pervasively

Page 10: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

10Market Snapshot

Data modeling concepts: the joy of sets

• Core concepts• Entity: a type of real-world thing of interest

• Anything about which we wish to capture descriptions • More precisely, an entity is an arbitrarily defined but mutually agreed upon

classification of things in the real world • Examples: customer, report, reservation, purchase

• Attribute: a descriptor (characteristic) of an entity • A customer entity, for example, is likely to have attributes including

customer name, address, …• Relationship: a bidirectional connection between two entities

• Composed of two links, each with a link label/descriptor• Example: customer has dialogue; dialogue of customer

• Identifier: one or more descriptors (attributes and/or relationship links) that together uniquely identify entity instances, e.g., CustomerID

Page 11: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

11Market Snapshot

Data modeling concepts: example data model fragment diagram

Customer

CustomerID

CustomerName

CustomerIndustry

CustomerAddress

CustomerRenewalDate

Dialogue

DialogueDate

DialogueTopic

DialogueAnalyst

Following Carlis/Maguire (from their data modeling book): • About each customer, we can remember its name, industry, address, renewal data, and ID. Each customer is identified by its ID.• About each dialogue, we can remember its customer and its date, topic, and analyst. Each dialogue is identified by its customer and its date.

Entities

Attributes

Attributes

Relationship

Identifiers

[Note: this model fragment is an example and is not very well-formed]

Page 12: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

12Market Snapshot

Data modeling concepts: example data model instance

CustomerCustomerID

(PK1) CustomerName CustomerIndustry CustomerAddress CustomerRenewalDate

017823

75912

91641

Acme Widgets

NewBank.com

Degrees 4U

Manufacturing

Financial services

Education

123 Main Street…

456 Central…

P.O. Box 1642…

2005/10/14

2006/05/28

2004/12/31

DialogueCustomerID(PK1, FK1)

DialogueDate(PK1) DialogueTopic DialogueAnalyst

75912

91641

017823

2005/06/18

2003/12/13

2004/10/14

Data architecture

SIP/SIMPLE

Portal

Peter O’Kelly

Mike Gotta

Craig Roth

PKn: participates in primary key

FKn: participates in foreign key

Bonus: it’s very simple to create instance models (and thus relational database designs) from well-formed logical data models

Page 13: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

13Market Snapshot

Data modeling benefits

• Precision and consistency• High fidelity models

• Which are easier to maintain in order to reflect real-world changes• Improved

• Ability to analyze, visualize, communicate, collaborate, and build consensus• Potential for data reuse

• A fundamental DBMS goal• Easier to recognize common shapes and patterns

• Impact analysis (e.g., “what if” assessments for proposed changes)• Exploitation of tools, servers, and services

• DBMSs and modern design tools/services assume well-formed data models• “Being normal is not enough”…

• SOA, defined in terms of schemas, requires data model consensus

Page 14: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

14Market Snapshot

Data modeling in the broader analysis/design landscape

• Four dimensions to consider• Data, process, and events• Roles/concerns/views: strategic, operational, and technology• Logical and physical• Current/as-is and goal/to-be states

Page 15: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

15Market Snapshot

Data, process, and events

• Think of nouns, verbs, and state transitions• Data: describes structure and state at a given point in time• Process: algorithm for accomplishing work and state changes• Event: trigger for data change and/or other action execution

• Integrated models are critically important • Data modeling, for example, is guided by process and event analyses

• Otherwise scope creep is likely• There is no clear right/wrong in data modeling

• Scope and detail are determined by the processes and events you wish to support, and they often change over time

Page 16: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

16Market Snapshot

Page 17: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

17Market Snapshot

Roles/concerns/views

• Three key dimensions• Strategic

• Organization mission, vision, goals, and strategy• Operational

• Data, process, and event details to support the strategic view• Technology

• Systems (applications, databases, and services) to execute operations

• Again pivotal to maintain integrated models• Data modeling that’s not guided by higher-level goal modeling can

suffer from scope creep and become an academic exercise

Page 18: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

18Market Snapshot

Logical and physical

• Another take on operational/technology• Logical: technology-independent data, process, and event models

• Examples:• Entity-Relationship (ER) diagram• Data flow diagram (process model)

• Physical: logical models defined in software• (Doesn’t imply illogical…)• Examples

• Data definition language statements for database definition, including details such as indexing and table space management for performance and fault tolerance

• Class and program modules in a specific programming language

• Integration and alignment between logical and physical are key• But are often far from ideal, in practice today

Page 19: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

19Market Snapshot

Current/as-is and goal/to-be states

• Combining as-is/to-be states and logical/physical

Logical

Physical

Technology-independent view of current systems

Systems already in place; the stuff we need to live with…

Real-world model unconstrained by current systems

New system view with high-fidelity mapping to logical goal state

Goal state/to-beCurrent/as-is

Page 20: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

20Market Snapshot

Why data modeling hasn’t been used more pervasively

• So, why isn’t everybody doing this?...• Data modeling is hard work• Historically

• Disproportionate focus on physical modeling• Inadequate techniques and tools• Suboptimal “burden of knowledge” distribution

• Reduced “green field” application development• Data modeling has a mixed reputation

Page 21: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

21Market Snapshot

Data modeling is hard work

• It’s straightforward to read well-formed data models, but it’s often very difficult to create them

• Key challenges• Capturing and accommodating real-world complexity• Dealing with existing applications and systems• Organizational issues

• Collaboration and consensus-building• Role definitions and incentive systems that discourage designing for

reuse and working with other project teams• Politics

Page 22: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

22Market Snapshot

Historically disproportionate focus on physical modeling

• Radical IT economic model shifts during recent years• Design used to be optimized for scarce computing resources

including MIPs, disk space, and network bandwidth• The “Y2K crisis” is a classic example of the consequences of placing

too much emphasis on physical modeling-related constraints• Relatively stand-alone systems discouraged designing for reuse

• Now • Applications are increasingly integrated, e.g., SOA• Hardware and networking resources are abundant and inexpensive• The ability to flexibly accommodate real-world changes is mission-

critical• Logical modeling is more important than ever before

Page 23: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

23Market Snapshot

Historically inadequate techniques and tools

• Tendency to focus on physical, often product-specific (e.g., PeopleSoft or SAP) models

• Lack of robust repository offerings• Making it very difficult to discover, explore, and share/reuse models

• Entity-Relationship (ER) “model” • More of an ambiguous and incomplete diagramming technique, but

still the de facto standard for data modeling

Page 24: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

24Market Snapshot

Tangent: ER, what’s the matter?

• Entity Relationship deficiencies• Per E. F. Codd [1990]

• “Only the structural aspects were described; neither the operators upon those structures nor the integrity constraints were discussed. Therefore, it was not a data model

• The distinction between entities and relationships was not, and is still not, precisely defined. Consequently, one person’s entity is another person’s relationship.

• Even if this distinction had been precisely defined, it would have added complexity without adding power.”

Source: Codd, The Relational Model for Database Management, Version 2

Page 25: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

25Market Snapshot

Tangent: ER, what’s the matter?• Many vendors have addressed some original ER limitations, but the

fact that ER is ambiguous and incomplete has led to considerable problems

• The Logical Data Structure (LDS) technique is much more consistent and concise, but it’s only supported by one tool vendor (Grandite)

• It’s possible to use the ER-based features in many tools in an LDS-centric approach, however

• Ultimately, diagramming techniques are simply views atop an underlying meta-meta model

• The most useful tools now include• Well designed and integrated meta-meta models• Options for multiple view types, including data, process, and event logical

views, as well as assorted physical views

Page 26: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

26Market Snapshot

Historically inadequate techniques and tools

• Unfortunate detours such as overzealous object-oriented analysis and design

• Class modeling is not a substitute for data modeling• “Everything is an object” and system-assigned identifiers often mean insufficient

specificity and endless refactoring• Fine to capture entity behaviors and to highlight generalization, but you still

need to be rigorous about entities, attributes, relationships, and identifiers

• No “Dummie’s Guide to Logical Data Modeling”• E.g., normalization: a useful set of heuristics for assessing and fixing

poorly-formed data models• But there has been a shortage of useful resources for people who seek to

develop data modeling skills – in order to create well-formed data models in the first place

• Result: often intimidating levels of complexity…

Page 27: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

27Market Snapshot

Historically inadequate tools and techniques

•An Object Role Modeling (ORM) example

• Consistent and concise• But also overwhelming• Doesn’t scale well for more

complex modeling domains

• Useful for some designers• But not as useful for

collaborative modeling with subject matter experts who don’t seek to master the technique

Source: http://www.orm.net/pdf/ORMwhitePaper.pdf

Page 28: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

28Market Snapshot

Historically suboptimal “burden of knowledge” distribution

• Following Carlis: knowledge is generally captured in three places• Resource managers/systems such as DBMSs• Applications/programs• People’s heads

• Universally-applicable data, process, and event details are ideally captured in DBMSs

• Applications can be circumvented and are often cruelly complex• People come and go (and take their knowledge with them)

• But in recent years, DBMSs have been relegated to reduced roles• Suboptimal in many data modeling-related respects

• Often meant inappropriate distribution of the burden of knowledge

• DBMSs (and thus data modeling) are now resurgent, however

Page 29: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

29Market Snapshot

Reduced “green field” application development• Following the enterprise shift toward purchased-and-customized applications

such as ERP and CRM• Start with models supplied by vendor• Usually with major penalties for extensive customization

• So we often see enterprises changing their operations to match purchased applications instead of the other way around

• In many cases, packaged applications • Follow least common denominator approaches in order to support multiple DBMS

types• Capture universally-applicable data/process/event model facets at the application tier

instead of in DBMSs• Far from ideal distribution of the burden of knowledge

• Trade off increased complexity for increased generality• Good for application vendors; not always so good for customer organizations

• Overall, this has often resulted in • Reduced incentives and utility for data modeling• Many organizations deferring to application suppliers for data models, often with

undesirable results such as “lock-in” and endless consulting

Page 30: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

30Market Snapshot

Recap: data modeling has a mixed reputation

• Because of the historical challenges• The return on data modeling time investment has been far from ideal

because of • Lack of best practices, techniques, and tools• Environmental dimensions that reduced the utility of data modeling

• Many enterprise data modeling projects became IT full-employment acts• With endless scope creep, unclear milestones, completion criteria, and return

on investment• As a result, enterprise data modeling endeavors have become scarcer

during recent years, with the relentless IT focus on ROI and TCO• Obviously an untenable situation

• Both IT people and information workers are increasingly making decisions when they literally don’t know what they’re talking about, due to the lack of high quality and fidelity data models

Page 31: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

31Analysis

Market trends

• Back to data basics• Broader and deeper data modeling applicability• Availability of more and better data models• Simpler and more effective techniques and tools• Increasing data modeling utility, requirements, and risks

Page 32: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

32Market Trends

Back to data basics

• Growing appreciation for • The reality that all bets are off if you’re not confident you have

established consensus about goals, nouns, verbs, and events• Software development life cycle economic realities

• It’s much more disruptive and expensive to correct models as you go through analysis, design, implementation, and maintenance phases

• Less expensive hardware and networking means the return on time investment for logical modeling is increasing while the return for physical modeling is decreasing

• Indeed, emerging model-driven tools increasingly make it possible for the logical model to serve as the application specification, with penalties for developers who insist on endlessly tweaking the generated physical models (code)

Page 33: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

33Market Trends

Broader data and deeper modeling applicability

• SOA is one of the most significant data modeling-related development during recent years

• All about services, but with a deep data model prerequisite• Don Box: services share schemas and contract, not class

• From a DBMS-centric world view, web services => pragmatic XML evolution• Parameterized queries, as in DBMS stored procedures• Structured and grouped query results

• SOA has also driven the need for web services repository (WSR) products • Increasingly powerful tools for information workers have also expanded the

applicability of data modeling• An early example: Business Objects – focused on making data useful for more

people through data model abstractions• Similar capabilities are now available throughout products such as Microsoft Office• Recent developments such as XQuery will dramatically advance the scope and

power of applied set theory

Page 34: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

34Market Trends

Availability of more and better models

• Resources such as books focused on the topic area, e.g., Carlis/Maguire and David Hay’s Data Model Patterns

• Products that include expansive data models, ranging from ERP to recent data model-focused offerings such as

• NCR Teradata’s logical data model-based solutions• “Universal model” resources from enterprise architecture tool

vendors such as Visible Systems• Based on decades of in-market enterprise modeling experience

Page 35: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

35Market Trends

Availability of more and better models

• Standards groups and initiatives, such as • ACCORD• Open Application Group• OASIS Universal Business Language

• Models developed by enterprises and government agencies, e.g.,

• Canada’s Integrated Justice Information (IJI) initiative• Provides a data model and context framework for all aspects of law

enforcement• No magic: a multi-year effort with pragmatic hard work and governance

• Similar initiatives are now under way in the United States and other countries

Page 36: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

36Market Trends

Simpler and more effective techniques and tools

• Most now include• Cleaner separation of concerns and more intuitive user

experiences• For data modeling: ER subsets/refinements that reduce

ambiguity and notational complexity• And support view preferences with variable levels of detail

• Integrated meta-meta models and unified repositories• Supporting enterprise architecture models such as the Zachman

Framework as navigational guides• Although there’s still a perplexing lack of repository-related standards

Page 37: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

37Market Trends

Data modeling in the enterprise architecture landscape

• Relative to the Zachman Framework

Source: http://www.zifa.com/

Page 38: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

38Market Trends

Simpler and more effective techniques and tools

• Most now include (continued)• Model-driven analysis and design tools

• Building on virtualization and application frameworks with declarative services for transactions, security, and more

• Even more incentive to focus more on logical models and less on physical models

• More powerful and robust forward- and reverse-engineering capabilities

• To transform physical => logical as well as logical => physical

• Many are also available at much lower cost• And some open source modeling tools have emerged

Page 39: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

39Market Trends

Increasing data modeling utility, requirements, and risks

• To recap: much more utility from effective data modeling• Related trends and risks

• Regulatory compliance requirements, especially concerning information disclosure

• Impossible to track what’s been disclosed (both by and to whom) if you don’t know what you’re managing and who has access to it

• Increasing demand for reverse-engineering tools in order to better understand existing systems and interactions

• “Cognitive overreach” – the potential for information workers to create nonsensical queries based on poorly-designed data models

• The queries will often execute and return arbitrary results • With which people will make equally arbitrary business decisions

Page 40: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

40Analysis

Market impact

• Pervasive data modeling and model-driven analysis/design

• Vendor consolidation and superplatform alignment• Potentially disruptive market dynamics

Page 41: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

41Market Impact

Pervasive data modeling and model-driven analysis/design

• No longer optional (never really was)• Most of today’s software products assume effective data modeling

• Using a DBMS or an abstraction layer such as Microsoft’s ADO.NET with poorly-designed data models results in significant penalties

• Often implicit, e.g., in • Information worker-oriented tools such as the query and data manipulation

tools included in Microsoft Office• Not a recent development – e.g., consider > $1B annual market for products

such as Apple Filemaker Pro and Microsoft Access – but rapidly expanding• Future offerings such Microsoft Vista and Microsoft Office 2007, which are

deeply data model- and schema-based• For documents, messages, calendar entries, and more, all with extensible

schemas and tools for direct information worker metadata manipulation actions

Page 42: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

42Market Impact

Vendor consolidation and superplatform alignment

• A familiar pattern –commoditization, standardization, and consolidation, resulting in

• Significant merger/acquisition activity• Shifting product categories, in this context including

• Specialized/focused modeling tools• Including widely-used products such as Microsoft Visio

• Enterprise architecture/application lifecycle management tool suites• Essentially CASE++, with more and better integrated tools, deeper standards

support, and often with support for strategic views• Examples: Borland, Embarcadero, Grandite, Telelogic, Visible

• Superplatform-aligned tool suites• IBM, Microsoft, and Oracle, for example, all either now or plan to soon offer end-to-

end model-driven tool suites• IBM currently has a significant market lead, through its Rational acquisition

• Broader support for interoperability-focused standards initiatives such as XMI (OMG’s XML Metadata Interchange specification)

Page 43: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

43Market Impact

Vendor consolidation and superplatform alignmentBachman

Cadre

Cayenne

Sterling

Computer Associates

KnowledgeWare

Texas Instruments IEF

LogicWorks ERWin Platinum

Some CASE and modeling tool vendor merger/acquisition activity

SDP S-Designor PowerSoft Sybase PowerDesigner

Popkin Telelogic

TogetherSoft Borland

Rational IBM

Visio Microsoft

Page 44: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

44Market Impact

Potentially disruptive market dynamics• Opportunities for new or refocused entrants, e.g.,

• Adobe: a potential leader in WSR following its acquisition of Yellow Dragon Software

• Adobe doesn’t offer data modeling tools, but it has a broad suite of tools that exploit XML and data models

• The urgent need for WSR products could result in SOA-centric repository offerings expanding to encompass more traditional repository needs as well

• Altova: expanding into UML modeling from its XML mapping/modeling franchise

• Microsoft: Visual Studio Team System (VSTS) is Microsoft’s first direct foray in modeling tools

• It used to offer Visual Modeler, an little-used OEM’d version of Rational Rose• VSTS won’t initially include data modeling tools, but they are part of the plan for

future releases• MySQL AB: acquired an open source data modeling tool (DBDesigner 4)

and is preparing to reintroduce an expanded version (which will remain open source)

Page 45: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

45Market Impact

Potentially disruptive market dynamics• New challenges for UML, with significant implications

• UML is the most widely-used set of diagramming techniques today, but it’s not particularly useful for data modeling, and it has some ambiguities and limitations

• Microsoft and some other vendors believe domain-specific languages (DSLs) are more effective than UML for many needs

• If UML falters, vendors that have placed strategic bets on UML (such as Borland, IBM, and Oracle) will face major challenges

• Open source modeling initiatives• Some examples

• Argo UML• MySQL’s future Workbench tool set• MyEclipse: $29.95 annual subscription for multifaceted tools with modeling

• These initiatives will accelerate modeling tool commoditization and standardization

Page 46: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

46Market Impact

The U in UML stands for “unified,” not “universal”

• UML is in some ways ambiguous and is not a substitute for data modeling

• Some tools include UML profiles for data modeling, however• UML profiles are similar to domain specific languages in many respects

• It’s not clear that UML is ideal for meta-meta-meta models

• UML represents unification of three leading diagramming techniques, but it’s not universally applicable

• UML is much better than not using any modeling/diagramming tools, but it’s not a panacea

• Although it’s getting more expressive and consistent, with UML v2

Page 47: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

47Analysis

Recommendations

• Think and work in models• Build and use model repositories• Create high-fidelity modeling abstractions for SOA• Revisit modeling tool vendor assumptions and

alternatives• Respect and accommodate inherent complexity

Page 48: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

48Recommendations

Think and work in models

• Develop skills and experience in• Thinking at the type level of abstraction• Using set-oriented query tools/services

• Data modeling utility now extends far beyond database analysis and design

• Information workers who have effective data modeling skills will be much more productive

• Use data modeling to analyze, visualize, communicate, and collaborate• Provide guidance in

• Data modeling training and tools• Selecting appropriate tools

• Don’t use ambiguous or incomplete diagramming techniques• Making resources available in models

Page 49: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

49Recommendations

Build and use model repositories

• Do not • Needlessly recreate/reinvent models• Default to exclusively extrapolating models from existing XML

schemas or query results• Reality check: that’s how most XML-oriented modeling is done today,

but it often propagates suboptimal designs and limits reuse• This may seem familiar: it repeats an early DBMS pattern, when many

developers simply moved eariler file designs into DBMSs rather than checking design assumptions/goals

• Ensure policies and incentive systems are in place to encourage and reward model sharing via repositories

• Add to data governance strategy

Page 50: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

50Recommendations

Create high-fidelity modeling abstractions for SOA

• SOA is rapidly becoming a primary means of facilitating inter-application integration

• Robust SOA schema design entails abstraction layers• Exposing public interfaces to private systems otherwise often means

propagating suboptimal data model design decisions• Sharing services with users whom you may never actually meet

• Making unambiguous and robust models more important than ever

• WSR is likely to become a key part of enterprise model repository strategy

• Encompassing contexts and models that aren’t exclusively SOA-focused

Page 51: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

51Recommendations

Revisit modeling tool vendor assumptions and alternatives

• Think form-follows-function• White board and pencil & paper often suffice for information worker

contexts, and are generally more conducive to productive modeling sessions• Enterprise architecture-related modeling, in contrast, should be done with

integrated and repository-based tool suites• Align with superplatform commitments, e.g.,

• If IBM-focused, for instance, Rational is an obvious candidate• Microsoft-focused customers need tactical plans until Microsoft delivers a

more comprehensive VSTS• Oracle customers should revisit Oracle Developer Suite 10g, which includes

Oracle Designer• Organizations using a mix of DBMSs can benefit from using tools from

specialists such as Embarcadero, Telelogic, and Visible Systems• Explore open source-related modeling initiatives

• And expect very rapid open source modeling initiative expansion/evolution

Page 52: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

52Recommendations

Respect and accommodate inherent complexity

• Modeling is and will remain hard work• Modeling is simpler and more effective when people can work with common

techniques, tools, repositories, and collections of high-fidelity data models• But the real world is increasingly complex and dynamic, and effective models

must reflect those realities • Politics and other inter-personal communication challenges are also not going

away, especially in “virtual” organizations

• Neither over-simplify nor over-reach• Suboptimal modeling and design decisions can cause much more damage

in today’s SOA-centric world• Means sub-optimally shifting the burden of knowledge

• Information worker-oriented power tools mean the potential for cognitive overreach is rapidly rising for people who (directly or indirectly) work with ambiguous or otherwise poorly-designed models

Page 53: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

53Conclusion

Data modeling is not just for databases anymore

• Data modeling is pivotal for analysis, visualization, communication, and collaboration

• Organizations that do incomplete or otherwise inadequate data modeling

• Will fail to fully exploit today’s leading tools, servers, and services• Will not be able to comply with regulatory compliance requirements,

especially for information disclosure

• Data modeling is not easy but it has a very strong return on time investment

• It’s not optional, so enterprises need to do it well• The timing and tools have never been better

Page 54: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

54Resources

Burton Group Content• Business Process Modeling: Adding Value or Overhead?

• http://www.burtongroup.com/Content/doc.aspx?cid=838 • Data Modeling: Not Just for Databases Anymore

• http://www.burtongroup.com/content/doc.aspx?cid=732 • XML Modeling and Mapping: Tumultuous Transformation in the Grand Schema

Things • http://www.burtongroup.com/Content/doc.aspx?cid=122

• Model-Driven Development: Rethinking the Development Process• http://www.burtongroup.com/Content/doc.aspx?cid=121

Related Resources• John Carlis, Joseph Maguire. Mastering Data Modeling: A User-Driven

Approach. Addison-Wesley, 2001. • Jack Greenfield, Keith Short. Software Factories: Assembling Applications with

Patterns, Models, Frameworks, and Tools. Wiley, 2004. • David C. Hay. Data Model Patterns: Conventions of Thought. Dorset House

Publishing, 1995. • Martin Fowler. UML Distilled (3rd ed.). Addison-Wesley, 2004

Page 55: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

55

WorkspaceWorkspace IDWorkspace titleE-mail post ID...

PagePage title...

Workspace personDate/time added...

PersonE-mail IDPasswordFirst nameLast nameTime zone...

Page viewDate/time...

Page versionVersion numberDate/timeVersion content...

Backlink

From

To

Data Model Examples

Basic wiki model

Page 56: Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things

56

WorkspaceWorkspace IDWorkspace titleWorkspace nameLogo image file nameLogo image URLE-mail post IDTechnorati keyE-mail notify settingWeb services proxy URLWeblog sort orderDisplay in My Workspaces...

PagePage titleDeleted status...

Workspace personDisplay My Favorites flagSide pane positionHyperlinks underlinedE-mail notification frequencyE-mail notification sort sequenceE-mail notification change types...

PersonE-mail IDAdminForce password resetFirst nameLast nameWikiwyg editor enabledPasswordE-mail notification preferencesTime zone...

AttachmentFile nameSizeDate uploaded...

Page viewDate/time...

CategoryCategory name

Page category

My Favorites page

Page versionVersion numberDate/timeVersion content

Backlink

From

To

SiteURL...

Team favorites

page

Workspacenavigation

page

Data Model Examples

Socialtext wiki model