the vietnam of computer science

Upload: gteodorescu

Post on 01-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 The Vietnam of Computer Science

    1/21

    The Vietnam of Computer Science

    (Two years ago, at Microsoft's TechEd in San Diego, I was involved ina conversation at an after-conference event with Harry Pierson and

    Cleens !asters, and as is ty"ical when the three of #s gettogether, architect#ral to"ics were at the forefront of o#rdisc#ssions$ %n crowd gathered aro#nd #s, and it t#rned into ani"ro"t# &irds-of-a-feather session$ The s#&ect ofo&ectrelational a""ing technologies cae #", and it was thereand then that I )rst coined the "hrase, *+&ectrelational a""ing isthe !ietna of Co"#ter Science*$ In the intervening tie, I'vereceived n#ero#s re#ests to esh o#t the disc#ssion &ehind thatstateent, and given Microsoft's recent anno#nceent regarding*entity s#""ort* in %D+$.ET /$0 and the acce"tance of the 1avaPersistence %PI as a re"laceent for &oth E12 Entity 2eans and 1D+,

    it seeed tie to do e3actly that$4

    No armed confict in US history haunts the American military morethan $g(Vietnam). So many divergent elements coalesced to createthe most decisive turning point in modern American history that itdees any layman's attempt to tease them apart. And yet the storyo! Vietnam is !undamentally a simple one" #he United States egana military pro%ect &ith simple yet unclear and conficting goals anduicly ecame enmeshed in a uagmire that not only roughtdo&n t&o governments (one legally one through !orce o! arms) utalso deeply scarred American military doctrine !or the net !ourdecades (at least).

    Although it may seem trite to say it $g(Object/RelationalMapping) is the Vietnam of Computer Science. *t represents auagmire &hich starts &ell gets more complicated as time passesand e!ore long entraps its users in a commitment that has no cleardemarcation point no clear &in conditions and no clear eitstrategy.

    Histor

    +,S has a good synopsis o! the &ar ut !or those &ho are moreinterested in -omputer Science than +olitical/ilitary 0istory theshort version goes lie this"$g(South *ndochina) no& no&n as Vietnam #hailand 1aos and-amodia has a long history o! struggle !or autonomy. ,e!ore2rench colonial rule (&hich egan in the mid34566s) South*ndochina &restled !or regional independence !rom -hina. 7uring8orld 8ar #&o the 9apanese conuered the area only to e later:lierated: y the Allies leading 2rance to resume their colonial rule

    (as did the ,ritish in their colonial territories else&here in Asia and*ndia). 2ollo&ing 88** ho&ever the people o! South *ndochina

    http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspxhttp://blogs.tedneward.com/ct.ashx?id=33e0e84c-1a82-4362-bb15-eb18a1a1d91f&url=http%3A%2F%2Fwww.pbs.org%2Fbattlefieldvietnam%2Fhistory%2Findex.htmlhttp://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspxhttp://blogs.tedneward.com/ct.ashx?id=33e0e84c-1a82-4362-bb15-eb18a1a1d91f&url=http%3A%2F%2Fwww.pbs.org%2Fbattlefieldvietnam%2Fhistory%2Findex.html

  • 8/9/2019 The Vietnam of Computer Science

    2/21

    having thro&n o; one oppressor etended their anti3occupatione;orts to ght the 2rench instead o! the 9apanese and in 4 the2rench capitulated signing the $g(?eneva +eace Accords) to!ormally grant Vietnam its independence. Un!ortunately gloalpressures perverted the e;orts some&hat and instead o! a lasting

    peace agreement a temporary solution &as created dividing thenation at the 4@th parallel creating t&o nations &here !ormerly nosuch division eisted. lections &ere to e held in 4

  • 8/9/2019 The Vietnam of Computer Science

    3/21

    path &as leading to success and came to elieve that moreaggressive action &as needed. SeiEing on a duious incident in&hich Vietnamese patrol oats attaced American destroyers4 in the$g(?ul! o! #onin) 9ohnson used pro3&ar sentiment in -ongress topass a resolution that gave him po&ers to conduct military action

    &ithout an eplicit declaration o! &ar. #o put it simply 9ohnson&anted to ght this &ar :in cold lood:" :#his meant that America&ould go to &ar in Vietnam &ith the precision o! a surgeon &ith littlenoticeale impact on domestic culture. A limited &ar called !orlimited moiliEation o! resources material and human and causedlittle disruption in everyday li!e in America.: (source) *n essence it&ould e a &ar &hose only impact &ould e !elt y the Vietnamese33American li!e and society &ould go on &ithout any notice o! theevents in Vietnam thus leaving 9ohnson to pursue his rst greatlove his :?reat Society: a domestic agenda designed to many o! US society's ills such as povertyF. 0istory o! course no&s etterand33perhaps cruelly33calls the Vietnam confict :9ohnson's 8ar:.*nitially it must e noted that Vietnam3as3disaster is a more recentperceptionG Americans polled as late as 4

  • 8/9/2019 The Vietnam of Computer Science

    4/21

    the negotiating tale in the rst placeG NVAV- leadershiprecogniEed that the NVAV- !orces despite staggering militarylosses that nearly roe them (several times) could simply continueto do as they &ere doing and &ring concessions !rom theAmericans &ithout o;ering any in return. Hunning on a plat!orm that

    consisted mostly o! a promise to :?et America out o! Vietnam: 9ohnson's successor Hepulican $g(Hichard Nion) tried severaltactics to ring pressure to the NVAV- !orces to argain includingincreased air3comat presence (such as the $g(-hristmas omings)and $g(Cperation /enu) ) and regular violations o! neary 1aos and-amodia pursuing the line o! supplies !rom North Vietnam to cellsin South Vietnam. Nothing &ored ho&ever and in 4

  • 8/9/2019 The Vietnam of Computer Science

    5/21

    the history o! modern society has ever een entirely truth!ul &ith itspopulation aout its !ortunes o! &arG one such eample includes (utis not limited to) the same US government's care!ul censorship o!activities during 8orld 8ar #&o !ty years earlier no&n inAmerican history as :the last 'good' &ar:. *t's also tempting to point

    to the lac o! a military o%ective as the crucial !ailing point o!Vietnam ut other non3military o%ectives have een success!ullyeecuted y the US and other governments &ithout the ind o!colossal !ailure accompanying Vietnam's story. /oreover it'simportant to note that the US did in !act have a clear o%ective in&hat it &anted out o! the confict in South *ndochina" to stop the !allo! the South Vietnam government and arring that the cessationo! the :spread: o! -ommunism. 8as it the reluctance o! the USgovernment to unleash the military to its !ullest capailities as$g(?eneral 8illiam 8estmoreland) al&ays claimedJ -ertainly the!ailure in Vietnam &as not a military oneG the casualty gures maeit clear that the US y any other measure &as clearly &inning.So &hat &ere the principal !ailures in VietnamJ And moreimportantly &hat does all this have to do &ith CH /appingJ

    Vietnam an O/R mapping

    *n the case o! Vietnam the United States political and militaryapparatus &as !aced &ith a deadly !orm o! the $g(1a& o!7iminishing Heturns). *n the case o! automated C%ectHelational/apping it's the same concern33that early successes yield acommitment to use CH3/ in places &here success ecomes moreelusive and over time isn't a success at all due to the overhead o!time and energy reuired to support it through all possile use3cases. *n essence the iggest lesson o! Vietnam33!or any grouppolitical or other&ise33is to no& &hen to :cut ait and run: asshermen say. #oo o!ten as &as the case in Vietnam it is easy to

     %usti!y !urther investment in a particular course o! action ysuggestion that aandoning that course someho& invalidates all the&or33or in Vietnam's case the lives o! American soldiers33that havealready een paid. +hrases lie :8e've gone this !ar surely &e can

    see this thing through: and :#o ac out no& is to thro& a&ayeverything &e've sacriced up until this point: ecomecommonplace. At least during the later deeply itter years o! thesecond hal! o! Vietnam uestions o! patriotism came into uestion"i! you didn't support the &ar you &ere clearly a traitor a-ommunist oviously :unAmerican: disrespect!ul o! all Americanveterans o! any &ar !ought on any soil !or &hatever reason and youproaly iced your dog to oot. (*t didn't help the protestors'cause that they lamed the soldiers !or the &ar holding themaccountale33sometimes personally33!or the decisions made ymilitary and political leaders most o! &hom neither the soldiers nor

    the protestors had ever met.)

  • 8/9/2019 The Vietnam of Computer Science

    6/21

    HecogniEing that all analogies !ail eventually and that the su%ect o! Vietnam is deeper than this essay can eamine there are stilllessons to e learned here in an entirely di;erent arena. Cne o! theey lessons o! Vietnam &as the danger o! &hat's collouially called:the Slippery Slope:" that a given course o! action might yield some

    early success yet !urther investment into that action yieldsdecreasingly commensurate results and increasily dangerousostacles &hose only solution appears to e greater and greatercommitment o! resources andor action. Some have called this :the7rug #rap: a!ter the &ay pharmaceuticals (legal or illegal) can havediminished e;ect a!ter prolonged use reuiring upped dosage inorder to yield the same results. Cthers call this :the 1ast /ile+rolem:" that as one nears the end o! a prolem it ecomesincreasingly diKcult in cost terms (oth monetary and astract) tond a 466L complete solution. All are asically speaing o! thesame thing33the diKculty o! nding an ans&er that allo&s our heroto :nish o;: the prolem in uestion completely and satis!actorily.8e egin the analysis o! C%ectHelational /apping33and itsrelationship to the Second South *ndochina 8ar33y eamining thereasons !or it in the rst place. 8hat drives developers a&ay !romusing traditional relational tools to access a relational dataase andto pre!er instead tools such as CH3/'sJ

    The Object*Relational +mpeence Mismatch

     #o say that o%ects and relational data sets are someho&

    constructed di;erently is typically not a surprise to any developer&ho's ever used othG ecept in etremely simplistic situations itecomes !airly ovious to recogniEe that the &ay in &hich arelational data store is designed is sutly33and yet pro!oundly33di;erent than ho& an o%ect system is designed.C%ect systems are typically characteriEed y !our asiccomponents" identity  state &ehavior  and enca"s#lation. *dentity isan implicit concept in most C3C languages in that a given o%ecthas a uniue identity that is distinct !rom its state (the value o! itsinternal elds)33t&o o%ects &ith the same state are still separateand distinct o%ects despite eing it3!or3it mirrors o! one another.

     #his is the :identity vs. euivalence: discussion that occurs inlanguages lie -MM - or 9ava &here developers must distinguishet&een :a OO : and :a.euals():. #he ehavior o! an o%ect is!airly easy to see a collection o! operations clients can invoe tomanipulate eamine or interact &ith o%ects in some !ashion. (#hisis &hat distinguishes o%ects !rom passive data structures in aprocedural language lie -.) ncapsulation is a ey detailpreventing outside parties !rom manipulating internal o%ect detailsthus providing evolutionary capailities to the o%ect's inter!ace toclients.I. 2rom this &e can derive more interesting concepts such as

    ty"e a !ormal declaration o! o%ect state and ehavior associationallo&ing types to re!erence one another through a light&eight

  • 8/9/2019 The Vietnam of Computer Science

    7/21

    re!erence rather than complete y3value o&nership (sometimescalled composition) inheritance the aility to relate one type toanother such that the relating type incorporates all o! the relatedtype's state and ehavior as part o! its o&n and "olyor"his theaility to sustitute an o%ect in &here a di;erent type is epected.

    Helational systems descrie a !orm o! no&ledge storage andretrieval ased on predicate logic and truth statements. *n essenceeach ro& &ithin a tale is a declaration aout a !act in the &orldand SP1 allo&s !or operator3eKcient data retrieval o! those !actsusing predicate logic to create in!erences !rom those !acts. Q7ate6>Rand Q2ussellR dene the relational model as characteriEed yrelation attrite t#"le relation val#e and relation varia&le. Arelation is at its heart a truth predicate aout the &orld astatement o! !acts (attrites) that provide meaning to thepredicate. 2or eample &e may dene the relation :+HSCN: asSSN Name -ityT &hich states that :there eists a +HSCN &ith aSocial Security Numer SSN &ho lives in -ity and is called Name:.Note that in a relation attriute ordering is entirely unspecied. At#"le is a truth statement &ithin the contet o! a relation a set o!attriute values that match the reuired set o! attriutes in therelation such as :+HSCN SSNO'4FI3>=3B@5=3B@53=3B

  • 8/9/2019 The Vietnam of Computer Science

    8/21

    oriented and procedural programming ut that's another argument!or another day.) CH mappings can tae place in a variety o! !ormsthe easiest o! &hich to recogniEe is the automated CH mappingtool such as $g(#op1in) $g(0iernate) $g(N0iernate) or$g(?entle.N#). Another !orm o! mapping is the hand3coded one in

    &hich programmers use relational3oriented tools such as 97,- orA7C.N# to access relational data and etract it into a !orm morepleasing to o%ect3minded developers :y hand:. A third is to simplyaccept the shape o! the relational data as :the: model !rom &hich tooperate and slave the o%ects around it to this approachG this is alsono&n in the patterns leicon as #ale 7ata ?ate&ay Q+AA 4>>R orHo& 7ata ?ate&ay Q+AA 4=FRG many data3access layers in oth

     9ava and .N# use this approach and comine it &ith code3generation to simpli!y the development o! that layer. Sometimes &euild o%ects around the relationaltale model put some additionalehavior around it and call it Active Hecord Q+AA 4B6R.*n truth this asic approach33to slave one model into the terms andapproach o! the other33has een the traditional ans&er to theimpedance mismatch e;ectively :solving: the prolem y ignoringone hal! o! it. Un!ortunately most development e;orts lie theDennedy Administration aren't &illing to see this through to itslogical conclusion &ith a &holesale commitment to one approachover the other. 2or eample &hile most development teams &oulde happy to adopt an :o%ects3only: approach doing so at thestorage level implies the use o! an C%ect Criented 7ata,ase/anagement System (CC7,/S) a topic that !reuently has no

    traction &ithin upper management or the corporate datamanagement team. #he opposite approach33a :relational3only:approach33is almost nonsensical to consider given the technology o! the day at the time this &as &ritten>.?iven that it's impossile then to :unleash the o%ects to their!ullest capailities: as ?eneral 8estmoreland might call it &e'rele!t &ith some ind o! hyrid o%ect3to3relational mapping approachpre!eraly one that's automated as much as possile so thatdevelopers can !ocus on their 7omain /odel rather than on thedetails o! the o%ect3to3tale(s) mapping. And here un!ortunately is&here the potential uagmire egins.

    The Object*to*Table Mapping &roblem

    Cne o! the rst and most easily3recogniEale prolems in usingo%ects as a !ront3end to a relational data store is that o! ho& to mapclasses to tales. At rst it seems a !airly straight!or&ard eercise33tales map to types columns to elds. ven the eld types appearto line up directly against the relational column types at least to a!airly isomorphic degree" VAH-0AHs to Strings *N#?Hs to intsand so on. So it maes sense that !or any given class dened in the

    system a corresponding tale33liely to e o! the same or closelyrelated name33is dened to go &ith it. Cr perhaps i! the o%ect code

  • 8/9/2019 The Vietnam of Computer Science

    9/21

    is eing &ritten to an already eisting schema then the class mapsto the tale.,ut as time progresses it's only natural that a &ell3trained o%ect3oriented developer &ill see to leverage inheritance in the o%ectsystem and see &ays to do the same in the relational model.

    Un!ortunately the relational model does not support any sort o!polymorphism or *S3A ind o! relation and so developers eventuallynd themselves adopting one o! three possile options to mapinheritance into the relational &orld" tale3per3class tale3per3concrete3class or tale3per3class3!amily. ach o! them carriespotentially signicant dra&acs.

     #he tale3per3class approach is perhaps the most easily understood!or it sees to minimiEe the :distance: et&een the o%ect modeland the relational modelG each class in the inheritance hierarchygets its o&n relational tale and o%ects o! derived types arestitched together !rom relational 9C*Ns across the variousinheritance3ased tales. So !or eample i! an o%ect model hasthe ase class +erson &ith Student derived !rom +erson and?raduateState derived !rom Student then there &ill e three talesreuired to hold this model +HSCN S#U7N# and?HA7UA#S#U7N# each holding the elds corresponding to theclass o! the same name. Helating these tales together ho&everreuires each to have an independent primary ey (one &hose valueis not actually stored in the o%ect entity) so that each derived classcan have a !oreign ey relation to its superclass's tale. #he reason!or this is clear" a ?raduateStudent o%ect y virtue o! its *S3A

    relationship to Student and +erson is a collection o! all three sets o!state and the distinction et&een the classes is largely removed ythe time an o%ect o! this type is created33in oth 9ava and .N# !oreample the o%ect itsel! is a chun o! memory that holds theinstance elds dened in all o! its classes and superclasses along&ith a pointer to the tale o! methods dened y that samehierarchy. #his means that &hen uerying !or a particular instanceat the relational level at least three 9C*Ns must e made in order toring all o! the o%ect's state into the o%ect program's &oringmemory.Actually it gets &orse than that33i! the o%ect hierarchy continues to

    gro& say to include +ro!essor Sta; Undergrad (inherits !romStudent) and a &hole hierarchy o! Ad%unctmployees (inheriting!rom Sta;) and the program &ants to nd all +ersons &hose lastname is Smith then 9C*Ns must e done !or every derived class inthe system since the semantics o! :nd all +ersons: means that theuery must see data on the +HSCN tale ut then do anepensive set o! 9C*Ns to ring in the rest o! the data !rom acrossthe rest o! the dataase pulling in the +HC2SSCH tale to !etchthe rest o! the data not to mention the UN7H?HA7A79U-#/+1C S#A22 and other tales. -onsidering that 9C*Ns

    are among the most epensive epressions in H7,/S ueries thisis clearly not something to undertae lightly.

  • 8/9/2019 The Vietnam of Computer Science

    10/21

    As a result developers typically adopt one o! the other t&oapproaches more comple in outloo ut more eKcient &hendealing &ith relational storage" they either create a tale perconcrete (most3derived) class pre!erring to adopt denormaliEationand its costs or else they create a single tale !or the entire

    hierarchy o!ten in either case creating a discriminator column toindicate to &hich class each ro& in the tale elongs. (Varioushyrids o! these schemes are also possile ut typically don'tcreate results that are signicantly di;erent !rom these t&o.)Un!ortunately the denormaliEation costs are o!ten signicant !or alarge volume o! data andor the tale(s) &ill contain signicantamounts o! empty columns &hich &ill need NU11aility constraintson all columns eliminating the po&er!ul integrity constraints o;eredy an H7,/S.*nheritance mapping isn't the end o! itG associations et&eeno%ects the typical 4"n or m"n cardinality associations so commonlyused in oth SP1 andor U/1 are handled entirely di;erently" ino%ect systems association is unidirectional !rom the associator tothe associatee (meaning the associated o%ect(s) have no idea theyare in !act associated unless an eplicit idirectional association isestalished) &hereas in relational systems the association isactually reversed !rom the associatee to the associator (via !oreigney columns). #his turns out to e surprisingly important as itmeans that !or m"n associations a third tale must e used to storethe actual relationship et&een associator and associatee and even!or the simpler 4"n relationships the associator has no inherent

    no&ledge o! the relations to &hich it associates33discovering thatdata reuires a 9C*N against any or all associated tales at somepoint. (8hen to actually retrieve that data is a su%ect o! somedeate33see the 1oading +arado elo&).

    The Schema*O,nership Con-ict

    7iscussions o! inheritance3to3tale and association mappingschemes also reveals a asic fa&" At heart many o%ect3relationalmapping tools assume that the schema is something that can edened according to schemes that help optimiEe the CH3/'sueries against the relational data. ,ut this elies a asic prolemthat o!ten the dataase schema itsel! is not under the direct controlo! developers ut instead is o&ned y another group &ithin thecompany typically the dataase administration (7,A) group. #o&hom does responsiility !or designing the dataase33and deciding&hen schema changes are permissile33elongJ*n many cases developers egin a ne& pro%ect &ith a :clean slate:an empty relational dataase &hose schema is theirs to dene asthey see t. ,ut soon a!ter the pro%ect has shipped (sometimeseven earlier than that due to political andor :tur! &ar: issues) it

    ecomes apparent that the developers' o&nership o! the schema istemporary at est33various departments egin clamoring !or reports

  • 8/9/2019 The Vietnam of Computer Science

    11/21

    against the dataase 7,As are held accountale to theper!ormance o! the dataase therey giving them cause to call !or:re!actoring: and denormaliEation o! the data and otherdevelopment teams may start inuiring aout ho& they might maeuse o! the data stored therein. ,e!ore too long the schema must e

    :!roEen: therey potentially creating a arrier to o%ect modelre!actoring (see #he -oupling -oncern elo&). *n addition theseother teams &ill epect to see a relational model dened inrelational terms not one &hich supports an entirely orthogonal !ormo! persistence33!or eample the :discriminator: column !rom the*nheritance3to3#ale /apping +rolem &ill represent diKculties andargualy e all ut unusale to relational report generators such as-rystal Heports. Unless developers are &illing to &rite all reports(and their U*s and their printing code and their ad3hoccapailities...) y hand this is usually going to e an unacceptalestate o! a;airs.(#o e !air this is not so much a technical prolem as it is a politicalprolem ut it still represents a serious prolem regardless o! itssource33or solution. And as such it still represents an impediment toan o%ectrelational mapping solution.)

    The .ual*Schema &roblem

    A related issue to the uestion o! schema o&nership is that in anCH3/ solution the metadata to the system is held !undamentally int&o di;erent places" once in the dataase schema and once in the

    o%ect model (another schema i! you &ill epressed in 9ava or -instead o! 771). Updates or re!actorings to one &ill liely reuiresimilar updates or re!actorings to the other. He!actoring code tomatch dataase schema changes is &idely considered to e theeasier o! the t&o33re!actoring the dataase !reuently reuires someind o! migration andor adaptation o! data already &ithin thedataase &here code has no such reuirement. (C%ects at least inthis discussion are ephemeral in3memory instances that &illdisappear once the process holding them terminates. *! the o%ectsare stored in some ind o! o%ect !orm that can persist acrossprocess eecution33such as serialiEed o%ect instances stored todis33then re!actoring o%ects ecomes eually prolematic.)/ore importantly &hile it's not uncommon !or code to e deployedspecically to a single application !reuently dataase instancesare used y more than one application and it's !reuentlyunacceptale to usiness to trigger a company3&ide re!actoring o!code simply ecause a re!actoring on one application reuires asimilar dataase3driven re!actoring. As a result as the system gro&sover time there &ill e increasing pressure on the developers to :tieo;: the o%ect model !rom the dataase schema such that schemachanges &on't reuire similar o%ect model re!actorings and vice

    versa. *n some cases &here the CH3/ doesn't permit suchdisconnection an entirely private dataase instance may have to e

  • 8/9/2019 The Vietnam of Computer Science

    12/21

    deployed &ith the eact schema the CH3/3ased solution &as uiltagainst creating yet another silo o! data in an *# environment &herepressure is uilding to red#ce such silos.

    'ntit +entit +ssues

    As i! these prolems &eren't enough &e then &al into anotherprolem that o! identity  o! o%ects and relations. As noted aoveo%ect systems use an implicit sense o! identity typically ased onthe o%ect's location in memory (the uiuitous this pointer)Galternatively this is sometimes re!erred to as an +ID (C%ect*7entier) usually in systems &hich don't directly epose memorylocations such as the o%ect dataase (&here an in3memory pointeris pretty useless as an identier outside o! the dataase process). *na relational model ho&ever identity is implicit in the state itsel!33

    t&o ro&s &ith the eact same state are typically considered arelational data corruption as the same !act asserted t&ice isredundant and counterproductive. #o e !air &e should e a itmore eplicit hereG a relational system can in !act permit duplicatetuples (as descried aove) ut this is o!ten eplicitly disallo&ed yeplicit relational constraints such as +H*/AH D constraints. *nthose situations &here duplicate values are allo&ed there is no &ay!or a relational system to determine &hich o! the t&o duplicate ro&sare eing retrieved33there is no implicit sense o! identity to therelation ecept that o;ered y its attriutes. #he same is not true o!o%ect systems &here t&o o%ects that contain precisely identical

    it patterns in t&o di;erent locations o! memory are in !act separateo%ects. (#his is the reason !or the distinction et&een :OO: and:.euals(): in 9ava or -.) #he implication here is simple" i! the t&osystems are going to agree on the sense o! identity the relationalsystem must o;er some ind o! uniue identity concept (usually anauto3incrementing integer column) to match that o! the notion o!o%ect identity.

     #his causes some serious concerns regarding automated CHsystems ecause the sense o! identity is entirely di;erent33i! t&oseparate user sessions interact &ith the same relation in storagethe relational dataase system's concurrency systems ic in andensure some !orm o! concurrent access typically via thetransactional metaphor (A-*7). *! an CH system retrieves a relationout o! storage (essentially !orming a :vie&: over the data) &e no&have a second source o! data identity one in the dataase(protected y the a!orementioned transactional scheme) and one inthe in3memory o%ect representation o! that data &hich has noconsistent transactional support aside !rom that uilt into thelanguage (such as the monitors concept in 9ava and .N#) orliraries (such as System.#ransactions in .N# F.6) either o! &hichcan e33and un!ortuantely !reuently are33easily ignored y

    developers. /anaging isolation and concurrency is not an easyprolem to solve and un!ortunately the languages and plat!orms

  • 8/9/2019 The Vietnam of Computer Science

    13/21

    commonly availale to developers aren't yet as consistent or feileas the dataase transaction metaphor.8hat complicates this prolem !urther is that many CH systemsintroduce signicant caching support into the CH layer (usually inan attempt to improve per!ormance and avoid round3trips to the

    dataase) and this in turn presents some prolems particularly i!the caching system is not a &rite3through cache" &hen does theactual :fush: to the dataase tae place and &hat does this sayaout transactional integrity i! the application code elieves the&rite to have occurred &hen in !act it hasn'tJ #his prolem in turnonly compounds &hen the CH system runs in multiple processes in!ront o! the dataase engine commonly !ound in clustered or!armed application server scenarios. No& the data identity is spreadacross n56 locations n eing the numer o! application servernodes and 6 eing the dataase itsel!. ach node must someho&signal its intent to do an update to the other nodes in order tootain some ind o! concurrency construct to prevent simultaneousaccess (y another instance o! the same session or y an instanceo! a di;erent session accessing the same data) &hich taes timeilling per!ormance. ven in the case o! a read3only cache updatesto the data store must someho& e signaled to the caches runningin the application server nodes reuiring server3to3clientcommunication originating !rom the dataaseG support !or this is not&ell3understood or documented in the current crop o! modernrelational dataases.

    The .ata Retrieal Mechansim Concern

    So once the entity is stored &ithin the dataase ho& eactly do &eretrieve itJ *n all honesty a purely o%ect3oriented approach &ouldmae use o! o%ect approaches !or retrieval ideally usingconstructor3style synta identi!ying the o%ect(s) desired utun!ortunately constructor synta isn't generic enough to allo& !orsomething that feileG in particular it lacs the aility to initialiEe acollection o! o%ects and ueries !reuently need to return acollection rather than %ust a single entity. (/ultiple trips to thedataase to !etch entities individually is generally considered too&aste!ul in oth latency and and&idth to consider credily as analternative33see the 1oad3#ime +arado elo& !or more.) As aresult &e typically end up &ith one o! Puery3,y3ample (P,)Puery3,y3A+* (P,A) or Puery3,y31anguage (P,1) approaches.A P, approach states that you ll out an o%ect template o! thetype o! o%ect you're looing !or &ith elds in the o%ect set to aparticular value to use as part o! the uery3ltration process. So !oreample i! you're uerying the +erson o%ecttale !or people &iththe last name o! Smith you set up the uery lie so"+erson p O ne& +erson()G assumes all elds are set to null y de!aultp.1astName O :Smith:GC%ect-ollection oc O Pueryecutor.eecute(p)G

  • 8/9/2019 The Vietnam of Computer Science

    14/21

    The problem with the &+ approach is obvious# while it-s perfectly sufficient forsimple ueries, it-s not nearly epressive enough to support the more comple style ofuery that freuently we need to eecute/find all 0ersons named %mith orCromwell/ and /find all 0ersons 12T named %mith/ are two eamples. 3hile it-s notimpossible to build &+ approaches that handle this (and more comple scenarios), it

    definitely complicates the 405 significantly. 6ore importantly, it also forces thedomain ob$ects into an uncomfortable positionthey must  support nullablefieldsproperties, which may be a violation of the domain rules the ob$ect wouldotherwise see* to supporta 0erson without a name isn-t a very useful ob$ect, in manyscenarios, yet this is eactly what a &+ approach will demand of domain ob$ectsstored within it. (0ractitioners of &+ will often argue that it-s not unreasonable foran ob$ect-s implementation to ta*e this into account, but again this is neither easy norfreuently done.)As a result usually the second step is to have the o%ect systemsupport a :Puery3,y3A+*: approach in &hich ueries are constructedy uery o%ects usually something o! the !orm"Puery O ne& Puery()G.2rom(:+HSCN:).8here(  ne& uals-riteria(:+HSCN.1AS#NA/: :Smith:))GC%ect-ollection oc O Pueryecutor.eecute()G

    8ere, the uery is not based on an empty /template/ of the ob$ect to be retrieved, butoff of a set of /uery ob$ects/ that are used together to define a Commandstyle ob$ectfor eecuting against the database. 6ultiple criteria are connected using some *ind of

     binomial construct, usually /4nd/ and /2r/ ob$ects, each of which contain uniueCriteria ob$ects to test against. 4dditional filtrationmanipulation ob$ects can betagged onto the end, usually by appending calls such as /2rder+y( field-name)/ or/9roup+y( field-name)/. 5n some cases, these method calls are actually ob$ects

    constructed by the programmer and strung together eplicitly.7evelopers uicly note that the aove approach is (generally)much more verose than the traditional SP1 approach and certainstyles o! ueries (particularly the more unconventional %oins such asouter %oins) are much more diKcult33i! not impossile33to representin the P,A approach.Cn top o! this &e have a more sutle prolem that o! the relianceon developers' dicipline" oth the tale name (:+HSCN:) and thecolumn name in the criteria (:+HSCN.1AS#NA/:) are standardstrings taen as3is and !ed to the system at runtime &ith no sort o!validity3checing until then. #his presents a classic prolem in

    programming that o! the :!at3nger: error &here a developerdoesn't actually uery the :+HSCN: tale ut the :+HSCN: taleinstead. 8hile a uic unit3test against a live dataase instance &illreveal the error during unit3testing this presumes t&o !acts33thatthe developers are religious aout adopting unit3testing and thatthe unit3tests are run against dataase instances. 8hile the !ormeris slo&ly ecoming more o! a guarantee as more and moredevelopers ecome :test3in!ected: (orro&ing ?amma's and ,ec'schoice o! terminology) the latter is still entirely open to discussionand interpretation o&ing to the !act that setting3up and tearing3

    do&n the dataase instance appropriately !or unit tests is still

  • 8/9/2019 The Vietnam of Computer Science

    15/21

    diKcult to do in a dataase. (8hile there are a variety o! &ays tocircumvent this prolem !e& o! them seem to e in use.)8e're also !aced &ith the asic prolem that greater a&areness o!the logical33or physical33data representation is reuired on the parto! the developer33instead o! simply !ocusing on ho& the o%ects are

    related to one another (through simple associations such as arraysor collection instances) the developer must no& have greatera&areness o! the !orm in &hich the o%ects are stored leaving thesystem some&hat vulnerale to dataase schema changes. #his issometimes oviated y a hyrid approach et&een the t&o&herey the system &ill tae responsiility !or interpreting theassociations leaving the developer to &rite something lie this"Puery O ne& Puery()G2ield lastName2ield2rom+erson O +erson.class.get7eclared2ield(:lastName:)G.2rom(+erson.class).8here(ne& uals-riteria(lastName2ield2rom+erson:Smith:))G

    C%ect-ollection oc O Pueryecutor.eecute()G3hich solves part of the schemaawareness problem and the /fatfingering/ problem but still leaves the developer vulnerable to the concerns over verbosity and stilldoesn-t address the compleity of putting together a more comple uery, such as amultitable (or multiclass, if you will) uery $oined on several criteria in a variety ofways.So then the net tas is to create a :Puery3,y31anguage:approach in &hich a ne& language similar to SP1 ut :etter:someho& is &ritten to support the ind o! comple and po&er!ulueries normally supported y SP1G CP1 and 0P1 are t&o eampleso! this. #he prolem here is that !reuently these languages are a

    suset o! SP1 and thus don't o;er the !ull po&er o! SP1. /oreimportantly the CH layer has no& lost an important :selling point:that o! the :o%ects and only o%ects: mantra that egat it in therst placeG using a SP13lie language is almost %ust lie using SP1itsel! so ho& can it e more :o%ectish:J 8hile developers may notneed to e a&are o! the physical schema o! the data model (theuery language interpretereecutor can do the mapping discussedearlier) developers &ill need to e a&are o! ho& o%ect associationsand properties are represented &ithin the language and the suseto! the o%ect's capailities &ithin the uery language33!or eampleis it possile to &rite something lie thisJS1-# +erson p4 +erson pF2HC/ +erson80H p4.getSpouse() OO null

    AN7 pF.getSpouse() OO nullAN7 p4.is#hisAnAcceptaleSpouse(pF)AN7 pF.is#hisAnAcceptaleSpouse(p4)G

    5n other words, scan through the database and find all single people who find eachother acceptable. 3hile the /isThis4n4cceptable%pouse/ method is clearly a methodthat belongs on the 0erson class (each 0erson instance may have its own criteria bywhich to $udge the acceptability of another singleare they blonde, brunette, orredhead, are they ma*ing more than :;, a year, and so on), it-s not clear if

    eecuting this method is possible in the uery language, nor is it clear if it should be.ven for the most trivial implementations, a serious performance hit will be li*ely,

  • 8/9/2019 The Vietnam of Computer Science

    16/21

     particularly if the 2< layer must turn the relational column data into ob$ects in orderto eecute the uery. 5n addition, we have no guarantees that the developer wrote thismethod to be at all efficient, and no ways to enforce any sort of performanceawareimplementation.(-ritics &ill argue that this is a &orale prolem proposing t&o

    possile solutions. Cne is to encode the pre!erence data in aseparate tale and mae that part o! the ueryG this &ill result in ahideously complicated uery that &ill tae several pages in lengthand liely reuire a SP1 epert to untangle later &hen ne&pre!erential criteria &ant to e added. #he other is to encode this:acceptaility: implementation in a stored procedure &ithin thedataase &hich no& removes code entirely !rom the o%ect modeland leaves us &ithout an :o%ect:3ased solution &hatsoever33acceptale ut only i! you accept the premise that not allimplementation can rest inside the o%ect model itsel! &hich re%ectsthe :o%ects and nothing ut o%ects: premise &ith &hich many CHadvocates open their arguments.)

    The &artial*Object &roblem an the oa*Time&arao%

    *t has long een no&n that net&or traversal such as that done&hen maing a traditional SP1 reuest taes a signicant amounto! time to process. (Hough enchmars have placed this value atany&here !rom three to ve orders o! magnitude compared againsta simple method call on either the 9ava or .N# plat!orm=G roughly

    analogous i! it taes you t&enty minutes to drive to &or in themorning and &e call that the time reuired to eecute a localmethod call !our orders o! magnitude to that is roughly the time ittaes to travel to +luto or %ust shy o! !ourteen years one &ay.) #hiscost is clearly non3trivial so as a result developers loo !or &ays tominimiEe this cost y optimiEing the numer o! round trips and dataretrieved.*n SP1 this optimiEation is achieved y care!ully structuring the SP1reuest maing sure to retrieve only the columns andor talesdesired rather than entire tales or sets o! tales. 2or eample&hen constructing a traditional drill3do&n user inter!ace thedeveloper presents a summary display o! all the records !rom &hichthe user can select one and once selected the developer thendisplays the complete set o! data !or that particular record. ?iventhat &e &ish to do a drill3do&n o! the +ersons relational typedescried earlier !or eample the t&o ueries to do so &ould e inorder (assuming the rst one is selected)"S1-# id rstname lastname 2HC/ personG

    S1-# W 2HC/ person 80H id O 4G

    5n particular, ta*e notice that only the data desired at each stage of the process isretrievedin the first uery, the necessary summary information and identifier (for the

    subseuent uery, in case first and last name wouldn-t be sufficient to identify the person directly), and in the second, the remainder of the data to display. 5n fact, most

  • 8/9/2019 The Vietnam of Computer Science

    17/21

    %&' eperts will eschew the /=/ wildcard column synta, preferring instead to nameeach column in the uery, both for performance and maintenance reasons

     performance, since the database will better optimi>e the uery, and maintenance, because there will be less chance of unnecessary columns being returned as D+4s ordevelopers evolve andor refactor the database table(s) involved. This notion of being

    able to return a part of a table (though still in relational form, which is important forreasons of closure, described above) is fundamental to the ability to optimi>e theseueries this waymost ueries will, in fact, only reuire a portion of the completerelation.

     #his presents a prolem !or most i! not all o%ectrelationalmapping layers" the goal o! any CH is to enale the developer tosee :nothing ut o%ects: and yet the CH layer cannot tell !romone reuest to another ho& the o%ects returned y the uery &ille used. 2or eample it is entirely !easile that most developers &ill&ant to &rite something along the lines o!"+ersonQR all O Puery/anager.eecute(...)G

    +erson selected O 7isplay+ersons2orSelection(all)G7isplay+erson7ata(selected)G

    6eaning, in other words, that once the 0erson to be displayed has been chosen fromthe array of 0ersons, no further retrieval action is necessaryafter all, you have yourob$ect, what more should be necessary?

     #he prolem here is that the data to e displayed in the rst7isplay...() call is not  the complete +erson ut a suset o! that dataGhere &e !ace our rst prolem in that an o%ect3oriented system lie- or 9ava cannot return %ust :parts: o! an o%ect33an o%ect is ano%ect and i! the +erson o%ect consists o! 4F elds then all 4Felds &ill e present in every +erson returned. #his means that the

    system !aces one o! three uncom!ortale choices" one reuire that+erson o%ects must e ale to accomodate :nullale: eldsregardless o! the domain restrictions against thatG t&o return the+erson completely lled out &ith all the data comprising a +ersono%ectG or three provide some ind o! on3demand load that &illotain those elds i! and &hen the developer accesses those eldseven indirectly perhaps through a method call.(Note that some o%ect3ased languages such as -/AScript vie&o%ects di;erently than class3ased languages such as 9ava or -or -MM and as a result it is entirely possile to return o%ects &hich

    contain varying numers o! elds. #hat said ho&ever !e&languages possess such an approach not even everyody's !avoritedynamic3language poster child Huy and until such languagesecome &idespread such discussion remains outside the realm o!this essay.)2or most CH layers this means that o%ects andor elds o! o%ectsmust e retrieved in a laEy3loaded manner otaining the eld dataon demand ecause retrieving all o! the elds o! all o! the +ersono%ectsrelations &ould :clearly: e a huge &aste o! and&idth !orthis particular scenario. #ypically the o%ect's entire set o! elds &ille retrieved &hen any eld not3yet3returned is accessed. (#his

    approach is pre!erred to a eld3y3eld approach ecause there'sless chance o! the :NM4 uery prolem: in &hich retrieving all the

  • 8/9/2019 The Vietnam of Computer Science

    18/21

    data !rom an o%ect reuires 4 uery to retrieve the primary ey M Nueries to retrieve each eld !rom the tale as necessary. #hisminimiEes the and&idth consumed to retrieve data33no unaccessedeld &ill have its data retrieved33ut clearly !ails to minimiEenet&or round trips.)

    Un!ortunately elds &ithin the o%ect are only part o! the prolem33the other prolem &e !ace is that o%ects are !reuently associated&ith other o%ects in various cardinalities (one3to3one one3to3manymany3to3one many3to3many) and an CH mapping has to maesome up3!ront decisions aout &hen to retrieve these associatedo%ects and despite the est e;orts o! the CH3/'s developersthere &ill al&ays e common use3cases &here the decision made&ill e eactly the &rong thing to do. /ost CH3/'s o;er some indo! developer3driven decision3maing support usually some ind o!conguration or mapping le to identi!y eactly &hat ind o!retrieval policy &ill e ut this setting is gloal to the class and assuch can't e changed on a situational asis.

    Summar

    ?iven then that o%ects3to3relational mapping is a necessity in amodern enterprise system ho& can anyone proclaim it a uagmire!rom &hich there is no escapeJ Again Vietnam serves as a use!ulanalogy here33&hile the situation in South *ndochina reuired aresponse !rom the Americans there &ere a variety o! responsesavailale to the Dennedy and 9ohson Administrations including thesame ind o! response that the recent !all o! Suharto in /alaysiagenerated !rom the US &hich is to say none at all. (Hememerisenho&er and 7ulles didn't consider South *ndochina to e a parto! the 7omino #heory in the rst placeG they &ere !ar moreconcerned aout 9apan and urope.)Several possile solutions present themselves to the CH3/ prolemsome reuiring some ind o! :gloal: action y the community as a&hole some more approachale to development teams :in thetrenches:"

    ;. Abandonment. Developers simply give up on ob$ects entirely, and return to a

     programming model that doesn-t create the ob$ectrelational impedancemismatch. 3hile distasteful, in certain scenarios an ob$ectoriented approachcreates more overhead than it saves, and the

  • 8/9/2019 The Vietnam of Computer Science

    19/21

    serviceoriented world, which eschews the idea of direct data access butinstead reuires all access go through the service gateway thus encapsulatingthe storage mechanism away from prying eyes, it becomes entirely feasible toimagine developers storing data in a form that-s much easier for them to use,rather than D+4s.

    . Manual mapping. Developers simply accept that it-s not such a hard problemto solve manually after all, and write straight relationalaccess code to returnrelations to the language, access the tuples, and populate ob$ects as necessary.5n many cases, this code might even be automatically generated by a tooleamining database metadata, eliminating some of the principal criticism ofthis approach (that being, /5t-s too much code to write and maintain/).

    !. Acceptance of O/R-M limitations. Developers simply accept that there is noway to efficiently and easily close the loop on the 2< mismatch, and use an2

  • 8/9/2019 The Vietnam of Computer Science

    20/21

     1ote that this list is not presented in any particular orderB while some are moreattractive to others, which are /better/ is a value $udgment that every developer anddevelopment team must ma*e for themselves.

     9ust as it's conceivale that the US could have achieved somemeasure o! :success: in Vietnam had it ept to a clear strategy and

    understood a more clear relationship et&een commitment andresults (HC* i! you &ill) it's conceivale that the o%ectrelationalprolem can e :&on: through care!ul and %udicious application o! astrategy that is celarly a&are o! its o&n limitations. 7evelopers muste &illing to tae the :&ins: &here they can get them and not !allinto the trap o! the Slippery Slope y looing to create solutions thatincreasingly cost more and yield less. Un!ortunately as the historyo! the Vietnam 8ar sho&s even an a&areness o! the dangers o! theSlippery Slope is o!ten not enough to avoid getting ogged do&n ina uagmire. 8orse it is a uagmire that is simply too attractive topass up a Siren song that continues to dra& development teams!rom all siEes o! corporations (including those at /icroso!t *,/Cracle and Sun to name a !e&) against the rocs &ith spectacularresults. 1ash yoursel! to the mast i! you &ish to hear the song utlet the sailors ro&.

    'nnotes

    4 1ater analysis y the principals involved33including then3Secretaryo! 7e!ense Hoert /cNamara33concluded that hal! o! the attacnever even too place.F *t is perhaps the greatest irony o! the &ar that the man 2ateselected to lead during America's largest !oreign entanglement &asa leader &hose principal !ocus &as entirely aimed &ithin his o&nshores. 0ad circumstances not conspired other&ise the hippieschanting :0ey hey 1,9 ho& many oys did you ill today: outsidethe Cval CKce could very &ell have een 9ohnson's staunchestsupporters.I *ronically encapsulation !or purposes o! maintenance simplicityturns out to e a ma%or motivation !or almost all o! the ma%orinnovations in 1inguistic -omputer Science33procedural !unctional

    o%ect aspect even relational technologies (Q7ate6FR) and otherlanguages all cite :encapsulation: as ma%or driving !actors.> 8e could perhaps consider stored procedure languages lie #3SP1or +1SP1 to e :relational: programming languages ut even thenit's etremely diKcult to uild a U* in +1SP1.= *n this case * &as measuring 9ava H/* method calls against localmethod calls. Similar results are pretty easily otainale !or SP13ased data access y measuring out3o!3process calls against in3process calls using a dataase product that supports oth such as-loudscape7ery or 0SP1 (0ypersonic SP1).

    References

  • 8/9/2019 The Vietnam of Computer Science

    21/21

    01ussell2" 7o#ndations of +&ect 8elational Ma""ing y /ar 1.2ussell v6.F (ml!3