distributed computing - uio.no · distributed computing group 6: mathias hagen per Øyvind karlsen...

Distributed Computing

● Group 6:● Mathias Hagen● Per Øyvind Karlsen● Cato Pakusch

Our presentation today

● Review of “A note on distributed computing”● Review of “Another Note on distributed

computing”● Historical discussion

A Note on distributed computing

● Article published in 1994 by Sun● Discuss contemporary (failed) approach to

distributed systems● Realize failure every 10 years● Fundamental error: Vision of Unified Objects● Industry might lose heart

Vision of Unified Objects

● Treat everything as objects● Treat everything as the same kind of objects● Design your interfaces● Concretize object location● Test it: Object location does not alter correctness● Conclusion: Object location irrelevant

Problems with the vision

● Local and remote objects not the same● Latency x4 for remote objects● Memory access to different address space● Partial failure: total app. crash vs partial● Concurrency: local control vs lack of control

The solutions

● Solution 1: Treat all objects as remote, adds a layer of complexity to every object

● Solution 2: Treat all objects as local, causes unreliable D.S.

● Solution 3: Accept the differences, mark objects as local or remote, provide better tools and frameworks

● Even better: Treat objects as local-remote, put them in separate address space, provide a resource manager

Another note on Distributing Comp

● Discussion triggered by the article: “A note on Distributing Computing”

● Written 17. July 2008 by Michi Henning

What is right

● Agrees with the fact that the industry needs a accept that there are differences between local and distributed computing.

● Stating the obvious!

What is NOT right

● The three phases of writing a distributed application.

● Different APIs, the solution.

● “..unified view of objects is mistaken”

Conclusion

● “A note on Distributing Computing” argues from false premises and arrives at conclusions that are not supported by these premises.

Our opinions on “A Note on Distributed Computing” - prologue

“Historically, the language approach has been the less influ-ential of the two camps. Every ten years (approximately),members of the language camp notice that the number ofdistributed applications is relatively small. They look atthe programming interfaces and decide that the problem isthat the programming model is not close enough to what-ever programming model is currently in vogue (messagesin the 1970s [7], [8], procedure calls in the 1980s [9], [10],[11], and objects in the 1990s [1], [2]). A furious bout oflanguage and protocol design takes place and a new dis-tributed computing paradigm is announced that is compli-ant with the latest programming model. After severalyears, the percentage of distributed applications is discov-ered not to have increased significantly, and the cyclebegins anew.”

Our opinions on “A Note on Distributed Computing” - epilogue

● Jim Waldo, Geoff Wyant, Ann Wollrath and Sam Kendall (of Sun) shows profound insight in their paper

● Visions really starts to realize with Java released by Sun the next year (1995)

● Enterprise Java Beans initially developed by IBM in 1997 where the full vision starts to realize

● Rapid development● Despite initial large criticism on increased

complexity, concept not abandoned● Sun in turn starts simplifying and reducing

complexity

● DHT is introduced in 2001, starts realizing decentralized resource management

● BitTorrent adopts DHT in 2001, leads to rapid deployment

● Spring 1.0 released in 2004● Hibernate 2.1 wins Jolt Award in 2005● EJB 3.0 is released 2006, simplified further,

influenced by Spring, much of it's functionality deprecated in favor of Hibernate

● EJB starts to reach maturity and gaining acceptance

● Tripler BitTorrent client completes decentralization with DHT in 2008

Our conclusion on “A Note on Distributed Computing”

● Jim Waldo et al. shows a remarkable insight and prediction in his note still fully relevant today

● Vision has been fully realized!● The trend of abandoning concepts every tenth year

in favor of new ones has been broken● Object orientation and distributed has showed a

continuously increasing adoption and importance more and more since, with new concepts and technologies built around it

● Distributed concepts is picked up in other contexts and is getting increasingly important fast late years

● Correct road eventually found?

Another Note on Distributed Computing (or is it really?)

● The prologue giving the premise for the discussion has fundamental issues:

● Discussion takes place on different level● Focuses on very specific details, revolving around

specific implementation(s)● Biased● Misleading rhetoric● Draws poor conclusions● Irrelevant (by own criteria) due to time published● Lacks any visible, supportive discussion feedback ● In summary: It's severely FLAWED and offtopic!

Divide and conquer?

● Article starts out quoting and praising the paper:● “Programming a distributed application will require the use of different

techniques than those used for non-distributed applications. Programming a distributed application will require thinking about the problem in a different way than before it was thought about when the solution was a non-distributed application.”

● I could not agree more: I’ve been preaching for years that, if you go and design the APIs for a distributed application the same way as for a non-distributed one, you are likely to fall flat on your face.

● This seems like an attempt to try break the ice, applauding what can be considered obvious, and not really any claim

● Continues to agree on other quotes which also states what is considered given facts, rather being claims to dispute

● This all seems like rhetorics for the author to give partial credit to the paper demonstrating insight while passing himself of as well balanced and credible, before trying to reduce the paper's credibility before starting to actually discuss any subject.

“The authors do not make it clear what they mean by “share an address space”, and do not further explain what they mean by “local” and “remote”. To talk meaningfully about this vision of unified objects, we need to be clear about what kinds of objects there actually are:”

● Remotable objects. These are objects that can (but need not) reside in a different address space. If they do reside in a different address space, they can be reached only via inter-process communication, such as by sharing memory or sending messages over the backplane (for same-machine communication) or over a network (for communication with objects on other machines). If a remotable object is in the same address space, it offers the same interface as if it were remote. It just so happens that it can be reached via a more efficient communication mechanism.

● Local, language-native objects. These are the objects that come built into the programming language, such as C++ or Java objects. These objects have nothing to do with distribution.

● Kinda right, but in wrong context - clear miss. The method of transport, the payload type used and object definition would in the situation described be dealt with by the appropriate framework and defined on a different layer, abstracting it for the developer to transparently handle them both just as “regular” objects.

● For remotable objects, the implementation of operations is hidden behind their interface and, as far as the caller of an operation is concerned, the same API is used to invoke the operation, regardless of the actual location of the object (local or remote). However, that API is not necessarily the same as the API for a language-native object.

“The vision is that developers write their applications so that the objects within the application are joined using the same programmatic glue as objects between applications.”

● This suggests that the authors, when they talk about local objects, actually mean language-native objects. However, they also say that local objects have an interface declared in an interface definition language, which suggests collocated remotable objects.

● Here, just as explained above the intent is to abstract this away for the developer not having to worry about it. The only concern for the developer regarding this, is on a different level. What is suggested by this comment's author, put in the world of the original paper, the applications logic in terms of it's layers would seem to be all spaghetti style.

● Now, while it is true that systems such as CORBA and Ice indeed strive to make distributed computing as frictionless as possible and to make a remote operation as easy to call as a local or language-native one, they do not try to paper over the difference between language-native objects and remotable objects. It just so happens that, if an object can be called remotely, it can be called the same way whether the object happens to be collocated in the same address space or not.

● False! Quoting from wikipedia on CORBA: “CORBA's notion of location transparency has been criticized; that is, that objects residing in the same address space and accessible with a simple function call are treated the same as objects residing elsewhere (different processes on the same machine, or different machines). This notion is flawed if one requires all local accesses to be as complicated as the most complex remote scenario.”...

● Fully consistent with the original paper's wording.● ...“However, CORBA does not place a restriction on the complexity of the calls.

Many implementations provide for recursive thread/connection semantics. I.e. Obj A calls Obj B, which in turn calls Obj A back, before returning.”

● Michi Henning – among lead-developers of CORBA, might be referring to this?● ICE being released first in 2003, is of course not mentioned in, nor relevant to

what the original paper covers..

● This does not mean that a language-native object can be called the same way as a remotable one, or vice versa. In particular, in systems such as CORBA and Ice, remotable objects have a type that differs from the type of any language-native object, and pointers (or references) to remotable objects cannot be used interchangeably with language-native ones. In particular, invocations on remotable objects are made via proxy types (such as Ice’s Prx types), and these proxy types are not type compatible with a language-native pointer or reference.

● My understanding: In stead of unified objects with remote/local functions, different kind of objects with different set of functions.

● Goal and logic seems basically the same...● Design is very different..● In summary: irrelevant● Paper mainly speaks of an unified vision based on human perception, not actual

implementations.

● “Writing a distributed application in this model proceeds in three phases. The first phase is to write the application without worrying about where objects are located and how their communication is implemented.”

● What? Where on earth do they take this from? I cannot recall a single instance where anyone with even the least shred of credibility claimed such a thing, even in the dim-distant days of DCOM and CORBA. There is a big difference between making it easy to call a remote operation, and claiming that, because remote operations are easy to call, we can ignore object location when we design an application. If this is how people start out writing their applications, they are guaranteed to fail, and that fact has been well known and well documented for at least 15 years.

● Comment author seems ~emotional and entangled in same world of his● His suggested model adds excessive complexity to basic application logic● Paper author's model still based on the same vision● Model seems more logical and transparent to developer● Model is more traditional and less complex (~KISS)● Distributed aspects more abstracted by framework, for developer to concern

about to lesser extent at appropriate and intuitive places.● Models continues with fundamental differences, making comments on paper

more “emotional” and near erratic.

Different premises

● “A better approach is to accept that there are irreconcilable differences between local and distributed computing, and to be conscious of those differences at all stages of the design and implementation of distributed applications.”

● Yes! That is exactly how we should build distributed applications. We cannot forget—ever—when we are dealing with distribution and when we are not. That is true regardless of the technology we use for distribution, regardless of the specific APIs, and regardless of the underlying protocol. The differences are due to distribution itself, not due to any artifact of design or implementation.

● Comment author (Henning) has a dogmatic view● Michi Henning: All parts of application needs to have distributed concerns● Original paper author's (Waldo) view is more visionary, hypothetically predictive● Jim Waldo: Envisions more concerns being addressed by framework,● Henning's comments illustrates premises being different in design itself.● As far as A Note on Distributed Computing is concerned, it argues from false

premises and arrives at conclusions that are not supported by these premises. In fact, the paper is largely irrelevant to modern middleware such as Ice.

● Contradictory to other modern middleware as EJB supporting EXACTLY this (near divine prediction, it being supported fourten years later!)

● Illustrated above, Henning seems to be the one arguing from false(/different) premises arriving at unsupported(/different) conclusions.

● Distributed computing is hard enough as is, and Ice does its best to not make it harder still. But, ultimately, API style has little to do with the real reasons for why distributed computing is hard. What we need to accept, first and foremost, is that—regardless of APIs, technologies, whether interactions are synchronous or asynchronous, and whether we use objects or “services” (whatever those might be)—distributed computing is hard because it is distributed computing.

● Henning's dogma is strengthened more● Nothing can be simplified● One has to accept it being hard, can only prevent making

it harder● const distributedComputing immutableTruth = hard;

Dogma reiterated

Now, does it matter whether I use CORBA, or Ice, or REST, or something else? You bet it does! But that I will make the topic of other posts

● Michi Henning makes it clear there's differences between implementations

● Reaffirms position on Henning's comment being implementation specific

● Strengthens suggestion of Henning being biased about own product

● Our very subjective conclusion on the “another note” is it rather being an incoherent, erratic rant fueled by Michi Henning's ego as “Chief Scientist of ZeroC, co-author of several CORBA ORBs and the CORBA "bible" "Advanced CORBA Programming with C++” (ref: http://zeroc.com) being stepped on by Waldo's paper all the way back in 1994, it's predictions also coming true.

distributed computing - uio.no · distributed computing group 6: mathias hagen per Øyvind karlsen...

Documents