application and data portability in cloud computing the cirrocumulus way ajith ranabahu and amit...
TRANSCRIPT
Application and Data Portability in Cloud Computing
The Cirrocumulus Way
Ajith Ranabahu and Amit ShethOhio Center of Excellence in Knowledge enabled Computing
(Kno.e.sis) http://knoesis.org
Wright State University, Dayton OH, USA
Agenda
• Issues in Cloud computing and its adoption• Vendor lock-in problem
– Understanding cloud heterogeneity• Using Domain Specific Languages to implement
application portability• Using RDFS based modeling to implement data
portability• Two Example applications• Where are the semantics ?
What is Cloud computing ?
CheapNo upfront cost
Utility style Resources
Service oriented provisioning over the Web
APIs enabling programmatic access
Numerous support servicesData access (Amazon RDS, Amazon simple DB, Google Big TableAutomatic scaling (Amazon Elastic Beanstalk)
Did it change the world ? Why not ?
Multiple Issues in Wide Adoption
Almost always heterogeneous platforms
Supported programming languagesData representationResource provisioning and Management Workflow
Writing one application is not enough !
Amazon EC2 accepts almost any language
Windows Azure support .NET (other languages need workarounds)
Google App Engine only supports Java and Python
Vendor Lock-in
Users locked into cloudsproprietary APIs
Limited language supportCustom tools and workflows
We need a better way of doing things !
Write applications in cloud agnostic ways : "Write once - Run on any cloud"
Access cloud resources with a uniform workflow
Move applications and data across cloudswhen the need arises
Understanding Heterogeneity :Where does it exist ?
Vertical and Horizontal
Vertical
Within the same type of clouds - Say Infrastructure service providers
Horizontal
Across different types of clouds - Say Infrastructure clouds and platform clouds
Some examples
Amazon EC2 vs Rackspace
Both are infrastructure providers
Process of starting a VM in EC2 is very differentfrom doing the same in Rackspace
Google App Engine vs Windows Azure
Both are platforms
Supports different languages (Java/Python vs C#/.NET)
Requires using different custom libraries
Requires adhering to different data models(document-oriented vs Relational)
How can Portable Applications be developed ?
DSLs to the rescue !
Domain Specific Languages (DSL) are specialized, mini languages that address problems in a limited domain.
• Matlab (Mathematics)• SQL (Data definitions and manipulation)• Ant / Make (build scripts)
What 'Domain' are we talking about ?
Many !
Each domain has to have its own DSL
Some example domains of Interest (in the Cloud context)
Data driven mobile applications
Enterprise data retrieval applications
Statistical Scientific Workflows
What is the catch ?
DSLs are not universal applicable:Useful only in a supported domain
Forcing a top-down (model driven) development method
The case of the "smallest common set of features"
When using an abstraction over multiple platforms, only the smallest common set of features can be effectively supported.
Its not as serious as it sounds !
The number of unique features across the major platforms is quite small
[quantification needed !]
In case platform specific features are needed
Use the DSLs as boiler-plate code generators
Use Bison-like conditional code additions to insert specific code fragments
Two Examples
MobiCloudGenerating hybrid applications that have
pieces running on clouds and mobile devices
http://mobicloud.knoesis.org
SCALE : Scalable Cloud AppLication gEnerator
Generating statistical workflows for biologists
http://metabolink.knoesis.org/SCALE
What kind of Heterogeneity are we talking about ?
package bean;
import com.google.appengine.api.datastore.Key;
import javax.jdo.annotations.IdGeneratorStrategy;import javax.jdo.annotations.IdentityType;import javax.jdo.annotations.PersistenceCapable;import javax.jdo.annotations.Persistent;import javax.jdo.annotations.PrimaryKey;
@PersistenceCapable(identityType = javax.jdo.annotations.IdentityType.APPLICATION)public class Todoitem { @PrimaryKey @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY) private Key key; public Key getKey() { return key; } @Persistent private java.lang.String time; public java.lang.String getTime() { return time; }
public void setTime(java.lang.String local_time) { this.time = local_time; } ……..}
package bean;
public class Todoitem { /* PrimaryKey */ private int key; public int getKey() { return key; } public void setKey(int localkey) { this.key = localkey; } private java.lang.String time; public java.lang.String getTime() { return time; }
public void setTime(java.lang.String local_time) { this.time = local_time; } …}
An Example for a simple class [ Code generated by MobiCloud for task manager example ]
Code for Google App Engine version
Code for Local / EC2 version
model(:todoitem,{:time => :string,:location => :string,:description => :string,:name => :string})
The DSL code fragment relevant to the generated code
Taking care of proper annotations and adhering platform specific restrictions Is taken care of by the generators !
Porting data across clouds
Issues caused by the difference in data models
Often more crucial than porting the application code
We are in luck !We already followed a model-driven
development process
We get to define data at a higher level of abstraction
Data transformations can be generated along with the application code
Our Choice for Data Definitions ?
RDF Schema (RDFS)
Developers need not know a lot of RDFS !
Graphical and textual abstractions can be provided todamp the learning curve
Where are the semantics ?
The four types of semanticsfor Clouds
In defining data structures
RDF/RDFS is already considered 'semantic'
Enhancing functional aspects
Add business policies via rules
Add non-functional enhancements
Security and reliability via policies
Describe System configurations
ECML, EDML, EMML byElastra
Inspired by the 4 types of semantics in services by Shivashanmugam, Sheth
http://knoesis.org/library/resource.php?id=00186
Possible Semantic usage in MobiCloud
What is left to do ?
Hassel-free deployment is a problem
Need a comprehensive middleware platform to fix it
Conclusion
Portability (both application and data) in Cloud computing is an important problem
to solve
Using DSLs and semantic abstractions is a viable solution to the portability
problem in many domains
There is long way to go but things seem promising !
Demonstration
Questions ?
Thank you