Background• Java since 1995• Consulting since 1999• Four Java Books
• Hibernate: A J2EE Developer’s Guide, 2004
• CTO, Dynacron Group, 2010-Present• Several NoSQL PoCs• Brian and Dustin on details
Trends & Context• Hibernate• MyBatis• NoSQL• Big Data & NoSQL = related, but not same thing
• Both are now famous!
Hibernate ORM & MyBatis• Hibernate
• Map classes to tables• Use HQL/Criteria queries• Pitch: Don’t have to learn SQL!• Reality: Learn Java, SQL, and HQL!
• Debugging very difficult
• MyBatis• Write interfaces• Add SQL via annotations
• For more on Hibernate vs. MyBatis, see video of last talk on SeaJUG.org
Most interesting: decoupling application
from RDBMS
MyBatis Decoupling?• Contract with data/persistence is just an interface
• Easy to mock• Easy to understand• Easy to… replace…
Today’s Two Examples• MongoDB
• Document Store• Order Example
• Neo4j• Graph walking
• Different Tools• …very complimentary
From http://www.neo4j.org/develop/example_data … and the BBC
Final Thoughts
Favorite (Personal) Stack Today?Persistence: MongoDBPlain Old Java + Maven
Interfaces in JSON, easy, fastJackson FTW
Add Ratpack (HTTP)http://www.ratpack-framework.org/
Angular.js vs Ember.js?FIGHT!
http://pastordonblog.blogspot.com/2010/09/stephen-hawking-is-probably-best-known.html
MongoDB: What is it?• Document-oriented• “a scalable, high-performance, open source NoSQL
database” • Auto-Sharding
• Distributed writes
• Replica Sets• High Availability• Distributed reads
• Low cost of ownership• Commodity hardware• Minimal administration
MongoDB: Use Cases/Strengths• Well-suited for large data sets
• Craigslist (10TB, 5B records)• Wordnik (3.5TB, 20B records)• Disney 1400 instances
• Always fetch an object with sub-objects• High volume reads/writes
• Latency as low as .1ms
• Map Reduce• Aggregation
MongoDB: Challenges• Data modeling
• What questions do I need to answer up front?
• Selecting a Shard Key• Multi-datacenter replication
• Time-delayed replicas• Hidden replicas
• Elections• Arbiter
Say What?• GRAPH database
• Nodes, relationships, properties• Shines with complex, highly connected data
• Social networks• Recommendations• Path finding
• Graph DATABASE• Reliable: ACID Compliance, High availability• Scalable: 32B nodes and edges, 64B properties• Accessible: REST API, Embeddable on JVM
Querying• Cypher Query Language
• Best for ad-hoc querying• SQL-like language• REST interface• Easy to copy-paste in email• “Prepared” statements
• Traversal API• Best for high-performance querying• Custom JAX-RS plugin• Java code• More powerful• Lower latency• Clean REST interface
Cypher
• Start-Match-Where-Return• START root=node(0)
RETURN root• START root=node(100)
MATCH (root)-[:has]->(child) RETURN child
• START me=node:lookup("name=dustin") MATCH me-[f:friend]->(friend) WHERE friend.gender=‘M’ AND f.date < ‘2012-01-01’ RETURN friend.name, f.date
Domain Layers• Spring Data
• Collaboration with SpringSource• Annotation/AspectJ-driven
•
• Qi4j, jo4neo, …
Performance• > 1 billion nodes, > 1 billion relationships, > 3
billion properties• < 10ms query time on average• < 100ms query time, 99th percentile• 4000 req/sec on 3 beefy servers
• 16-core, 256GB ram, 1.1TB SSD in Raid0+1
• Demands• Practically begs for SSD• Not horizontally scalable
• Add more machines for read scaling
• Tuning is VERY important.• Order of magnitude speed increase letting memory-mapped
IO consume almost all system resources