Vinicius Carvalho - Advisory Platform Architect @Pivotal @vccarvalhohttp://github.com/viniciusccarvalho/
Schema Evolution for Data Microservices
Servlet JSP Struts JSF Spring GWT Angular
timeCustomer
DB
JSONXMLJava Serialization
Format evolution
Account
User
Product
Order
Common jar
<dependency> <groupId>com.acme</groupId>
<artifactId>common-domain</artifactId> <version>1.1.3</version>
</dependency>
Enterprise Service Bus
RECOMMENDATION
SEARCH CATALOGCanonical Message Format
Haven’t we solved this already?
• Majority of ESB systems uses XML as the Canonical Model
• XML is good for structure, but it has no notion of evolution
• It’s heavy
And then there’s this µService thing
Bounded Contexts
Contexts Maps
Aggregates
Value Objects
Anti corruption Layer
• Data evolution is a hard problem to grasp
• Even in known territories such as traditional RDBMS is a hard problem to tackle
Services Evolution
behavioral
structural
New functions are added to the system
Information model changes over time
Forward compatibility
• Older version can read new version
• Challenges:
Field renaming
Field removal
V1
V2
▪@EnableBindings(Source.class) ▪one output
▪@EnableBindings(Sink.class) ▪one input
▪@EnableBinding(Processor.class) ▪one input and one output
▪@EnableBinding(MyOrderHandler.class) ▪custom interfaces with as many inputs and outputs
▪@EnableRxJavaProcessor ▪OOTB support for RxJava with one input and one output
@Enable All the things
Structure
Adaptability
Guarantees a contract between users of the
model
How flexible the format is for changes
on it’s structure
BenchmarkingBecause … we love it
Format Structure Adaptability
CSV Positional, no type definition
Possible if appending new columns
XMLflexible,
strong typedAppend, remove only via version (no standard)
supports defaults
JSON flexible, untyped
Append and remove are handled by parser
no support for defaults
Avro flexible, strong typed
Append, Removal supports defaults
Version is built in
public class Sensor { private String id; private float temperature; private float velocity; private float acceleration; private float[] accelerometer; private float[] magneticField; private float[] orientation; }
How much do you weight?
Source
spring: cloud: stream: bindings: output: destination: sensor-topic contentType: “avro/binary”
WORK
IN
PROG
RESS
Activates the converter
Avro Converter• Scans the classpath for *.avsc files and register them
• During writes, infer the schema from payload (SpecificDatum, GenericDatum, Reflection)
• During reads uses message headers to discover the schema being used
Avro Converter• Each component still needs the avsc file
• Avro versioning only works if both writer and reader schemas are available
• Transmitting the schema with the message is an overhead
Schema registry• Centralized store for schemas
• Idempotent registration (same schema payload always return the same id)
• Compatibility test
• Schema utilization
Schema registry• Allows developers to check if new schemas can break existing ones in the registry
• BACKWARD: new schema can read old versions
• FORWARD: Old schema can read new version
• FULL: BACKWARD + FORWARD
Schema utilization
{ "registrations" : [ {"application-name":"user-producer", "type" : "source" }, {"application-name":"user-enricher", "type" : "processor" }, {"application-name":"user-filter", "type" : "processor" } ] }
GET /schemas/user/{version}
Sink
Content-Type: avro/binaryX-Schema-Id: 17
Headers
Writer’s schema
spring: cloud: stream: bindings: input: destination: sensor-topic schema: “org.acme.Sensor”
Reader’s schema
Source Processor Sink
1. Register and obtain schema id
Payload
2. Reads headers fetch writer’s
schema
Schema Registry
Stream
Content-Type: avro/binaryX-Schema-Id: 17
Headers
References• Martin Kleppmann Schema Evolution in avro, thrift and protobufers: https://
martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html
• http://dataintensive.net/ - Martin Kleppmann
• The CQRS Journey: https://msdn.microsoft.com/en-us/library/jj554200.aspx
• Oracle Datastore schema evolution : https://docs.oracle.com/cd/NOSQL/html/GettingStartedGuide/schemaevolution.html
• Building Microservices by Sam Newman: http://samnewman.io/books/building_microservices/
• Apache Avro: https://avro.apache.org/docs/1.7.7/gettingstartedjava.html
• https://github.com/viniciusccarvalho/schema-evolution-samples