accumulo summit 2015: accumulo 2.0: a new client api [api]

Click here to load reader

Post on 15-Jul-2015




1 download

Embed Size (px)


  • Accumulo 2.0.0A New Client APIChristopher Tubbs

  • Versions Overview

    Retired 1.3 (last: 1.3.6) 1.4 (last: 1.4.5)

    Current 1.5 (latest: 1.5.2)* 1.6 (latest: 1.6.2) 1.7 ?

    Development 1.8 2.0

  • Version Philosophy


    x: major*y: minor/bugfixes*

    * habit of removing deprecated code arbitrarily

    New (1.6.2+)x.y.z

    x: majory: minorz: patch (bugfix)

    Semantic Versioning 2.0(

  • Background: 1.x API Focus (or lack thereof)

    Function > Usability Limited forethought for integration

    Current API a gradual evolution biggest redesign in 2009

    Instance / Connector Permissions / Authenticator

    lots of feature additions, deprecations, removals, but few fundamental design changes since

  • Background: 1.x API (cont.)Public API public and protected

    in org.apache.accumulo.core.client everything but impl packages

    in Key, Mutation, Value, Range Condition and ConditionalMutation (1.6+)

    in org.apache.accumulo.minicluster everything but impl packages

  • Background: 1.x API (cont.)Public API (1.7) public and protected

    org.apache.accumulo.core.client org.apache.accumulo.minicluster

    all but *impl*, *thrift*, *crypto*

  • Lessons Learned1. Confusing entry point

    Instance i = new ZooKeeperInstance();

    Connector c;

    // c = new Connector(i, user, pass);

    c = i.getConnector(user, pass);

    2. Too many overloaded methodsBatchWriterConfig bwConf;

    bwConf = new BatchWriterConfig();

    bw = c.createBatchWriter(table, bwConf);

  • Whats Better?Perhaps:

    AccumuloClient.Builder builder =



    AccumuloClient c =;

  • Better Yet......make it fluent:

    AccumuloClient client =


    1. More factories2. More configuration containers / builders3. Fluent

  • Resource ManagementCurrent Problems:

    private static fields shared by clients

    no ability to close / clean up

    performance trade-offs

  • Opaque ResourcesBetter:

    try (

    AccumuloClient.Resources r =


    AccumuloClient client =

    Accumulo.client().with(r).build()) {

    /* do work with the client */

    } catch (Exception e) { }

  • What About Exceptions?Current Problems:

    ... throws TableNotFoundException,

    AccumuloSecurityException, AccumuloException;

    With Java 7, this gets a little better:catch (AccumuloSecurityException | AccumuloException e) { }

  • Exception HierarchyBetter:

    public class TableNotFound

    extends AccumuloException {}


    try (

    AccumuloClient client =

    Accumulo.client().build()) {

    /* do work with the client */

    } catch (AccumuloException e) {

    } catch (YourCodeException e) {}

  • LeakingCurrent Problems: Leaking non-public (implementation) classes

    apilyzer-maven-plugin Problem: requires users to instantiate, assign, or

    pass non-public classes in normal use

    Exposing too much implementation MapReduce classes Problem: makes it difficult to extend or evolve

    internal changes without affecting users.

  • Dependency ExposureCurrent Problems: Dependencies on unstable third-party

    classes Guava @Beta-annotated classes Hadoop @LimitedPrivate-annotated classes

    Dependencies with lots of transitive deps Hadoop Text, Writable for serialization

    RPC serialization library in public API Thrift

  • Parameter Problems

    Current Problems: Exposing implementation-specific classes

    log4j Level prevents using log4j2, slf4j, and logback

    stringly typed objects parameters table tableName tableId

  • Encoding ProblemsCurrent Problems:

    Fail to specify internal encoding

    serialize/deserialize mismatch

    UTF-8 or user-specified?

    Overloaded methods again

    Unexpected characters (Authorizations)

    The Accumulo shell (jline)

  • New Types Namespace

    .getId() .exists() .tables() .rename(String)

    Table .getSplits() .merge(Range) .scanner(ScanOptions) .compact()

  • API-only Artifact

    accumulo-api.jar (new!) org.apache.accumulo.api no dependencies on other accumulo jars

    use Javas ServiceLoader to bind to impl minimal dependencies on stable libraries

    commons guava

    Not in accumulo-core.jar

  • 2.0.0 API StatementPublic API (new!) public and protected


    Alternatively: public and protected


  • Goals: A Summary Improved API stability Compatibility (semver)

    Easy to check Easy to track changes

    Helps users manage dependencies Separate API from implementation Possible ability to swap out implementation

    (mock replacement? in-process impl?) Intuitive front-door Fluent usage Resource management

  • Release planSteps Finish implementation Initial reviews 2.0.0-alpha-1

    Developer preview released to get feedback 2.0.0-beta-1 ?

    Possibly another developer preview after stabilizing API changes

    2.0.0 final release (Summer?)

  • ContactMe:

    [email protected] Fingerprint: 8CC4 F8A2 B29C 2B04 0F2B 835D 6F0C DAE7 00B6 899D

    Us:[email protected]

    #accumulo on FreeNode IRCIssue: