hbase crud java api

12
HBase CRUD Use Java API for Create, Read, Update, Delete operations

Upload: eugene-yushin

Post on 18-Jul-2015

183 views

Category:

Software


11 download

TRANSCRIPT

Page 1: HBase CRUD Java API

HBase CRUDUse Java API for Create, Read, Update, Delete operations

Page 2: HBase CRUD Java API

Agenda

• Intro• Create• Insert• Update• Delete• Read – Table Scan• Read – Get Field• Conclusions

Page 3: HBase CRUD Java API

IntroA rowkey primarily represents each row uniquely in the HBase table, whereas otherkeys such as column family, timestamp, and so on are used to locate a piece of datain an HBase table. The HBase API provides the following methods to support theCRUD operations:• Put• Get• Delete• Scan• Increment

You could find source code for this presentation on github: https://github.com/EugeneYushin/HBase-CRUD

Page 4: HBase CRUD Java API

Create

Table creates in ‘Enabled’ state. Check table creation in Hue (Cloudera CDH 5.1.0) and hbase shell

Page 5: HBase CRUD Java API

Insert

Use HConnection.getTable() against HTablePool as last is deprecated in 0.94, 0.95/0.96, and removed in 0.98 .

Page 6: HBase CRUD Java API

Insert

All manipulations with table implements throughHTableInterface. HTable represents particular table inHbase.

The HTable class is not thread-safe as concurrentmodifications are not safe. Hence, a single instanceof HTable for each thread should be used in anyapplication. For multiple HTable instances with thesame configuration reference, the same underlyingHConnection instance can be used.

RowKey is main point to consider when configuringtable structure. Use compound RowKey with SHA1,MD5 hashing algorithms (with additional reversetimestamp part) as Hbase store data sorted.

Page 7: HBase CRUD Java API

Update

Data in Hbase is versioned, by default there’re last 3 values stored into column. Use HColumnDescriptor.setMaxVersions(n) method to overwrite this value.

Page 8: HBase CRUD Java API

Delete

Value for “user_name” qual changed to previous version.

Page 9: HBase CRUD Java API

Read – Table Scan

Table Scan...PaulRK Paul [email protected]

Page 10: HBase CRUD Java API

Read – Get Field

Get particular Field...rowKey = MikeRK, user_name: MikerowKey = MikeRK, user_mail: [email protected]

Page 11: HBase CRUD Java API

Conclusions• HTable is expensive

Creating HTable instances also comes at a cost. Creating an HTable instance is a slow process as the creation of each HTable instance involves the scanning ofthe .META table to check whether the table actually exists, which makes the operation very costly. Hence, it is not recommended that you use a new HTableinstance for each request where the number of concurrent requests are very high

• Scan cashingA scan can be configured to retrieve a batch of rows in every RPC call it makes to HBase. This configuration can be done at a per-scanner level by using thesetCaching(int) API on the scan object. This configuration can also be set in the hbasesite.xml configuration file using the hbase.client.scanner.cachingproperty

• IncrementIncrement Column Value (ICV). It’s exposed as both the Increment command object like the others but also as a method on the HTableInterface. Thiscommand allows you to change an integral value stored in an HBase cell without reading it back first. The data manipulation happens in HBase, not in yourclient application, which makes it fast. It also avoids a possible race condition where some other client is interacting with the same cell.

• FilterA filter is a predicate that executes in HBase instead of on the client. When you specify a Filter in your Scan, HBase uses it to determine whether a recordshould be returned. This can avoid a lot of unnecessary data transfer. It also keeps the filtering on the server instead of placing that burden on the client. Thefilter applied is anything implementing the org.apache.hadoop.hbase.filter.Filter interface. HBase provides a number of filters, but it’s easy to implementyour own.

Page 12: HBase CRUD Java API

Thank you

ushin.evgenijhttps://www.linkedin.com/in/yushyn