lightning talks & integrations track - end-to-end exactly-once with apache apex @ abdw17, pune

24
End to End Exactly once processing in Apache Apex Hitesh Kapoor [email protected] Priyanka Gugale Shah [email protected]

Upload: datatorrent

Post on 15-Apr-2017

6 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

End to End Exactly once processing in Apache Apex

Hitesh [email protected] Gugale Shah

[email protected]

Page 2: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Agenda

● What is End to End Exactly once● Fault Tolerance ● Recovery from Operator failure.● Recovery Mechanisms.

● Importance of End to End Exactly once

● Achieving End to End Exactly once in Apache Apex●Example DAG achieving the desired goal

● Conclusion & Questions.

Page 3: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Fault

Page 4: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Recovery Mechanisms

●○○○

Page 5: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Replay data .. Big Data ..?

Image source https://www.pinterest.com/toysconcept/baby-fun-things/

Page 6: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Retrieve Operator State & Results

Page 7: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Recovery MechanismsAt most once

Subscribes to data from the start of the next window.

Ignore the lost windows and continue to processincoming data normally.

No duplicates

Possible missing data

Page 8: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Recovery Mechanisms

At most once At least once

Subscribes to data from the start of the next window.

Operator brought back to its latest checkpointed state and the upstream buffer server replays all subsequent windows

Ignore the lost windows and continue to processincoming data normally.

lost windows are recomputed & application catches up live incoming data

No duplicates Likely duplicates

Possible missing data No missing data

Page 9: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Recovery Mechanisms

At most once At least once Exactly once

Subscribes to data from the start of the next window.

Operator brought back to its latest checkpointed state and the upstream buffer server replays all subsequent windows

Operator brought back to its latest checkpointed state and the upstream buffer server replays all subsequent windows

Ignore the lost windows and continue to processincoming data normally.

lost windows are recomputed & application catches up live incoming data

Lost windows are recomputed in a logical way to have the effect as if computation has been done exactly once.

No duplicates No recomputation Likely duplicates & recomputation Duplicates/Recomputation?

Possible missing data No missing data No missing data

Page 10: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

If window is recomputed then how “exactly” once?

Image source https://www.pinterest.com/toysconcept/baby-fun-things/

Page 11: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

End-to-End Exactly Once

11

•ᵒ

Page 12: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Idempotency

12

Page 13: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune
Page 14: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

End-to-End Exactly-Once

Aggregate CountsWords

Kafka Database

● Input○ Uses com.datatorrent.contrib.kafka.KafkaSinglePortStringInputOperator

○ Emits words as a stream○ Operator is idempotent

● Counter○ com.datatorrent.lib.algo.UniqueCounter○ aggregate over a window, retain the aggregates as state for the duration of the window, emit them at

the end of the window and clear the state.● Store

○ Uses CountStoreOperator○ Inserts into JDBC○ Exactly-once results (End-To-End Exactly-once = At-least-once + Idempotency + Consistent State)

https://github.com/DataTorrent/examples/blob/master/tutorials/exactly-oncehttps://www.datatorrent.com/blog/end-to-end-exactly-once-with-apache-apex/

Page 15: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

End-to-End Exactly-Once

Aggregate CountsWords

Kafka Database

● Apex connectors retrieving data from external systems need to ensure recovery.

● Involves rewinding the stream and replaying unprocessed data from the source.

● The capabilities of the external system determine complexity.

● Kafka handles message persistence allowing to replay the message stream directly.

● Apex input operator needs to remember the offsets.

Page 16: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Idempotency - Apex Kafka operator●

Page 17: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Exactly Once Strategy

17

Page 18: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Exactly Once Strategy

18

d11 d12 d13

d21 d22 d23

lwn1 lw

n2 lw

n3

op-id wn

chk wn wn+1

Lwn+1

1 Lwn+1

2 Lwn+1

3

op-id wn+1

Data Table Meta Table

• Data in a window is written out in a single transaction

• Window id is also written to a meta table as part of the same transaction

• Operator reads the window id from meta table on recovery

• Ignores data for windows less than the recovered window id and writes new data

• Partial window data before failure will not appear in data table as transaction was not committed

• Assumes idempotency for replay

Page 19: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

End-to-End Exactly-Once (Contd.)

Aggregate CountsWords

Kafka Database

public static class CountStoreOperator extends AbstractJdbcTransactionableOutputOperator<KeyValPair<String, Integer>>{ public static final String SQL = "MERGE INTO words USING (VALUES ?, ?) I (word, wcount)" + " ON (words.word=I.word)" + " WHEN MATCHED THEN UPDATE SET words.wcount = words.wcount + I.wcount" + " WHEN NOT MATCHED THEN INSERT (word, wcount) VALUES (I.word, I.wcount)";

@Override protected String getUpdateCommand() { return SQL; }

@Override protected void setStatementParameters(PreparedStatement statement, KeyValPair<String, Integer> tuple) throws SQLException { statement.setString(1, tuple.getKey()); statement.setInt(2, tuple.getValue()); }}

Page 20: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Everything Tailored Together

Page 21: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Conclusion

Page 22: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune

Conclusion contd.

•ᵒ

Aggregate CountsWords

Kafka Database

Page 24: Lightning Talks & Integrations Track - End-to-end Exactly-Once with Apache Apex @ ABDW17, Pune