punch clock for debugging apache storm
TRANSCRIPT
![Page 1: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/1.jpg)
Punch clock for Apache storm
<just an idea>
![Page 2: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/2.jpg)
Punch clock (a.ka. time clock)
![Page 3: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/3.jpg)
Punch clock (a.ka. time clock)● You have a card per person.
![Page 4: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/4.jpg)
Punch clock (a.ka. time clock)● You have a card per person.
● The person punches IN with the card when
he/she enters the office.
![Page 5: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/5.jpg)
Punch clock (a.ka. time clock)● You have a card per person.
● The person punches IN with the card when
he/she enters the office.
● The person punches OUT with the card
when he/she leaves the office.
![Page 6: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/6.jpg)
Punch clock (a.ka. time clock)● You have a card per person.
● The person punches IN with the card when
he/she enters the office.
● The person punches OUT with the card
when he/she leaves the office.
● The punch clock records the time of
entry/exit on the card
![Page 7: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/7.jpg)
MotivationTo Find out …
![Page 8: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/8.jpg)
MotivationTo Find out …
1. When did the Person enter / exit the office ?
![Page 9: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/9.jpg)
MotivationTo Find out …
1. When did the Person enter / exit the office ?
2. Who is still in office ?
![Page 10: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/10.jpg)
Change of Context …
![Page 11: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/11.jpg)
“Apache Storm”Tuples going In & Out
of Spouts/Bolts
![Page 12: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/12.jpg)
MotivationDebugging Apache Storm*
* Debugging Storm Transactional Topologies
![Page 13: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/13.jpg)
Debugging Transactional Topologies
![Page 14: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/14.jpg)
Debugging Transactional Topologies
1. Spout emits a batch of data(tuples) which forms a
transaction.
![Page 15: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/15.jpg)
Debugging Transactional Topologies
1. Spout emits a batch of data(tuples) which forms a
transaction.
2. Every Bolt in the topology processes that batch of data
(tuples).
![Page 16: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/16.jpg)
MotivationTo Find out …
![Page 17: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/17.jpg)
MotivationTo Find out …
1. When did the batch enter/exit the Spout/Bolt ?
![Page 18: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/18.jpg)
MotivationTo Find out …
1. When did the batch enter/exit the Spout/Bolt ?
2. Which batch is still in the Spout/Bolt? i.e. are any batches STUCK ?
![Page 19: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/19.jpg)
MotivationTo Find out …
1. When did the batch enter/exit the Spout/Bolt ?
2. Which batch is still in the Spout/Bolt? i.e. are any batches STUCK ?
a. On which host are they stuck ?
b. In which Spout/Bolt are they stuck ?
![Page 20: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/20.jpg)
Possible Solution(s):
![Page 21: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/21.jpg)
Possible Solution(s): Add a log statement before and after the critical section.
![Page 22: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/22.jpg)
Possible Solution(s): Add a log statement before and after the critical section.
log.info(“Inserting data into database ….”); // ← entering
datasource.insert(table, tuples); // ←the real work
log.info(“Inserted data into database.”); //← exiting
![Page 23: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/23.jpg)
Possible Solution(s): Add a log statement before and after the critical section.
log.info(“Inserting data into database ….”); // ← entering
datasource.insert(table, tuples); // ←the real work
log.info(“Inserted data into database.”); //← exiting
------------------------------------------------------------------
Cons: Logs distributed over multiple hosts, need to aggregate logs. needs a bit of work,
Elastic Search Kibana ?
![Page 24: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/24.jpg)
Possible Solution(s):
Use http://riemann.io/index.html
This was Suggested by my friend angad. I have not looked at this though.
![Page 25: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/25.jpg)
My IdeaBatch of Tuples Punch IN and Punch Out in a bolt / spout.
![Page 26: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/26.jpg)
My IdeaBatch of Tuples Punch IN and Punch Out in a bolt / spout.
Punch In - Put into hashmap (or any other suitable data structure)
Punch Out - Remove from hashmap (or any other suitable data structure)
![Page 27: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/27.jpg)
My Idea: Batch of Tuples Punch In and Punch Out in a spout.
In the emitBatch of Transactional Spout:
PunchClock.getInstance().punchIn(punchCardId); // ←Punch In
collector.emit(tuples); // ←Emit tuple(s)
PunchClock.getInstance().punchOut(punchCardId); // ←Punch Out
![Page 28: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/28.jpg)
Batch of Tuples Punch IN and Punch Out in a bolt .
In the prepare method of Transactional Bolt:
punchCardId ="Bolt__"+Thread.currentThread().getId()+"__"+System.currentTimeMillis(); // ←Create Punch
Card for txn
In the execute method of Transactional Bolt:
PunchClock.getInstance().punchIn(punchCardId); // ← Punch In
In the finishBatch method of Transactional Bolt:
PunchClock.getInstance().punchOut(punchCardId); // ← Punch Out
My Idea:
![Page 29: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/29.jpg)
Yes,
but it’s a simple Put / Remove call to a hashmap.
When compared to logging it’s cheaper
Is it intrusive ?
![Page 30: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/30.jpg)
Punch Clocks
![Page 31: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/31.jpg)
Punch Clocks● Spouts / Bolts housed in a storm worker jvm.
![Page 32: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/32.jpg)
Punch Clocks● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.
![Page 33: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/33.jpg)
Punch Clocks● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.
● Since we have multiple JVM we have multiple Punch Clocks.
![Page 34: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/34.jpg)
Punch Clocks● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.
● Since we have multiple JVM we have multiple Punch Clocks.
● Batches move across storm workers & we have multiple JVM,
○ We need to aggregate the data across Punch Clocks.
○ Expose Punch Clock via JMX.
![Page 35: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/35.jpg)
![Page 36: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/36.jpg)
demo:
![Page 37: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/37.jpg)
![Page 38: Punch clock for debugging apache storm](https://reader034.vdocuments.net/reader034/viewer/2022051122/58a9acd91a28ab9c758b5a89/html5/thumbnails/38.jpg)
thank you
https://github.com/jaihind213/storm-punch-clock
sweetweet213@twitter