![Page 1: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/1.jpg)
NanoLog:ANanosecondScaleLoggingSystem
StephenYangJohnOusterhout
February9th,2017PlatformLab Review2017
![Page 2: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/2.jpg)
Overview• ImplementedafastC++LoggingSystem• 12.5nsmedianlatencyat60M logmsgs/sec• 10-100xfasterthanexistingsystemssuchasLog4j2andspdlog• Maintainsprintf-likesemantics
• Shiftsworkoutoftheruntimehot-path• Extractionofstaticinformationatcompile-time• Compactedbinaryoutputatruntime• Defersformattingtoanofflineprocess
• BenefitsandCosts• Allowsdetailedlogsinlowlatencysystems• Comesatthecostof1MBofRAMperthread,onecore,anddiskbandwidth
![Page 3: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/3.jpg)
WhyFast Logging?• Cornerstoneofdebugging• Affordsvisibilityapplicationstate• Helpsinrootcauseanalysisafterexecution
• Problem:Loggingisslow• Applicationresponsetimesaregettingfaster(microseconds)• Loggingisnot(100-1000’sofnanoseconds)• Example:RAMCloudresponsetime=5µs,butlogtime=1µs
![Page 4: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/4.jpg)
Whatmakesloggingslow?
• Compute:ComplexFormatting• Loggersneedtoprovidecontext(i.e.filelocation,time,severity,etc)• Themessageabovehas7argumentsandtakes850nstocompute
• OutputBandwidth:DiskIO• Ona250MB/sdisk,the129bytemessageabovetakes500ns tooutput!
1473057128.133777014 src/LogCleaner.cc:826 in TombstoneRatioBalancerNOTICE: Using tombstone ratio balancer with ratio = 0.400000
![Page 5: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/5.jpg)
Solutions
• Compute:RawDataOutput• Mostlogsinproductionarenotconsumedbyhumans• Savecomputationbydeferringformattingtoanofflineprocess• Sidebenefit:moreefficientforanalysisengines
• IO:ExtractingStaticInformation• StaticInfoinmessage:filelocation,line#,function,severity,formatstring.• Replacewithidentifierandcompactremainingdynamicinformation
1473057128.133777014 src/LogCleaner.cc:826 in TombstoneRatioBalancerNOTICE: Using tombstone ratio balancer with ratio = 0.400000
![Page 6: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/6.jpg)
NanoLogSystemArchitecture
CompactLog
Runtime
ApplicationExecutable
NanoLogRuntime
BufferBuffer
UserThread
Buffer
DecompressorAggregator
Offline
HumanReadable
Log
NanoLogPreprocessor
GCC
Compilation-TimeUserSourcesUser
SourcesUser
SourcesProcessed
UserSources
LibrarySourcesDecompressorAggregator
ApplicationExecutable
![Page 7: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/7.jpg)
Compile-timeOptimizationsPost-ProcessedUserSource(main.ii)UserSource(main.cc)
NanoLogLibrary(StaticInfo.cc)
(a)Extractstaticloginfo
(b)Injectoptimizedlogcode
ApplicationExecutable
DecompressorExecutable
compilecompile
![Page 8: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/8.jpg)
FastRuntimeArchitecture• IsolatetheThreads• Useper-threadbufferstolowersynchronization• Don’tnotifythebackgroundthread;letitpollfordata
• MinimizeOutputCost• Callerpushesdatauncompressed tosaveoncompute• IOThreadneedstosaveonbothIOandcomputetimes.
• Useonlyrudimentarycompaction(deltas+smallestbyterepresentations)
Runtime
NanoLogBackgroundThread
OutputLogFile[1bytesHeader][1-4byteUniqueId][1-8byteTimediff][0-4bytessize][0-nbytesarguments]....
UserThread BufferUserThread BufferUserThread Buffer
![Page 9: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/9.jpg)
Decompressor/Aggregator• Offlineprocesstodecompresslog• Recombinesthestatic+dynamicdatatoproduceahuman-readablefile
• FutureWork• Query/Aggregateincompactedformat
CompactLogFile[1bytesHeader][1-4byteUniqueId][1-8byteTimediff][0-4bytessize][0-nbytesarguments]....
HumanReadableLogFile
2/9/1712:45:24[main]:HelloWorld21
Decompressor/Aggregator
![Page 10: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/10.jpg)
Benchmarks• SystemSetup• Processor:[email protected]• Memory:24GBDDR3@1333Mhz• Disk:120GBCrucialM4overSATAII(~250MB/s)
• TestSetup• 100Miterationsoflogmessages,backtoback• LogMessage:“{time}{severity}:{56-bytemessage}”
• OverallResultsZeroArguments Boostv1.55 Log4j2 Spdlog NanoLog
Throughput(Log/s) 0.82M 1.43M 1.50M 60.1M
AverageLatency(ns) 1110ns 697ns 668ns 16.5ns
0.82 1.43 1.5
60.1
0
20
40
60
Throug
hput
(MillionsLog
s/sec)
Throughputvs.System
BoostLog Log4j2 spdlog NanoLog
![Page 11: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/11.jpg)
TailLatencies
10-8
10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
100 101 102 103 104 105 106 107 108 109
Frac
tion
of L
ogs
Latency (ns)
Kernel InterferenceBoost
Log4j2spdlog
NanoLog
![Page 12: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/12.jpg)
TailLatency(+NanoLogCompute)
10-8
10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
100 101 102 103 104 105 106 107 108 109
Frac
tion
of L
ogs
Latency (ns)
Kernel InterferenceBoost
Log4j2spdlog
NanoLogNanoLog with 10ns Compute
![Page 13: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/13.jpg)
IncreasingParameters
60.1 60
37
2125
13.7
21.6
9.7
17.6
8
14.5
6.41
0
10
20
30
40
50
60
70
SmallIntegers(~1Byte) LargeIntegers(~4bytes)
MillionsofLogM
essages/second
Throughputwithincreasing“%d”parameters
0Params 1Param 2Params 3Params 4Params 5Params
![Page 14: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/14.jpg)
Limitations/FutureWork• BetterCompression?• Isthereabetterwaytocompacttheoutput,butinaperformantway?
• Fullyfeatureddecompressor/aggregator• Operatingonthecompactrepresentationismoreefficient.• Iteratingoveracompactlogmessagetakesabout100nsvs.1.3µstooutput
• ResourceUtilization• Currentlythesystemrequires1MBperuserthread,afullcoretocompact,andthefullbandwidthofaSATASSDtomainlowlatency.Howdoesthischangewithnewhardware?
![Page 15: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/15.jpg)
NanoLogSystemSummary• Compile-TimePreprocessor
• Extractstaticinformationfromlogmessagesatcompiletime• Filename,line#,functionname,etc
• CatalogsstaticinfoandassignsauniqueIDtoeachlogstatement• CodeInjectiontorecordonlyanidentifier+parameterarguments
• RuntimeLibrary• Producer/ConsumerLogoutput• Simplecompaction(takingdeltas/compactingintegers)
• OfflineDecompressor/Aggregator• Recombinestaticinformationforhumanconsumption(ifnecessary)• OfflineSearch/Grep/Aggregateincompressedformat
![Page 16: NanoLog: A Nanosecond Scale Logging System · Overview • Implemented a fast C++ Logging System • 12.5ns median latency at 60M log msgs/sec • 10-100x faster than existing systems](https://reader030.vdocuments.net/reader030/viewer/2022041120/5f341302be508a52ce2d6bee/html5/thumbnails/16.jpg)
Questions