a18 – deduping packet capture files at layer 3 using wiretap to create a custom de-duping...
TRANSCRIPT
A18 – DEDUPING PACKET CAPTURE FILES AT LAYER 3
Using Wiretap to Create a Custom De-duping Application
Robert BullenBlue Cross Blue Shield MN
Agenda
Of General Interest Define Layer 2 and Layer 3 duplicates Explain when and why duplicates are captured Cover existing deduping solutions Introduce a new solution Show demos throughout
Of Interest to Developers Outline the new solution’s architecture Explain how Wiretap was leveraged Show code samples Model the new solution’s duplicate detection algorithm
DEALING WITH DUPLICATE PACKETS
Definitions
Layer 2 duplicates are packets that are byte-for-byte identical
Layer 3 duplicates are packets that have variations in Layer 2 encapsulation, and even slight variations in Layer 3 headers, as they are routed through a network MAC addresses and VLAN tags will change The IP TTL field will decrement
Duplicate packets can be found in trace files in the following scenarios When SPANing VLANs When SPAN bugs exist in routers/switches When aggregating multiple capture points by way of
Packet brokers Mining multiple ports of a sniffer Manually merging trace files
Analyzing trace files containing duplicate packets is difficult for both tools and their users Tools inflate error counts, which can cause incorrect or
misleading diagnoses; e.g. Wireshark will flag TCP duplicate segments as retransmissions or out-of-order
Tool features can fail in the face of duplicates; e.g. Wireshark’s SSL decryption
Users can get confused following a packet flow—it’s hard enough as it is!
ProblemStatement
Scenario 1—Intermittent Packet
Loss Between Firewall and Load
Balancer
Load Balancer
Web App Firewall
Web Server
Packet Broker
Tap
Tap
Mirror
Sniffer
Firewall
GET
Demo 1—Wireshark
Avoid duplicates by crafting packet captures carefully Isolate packet sources (not possible in retrospective
analysis) Be precise with packet broker/sniffer filters on MAC
addresses and VLAN tags in addition to IP addresses Sniffer on-board deduplication
Layer 2 deduplication only (byte-for-byte) Potentially a bad idea for continuous capture jobs
because those captures would no longer accurately reflect the real world
Packet broker on-board deduplication Some have configurable Layer 3 deduplication Potentially an extra license cost More of a point solution, not general purpose
Probably a good idea for monitoring appliances (APM or security)
Potentially a bad idea for continuous capture jobs on sniffers because those captures would no longer accurately reflect the real world
Editcap Part of Wireshark tool suite Layer 2 deduplication only (byte-for-byte)
ExistingSolutions
Demo 2—Editcap
Frame Queuing Time QueuingCon: Variable
frame count means
dynamic RAM usage
Pro: Duplicate detection
unaffected by burstiness
Pro: Fixed frame count means static RAM usage
Con: Duplicate detection
depends on burstiness
Duplicate DetectionQueuing Modes
Time queueing oftentimes yields more intuitive results
Frame queueing can be handy when it is desirable to restrict deduplication to just one or two packets away
VLAN200
VLAN203
Scenario 2—Layer 2 & Layer 3 Duplicates with Differing MAC
Addresses and VLAN Tags
Web Server
Load Balancer
Web App Firewall
Web Server
Packet Broker
Tap
Tap
Mirror
Sniffer
Firewall Tap
Demo 3—Wireshark, Editcap, Wireshark
Super Deduper’sFeatures
Presents a GUI that reads and writes all Ethernet trace file formats supported by Wireshark Accepts files via drag & drop Propagates the input file’s format to, or forces a chosen format
on, the output file Detects duplicates in two queuing modes
Time queuing—detects duplicates up to a maximum delta time Frame queuing—detects duplicates up to a maximum separation
by frame count Deduplicates IPv4 packets by ignoring layer 2
differences in frames (e.g. MAC addresses, VLAN tags) Supports optionally deduping non-IPv4 frames byte-for-byte at
layer 2, just like editcap Processes multiple files in two modes
Batch mode dedupes each input file independently to its own output file
Merge mode aggregates all input files while deduping and merges them into a single output file
Shifts timestamps on-the-fly on a per-file basis Generates an HTML summary with detailed logs of
per-packet handling
Demo 4—Super Deduper
Deduping can mask problems like packet storms
Deduping can mask legitimate retransmissions that are byte-for-byte identical to the original segment (but shouldn’t be)
Deduping packets from segments with nontrivial latency between them can result in misleading timings
When slicing/truncating of packets has taken place, and there are layer 3 duplicates of differing layer 2 lengths greater than the slice limit, they will not be detected
Use Caution When Deduping
INFORMATION FOR DEVELOPERS
WiretapC API
Wiretap .NET WrapperObject-oriented Interop, Timeshifting & Merging
Deduplication LogicModel
UI Interaction LogicView Models
UI PresentationWPF Views
Super Deduper’sArchitecture
wiretap-2.0.0
libglib-2.0.0
libintl-8
libgmodule-2.0.0
libwsutil
libgcyrpt-20
libgpg-error6-2.0.0
zlib1
msvcrt
Wiretap Library Dependency Tree
Improve API documentation Design Wiretap as a reusable module
Lessen dependencies on external modules, mainly the libwsutil branch
Improve API consistency Some functions return strings that are dynamic and
must be freed or a memory leak will occur, while others return strings are static and must not be freed. There is no way to tell how a string should be handled without diving into the code.
Use Glib types everywhere
Improve source code organization Nearly the entire API is declared in wtap.h, but some is
implemented in wtap.c and some in file_access.c The API should also be organized into logical regions
with their own headers and implementation files, much like how each file format is treated
Wiretap Wish List
WiretapC API
.NET WrapperObject-Oriented Interop, Timeshifting & Merging
Deduplication LogicModel
UI Interaction LogicView Models
UI PresentationWPF Views
Super Deduper’sArchitecture
Wiretap API Interop Declarations
Sample C# Method Using Wiretap .NET Wrapper
WiretapC API
Wiretap .NET WrapperObject-oriented Interop, Timeshifting & Merging
Deduplication LogicModel
UI Interaction LogicView Models
UI PresentationWPF Views
Super Deduper’sArchitecture
How Duplicate Detection Works
Packet A 0
1
2
…
…
65535
IPv4 Packet Hash Table(hash function = IP ID)
Frame Circular Queue(ring buffer)
Packet B
Frame C
Packet D
Packet A’
Output File
1. Frame C is read.2. Frame C is added to the queue but not the hash table
because it isn’t an IPv4 packet with an IP ID.3. The queue is searched linearly for a duplicate to Frame C.4. Frame C is found to be unique, so it is written to the output
file.5. Packet D is read and added to the queue.6. Packet D is inserted into the hash table also .
7. There are no other packets having the same IP ID so it must be unique and is written to the output file.
8. Packet A’ is read and added to the queue.9. Packet A’ collides with Packet A in the hash table, and
furthermore is a duplicate of Packet A. It is not written to the output file.
10.The hash table always refers to the most recent instance, so it is updated to point to Packet A’.
Finish the BETA Make a Super Deduper command line variant Fix the slicing limitation Implement set operations (union,
intersection, difference) Add an editcap mode Revamp the GUI
Future Plans/Ideas
THE END