spp version 1 nat daemon (natd)
DESCRIPTION
SPP Version 1 NAT Daemon (natd). Mart Haitjema. NATD Overview. Manages NAT connections for a Linecard (LC) in SPP Creates NAT connections: Manages UDP, TCP ports and ICMP IDs on a per-interface bases - PowerPoint PPT PresentationTRANSCRIPT
Mart Haitjema
SPP Version 1NAT Daemon (natd)
2 - Mart Haitjema - 04/22/23
NATD Overview Manages NAT connections for a Linecard (LC) in SPP
» Creates NAT connections:n Manages UDP, TCP ports and ICMP IDs on a per-interface basesn Translates board’s (GPE or CP) UDP/TCP port # or ICMP ID to an interface’s externally
visible port or ICMP IDn Enables connection by installing an ingress and egress filter in LC’s TCAM
» Tracks connection state:n UDP/ICMP: by hardware activity monitoring using TCAM aging bits (see Aging)n TCP: by tracking connection state (see TCP State Machine)
» Removes connections:n Removes inactive UDP/ICMP connections whose filters have timed outn Removes stale TCP connections that have timed out in a particular staten Disables connection by removing ingress and egress filter for connection
Supported NAT Connections:» Connections initiated from a board in SPP
n UDP - identified by two tuple, maps to public UDP port [board MAC, board port] -> public port
n TCP - identified by 4 tuple, maps to public TCP port [board MAC, board port, remote IP, remote port] –> public port
n ICMP echo-request (ping) - identified by 2 tuple, maps to public ICMP ID [board MAC, board ICMP ID] -> public ID
3 - Mart Haitjema - 04/22/23
NATD Overview Daemon can reside anywhere
» Intended to run on LC Ingress XScale for performance
Interacts with:»SCD
n Sends packet meta-data for NAT from datapath to natdn natd sends back updated meta-data and instructs SCD to forward, drop, or
ignore packetn Receives write and remove filter instructions from natdn Ingress SCD:
Polls TCAM for filters that have timed out see “Aging” Informs natd of timed out filters
»SRMn Determines queue/scheduler for NAT to use on each link (board-interface
mapping) see “Links”n natd queries for this information at startup
»Flow statsn natd informs flow stats of new/removed NAT connections
4 - Mart Haitjema - 04/22/23
NAT Message ExchangeINGRESS
TCAMEGRESS
Egress SCD to NATDnat_egress: process packet requiring NAT from egress
Ingress SCD to NATDnat_ingress: process packet requiring NAT from ingresstimed_out_filters: the following filter IDs have timed out through aging
NATD to Ingress SCDnat_filters: tells SCD which filter IDs to use aging withwrite_fltr: install a filter for NAT in LC’s TCAMrem_fltr_by_fid: remove a NAT filter from LC’s TCAM
NATD to SRM:get_sched_map: get queue/scheduler information for use by NAT connections
PCI BUS
SCDSCD
NATD
nat_ingress
timed_out_filters nat
_egres
s
write_fltr
rem_fltr_by_fid
EGRESS
Control Processor (CP)
SRM
get_sched_map
TCAM
XScale XScale
Line card
nat_filters
5 - Mart Haitjema - 04/22/23
NATD Interface result egress_natd(meta-data)
valBuf_t meta-datadw4_t words[8]; // the meta-data as defined on meta-data slides
valBuf_t result {dw4_t retCode; // code to scd to drop, forward, or ignore packetdw4_t words[7]; // updated meta-data as defined on meta-data slides
}» Sends packet meta-data to natd so natd can manage state for packet’s connection. Natd returns
updated meta-data with instruction for SCD to drop, forward, or ignore packet
result ingress_natd(meta-data)valBuf_t meta-datadw4_t words[8]; // the meta-data as defined on meta-data slides
valBuf_t result {dw4_t retCode; // code to natd to drop, forward, or ignore packetdw4_t words[6]; // updated meta-data as defined on meta-data slides
}» Sends packet meta-data to natd so natd can manage state for packet’s connection. Natd returns
updated meta-data with instruction for SCD to drop, forward, or ignore packet
6 - Mart Haitjema - 04/22/23
NATD Interface status timed_out_filters(ingStartFid, numIngFids,
egrStartFid, numEgrFids, ingFids, egrFids)
dw4_t ingStartFid // start of range of ingress filtersdw4_t egrStartFid // “ egress filters
dw4_t numIngFids // number of filters polled in ingress DB dw4_t numEgrFids // “ “ egress DB valBuf_t ingFids {
dw4_t fids[] // list of timed out filter IDs in ingress DB}
valBuf_t egrFids { dw4_t fids[] // “ “ “ egress DB
}» Sets/clears the timeout flag for all the filters that natd has state for in the range of the
filters specified for each database» See “Aging” for how call is used
7 - Mart Haitjema - 04/22/23
LinksNAT Traffic:
»Routed across links»One link between each SPP board and LC interface»Link specifies which queue manager, scheduler, queue, and VLAN should be used to route traffic both in and out of the LC
»Mappings are retrieved at startup by querying the SRM using the get_sched_map(...) call
n See http://www.arl.wustl.edu/projects/TeN/ppt/srm.ppt
8 - Mart Haitjema - 04/22/23
SCD Changes Both SCDs:
»New threadn Periodically (10ms) polls for packets in datapath to XScale scratch ringn Sends packet meta-data to natd to process
nat_ingress(...) call for ingress nat_egress(...) call for egress natd returns
– updated meta-data if packet needs to be forwarded– instruction to drop, forward or ignore packet
n If hit bit is not set, XScale has a copy of the packet and must either drop or forward the packet
Ingress only:»Starts when natd calls nat_filters(…) on ingress SCD»Periodically checks TCAM activity bits for nat filters (see Aging)»Uses timed_out_filters(...) to inform natd which filters have timed out
and which have not
9 - Mart Haitjema - 04/22/23
SCD to NATD: Packet meta-data
Rsvd3b
Hit
Egress: Ingress:
TCP Flags6b
H1b
Rsvd1b
Hit
TCP Flags6b
H1b
Buf Handle(24b)IP Pkt
Length (16b)Eth HdrLen (8b)
Flags (8b)
IP_SAddr (32b)
SrcMAC(8b)
TCP/UDP SPort Or ICMP ID (16b)
IP Proto (8b)
ICMPType(8b)
IP_DAddr (32b)
TCP/UDP DPort (16b)
TCAM Hit Index (32b)
IP Hdr 1st Word (32b)IP Hdr Top 16 bitsOf 2nd Word (16b)
Buf Handle(24b)IP Pkt
Length (16b)Eth HdrLen (8b)
Reserved(8b)
Flags (8b)
IP DAddr (32b)
Intf(4b)
TCP/UDP DPort Or ICMP ID (16b)
Protocol(8b)
ICMPType (8b)
Rsv(4b)
IP_SAddr (32b)
TCP/UDP SPort (16b)
TCAM Hit Index (32b)
IP Hdr 1st Word (32b)IP Hdr Top 16 bitsOf 2nd Word (16b)
TCP State onXScale usesFull 5-tuple
TCP stateUpdates Include TCAMHit Index
S1b
R1b
P1b
A1b
F1b
U1b
FINSYN
RST
PSH
ACK
URG
S1b
R1b
P1b
A1b
F1b
U1b
FINSYN
RST
PSH
ACK
URG
From: http://www.arl.wustl.edu/projects/techX/design/SPP/SPP_V1_NAT_design.ppt
10 - Mart Haitjema - 04/22/23
NATD to SCD: updated meta-dataEgress: Ingress:
Buf Handle(24b)
IP DAddr (32b)
IP PktLength (16b)
Reserved(8b)
Eth HdrLen (8b)
IP Hdr 1st Word (32b)
Flags (8b)
Translated SPort(16b)Stats Index (16b)
VLAN (12b) PerSchedQID(15b)
Sch3b
QM2b
IP Hdr Top 16 bitsOf 2nd Word (16b) Reserved (16b)
Reserved(8b)
Buf Handle(24b)IP Pkt
Length (16b)
TranslatedDPort/ID (16b) Stats Index (16b)
Eth HdrLen (8b)
IP Hdr 1st Word (32b)
Flags (8b)
VLAN (12b) PerSchedQID(15b)
Sch3b
QM2b
Reserved3b
N1b
H1b
I1b
U1b
T1b
ICM
PNA
T Hit
UDP
TCP
Reserved3b
N1b
H1b
I1b
U1b
T1b
ICM
PNA
T Hit
UDP
TCP
Natd updates fields in dark blue Flags:
» H: HIT - Lookup was a valid hit.» N: NAT - NAT translation is required» I: ICMP - ICMP pkt» U: UDP - UDP pkt» T: TCP - TCP pkt
At most one of I/U/T should be set at any time If N is 0, then I/U/T will be ignored
» HF does not need to do any protocol specific operations for packets that do not require NAT translation No need to send any H=0 pkts to HF.
IP Hdr Top 16 bitsOf 2nd Word (16b) Reserved (16b)
From: http://www.arl.wustl.edu/projects/techX/design/SPP/SPP_V1_NAT_design.ppt
11 - Mart Haitjema - 04/22/23
NATD – Top Level Single threaded, uses event queue for timed events
On start up retrieves scheduler information for board/interface mappings from srm using get_sched_map(...) call
Main loop:»Process messages from SCDs until next scheduled timeout event
n i.e. nat_ingress(...), nat_egress(...), and timed_out_filters(...)n Installs and removes connections by calling write_fltr(...) and
rem_fltr_by_fid(...) on Ingress SCD»Service timeout events
n Events to remove UDP/ICMP connections with timed out filtersn Events to remove stale TCP connectionsn See slides on Timeout Events
12 - Mart Haitjema - 04/22/23
New NAT connection example
SCDNATD
nat_ingres/egress
SCR
Poll forpackets
Lookup
XScale
HdrFormat
NN
Packet meta-data
TCAM
drop/forward/ignore natd response
updated meta-dataSCR
Datapath
install filter
install ingress filterwrite_fltr(...)
install egress filterwrite_fltr(...)
13 - Mart Haitjema - 04/22/23
Table Structure
natTableIP Address:
XXX.XXX.XXX.XXX
Ifn: X
tcpConnection
filterTable
tcpTable
tcpTable
icmpTable icmpConnectio
n
udpConnection
ingressFilterEgressFilter
ingressFilterEgressFilter
ingressFilterEgressFilter
One NAT Table per interface All NAT tables share a pool of filters from
the FilterTable
14 - Mart Haitjema - 04/22/23
TCP State Machine
1 ESTABLISHED
INGRESSCLOSED
EGRESSCLOSED
SYN-WAITNULL FIN-WAITsyn syn ack2
fin (ingress)
fin (egress)
fin (egress)
fin (ingress)
3
3
Transition: Action:1 create connection instance, install filters, add tcpSynTout event2 remove tcpSynTout event , add tcpIdleTout event3 remove tcpIdleTout event, add tcpFinTout event4 remove tcpFinTout/tcpIdleTout, re-add tcpIdleTout event5 remove connection, filters, & all timeout events
55
5
5
5rst
rstrst
rst
rst fin (egress)2syn4
syn4
15 - Mart Haitjema - 04/22/23
Timeout Events TCP TCP Timeouts
»All timeouts remove connection when they fire
»tcpSynTout:n Period: 5 minutesn Installed when connection transitions to SYN-WAIT staten Removed when connection transitions to established state
»tcpIdleTout:n Period: 24 hoursn Installed when connection transitions to ESTABLISHED staten Removed when connection transitions to FIN-WAIT state
»tcpFinTout:n Period: 5 minutesn Installed when connection transitions to FIN-WAIT staten Removed if connection is closed
16 - Mart Haitjema - 04/22/23
Timeout Events UDP/ICMPUDP & ICMP Timeouts
»udpAgeTout / icmpAgeToutn Period: 5 minutesn Remove connection if both ingress & egress filter for
connection have timed out
17 - Mart Haitjema - 04/22/23
Aging Hardware Aging:
» Uses TCAM’s hardware activity bits» See “TCAM and Aging” in http://www.arl.wustl.edu/projects/techX/design/SPP/SPP_V1_NAT_design.ppt
Algorithm:» SCD
n Polls TCAM for filters that have timed out Uses the range of filter IDs specified by nat_filters(…) call. Range must be a multiple of 32 Calls IdtSearchDatabaseSwAgeAndGetAgedEntries(...) to get timed out filters in subset of range of
filter IDs in each database Checks entire range of nat filters every 5 minutes Checks the same range of filter IDs in ingress & egress database at the same time
n Informs natd which filters have timed out in each range via timed_out_filters(…) call» Natd
n Updates state of each filter in range of filters specified in timed_out_filters(...) For each filter in specified range Sets timed out flag associated with filter SCD clears timed out flag associated with each filter natd
has state forn Each UDP/ICMP connection has a timeout event that fires every 5 minutes
if both filters have timed out, connection removed
18 - Mart Haitjema - 04/22/23
Status To do:
»Finish TCAM aging – need to debug IDT call - FINISHED»Fix eventManager to allow events on queue to be removed»Send connection information to flow stats»Implement hash functions for faster connection state lookup
Open issues»Burst of UDP packets not handled well
19 - Mart Haitjema - 04/22/23
File Structure techX repository: wu_arl/dnet/npe/natd
Files:» bitmap.{cc,h}
n bitmap/portmap class used for managing freelist of available ports/IDs» boards.{cc,h}
n defines board & link classes» connections.{cc,h}
n defines ICMP, UDP, and TCP connection data structures» events.{cc,h}
n all timeout events» filters.{cc,h}
n filter code and filter tablen includes calls to SCD to install/uninstall filters
» natd.{cc,h}n reads configuration file, gets scheduler mappings from SRM, includes main processing loop
» statOp.{cc,h}n code for natd interface calls [egress_nat(...), ingress_nat(...), and timed_out_filters(...)]
» tables.{cc,h}n defines all table data structures [natTable, icmpTable, udpTable, and tcpTable]n manages all connection state (e.g. open/close connection, TCP state transitions, etc)
20 - Mart Haitjema - 04/22/23
Configuration File FormatmyAddr = 0 natd’s addressmyPort = 5050 natd’s portscdAddr = 0 scd’s addressscdPort = 7070 scd’s portsrmAddr = 192.168.32.2 srm’s addresssrmPort = 6060 srm’s portloglvl = Loud logging verbosity
[GeneralParameters]tcpSynTimeOut = 300 timeout in syn-wait statetcpFinTimeOut = 300 timeout in fin-wait statetcpIdleTimeOut = 86400 timeout in established stateagingPollInterval = 300 period for udp/icmp timeoutingressStartFid = 0 first filter ID reserved for nat in ingress DBingressEndFid = 8191 last filter ID reserved for nat in ingress DB (range must be a multiple of 32)egressStartFid = 0 “ “ “ egress DBegressEndFid = 8191 “ “ “ egress DB (currently range must be same as ingress)
[ Interface ] defined for each interface # Link name drn05 ifn = 0 interface number IPAddress = 0x80fc99d1 interface’s IP address udpStartPort = 30000 first udp port reserved for nat udpEndPort = 30499 last udp port reserved for nat tcpStartPort = 30000 “ tcp “ tcpEndPort = 30499 “ tcp “ icmpStartID = 0 “ icmp “ icmpEndID = 65535 “ icmp “
[ Board ] defined for each board # cp1, Slot 0 type=cp CP or GPE (not currently used) MACAddress = 00:1E:C9:FE:76:23 board’s MAC address