A tool I rely on heavily for network debugging is tcpdump. This tool naturally comes to mind when I run into issues, but it may not for others. I thought I’d take a moment and describe when I reach for tcpdump and give a quick primer on how to use it.
If you’re using windows, windump/wireshark are the cli/gui equivalents. I’ll stick to tcpdump in this article, but many of the CLI options are the same and the filters are pretty similar if not the same.
When should you use tcpdump?
Whenever you’re troubleshooting an application, you hopefully have some sort of application-level logging to help you figure out what’s going on. Sometimes, you don’t have that – or what does exist provides inadequate detail or appears to be lying to you. You may also not have access to a device that you think is affecting the traffic, and you need to ensure that the traffic flow meets your expectations. As long as your application talks on the network, even locally, tcpdump may be able to help you!
You may have users from the internet who need to reach your application who are not able to, and they’re only receiving a timeout, but other users have no issues. You look in your web server logs and you don’t see any logs for the user complaining. There are log entries for the users who are not complaining. You can use tcpdump to listen on the webserver’s port for the customer’s IP and see if the connection attempts are seen. You can also see the packet contents in cleartext (as opposed to binary format – encrypted content is not decrypted, it’s just more easily visible) if that helps diagnose the issue.
Many applications also rely on local connections, typically on the loopback interface, and may be affected by the local firewall (iptables or the Windows Firewall Service, for example). Using tcpdump, you can see if the packets are immediately rejected, which is likely to be the firewall service, or if it completes a three-way handshake before closing the connection. In almost all cases, if a three-way handshakes is observed, the application has received the connection.
Given the name tcpdump, it’s worth nothing that you can see almost anything on the wire, not just TCP packets. UDP, GRE, even IPX are visible with the right filters.
How do you use tcpdump?
Let’s look at how you use tcpdump. In the examples below, I’m using a Linux VM with one interface, eth0, and the address 10.0.0.8. It has ssh, apache, and postfix services running. Tcpdump requires root access to see the raw packets on the wire, which I will gain with sudo. Be extremely careful who you grant this access to for two reasons. 1) Zombied tcpdump sessions can gobble all the CPU. 2) Since packet contents can be inspected, sensitive information can be seen by anyone with the permission to run tcpdump. This is a security risk, when you must meet PCI-DSS audit requirements. I’ll be using my unprivileged user rnelson0.
Tcpdump by default will try and resolve IP and service names. This can be slow, as it relies on DNS and file lookups, and confusing as most people will search by the IP addresses. We can disable these lookups by adding the n flag to the CLI, adding one instance for IPs and one for services, -nn. We also want to specify the interface, even on a single-NIC node, as it may default to the loopback instead of the ethernet interface, using -i <interface>. This gives us a default argument string of: -nni eth0 or -nni lo, depending on which we are looking for.
Next, we need to generate a filter to look at traffic. The tcpdump man page provides a lengthy list of filter components. One of the most common components is src|dst|host <scope>, which filters for packets from, to, or bi-directionally for the specified IP or network. Others are port <portnumber> and <protocol>, like icmp or gre. We can combine individual components with standard logical operators like and, or, and not: filter for non-ssh traffic to/from 10.0.0.200 with host 10.0.0.0.200 and not port 22.
As a “bonus”, when you run tcpdump with a bad filter, it will exit immediately. It doesn’t offer hints on how to fix the error, but it does let you know right away.
We put this together with the full command tcpdump -nni eth0 host 10.0.0.200 and not port 22. If we ssh to our node and just run this, we won’t see anything happen right away, but we’ll eventually see some ARP packets:
[rnelson0@kickstart ~]$ sudo tcpdump -nni eth0 host 10.0.0.200 and not port 22 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 22:05:50.309737 ARP, Request who-has 10.0.0.1 tell 10.0.0.200, length 46 22:05:50.589052 ARP, Request who-has 10.0.0.253 tell 10.0.0.200, length 46 22:05:59.934464 ARP, Request who-has 10.0.0.200 tell 10.0.0.253, length 46 22:06:51.315637 ARP, Request who-has 10.0.0.1 tell 10.0.0.200, length 46 22:06:51.519754 ARP, Request who-has 10.0.0.8 tell 10.0.0.200, length 46 22:06:51.519807 ARP, Reply 10.0.0.8 is-at 00:50:56:ac:f2:f7, length 28
Now if we view a file on the web server, we’ll see a three way handshake followed by a few PSH packets:
22:17:29.686840 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [S], seq 1113320281, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0 22:17:29.687041 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [S.], seq 741099373, ack 1113320282, win 14600, options [mss 1460,nop,nop,sackOK,nop,wscale 5], length 0 22:17:29.690439 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [.], ack 1, win 256, length 0 22:17:29.690475 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [P.], seq 1:412, ack 1, win 256, length 411 22:17:29.690540 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [.], ack 412, win 490, length 0 22:17:29.693772 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [P.], seq 1:151, ack 412, win 490, length 150 22:17:29.694090 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [F.], seq 151, ack 412, win 490, length 0 22:17:29.696030 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [.], ack 152, win 256, length 0 22:17:29.700858 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [F.], seq 412, ack 152, win 256, length 0 22:17:29.700893 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [.], ack 413, win 490, length 0
For comparison, here’s what HTTPS looks like when HTTPS is not enabled. You see the SYN packet from the client, and the RST packet comes from the OS since there’s no service listening there:
22:18:50.057972 IP 10.0.0.200.59917 > 10.0.0.8.443: Flags [S], seq 825112119, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0 22:18:50.058088 IP 10.0.0.8.443 > 10.0.0.200.59917: Flags [R.], seq 0, ack 825112120, win 0, length 0 22:18:50.558200 IP 10.0.0.200.59917 > 10.0.0.8.443: Flags [S], seq 825112119, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0 22:18:50.558264 IP 10.0.0.8.443 > 10.0.0.200.59917: Flags [R.], seq 0, ack 1, win 0, length 0 22:18:51.060995 IP 10.0.0.200.59917 > 10.0.0.8.443: Flags [S], seq 825112119, win 8192, options [mss 1460,nop,nop,sackOK], length 0 22:18:51.061065 IP 10.0.0.8.443 > 10.0.0.200.59917: Flags [R.], seq 0, ack 1, win 0, length 0
I hope this short tutorial helps you figure out when and how to use tcpdump. If you have specific questions, post them in a comment or ask on twitter and I’ll respond.