Puppet 4 Lessons Learned

I’ve been working recently on migrating to Puppet 4. All the modules I maintain have supported it for a little bit but my master and controlrepo were still on Puppet 3. I slowly hacked at this over the past month and a half when time presented itself and I learned a few things. This post is an assortment of lessons learned, more than a tutorial, but hopefully it will help others going through this effort themselves.

At a high level the process consists of:

  • Make sure your code is Puppet 4 compatible via Continuous Integration.
  • Find a module that manages Puppet 4 masters and agents. If your current module works with 4, this step is much easier.
  • Build a new template or base image that runs Puppet 4.
  • Update your controlrepo to work with a Puppet 4 master, preferably with the new puppetserver instead of apache+passenger, again using CI for testing.
  • Deploy the new master and then start adding agents.

The biggest lesson I can give you is to perform these changes as 5 separate steps! I combined the last three into a single step and I paid for it. I know better and my shortcut didn’t turn out so well. Especially, do not take your Puppet 3 master down until your Puppet 4 master build tests out okay! Alright, let’s go through some of the steps.

Continue reading

Tcpdump: When and How?

A tool I rely on heavily for network debugging is tcpdump. This tool naturally comes to mind when I run into issues, but it may not for others. I thought I’d take a moment and describe when I reach for tcpdump and give a quick primer on how to use it.

If you’re using windows, windump/wireshark are the cli/gui equivalents. I’ll stick to tcpdump in this article, but many of the CLI options are the same and the filters are pretty similar if not the same.

When should you use tcpdump?

Whenever you’re troubleshooting an application, you hopefully have some sort of application-level logging to help you figure out what’s going on. Sometimes, you don’t have that – or what does exist provides inadequate detail or appears to be lying to you. You may also not have access to a device that you think is affecting the traffic, and you need to ensure that the traffic flow meets your expectations. As long as your application talks on the network, even locally, tcpdump may be able to help you!

You may have users from the internet who need to reach your application who are not able to, and they’re only receiving a timeout, but other users have no issues. You look in your web server logs and you don’t see any logs for the user complaining. There are log entries for the users who are not complaining. You can use tcpdump to listen on the webserver’s port for the customer’s IP and see if the connection attempts are seen. You can also see the packet contents in cleartext (as opposed to binary format – encrypted content is not decrypted, it’s just more easily visible) if that helps diagnose the issue.

Many applications also rely on local connections, typically on the loopback interface, and may be affected by the local firewall (iptables or the Windows Firewall Service, for example). Using tcpdump, you can see if the packets are immediately rejected, which is likely to be the firewall service, or if it completes a three-way handshake before closing the connection. In almost all cases, if a three-way handshakes is observed, the application has received the connection.

Given the name tcpdump, it’s worth nothing that you can see almost anything on the wire, not just TCP packets. UDP, GRE, even IPX are visible with the right filters.

How do you use tcpdump?

Let’s look at how you use tcpdump. In the examples below, I’m using a Linux VM with one interface, eth0, and the address 10.0.0.8. It has ssh, apache, and postfix services running. Tcpdump requires root access to see the raw packets on the wire, which I will gain with sudo. Be extremely careful who you grant this access to for two reasons. 1) Zombied tcpdump sessions can gobble all the CPU. 2) Since packet contents can be inspected, sensitive information can be seen by anyone with the permission to run tcpdump. This is a security risk, when you must meet PCI-DSS audit requirements. I’ll be using my unprivileged user rnelson0.

Tcpdump by default will try and resolve IP and service names. This can be slow, as it relies on DNS and file lookups, and confusing as most people will search by the IP addresses. We can disable these lookups by adding the n flag to the CLI, adding one instance for IPs and one for services, -nn. We also want to specify the interface, even on a single-NIC node, as it may default to the loopback instead of the ethernet interface, using -i <interface>. This gives us a default argument string of: -nni eth0 or -nni lo, depending on which we are looking for.

Next, we need to generate a filter to look at traffic. The tcpdump man page provides a lengthy list of filter components. One of the most common components is src|dst|host <scope>, which filters for packets from, to, or bi-directionally for the specified IP or network. Others are port <portnumber> and <protocol>, like icmp or gre. We can combine individual components with standard logical operators like and, or, and not: filter for non-ssh traffic to/from 10.0.0.200 with host 10.0.0.0.200 and not port 22.

As a “bonus”, when you run tcpdump with a bad filter, it will exit immediately. It doesn’t offer hints on how to fix the error, but it does let you know right away.

We put this together with the full command tcpdump -nni eth0 host 10.0.0.200 and not port 22. If we ssh to our node and just run this, we won’t see anything happen right away, but we’ll eventually see some ARP packets:

[rnelson0@kickstart ~]$ sudo tcpdump -nni eth0 host 10.0.0.200 and not port 22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
22:05:50.309737 ARP, Request who-has 10.0.0.1 tell 10.0.0.200, length 46
22:05:50.589052 ARP, Request who-has 10.0.0.253 tell 10.0.0.200, length 46
22:05:59.934464 ARP, Request who-has 10.0.0.200 tell 10.0.0.253, length 46
22:06:51.315637 ARP, Request who-has 10.0.0.1 tell 10.0.0.200, length 46
22:06:51.519754 ARP, Request who-has 10.0.0.8 tell 10.0.0.200, length 46
22:06:51.519807 ARP, Reply 10.0.0.8 is-at 00:50:56:ac:f2:f7, length 28

Now if we view a file on the web server, we’ll see a three way handshake followed by a few PSH packets:

22:17:29.686840 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [S], seq 1113320281, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
22:17:29.687041 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [S.], seq 741099373, ack 1113320282, win 14600, options [mss 1460,nop,nop,sackOK,nop,wscale 5], length 0
22:17:29.690439 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [.], ack 1, win 256, length 0
22:17:29.690475 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [P.], seq 1:412, ack 1, win 256, length 411
22:17:29.690540 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [.], ack 412, win 490, length 0
22:17:29.693772 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [P.], seq 1:151, ack 412, win 490, length 150
22:17:29.694090 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [F.], seq 151, ack 412, win 490, length 0
22:17:29.696030 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [.], ack 152, win 256, length 0
22:17:29.700858 IP 10.0.0.200.59916 > 10.0.0.8.80: Flags [F.], seq 412, ack 152, win 256, length 0
22:17:29.700893 IP 10.0.0.8.80 > 10.0.0.200.59916: Flags [.], ack 413, win 490, length 0

For comparison, here’s what HTTPS looks like when HTTPS is not enabled. You see the SYN packet from the client, and the RST packet comes from the OS since there’s no service listening there:

22:18:50.057972 IP 10.0.0.200.59917 > 10.0.0.8.443: Flags [S], seq 825112119, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
22:18:50.058088 IP 10.0.0.8.443 > 10.0.0.200.59917: Flags [R.], seq 0, ack 825112120, win 0, length 0
22:18:50.558200 IP 10.0.0.200.59917 > 10.0.0.8.443: Flags [S], seq 825112119, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
22:18:50.558264 IP 10.0.0.8.443 > 10.0.0.200.59917: Flags [R.], seq 0, ack 1, win 0, length 0
22:18:51.060995 IP 10.0.0.200.59917 > 10.0.0.8.443: Flags [S], seq 825112119, win 8192, options [mss 1460,nop,nop,sackOK], length 0
22:18:51.061065 IP 10.0.0.8.443 > 10.0.0.200.59917: Flags [R.], seq 0, ack 1, win 0, length 0

Summary

I hope this short tutorial helps you figure out when and how to use tcpdump. If you have specific questions, post them in a comment or ask on twitter and I’ll respond.

A Full Stack What?

You’ve probably heard a lot of talk about the term “Full Stack Engineer”. You may even hear that everyone’s looking for one, so you probably want to be one to help your career. A Full Stack Engineer (hereafter FSE) is someone who doesn’t just know their one area deeply, but knows a bit about the rest of the stack. That depth of knowledge varies from very shallow to deep expertise, with the idea that the FSE knows how the different levels of the stack work together so they can make decisions that benefit the entire stack, rather than a local optimization that may harm the rest of the stack. You don’t want someone making an application decision that blows up the storage stack, or vice versa, so this kind of wide knowledge rather than deep knowledge is definitely helpful.

There’s a huge challenge to becoming an FSE, one of which is the sheer amount of layers in the stack to learn about these days. There’s so much to learn that it’s not actually feasible that any one person can learn all those layers deeply enough to really know the full stack. There’s absolutely nothing wrong with taking the journey toward Full Stack Engineer, but I think there’s another worthwhile goal out there:

A quipped about a “Full Stack Human,” a little bit of a tongue-in-cheek response to the overuse of the FSE term, but there’s some seriousness behind it. What it really means is that you should try and be a well-rounded person. In a sentence: Be more than a job.

A job is (hopefully) only 40 hours out of each 144 hour week and 2000 hours out of 8766 hours a year, less than 25% of what you do in a year. Sleeping should take up about 30% more – and it really should, we have to work very hard to not have a perpetual sleep deficit. Many of us will spend some of that remaining time trying to advance our work and careers, which is perfectly fine. This still leaves a lot of time, time in which we can find some hobbies and activities to enjoy so we’re more than just a working machine.

For exercise, I really like playing flag football. There’s a very diverse assortment of players out there and it’s far more entertaining than a treadmill or machine. When I feel creative, I enjoy woodworking. It requires deliberation, planning, and care in ways that my day job doesn’t – well, since I like having all of my fingers, anyway. I really like my sci-fi and fantasy novels, but I also make sure I fit some classics like War & Peace in between them. My wife and I don’t do anything truly adventurous, but we have been fortunate to visit a number of countries and enjoy their different cultures.

These activities gives you depth and adds dimensions to your character. (I realize I’m starting to sound like your parents did when you were filling out college applications, but bear with me a bit longer!) You meet other people and cultures and gain new viewpoints in which to perceive life. For example, in a decade of flag football, I’ve learned so many different ways to inspire teammates – and which ones don’t work! – and how to calm people down so they don’t lose the game.

I’d never get those experiences just by focusing on working my way up the stack at work, and those experiences help me out just as much at work. We talk a lot about encouraging diversity in tech, and in my opinion, it has to start in your personal life. A well-rounded person, a Full Stack Human, has those diverse experiences and can bring that diversity back into tech.

Your hobbies also give you a healthy escape from work. You aren’t just the project you released last week, and you shouldn’t kill yourself over work (figuratively or literally!). Identification and burnout can be a significant problem for everyone. If you don’t think so, you either aren’t there yet, or you’re there and you don’t know it! When you get too wrapped up in work – the deadlines are pressing down on you, politics got heated, you missed a family event because you were working late and didn’t even realize it – you need a safety valve to relieve that pressure and your personal time should help with that. PSA: If you’re struggling with burnout, please reach out to someone. We’re here to help!

Be a Full Stack Human. I guarantee it will be rewarding on its own, and it’s a huge step up on becoming a Full Stack Engineer!

2015 Recap: How did I do?

Just like I did at this time last year, it’s time to take a look at my goals for the previous year and see how I did.

Learn Ruby

I’d like to think I grok ruby at a more advanced level, now. I’ve written my first gem (and documented the ordeal) and contributed a number of patches to Ruby-ish projects here and there, mostly based around Puppet. I’ve also started writing “throw away” code in Ruby when possible, furthering my transition away from a bash-everywhere mentality. Grade: Pass

Blog more about Security

I started incorporating more security elements into my writing, but I haven’t really done a lot of security-focused writing. I only added one item to the Security category in 2015. I’m sitting on a bunch of drafts about security but am too timid to finish and publish them. Grade: Fail

Home Network

  1. I got my new home network up in running in the late spring, thanks to my partner-in-crime Mike SoRelle. I wrote an article about it as well.
  2. I made some progress here but not in the anticipated direction. I have all of my home network running Linux in Puppet and am working toward the same on the few Windows boxes. There was a lot of turmoil on the VMware side of things (5.5 updates, 6.0, 6.0 updates, changes to VCSA) and it slowed the work there. No IPAM, but I’m not feeling the burn very much because DNS at least is in Puppet.

Grade: Pass. But barely.

Expand PuppetInABox

I’ve learned a lot about software development in the past year. I’ve not only expanded and revamped PuppetInABox (support for Puppet 4 coming soon!), but I’m maintaining a few puppet modules, a puppet-related gem, and am actively participating in VoxPupuli (previously Puppet Community) and contributing features and fixes to Puppet itself. I think I’m making progress here, but still have a ways to go. Grade: Pass.

Propose a PuppetConf Talk

This was originally a goal to propose a VMworld talk, but I changed that as I didn’t have good subject matter for it before the CFP ended. I did submit a CFP to PuppetConf and was accepted! I presented in October and you can catch the video and slides online. I enjoyed the hell out of the conference and I dare say my talk did well, too! Grade: Pass.

VCAP-DCA

I have made zero progress here. It was a busy year! I have until April to get this or renew the VCP and I’m not sure which it will be. Grade: Fail

Read War & Peace

This wasn’t on the list, but it was a personal goal. I’ve read a lot of Barnes and Nobles Classics and I love the Russian literature (Crime and Punishment in particular!), but at ~1100 pages of translated mid-1800 Russian, War & Peace was intimidating. I started this in the latter part of the year and I underestimated the time required to chew on it. I’m around 450 pages in after a few months. It’s been slow but very rewarding. You’ll be happy to know that in 1810, the Russians had meetings about having a meeting. What’s old is new again! Grade: C

I’d say I had a successful year. I didn’t hit all my goals, but like New Year’s Resolutions, I knew some goals would change and others wouldn’t be as important, it was more a guide for the year. I will be posting some new goals for 2016 shortly, though, so I can stay grounded this year as well!