What is a backdoor?

Last month, a significant finding in Fortinet devices was discovered and published. When I say significant, I mean, it’s huge – Multiple Products SSH Undocumented Login Vulnerability. In other words, there’s a username/password combination that works on all devices running the affected firmware versions. If you are still running an affected version, you NEED to upgrade now! This is bad in so many ways, especially following similar issues with Juniper and everything we’ve seen from Snowden’s data dumps. Fortinet responded by saying ‘This was not a “backdoor” vulnerability issue but rather a management authentication issue.’

Is that right? What is a “backdoor” and what is “management authentication”? Is there an actual difference between the two, or is just a vendor trying to save their butt? I got into a discussion about that on twitter:


Ethan challenged me to think about the terminology and I think I’ve come around a bit. Here’s what I now believe the two terms mean.

  • Management Authentication: A login available for tech support. The user/password should vary between devices, maybe predicated on a serial number or some other attribute. A key attribute is that the device owner has control of whether or not the login is enabled.
  • Backdoor: Any sort of authentication mechanism to a system in which the owner has no control over whether or not it’s enabled.

I would originally have said that a vendor purposefully providing a backdoor, even if it were varied between devices, was a management authentication system. But that’s legacy speaking. I remember calling Novell for support with Netware when we were locked out and having a recovery path. In 1999, that was somewhat acceptable. Users weren’t as savvy, hacks weren’t as automated, and systems weren’t as connected. Regardless, it was still a backdoor, it was just more convenient and less likely to backfire.

I’ll add a third category using some more modern terminology:

  • Advanced Persistent Backdoor: Similar to a backdoor, but purposefully malicious. Attempts to deactivate or mitigate the backdoor (i.e. disabling ssh) will not be allowed and the backdoor will ignore the settings or return in some other way.

I’d like to thank Ethan for the assistance in challenging my preconceptions and evolving my understanding.

Root Cause Analysis: It’s Still Valid

You’ve probably heard it before: Root Cause Analysis (RCA) doesn’t exist, there’s always something under the root cause. Or, there’s no root cause, only contributing factors. This isn’t exactly untrue, of course. Rarely in our entire life will we find some cause and effect so simple that we can reduce a problematic effect to a single cause. Such arguments against RCA may be grounded in truth but smooth over the subtleties and complexities of the actual process of analysis. They also focus on the singular, though nothing in the phrase “Root Cause Analysis” actually implies the singular. Let’s take a look at how RCA works and analyze it for ourselves.

Root Cause Analysis is the analysis of the underlying causes related to an outage. We should emphasize that “causes” is plural. The primary goal is to differentiate the symptoms from the causes. This is a transformative and iterative process. You start with a symptom, such as the common “the internet is down!” In a series of analytical steps, you narrow it down as many times as needed. That progression may look like:

  • “DNS resolutions failed”
  • “DNS server bind72 failed to restart after the configuration was updated”
  • “A DNS configuration was changed but not verified and it made its way into production”
  • “Some nodes had two resolvers, one of which was bind72 and the other was the name of a decommissioned DNS node.”

Each iteration gets us closer to a root cause. We may identify multiple root causes – in this case, lack of config validation and bad settings on some nodes. Not only are these causes, root causes, but they are actionable.  Validation can be added to DNS configuration changes. Bad settings can be updated. Perhaps there’s even a cause underneath – WHY the nodes had bad settings – because RCA is an iterative process. We can also extrapolate upward to imagine what other problems could be prevented. DNS configurations surely aren’t the only configurations that need validated.

Multiple causes and findings doesn’t invalidate Root Cause Analysis, it only strengthens the case for it. If it makes it easier to share the concept, we can even call it Root Causes Analysis, to help others understand that we’re not looking for a singular cause. Regardless of what we call it, I believe it is absolutely vital that we continue such analysis, that we don’t throw away the practice because some people have focused on the singular. Be an advocate of proper RCA, of iterative analytical processes, and of identifying and addressing the multiple causes at hand.

Puppet 4 Lessons Learned

I’ve been working recently on migrating to Puppet 4. All the modules I maintain have supported it for a little bit but my master and controlrepo were still on Puppet 3. I slowly hacked at this over the past month and a half when time presented itself and I learned a few things. This post is an assortment of lessons learned, more than a tutorial, but hopefully it will help others going through this effort themselves.

At a high level the process consists of:

  • Make sure your code is Puppet 4 compatible via Continuous Integration.
  • Find a module that manages Puppet 4 masters and agents. If your current module works with 4, this step is much easier.
  • Build a new template or base image that runs Puppet 4.
  • Update your controlrepo to work with a Puppet 4 master, preferably with the new puppetserver instead of apache+passenger, again using CI for testing.
  • Deploy the new master and then start adding agents.

The biggest lesson I can give you is to perform these changes as 5 separate steps! I combined the last three into a single step and I paid for it. I know better and my shortcut didn’t turn out so well. Especially, do not take your Puppet 3 master down until your Puppet 4 master build tests out okay! Alright, let’s go through some of the steps.

Puppet 4 Compatibility

Puppet Labs has a good guide on Updating 3.x Manifests for Puppet 4.x. Add some Travis CI to your Controlrepo first (and make sure you get the current code passing before adding anything else!), then start updating your manifests. You’ll also need to make sure all the modules you use support Puppet 4. Hopefully this is as easy as updating to the latest version, but you may need to submit upstream PRs or use your own fork if the author is running a bit behind. This step may take a while, but is fairly easy. CI feedback is your guide here, don’t move to the next step until everything’s green.

Find a Puppet 4 Module

I’ve been using stephenrjohnson/puppet for a while, but v4 compatibility is still just an issue for now. After looking at a lot of modules, I settled on jlambert121/puppet. It works fairly well with puppetlabs/puppetdb (PR pending) with the right settings. This may work for you, it may not; you need to do your own research to ensure the module you choose meets your needs.

If you end up using jlambert121/puppet, here are some settings I found helpful to apply via Hiera (the firewall parameter means you can remove any firewall rule you have for port 8140 from your manifests). Adjust as necessary:

# puppet_master.yaml
puppet::server: true
puppet::server_version: 'latest'
  - 'puppet'
puppet::puppetdb_server: 'puppet.example.com'
puppet::puppetdb: true
puppet::manage_puppetdb: false
puppet::manage_hiera: false
puppet::firewall: true
puppet::runmode: service

I also needed to modify my puppet master profile and rspec tests.

# spec/classes/puppet_master_spec.rb
  # after let (:facts) and before the first context
  let (:pre_condition) {
    "package{'puppetdb': ensure => present, }"

Check your role/profile metadata.json file. If you are listing dependencies here (I was, but have since removed them because why would I?) be sure to replace your old module name with the new one.

Though you’ve found the new module, don’t swap it out in production, use a feature branch. We still need some changes that will hit production. Make sure you run some CI against this feature branch so you can iron out issues in your module usage before proceeding.

Build a new Template

I documented one way to build a new template here. If you use kickstart, you can also now deploy this from a module very easily, too, with danzilio/kickstart and puppet/community_kickstarts. Add them to your Puppetfile and create a profile that looks like this:

class profile::kickstart {
  include ::apache

  #::kickstart::ks_file{'el6-dhcp.ks': }
  #  vmwaretools_location => '',

  $el7_packages = [


  firewall { '100 HTTP/S inbound':
    dport  => [80, 443],
    proto  => tcp,
    action => accept,

Create a kickstart server that receives this profile and you can then create a new template that includes Puppet 4.x from Puppet Collection 1. The default partition layout assumes 100GB and cpu/mem is up to you. It’s very minimal, but you can add to $el7_packages to get what you need, or create your own kickstart template if you need something completely different.

Update your Controlrepo

You’ll need to grab your new puppet module and the settings for the module, which was covered above. I also found some other issues I needed to resolve.

Hiera Issues

HI-494: Sometime between puppet3/hiera1 and puppet4/hiera3, the old trick of using %%{}{realvar} in hiera data to generate %{realvar} and prevent interpolation of realvar broke. You can replace that with %%{::}{realvar}.

DOCUMENT-491: I also saw that automatic parameter lookup didn’t like %{environment} in the datadir anymore and required %{::environment}.

To address this, make these changes in your bootstrap hiera.yaml and your puppet master’s hiera data (changes in bold):

# hiera.yaml
  - yaml

:logger: console

  - "clientcert/%{clientcert}"
  - "puppet_role/%{puppet_role}"
  - global

  :datadir: /etc/puppetlabs/code/environments/%{::environment}/hiera

# puppet_master.yaml
  - 'clientcert/%%{::}{clientcert}'
  - 'puppet_role/%%{::}{puppet_role}'
  - 'global'
hiera::datadir: '/etc/puppetlabs/code/environments/%%{::}{::environment}/hiera'

You can do this work on your puppet 4 feature branch.

mcollective and R10k

There are a few changes that needed made with respect to r10k. In your bootstrap file (r10k_installation.pp), bring the gem up to the latest version and change the basedir, as Puppet 4 has a different default path.

class { 'r10k':
  version => '2.1.1',
  sources => {
    'puppet' => {
      'remote'  => 'git@github.com:username/controlrepo.git',
      'basedir' => $::settings::environmentpath,
      'prefix'  => false,
  manage_modulepath => false

Next, we need to find some new certificates to use with the webhook! There’s a request to have zack/r10k generate certificates itself, but until then, we can use some from PuppetDB. We just need to point to the new location in our module parameters, and add a relationship in our manifest to ensure the webhook doesn’t start until after the certs are there.

# puppet_master.pp
class profile::puppet_master {
  # After the r10k includes
  Package['puppetdb'] -> Service[webhook]
  # Everything else

# puppet.yaml
r10k::webhook::config::public_key_path: '/etc/puppetlabs/puppetdb/ssl/public.pem'
r10k::webhook::config::private_key_path: '/etc/puppetlabs/puppetdb/ssl/private.pem'

You can do this work on your puppet 4 feature branch.

Version Pinning Refresh

Review your Puppetfile and bring all the modules you can up to the latest. You may have done this already with some older versions that were not Puppet 4 compatible, but you should do this with the rest as well. Only keep an older version if it’s actually necessary. While you don’t absolutely have to do this, it’s a convenient time to do this as well, before you get to the new master. Again, develop on a feature branch and use tests to ensure everything’s okay before merging it into production. There are many tools to help bring your Puppetfile up to date.

If you do this, perform the work in a separate feature branch from your puppet 4 feature branch and merge it prior to the next step.

Deploy the new Master

Now you’re ready to take the feature branch where you started to work on the new puppet module and rebase it so you get the updated Puppetfile. This branch should have your new puppet module, r10k, and hiera changes ready to go. Your CI should tell you if you need other tweaks.

I then came up with a simple Bootstrap.md that I added to my controlrepo. Use your own controlrepo URI and this should work as a good framework for most people. During tests, checkout your feature branch and use it as the environment arg to puppet agent -t. Once everything is good, you can merge it and use no environment argument if you need to rebuild the master.

# Run ssh-keygen and attach the new key to your GitHub account

systemctl stop firewalld
mkdir /root/bootstrap
puppet module install zack/r10k --modulepath=/root/bootstrap
git clone git@github.com:username/controlrepo.git
cd controlrepo
git checkout <branch> # optional
puppet apply r10k_installation.pp --modulepath=/root/bootstrap
rm -f /etc/hiera.yaml /etc/puppetlabs/code/hiera.yaml
cp hiera.yaml /etc
cp hiera.yaml /etc/puppetlabs/code
r10k deploy environment -pv
yum install -y puppetserver
systemctl start puppetserver
puppet agent -t --environment=<branch>

It should be that simple! I would perform a snapshot before running puppet at the end, so if you need to revert for some reason, you’re a little closer to the end.

Next Steps

After building the master, the next step is to upgrade all your nodes. That’s a little more complicated and depends greatly on your environment. If you use your existing nodes, you may have to clean up some SSL certificates to talk to the new master. If you are using an immutable infrastructure model, or just rebuilding the whole lab at once, build new nodes using the updated template and they should be able to talk to the master once you sign the certs on the master, or enable autosigning.


As I said at the top, this isn’t a tutorial by any means, just a collection of lessons learned. I hope it helps you in some way! I’ve also created a Puppet 4 Lessons Learned gist. If you have comments or see room for improvement, feel free to drop a PR or comment there, and of course here or on twitter as well. Thanks!

Tcpdump: When and How?

A tool I rely on heavily for network debugging is tcpdump. This tool naturally comes to mind when I run into issues, but it may not for others. I thought I’d take a moment and describe when I reach for tcpdump and give a quick primer on how to use it.

If you’re using windows, windump/wireshark are the cli/gui equivalents. I’ll stick to tcpdump in this article, but many of the CLI options are the same and the filters are pretty similar if not the same.

When should you use tcpdump?

Whenever you’re troubleshooting an application, you hopefully have some sort of application-level logging to help you figure out what’s going on. Sometimes, you don’t have that – or what does exist provides inadequate detail or appears to be lying to you. You may also not have access to a device that you think is affecting the traffic, and you need to ensure that the traffic flow meets your expectations. As long as your application talks on the network, even locally, tcpdump may be able to help you!

You may have users from the internet who need to reach your application who are not able to, and they’re only receiving a timeout, but other users have no issues. You look in your web server logs and you don’t see any logs for the user complaining. There are log entries for the users who are not complaining. You can use tcpdump to listen on the webserver’s port for the customer’s IP and see if the connection attempts are seen. You can also see the packet contents in cleartext (as opposed to binary format – encrypted content is not decrypted, it’s just more easily visible) if that helps diagnose the issue.

Many applications also rely on local connections, typically on the loopback interface, and may be affected by the local firewall (iptables or the Windows Firewall Service, for example). Using tcpdump, you can see if the packets are immediately rejected, which is likely to be the firewall service, or if it completes a three-way handshake before closing the connection. In almost all cases, if a three-way handshakes is observed, the application has received the connection.

Given the name tcpdump, it’s worth nothing that you can see almost anything on the wire, not just TCP packets. UDP, GRE, even IPX are visible with the right filters.

How do you use tcpdump?

Let’s look at how you use tcpdump. In the examples below, I’m using a Linux VM with one interface, eth0, and the address It has ssh, apache, and postfix services running. Tcpdump requires root access to see the raw packets on the wire, which I will gain with sudo. Be extremely careful who you grant this access to for two reasons. 1) Zombied tcpdump sessions can gobble all the CPU. 2) Since packet contents can be inspected, sensitive information can be seen by anyone with the permission to run tcpdump. This is a security risk, when you must meet PCI-DSS audit requirements. I’ll be using my unprivileged user rnelson0.

Tcpdump by default will try and resolve IP and service names. This can be slow, as it relies on DNS and file lookups, and confusing as most people will search by the IP addresses. We can disable these lookups by adding the n flag to the CLI, adding one instance for IPs and one for services, -nn. We also want to specify the interface, even on a single-NIC node, as it may default to the loopback instead of the ethernet interface, using -i <interface>. This gives us a default argument string of: -nni eth0 or -nni lo, depending on which we are looking for.

Next, we need to generate a filter to look at traffic. The tcpdump man page provides a lengthy list of filter components. One of the most common components is src|dst|host <scope>, which filters for packets from, to, or bi-directionally for the specified IP or network. Others are port <portnumber> and <protocol>, like icmp or gre. We can combine individual components with standard logical operators like and, or, and not: filter for non-ssh traffic to/from with host and not port 22.

As a “bonus”, when you run tcpdump with a bad filter, it will exit immediately. It doesn’t offer hints on how to fix the error, but it does let you know right away.

We put this together with the full command tcpdump -nni eth0 host and not port 22. If we ssh to our node and just run this, we won’t see anything happen right away, but we’ll eventually see some ARP packets:

[rnelson0@kickstart ~]$ sudo tcpdump -nni eth0 host and not port 22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
22:05:50.309737 ARP, Request who-has tell, length 46
22:05:50.589052 ARP, Request who-has tell, length 46
22:05:59.934464 ARP, Request who-has tell, length 46
22:06:51.315637 ARP, Request who-has tell, length 46
22:06:51.519754 ARP, Request who-has tell, length 46
22:06:51.519807 ARP, Reply is-at 00:50:56:ac:f2:f7, length 28

Now if we view a file on the web server, we’ll see a three way handshake followed by a few PSH packets:

22:17:29.686840 IP > Flags [S], seq 1113320281, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
22:17:29.687041 IP > Flags [S.], seq 741099373, ack 1113320282, win 14600, options [mss 1460,nop,nop,sackOK,nop,wscale 5], length 0
22:17:29.690439 IP > Flags [.], ack 1, win 256, length 0
22:17:29.690475 IP > Flags [P.], seq 1:412, ack 1, win 256, length 411
22:17:29.690540 IP > Flags [.], ack 412, win 490, length 0
22:17:29.693772 IP > Flags [P.], seq 1:151, ack 412, win 490, length 150
22:17:29.694090 IP > Flags [F.], seq 151, ack 412, win 490, length 0
22:17:29.696030 IP > Flags [.], ack 152, win 256, length 0
22:17:29.700858 IP > Flags [F.], seq 412, ack 152, win 256, length 0
22:17:29.700893 IP > Flags [.], ack 413, win 490, length 0

For comparison, here’s what HTTPS looks like when HTTPS is not enabled. You see the SYN packet from the client, and the RST packet comes from the OS since there’s no service listening there:

22:18:50.057972 IP > Flags [S], seq 825112119, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
22:18:50.058088 IP > Flags [R.], seq 0, ack 825112120, win 0, length 0
22:18:50.558200 IP > Flags [S], seq 825112119, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
22:18:50.558264 IP > Flags [R.], seq 0, ack 1, win 0, length 0
22:18:51.060995 IP > Flags [S], seq 825112119, win 8192, options [mss 1460,nop,nop,sackOK], length 0
22:18:51.061065 IP > Flags [R.], seq 0, ack 1, win 0, length 0


I hope this short tutorial helps you figure out when and how to use tcpdump. If you have specific questions, post them in a comment or ask on twitter and I’ll respond.

A Full Stack What?

You’ve probably heard a lot of talk about the term “Full Stack Engineer”. You may even hear that everyone’s looking for one, so you probably want to be one to help your career. A Full Stack Engineer (hereafter FSE) is someone who doesn’t just know their one area deeply, but knows a bit about the rest of the stack. That depth of knowledge varies from very shallow to deep expertise, with the idea that the FSE knows how the different levels of the stack work together so they can make decisions that benefit the entire stack, rather than a local optimization that may harm the rest of the stack. You don’t want someone making an application decision that blows up the storage stack, or vice versa, so this kind of wide knowledge rather than deep knowledge is definitely helpful.

There’s a huge challenge to becoming an FSE, one of which is the sheer amount of layers in the stack to learn about these days. There’s so much to learn that it’s not actually feasible that any one person can learn all those layers deeply enough to really know the full stack. There’s absolutely nothing wrong with taking the journey toward Full Stack Engineer, but I think there’s another worthwhile goal out there:

A quipped about a “Full Stack Human,” a little bit of a tongue-in-cheek response to the overuse of the FSE term, but there’s some seriousness behind it. What it really means is that you should try and be a well-rounded person. In a sentence: Be more than a job.

A job is (hopefully) only 40 hours out of each 144 hour week and 2000 hours out of 8766 hours a year, less than 25% of what you do in a year. Sleeping should take up about 30% more – and it really should, we have to work very hard to not have a perpetual sleep deficit. Many of us will spend some of that remaining time trying to advance our work and careers, which is perfectly fine. This still leaves a lot of time, time in which we can find some hobbies and activities to enjoy so we’re more than just a working machine.

For exercise, I really like playing flag football. There’s a very diverse assortment of players out there and it’s far more entertaining than a treadmill or machine. When I feel creative, I enjoy woodworking. It requires deliberation, planning, and care in ways that my day job doesn’t – well, since I like having all of my fingers, anyway. I really like my sci-fi and fantasy novels, but I also make sure I fit some classics like War & Peace in between them. My wife and I don’t do anything truly adventurous, but we have been fortunate to visit a number of countries and enjoy their different cultures.

These activities gives you depth and adds dimensions to your character. (I realize I’m starting to sound like your parents did when you were filling out college applications, but bear with me a bit longer!) You meet other people and cultures and gain new viewpoints in which to perceive life. For example, in a decade of flag football, I’ve learned so many different ways to inspire teammates – and which ones don’t work! – and how to calm people down so they don’t lose the game.

I’d never get those experiences just by focusing on working my way up the stack at work, and those experiences help me out just as much at work. We talk a lot about encouraging diversity in tech, and in my opinion, it has to start in your personal life. A well-rounded person, a Full Stack Human, has those diverse experiences and can bring that diversity back into tech.

Your hobbies also give you a healthy escape from work. You aren’t just the project you released last week, and you shouldn’t kill yourself over work (figuratively or literally!). Identification and burnout can be a significant problem for everyone. If you don’t think so, you either aren’t there yet, or you’re there and you don’t know it! When you get too wrapped up in work – the deadlines are pressing down on you, politics got heated, you missed a family event because you were working late and didn’t even realize it – you need a safety valve to relieve that pressure and your personal time should help with that. PSA: If you’re struggling with burnout, please reach out to someone. We’re here to help!

Be a Full Stack Human. I guarantee it will be rewarding on its own, and it’s a huge step up on becoming a Full Stack Engineer!

2015 Recap: How did I do?

Just like I did at this time last year, it’s time to take a look at my goals for the previous year and see how I did.

Learn Ruby

I’d like to think I grok ruby at a more advanced level, now. I’ve written my first gem (and documented the ordeal) and contributed a number of patches to Ruby-ish projects here and there, mostly based around Puppet. I’ve also started writing “throw away” code in Ruby when possible, furthering my transition away from a bash-everywhere mentality. Grade: Pass

Blog more about Security

I started incorporating more security elements into my writing, but I haven’t really done a lot of security-focused writing. I only added one item to the Security category in 2015. I’m sitting on a bunch of drafts about security but am too timid to finish and publish them. Grade: Fail

Home Network

  1. I got my new home network up in running in the late spring, thanks to my partner-in-crime Mike SoRelle. I wrote an article about it as well.
  2. I made some progress here but not in the anticipated direction. I have all of my home network running Linux in Puppet and am working toward the same on the few Windows boxes. There was a lot of turmoil on the VMware side of things (5.5 updates, 6.0, 6.0 updates, changes to VCSA) and it slowed the work there. No IPAM, but I’m not feeling the burn very much because DNS at least is in Puppet.

Grade: Pass. But barely.

Expand PuppetInABox

I’ve learned a lot about software development in the past year. I’ve not only expanded and revamped PuppetInABox (support for Puppet 4 coming soon!), but I’m maintaining a few puppet modules, a puppet-related gem, and am actively participating in VoxPupuli (previously Puppet Community) and contributing features and fixes to Puppet itself. I think I’m making progress here, but still have a ways to go. Grade: Pass.

Propose a PuppetConf Talk

This was originally a goal to propose a VMworld talk, but I changed that as I didn’t have good subject matter for it before the CFP ended. I did submit a CFP to PuppetConf and was accepted! I presented in October and you can catch the video and slides online. I enjoyed the hell out of the conference and I dare say my talk did well, too! Grade: Pass.


I have made zero progress here. It was a busy year! I have until April to get this or renew the VCP and I’m not sure which it will be. Grade: Fail

Read War & Peace

This wasn’t on the list, but it was a personal goal. I’ve read a lot of Barnes and Nobles Classics and I love the Russian literature (Crime and Punishment in particular!), but at ~1100 pages of translated mid-1800 Russian, War & Peace was intimidating. I started this in the latter part of the year and I underestimated the time required to chew on it. I’m around 450 pages in after a few months. It’s been slow but very rewarding. You’ll be happy to know that in 1810, the Russians had meetings about having a meeting. What’s old is new again! Grade: C

I’d say I had a successful year. I didn’t hit all my goals, but like New Year’s Resolutions, I knew some goals would change and others wouldn’t be as important, it was more a guide for the year. I will be posting some new goals for 2016 shortly, though, so I can stay grounded this year as well!