Alternatively: Common mistakes made when adopting Puppet.
I love me some Puppet, and anyone who knows me will tell you I’ll talk about it and configuration management as long as you let me. However, sometimes it’s not the answer people expect it to be. Is it even the right tool? As a counterpoint to Why Puppet?, let’s look at some potential use cases and see whether they are a good fit. These use cases have been gathered from my own usage, ask.puppetlabs.com, #puppet on IRC, and some user stories recounted to me and are presented in no specific order. Special thanks to Ryan McKern for some additional stories and editing.
Is it possible to run something only if the file/user/package/whatever is present? (IRC, nearly every day)
The situation is often presented as, “$Thing won’t install without me answering some questions or providing an answer file, can I get Puppet to manage it only if the package is installed?” Yes, but also no.
The goal of any CM platform is to describe a desired state and enforce it. Whether you use an imperative tool (shell scripts, Chef) or a declarative tool (Puppet, Ansible), you are saying, “This is what I want node X to look like.” This sort of conditional turns your CM platform into a waffler. “Here’s what I want node X to look like – unless…” Desired state becomes an uncertain, conditional state.
I understand the pattern: you’re going to install the application manually and configure it before Puppet runs. Presumably you’ll then manage some additional resources related to that software. But what happens if the software is uninstalled, or just appears uninstalled to Puppet (because not all software shows up in a package manager somewhere). Will Puppet then attempt to reinstall the software and enforce your desired state? Nope! You told Puppet that if the software isn’t installed, don’t worry about it! Will Puppet send an alert about the lack of software? Perhaps, if you used a notify in the else conditional. However, the exit code will look normal since Puppet considers it a successful run. You’ve broken your ability to set and enforce the state of your infrastructure through your configuration management system.
There are numerous ways to fix this, depending upon the root cause. Some packages (more frequently with .deb than .rpm) are interactive installs, but a preseed file can be provided to non-interactively complete the install. This may require managing the file resource and ensuring it is present before the package resource. In other cases, the magic happens when the service is started for the first time and no configuration is available. You can provide the configuration file through ERB templates, file resources, or even through a user-created package containing the config bundle. Ensure the config is present before the package itself, or before the service is started, and the issue should be resolved.
Bottom Line: Configuration management is about specifying a state and letting the tool converge to that desired state. Conditional state is no state at all.
A black box device only shows its config via a web page. Can I wget the config as a file, saving it in the same location every time, and have Puppet notify me when the content changes? (IRC)
This was an interesting question from IRC, and a tough one, too. Many of us have to deal with software that has less-than-desirable traits, including having no API to determine when changes are made. In this case, the device at least lets the user grab a copy of the config via http(s), but it apparently doesn’t say when there are changes.
As described, this isn’t a great use case for Puppet. It would require an Exec resource that computes the md5 checksum of the current file, another Exec to wget the config again, a third Exec to compute the md5 checksum again, and then perhaps a Notify resource if the md5 checksums are different. Each Exec has to be executed on every Puppet run, which can introduce some delay, especially if the black box’s web server takes a while to respond. The output ends up in a Notify which is visible in syslog, meaning you also need to have a watchdog of some sort that looks for the Notify string in order to take action (local or on the log collector). However, if you run the Agent in attended mode, then you get to wait for the Execs to complete every time. You made a tweak to the version of a package and want to force the Agent to run now? You get to wait. You messed it up and need to tweak it some more? You get to wait, again.
On the whole, this isn’t quite what configuration management is for. It’s easy to view it as managing the configuration of the black box, but from Puppet’s perspective the goal is to managed the state of the node that’s running wget. The black box’s config file isn’t really part of the state of the node and hence Puppet is being forced to act like a circus contortionist to make things work.
The Exec-heavy setup can be improved to something that fit’s Puppet’s model, however. The process of checking for a difference in a file was defined above, so make THAT part of the state of the node. A shell script that does the heavy lifting – md5, wget, md5, generate syslog/ticket – can be managed as a File resource or preferably a Package (FPM to the rescue!). A Cron resource specifies the interval with which to run the script. Puppet then ensures that the script file exists and has the correct contents, permissions, etc. and that the cron job is present. You could then combine this script with a tool like Tripwire, that is purpose-designed to monitor and alert on changed files, or a monitoring system like Nagios, which accepts custom plugins that could generate an alert when the checksum changes.
Additionally, wget is often not part of a minimal install. Are you sure you want to make sure wget is there with another Package resource? You should probably use curl, which is a much better tool that is also part of minimal installs.
Bottom Line: Puppet does not function well as a data integrity or monitoring system. Use purpose-built tools in conjunction with Puppet.
Can I run some commands but only once? (IRC, multiple Ask questions)
This is how Exec resources work: “Any command in an
exec resource must be able to run multiple times without causing harm — that is, it must be idempotent.” There is the option of using onlyif, unless, or creates, though the first two run the check command on every checkin and the third only stops the command from running if the specified file does not exist. It won’t create the file – something else would need to do that (typically, but not always, the Exec command itself during its first run). But this pattern can be dangerous! If your exec command is not idempotent and someone/thing deletes the
creates file or the
onlyif/unless suddenly fail, the command will run again. Is the lack of idempotency benign (unlikely) or could it inadvertently cause a service outage or impairment?
In addition, Execs can be fragile because they run outside of the scope of the Puppet DSL. They do not work always well with –noop mode, as Puppet cannot determine the resultant state.
If you have commands that need to run only once, investigate the alternatives. Perhaps your provisioning system (you do have one, right?) can run the one-time command during the initial provisioning effort, or even as part of the template it creates your node from. If the command is associated with a package install, a post-install script should probably run the specified command. If it really is a one-time thing, and the node itself is a one-time thing, it might even be appropriate to run it by hand (spoiler alert: it’s ever a one time thing!).
Bottom Line: Enforce one-time commands before or during provisioning and do not leave yourself vulnerable to an outage if automation ends up running the command out-of-band or at the wrong time.
I want to verify I’m patched against the lastest CVEs! (IRC)
This is a really murky use case. It’s important to know that you have the right versions of OpenSSL and bash (Heartbleed, Shellshock, GHOST, FREAK, POODLE – you really do want to know that you are not vulnerable to any of these!). There will be other packages that have known vulnerabilities and Puppet can help you with that: your state definitions can pin those packages to specific versions allowing you to query PuppetDB for outliers, or you can use mcollective to gather information from your managed nodes in realtime. Believe it or not, there are modules that state they do just this! Unfortunately, so far I’ve only seen modules that look at the version of a package or size of a file to determine if it matches the known patched version. Read that sentence again. They only verify the desired version of a package or size of a file. Is there any actual verification that the system is not vulnerable? Nope – no exploit code is tested.
This is a problem because this is only half of compliance checking. Your systems say they are not vulnerable, but you need to verify that they actually are not vulnerable. Many (most?) package managers only validate package payloads at install/upgrade time to ensure the installed file matches the md5 checksum specified by the package. Some package managers maintain file checksums in the package metadata and can tell you if the checksums do not match. Do you know how your package managers behave? If someone modifies or deletes the file later, then the package manager may not notice. A report which tells you that version 1.2.3 is installed may have only verified the contents of the package manager’s database. Proper compliance auditing should involve some form of penetration testing. But penetration testing has the potential to trigger an exploit and cause an outage, so it is typically performed only at certain intervals throughout the year and often against a few “representative” systems rather than against all systems (not everyone has, is ready for, or even desires Chaos Monkey). You do not want Puppet running exploit code every 30 minutes against your entire infrastructure. Not only is it dangerous, but it’s probably almost certainly needless, senseless, destructive, and irresponsible.
Your (professionally paranoid) security compliance officers usually want to run their own checks, because if you were failing the checks, then you could just fake the data. They’ll still want to run an audit even if you do it automatically every 30 minutes. Please do not use any configuration management tools as a method to ensure your systems are not vulnerable by using exploit code. Your career will thank you.
Bottom Line: Rely on your security team’s audit tools, tripwire, and other CVE-centric tools to enforce and police your security posture.
Edit: Shortly after I posted this, I was pointed to Ben Ford’s File/mcollective/report method. This is exactly how you could audit for CVEs using non-destructive tests on demand! Destructive tests still need not apply.
I want to copy a file from one agent to another. (ask)
Sure! You could do this with an Exec resource that calls scp. But should you? Ask yourself, is there a valid use-case that leads to fetching a file from NodeA and copying it to NodeB?
What you should be thinking about is how you’re evaluating state. The state of NodeB involves a bunch of resources from Puppet, plus one resource sourced from NodeA. If NodeA disappears or is unresponsive, can the state of NodeB still be enforced? Can the contents of that file be recovered if both nodes are lost?
Furthermore, this is a potential security risk. NodeA has a file on it and always provides it to NodeB. If the file on NodeA is deleted or modified to include a malicious payload, NodeB then ends up without the required file or with the same malicious payload. Again, the desired state becomes unenforceable.
If two nodes need the same file, have Puppet provide the file as a File resource. The file can be part of a module and live in version control, ensuring that it’s content is versioned and auditable.
Bottom Line: Don’t try to force CM systems to run scp for you; it will just leave you with inter-dependent nodes and no useful version control for these resources.
My app/system uses some shell scripts, I need puppet to do that! (IRC, just about every week)
Although we have covered most aspects of this above, this is such a common request and it encapsulates the most serious transitional issues that people encounter so it is worth analyzing this more closely. Here are some of the issues you may run into.
- We already covered Exec resources and idempotence above. You’re going to have issues. Exec resources are bad, m’kay?
- Puppet is about enforcing a desired state through the Puppet DSL, a declarative language. If you try hard enough, you can make it act like an imperative language. That would be akin to ripping the bottom out of your car, dressing in a leopard-print toga and a ratty blue tie, and driving to work with your feet. Is that something you really want to do, or just watch on Boomerang?
- The code will almost definitely end up littered with comments like “X run may fail at first, just run Puppet again.” Congratulations! Your class now fails at idempotence. By using the wrong resource types (or the right types incorrectly) and trying to imperatively define state, you created a failure domain.
- Future iterations of Puppet may not respect the goal of your code as you have broken the declarative model. This was a very common problem in the transition from Puppet 2 to 3, and Puppet 4 is right around the corner. Changes in how Puppet orders resources (see Introducing Manifest-Ordered Resources by Eric Sorenson), evaluates return codes, etc., could cause failures when upgrading. You will incur some technical debt costs for your troubles. You may have to pay off some debt with a re-write (hope you didn’t have too much code, like 10,000 lines of Puppet code or anything), or carry that debt by being tied to an older version of Puppet – and Ruby. Who doesn’t love Ruby v1.8.6,?
- Sharing your code with others becomes more difficult. The complexities of forcing an imperative model through declaration means that even your future self may have difficulty reading your code. Will someone used to Puppet’s declarative model be able to understand your code? Will they even want to? Will you scare them away during an interview, ensuring your team remains small?
I think that these points boil down to a simple concept: The word need is overused. You need to provide services, but rarely is the method of providing the services something you “need”. This isn’t homework, you don’t need to show your work. Let Puppet do its thing and figure out how to get from A to B to Q to ∀ for you. The sooner you learn to “Think Declaratively”, the sooner you can transition away from your previous imperative model that was already known to be insufficient to the declarative model that lets the computer do the work for you. Take a moment and read Luke Kanies‘s article on “Why Puppet has its own configuration language” and Mike English’s “From Imperative to Declarative System Configuration with Puppet.”
Bottom Line: A change in mindset must accompany new tooling in order to fully benefit from the changing model. If you persist with your old model, you only add complexity.
We’ve explored some common requests seen from Puppet users and how to adjust those requests to conform to Puppet’s model and strengths. Adjusting your mental and workflow processes to align with a chosen tool is always important if you want to get the most out of that tool. Forcing the tool to bend itself to your will often incurs technical debt and higher maintenance costs. There may be some cases where you cannot change the process at all or where bending it incurs other costs. In such situations, perhaps using Puppet is not appropriate. There is no one tool that can solve all problems! It’s okay to admit that there are other tools better suited to the process at hand, either in conjunction or as outright replacements. Many people use a combination of Puppet and Salt/Ansible/Chef, and that’s okay! I hope this article helps you to analyze and make those decisions.