Linux OS Patching with Puppet Tasks

One of the biggest gaps in most IT security policies is a very basic feature, patching. Specific numbers vary, but most surveys show a majority of hacks are due to unpatched vulnerabilities. Sadly, in 2018, automatic patching on servers is still out of the grasp of many, especially those running older OSes.

While there are a number of solutions out there from OS vendors (WSUS for Microsoft, Satellite for RHEL, etc.), I manage a number of OSes and the one commonality is that they are all managed by Puppet. A single solution with central reporting of success and failure sounds like a plan. I took a look at Puppet solutions and found a module called os_patching by Tony Green. I really like this module and what it has to offer, even though it doesn’t address all my concerns at this time. It shows a lot of promise and I suspect I will be working with Tony on some features I’d like to see in the future.

Currently, os_patching only supports Red Hat/Debian-based Linux distributions. Support is planned for Windows, and I know someone is looking at contributing to provide SuSE support. The module will collect information on patching that can be used for reporting, and patching is performed through a Task, either at the CLI or using the PE console’s Task pane.

Setup

Configuring your system to use the module is pretty easy. Add the module to your Puppetfile / .fixtures.yml, add a feature flag to your profile, and include os_patching behind the feature flag. Implement your tests and you’re good to go. Your only real decision is whether you default the feature flag to enabled or disabled. In my home network, I will enable it, but a production environment may want to disable it by default and enable it as an override through hiera. Because the fact collects data from the node, it will add a few seconds to each agent’s runtime, so be sure to include that in your calculation.

Adding the module is pretty simple, Here are theĀ Puppetfile / .fixtures.yml diffs:

# Puppetfile
mod 'albatrossflavour/os_patching', '0.3.5'

# .fixtures.yml
fixtures:
  forge_modules:
    os_patching:
      repo: "albatrossflavour/os_patching"
      ref: "0.3.5"

Next, we need an update to our tests. I will be adding this to my profile::base, so I modify that spec file. Add a test for the default feature flag setting, and one for the non-default setting. Flip the to and not_to if you default the feature flag to disabled. If you run the tests now, you’ll get a failure, which is expected since there is no supporting code in the class yet.(there is more to the test, I have only included the framework plus the next tests):

require 'spec_helper'
describe 'profile::base', :type => :class do
  on_supported_os.each do |os, facts|
    let (:facts) {
      facts
    }

    context 'with defaults for all parameters' do
      it { is_expected.to contain_class('os_patching') }
    end

    context 'with manage_os_patching enabled' do
      let (:params) do {
        manage_os_patching: false,
      }
      end

      # Disabled feature flags
      it { is_expected.not_to contain_class('os_patching') }
    end
  end
end

Finally, add the feature flag and feature to profile::base (the additions are in italics):

class profile::base (
  Hash    $sudo_confs = {},
  Boolean $manage_puppet_agent = true,
  Boolean $manage_firewall = true,
  Boolean $manage_syslog = true,
  Boolean $manage_os_patching = true,
) {
  if $manage_firewall {
    include profile::linuxfw
  }

  if $manage_puppet_agent {
    include puppet_agent
  }
  if $manage_syslog {
    include rsyslog::client
  }
  if $manage_os_patching {
    include os_patching
  }
  ...
}

Your tests will pass now. That’s all it takes! For any nodes where it is enabled, you will see a new fact and some scripts pushed down on the next run:

[rnelson0@build03 controlrepo:production]$ sudo puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Notice: /File[/opt/puppetlabs/puppet/cache/lib/facter/os_patching.rb]/ensure: defined content as '{md5}af52580c4d1fb188061e0c51593cf80f'
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for build03.nelson.va
Info: Applying configuration version '1535052836'
Notice: /Stage[main]/Os_patching/File[/etc/os_patching]/ensure: created
Info: /Stage[main]/Os_patching/File[/etc/os_patching]: Scheduling refresh of Exec[/usr/local/bin/os_patching_fact_generation.sh]
Notice: /Stage[main]/Os_patching/File[/usr/local/bin/os_patching_fact_generation.sh]/ensure: defined content as '{md5}af4ff2dd24111a4ff532504c806c0dde'
Info: /Stage[main]/Os_patching/File[/usr/local/bin/os_patching_fact_generation.sh]: Scheduling refresh of Exec[/usr/local/bin/os_patching_fact_generation.sh]
Notice: /Stage[main]/Os_patching/Exec[/usr/local/bin/os_patching_fact_generation.sh]: Triggered 'refresh' from 2 events
Notice: /Stage[main]/Os_patching/Cron[Cache patching data]/ensure: created
Notice: /Stage[main]/Os_patching/Cron[Cache patching data at reboot]/ensure: created
Notice: Applied catalog in 54.18 seconds

You can now examine a new fact, os_patching, which will shows tons of information including the pending package updates, the number of packages, which ones are security patches, whether the node is blocked (explained in a bit), and whether a reboot is required:

[rnelson0@build03 controlrepo:production]$ sudo facter -p os_patching
{
  package_updates => [
    "acl.x86_64",
    "audit.x86_64",
    "audit-libs.x86_64",
    "audit-libs-python.x86_64",
    "augeas-devel.x86_64",
    "augeas-libs.x86_64",
    ...
  ],
  package_update_count => 300,
  security_package_updates => [
    "epel-release.noarch",
    "kexec-tools.x86_64",
    "libmspack.x86_64"
  ],
  security_package_update_count => 3,
  blocked => false,
  blocked_reasons => [],
  blackouts => {},
  pinned_packages => [],
  last_run => {},
  patch_window => "",
  reboots => {
    reboot_required => "unknown"
  }
}

Additional Configuration

There are a number of other settings you can configure if you’d like.

  • patch_window: a string descriptor used to “tag” a group of machines, i.e. Week3 or Group2
  • blackout_windows: a hash of datetime start/end dates during which updates are blocked
  • security_only: boolean, when enabled only the security_package_updates packages and dependencies are updated
  • reboot_override: boolean, overrides the task’s reboot flag (default: false)
  • dpkg_options/yum_options: a string of additional flags/options to dpkg or yum, respectively

You can set these in hiera. For instance, my global config has some blackout windows for the next few years:

os_patching::blackout_windows:
  'End of year 2018 change freeze':
    'start': '2018-12-15T00:00:00+1000'
    'end':   '2019-01-05T23:59:59+1000'
  'End of year 2019 change freeze':
    'start': '2019-12-15T00:00:00+1000'
    'end':   '2020-01-05T23:59:59+1000'
  'End of year 2020 change freeze':
    'start': '2020-12-15T00:00:00+1000'
    'end':   '2021-01-05T23:59:59+1000'
  'End of year 2021 change freeze':
    'start': '2021-12-15T00:00:00+1000'
    'end':   '2022-01-05T23:59:59+1000'

Patching Tasks

Once the module is installed and all of your agents have picked up the new config, they will start reporting their patch status. You can query nodes with outstanding patches using PQL. A search like inventory[certname] {facts.os_patching.package_update_count > 0 and facts.clientcert !~ 'puppet'} can find all your agents that have outstanding patches (except puppet – kernel patches require reboots and puppet will have a hard time talking to itself across a reboot). You can also select against a patch_window selection with and facts.os_patching.patch_window = "Week3" or similar. You can then provide that query to the command line task:

puppet task run os_patching::patch_server --query="inventory[certname] {facts.os_patching.package_update_count > 0 and facts.clientcert !~ 'puppet'}"

Or use the Console’s Task view to run the task against the PQL selection:

Add any other parameters you want in the dialog/CLI args, like setting rebootto true, then run the task. An individual job will be created for each node, all run in parallel. If you are selecting too many nodes for simultaneous runs, use additional filters, like the aforementioned patch_window or other facts (EL6 vs EL7, Debian vs Red Hat), etc. to narrow the node selection [I blew up my home lab, which couldn’t handle the CPU/IO lab, when I ran it against all systems the first time, whooops!]. When the job is complete, you will get your status back for each node as a hash of status elements and the corresponding values, including return (success or failure), reboot, packages_updated, etc. You can extract the logs from the Console or pipe CLI logs directly to jq (json query) to analyze as necessary.

Summary

Patching for many of us requires additional automation and reporting. The relatively new puppet module os_patching provides helpful auditing and compliance information alongside orchestration tasks for patching. Applying a little Puppet Query Language allows you to update the appropriate agents on your schedule, or to pull the compliance information for any reporting needs, always in the same format regardless of the (supported) OS. Currently, this is restricted to Red Hat/Debian-based Linux distributions, but there are plans to expand support to other OSes soon. Many thanks to Tony Green for his efforts in creating this module!

Powershell in a Post-TLS1.1 World

I was trying to install PowerCLI on a new server in a new environment today and I encountered all sorts of error messages when PowerShell tried to install the required NuGet provider:

PS C:\Windows\system32> Find-Module -Name VMware.PowerCLI
WARNING: Unable to download from URI 'https://go.microsoft.com/fwlink/?LinkID=627338&clcid=0x409' to ''.
WARNING: Unable to download the list of available providers. Check your internet connection.
PackageManagement\Install-PackageProvider : No match was found for the specified search criteria for the provider 'NuGet'. The package provider 
requires 'PackageManagement' and 'Provider' tags. Please check if the specified package has the tags.
At C:\Program Files\WindowsPowerShell\Modules\PowerShellGet\1.0.0.1\PSModule.psm1:7405 char:21
+ ... $null = PackageManagement\Install-PackageProvider -Name $script:N ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidArgument: (Microsoft.Power...PackageProvider:InstallPackageProvider) [Install-PackageProvider], Exception
+ FullyQualifiedErrorId : NoMatchFoundForProvider,Microsoft.PowerShell.PackageManagement.Cmdlets.InstallPackageProvider

PackageManagement\Import-PackageProvider : No match was found for the specified search criteria and provider name 'NuGet'. Try 
'Get-PackageProvider -ListAvailable' to see if the provider exists on the system.
At C:\Program Files\WindowsPowerShell\Modules\PowerShellGet\1.0.0.1\PSModule.psm1:7411 char:21
+ ... $null = PackageManagement\Import-PackageProvider -Name $script:Nu ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (NuGet:String) [Import-PackageProvider], Exception
+ FullyQualifiedErrorId : NoMatchFoundForCriteria,Microsoft.PowerShell.PackageManagement.Cmdlets.ImportPackageProvider

WARNING: Unable to download from URI 'https://go.microsoft.com/fwlink/?LinkID=627338&clcid=0x409' to ''.
WARNING: Unable to download the list of available providers. Check your internet connection.
PackageManagement\Get-PackageProvider : Unable to find package provider 'NuGet'. It may not be imported yet. Try 'Get-PackageProvider 
-ListAvailable'.
At C:\Program Files\WindowsPowerShell\Modules\PowerShellGet\1.0.0.1\PSModule.psm1:7415 char:30
+ ... tProvider = PackageManagement\Get-PackageProvider -Name $script:NuGet ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (Microsoft.Power...PackageProvider:GetPackageProvider) [Get-PackageProvider], Exception
+ FullyQualifiedErrorId : UnknownProviderFromActivatedList,Microsoft.PowerShell.PackageManagement.Cmdlets.GetPackageProvider

Find-Module : NuGet provider is required to interact with NuGet-based repositories. Please ensure that '2.8.5.201' or newer version of NuGet 
provider is installed.
At line:1 char:1
+ Find-Module -Name VMware.PowerCLI
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (:) [Find-Module], InvalidOperationException
+ FullyQualifiedErrorId : CouldNotInstallNuGetProvider,Find-Module

I made it very angry, and I didn’t know why! After some searching, I stumbled on a solution on the Microsoft Community site. The issue is that PowerShell 5.1 defaults to only enabling SSL3 and TLS 1.0 for secure HTTP connections. You have probably noticed a lot of recent warnings on various websites about services removing support for TLS 1.0 and 1.1, and SSL3 has been disabled for many for years. Microsoft is no slacker here, and go.microsoft.com has dropped support for SSL3 and TLS 1.0 (probably TLS 1.1, too, but I didn’t check). Thus the Provider list at the URL cannot be accessed and the NuGet install fails.

PS C:\ProgramData\Documents> [Net.ServicePointManager]::SecurityProtocol
Ssl3, Tls

You can fix this by specifying Tls12 as the SecurityProtocol, but it only persists in this session, for this user. Thankfully, PowerShell has a well documented series of profile loads, so you can make the change once for all users on the server. You can choose whichever level works best for you. I chose $PsHome\Profile.ps1 which affects All Users, All Hosts. If you choose a global file like that, launch a PowerShell session as administrator (if you weren’t aware, there’s a Ctrl-modifier to avoid right-clicking!) so that you have the rights to edit the target file. If not, just substitute the file below with your choice.

This snippet will check for the existence of the file and create it if needed, then populate it with our one line change and comment telling us why. Finally, it opens the file so you can inspect it and adjust if you need to. Note that running it again will append the same lines, which isn’t harmful but may result in a little confusion for the next person to peek at it. Hello, future self!

$ProfileFile = "${PsHome}\Profile.ps1"

if (! (Test-Path $ProfileFile)) {
New-Item -Path $ProfileFile -Type file -Force
}
''                                                                                | Out-File -FilePath $ProfileFile -Encoding ascii -Append
'# It is 2018, SSL3 and TLS 1.0 are no good anymore'                              | Out-File -FilePath $ProfileFile -Encoding ascii -Append
'[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12' | Out-File -FilePath $ProfileFile -Encoding ascii -Append

notepad $ProfileFile

If you enter [Net.ServicePointManager]::SecurityProtocol in the current window, you’ll get the same Ssl3, Tls result you saw before. The profile is only loaded at startup. Open a new powershell instance on the server – as any user, even – and run it again. You should see the new setting:

PS C:\windows\system32> [Net.ServicePointManager]::SecurityProtocol
Tls12

Now you are ready to use PowerShell to connect to modern web servers, whether it’s to install NuGet, use Invoke-WebRequest, or any other HTTPS connection. Enjoy!

vRealize Orchestrator Workflows for Puppet Enterprise

Over the past three years, my Puppet for vSphere Admins series has meandered through a number of topics, mostly involved on the Puppet side and somewhat light on the vSphere side. That changed a bit with my article Make the Puppet vRealize Automation plugin work with vRealize Orchestrator, describing how to use the plugin’s built-in workflows to perform some actions on your VMs. However, you had to invoke the workflows one by one, and they only worked on existing VMs. That is not good enough for automation! Today, we will start to look at how to integrate the Puppet Enterprise plugin into other workflows to provide end-to-end lifecycle management for your VMs.

What is the lifecycle of a VM? This can vary quite a bit, so the lifecycle we will work with today is made to be generic enough for everyone to use, but flexible enough that everyone can expand on it. It consists of:

  • Provisioning
    • Updating ancillary systems prior to VM creation (IPAM, DNS, etc)
    • Deploying a Virtual Machine
    • Installing Puppet Enterprise on the VM
    • Using Puppet Enterprise to provision services on and configure the VM
    • Add the new VM to a vCenter tag-based backup system
  • Decommission
    • Delete the VM (removes from backups)
    • Purge the record from PE
    • Update ancillary systems after VM removal (IPAM, DNS, etc)

Continue reading

Automating Puppet tests with a Jenkins Job, version 1.1

Today, let’s build on version 1.0 of our Jenkins job. We are running builds against every commit, but when someone opens a pull request, they don’t get automated builds or feedback. If the PR submitter even knows about Jenkins, and has network access and a login, they can look at it to find out how the tests went, but most people aren’t going to have that visibility (especially if your Jenkins server is private, as in this example setup). We need to make sure Jenkins is aware of the pull request and that it updates the PR with the status. Our end goal is for each PR to start a Jenkins build and update the PR with a successful check when done:

To get there, we will install and configure a new plugin and configure our job to use the plugin.

Continue reading

Automating Puppet tests with a Jenkins Job, version 1.0

As I’ve worked through setting up Jenkins and Puppet (and remembering my password!), I created a job to automate rspec tests on my puppet controlrepo. I am sure I will go through many iterations of this as I learn more, so we’ll just call this version 1.0. The goal is that when I push a branch to my controlrepo on GitHub, Jenkins automagically runs the test. Today, we will ensure that Jenkins is notified of activity on a GitHub repo, that it spins up a clean test environment without any left over files that may inadvertently assist, and run the tests. What it will NOT do is notify anyone – it won’t work off a Pull Request and provide feedback like Travis CI does, for instance. Hopefully, I will figure that out soon.

The example below is using GitHub. You can certainly make this work with BitBucket, GitLab, Mercurial, and tons of other source control systems and platforms, but you might need some additional Jenkins Plugins. It should be pretty apparent where to change Git/GitHub to the system/platform you chose.

Creating A Job

From the main view of your Jenkins instance, click New Item. Call it whatever you want, choose Freestyle project as the type, and click OK. The next page is going to be where we set up all the parameters for the job. There are tabs across the top AND you can scroll down; you’ll see the same selection items either way. Going from the top to the bottom, the settings that we want:

Continue reading

Jenkins Tricks – Password Recovery and Job Exports

I’m finally getting back to Jenkins, which I started waaaay back in November (here and here). Unfortunately, I kind of forgot my password. Well, that’s embarrassing! I also want to redo the manifest using maestrodev/rvm which means starting over, so I need to back things up. The manual for Jenkins and the results on Google can be overwhelming sometimes, so I thought I’d share what I learned to hopefully save someone else.

Password Recovery

There’s a few ways I found to recover your password. One suggestion is to disable all security, delete your user, re-enable security and allow signups, and then recreate the same user and things should just “work”. Part of the reason you have to do this is that once you disable security, you can’t change the password for your user; only the user can. That’s … frustrating.

Disable security by editing $JENKINS_HOME/config.xml, /var/lib/jenkins/config.xml on my instance. I was able to get away with disabling it by changing <useSecurity>true</useSecurity> to false, though the article suggests removing two other lines. Restart the service with systemctl restart jenkins or equivalent and now you’re able to get in and recreate some users.

Continue reading

Scheduling regular Travis CI builds with Cron Jobs

If you have a project of any complexity that is using Travis CI for testing, you’ve inevitably run into an issue where you make a minor change – maybe even just to a markdown file – and the tests that were passing before completely bomb. If this hasn’t happened to you, just wait, it will! This typically happens because some dependency, such as a newer rubygem you depend on indirectly, is no longer compatible with your test environment. As an example, I updated my .travis.yml in a puppet module project, opened PR16, and suddenly all the previously-green tests went red. The error NoMethodError: undefined method 'last_comment' for #<Rake::Application:0x000000015849f8> was observed because the dependent gem rake was updated from v11 to v12, conflicting with the pinned version of rspec. Until this problem is fixed, every PR, no matter how simple or complex, is going to fail its test.

This puts a lot of burden on both the person submitting the PR, who just wants their simple change to be approved, and the project maintainer, who wants their build status to work. Neither is satisfied until the problem is tracked down and remediated, often in a separate PR. By that time, the contributor may not have the interest to update their PR and the maintainer has to decide if they want to fix it up or discard it. No-one is happy. All because some upstream dependency changed and we weren’t aware of it until the moment a PR was submitted, even though it may have happened days or months ago.

Continue reading