Adding, extending, and removing Linux disks and partitions in 2019

Managing disks and partitions in Linux has changed quite a bit over time. Unfortunately, as Jonathan Frappier points out, a lot of advice is either wrong, dated, or makes some poor assumptions along the way:

I know I almost always go to the search results drawing board when I have to manage a disk, so it’s pretty infrequent, and have the same problems of sifting through documentation that will work and that which will. I hope this article can be used as a source of modern, tested documentation for some common use cases.

Tools and File systems

First, a quick mention of the tools and file systems available.

fdisk

A venerable tool that continues to work fully as long as your disks are under 2 T in size. Once you’re over 2 T, I’d treat it like a RO only and not use it to make edits. You can use fdisk -l to display information on all disks it sees or fdisk <devicepath> to interact with a specific disk:

[rnelson0@build03 ~]$ sudo fdisk -l

Disk /dev/sda: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b06ec

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048     1026047      512000   83  Linux
/dev/sda2         1026048   209715199   104344576   8e  Linux LVM

Disk /dev/mapper/VolGroup00-lv_root: 12.6 GB, 12582912000 bytes, 24576000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/VolGroup00-lv_swap: 4294 MB, 4294967296 bytes, 8388608 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/VolGroup00-lv_home: 83.9 GB, 83886080000 bytes, 163840000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/loop0: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/loop1: 2147 MB, 2147483648 bytes, 4194304 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/docker-253:0-263061-pool: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes

[rnelson0@build03 ~]$ sudo fdisk /dev/sda
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): m
Command action
   a   toggle a bootable flag
   b   edit bsd disklabel
   c   toggle the dos compatibility flag
   d   delete a partition
   g   create a new empty GPT partition table
   G   create an IRIX (SGI) partition table
   l   list known partition types
   m   print this menu
   n   add a new partition
   o   create a new empty DOS partition table
   p   print the partition table
   q   quit without saving changes
   s   create a new empty Sun disklabel
   t   change a partition's system id
   u   change display/entry units
   v   verify the partition table
   w   write table to disk and exit
   x   extra functionality (experts only)

Command (m for help): p

Disk /dev/sda: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b06ec

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048     1026047      512000   83  Linux
/dev/sda2         1026048   209715199   104344576   8e  Linux LVM

Command (m for help): q

parted

A more modern tool that supports over 2 T disks. I would prefer this, though there’s nothing wrong with fdisk other than the size limit. Similarly, you can use -l to show all data or a device name to interact. Once nice thing is it defaults to human-readable sizes instead of a jumble of long numbers without commas.

[rnelson0@build03 ~]$ sudo parted -l
Model: VMware Virtual disk (scsi)
Disk /dev/sda: 107GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:

Number  Start   End    Size   Type     File system  Flags
 1      1049kB  525MB  524MB  primary  ext4         boot
 2      525MB   107GB  107GB  primary               lvm


Model: Linux device-mapper (thin-pool) (dm)
Disk /dev/mapper/docker-253:0-263061-pool: 107GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End    Size   File system  Flags
 1      0.00B  107GB  107GB  xfs


Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/VolGroup00-lv_home: 83.9GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End     Size    File system  Flags
 1      0.00B  83.9GB  83.9GB  ext4


Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/VolGroup00-lv_swap: 4295MB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End     Size    File system     Flags
 1      0.00B  4295MB  4295MB  linux-swap(v1)


Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/VolGroup00-lv_root: 12.6GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End     Size    File system  Flags
 1      0.00B  12.6GB  12.6GB  ext4


[rnelson0@build03 ~]$ sudo parted /dev/sda
GNU Parted 3.1
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) help
  align-check TYPE N                        check partition N for TYPE(min|opt) alignment
  help [COMMAND]                           print general help, or help on COMMAND
  mklabel,mktable LABEL-TYPE               create a new disklabel (partition table)
  mkpart PART-TYPE [FS-TYPE] START END     make a partition
  name NUMBER NAME                         name partition NUMBER as NAME
  print [devices|free|list,all|NUMBER]     display the partition table, available devices, free space, all found partitions, or a particular partition
  quit                                     exit program
  rescue START END                         rescue a lost partition near START and END

  resizepart NUMBER END                    resize partition NUMBER
  rm NUMBER                                delete partition NUMBER
  select DEVICE                            choose the device to edit
  disk_set FLAG STATE                      change the FLAG on selected device
  disk_toggle [FLAG]                       toggle the state of FLAG on selected device
  set NUMBER FLAG STATE                    change the FLAG on partition NUMBER
  toggle [NUMBER [FLAG]]                   toggle the state of FLAG on partition NUMBER
  unit UNIT                                set the default unit to UNIT
  version                                  display the version number and copyright information of GNU Parted
(parted) print
Model: VMware Virtual disk (scsi)
Disk /dev/sda: 107GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:

Number  Start   End    Size   Type     File system  Flags
 1      1049kB  525MB  524MB  primary  ext4         boot
 2      525MB   107GB  107GB  primary               lvm

(parted) quit

ext4

The “extended” family of filesystems (currently ext4, but possibly ext3 or ext2 if you work on some really old systems) have been used by Linux for a long time. It’s still a default in a number of distros, especially for smaller partitions. It’s a very competent journaled filesystem and the maximums have been massively extended over earlier versions (for example, 1 EiB max volume and 16 TiB max file size), but it’s still considered yesterday’s technology. The biggest fault I find with ext4 is that inodes – what tracks the metadata for files – are set at filesystem creation time and cannot be changed. This means that when extended volumes, the inode count is not increased. Thus, after an expansion, you could wind up horribly undersized in inodes while still having free space available. New files cannot be created when the inodes are exhausted. Most are familiar with measuring disk usage by space capacity, but you can also view your inode capacity:

[rnelson0@build03 ~]$ df -h
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-lv_root   12G  6.7G  4.3G  62% /
devtmpfs                        908M     0  908M   0% /dev
tmpfs                           920M     0  920M   0% /dev/shm
tmpfs                           920M   98M  822M  11% /run
tmpfs                           920M     0  920M   0% /sys/fs/cgroup
/dev/sda1                       477M  185M  263M  42% /boot
/dev/mapper/VolGroup00-lv_home   77G  5.8G   68G   8% /home
tmpfs                           184M     0  184M   0% /run/user/1000
[rnelson0@build03 ~]$ df -i
Filesystem                      Inodes  IUsed   IFree IUse% Mounted on
/dev/mapper/VolGroup00-lv_root  768544 202858  565686   27% /
devtmpfs                        232260    360  231900    1% /dev
tmpfs                           235373      1  235372    1% /dev/shm
tmpfs                           235373    850  234523    1% /run
tmpfs                           235373     16  235357    1% /sys/fs/cgroup
/dev/sda1                       128016    354  127662    1% /boot
/dev/mapper/VolGroup00-lv_home 5120000 634404 4485596   13% /home
tmpfs                           235373      1  235372    1% /run/user/1000

This limitation leads me to recommend other filesystems, especially when you expect it will grow in the future.

btrfs

Designed at Oracle for Linux, btrfs uses a copy-on-write B-Tree data structure and provides advanced pooling, snapshots, and checksum features. Initially developed in 2008, it took until 2013 to be marked as stable and even longer for Linux distros to mark it as supported – even now, only Oracle Linux 7, SuSE 15, and Synology DSM v6 do so. Plenty of people do love in spite of its support status and it is a viable solution, I’m just not as familiar with it personally.

xfs

Originally designed by SGI, xfs was ported to Linux in 2001. This, too, took a long while to become supported by distros, but now is supported by almost every distro, many of which use it as the default file system. It supports dynamic inode allocation, up to 8 EiB – 1 byte volumes AND files, online resizing, and many other features. This is the default in Red Hat Enterprise Linux and my preferred FS, which I’ll be using later – but is roughly equivalent to btrfs in general, if you prefer that.

LVM

Not quite a file system or a tool, the Logical Volume Manager is a device mapper that provides an extra layer of an abstraction between raw file systems and the OS. All of ext4, btrfs, and xfs can be used separately or with the LVM. Among other things, it lets you easily create device maps that span partitions and disks, resize them, and export/import them (for instance, to migrate disks to another system and retain the FS). I prefer using the LVM for everything; if you never modify the partitions it causes no harm, but if you ever need to and have not used LVM, it’s far more painful.

Most LVM commands begin with pv, vg, or lv and you use can use tab-completion to discover the utilities we do not cover today. I regularly use pvs/vgs/lvs to get short summaries and pvdisplay/vgdisplay/lvdisplay for detailed status. All others are very situational.

Adding a new volume

With some understanding of the tools and filesystems at our fingertips, let’s take a look at a common scenario: adding a new volume. I’m only focused on the OS steps here, so we will assume that you have added a new physical disk, attached a VMDK or EC2 volume, etc. as your implementation requires. We will also assume that you want an xfs filesystem at /mnt/newdisk, the system had a single disk /dev/sda that was automatically partitioned by the installer, and the new disk receives the moniker /dev/sdb. Not all OSes and systems will provide the same name; for instance, Jonathan’s AWS Linux system’s second disk was called /dev/nvme1n1. In such a case, just swap in the correct partition name.

Most modern distros will automatically detect new disks while running. If yours does not, or fails to detect the disk, you can force a rescan with the following command (change the host number as appropriate):

echo "- - -" > /sys/class/scsi_host/host0/scan

Make sure the new device shows up before continuing:

[rnelson0@build03 ~]$ ls /dev/sd?
/dev/sda
[rnelson0@build03 ~]$ ls /dev/sd?
/dev/sda /dev/sdb

You can use fdisk or parted to get information on the device. There will be no label, since we haven’t touched it, but the reported size should be accurate (be sure to account for the difference between, say, gigabytes and gibibytes. I was surprised to learn that when I specify 16 GB in my vSphere system, it apparently means gibibytes, where the proper label is actually GiB. By comparison, parted’s GB does mean gigabytes.

[rnelson0@build03 ~]$ sudo parted /dev/sdb
GNU Parted 3.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print
Error: /dev/sdb: unrecognised disk label
Model: VMware Virtual disk (scsi)
Disk /dev/sdb: 17.2GB
Sector size (logical/physical): 512B/512B
Partition Table: unknown
Disk Flags:

If we plan to have multiple partitions on a single disk, this is where we would create them using mkpart. However, you do not need to create a partition and can use the whole disk. I generally prefer using the whole disk; it’s fairly cheap to add a new disk and I believe is cognitively easier to manage than extending disks and partitions. For instance, if you add a 100 GB disk and make a single partition and later extend it to 200 GB, you can just extend it easily. If you create two 50 GB partitions and extend it to 100 GB later, you can only extend the 2nd partition and would have to create a new partition and join it to an existing LVM mapper to (effectively) increase the space of the partition. Alternatively, you can just always add entire disks to LVM groups, assuming that any extension is via an additional disk, which is what we will do here.

Though we are not creating partitions on the disk, we do want to leverage the LVM to create a mapping that we can manipulate semi-independently of the disks. Each LVM mapping has 3 layers – the physical, virtual, and logical layers. The physical layer refers to the disk itself (even if it’s a virtualized system, we pretend it’s a physical disk). The logical layer is where we create the end mapping that receives the file system. The virtual layers sits in the middle, sort of akin to the partitions we skipped, and presents physical layer information to the virtual layer. In our case, we simply map the entire disk to the physical and virtual layers, then assign 100% of the free space of the virtual device to the new logical device.

Once that is done, we format the result with our filesystem of choice, in this case xfs by using mkfs.xfs, and if it doesn’t exist, create the mount point. I normally put some of the information in variables, since the LVM commands are almost always the same. You’ll also notice I switched to root to save a few keystrokes over using sudo constantly, as all the LVM commands require elevated permissions.

#COMMANDS
DEVICE=/dev/sdb
NAME=newdisk
MOUNT=/mnt/newdisk

pvcreate ${DEVICE}
vgcreate vg_xfs_${NAME} ${DEVICE}
lvcreate -n lv_${NAME} -l 100%FREE vg_xfs_${NAME}
mkfs.xfs /dev/vg_xfs_${NAME}/lv_${NAME}
if [[ ! -d "${MOUNT}" ]]; then mkdir ${MOUNT}; fi

#OUTPUT
[root@build03 ~]# DEVICE=/dev/sdb
[root@build03 ~]# NAME=newdisk
[root@build03 ~]# MOUNT=/mnt/newdisk
[root@build03 ~]#
[root@build03 ~]# pvcreate ${DEVICE}
  Physical volume "/dev/sdb" successfully created.
[root@build03 ~]# vgcreate vg_xfs_${NAME} ${DEVICE}
  Volume group "vg_xfs_newdisk" successfully created
[root@build03 ~]# lvcreate -n lv_${NAME} -l 100%FREE vg_xfs_${NAME}
  Logical volume "lv_newdisk" created.
[root@build03 ~]# mkfs.xfs /dev/vg_xfs_${NAME}/lv_${NAME}
meta-data=/dev/vg_xfs_newdisk/lv_newdisk isize=512    agcount=4, agsize=1048320 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=4193280, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[root@build03 ~]# parted -l
...
Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/vg_xfs_newdisk-lv_newdisk: 17.2GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End     Size    File system  Flags
 1      0.00B  17.2GB  17.2GB  xfs

Finally, we have to mount this new partition. How your system manages the fstab can vary wildly, but generally just appending a line to /etc/fstab works (some distros automagically manage it). Mount the partition and now we can start using it!

[root@build03 ~]# echo "/dev/vg_xfs_${NAME}/lv_${NAME} ${MOUNT}          xfs    defaults        1 2" >> /etc/fstab
[root@build03 ~]# mount /mnt/newdisk
[root@build03 ~]# df -h /mnt/newdisk
Filesystem                             Size  Used Avail Use% Mounted on
/dev/mapper/vg_xfs_newdisk-lv_newdisk   16G   33M   16G   1% /mnt/newdisk
[root@build03 ~]# df -i /mnt/newdisk
Filesystem                             Inodes IUsed   IFree IUse% Mounted on
/dev/mapper/vg_xfs_newdisk-lv_newdisk 8386560     3 8386557    1% /mnt/newdisk

Extending an LVM device with an existing disk

The next most common case is to extend the disk a device uses. As I mentioned before, I prefer to just use a new disk as they’re frequently cheap, but some systems do make them more expensive. In my lab, I increased the size of my /dev/sdb to 32 GiB. Because the disk has already been detected, I have to force the OS to rescan and see the new space using echo 1>/sys/class/block/<DEVICE>/device/rescan:

[root@build03 ~]# parted -l
...
Error: /dev/sdb: unrecognised disk label
Model: VMware Virtual disk (scsi)
Disk /dev/sdb: 17.2GB
Sector size (logical/physical): 512B/512B
Partition Table: unknown
Disk Flags:

Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/vg_xfs_newdisk-lv_newdisk: 17.2GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End     Size    File system  Flags
 1      0.00B  17.2GB  17.2GB  xfs
...
[root@build03 ~]# echo 1>/sys/class/block/sdb/device/rescan
[root@build03 ~]# parted -l
...
Error: /dev/sdb: unrecognised disk label
Model: VMware Virtual disk (scsi)
Disk /dev/sdb: 34.4GB
Sector size (logical/physical): 512B/512B
Partition Table: unknown
Disk Flags:

Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/vg_xfs_newdisk-lv_newdisk: 17.2GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End     Size    File system  Flags
 1      0.00B  17.2GB  17.2GB  xfs

With the physical size increase visible, we need to expand the LVM device. First, pvresize so the physical layer sees the change, then vgscan to see the changes, and finally lvextend to have the logical volume allocate the virtual group’s free (unassigned) space (the + in +100%FREE means to ADD free space; 100%FREE without the plus means it allocates as much space as is free, but starting from the beginning – I don’t make the rules, I just get run over by them like everyone else!). Extend the FS with xfs_growfs and then look at the space and inodes. We can use the same variables as before to make it a little easier:

#COMMANDS
DEVICE=/dev/sdb
NAME=newdisk
MOUNT=/mnt/newdisk

pvresize ${DEVICE}
vgscan
lvextend -l +100%FREE /dev/vg_xfs_${NAME}/lv_${NAME}
xfs_growfs ${MOUNT}

#OUTPUT
[root@build03 ~]# DEVICE=/dev/sdb
[root@build03 ~]# NAME=newdisk
[root@build03 ~]# MOUNT=/mnt/newdisk
[root@build03 ~]#
[root@build03 ~]# pvresize ${DEVICE}
  Physical volume "/dev/sdb" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized
[root@build03 ~]# vgscan
  Reading volume groups from cache.
  Found volume group "vg_xfs_newdisk" using metadata type lvm2
  Found volume group "VolGroup00" using metadata type lvm2
[root@build03 ~]# lvextend -l +100%FREE /dev/vg_xfs_${NAME}/lv_${NAME}
  Size of logical volume vg_xfs_newdisk/lv_newdisk changed from 16.00 GiB (4096 extents) to <32.00 GiB (8191 extents).
  Logical volume vg_xfs_newdisk/lv_newdisk successfully resized.
[root@build03 ~]# xfs_growfs ${MOUNT}
meta-data=/dev/mapper/vg_xfs_newdisk-lv_newdisk isize=512    agcount=4, agsize=1048320 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=4193280, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 4193280 to 8387584

[root@build03 ~]# df -h /mnt/newdisk
Filesystem                             Size  Used Avail Use% Mounted on
/dev/mapper/vg_xfs_newdisk-lv_newdisk   32G   33M   32G   1% /mnt/newdisk
[root@build03 ~]# df -i /mnt/newdisk
Filesystem                              Inodes IUsed    IFree IUse% Mounted on
/dev/mapper/vg_xfs_newdisk-lv_newdisk 16775168     3 16775165    1% /mnt/newdisk

You can see that the inode count is roughly twice the size as before, so you are in no danger of running out any time soon, even with a ton of tiny files!

Extending an LVM device with a new disk

We can also extend an LVM device using a new disk. After attaching the new disk, you need to pvcreate the new physical device, vgextend the virtual group into the new disk, and then lvextend the volume and xfs_growfs filesystem. In this example, the new disk is /dev/sdc and is 16 GiB:

#COMMANDS
DEVICE=/dev/sdc
NAME=newdisk
MOUNT=/mnt/newdisk

pvcreate ${DEVICE}
vgextend vg_xfs_${NAME} ${DEVICE}
lvextend /dev/vg_xfs_${NAME}/lv_${NAME}
xfs_growfs ${MOUNT}

#OUTPUT
[root@build03 ~]# DEVICE=/dev/sdc
[root@build03 ~]# NAME=newdisk
[root@build03 ~]# MOUNT=/mnt/newdisk
[root@build03 ~]#
[root@build03 ~]# pvcreate ${DEVICE}
  Physical volume "/dev/sdc" successfully created.
[root@build03 ~]# vgextend vg_xfs_${NAME} ${DEVICE}
  Volume group "vg_xfs_newdisk" successfully extended
[root@build03 ~]# vgs
  VG             #PV #LV #SN Attr   VSize   VFree
  VolGroup00       1   3   0 wz--n- <99.51g   5.66g
  vg_xfs_newdisk   2   1   0 wz--n-  47.99g <16.00g
[root@build03 ~]# lvextend -n lv_${NAME} -l +100%FREE /dev/vg_xfs_${NAME}/lv_${NAME}
  Please specify a logical volume path.
  Run `lvextend --help' for more information.
[root@build03 ~]# lvextend -l +100%FREE /dev/vg_xfs_${NAME}/lv_${NAME}
  Size of logical volume vg_xfs_newdisk/lv_newdisk changed from <32.00 GiB (8191 extents) to 47.99 GiB (12286 extents).
  Logical volume vg_xfs_newdisk/lv_newdisk successfully resized.
[root@build03 ~]# xfs_growfs ${MOUNT}
meta-data=/dev/mapper/vg_xfs_newdisk-lv_newdisk isize=512    agcount=9, agsize=1048320 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=8387584, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 8387584 to 12580864
[root@build03 ~]# df -h /mnt/newdisk
Filesystem                             Size  Used Avail Use% Mounted on
/dev/mapper/vg_xfs_newdisk-lv_newdisk   48G   33M   48G   1% /mnt/newdisk
[root@build03 ~]# df -i /mnt/newdisk
Filesystem                              Inodes IUsed    IFree IUse% Mounted on
/dev/mapper/vg_xfs_newdisk-lv_newdisk 25161728     3 25161725    1% /mnt/newdisk

Once again, the space and the inodes have increased!

Removing an LVM device

Finally, there are times when you will remove a mount point entirely. You can just “yank” the disks and you’ll probably be okay, but rather than cross our fingers and hope, we can manually remove the configuration to ensure no auto-detection goes wrong. The process includes unmounting the partition, removing the logical/virtual/physical layer mappings, then yanking the disks. We will undo our previous examples, where /dev/sdb and /dev/sdc provide vg_xfs_newdisk and lv_newdisk. Just start at the end and work our way back:

[root@build03 ~]# umount /mnt/newdisk
[root@build03 ~]# lvremove lv_${NAME}
  Volume group "lv_newdisk" not found
  Cannot process volume group lv_newdisk
[root@build03 ~]# lvremove /dev/vg_xfs_newdisk/lv_newdisk
Do you really want to remove active logical volume vg_xfs_newdisk/lv_newdisk? [y/n]: y
  Logical volume "lv_newdisk" successfully removed
[root@build03 ~]# vgremove vg_xfs_newdisk
  Volume group "vg_xfs_newdisk" successfully removed
[root@build03 ~]# pvremove /dev/sdc
  Labels on physical volume "/dev/sdc" successfully wiped.
[root@build03 ~]# pvremove /dev/sdb
  Labels on physical volume "/dev/sdb" successfully wiped.
[root@build03 ~]#
[root@build03 ~]# lvs
  LV      VG         Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv_home VolGroup00 -wi-ao----  78.12g
  lv_root VolGroup00 -wi-ao---- <11.72g
  lv_swap VolGroup00 -wi-ao----   4.00g
[root@build03 ~]# vgs
  VG         #PV #LV #SN Attr   VSize   VFree
  VolGroup00   1   3   0 wz--n- <99.51g 5.66g
[root@build03 ~]# pvs
  PV         VG         Fmt  Attr PSize   PFree
  /dev/sda2  VolGroup00 lvm2 a--  <99.51g 5.66g

Do not forget to modify /etc/fstab to remove the mapping, or the next boot may complain about the missing disk or even fail outright. Make sure to follow your distro guide for this, as sometimes systems like anaconda can undo your edits silently. While this can all be done live, I do recommend a reboot at the end, so that if something went wrong that affects the boot cycle, it is found immediately.

Summary

I covered common tasks – adding, extending, and removing partitions on the same system – but there’s a lot more you can do. As mentioned, parted replaces fdisk, but we glossed over that in favor of using entire disks. But, at least you know that any documentation using fdisk is pretty aged. I also briefly mentioned vgexport/vgimport. If you plan to detach disks in LVM – even a single disk – and re-attach them to another system, you want to export the virtual group on the old system and import it on the new system. This will help ensure that any mismatch in device naming by the OS – say sdc and sde on the old system and sdc and sdd on the new system – do not result in any data loss.

I hope this article is a good reference for fundamental filesystem tasks for novice and expert linux administrators alike. Please let me know on twitter or in the comments of anything else you would like to see and I’ll keep this updated. Thanks!

Linux OS Patching with Puppet Tasks

One of the biggest gaps in most IT security policies is a very basic feature, patching. Specific numbers vary, but most surveys show a majority of hacks are due to unpatched vulnerabilities. Sadly, in 2018, automatic patching on servers is still out of the grasp of many, especially those running older OSes.

While there are a number of solutions out there from OS vendors (WSUS for Microsoft, Satellite for RHEL, etc.), I manage a number of OSes and the one commonality is that they are all managed by Puppet. A single solution with central reporting of success and failure sounds like a plan. I took a look at Puppet solutions and found a module called os_patching by Tony Green. I really like this module and what it has to offer, even though it doesn’t address all my concerns at this time. It shows a lot of promise and I suspect I will be working with Tony on some features I’d like to see in the future.

Currently, os_patching only supports Red Hat/Debian-based Linux distributions. Support is planned for Windows, and I know someone is looking at contributing to provide SuSE support. The module will collect information on patching that can be used for reporting, and patching is performed through a Task, either at the CLI or using the PE console’s Task pane.

Setup

Configuring your system to use the module is pretty easy. Add the module to your Puppetfile / .fixtures.yml, add a feature flag to your profile, and include os_patching behind the feature flag. Implement your tests and you’re good to go. Your only real decision is whether you default the feature flag to enabled or disabled. In my home network, I will enable it, but a production environment may want to disable it by default and enable it as an override through hiera. Because the fact collects data from the node, it will add a few seconds to each agent’s runtime, so be sure to include that in your calculation.

Adding the module is pretty simple, Here are the Puppetfile / .fixtures.yml diffs:

# Puppetfile
mod 'albatrossflavour/os_patching', '0.3.5'

# .fixtures.yml
fixtures:
  forge_modules:
    os_patching:
      repo: "albatrossflavour/os_patching"
      ref: "0.3.5"

Next, we need an update to our tests. I will be adding this to my profile::base, so I modify that spec file. Add a test for the default feature flag setting, and one for the non-default setting. Flip the to and not_to if you default the feature flag to disabled. If you run the tests now, you’ll get a failure, which is expected since there is no supporting code in the class yet.(there is more to the test, I have only included the framework plus the next tests):

require 'spec_helper'
describe 'profile::base', :type => :class do
  on_supported_os.each do |os, facts|
    let (:facts) {
      facts
    }

    context 'with defaults for all parameters' do
      it { is_expected.to contain_class('os_patching') }
    end

    context 'with manage_os_patching enabled' do
      let (:params) do {
        manage_os_patching: false,
      }
      end

      # Disabled feature flags
      it { is_expected.not_to contain_class('os_patching') }
    end
  end
end

Finally, add the feature flag and feature to profile::base (the additions are in italics):

class profile::base (
  Hash    $sudo_confs = {},
  Boolean $manage_puppet_agent = true,
  Boolean $manage_firewall = true,
  Boolean $manage_syslog = true,
  Boolean $manage_os_patching = true,
) {
  if $manage_firewall {
    include profile::linuxfw
  }

  if $manage_puppet_agent {
    include puppet_agent
  }
  if $manage_syslog {
    include rsyslog::client
  }
  if $manage_os_patching {
    include os_patching
  }
  ...
}

Your tests will pass now. That’s all it takes! For any nodes where it is enabled, you will see a new fact and some scripts pushed down on the next run:

[rnelson0@build03 controlrepo:production]$ sudo puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Notice: /File[/opt/puppetlabs/puppet/cache/lib/facter/os_patching.rb]/ensure: defined content as '{md5}af52580c4d1fb188061e0c51593cf80f'
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for build03.nelson.va
Info: Applying configuration version '1535052836'
Notice: /Stage[main]/Os_patching/File[/etc/os_patching]/ensure: created
Info: /Stage[main]/Os_patching/File[/etc/os_patching]: Scheduling refresh of Exec[/usr/local/bin/os_patching_fact_generation.sh]
Notice: /Stage[main]/Os_patching/File[/usr/local/bin/os_patching_fact_generation.sh]/ensure: defined content as '{md5}af4ff2dd24111a4ff532504c806c0dde'
Info: /Stage[main]/Os_patching/File[/usr/local/bin/os_patching_fact_generation.sh]: Scheduling refresh of Exec[/usr/local/bin/os_patching_fact_generation.sh]
Notice: /Stage[main]/Os_patching/Exec[/usr/local/bin/os_patching_fact_generation.sh]: Triggered 'refresh' from 2 events
Notice: /Stage[main]/Os_patching/Cron[Cache patching data]/ensure: created
Notice: /Stage[main]/Os_patching/Cron[Cache patching data at reboot]/ensure: created
Notice: Applied catalog in 54.18 seconds

You can now examine a new fact, os_patching, which will shows tons of information including the pending package updates, the number of packages, which ones are security patches, whether the node is blocked (explained in a bit), and whether a reboot is required:

[rnelson0@build03 controlrepo:production]$ sudo facter -p os_patching
{
  package_updates => [
    "acl.x86_64",
    "audit.x86_64",
    "audit-libs.x86_64",
    "audit-libs-python.x86_64",
    "augeas-devel.x86_64",
    "augeas-libs.x86_64",
    ...
  ],
  package_update_count => 300,
  security_package_updates => [
    "epel-release.noarch",
    "kexec-tools.x86_64",
    "libmspack.x86_64"
  ],
  security_package_update_count => 3,
  blocked => false,
  blocked_reasons => [],
  blackouts => {},
  pinned_packages => [],
  last_run => {},
  patch_window => "",
  reboots => {
    reboot_required => "unknown"
  }
}

Additional Configuration

There are a number of other settings you can configure if you’d like.

  • patch_window: a string descriptor used to “tag” a group of machines, i.e. Week3 or Group2
  • blackout_windows: a hash of datetime start/end dates during which updates are blocked
  • security_only: boolean, when enabled only the security_package_updates packages and dependencies are updated
  • reboot_override: boolean, overrides the task’s reboot flag (default: false)
  • dpkg_options/yum_options: a string of additional flags/options to dpkg or yum, respectively

You can set these in hiera. For instance, my global config has some blackout windows for the next few years:

os_patching::blackout_windows:
  'End of year 2018 change freeze':
    'start': '2018-12-15T00:00:00+1000'
    'end':   '2019-01-05T23:59:59+1000'
  'End of year 2019 change freeze':
    'start': '2019-12-15T00:00:00+1000'
    'end':   '2020-01-05T23:59:59+1000'
  'End of year 2020 change freeze':
    'start': '2020-12-15T00:00:00+1000'
    'end':   '2021-01-05T23:59:59+1000'
  'End of year 2021 change freeze':
    'start': '2021-12-15T00:00:00+1000'
    'end':   '2022-01-05T23:59:59+1000'

Patching Tasks

Once the module is installed and all of your agents have picked up the new config, they will start reporting their patch status. You can query nodes with outstanding patches using PQL. A search like inventory[certname] {facts.os_patching.package_update_count > 0 and facts.clientcert !~ 'puppet'} can find all your agents that have outstanding patches (except puppet – kernel patches require reboots and puppet will have a hard time talking to itself across a reboot). You can also select against a patch_window selection with and facts.os_patching.patch_window = "Week3" or similar. You can then provide that query to the command line task:

puppet task run os_patching::patch_server --query="inventory[certname] {facts.os_patching.package_update_count > 0 and facts.clientcert !~ 'puppet'}"

Or use the Console’s Task view to run the task against the PQL selection:

Add any other parameters you want in the dialog/CLI args, like setting rebootto true, then run the task. An individual job will be created for each node, all run in parallel. If you are selecting too many nodes for simultaneous runs, use additional filters, like the aforementioned patch_window or other facts (EL6 vs EL7, Debian vs Red Hat), etc. to narrow the node selection [I blew up my home lab, which couldn’t handle the CPU/IO lab, when I ran it against all systems the first time, whooops!]. When the job is complete, you will get your status back for each node as a hash of status elements and the corresponding values, including return (success or failure), reboot, packages_updated, etc. You can extract the logs from the Console or pipe CLI logs directly to jq (json query) to analyze as necessary.

Summary

Patching for many of us requires additional automation and reporting. The relatively new puppet module os_patching provides helpful auditing and compliance information alongside orchestration tasks for patching. Applying a little Puppet Query Language allows you to update the appropriate agents on your schedule, or to pull the compliance information for any reporting needs, always in the same format regardless of the (supported) OS. Currently, this is restricted to Red Hat/Debian-based Linux distributions, but there are plans to expand support to other OSes soon. Many thanks to Tony Green for his efforts in creating this module!

Powershell in a Post-TLS1.1 World

I was trying to install PowerCLI on a new server in a new environment today and I encountered all sorts of error messages when PowerShell tried to install the required NuGet provider:

PS C:\Windows\system32> Find-Module -Name VMware.PowerCLI
WARNING: Unable to download from URI 'https://go.microsoft.com/fwlink/?LinkID=627338&clcid=0x409' to ''.
WARNING: Unable to download the list of available providers. Check your internet connection.
PackageManagement\Install-PackageProvider : No match was found for the specified search criteria for the provider 'NuGet'. The package provider 
requires 'PackageManagement' and 'Provider' tags. Please check if the specified package has the tags.
At C:\Program Files\WindowsPowerShell\Modules\PowerShellGet\1.0.0.1\PSModule.psm1:7405 char:21
+ ... $null = PackageManagement\Install-PackageProvider -Name $script:N ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidArgument: (Microsoft.Power...PackageProvider:InstallPackageProvider) [Install-PackageProvider], Exception
+ FullyQualifiedErrorId : NoMatchFoundForProvider,Microsoft.PowerShell.PackageManagement.Cmdlets.InstallPackageProvider

PackageManagement\Import-PackageProvider : No match was found for the specified search criteria and provider name 'NuGet'. Try 
'Get-PackageProvider -ListAvailable' to see if the provider exists on the system.
At C:\Program Files\WindowsPowerShell\Modules\PowerShellGet\1.0.0.1\PSModule.psm1:7411 char:21
+ ... $null = PackageManagement\Import-PackageProvider -Name $script:Nu ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (NuGet:String) [Import-PackageProvider], Exception
+ FullyQualifiedErrorId : NoMatchFoundForCriteria,Microsoft.PowerShell.PackageManagement.Cmdlets.ImportPackageProvider

WARNING: Unable to download from URI 'https://go.microsoft.com/fwlink/?LinkID=627338&clcid=0x409' to ''.
WARNING: Unable to download the list of available providers. Check your internet connection.
PackageManagement\Get-PackageProvider : Unable to find package provider 'NuGet'. It may not be imported yet. Try 'Get-PackageProvider 
-ListAvailable'.
At C:\Program Files\WindowsPowerShell\Modules\PowerShellGet\1.0.0.1\PSModule.psm1:7415 char:30
+ ... tProvider = PackageManagement\Get-PackageProvider -Name $script:NuGet ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (Microsoft.Power...PackageProvider:GetPackageProvider) [Get-PackageProvider], Exception
+ FullyQualifiedErrorId : UnknownProviderFromActivatedList,Microsoft.PowerShell.PackageManagement.Cmdlets.GetPackageProvider

Find-Module : NuGet provider is required to interact with NuGet-based repositories. Please ensure that '2.8.5.201' or newer version of NuGet 
provider is installed.
At line:1 char:1
+ Find-Module -Name VMware.PowerCLI
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (:) [Find-Module], InvalidOperationException
+ FullyQualifiedErrorId : CouldNotInstallNuGetProvider,Find-Module

I made it very angry, and I didn’t know why! After some searching, I stumbled on a solution on the Microsoft Community site. The issue is that PowerShell 5.1 defaults to only enabling SSL3 and TLS 1.0 for secure HTTP connections. You have probably noticed a lot of recent warnings on various websites about services removing support for TLS 1.0 and 1.1, and SSL3 has been disabled for many for years. Microsoft is no slacker here, and go.microsoft.com has dropped support for SSL3 and TLS 1.0 (probably TLS 1.1, too, but I didn’t check). Thus the Provider list at the URL cannot be accessed and the NuGet install fails.

PS C:\ProgramData\Documents> [Net.ServicePointManager]::SecurityProtocol
Ssl3, Tls

You can fix this by specifying Tls12 as the SecurityProtocol, but it only persists in this session, for this user. Thankfully, PowerShell has a well documented series of profile loads, so you can make the change once for all users on the server. You can choose whichever level works best for you. I chose $PsHome\Profile.ps1 which affects All Users, All Hosts. If you choose a global file like that, launch a PowerShell session as administrator (if you weren’t aware, there’s a Ctrl-modifier to avoid right-clicking!) so that you have the rights to edit the target file. If not, just substitute the file below with your choice.

This snippet will check for the existence of the file and create it if needed, then populate it with our one line change and comment telling us why. Finally, it opens the file so you can inspect it and adjust if you need to. Note that running it again will append the same lines, which isn’t harmful but may result in a little confusion for the next person to peek at it. Hello, future self!

$ProfileFile = "${PsHome}\Profile.ps1"

if (! (Test-Path $ProfileFile)) {
New-Item -Path $ProfileFile -Type file -Force
}
''                                                                                | Out-File -FilePath $ProfileFile -Encoding ascii -Append
'# It is 2018, SSL3 and TLS 1.0 are no good anymore'                              | Out-File -FilePath $ProfileFile -Encoding ascii -Append
'[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12' | Out-File -FilePath $ProfileFile -Encoding ascii -Append

notepad $ProfileFile

If you enter [Net.ServicePointManager]::SecurityProtocol in the current window, you’ll get the same Ssl3, Tls result you saw before. The profile is only loaded at startup. Open a new powershell instance on the server – as any user, even – and run it again. You should see the new setting:

PS C:\windows\system32> [Net.ServicePointManager]::SecurityProtocol
Tls12

Now you are ready to use PowerShell to connect to modern web servers, whether it’s to install NuGet, use Invoke-WebRequest, or any other HTTPS connection. Enjoy!

vRealize Orchestrator Workflows for Puppet Enterprise

Over the past three years, my Puppet for vSphere Admins series has meandered through a number of topics, mostly involved on the Puppet side and somewhat light on the vSphere side. That changed a bit with my article Make the Puppet vRealize Automation plugin work with vRealize Orchestrator, describing how to use the plugin’s built-in workflows to perform some actions on your VMs. However, you had to invoke the workflows one by one, and they only worked on existing VMs. That is not good enough for automation! Today, we will start to look at how to integrate the Puppet Enterprise plugin into other workflows to provide end-to-end lifecycle management for your VMs.

What is the lifecycle of a VM? This can vary quite a bit, so the lifecycle we will work with today is made to be generic enough for everyone to use, but flexible enough that everyone can expand on it. It consists of:

  • Provisioning
    • Updating ancillary systems prior to VM creation (IPAM, DNS, etc)
    • Deploying a Virtual Machine
    • Installing Puppet Enterprise on the VM
    • Using Puppet Enterprise to provision services on and configure the VM
    • Add the new VM to a vCenter tag-based backup system
  • Decommission
    • Delete the VM (removes from backups)
    • Purge the record from PE
    • Update ancillary systems after VM removal (IPAM, DNS, etc)

Continue reading

Automating Puppet tests with a Jenkins Job, version 1.1

Today, let’s build on version 1.0 of our Jenkins job. We are running builds against every commit, but when someone opens a pull request, they don’t get automated builds or feedback. If the PR submitter even knows about Jenkins, and has network access and a login, they can look at it to find out how the tests went, but most people aren’t going to have that visibility (especially if your Jenkins server is private, as in this example setup). We need to make sure Jenkins is aware of the pull request and that it updates the PR with the status. Our end goal is for each PR to start a Jenkins build and update the PR with a successful check when done:

To get there, we will install and configure a new plugin and configure our job to use the plugin.

Continue reading

Automating Puppet tests with a Jenkins Job, version 1.0

As I’ve worked through setting up Jenkins and Puppet (and remembering my password!), I created a job to automate rspec tests on my puppet controlrepo. I am sure I will go through many iterations of this as I learn more, so we’ll just call this version 1.0. The goal is that when I push a branch to my controlrepo on GitHub, Jenkins automagically runs the test. Today, we will ensure that Jenkins is notified of activity on a GitHub repo, that it spins up a clean test environment without any left over files that may inadvertently assist, and run the tests. What it will NOT do is notify anyone – it won’t work off a Pull Request and provide feedback like Travis CI does, for instance. Hopefully, I will figure that out soon.

The example below is using GitHub. You can certainly make this work with BitBucket, GitLab, Mercurial, and tons of other source control systems and platforms, but you might need some additional Jenkins Plugins. It should be pretty apparent where to change Git/GitHub to the system/platform you chose.

Creating A Job

From the main view of your Jenkins instance, click New Item. Call it whatever you want, choose Freestyle project as the type, and click OK. The next page is going to be where we set up all the parameters for the job. There are tabs across the top AND you can scroll down; you’ll see the same selection items either way. Going from the top to the bottom, the settings that we want:

Continue reading

Jenkins Tricks – Password Recovery and Job Exports

I’m finally getting back to Jenkins, which I started waaaay back in November (here and here). Unfortunately, I kind of forgot my password. Well, that’s embarrassing! I also want to redo the manifest using maestrodev/rvm which means starting over, so I need to back things up. The manual for Jenkins and the results on Google can be overwhelming sometimes, so I thought I’d share what I learned to hopefully save someone else.

Password Recovery

There’s a few ways I found to recover your password. One suggestion is to disable all security, delete your user, re-enable security and allow signups, and then recreate the same user and things should just “work”. Part of the reason you have to do this is that once you disable security, you can’t change the password for your user; only the user can. That’s … frustrating.

Disable security by editing $JENKINS_HOME/config.xml, /var/lib/jenkins/config.xml on my instance. I was able to get away with disabling it by changing <useSecurity>true</useSecurity> to false, though the article suggests removing two other lines. Restart the service with systemctl restart jenkins or equivalent and now you’re able to get in and recreate some users.

Continue reading