How to move from a linear disk to an LVM disk and join the two disks into an LVM-like RAID-0

I had the recent need for adding a disk to an existing installation of Ubuntu, to make the / folder bigger. In such a case, I have two possibilities: to move my whole system to a new bigger disk (and e.g. dispose of the original disk) or to convert my disk to an LVM volume and add a second disk to enable the volume to grow. The first case was the subject of a previous post, but this time I learned…

How to move from a linear disk to an LVM disk and join the two disks into an LVM-like RAID-0

The starting point is simple:

  • I have one 14 Gb. disk (/dev/vda) with a single partition that is mounted in / (The disk has a GPT table and UEFI format and so it has extra partitions that we’ll keep as they are).
  • I have an 80 Gb. brand new disk (/dev/vdb)
  • I want to have one 94 Gb. volume built from the two disks
root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 14G 0 disk
├─vda1 252:1 0 13.9G 0 part /
├─vda14 252:14 0 4M 0 part
└─vda15 252:15 0 106M 0 part /boot/efi
vdb 252:16 0 80G 0 disk /mnt
vdc 252:32 0 4G 0 disk [SWAP]

The steps are the next:

  1. Creating a boot partition in /dev/vdb (this is needed because Grub cannot boot from LVM and needs an ext or VFAT partition)
  2. Format the boot partition and put the content of the current /boot folder
  3. Create an LVM volume using the extra space in /dev/vdb and initialize it using an ext4 filesystem
  4. Put the contents of the current / folder into the new partition
  5. Update grub to boot from the new disk
  6. Update the mount point for our system
  7. Reboot (and check)
  8. Add the previous disk to the LVM volume.

Let’s start…

Separate the /boot partition

When installing an LVM system, it is needed to have a /boot partition in a common format (e.g. ext2 or ext4), because GRUB cannot read from LVM. Then GRUB reads the contents of that partition and starts the proper modules to read the LVM volumes.

So we need to create the /boot partition. In our case, we are using ext2 format, because has no digest (we do not need it for the content of /boot) and it is faster. We are using 1 Gb. for the /boot partition, but 512 Mb. will probably be enough:

root@somove:~# fdisk /dev/vdb

Welcome to fdisk (util-linux 2.31.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p):

Using default response p.
Partition number (1-4, default 1):
First sector (2048-167772159, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-167772159, default 167772159): +1G

Created a new partition 1 of type 'Linux' and of size 1 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

root@somove:~# mkfs.ext2 /dev/vdb1
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: 24618637-d2d4-45fe-bf83-d69d37f769d0
Superblock backups stored on blocks:
32768, 98304, 163840, 229376

Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done

Now we’ll make a mount point for this partition, mount the partition and copy the contents of the current /boot folder to that partition:

root@somove:~# mkdir /mnt/boot
root@somove:~# mount /dev/vdb1 /mnt/boot/
root@somove:~# cp -ax /boot/* /mnt/boot/

Create an LVM volume in the extra space of /dev/vdb

First, we will create a new partition for our LVM system, and we’ll get the whole free space:

root@somove:~# fdisk /dev/vdb

Welcome to fdisk (util-linux 2.31.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): n
Partition type
p primary (1 primary, 0 extended, 3 free)
e extended (container for logical partitions)
Select (default p):

Using default response p.
Partition number (2-4, default 2):
First sector (2099200-167772159, default 2099200):
Last sector, +sectors or +size{K,M,G,T,P} (2099200-167772159, default 167772159):

Created a new partition 2 of type 'Linux' and of size 79 GiB.

Command (m for help): w
The partition table has been altered.
Syncing disks.

Now we will create a Physical Volume, a Volume Group and the Logical Volume for our root filesystem, using the new partition:

root@somove:~# pvcreate /dev/vdb2
Physical volume "/dev/vdb2" successfully created.
root@somove:~# vgcreate rootvg /dev/vdb2
Volume group "rootvg" successfully created
root@somove:~# lvcreate -l +100%free -n rootfs rootvg
Logical volume "rootfs" created.

If you want to learn about LVM to better understand what we are doing, you can read my previous post.

Now we are initializing the filesystem of the new /dev/rootvg/rootfs volume using ext4, and then we’ll copy the existing filesystem except from the special folders and the /boot folder (which we have separated in the other partition):

root@somove:~# mkfs.ext4 /dev/rootvg/rootfs
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 20708352 4k blocks and 5177344 inodes
Filesystem UUID: 47b4b698-4b63-4933-98d9-f8904ad36b2e
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000

Allocating group tables: done
Writing inode tables: done
Creating journal (131072 blocks): done
Writing superblocks and filesystem accounting information: done

root@somove:~# mkdir /mnt/rootfs
root@somove:~# mount /dev/rootvg/rootfs /mnt/rootfs/
root@somove:~# rsync -aHAXx --delete --exclude={/dev/*,/proc/*,/sys/*,/tmp/*,/run/*,/mnt/*,/media/*,/boot/*,/lost+found} / /mnt/rootfs/

Update the system to boot from the new /boot partition and the LVM volume

At this point we have our /boot partition (/dev/vdb1) and the / filesystem (/dev/rootvg/rootfs). Now we need to prepare GRUB to boot using these new resources. And here comes the magic…

root@somove:~# mount --bind /dev /mnt/rootfs/dev/
root@somove:~# mount --bind /sys /mnt/rootfs/sys/
root@somove:~# mount -t proc /proc /mnt/rootfs/proc/
root@somove:~# chroot /mnt/rootfs/

We are binding the special mount points /dev and /sys to the same folders in the new filesystem which is mounted in /mnt/rootfs. We are also creating the /proc mount point which holds the information about the processes. You can find some more information about why this is needed in my previous post on chroot and containers.

Intuitively, we are somehow “in the new filesystem” and now we can update things as if we had already booted into it.

At this point, we need to update the mount point in /etc/fstab to mount the proper disks once the system boots. So we are getting the UUIDs for our partitions:

root@somove:/# blkid
/dev/vda1: LABEL="cloudimg-rootfs" UUID="135ecb53-0b91-4a6d-8068-899705b8e046" TYPE="ext4" PARTUUID="b27490c5-04b3-4475-a92b-53807f0e1431"
/dev/vda14: PARTUUID="14ad2c62-0a5e-4026-a37f-0e958da56fd1"
/dev/vda15: LABEL="UEFI" UUID="BF99-DB4C" TYPE="vfat" PARTUUID="9c37d9c9-69de-4613-9966-609073fba1d3"
/dev/vdb1: UUID="24618637-d2d4-45fe-bf83-d69d37f769d0" TYPE="ext2"
/dev/vdb2: UUID="Uzt1px-ANds-tXYj-Xwyp-gLYj-SDU3-pRz3ed" TYPE="LVM2_member"
/dev/mapper/rootvg-rootfs: UUID="47b4b698-4b63-4933-98d9-f8904ad36b2e" TYPE="ext4"
/dev/vdc: UUID="3377ec47-a0c9-4544-b01b-7267ea48577d" TYPE="swap"

And we are updating /etc/fstab to mount /dev/mapper/rootvg-rootfs as the / folder. But we need to mount partition /dev/vdb1 in /boot. Using our example, the /etc/fstab file will be this one:

UUID="47b4b698-4b63-4933-98d9-f8904ad36b2e" / ext4 defaults 0 0
UUID="24618637-d2d4-45fe-bf83-d69d37f769d0" /boot ext2 defaults 0 0
LABEL=UEFI /boot/efi vfat defaults 0 0
UUID="3377ec47-a0c9-4544-b01b-7267ea48577d" none swap sw,comment=cloudconfig 0 0

We are using the UUID to mount / and /boot folders because the devices may change their names or location and that may lead to breaking our system.

And now we are ready to mount our /boot partition, update grub, and to install it in the /dev/vda disk (because we are keeping both disks).

root@somove:/# mount /boot
root@somove:/# update-grub
Generating grub configuration file ...
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Found linux image: /boot/vmlinuz-4.15.0-43-generic
Found initrd image: /boot/initrd.img-4.15.0-43-generic
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Found Ubuntu 18.04.1 LTS (18.04) on /dev/vda1
done
root@somove:/# grub-install /dev/vda
Installing for i386-pc platform.
Installation finished. No error reported.

Reboot and check

We are almost done, and now we are exiting the chroot and rebooting

root@somove:/# exit
root@somove:~# reboot

And the result should be the next one:

root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 14G 0 disk
├─vda1 252:1 0 13.9G 0 part
├─vda14 252:14 0 4M 0 part
└─vda15 252:15 0 106M 0 part /boot/efi
vdb 252:16 0 80G 0 disk
├─vdb1 252:17 0 1G 0 part /boot
└─vdb2 252:18 0 79G 0 part
└─rootvg-rootfs 253:0 0 79G 0 lvm /
vdc 252:32 0 4G 0 disk [SWAP]

root@somove:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 2.0G 0 2.0G 0% /dev
tmpfs 395M 676K 394M 1% /run
/dev/mapper/rootvg-rootfs 78G 993M 73G 2% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/vdb1 1008M 43M 915M 5% /boot
/dev/vda15 105M 3.6M 101M 4% /boot/efi
tmpfs 395M 0 395M 0% /run/user/1000

We have our / system mounted from the new LVM Logical Volume /dev/rootvg/rootfs, the /boot partition from /dev/vdb1, and the /boot/efi from the existing partition (just in case that we need it).

Add the previous disk to the LVM volume

Here we are facing the easier part, which is to integrate the original /dev/vda1 volume in the LVM volume.

Once we have double-checked that we had copied every file from the original / folder in /dev/vda1, we can initialize it for using it in LVM:

WARNING: This step wipes the content of /dev/vda1.

root@somove:~# pvcreate /dev/vda1
WARNING: ext4 signature detected on /dev/vda1 at offset 1080. Wipe it? [y/n]: y
Wiping ext4 signature on /dev/vda1.
Physical volume "/dev/vda1" successfully created.

Finally, we can integrate the new partition in our volume group and extend the logical volume to use the free space:

root@somove:~# vgextend rootvg /dev/vda1
Volume group "rootvg" successfully extended
root@somove:~# lvextend -l +100%free /dev/rootvg/rootfs
Size of logical volume rootvg/rootfs changed from <79.00 GiB (20223 extents) to 92.88 GiB (23778 extents).
Logical volume rootvg/rootfs successfully resized.
root@somove:~# resize2fs /dev/rootvg/rootfs
resize2fs 1.44.1 (24-Mar-2018)
Filesystem at /dev/rootvg/rootfs is mounted on /; on-line resizing required
old_desc_blocks = 10, new_desc_blocks = 12
The filesystem on /dev/rootvg/rootfs is now 24348672 (4k) blocks long.

And now we have the new 94 Gb. / folder which is made from /dev/vda1 and /dev/vdb2:

root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 14G 0 disk
├─vda1 252:1 0 13.9G 0 part
│ └─rootvg-rootfs 253:0 0 92.9G 0 lvm /
├─vda14 252:14 0 4M 0 part
└─vda15 252:15 0 106M 0 part /boot/efi
vdb 252:16 0 80G 0 disk
├─vdb1 252:17 0 1G 0 part /boot
└─vdb2 252:18 0 79G 0 part
└─rootvg-rootfs 253:0 0 92.9G 0 lvm /
vdc 252:32 0 4G 0 disk [SWAP]
root@somove:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 2.0G 0 2.0G 0% /dev
tmpfs 395M 676K 394M 1% /run
/dev/mapper/rootvg-rootfs 91G 997M 86G 2% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/vdb1 1008M 43M 915M 5% /boot
/dev/vda15 105M 3.6M 101M 4% /boot/efi
tmpfs 395M 0 395M 0% /run/user/1000

(optional) Having the /boot partition to /dev/vda

In case we wanted to have the /boot partition in /dev/vda, the procedure will be a bit different:

  1. Instead of creating the LVM volume in /dev/vdb1, I would prefer to create a single partition /dev/vdb1 (ext4) which does not imply the separation of /boot and /.
  2. Once created /dev/vdb1, copy the filesystem in /dev/vda1 to /dev/vdb1 and prepare to boot from /dev/vdb1 (chroot, adjust mount points, update-grub, grub-install…).
  3. Boot from the new partition and wipe the original /dev/vda1 partition.
  4. Create a partition /dev/vda1 for the new /boot and initialize it using ext2, copy the contents of /boot according to the instructions in this post.
  5. Create a partition /dev/vda2, create the LVM volume, initialize it and copy the contents of /dev/vdb1 except from /boot
  6. Prepare to boot from /dev/vda (chroot, adjust mount points, mount /boot, update-grub, grub-install…)
  7. Boot from the new /root+LVM / and decide whether you want to add /dev/vdb to the LVM volume or not.

Using this procedure, you will get from linear to LVM with a single disk. And then you can decide whether to make the LVM to grow or not. Moreover you may decide whether to create a LVM-Raid(1,5,…) with the new or other disks.

How to run Docker containers using common Linux tools (without Docker)

Containers are a current virtualization and application delivery trend, and I am working on it. If you try to search google about them, you can find tons of how-tos, information, tech guides, etc. As in anything in IT, there are flavors of containers. In this case, the players are Docker, LXC/LXD (in which Docker was once based), CoreOS RKT, OpenVZ, etc. If you have a look in the google trends, you’ll notice that undoubtedly the winner hype is Docker and “the others” try to fight against it.

trends_containers

But as there are several alternatives, I wanted to learn about the underlying technology and it seems that all of them are simply based on a set of kernel features: mainly the linux namespaces and the cgroups. And the most important diferences are the utilities that they provide to automate the procedures (the repository of images, container management and other parts of the ecosystem of a particular product).

Disclaimer: This is not a research blog, and so I am not going in depth on when namespaces were introduced in the kernel, which namespaces exist, how they work, what is copy on write, what are cgroups, etc. The purpose of this post is simply “fun with containers” 🙂

At the end, the “hard work” (i.e. the execution of a containerized environment) is made by the Linux kernel. And so this time I learned…

How to run Docker containers using common Linux tools (without Docker).

We start from a scenario in which we have one container running in Docker, and we want to run it using standard Linux tools. We will mainly act as a common user that has permissions to run Docker containers (i.e. in the case of Ubuntu, my user calfonso is in group ‘docker’), to see that we can run containers in the user space.

1

TL;DR

To run a contained environment with its own namespaces, using standard Linux tools you can follow the next procedure:

calfonso:handcontainer$ docker export blissful_goldstine -o dockercontainer.tar
calfonso:handcontainer$ mkdir rootfs
calfonso:handcontainer$ tar xf dockercontainer.tar --ignore-command-error -C rootfs/
calfonso:handcontainer$ unshare --mount --uts --ipc --net --pid --fork --user --map-root-user chroot $PWD/rootfs ash
root:# mount -t proc none /proc
root:# mount -t sysfs none /sys
root:# mount -t tmpfs none /tmp

At this point you need to set up the network devices (from outside the container) and deal with the cgroups (if you need to).

In first place, we are preparing a folder for our tests (handcontainer) and then we will dump the filesystem of the container:

calfonso:~$ mkdir handcontainer
calfonso:~$ cd handcontainer
calfonso:handcontainer$ docker export blissful_goldstine -o dockercontainer.tar

If we check the tar file produced, we’ll see that it is the whole filesystem of the container

2

Let’s extract it in a new folder (called rootfs)

calfonso:handcontainer$ mkdir rootfs
calfonso:handcontainer$ tar xf dockercontainer.tar --ignore-command-error -C rootfs/

This action will raise an error, because only the root user can use the mknod application and it is needed for the /dev folder, but it will be fine for us because we are not dealing with devices.

3

If we check the contents of rootfs, the filesystem is there and we can chroot to that filesystem to verify that we can use it (more or less) as if it was the actual system.

4

The chroot technique is well known and it was enough in the early days, but we have no isolation in this system. It is exposed if we use the next commands:

/ # ip link
/ # mount -t proc proc /proc && ps -ef
/ # hostname

In these cases, we can manipulate the network of the host, interact with the processes of the host or manipulate the hostname.

This is because using chroot only changes the root filesystem for the current session, but it takes no other action.

Some words on namespaces

One of the “magic” of containers are the namespaces (you can read more on this in this link). The namespaces make that one process have a particular vision of “the things” in several areas. The namespaces that are currently available in Linux are the next:

  • Mounts namespace: mount points.
  • PID namespace: process number.
  • IPC namespace: Inter Process Communication resources.
  • UTS namespace: hostname and domain name.
  • Network namespace: network resources.
  • User namespace: User and Group ID numbers.

Namespaces are handled in the Linux kernel, and any process is already in one namespace (i.e. the root namespace). So changing the namespaces of one particular process do not introduce additional complexity for the processes.

Creating particular namespaces for particular processes means that one process will have its particular vision of the resources in that namespace. As an example, if one process is started with its own PID namespace, the PID number of the process will be 0 and its children will have the next PID numbers. Or if one process is started with its own NET namespace, it will have a particular stack of network devices.

The parent namespace of one namespace is able to manipulate the nested namespace… It is a “hard” sentence, but what this means is that the root namespace is always able to manipulate the resources in the nested namespaces. So the root of one host has the whole vision of the namespaces.

Using namespaces

Now that we know about namespaces, we want to use them 😉

We can think of a container as one process (e.g. a /bin/bash shell) that has its particular root filesystem, its particular network, its particular hostname, its particular PIDs and users, etc. And this can be achieved by creating all these namespaces and spawning the /bin/bash processes inside of them.

The Linux kernel includes the calls clone, setns and unshare that enable to easily manipulate the namespaces for processes. But the common Linux distributions also provide the commands unshare and nsenter that enable to manipulate the namespaces for proccesses and applications from the commandline.

If we get back to the main host, we can use the command unshare to create a process with its own namespaces:

calfonso:handcontainer$ unshare --mount --uts --ipc --net --pid --fork --user --map-root-user /bin/bash

It seems that nothing happened, except that we are “root”, but if we start using commands that manipulate the features in the host, we’ll see what happened.

8If we echo the PID of the current process ($$) we can see that it is 1 (the main process), the current user has UID and GID 0 (he is root), we have not any network device, we can manipulate the hostname…

If we check the processes in the host, in other terminal, we’ll see that even we are shown as ‘root’, outside the process our process is executed under the credentials of our regular user:

9.png

This is the magic of the PID namespace, that makes that one process has different PID numbers, depending on the namespace.

Back in our “unshared” environment, if we try to show the processes that are currently running, we’ll get the vision of the processes in the host:

10.pngThis is because of how Linux works: the processes are file descriptors in the /proc mount point and, in our environments, we still have access to the existing mountpoints. But as we have our mnt namespace, we can mount our particular mount filesystem:

11

From ouside the container, we will be able to create a network device and put it into the namespace:

12

And if we get back to our “unshared” environment, we’ll see that we have a new network device:

13.png

The network setup is incomplete, and we will have access to nowhere (the peer of our eth0 is not connected to any network). This falls out of the scope of this post, but tha main idea is that you will need to connect the peer to some bridge, set an IP address for the eth0 inside the unshared environment, set up a NAT in the host, etc.

Obtaining the filesystem of the container

Now that we are in an “isolated” environment, we want to have the filesystem, utilities, etc. from the container that we started. And this can be done with our old friend “chroot” and some mounts:

root:handcontainer# chroot rootfs ash
root:# mount -t proc none /proc
root:# mount -t sysfs none /sys
root:# mount -t tmpfs none /tmp

Using chroot, the filesystem changes and we can use all the new mount points, commands, etc. in that filesystem. So now we have the vision of being inside an isolated environment with an isolated filesystem.

Now we have finished setting up a “hand made container” from an existing Docker container.

Further work

Appart from the “contained environment”, the Docker containers also are managed inside cgroups. Cgroups enable to account and to limit the resources that the processes are able to use (i.e. CPU, I/O, Memory and Devices) and that will be interesting to better control the resources that the processes will be allowed to use (and how).

It is possible to explore the cgroups in the path /sys/fs/cgroups. In that folder you will find the different cgroups that are managed in the system. Dealing with cgroups is a bit “obscure” (creating subfolders, adding PID to files, etc.), and will be left to other eventual post.

Other features that offer Docker is the layered filesystems. The layered filesystem is used in Docker basically to have a common filesystem and only track the modifications. So there is a set of common layers for different containers (that will not be modified) and each of the containers will have a layer that makes its filesystem unique from the others.

In our case, we used a simple flat filesystem for the container, that we used as root filesystem for our contained environment. Dealing with layered filesystem will be a new post 😉

And now…

Well, in this post we tried to understand how the containers work and see that it is a relatively simple feature that is offered by the kernel. But it involves a lot of steps to have a properly configured container (remember that we left the cgroups out of this post).

We did this steps just because we could… just to better understand containers.

My advise is to use the existing technologies to be able to use well built containers (e.g. Docker).

Further reading

As in other posts, I wrote this just to arrange my concepts following a very simple step-by-step procedure. But you can find a lot of resources about containers using your favourite search engine. The most useful resources that I found are: