How to use a Dell PS Equallogic as a backend for OpenStack Cinder

I have written several posts on installing OpenStack Rocky from scratch. They all have the tag #openstack. In the previous posts we…

  1. Installed OpenStack Rocky (part 1part 2, and part 3).
  2. Installed the Horizon Dashboard and upgraded noVNC (install horizon).
  3. Installed Cinder and integrated it with Glance (in this post).

And now that I have a Dell PS Equallogic, I learned…

How to use a Dell PS Equallogic as a backend for OpenStack Cinder

In the post in which we installed Cinder, we used a single disk and LVM as a backend to store and to serve the Cinder volumes. But in my lab, we own a Dell PS Equallogic, which is far better storage than a Linux server as a SAN. So I’d prefer to use it as a backend for Cinder.

In the last post we did “the hard work” and now Cinder, and setting the new backend is easier now. We’ll follow the official documentation of the plugin for the Dell EMC PS Series in this link.

Prior to replacing the storage backend, it is advisable to remove any volume that was stored in the other backend. And also the images that were volume-backed. Otherwise, Cinder will understand that the old volumes are stored in the new backend, and your installation will run into a weird state. If you plan to keep both storage backends (e.g. because you still have running volumes), you can add the new storage backend and set it as default.

In this guide, I will assume that every volume and image has been removed from the OpenStack deployment. Moreover, I assume that the PS Equallogic is up and running.

The SAN is in a separate data network. The IP address of the SAN is 192.168.1.11, and every node and the controller have IP addresses in that range. In my particular case, the controller has the IP address 192.168.1.49.

Create a user in the SAN

We need a user for OpenStack to access the SAN. So I have created user “osuser” with password “SAN_PASS”.

SAN
osuser is restricted to use only come features

It is important to check the connectivity via ssh. We can check it from the front-end:

osuser@192.168.1.11's password:
Last login: Wed May 6 15:45:56 2020 from 192.168.1.49 on tty??


Welcome to Group Manager

Copyright 2001-2016 Dell Inc.

group1>

Add the new backend to Cinder

Now that we have the user, we can just add the new backend to Cinder. So we’ll add the following lines to /etc/cinder/cinder.conf

[dell]
volume_driver = cinder.volume.drivers.dell_emc.ps.PSSeriesISCSIDriver
san_ip = 192.168.1.11
san_login = osuser
san_password = SAN_PASS
eqlx_group_name = group1
eqlx_pool = default

# Optional settings
san_thin_provision = true
use_chap_auth = false
eqlx_cli_max_retries = 3
san_ssh_port = 22
ssh_conn_timeout = 15
ssh_min_pool_conn = 1
ssh_max_pool_conn = 5

# Enable the volume-backed image cache
image_volume_cache_enabled = True

These lines just configure the access to your SAN. Please adjust the parameters to your settings and preferences.

Pay attention to variable “image_volume_cache_enabled” that I have also set to True for this backend. This enables the creation of the images by means of volume cloning, instead of uploading the images to the volume each time (you can read more about this in the part of integrating Cinder and Glance, in the previous post). In the end, this mechanism boosts the boot process of the VMs.

Now, we have to update the value of the enable backends in the [DEFAULT] section of file /etc/cinder/cinder.conf. In my case, I have disabled the other backends (i.e. LVM):

enabled_backends = dell

Finally, you have to restart the cinder services:

# service cinder-volume restart
# service cinder-scheduler restart

Testing the integration

If the integration was fine, you can find messages like the next ones in /var/log/cinder-volume.log:

2020-05-06 14:17:40.688 17855 INFO cinder.volume.manager [req-74aabdaa-5f88-4576-801e-cf923265d23e - - - - -] Starting volume driver PSSeriesISCSIDriver (1.4.6)
2020-05-06 14:17:40.931 17855 INFO paramiko.transport [-] Connected (version 1.99, client OpenSSH_5.0)
2020-05-06 14:17:42.096 17855 INFO paramiko.transport [-] Authentication (password) successful!
2020-05-06 14:17:42.099 17855 INFO cinder.volume.drivers.dell_emc.ps [req-74aabdaa-5f88-4576-801e-cf923265d23e - - - - -] PS-driver: executing "cli-settings confirmation off".
...

As I prepared a separate block for osuser, when I get to the SAN via SSH, I get no volume:

osuser@192.168.1.11's password:
Last login: Wed May 6 16:02:01 2020 from 192.168.1.49 on tty??


Welcome to Group Manager

Copyright 2001-2016 Dell Inc.

group1> volume show
Name Size Snapshots Status Permission Connections T
--------------- ---------- --------- ------- ---------- ----------- -
group1>

Now I am creating a new volume:

# openstack volume create --size 2 testvol
+---------------------+--------------------------------------+
| Field | Value |
+---------------------+--------------------------------------+
(...)
| id | 7844286d-869d-49dc-9c91-d7af6c2123ab |
(...)

And now, if I get to the SAN CLI, I will see a new volume whose name coincides with the ID of the just created volume:

group1> volume show
Name Size Snapshots Status Permission Connections T
--------------- ---------- --------- ------- ---------- ----------- -
volume-7844286d 2GB 0 online read-write 0 Y
-869d-49dc-9c
91-d7af6c2123
ab

Verifying the usage as image cache for volume backed instances

We are creating a new image, and it should be stored in the SAN, because we set Cinder as the default storage backend for Glance, in the previous post.

# openstack image create --public --container-format bare --disk-format qcow2 --file ./bionic-server-cloudimg-amd64.img "Ubuntu 18.04"
...
# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
| 41caf4ad-2bbd-4311-9003-00d39d009a9f | image-c8f3d5c5-5d4a-47de-bac0-bfd3f673c713 | available | 1 | |
+--------------------------------------+--------------------------------------------+-----------+------+-------------+

And the SAN’s CLI shows the newly created volume:

group1> volume show
Name Size Snapshots Status Permission Connections T
--------------- ---------- --------- ------- ---------- ----------- -
volume-41caf4ad 1GB 0 online read-write 0 Y
-2bbd-4311-90
03-00d39d009a
9f

At this point, we will boot a new VM which is volume backed and makes use of that image.

Captura de pantalla 2020-05-06 a las 16.44.55

 

First, we will see that a new volume is being created, and the volume that corresponds to the image is connected to “None”.

# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+----------+------+-----------------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+--------------------------------------------+----------+------+-----------------------------------+
| dd4071bf-48e9-484c-80cd-89f52a4fa442 | | creating | 4 | |
| 41caf4ad-2bbd-4311-9003-00d39d009a9f | image-c8f3d5c5-5d4a-47de-bac0-bfd3f673c713 | in-use | 1 | Attached to None on glance_store |
+--------------------------------------+--------------------------------------------+----------+------+-----------------------------------+

This is because Cinder is grabbing the image to be able to prepare it and upload it to the storage backend as a special image to clone from. We can check that folder /var/lib/cinder/conversion/ contains a temporary file, which corresponds to the image.

root@menoscloud:~# ls -l /var/lib/cinder/conversion/
total 337728
-rw------- 1 cinder cinder 0 may 6 14:32 tmp3wcPzD
-rw------- 1 cinder cinder 345833472 may 6 14:31 tmpDsKUzxmenoscloud@dell

Once obtained the image, Cinder will convert it to raw format (using qemu-img), into the volume that it has just created.

root@menoscloud:~# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+-------------+------+-------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+--------------------------------------------+-------------+------+-------------+
| dd4071bf-48e9-484c-80cd-89f52a4fa442 | | downloading | 3 | |
| 41caf4ad-2bbd-4311-9003-00d39d009a9f | image-c8f3d5c5-5d4a-47de-bac0-bfd3f673c713 | available | 1 | |
+--------------------------------------+--------------------------------------------+-------------+------+-------------+
root@menoscloud:~# ps -ef | grep qemu
root 18571 17855 0 14:32 ? 00:00:00 sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -O raw -t none -f qcow2 /var/lib/cinder/conversion/tmpDsKUzxmenoscloud@dell /dev/sda
root 18573 18571 2 14:32 ? 00:00:00 /usr/bin/python2.7 /usr/bin/cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -O raw -t none -f qcow2 /var/lib/cinder/conversion/tmpDsKUzxmenoscloud@dell /dev/sda
root 18575 18573 17 14:32 ? 00:00:02 /usr/bin/qemu-img convert -O raw -t none -f qcow2 /var/lib/cinder/conversion/tmpDsKUzxmenoscloud@dell /dev/sda

And once the conversion procedure has finished, we’ll see that it appears a new volume that stores the image (in raw format), and it is ready to be cloned for the next volume-backed instances:

# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+-----------+------+-------------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+--------------------------------------------+-----------+------+-------------------------------+
| dd4071bf-48e9-484c-80cd-89f52a4fa442 | | in-use | 4 | Attached to eql1 on /dev/vda |
| 61903a21-02ad-4dc4-8709-a039e7a65815 | image-c8f3d5c5-5d4a-47de-bac0-bfd3f673c713 | available | 3 | |
| 41caf4ad-2bbd-4311-9003-00d39d009a9f | image-c8f3d5c5-5d4a-47de-bac0-bfd3f673c713 | available | 1 | |
+--------------------------------------+--------------------------------------------+-----------+------+-------------------------------+

If we start a new volume-backed instance, we can check that it boots much faster than the first one, and it happens because in this case, Cinder skips the “qemu-img convert” phase and just clones the volume. It is possible to check the commands in file /var/log/cinder/cinder-volume.log:

(...)
2020-05-06 14:39:39.181 17855 INFO cinder.volume.drivers.dell_emc.ps [req-0a0a625f-eb5d-4826-82b2-f1bce93535cd 22a4facfd9794df1b8db1b4b074ae6db 50ab438534cd4c04b9ad341b803a1587 - - -] PS-driver: executing "volume select volume-61903a21-02ad-4dc4-8709-a039e7a65815 clone volume-f3226260-7790-427c-b3f0-da6ab1b2291b".
2020-05-06 14:39:40.333 17855 INFO cinder.volume.drivers.dell_emc.ps [req-0a0a625f-eb5d-4826-82b2-f1bce93535cd 22a4facfd9794df1b8db1b4b074ae6db 50ab438534cd4c04b9ad341b803a1587 - - -] PS-driver: executing "volume select volume-f3226260-7790-427c-b3f0-da6ab1b2291b size 4G no-snap".
(...)

Obviously, you can check that the volumes have been created in the backend, by using the SAN’s CLI:

group1> volume show
Name Size Snapshots Status Permission Connections T
--------------- ---------- --------- ------- ---------- ----------- -
volume-41caf4ad 1GB 0 online read-write 0 Y
-2bbd-4311-90
03-00d39d009a
9f
volume-dd4071bf 4GB 0 online read-write 1 Y
-48e9-484c-80
cd-89f52a4fa4
42
volume-61903a21 3GB 0 online read-write 0 Y
-02ad-4dc4-87
09-a039e7a658
15
volume-f3226260 4GB 0 online read-write 1 Y
-7790-427c-b3
f0-da6ab1b229
1b

How to install Cinder in OpenStack Rocky and make it work with Glance

I have written several posts on installing OpenStack Rocky from scratch. They all have the tag #openstack. In the previous posts we…

  1. Installed OpenStack Rocky (part 1, part 2, and part 3).
  2. Installed the Horizon Dashboard and upgraded noVNC (install horizon).

So we have a working installation of the basic services (keystone, glance, neutron, compute, etc.). And now it is time to learn

How to install Cinder in OpenStack Rocky and make it work with Glance

Cinder is very straightforward to install using the basic mechanism: having a standard Linux server that will serve block devices as a SAN, by providing iSCSI endpoints. This server will use tgtadm and iscsiadm as the basic tools, and a backend for the block devices.

The other problem is to integrate the cinder server with an external SAN device, such as a Dell Equallogic SAN. Cinder has some plugins and each of them has its own problems.

In this post, we are following the standard cinder installation guide for Ubuntu (in this link), and what we’ll get is the standard SAN server with an LVM back-end for the block devices. Then we will integrate it with Glance (to be able to use Cinder as a storage for the OpenStack images) and we’ll learn a bit about how they work.

Installing Cinder

In the first place we are creating a database for Cinder:

mysql -u root -p <<< "CREATE DATABASE cinder;\
GRANT ALL PRIVILEGES ON cinder.* TO 'cinder'@'localhost' IDENTIFIED BY 'CINDER_DBPASS';\
GRANT ALL PRIVILEGES ON cinder.* TO 'cinder'@'%' IDENTIFIED BY 'CINDER_DBPASS';"

Now we are creating a user for Cinder and to create the service in OpenStack (we create both v2 and v3):

$ openstack user create --domain default --password "CINDER_PASS" cinder
$ openstack role add --project service --user cinder admin
$ openstack service create --name cinderv2 --description "OpenStack Block Storage" volumev2
$ openstack service create --name cinderv3 --description "OpenStack Block Storage" volumev3

Once we have the user and the service, we create the proper endpoints for both v2 and v3:

$ openstack endpoint create --region RegionOne volumev2 public http://controller:8776/v2/%\(project_id\)s
$ openstack endpoint create --region RegionOne volumev2 internal http://controller:8776/v2/%\(project_id\)s
$ openstack endpoint create --region RegionOne volumev2 admin http://controller:8776/v2/%\(project_id\)s
$ openstack endpoint create --region RegionOne volumev3 public http://controller:8776/v3/%\(project_id\)s
$ openstack endpoint create --region RegionOne volumev3 internal http://controller:8776/v3/%\(project_id\)s
$ openstack endpoint create --region RegionOne volumev3 admin http://controller:8776/v3/%\(project_id\)s

And now we are ready to install the cinder packages

$ apt install -y cinder-api cinder-scheduler

Once the packages are installed, we need to update the configuration file /etc/cinder/cinder.conf. The content will be something like the next:

[DEFAULT]
rootwrap_config = /etc/cinder/rootwrap.conf
api_paste_confg = /etc/cinder/api-paste.ini
iscsi_helper = tgtadm
volume_name_template = volume-%s
volume_group = cinder-volumes
verbose = True
auth_strategy = keystone
state_path = /var/lib/cinder
lock_path = /var/lock/cinder
volumes_dir = /var/lib/cinder/volumes
enabled_backends = lvm
transport_url = rabbit://openstack:RABBIT_PASS@controller
my_ip = 192.168.1.241
glance_api_servers = http://controller:9292
[database]
connection = mysql+pymysql://cinder:CINDER_DBPASS@controller/cinder
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_id = default
user_domain_id = default
project_name = service
username = cinder
password = CINDER_PASS
[oslo_concurrency]
lock_path = /var/lib/cinder/tmp
[lvm]
volume_driver = cinder.volume.drivers.lvm.LVMVolumeDriver
volume_group = cinder-volumes
iscsi_protocol = iscsi
iscsi_helper = tgtadm
image_volume_cache_enabled = True

You must adapt this file to your configuration. In special, the passwords of rabbit, cinder database and cinder service, and the IP address of the cinder server (which is stored in my_ip variable). In my case, I am using the same server as in the previous posts.

Using this configuration we suppose that cinder will use tgtadm to create the iSCSI endpoints, and the backend will be LVM.

Now we just have to add the following lines to file /etc/nova/nova.conf to enable cinder in OpenStack via nova-api:

[cinder]
os_region_name=RegionOne

Then, sync the cinder database by executing the next command:

$ su -s /bin/sh -c "cinder-manage db sync" cinder

And restart the related services:

$ service nova-api restart
$ service cinder-scheduler restart
$ service apache2 restart

Preparing the LVM backend

Now that we have configured cinder, we need a backend for the block devices. In our case, it is LVM. If you want to know a bit more about the concepts that we are using at this point and what we are doing, you can check my previous post in this link.

Now we are installing the LVM tools:

$ apt install lvm2 thin-provisioning-tools

LVM needs a partition or a whole disk to work. You can use any partition or disk (or even a file that can be used for testing purposes, as described in the section “testlab” in this link). In our case, we are using the whole disk /dev/vdb.

According to our settings, OpenStack expects to be able to use an existing LVM volume group with the name “cinder-volumes”. So we need to create it

$ pvcreate /dev/vdb
$ vgcreate cinder-volumes /dev/vdb

Once we have our volume group ready, we can install the cinder-volume service.

$ apt install cinder-volume

And that’s all about the installation of cinder. The last part will work because we included section [lvm] in /etc/cinder/cinder.conf and “enabled_backends = lvm”.

Verifying that cinder works

To verify that cinder works, we’ll just create one volume:

# openstack volume create --size 2 checkvol
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| attachments         | []                                   |
| availability_zone   | nova                                 |
| bootable            | false                                |
| consistencygroup_id | None                                 |
| created_at          | 2020-05-05T09:52:47.000000           |
| description         | None                                 |
| encrypted           | False                                |
| id                  | aadc24eb-ec1c-4b84-b2b2-8ea894b50417 |
| migration_status    | None                                 |
| multiattach         | False                                |
| name                | checkvol                             |
| properties          |                                      |
| replication_status  | None                                 |
| size                | 2                                    |
| snapshot_id         | None                                 |
| source_volid        | None                                 |
| status              | creating                             |
| type                | None                                 |
| updated_at          | None                                 |
| user_id             | 8c67fb57d70646d9b57beb83cc04a892     |
+---------------------+--------------------------------------+

After a while, we can check that the volume has been properly created and it is available.

# openstack volume list
+--------------------------------------+----------+-----------+------+-----------------------------+
| ID                                   | Name     | Status    | Size | Attached to                 |
+--------------------------------------+----------+-----------+------+-----------------------------+
| aadc24eb-ec1c-4b84-b2b2-8ea894b50417 | checkvol | available |    2 |                             |
+--------------------------------------+----------+-----------+------+-----------------------------+

If we are curious, we can check what happened in the backend:

# lvs -o name,lv_size,data_percent,thin_count
  LV                                          LSize  Data%  #Thins
  cinder-volumes-pool                         19,00g 0,00        1
  volume-aadc24eb-ec1c-4b84-b2b2-8ea894b50417  2,00g 0,00

We can see that we have a volume with the name volume-aadc24eb-ec1c-4b84-b2b2-8ea894b50417, with the ID that coincides with the ID of the volume that we have just created. Moreover, we can see that it has occupied 0% of space because it is thin-provisioned (i.e. it will only use the effective stored data like in qcow2 or vmdk virtual disk formats).

Integrating Cinder with Glance

The integration of Cinder with Glance can be made in two different parts:

  1. Using Cinder as a storage backend for the Images.
  2. Using Cinder as a cache for the Images of the VMs that are volume-based.

It may seem that it is the same, but it is not. To be able to identify what feature we want, we need to know how OpenStack works, and also acknowledge that Cinder and Glance are independent services.

Using Cinder as a backend for Glance

In OpenStack, when a VM is image-based (i.e. it does not create a volume), nova-compute will transfer the image to the host in which it has to be used. It happens regarding it is from a filesystem backend (i.e. stored in /var/lib/glance/images/), it is stored in swift (it is transferred using HTTP), or it is stored in cinder (it is transferred using iSCSI). So using Cinder as a storage backend for the Images will prevent the need of having extra storage in the controller. But it will not be useful for anything more.

Image based VM
Booting an image-based VM, which does not create a volume.

If you start a volume-based image, OpenStack will create a volume for your new VM (using cinder). In this case, cinder is very inefficient, because it connects to the existing volume, downloads it, and converts it to raw format and dumps it to the new volume (i.e. using qemu-img convert -O raw -f qcow2 …). So the creation of the image is extremely slow.

There is one way to boost this procedure by using efficient tools: if the image is stored in raw format and the owner is the same user that tries to use it (check image_upload_use_internal_tenant), and the allowed_direct_url_schemes option is properly set, the new volume will be created by cloning the volume that contains the image and resizing it using the backend tools (i.e. lvm cloning and resizing capabilities). That means that the new volume will be almost instantly created, and we’ll try to use this mechanism, if possible.

To enable cinder as a backend for Glance, you need to add the following lines to file /etc/glance/glance-api.conf

[glance_store]
stores = file,http,cinder
default_store = cinder
filesystem_store_datadir = /var/lib/glance/images/
cinder_store_auth_address = http://controller:5000/v3
cinder_store_user_name = cinder
cinder_store_password = CINDER_PASS
cinder_store_project_name = service

We are just adding “Cinder” as one of the mechanisms for Glance (apart from the others, like file or HTTP). In our example, we are setting Cinder as the default storage backend, because the horizon dashboard has not any way to select where to store the images.

It is possible to set any other storage backend as default storage backend, but then you’ll need to create the volumes by hand and execute Glance low-level commands such as “glance location-add <image-uuid> –url cinder://<volume-uuid>”. The mechanism can be seen in the official guide.

Variables cinder_store_user_name, cinder_store_password, and cinder_store_project_name are used to set the owner of the images that are uploaded to Cinder via Glance. And they are used only if setting image_upload_use_internal_tenant is set to True in the Cinder configuration.

And now we need to add the next lines to section [DEFAULT] in /etc/cinder/cinder.conf:

allowed_direct_url_schemes = cinder
image_upload_use_internal_tenant = True

Finally, you need to restart the services:

# service cinder-volume restart
# service cinder-scheduler restart
# service glance-api restart

It may seem a bit messy, but Cinder and Glance are configured in this way. I feel that if you use the configuration that I propose in this post, you’ll get the integration as expected.

Verifying the integration

We are storing a new image, but we’ll store it as a volume this time. Moreover, we will store it in raw format, to be able to use the “direct_url” method to clone the volumes instead of downloading them:

# wget https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img
(...)
# qemu-img convert -O raw bionic-server-cloudimg-amd64.img bionic-server-cloudimg-amd64.raw
# openstack image create --public --container-format bare --disk-format raw --file ./bionic-server-cloudimg-amd64.raw "Ubuntu 18.04"+------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(...)
| id | 7fd1c4b4-783e-41cb-800d-4e259c22d1ab |
| name | Ubuntu 18 |
(...)
+------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

And now we can check what happened under the hood:

# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
| ID                                   | Name                                       | Status    | Size | Attached to |
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
| 13721f57-c706-47c9-9114-f4b011f32ea2 | image-7fd1c4b4-783e-41cb-800d-4e259c22d1ab | available |    3 |             |
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
# lvs
  LV                                          VG             Attr       LSize  Pool                Origin Data%  Meta%  Move Log Cpy%Sync Convert
  cinder-volumes-pool                         cinder-volumes twi-aotz-- 19,00g                            11,57  16,11
  volume-13721f57-c706-47c9-9114-f4b011f32ea2 cinder-volumes Vwi-a-tz--  3,00g cinder-volumes-pool        73,31

We can see that a new volume has been created with the name “image-7fd1c4b4…”, which corresponds to the just created image ID. The volume has an ID 13721f57…, and LVM has a new logical volume with the name volume-13721f57 that corresponds to that new volume.

Now if we create a new VM that uses that image, we will notice that the creation of the VM is very quick (and this is because we used the “allowed_direct_url_schemes” method).

# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+-----------+------+-----------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+--------------------------------------------+-----------+------+-----------------------------+
| 219d9f92-ce17-4da6-96fa-86a04e460eb2 | | in-use | 4 | Attached to u1 on /dev/vda |
| 13721f57-c706-47c9-9114-f4b011f32ea2 | image-7fd1c4b4-783e-41cb-800d-4e259c22d1ab | available | 3 | |
+--------------------------------------+--------------------------------------------+-----------+------+-----------------------------+
# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
cinder-volumes-pool cinder-volumes twi-aotz-- 19,00g 11,57 16,11
volume-13721f57-c706-47c9-9114-f4b011f32ea2 cinder-volumes Vwi-a-tz-- 3,00g cinder-volumes-pool 73,31
volume-219d9f92-ce17-4da6-96fa-86a04e460eb2 cinder-volumes Vwi-aotz-- 4,00g cinder-volumes-pool volume-13721f57-c706-47c9-9114-f4b011f32ea2 54,98

Under the hood, we can see that the creation of the volume for the instance (id 219d9f92…) has a reference is LVM to the original volume (id 13721f57…) that corresponds to the volume of the image.

Cinder as an image-volume storage cache

If you do not need Cinder as a storage backend (either because you are happy with the filesystem backend or you are storing the images in swift, etc.), it is also possible to use it to boost the boot process of VMs that create a volume as the main disk.

Cinder enables a mechanism to be used as an image-volume storage cache. It means that when an image is being used for a volume-based VM, it will be stored in a special volume regarding it was stored in cinder or not. Then the volume that contains the image will be cloned and resized (using the backend tools; i.e. lvm cloning and resizing capabilities) for subsequent VMs that use that image.

During the first use of the image, it will be downloaded (either from the filesystem, cinder, swift, or wherever the image is stored), converted to raw format, and stored as a volume. The next uses of the image will work as using the “direct_url” method (i.e. cloning the volume).

To enable this mechanism, you need to get the id of project “server” and the id of user “cinder”:

# openstack project list
+----------------------------------+---------+
| ID | Name |
+----------------------------------+---------+
| 50ab438534cd4c04b9ad341b803a1587 | service |
(...)
+----------------------------------+---------+
# openstack user list
+----------------------------------+-----------+
| ID | Name |
+----------------------------------+-----------+
| 22a4facfd9794df1b8db1b4b074ae6db | cinder |
(...)
+----------------------------------+-----------+

Then you need to add the following lines to the [DEFAULT] section in file /etc/cinder/cinder.conf (configuring your ids):

cinder_internal_tenant_project_id = 50ab438534cd4c04b9ad341b803a1587
cinder_internal_tenant_user_id = 22a4facfd9794df1b8db1b4b074ae6db

And add the following line to the section of the backend that will act as a cache (in our case [lvm])

[lvm]
...
image_volume_cache_enabled = True

Then you just need to restart the cinder services:

root@menoscloud:~# service cinder-volume restart
root@menoscloud:~# service cinder-scheduler restart

Testing the cache

In this case, I am creating a new image, which is in qcow2 format:

# wget https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img
(...)
# openstack image create --public --container-format bare --disk-format qcow2 --file ./bionic-server-cloudimg-amd64.img "Ubuntu 18.04 - qcow2"

Under the hood, OpenStack created a volume (id c36de566…) for the corresponding image (id 50b84eb0…), that can be seen in LVM:

# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
| c36de566-d538-4b43-b2b3-d000f9b4162f | image-50b84eb0-9de5-45ba-8004-f1f1c7a0c00c | available | 1 | |
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
# lvs
  LV                                          VG             Attr       LSize  Pool                Origin Data%  Meta%  Move Log Cpy%Sync Convert
  cinder-volumes-pool                         cinder-volumes twi-aotz-- 19,00g                            1,70   11,31
  volume-c36de566-d538-4b43-b2b3-d000f9b4162f cinder-volumes Vwi-a-tz--  1,00g cinder-volumes-pool        32,21

Now we create a VM (which is volume-based). And during the “block device mapping” phase, we can inspect what is happening under the hood:

# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+-------------+------+-------------+
| ID                                   | Name                                       | Status      | Size | Attached to |
+--------------------------------------+--------------------------------------------+-------------+------+-------------+
| 60c19e3c-3960-4fe7-9895-0426070b3e88 |                                            | downloading |    3 |             |
| c36de566-d538-4b43-b2b3-d000f9b4162f | image-50b84eb0-9de5-45ba-8004-f1f1c7a0c00c | available   |    1 |             |
+--------------------------------------+--------------------------------------------+-------------+------+-------------+
# lvs
  LV                                          VG             Attr       LSize  Pool                Origin Data%  Meta%  Move Log Cpy%Sync Convert
  cinder-volumes-pool                         cinder-volumes twi-aotz-- 19,00g                            3,83   12,46
  volume-60c19e3c-3960-4fe7-9895-0426070b3e88 cinder-volumes Vwi-aotz--  3,00g cinder-volumes-pool        13,54
  volume-c36de566-d538-4b43-b2b3-d000f9b4162f cinder-volumes Vwi-a-tz--  1,00g cinder-volumes-pool        32,21
# ps -ef | grep qemu
root      9681  9169  0 09:42 ?        00:00:00 sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -O raw -t none -f qcow2 /var/lib/cinder/conversion/tmpWqlJD5menoscloud@lvm /dev/mapper/cinder--volumes-volume--60c19e3c--3960--4fe7--9895--0426070b3e88
root      9682  9681  0 09:42 ?        00:00:00 /usr/bin/python2.7 /usr/bin/cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -O raw -t none -f qcow2 /var/lib/cinder/conversion/tmpWqlJD5menoscloud@lvm /dev/mapper/cinder--volumes-volume--60c19e3c--3960--4fe7--9895--0426070b3e88
root      9684  9682 29 09:42 ?        00:00:13 /usr/bin/qemu-img convert -O raw -t none -f qcow2 /var/lib/cinder/conversion/tmpWqlJD5menoscloud@lvm /dev/mapper/cinder--volumes-volume--60c19e3c--3960--4fe7--9895--0426070b3e88inder has created a new volume (id 26aee3ee...) and it is converting the content of the image (id 50b84eb0...) to raw format into that new volume.

Cinder created a new volume (id 60c19e3c…) that does not correspond with the size of the flavor I used (4Gb). And cinder is converting one image to that new volume. That image was previously downloaded from Cinder, from volume c36de566… to folder /var/lib/cinder/conversion mapping the iSCSI device and dumping its contents. In case that the image was not Cinder backed, it would have been downloaded using the appropriate mechanisms (e.g. HTTP or file copy from /var/lib/glance/images).

After a while (depending on the conversion process), the VM will start and we can inspect the backend…

# openstack volume list --all-projects
+--------------------------------------+--------------------------------------------+-----------+------+-----------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+--------------------------------------------+-----------+------+-----------------------------+
| 91d51bc2-e33b-4b97-b91d-3a8655f88d0f | image-50b84eb0-9de5-45ba-8004-f1f1c7a0c00c | available | 3 | |
| 60c19e3c-3960-4fe7-9895-0426070b3e88 | | in-use | 4 | Attached to q1 on /dev/vda |
| c36de566-d538-4b43-b2b3-d000f9b4162f | image-50b84eb0-9de5-45ba-8004-f1f1c7a0c00c | available | 1 | |
+--------------------------------------+--------------------------------------------+-----------+------+-----------------------------+
# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
cinder-volumes-pool cinder-volumes twi-aotz-- 19,00g 13,69 17,60
volume-60c19e3c-3960-4fe7-9895-0426070b3e88 cinder-volumes Vwi-aotz-- 4,00g cinder-volumes-pool 56,48
volume-91d51bc2-e33b-4b97-b91d-3a8655f88d0f cinder-volumes Vwi-a-tz-- 3,00g cinder-volumes-pool volume-60c19e3c-3960-4fe7-9895-0426070b3e88 73,31
volume-c36de566-d538-4b43-b2b3-d000f9b4162f cinder-volumes Vwi-a-tz-- 1,00g cinder-volumes-pool 32,21

Now we can see that there is a new volume (id 91d51bc2…) which has been associated to the image (id 50b84eb0…). And that volume will be cloned using the LVM mechanisms in the next uses of the image for volume-backend instances. Now if you start new instances, they will boot much faster.

How to move from a linear disk to an LVM disk and join the two disks into an LVM-like RAID-0

I had the recent need for adding a disk to an existing installation of Ubuntu, to make the / folder bigger. In such a case, I have two possibilities: to move my whole system to a new bigger disk (and e.g. dispose of the original disk) or to convert my disk to an LVM volume and add a second disk to enable the volume to grow. The first case was the subject of a previous post, but this time I learned…

How to move from a linear disk to an LVM disk and join the two disks into an LVM-like RAID-0

The starting point is simple:

  • I have one 14 Gb. disk (/dev/vda) with a single partition that is mounted in / (The disk has a GPT table and UEFI format and so it has extra partitions that we’ll keep as they are).
  • I have an 80 Gb. brand new disk (/dev/vdb)
  • I want to have one 94 Gb. volume built from the two disks
root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 14G 0 disk
├─vda1 252:1 0 13.9G 0 part /
├─vda14 252:14 0 4M 0 part
└─vda15 252:15 0 106M 0 part /boot/efi
vdb 252:16 0 80G 0 disk /mnt
vdc 252:32 0 4G 0 disk [SWAP]

The steps are the next:

  1. Creating a boot partition in /dev/vdb (this is needed because Grub cannot boot from LVM and needs an ext or VFAT partition)
  2. Format the boot partition and put the content of the current /boot folder
  3. Create an LVM volume using the extra space in /dev/vdb and initialize it using an ext4 filesystem
  4. Put the contents of the current / folder into the new partition
  5. Update grub to boot from the new disk
  6. Update the mount point for our system
  7. Reboot (and check)
  8. Add the previous disk to the LVM volume.

Let’s start…

Separate the /boot partition

When installing an LVM system, it is needed to have a /boot partition in a common format (e.g. ext2 or ext4), because GRUB cannot read from LVM. Then GRUB reads the contents of that partition and starts the proper modules to read the LVM volumes.

So we need to create the /boot partition. In our case, we are using ext2 format, because has no digest (we do not need it for the content of /boot) and it is faster. We are using 1 Gb. for the /boot partition, but 512 Mb. will probably be enough:

root@somove:~# fdisk /dev/vdb

Welcome to fdisk (util-linux 2.31.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p):

Using default response p.
Partition number (1-4, default 1):
First sector (2048-167772159, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-167772159, default 167772159): +1G

Created a new partition 1 of type 'Linux' and of size 1 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

root@somove:~# mkfs.ext2 /dev/vdb1
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: 24618637-d2d4-45fe-bf83-d69d37f769d0
Superblock backups stored on blocks:
32768, 98304, 163840, 229376

Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done

Now we’ll make a mount point for this partition, mount the partition and copy the contents of the current /boot folder to that partition:

root@somove:~# mkdir /mnt/boot
root@somove:~# mount /dev/vdb1 /mnt/boot/
root@somove:~# cp -ax /boot/* /mnt/boot/

Create an LVM volume in the extra space of /dev/vdb

First, we will create a new partition for our LVM system, and we’ll get the whole free space:

root@somove:~# fdisk /dev/vdb

Welcome to fdisk (util-linux 2.31.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): n
Partition type
p primary (1 primary, 0 extended, 3 free)
e extended (container for logical partitions)
Select (default p):

Using default response p.
Partition number (2-4, default 2):
First sector (2099200-167772159, default 2099200):
Last sector, +sectors or +size{K,M,G,T,P} (2099200-167772159, default 167772159):

Created a new partition 2 of type 'Linux' and of size 79 GiB.

Command (m for help): w
The partition table has been altered.
Syncing disks.

Now we will create a Physical Volume, a Volume Group and the Logical Volume for our root filesystem, using the new partition:

root@somove:~# pvcreate /dev/vdb2
Physical volume "/dev/vdb2" successfully created.
root@somove:~# vgcreate rootvg /dev/vdb2
Volume group "rootvg" successfully created
root@somove:~# lvcreate -l +100%free -n rootfs rootvg
Logical volume "rootfs" created.

If you want to learn about LVM to better understand what we are doing, you can read my previous post.

Now we are initializing the filesystem of the new /dev/rootvg/rootfs volume using ext4, and then we’ll copy the existing filesystem except from the special folders and the /boot folder (which we have separated in the other partition):

root@somove:~# mkfs.ext4 /dev/rootvg/rootfs
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 20708352 4k blocks and 5177344 inodes
Filesystem UUID: 47b4b698-4b63-4933-98d9-f8904ad36b2e
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000

Allocating group tables: done
Writing inode tables: done
Creating journal (131072 blocks): done
Writing superblocks and filesystem accounting information: done

root@somove:~# mkdir /mnt/rootfs
root@somove:~# mount /dev/rootvg/rootfs /mnt/rootfs/
root@somove:~# rsync -aHAXx --delete --exclude={/dev/*,/proc/*,/sys/*,/tmp/*,/run/*,/mnt/*,/media/*,/boot/*,/lost+found} / /mnt/rootfs/

Update the system to boot from the new /boot partition and the LVM volume

At this point we have our /boot partition (/dev/vdb1) and the / filesystem (/dev/rootvg/rootfs). Now we need to prepare GRUB to boot using these new resources. And here comes the magic…

root@somove:~# mount --bind /dev /mnt/rootfs/dev/
root@somove:~# mount --bind /sys /mnt/rootfs/sys/
root@somove:~# mount -t proc /proc /mnt/rootfs/proc/
root@somove:~# chroot /mnt/rootfs/

We are binding the special mount points /dev and /sys to the same folders in the new filesystem which is mounted in /mnt/rootfs. We are also creating the /proc mount point which holds the information about the processes. You can find some more information about why this is needed in my previous post on chroot and containers.

Intuitively, we are somehow “in the new filesystem” and now we can update things as if we had already booted into it.

At this point, we need to update the mount point in /etc/fstab to mount the proper disks once the system boots. So we are getting the UUIDs for our partitions:

root@somove:/# blkid
/dev/vda1: LABEL="cloudimg-rootfs" UUID="135ecb53-0b91-4a6d-8068-899705b8e046" TYPE="ext4" PARTUUID="b27490c5-04b3-4475-a92b-53807f0e1431"
/dev/vda14: PARTUUID="14ad2c62-0a5e-4026-a37f-0e958da56fd1"
/dev/vda15: LABEL="UEFI" UUID="BF99-DB4C" TYPE="vfat" PARTUUID="9c37d9c9-69de-4613-9966-609073fba1d3"
/dev/vdb1: UUID="24618637-d2d4-45fe-bf83-d69d37f769d0" TYPE="ext2"
/dev/vdb2: UUID="Uzt1px-ANds-tXYj-Xwyp-gLYj-SDU3-pRz3ed" TYPE="LVM2_member"
/dev/mapper/rootvg-rootfs: UUID="47b4b698-4b63-4933-98d9-f8904ad36b2e" TYPE="ext4"
/dev/vdc: UUID="3377ec47-a0c9-4544-b01b-7267ea48577d" TYPE="swap"

And we are updating /etc/fstab to mount /dev/mapper/rootvg-rootfs as the / folder. But we need to mount partition /dev/vdb1 in /boot. Using our example, the /etc/fstab file will be this one:

UUID="47b4b698-4b63-4933-98d9-f8904ad36b2e" / ext4 defaults 0 0
UUID="24618637-d2d4-45fe-bf83-d69d37f769d0" /boot ext2 defaults 0 0
LABEL=UEFI /boot/efi vfat defaults 0 0
UUID="3377ec47-a0c9-4544-b01b-7267ea48577d" none swap sw,comment=cloudconfig 0 0

We are using the UUID to mount / and /boot folders because the devices may change their names or location and that may lead to breaking our system.

And now we are ready to mount our /boot partition, update grub, and to install it in the /dev/vda disk (because we are keeping both disks).

root@somove:/# mount /boot
root@somove:/# update-grub
Generating grub configuration file ...
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Found linux image: /boot/vmlinuz-4.15.0-43-generic
Found initrd image: /boot/initrd.img-4.15.0-43-generic
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Found Ubuntu 18.04.1 LTS (18.04) on /dev/vda1
done
root@somove:/# grub-install /dev/vda
Installing for i386-pc platform.
Installation finished. No error reported.

Reboot and check

We are almost done, and now we are exiting the chroot and rebooting

root@somove:/# exit
root@somove:~# reboot

And the result should be the next one:

root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 14G 0 disk
├─vda1 252:1 0 13.9G 0 part
├─vda14 252:14 0 4M 0 part
└─vda15 252:15 0 106M 0 part /boot/efi
vdb 252:16 0 80G 0 disk
├─vdb1 252:17 0 1G 0 part /boot
└─vdb2 252:18 0 79G 0 part
└─rootvg-rootfs 253:0 0 79G 0 lvm /
vdc 252:32 0 4G 0 disk [SWAP]

root@somove:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 2.0G 0 2.0G 0% /dev
tmpfs 395M 676K 394M 1% /run
/dev/mapper/rootvg-rootfs 78G 993M 73G 2% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/vdb1 1008M 43M 915M 5% /boot
/dev/vda15 105M 3.6M 101M 4% /boot/efi
tmpfs 395M 0 395M 0% /run/user/1000

We have our / system mounted from the new LVM Logical Volume /dev/rootvg/rootfs, the /boot partition from /dev/vdb1, and the /boot/efi from the existing partition (just in case that we need it).

Add the previous disk to the LVM volume

Here we are facing the easier part, which is to integrate the original /dev/vda1 volume in the LVM volume.

Once we have double-checked that we had copied every file from the original / folder in /dev/vda1, we can initialize it for using it in LVM:

WARNING: This step wipes the content of /dev/vda1.

root@somove:~# pvcreate /dev/vda1
WARNING: ext4 signature detected on /dev/vda1 at offset 1080. Wipe it? [y/n]: y
Wiping ext4 signature on /dev/vda1.
Physical volume "/dev/vda1" successfully created.

Finally, we can integrate the new partition in our volume group and extend the logical volume to use the free space:

root@somove:~# vgextend rootvg /dev/vda1
Volume group "rootvg" successfully extended
root@somove:~# lvextend -l +100%free /dev/rootvg/rootfs
Size of logical volume rootvg/rootfs changed from <79.00 GiB (20223 extents) to 92.88 GiB (23778 extents).
Logical volume rootvg/rootfs successfully resized.
root@somove:~# resize2fs /dev/rootvg/rootfs
resize2fs 1.44.1 (24-Mar-2018)
Filesystem at /dev/rootvg/rootfs is mounted on /; on-line resizing required
old_desc_blocks = 10, new_desc_blocks = 12
The filesystem on /dev/rootvg/rootfs is now 24348672 (4k) blocks long.

And now we have the new 94 Gb. / folder which is made from /dev/vda1 and /dev/vdb2:

root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 14G 0 disk
├─vda1 252:1 0 13.9G 0 part
│ └─rootvg-rootfs 253:0 0 92.9G 0 lvm /
├─vda14 252:14 0 4M 0 part
└─vda15 252:15 0 106M 0 part /boot/efi
vdb 252:16 0 80G 0 disk
├─vdb1 252:17 0 1G 0 part /boot
└─vdb2 252:18 0 79G 0 part
└─rootvg-rootfs 253:0 0 92.9G 0 lvm /
vdc 252:32 0 4G 0 disk [SWAP]
root@somove:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 2.0G 0 2.0G 0% /dev
tmpfs 395M 676K 394M 1% /run
/dev/mapper/rootvg-rootfs 91G 997M 86G 2% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/vdb1 1008M 43M 915M 5% /boot
/dev/vda15 105M 3.6M 101M 4% /boot/efi
tmpfs 395M 0 395M 0% /run/user/1000

(optional) Having the /boot partition to /dev/vda

In case we wanted to have the /boot partition in /dev/vda, the procedure will be a bit different:

  1. Instead of creating the LVM volume in /dev/vdb1, I would prefer to create a single partition /dev/vdb1 (ext4) which does not imply the separation of /boot and /.
  2. Once created /dev/vdb1, copy the filesystem in /dev/vda1 to /dev/vdb1 and prepare to boot from /dev/vdb1 (chroot, adjust mount points, update-grub, grub-install…).
  3. Boot from the new partition and wipe the original /dev/vda1 partition.
  4. Create a partition /dev/vda1 for the new /boot and initialize it using ext2, copy the contents of /boot according to the instructions in this post.
  5. Create a partition /dev/vda2, create the LVM volume, initialize it and copy the contents of /dev/vdb1 except from /boot
  6. Prepare to boot from /dev/vda (chroot, adjust mount points, mount /boot, update-grub, grub-install…)
  7. Boot from the new /root+LVM / and decide whether you want to add /dev/vdb to the LVM volume or not.

Using this procedure, you will get from linear to LVM with a single disk. And then you can decide whether to make the LVM to grow or not. Moreover you may decide whether to create a LVM-Raid(1,5,…) with the new or other disks.

How to move an existing installation of Ubuntu to another disk

Under some circumstances, we may have the need of moving a working installation of Ubuntu to another disk. The most common case is when your current disk runs out of space and you want to move it to a bigger one. But you could also want to move to an SSD disk or to create an LVM raid…

So this time I learned…

How to move an existing installation of Ubuntu to another disk

I have a 14Gb disk that contains my / partition (vda), and I want to move to a new 80Gb disk (vdb).

root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 14G 0 disk
└─vda1 252:1 0 13.9G 0 part /
vdb 252:16 0 80G 0 disk
vdc 252:32 0 4G 0 disk [SWAP]

First of all, I will create a partition for my / system in /dev/vdb.

root@somove:~# fdisk /dev/vdb

Welcome to fdisk (util-linux 2.31.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-167772159, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-167772159, default 167772159):

Created a new partition 1 of type 'Linux' and of size 80 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

NOTE: The inputs from the user are: n for the new partition and the defaults (i.e. return) for any setting to get the whole disk. Then w to write the partition table.

Now that I have the new partition, we’ll create the filesystem (ext4):

root@somove:~# mkfs.ext4 /dev/vdb1
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 20971264 4k blocks and 5242880 inodes
Filesystem UUID: ea7ee2f5-749e-4e74-bcc3-2785297291a4
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000

Allocating group tables: done
Writing inode tables: done
Creating journal (131072 blocks): done
Writing superblocks and filesystem accounting information: done

We have to transfer the content of the running filesystem to the new disk. But first, we’ll make sure that any other mount point except for / is unmounted (to avoid copying files in other disks).:

root@somove:~# umount -a
umount: /run/user/1000: target is busy.
umount: /sys/fs/cgroup/unified: target is busy.
umount: /sys/fs/cgroup: target is busy.
umount: /: target is busy.
umount: /run: target is busy.
umount: /dev: target is busy.
root@somove:~# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=2006900k,nr_inodes=501725,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=403912k,mode=755)
/dev/vda1 on / type ext4 (rw,relatime,data=ordered)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=403908k,mode=700,uid=1000,gid=1000)
root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 14G 0 disk
└─vda1 252:1 0 13.9G 0 part /
vdb 252:16 0 80G 0 disk
└─vdb1 252:17 0 80G 0 part
vdc 252:32 0 4G 0 disk [SWAP]

Now we will create a mount point for the new filesystem and we’ll copy everything from / to it, except for the special folders (i.e. /tmp, /sys, /dev, etc.). Once completed, we’ll create the Linux special folders:

root@somove:~# mkdir /mnt/vdb1
root@somove:~# mount /dev/vdb1 /mnt/vdb1/
root@somove:~# rsync -aHAXx --delete --exclude={/dev/*,/proc/*,/sys/*,/tmp/*,/run/*,/mnt/*,/media/*,/lost+found} / /mnt/vdb1/

Instead of using rsync, we could use cp -ax /bin /etc /home /lib /lib64 …, but you need make sure that all folders and files are copied. You also need to make sure that the special folders are created by running mkdir /mnt/vdb1/{boot,mnt,proc,run,tmp,dev,sys}. The rsync version is easier to control and to understand.

Now that we have the same directory tree, we just need to make the magic of chroot to prepare the new disk:

root@somove:~# mount --bind /dev /mnt/vdb1/dev
root@somove:~# mount --bind /sys /mnt/vdb1/sys
root@somove:~# mount -t proc /proc /mnt/vdb1/proc
root@somove:~# chroot /mnt/vdb1/

We need to make sure that the new system will try to mount in / the new partition (i.e. /dev/vdb1), but we cannot use /dev/vdb1 id because if we remove the other disk, it will modify its device name to /dev/vda1. So we are using the UUID of the disk. To get it, we can use blkid:

root@somove:/# blkid
/dev/vda1: UUID="135ecb53-0b91-4a6d-8068-899705b8e046" TYPE="ext4"
/dev/vdb1: UUID="eb8d215e-d186-46b8-bd37-4b244cbb8768" TYPE="ext4"

And now we have to update file /etc/fstab to mount the proper UUID in the / folder. The new file /etc/fstab for our example is the next:

UUID="eb8d215e-d186-46b8-bd37-4b244cbb8768" / ext4 defaults 0 0

At this point, we need to update grub to match our disks (it will get the UUID or labels), and install it in the new disk:

root@somove:/# update-grub
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.15.0-43-generic
Found initrd image: /boot/initrd.img-4.15.0-43-generic
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Found Ubuntu 18.04.1 LTS (18.04) on /dev/vda1
done
root@somove:/# grub-install /dev/vdb
Installing for i386-pc platform.
Installation finished. No error reported.

WARNING: In case we get error “error: will not proceed with blocklists.”, please go to the end part of this post.

WARNING: If you plan to keep the original disk in its place (e.g. a Virtual Machine in Amazon or OpenStack), you must install grub in /dev/vda. Otherwise, it will boot the previous system.

Finally, you can exit from chroot, power off, remove the old disk, and boot using the new one. The result will be next:

root@somove:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:16 0 80G 0 disk
└─vda1 252:17 0 80G 0 part /
vdc 252:32 0 4G 0 disk [SWAP]

What if we get error “error: will not proceed with blocklists.”

If we get this error (only if we get this error), we’ll need to wipe the gap between the init of the disk and the partition and then we’ll be able to install grub in the disk.

WARNING: make sure that you know what you are doing, or the disk is new, because this can potentialle erase the data of /dev/vdb.

$ grub-install /dev/vdb
Installing for i386-pc platform.
grub-install: warning: Attempting to install GRUB to a disk with multiple partition labels. This is not supported yet..
grub-install: warning: Embedding is not possible. GRUB can only be installed in this setup by using blocklists. However, blocklists are UNRELIABLE and their use is discouraged..
grub-install: error: will not proceed with blocklists.

In this case, we need to check the partition table of /dev/vdb

root@somove:/# fdisk -l /dev/vdb
Disk /dev/vdb: 80 GiB, 85899345920 bytes, 167772160 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device Boot Start End Sectors Size Id Type
/dev/vdb1 2048 167772159 167770112 80G 83 Linux

And now we will put zeros in /dev/vdb (skipping the first sector where the partition table is stored), up to the sector before to where our partition starts (in our case, partition /dev/vdb1 starts in sector 2048 and so we will zero 2047 sectors):

root@somove:/# dd if=/dev/zero of=/dev/vdb seek=1 count=2047
2047+0 records in
2047+0 records out
1048064 bytes (1.0 MB, 1.0 MiB) copied, 0.0245413 s, 42.7 MB/s

If this was the problem, now you should be able to install grub:

root@somove:/# grub-install /dev/vdb
Installing for i386-pc platform.
Installation finished. No error reported.

How to use LVM

LVM stands for Logical Volume Manager, and it provides logical volume management for Linux kernels. It enables us to manage multiple physical disks from a single manager and to create logical volumes that take profit from having multiple disks (e.g. RAID, thin provisioning, volumes that span across disks, etc.).

I needed using LVM multiple times, and in special it is of interest when dealing with LVM backed cinder in OpenStack.

So this time I learned…

How to use LVM

LVM is installed in multiple Linux distros and they are usually LVM-aware, to be able to boot from LVM volumes.

For the purpose of this post, we’ll consider LVM as a mechanism to manage multiple physical disks and to create logical volumes on them. Then LVM will show the operating system the logical volumes as if they were disks.

Testlab

LVM is intended for physical disks (e.g. /dev/sda, /dev/sdb, etc.). But we are creating a test lab to avoid the need of buying physical disks.

We are creating 4 fake disks of size 256Mb each. To create each of them we simply create a file of the proper size (that will store the data), and then we attach that file to a loop device:

root@s1:/tmp# dd if=/dev/zero of=/tmp/fake-disk-256.0 bs=1M count=256
...
root@s1:/tmp# dd if=/dev/zero of=/tmp/fake-disk-256.1 bs=1M count=256
...
root@s1:/tmp# dd if=/dev/zero of=/tmp/fake-disk-256.2 bs=1M count=256
...
root@s1:/tmp# dd if=/dev/zero of=/tmp/fake-disk-256.3 bs=1M count=256
...
root@s1:/tmp# losetup /dev/loop0 ./fake-disk-256.0
root@s1:/tmp# losetup /dev/loop1 ./fake-disk-256.1
root@s1:/tmp# losetup /dev/loop2 ./fake-disk-256.2
root@s1:/tmp# losetup /dev/loop3 ./fake-disk-256.3

And now you have 4 working disks for our tests:

root@s1:/tmp# fdisk -l
Disk /dev/loop0: 256 MiB, 268435456 bytes, 524288 sectors
...
Disk /dev/loop1: 256 MiB, 268435456 bytes, 524288 sectors
...
Disk /dev/loop2: 256 MiB, 268435456 bytes, 524288 sectors
...
Disk /dev/loop3: 256 MiB, 268435456 bytes, 524288 sectors
...
Disk /dev/sda: (...)

For the system, these devices can be used as regular disks (e.g. format them, mount, etc.):

root@s1:/tmp# mkfs.ext4 /dev/loop0
mke2fs 1.44.1 (24-Mar-2018)
...
Writing superblocks and filesystem accounting information: done

root@s1:/tmp# mkdir -p /tmp/mnt/disk0
root@s1:/tmp# mount /dev/loop0 /tmp/mnt/disk0/
root@s1:/tmp# cd /tmp/mnt/disk0/
root@s1:/tmp/mnt/disk0# touch this-is-a-file
root@s1:/tmp/mnt/disk0# ls -l
total 16
drwx------ 2 root root 16384 Apr 16 12:35 lost+found
-rw-r--r-- 1 root root 0 Apr 16 12:38 this-is-a-file
root@s1:/tmp/mnt/disk0# cd /tmp/
root@s1:/tmp# umount /tmp/mnt/disk0

Concepts of LVM

LVM has simple actors:

  • Physical volume: which is a physical disk.
  • Volume group: which is a set of physical disks managed together.
  • Logical volume: which is a block device.

Logical Volumes (LV) are stored in Volume Groups (VG), which are backed by Physical Volumes (PV).

PVs are managed using pv* commands (e.g. pvscan, pvs, pvcreate, etc.). VGs are managed using vg* commands (e.g. vgs, vgdisplay, vgextend, etc.). LGs are managed using lg* commands (e.g. lvdisplay, lvs, lvextend, etc.).

Simple workflow with LVM

To have an LVM system, we have to first initialize a physical volume. That is somehow “initializing a disk in LVM format”, and that wipes the content of the disk:

root@s1:/tmp# pvcreate /dev/loop0
WARNING: ext4 signature detected on /dev/loop0 at offset 1080. Wipe it? [y/n]: y
Wiping ext4 signature on /dev/loop0.
Physical volume "/dev/loop0" successfully created.

Now we have to create a volume group (we’ll call it test-vg):

root@s1:/tmp# vgcreate test-vg /dev/loop0
Volume group "test-vg" successfully created

And finally, we can create a logical volume

root@s1:/tmp# lvcreate -l 100%vg --name test-vol test-vg
Logical volume "test-vol" created.

And now we have a simple LVM system that is built from one single physical disk (/dev/loop0) that contains one single volume group (test-vg) that holds a single logical volume (test-vol).

Examining things in LVM

  • The commands to examine PVs: pvs and pvdisplay. Each of them offers different information. pvscan also exists, but it is not needed in current versions of LVM.
  • The commands to examine VGs: vgs and vgdisplay. Each of them offers different information. vgscan also exists, but it is not needed in current versions of LVM.
  • The commands to examine PVs: lvs and lvdisplay. Each of them offers different information. lvscan also exists, but it is not needed in current versions of LVM.

Each command has several options, which we are not exploring here. We are just using the commands and we’ll present some options in the next examples.

At this time we should have a PV, a VG, and LV, and we can see them by using pvs, vgs and lvs:

root@s1:/tmp# pvs
PV VG Fmt Attr PSize PFree
/dev/loop0 test-vg lvm2 a-- 252.00m 0
root@s1:/tmp# vgs
VG #PV #LV #SN Attr VSize VFree
test-vg 1 1 0 wz--n- 252.00m 0
root@s1:/tmp# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
test-vol test-vg -wi-a----- 252.00m

Now we can use the test-vol as if it was a partition:

root@s1:/tmp# mkfs.ext4 /dev/mapper/test--vg-test--vol
mke2fs 1.44.1 (24-Mar-2018)
...
Writing superblocks and filesystem accounting information: done

root@s1:/tmp# mount /dev/mapper/test--vg-test--vol /tmp/mnt/disk0/
root@s1:/tmp# cd /tmp/mnt/disk0/
root@s1:/tmp/mnt/disk0# touch this-is-my-file
root@s1:/tmp/mnt/disk0# df -h
Filesystem Size Used Avail Use% Mounted on
(...)
/dev/mapper/test--vg-test--vol 241M 2.1M 222M 1% /tmp/mnt/disk0

Adding another disk to grow the filesystem

Imagine that we have filled our 241Mb volume in our 256Mb disk (/dev/loop0), and we need some more storage space. We could buy an extra disk (i.e. /dev/loop1) and add it to the LVM (using command vgextend).

 

root@s1:/tmp# pvcreate /dev/loop1
Physical volume "/dev/loop1" successfully created.
root@s1:/tmp# vgextend test-vg /dev/loop1
Volume group "test-vg" successfully extended

And now we have two physical volumes added to a single volume group. The VG is of size 504Mb and there are 252Mb free.

root@s1:/tmp# pvs
PV VG Fmt Attr PSize PFree
/dev/loop0 test-vg lvm2 a-- 252.00m 0
/dev/loop1 test-vg lvm2 a-- 252.00m 252.00m
root@s1:/tmp# vgs
VG #PV #LV #SN Attr VSize VFree
test-vg 2 1 0 wz--n- 504.00m 252.00m

We could think as if the VG was somehow a disk and the LV are the partitions. So we can make grow the LV in the VG and then grow the filesystem:

root@s1:/tmp/mnt/disk0# lvscan
ACTIVE '/dev/test-vg/test-vol' [252.00 MiB] inherit

root@s1:/tmp/mnt/disk0# lvextend -l +100%free /dev/test-vg/test-vol
Size of logical volume test-vg/test-vol changed from 252.00 MiB (63 extents) to 504.00 MiB (126 extents).
Logical volume test-vg/test-vol successfully resized.

root@s1:/tmp/mnt/disk0# resize2fs /dev/test-vg/test-vol
resize2fs 1.44.1 (24-Mar-2018)
Filesystem at /dev/test-vg/test-vol is mounted on /tmp/mnt/disk0; on-line resizing required
old_desc_blocks = 2, new_desc_blocks = 4
The filesystem on /dev/test-vg/test-vol is now 516096 (1k) blocks long.

root@s1:/tmp/mnt/disk0# df -h
Filesystem Size Used Avail Use% Mounted on
(...)
/dev/mapper/test--vg-test--vol 485M 2.3M 456M 1% /tmp/mnt/disk0

root@s1:/tmp/mnt/disk0# ls -l
total 12
drwx------ 2 root root 12288 Apr 22 16:46 lost+found
-rw-r--r-- 1 root root 0 Apr 22 16:46 this-is-my-file

Now we have the new LV with double size.

Downsize the LV

Now we have obtained some free space and we want to keep only 1 disk (e.g. /dev/loop0). So we can downsize the filesystem (e.g. to 200Mb), and then downsize the LV.

This method needs unmounting the filesystem. So if you want to resize the root partition, you would need to use a live system or pivoting root to an unused filesystem as described [here](https://unix.stackexchange.com/a/227318).

First, unmount the filesystem and check it:

root@s1:/tmp# umount /tmp/mnt/disk0
root@s1:/tmp# e2fsck -ff /dev/mapper/test--vg-test--vol
e2fsck 1.44.1 (24-Mar-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/mapper/test--vg-test--vol: 12/127008 files (0.0% non-contiguous), 22444/516096 blocks

Then change the size of the filesystem to the desired size:

root@s1:/tmp# resize2fs /dev/test-vg/test-vol 200M
resize2fs 1.44.1 (24-Mar-2018)
Resizing the filesystem on /dev/test-vg/test-vol to 204800 (1k) blocks.
The filesystem on /dev/test-vg/test-vol is now 204800 (1k) blocks long.

And now, we’ll reduce the logical volume to the new size and re-check the filesystem:

root@s1:/tmp# lvreduce -L 200M /dev/test-vg/test-vol
WARNING: Reducing active logical volume to 200.00 MiB.
THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce test-vg/test-vol? [y/n]: y
Size of logical volume test-vg/test-vol changed from 504.00 MiB (126 extents) to 200.00 MiB (50 extents).
Logical volume test-vg/test-vol successfully resized.
root@s1:/tmp# lvdisplay
--- Logical volume ---
LV Path /dev/test-vg/test-vol
LV Name test-vol
VG Name test-vg
LV UUID xGh4cd-R93l-UpAL-LGTV-qnxq-vvx2-obSubY
LV Write Access read/write
LV Creation host, time s1, 2020-04-22 16:26:48 +0200
LV Status available
# open 0
LV Size 200.00 MiB
Current LE 50
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:0

root@s1:/tmp# resize2fs /dev/test-vg/test-vol
resize2fs 1.44.1 (24-Mar-2018)
The filesystem is already 204800 (1k) blocks long. Nothing to do!

And now we are ready to use the disk with the new size:

root@s1:/tmp# mount /dev/test-vg/test-vol /tmp/mnt/disk0/
root@s1:/tmp# df -h
Filesystem Size Used Avail Use% Mounted on
(...)
/dev/mapper/test--vg-test--vol 190M 1.6M 176M 1% /tmp/mnt/disk0root@s1:/tmp# cd /tmp/mnt/disk0/
root@s1:/tmp/mnt/disk0# ls -l
total 12
drwx------ 2 root root 12288 Apr 22 16:46 lost+found
-rw-r--r-- 1 root root 0 Apr 22 16:46 this-is-my-file

Removing a PV

Now we want to remove /dev/loop0 (which was our original disks) and keep the replacement (/dev/loop1).

We just need to free /dev/loop0 from VGs and remove it from the VG, for being able to safely remove it from the system. First, we check the PVs:

root@s1:/tmp/mnt/disk0# pvs -o+pv_used
PV VG Fmt Attr PSize PFree Used
/dev/loop0 test-vg lvm2 a-- 252.00m 52.00m 200.00m
/dev/loop1 test-vg lvm2 a-- 252.00m 252.00m 0

We can see that /dev/loop0 is used, so we need to move its data to another PV:

root@s1:/tmp/mnt/disk0# pvmove /dev/loop0
/dev/loop0: Moved: 100.00%
root@s1:/tmp/mnt/disk0# pvs -o+pv_used
PV VG Fmt Attr PSize PFree Used
/dev/loop0 test-vg lvm2 a-- 252.00m 252.00m 0
/dev/loop1 test-vg lvm2 a-- 252.00m 52.00m 200.00m

Now /dev/loop0 is 100% free and we can remove it from the VG:

root@s1:/tmp/mnt/disk0# vgreduce test-vg /dev/loop0
Removed "/dev/loop0" from volume group "test-vg"
root@s1:/tmp/mnt/disk0# pvremove /dev/loop0
Labels on physical volume "/dev/loop0" successfully wiped.

Thin provisioning with LVM

Thin provisioning consists of providing the user with the illusion of having an amount of storage space, but only storing the amount of space that is actually used. It is similar to qcow2 or vmdk disk formats for virtual machines. The idea is that, if you have 1 Gb of storage space used, the backend will only store these data, even if the volume is 10Gb. The total amount of storage space will be used as it is requested.

First, you need to reserve an effective storage space in the form of a “thin pool”. The next example reserves 200M (-L 200M) as a thin storage (-T) as the thin pool thinpool in VG test-vg:

root@s1:/tmp/mnt# lvcreate -L 200M -T test-vg/thinpool
Using default stripesize 64.00 KiB.
Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
Logical volume "thinpool" created.

The result is that we’ll have a volume of size 200M, which is like a volume but is marked as a thin pool.

Now we can create two thin-provisioned volumes of the same size:

root@s1:/tmp/mnt# lvcreate -V 200M -T test-vg/thinpool -n thin-vol-1
Using default stripesize 64.00 KiB.
Logical volume "thin-vol-1" created.
root@s1:/tmp/mnt# lvcreate -V 200M -T test-vg/thinpool -n thin-vol-2
Using default stripesize 64.00 KiB.
WARNING: Sum of all thin volume sizes (400.00 MiB) exceeds the size of thin pool test-vg/thinpool and the amount of free space in volume group (296.00 MiB).
WARNING: You have not turned on protection against thin pools running out of space.
WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
Logical volume "thin-vol-2" created.
root@s1:/tmp/mnt# lvs -o name,lv_size,data_percent,thin_count
LV LSize Data% #Thins
thin-vol-1 200.00m 0.00
thin-vol-2 200.00m 0.00
thinpool 200.00m 0.00 2

The result is that we have 2 volumes of 200Mb each, while actually having only 200Mb. But each of the volumes is empty. Now as we use the volumes, the space will be occupied:

root@s1:/tmp/mnt# mkfs.ext4 /dev/test-vg/thin-vol-1
(...)
Writing superblocks and filesystem accounting information: done

root@s1:/tmp/mnt# mount /dev/test-vg/thin-vol-1 /tmp/mnt/disk0/
root@s1:/tmp/mnt# dd if=/dev/random of=/tmp/mnt/disk0/randfile bs=1K count=1024
dd: warning: partial read (94 bytes); suggest iflag=fullblock
0+1024 records in
0+1024 records out
48623 bytes (49 kB, 47 KiB) copied, 0.0701102 s, 694 kB/s
root@s1:/tmp/mnt# lvs -o name,lv_size,data_percent,thin_count
LV LSize Data% #Thins
thin-vol-1 200.00m 5.56
thin-vol-2 200.00m 0.00
thinpool 200.00m 5.56 2
root@s1:/tmp/mnt# df -h
Filesystem Size Used Avail Use% Mounted on
(...)
/dev/mapper/test--vg-thin--vol--1 190M 1.6M 175M 1% /tmp/mnt/disk0

In this example, we have used 49kb, which means 5.56%. If we repeat the process for the other volume:

root@s1:/tmp/mnt# mkfs.ext4 /dev/test-vg/thin-vol-2
(...)
Writing superblocks and filesystem accounting information: done

root@s1:/tmp/mnt# mkdir -p /tmp/mnt/disk1
root@s1:/tmp/mnt# mount /dev/test-vg/thin-vol-2 /tmp/mnt/disk1/
root@s1:/tmp/mnt# dd if=/dev/random of=/tmp/mnt/disk1/randfile bs=1K count=1024
dd: warning: partial read (86 bytes); suggest iflag=fullblock
0+1024 records in
0+1024 records out
38821 bytes (39 kB, 38 KiB) copied, 0.0473561 s, 820 kB/s
root@s1:/tmp/mnt# lvs -o name,lv_size,data_percent,thin_count
LV LSize Data% #Thins
thin-vol-1 200.00m 5.56
thin-vol-2 200.00m 5.56
thinpool 200.00m 11.12 2
root@s1:/tmp/mnt# df -h
Filesystem Size Used Avail Use% Mounted on
(...)
/dev/mapper/test--vg-thin--vol--1 190M 1.6M 175M 1% /tmp/mnt/disk0
/dev/mapper/test--vg-thin--vol--2 190M 1.6M 175M 1% /tmp/mnt/disk1

We can see that the space is being consumed, but we still see the 200Mb for each of the volumes.

Using RAID with LVM

Apart from using LVM as RAID-0 (i.e. stripping an LV across multiple physical devices), it is possible to create other types of raids using LVM.

Some of the most popular types of raids are RAID-1, RAID-10, and RAID-5. In the context of LVM, they intuitively mean:

  • RAID-1: mirror an LV in multiple PV
  • RAID-10: mirror an LV in multiple PV, but also stripping parts of the volume in different PVs.
  • RAID-5: distributing the LV in multiple PV, and using some parity data to be able to continue working if PV fails.

You can check more specific information on RAIDs in this link.

You can use other RAID utilities (such as on-board RAIDs or software like mdadm), but using LVM you will also be able to take profit from LVM features.

Mirroring an LV with LVM

The first use case is to get a volume that is mirrored in multiple PV. We’ll get back to the 2 PV set-up:

root@s1:/tmp/mnt# pvs
PV VG Fmt Attr PSize PFree
/dev/loop0 test-vg lvm2 a-- 252.00m 252.00m
/dev/loop1 test-vg lvm2 a-- 252.00m 252.00m
root@s1:/tmp/mnt# vgs
VG #PV #LV #SN Attr VSize VFree
test-vg 2 0 0 wz--n- 504.00m 504.00m

And now we can create an LV that is in RAID1. But we can also set the number of copies that we want for the volume (using flag -m). In this case, we’ll create LV lv-mirror (-n lv-mirror) that is mirrored once (-m 1) with a size of 100Mb (-L 100M).

root@s1:/tmp/mnt# lvcreate --type raid1 -m 1 -L 100M -n lv-mirror test-vg
Logical volume "lv-mirror" created.
root@s1:/tmp/mnt# lvs -a -o name,copy_percent,devices
LV Cpy%Sync Devices
lv-mirror 100.00 lv-mirror_rimage_0(0),lv-mirror_rimage_1(0)
[lv-mirror_rimage_0] /dev/loop1(1)
[lv-mirror_rimage_1] /dev/loop0(1)
[lv-mirror_rmeta_0] /dev/loop1(0)
[lv-mirror_rmeta_1] /dev/loop0(0)

As you can see, LV lv-mirror is built from lv-mirror_rimage_0, lv-mirror_rmeta_0, lv-mirror_rimage_1 and lv-mirror_rmeta_1 “devices”, and in which PV is located each part.

For the case of RAID1, you can convert it to and from linear volumes. This feature is not (yet) implemented for other types of raid such as RAID10 or RAID5.

You can change the number of mirror copies for an LV (even getting the volume to linear), by using command lvconvert:

root@s1:/tmp/mnt# lvconvert -m 0 /dev/test-vg/lv-mirror
Are you sure you want to convert raid1 LV test-vg/lv-mirror to type linear losing all resilience? [y/n]: y
Logical volume test-vg/lv-mirror successfully converted.
root@s1:/tmp/mnt# lvs -a -o name,copy_percent,devices
LV Cpy%Sync Devices
lv-mirror /dev/loop1(1)
root@s1:/tmp/mnt# pvs
PV VG Fmt Attr PSize PFree
/dev/loop0 test-vg lvm2 a-- 252.00m 252.00m
/dev/loop1 test-vg lvm2 a-- 252.00m 152.00m

In this case, we have converted the volume to linear (i.e. zero copies).

But you can also get mirror capabilities for a linear volume:

root@s1:/tmp/mnt# lvconvert -m 1 /dev/test-vg/lv-mirror
Are you sure you want to convert linear LV test-vg/lv-mirror to raid1 with 2 images enhancing resilience? [y/n]: y
Logical volume test-vg/lv-mirror successfully converted.
root@s1:/tmp/mnt# lvs -a -o name,copy_percent,devices
LV Cpy%Sync Devices
lv-mirror 100.00 lv-mirror_rimage_0(0),lv-mirror_rimage_1(0)
[lv-mirror_rimage_0] /dev/loop1(1)
[lv-mirror_rimage_1] /dev/loop0(1)
[lv-mirror_rmeta_0] /dev/loop1(0)
[lv-mirror_rmeta_1] /dev/loop0(0)

More on RAIDs

Using command lvcreate it is also possible to create other types of RAID (e.g. RAID10, RAID5, RAID6, etc.). The only requirement is to have enough PVs for the type of RAID. But you must also have in mind that it is not possible to convert from these other types of raid to or from LV.

You can find much more information on LVM and RAIDs in this link.

More on LVM

There are a lot of features and tweaks to adjust in LVM, but this post shows the basics (and a bit more) to deal with LVM. You are advised to check the file /etc/lvm.conf and the man page.

How to install Horizon Dashboard in OpenStack Rocky and upgrade noVNC

Some time ago I wrote a series of posts on installing OpenStack Rocky: Part 1, Part 2 and Part 3. That installation was usable by the command line, but now I learned…

How to install Horizon Dashboard in OpenStack Rocky and upgrade noVNC

In this post, I start from the working installation of OpenStack Rocky in Ubuntu created in the previous posts in this series.

Installing the Horizon Dashboard if you have used the configuration settings that I suggested is very simple (I’m following the official documentation). You just need to install the dashboard packages and dependencies by issuing the next command:

$ apt install openstack-dashboard

And now you need to configure the dashboard settings in the file /etc/openstack-dashboard/local_settings.py. The basic configuration can be made with the next lines:

$ sed -i 's/OPENSTACK_HOST = "[^"]*"/OPENSTACK_HOST = "controller"/g' /etc/openstack-dashboard/local_settings.py
$ sed -i 's/^\(CACHES = {\)/SESSION_ENGINE = "django.contrib.sessions.backends.cache"\n\1/' /etc/openstack-dashboard/local_settings.py
$ sed -i "s/'LOCATION': '127\.0\.0\.1:11211',/'LOCATION': 'controller:11211'/" /etc/openstack-dashboard/local_settings.py
$ sed -i 's/^\(#OPENSTACK_API_VERSIONS = {\)/OPENSTACK_API_VERSIONS = {\n"identity": 3,\n"image": 2,\n"volume": 2,\n}\n\1'
$ sed -i 's/^OPENSTACK_KEYSTONE_DEFAULT_ROLE = "[^"]*"/OPENSTACK_KEYSTONE_DEFAULT_ROLE = "user"/'
$ sed -i "/^#OPENSTACK_KEYSTONE_DEFAULT_DOMAIN =.*$/aOPENSTACK_KEYSTONE_DEFAULT_DOMAIN='Default'" /etc/openstack-dashboard/local_settings.py
$ sed -i 's/^TIME_ZONE = "UTC"/TIME_ZONE = "Europe\/Madrid"/' /etc/openstack-dashboard/local_settings.py

Each line consists in:

  1. Setting the IP address of the controller (we set it in the /etc/hosts file).
  2. Setting a cache engine for the pages.
  3. Setting the memcached server (we installed it in the controller).
  4. Set the version of the APIs that we installed.
  5. Setting the default role for the users that log into the portal to “user” instead of the default one (which is “member”).
  6. Set the default domain to “Default” to avoid the need for querying for the domain to the users.
  7. Set the timezone of the site (you can check the code for your timezone here)

In our installation, we used the self-service option for the networks. If you changed it, please make sure that variable OPENSTACK_NEUTRON_NETWORK to match your platform.

And that’s all on the basic configuration of Horizon. Now we just need to check that file /etc/apache2/conf-available/openstack-dashboard.conf contains the next line:

WSGIApplicationGroup %{GLOBAL}

And finally, you need to restart apache:

$ service apache2 restart

Now you should be able to log in to the horizon portal by using the address https://controller.my.server/horizon in your web browser.

Please take into account that controller.my.server corresponds to the routable IP address of your server (if you followed the previous posts, it is 158.42.1.1).

Captura de pantalla 2020-04-01 a las 13.09.29

Configuring VNC

One of the most common complaints about the Horizon dashboard is that VNC does not work. The effect is that you can reach to the “console” tab of the instances, but you cannot see the console. And if you try to open the console in a new tab, you will probably find a Not Found web page.

The most common error, in this case, is that you have not configured the noVNC settings in the /etc/nova/nova.conf file, in the computing nodes. So please check the section [vnc] in that file. It should look like the next:

[vnc]
enabled = true
server_listen = 0.0.0.0
server_proxyclient_address = $my_ip
novncproxy_base_url = http://controller.my.server:6080/vnc_auto.html

There are two keys in this configuration:

  • Make sure that controller.my.server corresponds to the routable IP address of your server (the same that you used in the web browser).
  • Make sure that file vnc_auto.html exists in folder /usr/share/novnc in the host where horizon is installed.

Captura de pantalla 2020-04-01 a las 13.22.32

Upgrading noVNC

OpenStack Rocky comes with a very old version of noVNC (0.4), while at the moment of writing this post, noVNC has already released version 1.1.0 (see here).

To update noVNC is as easy as getting the file that contains the release and to put it in /usr/share/novnc, in the front-end:

$ cd /tmp
$ wget https://github.com/novnc/noVNC/archive/v1.1.0.tar.gz -O novnc-v1.1.0.tar.gz
$ tar xfz novnc-v1.1.0.tar.gz
$ mv /tmp/noVNC-1.1.0 /usr/share
$ cd /usr/share
$ mv novnc novnc-0.4
$ ln -s noVNC-1.1.0 novnc

Now we need to configure the new settings in the compute nodes. So we need to update file /etc/nova/nova.conf in each compute node. We need to modify the [vnc] section to match the new version of noVNC. The section will look like the next one:

[vnc]
enabled = true
server_listen = 0.0.0.0
server_proxyclient_address = $my_ip
novncproxy_base_url = http://controller.my.server:6080/vnc_lite.html

Finally, you will need to restart nova-compute on each compute node:

$ service nova-compute restart

and to restart apache2 in the server in which horizon is installed:

$ service apache2 restart

*WARNING* It is not guaranteed that the changes will be updated for the running instances, but it will be applied for the new ones.

 

How to use SSH with Proxies and Port Forwarding to get access to my Private Network

In my work, I have one single host with public access to ssh port (i.e. 22). The rest of my hosts have the ssh port filtered. Moreover, I have one cloud platform in which I can create virtual machines (VM) with private IP addresses (e.g. 10.0.0.1). I want to start one web server in a VM and have access to that web server from my machine at home.

The structure of hosts is in the next picture:

ssh-magic-2

So this time I learned…

How to use SSH with Proxies and Port Forwarding to get access to my Private Network

In first place I have to mention that this usage of proxy jumping is described as a real use case, and it is intended for legal purposes only.

TL;DR

$ ssh -L 10080:localhost:80 -J main-ui.myorg.com -J cloud-ui.myorg.com root@web.internal

My first attempt to achieve this was to chain ssh calls (will forget about port forwarding for the moment):

$ ssh -t user@main-ui.myorg.com ssh -t user@cloud-ui.myorg.com ssh root@web.internal

And this works, but has some problems…

  1. I am asked for a password at every stage (for main-ui, cloud-ui and web.internal). I want to use passwordless access (using my private key), but if I try to use ‘-i <private key>’ flag, the path to the files is always relative to the specific machine. So I would need to copy my private key to every machine (weird).
  2. I need to chain port forwarding by chaining individual port forwards.
  3. I will not work for scp.

I tried SSHProxyCommand options, but I found easier to use ProxyJump option (i.e. -J). So my attempt was:

$ ssh -J user@main-ui.myorg.com -J user@cloud-ui.myorg.com root@web.internal

And this works, but I have not found any way to provide my private key in the command line but for the target host (web.internal).

And now I figured out how to configure this by using ssh-config files (i.e. $HOME/.ssh/config). I wrote the next entries in that file:

Host main-ui.myorg.com
IdentityFile ~/.ssh/key-for-main-ui.key

Host cloud-ui.myorg.com
IdentityFile ~/.ssh/key-for-cloud-ui.key
ProxyJump main-ui.myorg.com

Host *.internal
ProxyJump could-ui.myorg.com

Using that configuration I can access to web.internal by issuing the next simple command:

$ ssh -i keyfor-web.key root@web.internal

And each file with each identity key written in the .ssh/config file y relative to my laptop, so I do not need to distribute my private key (nor create artificial intermediate keys).

The config file is also for the scp command, so I can issue commands like the next one:

$ scp -i keyfor-web.key  ./myfile root@web.internal:.

And ssh will make the magic!

Port forwarding

But remember that I also wanted to access to my web server in web.internal. So the easies way is to port forward port 80 to a port (e.g. 10080) in my laptop.

So using the previous configuration I can issue a command like the next one:

$ ssh -i keyfor-web.key -L 10080:localhost:80 root@web.internal

And now I will be able to open other terminal and issue the next command:

$ curl localhost:10080

and I will get the contents from web.internal.

Flag -L can be interpreted as (using the previous ssh call): connect using ssh to root@web.internal and forward to ‘localhost:80’ any traffic received in port 10080 from the client host.

If I had no ssh access to web.internal, I could issue the next alternate command:

$ ssh -L 10080:web.internal:80 user@cloud-ui.myorg.com

In this case, -L flag will be interpreted as: connect using ssh to user@cloud-ui.myorg.com and forward to ‘web.internal:80’ any traffic received in port 10080 from the client host.

Final words

Using these commands I will need to keep the ssh session opened. In case that I wanted to forget about that session, and run port forwarding in background, I could use -f flag:

$ ssh -i keyfor-web.key -f -L 10080:localhost:80 root@web.internal 'while true; do sleep 60; done'

How to install OpenStack Rocky – part 3 (final)

This is the third post on the installation of OpenStack Rocky in an Ubuntu-based deployment.

In this post, I am explaining how to install the compute nodes. Please check the previous posts How to install OpenStack Rocky – part 1 and How to install OpenStack Rocky – part 2 to learn on how to prepare the servers to host our OpenStack Rocky platform, and the controller.

Recap

In the first post, we prepared the network for the compute elements. The description is in the next figure:

horsemen

And in the second post, we installed the controller, where we configured neutron to be able to create on-demand networks, and we also configured nova to discover new compute elements (value discover_hosts_in_cells_interval in nova.conf).

Installation of the compute elements

In this section, we are installing nova-compute and neutron in the compute elements.

Remember… we prepared one interface connected to the provider network without IP address (enp1s0f0), and other interface connected to the management network with an IP address, and with the ability to access to the internet via NAT (enp1s0f1). We also disabled IPv6. We are also able to ping to the controller, using the management network (we configured the /etc/hosts file).

Dependencies

We need to installing chrony:

$ apt install chrony

And now we have to update the file /etc/chrony/chrony.conf to include the controller as the NTP server and disabling the other NTP servers. The modified fragment of the file should look like the next one:

server controller iburst
#pool ntp.ubuntu.com iburst maxsources 4
#pool 0.ubuntu.pool.ntp.org iburst maxsources 1
#pool 1.ubuntu.pool.ntp.org iburst maxsources 1
#pool 2.ubuntu.pool.ntp.org iburst maxsources 2

At this point, we need to restart chrony and we’ll have all the prerequisites installed:

$ service chrony restart

Activate the OpenStack packages

We have to install the OpenStack repository and install the basic command line:

apt install software-properties-common
add-apt-repository cloud-archive:rocky
apt update && apt -y dist-upgrade
apt install -y python-openstackclient

Installing nova

Now we’ll install the nova-compute elements (we are installing the KVM subsystem):

apt install -y nova-compute nova-compute-kvm

And now we are creating the configuration file /etc/nova.conf with the next content:

[DEFAULT]
# BUG: log_dir = /var/log/nova
lock_path = /var/lock/nova
state_path = /var/lib/nova
transport_url = rabbit://openstack:RABBIT_PASS@controller
my_ip = 192.168.1.241
use_neutron = true
firewall_driver = nova.virt.firewall.NoopFirewallDriver
[api]
auth_strategy = keystone
[api_database]
connection = sqlite:////var/lib/nova/nova_api.sqlite
[cells]
enable = False
[cinder]
os_region_name = RegionOne
[database]
connection = sqlite:////var/lib/nova/nova.sqlite
[glance]
api_servers = http://controller:9292
[keystone_authtoken]
auth_url = http://controller:5000/v3
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = nova
password = NOVA_PASS
[neutron]
url = http://controller:9696
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = neutron
password = NEUTRON_PASS
[oslo_concurrency]
lock_path = /var/lib/nova/tmp
[placement]
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
[vnc]
enabled = true
server_listen = 0.0.0.0
server_proxyclient_address = $my_ip
novncproxy_base_url = http://158.42.1.1:6080/vnc_auto.html

This is the basic content for our installation. All the variables will take the default values.

In this configuration file we MUST configure 2 important values:

  • my_ip, to match the management IP address of the compute element (in this case, it is fh01).
  • novncproxy_base_url, to match the URL that includes the public address of your horizon server. It is important not to include (e.g.) http://controller:6080 … because that is an internal IP address that will not probably be routable from the browser with which you will access to horizon.

Once customized these values, we just need to restart nova:

service nova-compute restart

In the guides you will probably see that you are commited to execute a command like this one (su -s /bin/sh -c “nova-manage cell_v2 discover_hosts –verbose” nova​) in the controller, but again, we configured the controller to periodically discover the compute elements. Anyway, executing the command is not needed but it will not break our deployment.

Installing neutron

Finally, we need to install and configure neutron, to be able to manage the OpenStack network. We just need to issue the next command:

apt install neutron-linuxbridge-agent

Once installed the components, we have to update the file /etc/neutron/neutron.conf with the following content:

[DEFAULT]
lock_path = /var/lock/neutron
core_plugin = ml2
transport_url = rabbit://openstack:RABBIT_PASS@controller
auth_strategy = keystone
[agent]
root_helper = "sudo /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf"
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = neutron
password = NEUTRON_PASS

And also update the file /etc/neutron/plugins/ml2/linuxbridge_agent.ini with the next content:

[linux_bridge]
physical_interface_mappings = provider:enp1s0f0
[securitygroup]
firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver
enable_security_group = true
[vxlan]
enable_vxlan = true
local_ip = 192.168.1.241
l2_population = true

In this second file, we must configure the interface that is connected to the provider network (in my case, enp1s0f0), and set the proper IP address of the compute element (in my case, 192.168.1.241).

At this point we just need to restart the neutron services:

# service nova-compute restart
# service neutron-linuxbridge-agent restart

And that’s all, folks.

What’s next

Now you should be able to execute commands using OpenStack, create virtual machines, etc. Now you can install horizon, create networks, etc. You can find information on using OpenStack in the official documentation.

In this blog, I am writing some other posts that will cover issues like

Moreover, I will try to write some posts on debugging the OpenStack network and understanding how the VM images and volumes are connected and used in the VMs.

How to install OpenStack Rocky – part 2

This is the second post on the installation of OpenStack Rocky in an Ubuntu based deployment.

In this post I am explaning how to install the essential OpenStack services in the controller. Please check the previous port How to install OpenStack Rocky – part 1 to learn on how to prepare the servers to host our OpenStack Rocky platform.

Recap

In the last post, we prepared the network for both the node controller and the compute elements. The description is in the next figure:

horsemen

We also installed the prerrequisites for the controller.

Installation of the controller

In this section we are installing keystone, glance, nova, neutron and the dashboard, in the controller.

Repositories

In first place, we need to install the OpenStack repositories:

# apt install software-properties-common
# add-apt-repository cloud-archive:rocky
# apt update && apt -y dist-upgrade
# apt install -y python-openstackclient

Keystone

To install keystone, first we need to create the database:

# mysql -u root -p <<< "CREATE DATABASE keystone;
GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'localhost' IDENTIFIED BY 'KEYSTONE_DBPASS';
GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'%' IDENTIFIED BY 'KEYSTONE_DBPASS';"

And now, we’ll install keystone

# apt install -y keystone apache2 libapache2-mod-wsgi

We are creating the minimal keystone.conf configuration, according to the basic deployment:

# cat > /etc/keystone/keystone.conf  <<EOT
[DEFAULT]
log_dir = /var/log/keystone
[database]
connection = mysql+pymysql://keystone:KEYSTONE_DBPASS@controller/keystone
[extra_headers]
Distribution = Ubuntu
[token]
provider = fernet"
fi
EOT

Now we need to execute some commands to prepare the keystone service

# su keystone -s /bin/sh -c 'keystone-manage db_sync'
# keystone-manage fernet_setup --keystone-user keystone --keystone-group keystone
# keystone-manage credential_setup --keystone-user keystone --keystone-group keystone
# keystone-manage bootstrap --bootstrap-password "ADMIN_PASS" --bootstrap-admin-url http://controller:5000/v3/ --bootstrap-internal-url http://controller:5000/v3/ --bootstrap-public-url http://controller:5000/v3/ --bootstrap-region-id RegionOne

At this moment, we have to configure apache2, because it is used as the http backend.

# echo "ServerName controller" >> /etc/apache2/apache2.conf
# service apache2 restart

Finally we’ll prepare a file that contains a set of variables that will be used to access openstack. This file will be called admin-openrc and its content is the next:

# cat > admin-openrc <<EOT
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_NAME=admin
export OS_USERNAME=admin
export OS_PASSWORD=ADMIN_PASS
export OS_AUTH_URL=http://controller:5000/v3
export OS_IDENTITY_API_VERSION=3
export OS_IMAGE_API_VERSION=2
EOT

And now we are almost ready to operate keystone. Now we need to source that file:

# source admin-openrc

And now we are ready to issue commands in OpenStack. And we are testing by create a project that will host the OpenStack services:

# openstack project create --domain default --description "Service Project" service

Demo Project

In the OpenStack installation guide, a demo project is created. We are including the creation of this demo project, although it is not needed:

# openstack project create --domain default --description "Demo Project" myproject
# openstack user create --domain default --password "MYUSER_PASS" myuser
# openstack role create myrole
# openstack role add --project myproject --user myuser myrole

We are also creating the set of variables needed in the system to execute commands in OpenStack, using this demo user

# cat > demo-openrc << EOT
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_NAME=myproject
export OS_USERNAME=myuser
export OS_PASSWORD=MYUSER_PASS
export OS_AUTH_URL=http://controller:5000/v3
export OS_IDENTITY_API_VERSION=3
export OS_IMAGE_API_VERSION=2"
EOT

In case that you want to use this demo user and project, you will be either login in the horizon portal (once it is installed in further steps), using the pair myuser/MYUSER_PASS credentials, or sourcing the file demo-openrc to use the commandline.

Glance

Glance is the OpenStack service dedicated to manage the VM images. And this using this steps, we will be able to make a basic installation where the images are stored in the filesystem of the controller.

First we need to create a database and user in mysql:

# mysql -u root -p <<< "CREATE DATABASE glance;
GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'localhost' IDENTIFIED BY 'GLANCE_DBPASS';
GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'%' IDENTIFIED BY 'GLANCE_DBPASS';"

Now we need to create the user dedicated to run the service and the endpoints in keystone, but first we’ll make sure that we have the proper env variables by sourcing the admin credentials:

# source admin-openrc# openstack user create --domain default --password "GLANCE_PASS" glance
# openstack role add --project service --user glance admin
# openstack service create --name glance --description "OpenStack Image" image
# openstack endpoint create --region RegionOne image public http://controller:9292
# openstack endpoint create --region RegionOne image internal http://controller:9292
# openstack endpoint create --region RegionOne image admin http://controller:9292

Now we are ready to install the components:

# apt install -y glance

At the time of writing this post, there is an error in the glance package in the OpenStack repositories. That makes that (e.g.) integration with cinder does not work. The problem is that file /etc/glance/rootwrap.conf and folder /etc/glance/rootwrap.d are inside folder /etc/glance/glance. So the patch simply consists in executing

$ mv /etc/glance/glance/rootwrap.* /etc/glance/

 

And now we are creating the basic configuration files, needed to run glance as in the basic installation:

# cat > /etc/glance/glance-api.conf  << EOT
[database]
connection = mysql+pymysql://glance:GLANCE_DBPASS@controller/glance
backend = sqlalchemy
[image_format]
disk_formats = ami,ari,aki,vhd,vhdx,vmdk,raw,qcow2,vdi,iso,ploop.root-tar
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = glance
password = GLANCE_PASS
[paste_deploy]
flavor = keystone
[glance_store]
stores = file,http
default_store = file
filesystem_store_datadir = /var/lib/glance/images/
EOT

And

# cat > /etc/glance/glance-registry.conf  << EOT
[database]
connection = mysql+pymysql://glance:GLANCE_DBPASS@controller/glance
backend = sqlalchemy
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = glance
password = GLANCE_PASS
[paste_deploy]
flavor = keystone
EOT

The backend to store the files is the folder /var/lib/glance/images/ in the controller node. If you want to change this folder, please update the variable filesystem_store_datadir in the file glance-api.conf

We have created the files that result from the installation from the official documentation and now we are ready to start glance. First we’ll prepare the database

# su -s /bin/sh -c "glance-manage db_sync" glance

And finally we will restart the services

# service glance-registry restart
# service glance-api restart

At this point, we are creating our first image (the common cirros image):

# wget -q http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img -O /tmp/cirros-0.4.0-x86_64-disk.img
# openstack image create "cirros" --file /tmp/cirros-0.4.0-x86_64-disk.img --disk-format qcow2 --container-format bare --public

Nova (i.e. compute)

Nova is the set of services dedicated to the compute service. As we are installing the controller, this server will not run any VM. Instead it will coordinate the creation of the VMs in the working nodes.

First we need to create the databases and users in mysql:

# mysql -u root -p <<< "CREATE DATABASE nova_api;
CREATE DATABASE nova;
CREATE DATABASE nova_cell0;
CREATE DATABASE placement;
GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'localhost' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'%' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'localhost' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'%' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'localhost' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'%' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'localhost' IDENTIFIED BY 'PLACEMENT_DBPASS';
GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'%' IDENTIFIED BY 'PLACEMENT_DBPASS';"

And now we will create the users that will manage the services and the endpoints in keystone. But first we’ll make sure that we have the proper env variables by sourcing the admin credentials:

# source admin-openrc
# openstack user create --domain default --password "NOVA_PASS" nova
# openstack role add --project service --user nova admin
# openstack service create --name nova --description "OpenStack Compute" compute
# openstack endpoint create --region RegionOne compute public http://controller:8774/v2.1
# openstack endpoint create --region RegionOne compute internal http://controller:8774/v2.1
# openstack endpoint create --region RegionOne compute admin http://controller:8774/v2.1
# openstack user create --domain default --password "PLACEMENT_PASS" placement
# openstack role add --project service --user placement admin
# openstack service create --name placement --description "Placement API" placement
# openstack endpoint create --region RegionOne placement public http://controller:8778
# openstack endpoint create --region RegionOne placement internal http://controller:8778
# openstack endpoint create --region RegionOne placement admin http://controller:8778

Now we’ll install the services

# apt -y install nova-api nova-conductor nova-consoleauth nova-novncproxy nova-scheduler nova-placement-api

Once the services have been installed, we are creating the basic configuration file

# cat > /etc/nova/nova.conf  <<\EOT
[DEFAULT]
lock_path = /var/lock/nova
state_path = /var/lib/nova
transport_url = rabbit://openstack:RABBIT_PASS@controller
my_ip = 192.168.1.240
use_neutron = true
firewall_driver = nova.virt.firewall.NoopFirewallDriver
[api]
auth_strategy = keystone
[api_database]
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova_api
[cells]
enable = False
[database]
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova
[glance]
api_servers = http://controller:9292
[keystone_authtoken]
auth_url = http://controller:5000/v3
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = nova
password = NOVA_PASS
[neutron]
url = http://controller:9696
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = neutron
password = NEUTRON_PASS
service_metadata_proxy = true
metadata_proxy_shared_secret = METADATA_SECRET
[oslo_concurrency]
lock_path = /var/lib/nova/tmp
[placement]
os_region_name = openstack
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
[placement_database]
connection = mysql+pymysql://placement:PLACEMENT_DBPASS@controller/placement
[scheduler]
discover_hosts_in_cells_interval = 300
[vnc]
enabled = true
server_listen = $my_ip
server_proxyclient_address = $my_ip
EOT

In this file, the most important value to tweak is “my_ip” that corresponds to the internal IP address of the controller.

Also remember that we are using simple passwords, to ease its tracking. In case that you need to make the deployment more secure, please set secure passwords.

At this point we need to synchronize the databases and create the openstack cells

# su -s /bin/sh -c "nova-manage api_db sync" nova
# su -s /bin/sh -c "nova-manage cell_v2 map_cell0" nova
# su -s /bin/sh -c "nova-manage cell_v2 create_cell --name=cell1 --verbose" nova
# su -s /bin/sh -c "nova-manage db sync" nova

Finally we need to restart the compute services.

# service nova-api restart
# service nova-consoleauth restart
# service nova-scheduler restart
# service nova-conductor restart
# service nova-novncproxy restart

We have to take into account that this is the controller node, and will not host any virtual machine.

Neutron

Neutron is the networking service in OpenStack. In this post we are installing the “self-sevice networks” option, so that the users will be able to create their isolated networks.

In first place, we are creating the database for the neutron service.

# mysql -u root -p <<< "CREATE DATABASE neutron;
GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'localhost' IDENTIFIED BY 'NEUTRON_DBPASS';
GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'%' IDENTIFIED BY 'NEUTRON_DBPASS'
"

Now we will create the openstack user and endpoints, but first we need to ensure that we set the env variables:

# source admin-openrc 
# openstack user create --domain default --password "NEUTRON_PASS" neutron
# openstack role add --project service --user neutron admin
# openstack service create --name neutron --description "OpenStack Networking" network
# openstack endpoint create --region RegionOne network public http://controller:9696
# openstack endpoint create --region RegionOne network internal http://controller:9696
# openstack endpoint create --region RegionOne network admin http://controller:9696

Now we are ready to install the packages related to neutron:

# apt install -y neutron-server neutron-plugin-ml2 neutron-linuxbridge-agent neutron-l3-agent neutron-dhcp-agent neutron-metadata-agent

And now we need to create the configuration files for neutron. In first place, the general file /etc/neutron/neutron.conf

# cat > /etc/neutron/neutron.conf <<\EOT
[DEFAULT]
core_plugin = ml2
core_plugin = ml2
service_plugins = router
allow_overlapping_ips = true
transport_url = rabbit://openstack:RABBIT_PASS@controller
auth_strategy = keystone
notify_nova_on_port_status_changes = true
notify_nova_on_port_data_changes = true
[agent]
root_helper = "sudo /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf"
[database]
connection = mysql+pymysql://neutron:NEUTRON_DBPASS@controller/neutron
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = neutron
password = NEUTRON_PASS
[nova]
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = nova
password = NOVA_PASS
[oslo_concurrency]
lock_path = /var/lock/neutron
EOT

Now the file /etc/neutron/plugins/ml2/ml2_conf.ini, that will be used to instruct neutron how to create the LANs:

# cat > /etc/neutron/plugins/ml2/ml2_conf.ini <<\EOT
[ml2]
type_drivers = flat,vlan,vxlan
tenant_network_types = vxlan
mechanism_drivers = linuxbridge,l2population
extension_drivers = port_security
[ml2_type_flat]
flat_networks = provider
[ml2_type_vxlan]
vni_ranges = 1:1000
[securitygroup]
enable_ipset = true
EOT

Now the file /etc/neutron/plugins/ml2/linuxbridge_agent.ini, because we are using linux bridges in this setup:

# cat > /etc/neutron/plugins/ml2/linuxbridge_agent.ini <<\EOT
[linux_bridge]
physical_interface_mappings = provider:eno3
[securitygroup]
firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver
enable_security_group = true
[vxlan]
enable_vxlan = true
local_ip = 192.168.1.240
l2_population = true
EOT

In this file, it is important to tweak the value “eno3” in value physical_interface_mappings, to match the physical interface that has access to the provider’s network (i.e. public network). It is also essential to set the proper value for “local_ip“, which is the IP address of your internal interface with which to communicate to the compute hosts.

Now we have to create the files correponding to the l3_agent and the dhcp agent:

# cat > /etc/neutron/l3_agent.ini <<EOT
[DEFAULT]
interface_driver = linuxbridge
EOT
# cat /etc/neutron/dhcp_agent.ini <<EOT
[DEFAULT]
interface_driver = linuxbridge
dhcp_driver = neutron.agent.linux.dhcp.Dnsmasq
enable_isolated_metadata = true
dnsmasq_dns_servers = 8.8.8.8
EOT

Finally we need to create the file /etc/neutron/metadata_agent.ini

# cat > genfile /etc/neutron/metadata_agent.ini <<EOT
[DEFAULT]
nova_metadata_host = controller
metadata_proxy_shared_secret = METADATA_SECRET
EOT

Once the configuration files have been created, we are synchronizing the database and restarting the services related to neutron.

# su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade head" neutron
# service nova-api restart
# service neutron-server restart
# service neutron-linuxbridge-agent restart
# service neutron-dhcp-agent restart
# service neutron-metadata-agent restart
# service neutron-l3-agent restart

And that’s all.

At this point we have the controller node installed according to the OpenStack documentation. It is possible to issue any command, but it will not be possible to start any VM, because we have not installed any compute node, yet.