How to use SSH with Proxies and Port Forwarding to get access to my Private Network

In my work, I have one single host with public access to ssh port (i.e. 22). The rest of my hosts have the ssh port filtered. Moreover, I have one cloud platform in which I can create virtual machines (VM) with private IP addresses (e.g. 10.0.0.1). I want to start one web server in a VM and have access to that web server from my machine at home.

The structure of hosts is in the next picture:

ssh-magic-2

So this time I learned…

How to use SSH with Proxies and Port Forwarding to get access to my Private Network

In first place I have to mention that this usage of proxy jumping is described as a real use case, and it is intended for legal purposes only.

TL;DR

$ ssh -L 10080:localhost:80 -J main-ui.myorg.com -J cloud-ui.myorg.com root@web.internal

My first attempt to achieve this was to chain ssh calls (will forget about port forwarding for the moment):

$ ssh -t user@main-ui.myorg.com ssh -t user@cloud-ui.myorg.com ssh root@web.internal

And this works, but has some problems…

  1. I am asked for a password at every stage (for main-ui, cloud-ui and web.internal). I want to use passwordless access (using my private key), but if I try to use ‘-i <private key>’ flag, the path to the files is always relative to the specific machine. So I would need to copy my private key to every machine (weird).
  2. I need to chain port forwarding by chaining individual port forwards.
  3. I will not work for scp.

I tried SSHProxyCommand options, but I found easier to use ProxyJump option (i.e. -J). So my attempt was:

$ ssh -J user@main-ui.myorg.com -J user@cloud-ui.myorg.com root@web.internal

And this works, but I have not found any way to provide my private key in the command line but for the target host (web.internal).

And now I figured out how to configure this by using ssh-config files (i.e. $HOME/.ssh/config). I wrote the next entries in that file:

Host main-ui.myorg.com
IdentityFile ~/.ssh/key-for-main-ui.key

Host cloud-ui.myorg.com
IdentityFile ~/.ssh/key-for-cloud-ui.key
ProxyJump main-ui.myorg.com

Host *.internal
ProxyJump could-ui.myorg.com

Using that configuration I can access to web.internal by issuing the next simple command:

$ ssh -i keyfor-web.key root@web.internal

And each file with each identity key written in the .ssh/config file y relative to my laptop, so I do not need to distribute my private key (nor create artificial intermediate keys).

The config file is also for the scp command, so I can issue commands like the next one:

$ scp -i keyfor-web.key  ./myfile root@web.internal:.

And ssh will make the magic!

Port forwarding

But remember that I also wanted to access to my web server in web.internal. So the easies way is to port forward port 80 to a port (e.g. 10080) in my laptop.

So using the previous configuration I can issue a command like the next one:

$ ssh -i keyfor-web.key -L 10080:localhost:80 root@web.internal

And now I will be able to open other terminal and issue the next command:

$ curl localhost:10080

and I will get the contents from web.internal.

Flag -L can be interpreted as (using the previous ssh call): connect using ssh to root@web.internal and forward to ‘localhost:80’ any traffic received in port 10080 from the client host.

If I had no ssh access to web.internal, I could issue the next alternate command:

$ ssh -L 10080:web.internal:80 user@cloud-ui.myorg.com

In this case, -L flag will be interpreted as: connect using ssh to user@cloud-ui.myorg.com and forward to ‘web.internal:80’ any traffic received in port 10080 from the client host.

Final words

Using these commands I will need to keep the ssh session opened. In case that I wanted to forget about that session, and run port forwarding in background, I could use -f flag:

$ ssh -i keyfor-web.key -f -L 10080:localhost:80 root@web.internal 'while true; do sleep 60; done'

How to install OpenStack Rocky – part 2

This is the second post on the installation of OpenStack Rocky in an Ubuntu based deployment.

In this post I am explaning how to install the essential OpenStack services in the controller. Please check the previous port How to install OpenStack Rocky – part 1 to learn on how to prepare the servers to host our OpenStack Rocky platform.

Recap

In the last post, we prepared the network for both the node controller and the compute elements. The description is in the next figure:

horsemen

We also installed the prerrequisites for the controller.

Installation of the controller

In this section we are installing keystone, glance, nova, neutron and the dashboard, in the controller.

Repositories

In first place, we need to install the OpenStack repositories:

# apt install software-properties-common
# add-apt-repository cloud-archive:rocky
# apt update && apt -y dist-upgrade
# apt install -y python-openstackclient

Keystone

To install keystone, first we need to create the database:

# mysql -u root -p <<< "CREATE DATABASE keystone;
GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'localhost' IDENTIFIED BY 'KEYSTONE_DBPASS';
GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'%' IDENTIFIED BY 'KEYSTONE_DBPASS';"

And now, we’ll install keystone

# apt install -y keystone apache2 libapache2-mod-wsgi

We are creating the minimal keystone.conf configuration, according to the basic deployment:

# cat > /etc/keystone/keystone.conf  <<EOT
[DEFAULT]
log_dir = /var/log/keystone
[database]
connection = mysql+pymysql://keystone:KEYSTONE_DBPASS@controller/keystone
[extra_headers]
Distribution = Ubuntu
[token]
provider = fernet"
fi
EOT

Now we need to execute some commands to prepare the keystone service

# su keystone -s /bin/sh -c 'keystone-manage db_sync'
# keystone-manage fernet_setup --keystone-user keystone --keystone-group keystone
# keystone-manage credential_setup --keystone-user keystone --keystone-group keystone
# keystone-manage bootstrap --bootstrap-password "ADMIN_PASS" --bootstrap-admin-url http://controller:5000/v3/ --bootstrap-internal-url http://controller:5000/v3/ --bootstrap-public-url http://controller:5000/v3/ --bootstrap-region-id RegionOne

At this moment, we have to configure apache2, because it is used as the http backend.

# echo "ServerName controller" >> /etc/apache2/apache2.conf
# service apache2 restart

Finally we’ll prepare a file that contains a set of variables that will be used to access openstack. This file will be called admin-openrc and its content is the next:

# cat > admin-openrc <<EOT
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_NAME=admin
export OS_USERNAME=admin
export OS_PASSWORD=ADMIN_PASS
export OS_AUTH_URL=http://controller:5000/v3
export OS_IDENTITY_API_VERSION=3
export OS_IMAGE_API_VERSION=2
EOT

And now we are almost ready to operate keystone. Now we need to source that file:

# source admin-openrc

And now we are ready to issue commands in OpenStack. And we are testing by create a project that will host the OpenStack services:

# openstack project create --domain default --description "Service Project" service

Demo Project

In the OpenStack installation guide, a demo project is created. We are including the creation of this demo project, although it is not needed:

# openstack project create --domain default --description "Demo Project" myproject
# openstack user create --domain default --password "MYUSER_PASS" myuser
# openstack role create myrole
# openstack role add --project myproject --user myuser myrole

We are also creating the set of variables needed in the system to execute commands in OpenStack, using this demo user

# cat > demo-openrc << EOT
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_NAME=myproject
export OS_USERNAME=myuser
export OS_PASSWORD=MYUSER_PASS
export OS_AUTH_URL=http://controller:5000/v3
export OS_IDENTITY_API_VERSION=3
export OS_IMAGE_API_VERSION=2"
EOT

In case that you want to use this demo user and project, you will be either login in the horizon portal (once it is installed in further steps), using the pair myuser/MYUSER_PASS credentials, or sourcing the file demo-openrc to use the commandline.

Glance

Glance is the OpenStack service dedicated to manage the VM images. And this using this steps, we will be able to make a basic installation where the images are stored in the filesystem of the controller.

First we need to create a database and user in mysql:

# mysql -u root -p <<< "CREATE DATABASE glance;
GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'localhost' IDENTIFIED BY 'GLANCE_DBPASS';
GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'%' IDENTIFIED BY 'GLANCE_DBPASS';"

Now we need to create the user dedicated to run the service and the endpoints in keystone, but first we’ll make sure that we have the proper env variables by sourcing the admin credentials:

# source admin-openrc# openstack user create --domain default --password "GLANCE_PASS" glance
# openstack role add --project service --user glance admin
# openstack service create --name glance --description "OpenStack Image" image
# openstack endpoint create --region RegionOne image public http://controller:9292
# openstack endpoint create --region RegionOne image internal http://controller:9292
# openstack endpoint create --region RegionOne image admin http://controller:9292

Now we are ready to install the components:

# apt install -y glance

At the time of writing this post, there is an error in the glance package in the OpenStack repositories. That makes that (e.g.) integration with cinder does not work. The problem is that file /etc/glance/rootwrap.conf and folder /etc/glance/rootwrap.d are inside folder /etc/glance/glance. So the patch simply consists in executing

$ mv /etc/glance/glance/rootwrap.* /etc/glance/

 

And now we are creating the basic configuration files, needed to run glance as in the basic installation:

# cat > /etc/glance/glance-api.conf  << EOT
[database]
connection = mysql+pymysql://glance:GLANCE_DBPASS@controller/glance
backend = sqlalchemy
[image_format]
disk_formats = ami,ari,aki,vhd,vhdx,vmdk,raw,qcow2,vdi,iso,ploop.root-tar
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = glance
password = GLANCE_PASS
[paste_deploy]
flavor = keystone
[glance_store]
stores = file,http
default_store = file
filesystem_store_datadir = /var/lib/glance/images/
EOT

And

# cat > /etc/glance/glance-registry.conf  << EOT
[database]
connection = mysql+pymysql://glance:GLANCE_DBPASS@controller/glance
backend = sqlalchemy
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = glance
password = GLANCE_PASS
[paste_deploy]
flavor = keystone
EOT

The backend to store the files is the folder /var/lib/glance/images/ in the controller node. If you want to change this folder, please update the variable filesystem_store_datadir in the file glance-api.conf

We have created the files that result from the installation from the official documentation and now we are ready to start glance. First we’ll prepare the database

# su -s /bin/sh -c "glance-manage db_sync" glance

And finally we will restart the services

# service glance-registry restart
# service glance-api restart

At this point, we are creating our first image (the common cirros image):

# wget -q http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img -O /tmp/cirros-0.4.0-x86_64-disk.img
# openstack image create "cirros" --file /tmp/cirros-0.4.0-x86_64-disk.img --disk-format qcow2 --container-format bare --public

Nova (i.e. compute)

Nova is the set of services dedicated to the compute service. As we are installing the controller, this server will not run any VM. Instead it will coordinate the creation of the VMs in the working nodes.

First we need to create the databases and users in mysql:

# mysql -u root -p <<< "CREATE DATABASE nova_api;
CREATE DATABASE nova;
CREATE DATABASE nova_cell0;
CREATE DATABASE placement;
GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'localhost' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'%' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'localhost' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'%' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'localhost' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'%' IDENTIFIED BY 'NOVA_DBPASS';
GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'localhost' IDENTIFIED BY 'PLACEMENT_DBPASS';
GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'%' IDENTIFIED BY 'PLACEMENT_DBPASS';"

And now we will create the users that will manage the services and the endpoints in keystone. But first we’ll make sure that we have the proper env variables by sourcing the admin credentials:

# source admin-openrc
# openstack user create --domain default --password "NOVA_PASS" nova
# openstack role add --project service --user nova admin
# openstack service create --name nova --description "OpenStack Compute" compute
# openstack endpoint create --region RegionOne compute public http://controller:8774/v2.1
# openstack endpoint create --region RegionOne compute internal http://controller:8774/v2.1
# openstack endpoint create --region RegionOne compute admin http://controller:8774/v2.1
# openstack user create --domain default --password "PLACEMENT_PASS" placement
# openstack role add --project service --user placement admin
# openstack service create --name placement --description "Placement API" placement
# openstack endpoint create --region RegionOne placement public http://controller:8778
# openstack endpoint create --region RegionOne placement internal http://controller:8778
# openstack endpoint create --region RegionOne placement admin http://controller:8778

Now we’ll install the services

# apt -y install nova-api nova-conductor nova-consoleauth nova-novncproxy nova-scheduler nova-placement-api

Once the services have been installed, we are creating the basic configuration file

# cat > /etc/nova/nova.conf  <<\EOT
[DEFAULT]
lock_path = /var/lock/nova
state_path = /var/lib/nova
transport_url = rabbit://openstack:RABBIT_PASS@controller
my_ip = 192.168.1.240
use_neutron = true
firewall_driver = nova.virt.firewall.NoopFirewallDriver
[api]
auth_strategy = keystone
[api_database]
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova_api
[cells]
enable = False
[database]
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova
[glance]
api_servers = http://controller:9292
[keystone_authtoken]
auth_url = http://controller:5000/v3
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = nova
password = NOVA_PASS
[neutron]
url = http://controller:9696
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = neutron
password = NEUTRON_PASS
service_metadata_proxy = true
metadata_proxy_shared_secret = METADATA_SECRET
[oslo_concurrency]
lock_path = /var/lib/nova/tmp
[placement]
os_region_name = openstack
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
[placement_database]
connection = mysql+pymysql://placement:PLACEMENT_DBPASS@controller/placement
[scheduler]
discover_hosts_in_cells_interval = 300
[vnc]
enabled = true
server_listen = $my_ip
server_proxyclient_address = $my_ip
EOT

In this file, the most important value to tweak is “my_ip” that corresponds to the internal IP address of the controller.

Also remember that we are using simple passwords, to ease its tracking. In case that you need to make the deployment more secure, please set secure passwords.

At this point we need to synchronize the databases and create the openstack cells

# su -s /bin/sh -c "nova-manage api_db sync" nova
# su -s /bin/sh -c "nova-manage cell_v2 map_cell0" nova
# su -s /bin/sh -c "nova-manage cell_v2 create_cell --name=cell1 --verbose" nova
# su -s /bin/sh -c "nova-manage db sync" nova

Finally we need to restart the compute services.

# service nova-api restart
# service nova-consoleauth restart
# service nova-scheduler restart
# service nova-conductor restart
# service nova-novncproxy restart

We have to take into account that this is the controller node, and will not host any virtual machine.

Neutron

Neutron is the networking service in OpenStack. In this post we are installing the “self-sevice networks” option, so that the users will be able to create their isolated networks.

In first place, we are creating the database for the neutron service.

# mysql -u root -p <<< "CREATE DATABASE neutron;
GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'localhost' IDENTIFIED BY 'NEUTRON_DBPASS';
GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'%' IDENTIFIED BY 'NEUTRON_DBPASS'
"

Now we will create the openstack user and endpoints, but first we need to ensure that we set the env variables:

# source admin-openrc 
# openstack user create --domain default --password "NEUTRON_PASS" neutron
# openstack role add --project service --user neutron admin
# openstack service create --name neutron --description "OpenStack Networking" network
# openstack endpoint create --region RegionOne network public http://controller:9696
# openstack endpoint create --region RegionOne network internal http://controller:9696
# openstack endpoint create --region RegionOne network admin http://controller:9696

Now we are ready to install the packages related to neutron:

# apt install -y neutron-server neutron-plugin-ml2 neutron-linuxbridge-agent neutron-l3-agent neutron-dhcp-agent neutron-metadata-agent

And now we need to create the configuration files for neutron. In first place, the general file /etc/neutron/neutron.conf

# cat > /etc/neutron/neutron.conf <<\EOT
[DEFAULT]
core_plugin = ml2
core_plugin = ml2
service_plugins = router
allow_overlapping_ips = true
transport_url = rabbit://openstack:RABBIT_PASS@controller
auth_strategy = keystone
notify_nova_on_port_status_changes = true
notify_nova_on_port_data_changes = true
[agent]
root_helper = "sudo /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf"
[database]
connection = mysql+pymysql://neutron:NEUTRON_DBPASS@controller/neutron
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = neutron
password = NEUTRON_PASS
[nova]
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = nova
password = NOVA_PASS
[oslo_concurrency]
lock_path = /var/lock/neutron
EOT

Now the file /etc/neutron/plugins/ml2/ml2_conf.ini, that will be used to instruct neutron how to create the LANs:

# cat > /etc/neutron/plugins/ml2/ml2_conf.ini <<\EOT
[ml2]
type_drivers = flat,vlan,vxlan
tenant_network_types = vxlan
mechanism_drivers = linuxbridge,l2population
extension_drivers = port_security
[ml2_type_flat]
flat_networks = provider
[ml2_type_vxlan]
vni_ranges = 1:1000
[securitygroup]
enable_ipset = true
EOT

Now the file /etc/neutron/plugins/ml2/linuxbridge_agent.ini, because we are using linux bridges in this setup:

# cat > /etc/neutron/plugins/ml2/linuxbridge_agent.ini <<\EOT
[linux_bridge]
physical_interface_mappings = provider:eno3
[securitygroup]
firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver
enable_security_group = true
[vxlan]
enable_vxlan = true
local_ip = 192.168.1.240
l2_population = true
EOT

In this file, it is important to tweak the value “eno3” in value physical_interface_mappings, to match the physical interface that has access to the provider’s network (i.e. public network). It is also essential to set the proper value for “local_ip“, which is the IP address of your internal interface with which to communicate to the compute hosts.

Now we have to create the files correponding to the l3_agent and the dhcp agent:

# cat > /etc/neutron/l3_agent.ini <<EOT
[DEFAULT]
interface_driver = linuxbridge
EOT
# cat /etc/neutron/dhcp_agent.ini <<EOT
[DEFAULT]
interface_driver = linuxbridge
dhcp_driver = neutron.agent.linux.dhcp.Dnsmasq
enable_isolated_metadata = true
dnsmasq_dns_servers = 8.8.8.8
EOT

Finally we need to create the file /etc/neutron/metadata_agent.ini

# cat > genfile /etc/neutron/metadata_agent.ini <<EOT
[DEFAULT]
nova_metadata_host = controller
metadata_proxy_shared_secret = METADATA_SECRET
EOT

Once the configuration files have been created, we are synchronizing the database and restarting the services related to neutron.

# su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade head" neutron
# service nova-api restart
# service neutron-server restart
# service neutron-linuxbridge-agent restart
# service neutron-dhcp-agent restart
# service neutron-metadata-agent restart
# service neutron-l3-agent restart

And that’s all.

At this point we have the controller node installed according to the OpenStack documentation. It is possible to issue any command, but it will not be possible to start any VM, because we have not installed any compute node, yet.

 

 

How to dynamically create on-demand services to respond to incoming TCP connections

Some time ago I had the problem of dynamically start virtual machines, when an incoming connection was received in a port. The exact problem was to have a VM that was powered off, and start it whenever an incoming ssh connection was received, and then forward the network traffic to that VM to serve the ssh request. In this way, I could have a server in a cloud provider (e.g. Amazon), and not to spend money if I was not using it.

This problem has been named “the sleeping beauty”, because of the tale. It was like having a sleeping virtual infrastructure (i.e. the sleeping beauty), that will be awaken as an incoming connection (i.e. the kiss) was received from the user (i.e. the prince).

Now I have figured out how to solve that problem, and that is why this time I learned

How to dynamically create on-demand services to respond to incoming TCP connections

The way to solve it is very straightforward, as it is fully based in the ​socat application.

socat is “a relay for bidirectional data transfer between two independent data channels”. And it can be used to forward the traffic received in a port, to other pair IP:PORT.

A simple example is:

$ socat tcp-listen:10000 tcp:localhost:22 &

And now we can SSH to localhost in the following way:

$ ssh localhost -p 10000

The interesting thing is that socat is able to exec one command upon receiving a connection (using the destination of the relay the address type EXEC or SYSTEM). But the most important thing is that socat will stablish a communication using stdin and stdout.

So it is possible to make this funny thing:

$ socat tcp-listen:10000 SYSTEM:'echo "hello world"' &
[1] 11136
$ wget -q -O- http://localhost:10000
hello world
$
[1]+ Done socat tcp-listen:10000 SYSTEM:'echo "hello world"'

Now that we know that the communication is stablished using stdin and stdout, we can somehow abuse of socat and try this even funnier thing:

$ socat tcp-listen:10000 SYSTEM:'echo "$(date)" >> /tmp/sshtrack.log; socat - "TCP:localhost:22"' &
[1] 27421
$ ssh localhost -p 10000 cat /tmp/sshtrack.log
mié feb 27 14:36:45 CET 2019
$
[1]+ Done socat tcp-listen:10000 SYSTEM:'echo "$(date)" >> /tmp/sshtrack.log; socat - "
TCP:localhost:22"'

The effect is that we can execute commands and redirect the connection to an arbitrary IP:PORT.

Now, it is easy to figure how to dinamically spawn servers to serve the incoming TCP resquests. An example to spawn a one-shot web server in port 8080 to serve requests in port 10000 is the next:

$ socat tcp-listen:10000 SYSTEM:'(echo "hello world" | nc -l -p 8080 -q 1 > /dev/null &) ; socat - "TCP:localhost:8080"' &
[1] 31586
$ wget -q -O- http://localhost:10000
hello world
$
[1]+ Done socat tcp-listen:10000 SYSTEM:'(echo "hello world" | nc -l -p 8080 -q 1 > /dev/null &) ; socat - "TCP:localhost:8080"'

And now you can customize your scripts to create the effective servers on demand.

The sleeping beauty application

I have used these proofs of concept to create the sleeping-beauty application. It is open source, and you can get it in github.

The sleeping beauty is a system that helps to implement serverless infrastructures: you have the servers aslept (or not even created), and they are awaken (or created) as they are needed. Later, they go back to sleep (or they are disposed).

In the sleeping-beauty, you can configure services that listen to a port, and the commands that socat should use to start, check the status or stop the effective services. Moreover it implements an idle-detection mechanism that is able to check whether the effective service is idle, and if it has been idle for a period of time, stop it to save resources.

Example: In the description of the use case, the command to be used to start the service, will contact Amazon AWS and will start a VM. The command to stop the service will contact Amazon AWS to stop the VM. And the command to check whether the service is idle or not will ssh the VM and execute the command ‘who’.

How to install OpenStack Rocky – part 1

This is the first post of a series in which I am describing the installation of a OpenStack site using the latest distribution at writting time: Rocky.

My project is very ambitious, because I have a 2 virtualization nodes (that have different GPU each), 10GbE, a lot of memory and disk, and I want to offer the GPUs to the VMs. The front-end is a 24 core server, with 32Gb. RAM and 6 Tb. disk, with 4 network ports (2x10GbE+2x1GbE), that will also act as block devices server.

We’ll be using Ubuntu 18.04 LTS for all the nodes, and I’ll try to follow the official documentation. But I will try to be very straighforward in the configuration… I want to make it work and I will try to explain how things work, instead of tunning the configuration.

How to install OpenStack Rocky – part 1

My setup for the OpenStack installation is the next one:

horsemen

In the figure I have annotated the most relevant data to identify the servers: the IP addresses for each interface, which is the volume server and the virtualization nodes that will share their GPUs.

At the end, the server horsemen will host the next services: keystone, glance, cinder, neutron and horizon. On the other side, fh01 and fh02 will host the services compute and neutron-related.

In each of the servers we need a network interface (eno1, enp1s0f1 and enp1f0f1) which is intended for administration purposes (i.e. the network 192.168.1.1/24). That interface has a gateway (192.168.1.220) that enables the access to the internet via NAT. From now on, we’ll call these interfaces as the “private interfaces“.

We need an additional interface that is connected to the provider network (i.e. to the internet). That network will hold the publicly routable IP addresses. In my case, I have the network 158.42.1.1/24, that is publicly routable. It is a flat network with its own network services (e.g. gateway, nameservers, dhcp servers, etc.). From now on, we’ll call these interfaces the “public interfaces“.

One note on “eno4” interface in horsemen: I am using this interface for accessing horizon. In case that you do not have a different interface, you can use interface aliasing or providing the IP address in the ifupdown “up” script.

An extra note on “eno2” interface in horsemen: It is an extra interface in the node. It will be left unused during this installation, but it will be configured to work in bond mode with “eno1”.

IMPORTANT: In the whole series of posts, I am using the passwords as they appear: RABBIT_PASS, NOVA_PASS, NOVADB_PASS, etc. You should change it, according to a secure password policy, but they are set as-is to ease understanding the installation. Anyway, most of them will be fine if you have an isolated network and the service listen only in the management network (e.g. mysql will only be configured to listen in the management interface).

Some words on the Openstack network (concepts)

The basic installation of Openstack considers two networks: the provider network and the management network. The provider network means “the network that is attached to the provider” i.e. the network where the VMs can have publicly routable IP addresses.  On the other side, the management network is a private network that is (probably) isolated from the provider one. The computers in that network have private IP addresses (e.g. 192.168.1.1/16, 10.0.0.0/8, 172.16.0.0/12).

In the basic deployment of Openstack, it considers that the controller node do not need to have a routable IP address. Instead, it can be accessed by the admin by the management network. That is why the “eno3” interface has not an IP address.

In the Openstack architecture, horizon is a separated piece, so horizon is the one that will need a routable IP address. As I want to install horizon also in the controller, I need a routable IP address and that is why I put a publicly routable IP address in “eno4” (158.42.1.1).

In my case, I had a spare network interface (eno4) but if you do not have one of them, you can create a bridge and add your “interface connected to the provider network” (i.e. “eno3”) to that bridge, and then add a publicly routable IP address to the bridge.

IMPORTANT: this is not part of my installation. Although it may be part of your installation.

brctl addbr br-publicbrctl addif br-public eno3ip link set dev br-public upip addr add 158.42.1.1/16 dev br-public

Configuring the network

One of the first things that we need to set up is to configure the network for the different servers.

Ubuntu 18.04 has moved to netplan, but at the time of writting this text, I have not found any mechanism to get an interface UP and not providing an IP address for it, using netplan. Moreover, when trying to use ifupdown, netplan is not totally disabled and interferes with options such as dns-nameservers for the static addresses. At the end I need to install ifupdown and make a mix of configuration using both netplan and ifupdown.

It is very important to disable IPv6 for any of the servers, because if not, you will probably face a problem when using the public IP addresses. You can read more in this link.

To disable IPv6, we need to execute the following lines in all the servers (as root):

# sysctl -w net.ipv6.conf.all.disable_ipv6=1
# sysctl -w net.ipv6.conf.default.disable_ipv6=1
# cat >> /etc/default/grub << EOT
GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1"
GRUB_CMDLINE_LINUX="ipv6.disable=1"
EOT
# update-grub

We’ll disable IPv6 for the current session, and persist it by disabling at boot time. If you have customized your grub, you should check the options that we are setting.

Configuring the network in “horsemen”

You need to install ifupdown, to have an anonymous interface connected to the internet, to include neutron-related services

# apt update && apt install -y ifupdown

Edit the file /etc/network/interfaces and adjust it with a content like the next one:

auto eno3
iface eno3 inet manual
up ip link set dev $IFACE up
down ip link set dev $IFACE down

Now edit the file /etc/netplan/50-cloud-init.yaml to set the private IP address:

network:
  ethernets:
    eno4:
      dhcp4: true
    eno1:
      addresses:
        - 192.168.1.240/24
      gateway4: 192.168.1.221
      nameservers:
        addresses: [ 192.168.1.220, 8.8.8.8 ]
  version: 2

When you save these settings, you can issue the next commands:

# netplan generate
# netplan apply

Now we’ll edit the file /etc/hosts, and will add the addresses of each server. My file is the next one:

127.0.0.1       localhost.localdomain   localhost
192.168.1.240   horsemen controller
192.168.1.241   fh01
192.168.1.242   fh02

I have removed the entry 127.0.1.1 because I read thet it may interfer. And I also removed all the crap about IPv6 because I disabled it.

Configuring the network in “fh01” and “fh02”

Here is the short version of the configuration of fh01:

# apt install -y ifupdown
# cat >> /etc/network/interfaces << EOT
auto enp1s0f0
iface enp1s0f0 inet manual
up ip link set dev $IFACE up
down ip link set dev $IFACE down
EOT

Here is my file /etc/netplan/50-cloud-init.yaml for fh01:

network:
  ethernets:
    enp1s0f1:
      addresses:
        - 192.168.1.241/24
      gateway4: 192.168.1.221
    nameservers:
    addresses: [ 192.168.1.220, 8.8.8.8 ]
  version: 2

Here is the file /etc/hosts for fh01:

127.0.0.1 localhost.localdomain localhost
192.168.1.240 horsemen controller
192.168.1.241 fh01
192.168.1.242 fh02

You can export this configuration to fh02 by adjusting the IP address in the /etc/netplan/50-cloud-init.yaml file.

Reboot and test

Now it is a good moment to reboot your systems and test that the network is properly configured. If it is not, please make sure that it is working because. Otherwise the next steps will have no sense.

From each of the hosts you should be able to ping to the outern world and ping between the hosts. These are the tests from horsemen, but you need to be able to repeat them from each of the servers.

root@horsemen# ping -c 2 www.google.es
PING www.google.es (172.217.17.3) 56(84) bytes of data.
64 bytes from mad07s09-in-f3.1e100.net (172.217.17.3): icmp_seq=1 ttl=54 time=7.26 ms
64 bytes from mad07s09-in-f3.1e100.net (172.217.17.3): icmp_seq=2 ttl=54 time=7.26 ms

--- www.google.es ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 7.262/7.264/7.266/0.002 ms
root@horsemen# ping -c 2 fh01
PING fh01 (192.168.1.241) 56(84) bytes of data.
64 bytes from fh01 (192.168.1.241): icmp_seq=1 ttl=64 time=0.180 ms
64 bytes from fh01 (192.168.1.241): icmp_seq=2 ttl=64 time=0.113 ms

--- fh01 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1008ms
rtt min/avg/max/mdev = 0.113/0.146/0.180/0.035 ms
root@horsemen# ping -c 2 fh02
PING fh02 (192.168.1.242) 56(84) bytes of data.
64 bytes from fh02 (192.168.1.242): icmp_seq=1 ttl=64 time=0.223 ms
64 bytes from fh02 (192.168.1.242): icmp_seq=2 ttl=64 time=0.188 ms

--- fh02 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1027ms
rtt min/avg/max/mdev = 0.188/0.205/0.223/0.022 ms

Prerrequisites for Openstack in the server (horsemen)

Remember: for simplicity, I will use obvious passwords like SERVICE_PASS or SERVICEDB_PASS (e.g. RABBIT_PASS). You should change these passwords, although most of them will be fine if you have an isolated network and the service listen only in the management network.

First of all, we are installing the prerrequisites. We will start with the NTP server, that will keep the hour synchronized between the controller (horsemen) and the virtualization servers (fh01 and fh02). We’ll instal chrony (recommended in the Openstack configuration) and allow any computer in our private network to connect to this new NTP server:

# apt install -y chrony
cat >> /etc/chrony/chrony.conf << EOT
allow 192.168.1.0/24
EOT
# service chrony restart

Now we are installing and configuring the database server (we’ll use mariadb as it is used in the basic installation):

# apt install mariadb-server python-pymysql
# cat > /etc/mysql/mariadb.conf.d/99-openstack.cnf << EOT
[mysqld]
bind-address = 192.168.1.240

default-storage-engine = innodb
innodb_file_per_table = on
max_connections = 4096
collation-server = utf8_general_ci
character-set-server = utf8
EOT
#service mysql restart

Now we are installing rabbitmq, that will be used to orchestrate message interchange between services (please change RABBIT_PASS).

# apt install rabbitmq-server
# rabbitmqctl add_user openstack "RABBIT_PASS"
# rabbitmqctl set_permissions openstack ".*" ".*" ".*"

At this moment, we have to install memcached and configure it to listen in the management interface:

# apt install memcached

# echo "-l 192.168.1.240" >> /etc/memcached.conf

# service memcached restart

Finally we need to install etcd and configure it to be accessible by openstack

# apt install etcd
# cat >> /etc/default/etcd << EOT
ETCD_NAME="controller"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-01"
ETCD_INITIAL_CLUSTER="controller=http://192.168.1.240:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.1.240:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.240:2379"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.240:2379"
EOT
# systemctl enable etcd
# systemctl start etcd

Now we are ready to with the installation of the OpenStack Rocky packages… (continue to part 2)

How to create overlay networks using Linux Bridges and VXLANs

Some time ago, I learned How to create a overlay network using Open vSwitch in order to connect LXC containers. Digging in the topic of overlay networks, I saw that linux bridges had included VXLAN capabilities, and also saw how some people were using it to create overlay networks in a LAN. As an example, it is made in this way in the OpenStack linux bridges plugin. So I decided to work by hand in this topic (in order to better understand how OpenStack was working) and learned…

How to create overlay networks using Linux Bridges and VXLANs

Well, I do not know when OpenStack started to use overlay networks using Linux Bridges, but as I started to search for information on how to do it by hand, I realized that it is a widespread topic. As an example, I found this post from VMWare that is a must if we want to better understand what is going on here.

Scenario

I have a single LAN, and I want to have several overlay networks with multiple VMs in each of them. I want that one set of VMs can communicate between them, but I don’t want that other set of VMs even know about the first set: I want to isolate networks from multiple tenants.

The next figure shows what I wan to do:

overlay-1

The left hand part of the image shows that will happen, and the right hand side of the image shows what the users in the hosts will have the vision that happen. The hosts which end in “1” will see that they are they are alone in their LAN, and the hosts which end in “2” will see that they are alone in their LAN.

Set up

We will create a “poor man setup” in which we will have two VMs that are simulating the hosts, and we will use LXC containers that will act as “guests”.

The next figure shows what we are creating

overlay-2

node01 and node02 are the hosts that will host the containers. Each of them have a physical interface named ens3, with IPs 10.0.0.28 and 10.0.0.31. We will create on each of them a bridge named br-vxlan-<ID> to which we should be able to bridge our containers. And these containers will have a interface (eth0) with an IP addresses in the range of 192.168.1.1/24.

To isolate the networks, we are using VXLANS with different VXLAN Network Identifier (VNI). In our case, 10 and 20.

Starting point

We have 2 hosts that can ping one to each other (node01 and node02).

root@node01:~# ping -c 2 node02
PING node02 (10.0.0.31) 56(84) bytes of data.
64 bytes from node02 (10.0.0.31): icmp_seq=1 ttl=64 time=1.17 ms
64 bytes from node02 (10.0.0.31): icmp_seq=2 ttl=64 time=0.806 ms

and

root@node02:~# ping -c 2 node01
PING node01 (10.0.0.28) 56(84) bytes of data.
64 bytes from node01 (10.0.0.28): icmp_seq=1 ttl=64 time=0.740 ms
64 bytes from node01 (10.0.0.28): icmp_seq=2 ttl=64 time=0.774 ms

In each of them I will make sure that I have installed the package iproute2 (i.e. the command “ip”).

In order to verify that everything is properly working, in each node, we will install the latest version of lxd according to this (in my case, I have lxd version 2.8). The one shipped with ubuntu 16.04.1 is 2.0 and will not be useful for us because we want that it is able to manage networks.

Anyway, I will offer an alternative for non-ubuntu users that will consist in creating a extra interface that will be bridged to the br-vxlan interface.

Let’s begin

The implementation of vxlan for the linux bridges works encapsulating traffic in multicast UDP messages that are distributed using IGMP.

In order to enable that the TCP/IP traffic is encapsulated through these interfaces, we will create a bridge and will attach the vxlan interface to that bridge. At the end, a bridge works like a network hub and forwards the traffic to the ports that are connected to it. So the traffic that appears in the bridge will be encapsulated into the UDP multicast messages.

For the creation of the first VXLAN (with VNI 10) we will need to issue the next commands (in each of the nodes)

ip link add vxlan10 type vxlan id 10 group 239.1.1.1 dstport 0 dev ens3
ip link add br-vxlan10 type bridge
ip link set vxlan10 master br-vxlan10
ip link set vxlan10 up
ip link set br-vxlan10 up

In these lines…

  1. First we create a vxlan port with VNI 10 that will use the device ens3 to multicast the UDP traffic using group 239.1.1.1 (using dstport 0 makes use of the default port).
  2. Then we will create a bridge named br-vxlan10 to which we will bridge the previously created vxlan port.
  3. Finally we will set both ports up.

Now that we have the first VXLAN, we will proceed with the second:

ip link add vxlan20 type vxlan id 20 group 239.1.1.1 dstport 0 dev ens3
ip link add br-vxlan20 type bridge
ip link set vxlan20 master br-vxlan20
ip link set vxlan20 up
ip link set br-vxlan20 up

Both VXLANs will be created in both nodes node01 and node02.

Tests and verification

At this point, we have the VXLANs ready to be used, and the traffic of each port that we bridge to the br-vxlan10 or br-vxlan20 will be multicasted using UDP to the network. As we have several nodes in the LAN, we will have VXLANs that span across multiple nodes.

In practise, the bridges br-vxlan10 of each node will be in LAN (each port included in such bridge of each node will be in the same LAN). The same occurs for br-vxlan20.

And the traffic of br-vxlan10 will not be visible in br-vxlan20 and vice-versa.

Verify using LXD containers

This is the test that will be more simple to understand as it is conceptually what we want. The only difference is that we will create containers instead of VMs.

In order to verify that it works, we will create the containers lhs1 (in node01) and and rhs1 (in node02) that will be attached to the br-vxlan10. In node01 we will execute the following commands:

lxc profile create vxlan10
lxc network attach-profile br-vxlan10 vxlan10
lxc launch images:alpine/3.4 lhs1 -p vxlan10
sleep 10 # to wait for the container to be up and ready
lxc exec lhs1 ip addr add 192.168.1.1/24 dev eth0

What we are doing is the next:

  1. Creating a LXC profile, to ensure that it has not any network interface.
  2. Making that the profile uses the bridge that we created for the VXLAN.
  3. Creating a container that uses the profile (and so, will be attached to the VXLAN).
  4. Assigning the IP address 192.168.1.1 to the container.

In node02, we will create other container (rhs1) with IP 192.168.1.2:

lxc profile create vxlan10
lxc network attach-profile br-vxlan10 vxlan10
lxc launch images:alpine/3.4 rhs1 -p vxlan10
sleep 10 # to wait for the container to be up and ready
lxc exec rhs1 ip addr add 192.168.1.2/24 dev eth0

And now, we have one container in each node that feels like if it was in a LAN with the other container.

In order to verify it, we will use a simple server that echoes the information sent. So in node01, in lhs1 we will start netcat listening in port 9999:

root@node01:~# lxc exec lhs1 -- nc -l -p 9999

And in node02, in rhs1 we will start netcat connected to the lhs1 IP and port (192.168.1.1:9999):

root@node02:~# lxc exec rhs1 -- nc 192.168.1.1 9999

Anything that we write in this node will get output in the other one, as shown in the image:

lxc-over

Now we can create the other containers and see what happens.

In node01 we will create the container lhs2 connected to vxlan20 and the same IP address than lhs1 (i.e. 192.168.1.1):

lxc profile create vxlan20
lxc network attach-profile br-vxlan20 vxlan20
lxc launch images:alpine/3.4 lhs2 -p vxlan20
sleep 10 # to wait for the container to be up and ready
lxc exec lhs2 ip addr add 192.168.1.1/24 dev eth0

At this point, if we try to ping to IP address 192.168.1.2 (which is assigned to rhs1), it should not work, as it is in the other VXLAN:

root@node01:~# lxc exec lhs2 -- ping -c 2 192.168.1.2
PING 192.168.1.2 (192.168.1.2): 56 data bytes

--- 192.168.1.2 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

Finally, in node02, we will create the container rhs2, attached to vxlan20, and the IP address 192.168.1.2:

lxc profile create vxlan20
lxc network attach-profile br-vxlan20 vxlan20
lxc launch images:alpine/3.4 rhs2 -p vxlan20
sleep 10 # to wait for the container to be up and ready
lxc exec rhs2 -- ip addr add 192.168.1.2/24 dev eth0

And now we can verify that each pair of nodes can communicate between them and the other traffic will not arrive. The next figure shows that it works!test02

Now you could have fun capturing the traffic in the hosts and get things like this:

traffic.png

You ping a host in vxlan20 and if you dump the traffic from ens3 you will get the top left traffic (the traffic in “instance 20”, i.e. VNI 20), but there is no traffic in br-vxlan10.

I suggest to have fun with wireshark to look in depth at what is happening (watch the UDP traffic, how it is translated using the VXLAN protocol, etc.).

Verify using other devices

If you cannot manage to use VMs or LXD containers, you can create a veth device and assign the IP addresses to it. Then ping through that interface to generate traffic.

ip link add eth10 type veth peer name eth10p
ip link set eth10p master br-vxlan10
ip link set eth10 up
ip link set eth10p up

And we will create other interface too

ip link add eth20 type veth peer name eth20p
ip link set eth20p master br-vxlan20
ip link set eth20 up
ip link set eth20p up

Now we will set the IP 192.168.1.1 to eth10 and 192.168.1.2 to eth20, and will try to ping from one to the other:

# ip addr add 192.168.1.1/24 dev eth10
# ip addr add 192.168.2.1/24 dev eth20
# ping 192.168.2.1 -c 2 -I eth10
PING 192.168.2.1 (192.168.2.1) from 192.168.1.1 eth10: 56(84) bytes of data.
From 192.168.1.1 icmp_seq=1 Destination Host Unreachable
From 192.168.1.1 icmp_seq=2 Destination Host Unreachable

--- 192.168.2.1 ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms
pipe 2

Here we see that it does not work.

I had to set IP addresses in different ranges. Otherwise the interfaces do not work properly.

Now, in node02, we will create the interfaces and set IP addresses to them (192.168.1.2 to eth10 and 192.168.2.2 to eth20).

ip link add eth10 type veth peer name eth10p
ip link set eth10p master br-vxlan10
ip link set eth10 up
ip link set eth10p up
ip link add eth20 type veth peer name eth20p
ip link set eth20p master br-vxlan20
ip link set eth20 up
ip link set eth20p up
ip addr add 192.168.1.2/24 dev eth10
ip addr add 192.168.2.2/24 dev eth20

And now we can try to ping to the interfaces in the corresponding VXLAN.

root@node01:~# ping 192.168.1.2 -c 2 -I eth10
PING 192.168.1.2 (192.168.1.2) from 192.168.1.1 eth10: 56(84) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=10.1 ms
64 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=4.53 ms

--- 192.168.1.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 4.539/7.364/10.189/2.825 ms
root@node01:~# ping 192.168.2.2 -c 2 -I eth10
PING 192.168.2.2 (192.168.2.2) from 192.168.1.1 eth10: 56(84) bytes of data.

--- 192.168.2.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1007ms

If we inspect what is happening using tcpdump, we’ll see that the traffic arrives to one interface and not to the other, as it is shown in the next figure:

dump.png

What we got here…

At the end, we have achived to a situation in which we have multiple isolated LANs over a single LAN. The traffic in one LAN is not seen in the other LANs.

This enables to create multi-tenant networks for Cloud datacenters.

Troubleshooting

During the tests I created a bridge in which the traffic was not forwarded from one port to the others. I tried to debug what was happening, whether it was affected by ebtables, iptables, etc. and at first I found no reason.

I was able to solve it by following the advice in this post. In fact, I did not trusted on it and rebooted, and while some of the settings were set to 1, it worked from then on.

$ cd /proc/sys/net/bridge
$ ls
 bridge-nf-call-arptables bridge-nf-call-iptables
 bridge-nf-call-ip6tables bridge-nf-filter-vlan-tagged
$ for f in bridge-nf-*; do echo 0 > $f; done

The machine in which I was doing the tests is not usually powered off, so maybe it was on for at least 2 months. Maybe some previous tests drove me to that problem.

I have faced this problem again and I was not comfortable with a solution “based on the faith”. So I searched a bit more and I found this post. Now I know what these files in /proc/sys/net/bridge mean and now I know that the problem was about iptables. The problem is that the files bridge-nf-call-iptables, etc. mean if the rules should go through iptables/arptables… before forwarding them to the ports in the bridge. So if you set a zero to these files, you will not have any iptables related problem.

If you find that the traffic is not forwarded to the ports, you should double-check the iptables and so on. In my case the “problem” was that forwarding was prevented by default. A “easy to check” solution is to check the filter table of iptables:

# iptables -t filter -S
-P INPUT ACCEPT
-P FORWARD DROP
-P OUTPUT ACCEPT
...

In my case, the filter dropped the traffic. If I want the traffic to be forwarded, I must explicitly accept it by adding a rule such as

# iptables -t FILTER -A FORWARD -i br-vxlan20 -j ACCEPT

 

How to connect complex networking infrastructures with Open vSwitch and LXC containers

Some days ago, I learned How to create a overlay network using Open vSwitch in order to connect LXC containers. In order to extend the features of the set-up that I did there, I wanted to introduce some services: a DHCP server, a router, etc. to create a more complex infrastructure. And so this time I learned…

How to connect complex networking infrastructures with Open vSwitch and LXC containers

My setup is based on the previous one, to introduce common services for networked environments. In particular, I am going to create a router and a DHCP server. So I will have two nodes that will host LXC containers and they will have the following features:

  • Any container in any node will get an IP address from the single DHCP server.
  • Any container will have access to the internet through the single router.
  • The containers will be able to connect between them using their private IP addresses.

We had the set-up in the next figure:

ovs

And now we want to get to the following set-up:

ovs2

Well… we are not making anything new, because we have worked with this before in How to create a multi-LXC infrastructure using custom NAT and DHCP server. But we can see this post as an integration post.

Update of the previous setup

On each of the nodes we have to create the bridge br-cont0 and the containers that we want. Moreover, we have to create the virtual swithc ovsbr0 and to connect it to the other node.

ovsnode01:~# brctl addbr br-cont0
ovsnode01:~# ip link set dev br-cont0 up
ovsnode01:~# cat > ./internal-network.tmpl << EOF
lxc.network.type = veth
lxc.network.link = br-cont0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
ovsnode01:~# lxc-create -f ./internal-network.tmpl -n node01c01 -t ubuntu
ovsnode01:~# lxc-create -f ./internal-network.tmpl -n node01c02 -t ubuntu
ovsnode01:~# apt-get install openvswitch-switch
ovsnode01:~# ovs-vsctl add-br ovsbr0
ovsnode01:~# ovs-vsctl add-port ovsbr0 br-cont0
ovsnode01:~# ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.10.2.22

Warning: we are not starting the containers, because we want them to get the IP address from our dhcp server.

Preparing a bridge to the outern world (NAT bridge)

We need a bridge that will act as a router to the external world for the router in our LAN. This is because we only have two known IP addresses (the one for ovsnode01 and the one for ovsnode02). So we’ll provide access to the Internet through one of them (according to the figure, it will be ovsnode01).

So we will create the bridge and will give it a local IP address:

ovsnode01:~# brctl addbr br-out
ovsnode01:~# ip addr add dev br-out 10.0.1.1/24

And now we will provide access to the containers that connect to that bridge through NAT. So let’s create the following script and execute it:

ovsnode01:~# cat > enable_nat.sh <<\EOF
#!/bin/bash
IFACE_WAN=eth0
IFACE_LAN=br-out
NETWORK_LAN=10.0.1.0/24

echo "1" > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o $IFACE_WAN -s $NETWORK_LAN ! -d $NETWORK_LAN -j MASQUERADE
iptables -A FORWARD -d $NETWORK_LAN -i $IFACE_WAN -o $IFACE_LAN -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -s $NETWORK_LAN -i $IFACE_LAN -j ACCEPT
EOF
ovsnode01:~# chmod +x enable_nat.sh
ovsnode01:~# ./enable_nat.sh

And that’s all. Now ovsnode01 will act as a router for IP addresses in the range 10.0.1.0/24.

DHCP server

Creating a DHCP server is as easy as creating a new container, installing dnsmasq and configuring it.

ovsnode01:~# cat > ./nat-network.tmpl << EOF
lxc.network.type = veth
lxc.network.link = br-out 
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
ovsnode01:~# lxc-create -f nat-network.tmpl -n dhcpserver -t ubuntu
ovsnode01:~# lxc-start -dn dhcpserver
ovsnode01:~# lxc-attach -n dhcpserver -- bash -c 'echo "nameserver 8.8.8.8" > /etc/resolv.conf
ip addr add 10.0.1.2/24 dev eth0
route add default gw 10.0.1.1'
ovsnode01:~# lxc-attach -n dhcpserver

WARNING: we created the container attached to br-out, because we want it to have internet access to be able to install dnsmasq. Moreover we needed to give it an IP address and set the nameserver to the one from google. Once the dhcpserver is configured, we’ll change the configuration to attach to br-cont0, because the dhcpserver only needs to access to the internal network.

Now we have to install dnsmasq:

apt-get update
apt-get install -y dnsmasq

Now we’ll configure the static network interface (172.16.0.202), by modifying the file /etc/network/interfaces

cat > /etc/network/interfaces << EOF
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
address 172.16.1.202
netmask 255.255.255.0
EOF

And finally, we’ll configure dnsmasq

cat > /etc/dnsmasq.conf << EOF
interface=eth0
except-interface=lo
listen-address=172.16.1.202
bind-interfaces
dhcp-range=172.16.1.1,172.16.1.200,1h
dhcp-option=26,1400
dhcp-option=option:router,172.16.1.201
EOF

In this configuration we have created our range of IP addresses (from 172.16.1.1 to 172.16.1.200). We have stated that our router will have the IP address 172.16.1.201 and one important thing: we have set the MTU to 1400 (remember that when using OVS we had to set the MTU to a lower size).

Now we are ready to connect the container to br-cont0. In order to make it, we have to modify the file /var/lib/lxc/dhcpserver/config. In particular, we have to change the value of the attribute lxc.network.link from br-out to br-cont0. Once I modified it, my network configuration in that file is as follows:

# Network configuration
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br-cont0
lxc.network.hwaddr = 00:16:3e:9f:ae:3f

Finally we can reboot our container

ovsnode01:~# lxc-stop -n dhcpserver 
ovsnode01:~# lxc-start -dn dhcpserver

And we can check that our server gets the proper IP address:

root@ovsnode01:~# lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 
dhcpserver RUNNING 0 - 172.16.1.202 -

We could also check that it is connected to the bridge:

ovsnode01:~# ip addr
...
83: vethGUV3HB: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-cont0 state UP group default qlen 1000
 link/ether fe:86:05:f6:f4:55 brd ff:ff:ff:ff:ff:ff
ovsnode01:~# brctl show br-cont0
bridge name bridge id STP enabled interfaces
br-cont0 8000.fe3b968e0937 no vethGUV3HB
ovsnode01:~# lxc-attach -n dhcpserver -- ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default 
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
82: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
 link/ether 00:16:3e:9f:ae:3f brd ff:ff:ff:ff:ff:ff
ovsnode01:~# ethtool -S vethGUV3HB
NIC statistics:
 peer_ifindex: 82

No matter if you do not understand this… it is a very advanced issue for this post. The important thing is that bridge br-cont0 has the device vethGUV3HB, whose number is 83 and its peer interface is the 82 that, in fact, is the eth0 device from inside the container.

Installing the router

Now that we have our dhcpserver ready, we are going to create a container that will act as a router for our network. It is very easy (in fact, we have already created a router). And… this fact arises a question: why are we creating another router?

We create a new router because it has to have an IP address inside the private network and other interface in the network to which we want to provide acess from the internal network.

Once we have this issue clear, let’s create the router, which as an IP in the bridge in the internal network (br-cont0):

ovsnode01:~# cat > ./router-network.tmpl << EOF
 lxc.network.type = veth
 lxc.network.link = br-cont0
 lxc.network.flags = up
 lxc.network.hwaddr = 00:16:3e:xx:xx:xx
 lxc.network.type = veth
lxc.network.link = br-out
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
 ovsnode01:~# lxc-create  -t ubuntu -f router-network.tmpl -n router

WARNING: I don’t know why, but for some reason sometimes lxc 2.0.3 fails in Ubuntu 14.04 when starting containers if they are created using two NICs.

Now we can start the container and start to work with it:

ovsnode01:~# lxc-start -dn router
ovsnode01:~# lxc-attach -n router

Now we simply have to configure the IP addresses for the router (eth0 is the interface in the internal network, bridged to br-cont0, and eth1 is bridged to br-out)

cat > /etc/network/interfaces << EOF
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 172.16.1.201
netmask 255.255.255.0

auto eth1
iface eth1 inet static
address 10.0.1.2
netmask 255.255.255.0
gateway 10.0.1.1
EOF

And finally create the router by using a script which is similar to the previous one:

router:~# apt-get install -y iptables
router:~# cat > enable_nat.sh <<\EOF
#!/bin/bash
IFACE_WAN=eth1
IFACE_LAN=eth0
NETWORK_LAN=172.16.1.201/24

echo "1" > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o $IFACE_WAN -s $NETWORK_LAN ! -d $NETWORK_LAN -j MASQUERADE
iptables -A FORWARD -d $NETWORK_LAN -i $IFACE_WAN -o $IFACE_LAN -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -s $NETWORK_LAN -i $IFACE_LAN -j ACCEPT
EOF
router:~# chmod +x enable_nat.sh
router:~# ./enable_nat.sh

Now we have our router ready to be used.

Starting the containers

Now we can simply start the containers that we created before, and we can check that they get an IP address by DHCP:

ovsnode01:~# lxc-start -n node01c01
ovsnode01:~# lxc-start -n node01c02
ovsnode01:~# lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6
dhcpserver RUNNING 0 - 172.16.1.202 -
node01c01 RUNNING 0 - 172.16.1.39 -
node01c02 RUNNING 0 - 172.16.1.90 -
router RUNNING 0 - 10.0.1.2, 172.16.1.201 -

And also we can check all the hops in our network, to check that it is properly configured:

ovsnode01:~# lxc-attach -n node01c01 -- apt-get install traceroute
(...)
ovsnode01:~# lxc-attach -n node01c01 -- traceroute -n www.google.es
traceroute to www.google.es (216.58.210.131), 30 hops max, 60 byte packets
 1 172.16.1.201 0.085 ms 0.040 ms 0.041 ms
 2 10.0.1.1 0.079 ms 0.144 ms 0.067 ms
 3 10.10.2.201 0.423 ms 0.517 ms 0.514 ms
...
12 216.58.210.131 8.247 ms 8.096 ms 8.195 ms

Now we can go to the other host and create the bridges, the virtual switch and the containers, as we did in the previous post.

WARNING: Just to remember, I leave this snip of code here:

ovsnode02:~# brctl addbr br-cont0
ovsnode02:~# ip link set dev br-cont0 up
ovsnode02:~# cat > ./internal-network.tmpl << EOF
lxc.network.type = veth
lxc.network.link = br-cont0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
ovsnode02:~# lxc-create -f ./internal-network.tmpl -n node01c01 -t ubuntu
ovsnode02:~# lxc-create -f ./internal-network.tmpl -n node01c02 -t ubuntu
ovsnode02:~# apt-get install openvswitch-switch
ovsnode02:~# ovs-vsctl add-br ovsbr0
ovsnode02:~# ovs-vsctl add-port ovsbr0 br-cont0
ovsnode02:~# ovs-vsctl add-port ovsbr0 vxlan0 — set interface vxlan0 type=vxlan options:remote_ip=10.10.2.21

And finally, we can start the containers and check that they get IP addresses from the DHCP server, and that they have connectivity to the internet using the routers that we have created:

ovsnode02:~# lxc-start -n node02c01
ovsnode02:~# lxc-start -n node02c02
ovsnode02:~# lxc-ls -f
NAME STATE IPV4 IPV6 AUTOSTART
-------------------------------------------------
node02c01 RUNNING 172.16.1.50 - NO
node02c02 RUNNING 172.16.1.133 - NO
ovsnode02:~# lxc-attach -n node02c01 -- apt-get install traceroute
(...)
ovsnode02:~# lxc-attach -n node02c01 -- traceroute -n www.google.es
traceroute to www.google.es (216.58.210.131), 30 hops max, 60 byte packets
 1 172.16.1.201 0.904 ms 0.722 ms 0.679 ms
 2 10.0.1.1 0.853 ms 0.759 ms 0.918 ms
 3 10.10.2.201 1.774 ms 1.496 ms 1.603 ms
...
12 216.58.210.131 8.849 ms 8.773 ms 9.062 ms

What is next?

Well, you’d probably want to persist the settings. Maybe you can set the iptables rules (aka the enable_nat.sh script) as a start script in /etc/init.d

As a further work, you can try VLAN tagging in OVS and so on, to duplicate the networks using the same components, but isolating the different networks.

You can also try to include new services (e.g. a private DNS server, a reverse NAT, etc.).

How to create a overlay network using Open vSwitch in order to connect LXC containers.

Open vSwitch (OVS) is a virtual switch implementation that can be used as a tool for Software Defined Network (SDN). The concepts managed by OVS are the same as the concepts managed by hardware switches (ports, routes, etc.). The most important features (for me) of OVS are

  • It enables to have a virtual switch inside a host in which to connect virtual machines, containers, etc.
  • It enables connect switches in different hosts using network tunnels (gre or vxlan).
  • It is possible to program the switch using OpenFlow.

In order to start working with OVS, this time…

I learned how to create a overlay network using Open vSwitch in order to connect LXC containers.

Well, first of all, I have to say that I have used LXC instead of VM because they are lightweight and very straightforward to use in a Ubuntu distribution. But if you understand this, you should be able to follow this how-to and use VMs instead of containers.

What we want (Our set-up)

What we are creating is shown in the next figure:

ovs

We have two nodes (ovsnodeXX) and several containers deployed on them (nodeXXcYY). The result is that all the containers can connect between them, but the traffic is not seen the LAN 10.10.2.x appart from the connection bewteen the OVS switches (because it is tunnelled).

Starting point

  • I have 2 ubuntu-based nodes (ovsnode01 and ovsnode02), with 1 network interface each, and they are able to ping one to each other:
root@ovsnode01:~# ping ovsnode01 -c 3
PING ovsnode01 (10.10.2.21) 56(84) bytes of data.
64 bytes from ovsnode01 (10.10.2.21): icmp_seq=1 ttl=64 time=0.022 ms
64 bytes from ovsnode01 (10.10.2.21): icmp_seq=2 ttl=64 time=0.028 ms
64 bytes from ovsnode01 (10.10.2.21): icmp_seq=3 ttl=64 time=0.033 ms

--- ovsnode01 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.022/0.027/0.033/0.007 ms
root@ovsnode01:~# ping ovsnode02 -c 3
PING ovsnode02 (10.10.2.22) 56(84) bytes of data.
64 bytes from ovsnode02 (10.10.2.22): icmp_seq=1 ttl=64 time=1.45 ms
64 bytes from ovsnode02 (10.10.2.22): icmp_seq=2 ttl=64 time=0.683 ms
64 bytes from ovsnode02 (10.10.2.22): icmp_seq=3 ttl=64 time=0.756 ms

--- ovsnode02 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 0.683/0.963/1.451/0.347 ms
  • I am able to create lxc containers, by issuing commands such as:
lxc-create -n node1 -t ubuntu
  • I am able to create linux bridges (i.e. I have installed the bridge-utils package).

Spoiler (or quick setup)

If you just want the solution, here you have (later I will explain all the steps). On each node you can follow the next steps in ovsnode01:

ovsnode01:~# brctl addbr br-cont0
ovsnode01:~# ip link set dev br-cont0 up
ovsnode01:~# cat > ./container-template.lxc << EOF
lxc.network.type = veth
lxc.network.link = br-cont0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
ovsnode01:~# lxc-create -f ./container-template.lxc -n node01c01 -t ubuntu
ovsnode01:~# lxc-start -dn node01c01
ovsnode01:~# lxc-create -f ./container-template.lxc -n node01c02 -t ubuntu
ovsnode01:~# lxc-start -dn node01c02
ovsnode01:~# lxc-attach -n node01c01 -- ip addr add 192.168.1.11/24 dev eth0
ovsnode01:~# lxc-attach -n node01c01 -- ifconfig eth0 mtu 1400
ovsnode01:~# lxc-attach -n node01c02 -- ip addr add 192.168.1.12/24 dev eth0
ovsnode01:~# lxc-attach -n node01c02 -- ifconfig eth0 mtu 1400
ovsnode01:~# apt-get install openvswitch-switch
ovsnode01:~# ovs-vsctl add-port ovsbr0 br-cont0
ovsnode01:~# ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.10.2.22

You will need to follow the same steps for ovsnode02, taking into account the names of the containers and the IP addresses.

Preparing the newtork

First we are going to create a bridge (br-cont0) to which the containers are being bridged

We are not using lxcbr0 because it may have other services such as dnsmasq that we should disable before. Moreover, creating our bridge will be more interesting to understand what we are doing

We will issue this command on both ovsnode01 and ovsnode02 nodes.

brctl addbr br-cont0

As this bridge has no IP, it is down (you can see it using ip command). Now we are going to set it up (also in both nodes):

ip link set dev br-cont0 up

Creating the containers

Now we need a template to associate the network to the containers. So we have to create the file container-template.lxc:

cat > ./container-template.lxc << EOF
lxc.network.type = veth
lxc.network.link = br-cont0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF

In this file we are saying that containers should be automatically bridged to bridge br-cont0 (remember that we created it before) and they will hace an interface with a hardware address that will follow the template. We could also modify the file /etc/lxc/default.conf file instead of creating a new one.

Now, we can create the containers on node ovsnode01:

ovsnode01# lxc-create -f ./container-template.lxc -n node01c01 -t ubuntu
ovsnode01# lxc-start -dn node01c01
ovsnode01# lxc-create -f ./container-template.lxc -n node01c02 -t ubuntu
ovsnode01# lxc-start -dn node01c02

And also in ovsnode02:

ovsnode02# lxc-create-f ./container-template.lxc -n node02c01 -t ubuntu
ovsnode02# lxc-start -dn node02c01
ovsnode02# lxc-create-f ./container-template.lxc -n node02c02 -t ubuntu
ovsnode02# lxc-start -dn node02c02

If you followed my steps, the containers will not have any IP address. This is because you do not have any dhcp server and the containers do not have static IP addresses. And that is what we are going to do now.

We are setting the IP addresses 192.168.1.11 and 192.168.1.21 to node01c01 and node0201 respetively. In order to do so, we have to issue these two commands (each of them in the corresponding node):

ovsnode01# lxc-attach -n node01c01 -- ip addr add 192.168.1.11/24 dev eth0
ovsnode01# lxc-attach -n node01c02 -- ip addr add 192.168.1.12/24 dev eth0

You need to configure the containers in ovsnode02 too:

ovsnode02# lxc-attach -n node02c01 -- ip addr add 192.168.1.21/24 dev eth0
ovsnode02# lxc-attach -n node02c02 -- ip addr add 192.168.1.22/24 dev eth0

Making connections

At this point, you should be able to connect between the containers in the same node: between node01c01 and node01c02, and between node02c01 and node02c02. But not between containers in different nodes.

ovsnode01# lxc-attach -n node01c01 -- ping 192.168.1.11 -c3
PING 192.168.1.11 (192.168.1.11) 56(84) bytes of data.
64 bytes from 192.168.1.11: icmp_seq=1 ttl=64 time=0.136 ms
64 bytes from 192.168.1.11: icmp_seq=2 ttl=64 time=0.051 ms
64 bytes from 192.168.1.11: icmp_seq=3 ttl=64 time=0.055 ms
ovsnode01# lxc-attach -n node01c01 -- ping 192.168.1.21 -c3
PING 192.168.1.21 (192.168.1.21) 56(84) bytes of data.
From 192.168.1.11 icmp_seq=1 Destination Host Unreachable
From 192.168.1.11 icmp_seq=2 Destination Host Unreachable
From 192.168.1.11 icmp_seq=3 Destination Host Unreachable

This is because br-cont0 is somehow a classic “network hub”, in which all the traffic can be listened by all the devices in it. So the containers in the same bridge are in the same LAN (in the same cable, indeed). But there is no connection between the hubs, and we will make it using Open vSwitch.

In the case of Ubuntu, installing OVS is as easy as issuing the next command:

apt-get install openvswitch-switch

Simple, isn’t it? But now we have to prepare the network and we are going to create a virtual switch (on both ovsnode01 and ovsnode02):

ovs-vsctl add-br ovsbr0

Open vSwitch works like a physical switch, with ports that can be connected and so on… And we are going to connect our hub to our switch (i.e. our bridge to the virtual switch):

ovs-vsctl add-port ovsbr0 br-cont0

We’ll make it in both ovsnode01 and ovsnode02

Finally, we’ll connect the ovs switches between them, using a vxlan tunnel:

ovsnode01# ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.10.2.22
ovsnode02# ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.10.2.21

We’ll make each of the two commands above on the corresponding node. Take care that the remote IP addresses are set to the other node 😉

We can check the final configuration of the nodes (let’s show only ovsnode01, but the other is very similar):

ovsnode01# brctl show
bridge name bridge id STP enabled interfaces
br-cont0 8000.fe502f26ea2d no veth3BUL7S
                              vethYLRPM2

ovsnode01# ovs-vsctl show
2096d83a-c7b9-47a8-8fff-d38c6d5ab04d
 Bridge "ovsbr0"
     Port "ovsbr0"
         Interface "ovsbr0"
             type: internal
     Port "vxlan0"
         Interface "vxlan0"
             type: vxlan
             options: {remote_ip="10.10.2.22"}
 ovs_version: "2.0.2"

Warning

Using this set up as is, you will get ping working, but probably no other traffic. This is because the traffic is encapsulated in a transport network. Did you know about MTU?

If we check the eth0 interface from one container we’ll get this:

# lxc-attach -n node01c01 -- ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:16:3e:52:42:2f 
 inet addr:192.168.1.11 Bcast:0.0.0.0 Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
 ...

Pay attention to the MTU value (it is 1500). And if we check the MTU of eth0 from the node, we’ll get this:

# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 60:00:00:00:20:15 
 inet addr:10.10.2.21 Bcast:10.10.2.255 Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
 ...

Summarizing, MTU is the size of the message for ethernet, which usually is 1500. But we are sending messages into messages and if we try to use it as is, we are trying to send things that have a size of “1500 + some overhead” in a room of “1500” (we are conciously omiting the units). And “1500 + some overhead” is bigger than “1500” and that is why it will not work.

We have to change the MTU of the containers to a lower size. It is as simple as:

ovsnode01:~# lxc-attach -n node01c01 -- ifconfig eth0 mtu 1400
ovsnode01:~# lxc-attach -n node01c02 -- ifconfig eth0 mtu 1400

ovsnode02:~# lxc-attach -n node02c01 -- ifconfig eth0 mtu 1400
ovsnode02:~# lxc-attach -n node02c02 -- ifconfig eth0 mtu 1400

This method is not persistent, and will be lost in case of rebooting the container. In order to persist it, you can set it in the DHCP server (in case that you are using it), or in the network device set up. In the case of ubuntu it is as simple as adding a line with ‘mtu 1400’ to the proper device in /etc/network/interfaces. As an example for container node01c01:

auto eth0
iface eth0 inet static
address 192.168.1.11
netmask 255.255.255.0
mtu 1400

Some words on Virtual Machines

If you have some expertise on Virtual Machines and Linux (I suppose that if you are following this how-to, this is your case), you should be able to make all the set-up for your VMs. When you create your VM, you simply need to bridge the interfaces of the VM to the bridge that we have created (br-cont0) and that’s all.

And now, what?

Well, now we have what we wanted to have. And now you can play with it. I suggest to create a DHCP server (to not to have to set the MTU and the IP addresses), a router, a DNS server, etc.

And as an advanced option, you can play with traffic tagging, in order to create multiple overlay networks and isolating them from each other.

 

 

How to set the hostname from DHCP in Ubuntu 14.04

I have a lot of virtual servers, and I like preparing a “golden image” and instantiate it many times. One of the steps is to set the hostname for the host, from my private DHCP server. It usually works, but sometimes it fails and I didn’t know why. So I got tired of such indetermination and this time…

I learned how to set the hostname from DHCP in Ubuntu 14.04

Well, so I have my dnsmasq server that acts both as a dns server and as a dhcp server. I investigated how a host gets its hostname from DHCP and it seems that it occurs when the DHCP server sends it by default (option hostname).

I debugged the DHCP messages using this command from other server

# tcpdump -i eth1 -vvv port 67 or port 68

And I saw that my dhcp server was properly sending the hostname (dnsmasq sends it by default), and I realized that the problem was in the dhclient script. I googled a lot and I found some clever scripts that got the name from a nameserver and were started as hooks from the dhcpclient script. But if the dhcp protocol sets the hostname, why do I have to create other script to set the hostname???.

Finally I got this blog entry and I realized that it was a bug in the dhclient script: if there exists an old dhcp entry in /var/lib/dhcp/dhclient.eth0.leases (or eth1, etc.), it does not set the hostname.

At this point you have two options:

  • The easy: in /etc/rc.local include a line that removes that file
  • The clever (suggested in the blog entry): include a hook script that unsets the hostname
echo unset old_host_name >/etc/dhcp/dhclient-enter-hooks.d/unset_old_hostname

How to configure a simple router with iptables in Ubuntu

If you have a server with two network cards, you can set a simple router that NATs a private range to a public one, by simply installing iptables and configuring it. This time…

I learned how to configure a simple router with iptables in Ubuntu

Scenario

  1. I have one server with two NICs: eth0 and eth1, and several servers that have at least one NIC (e.g. eth1).
  2. The eth0 NIC is connected to a public network, with IP 111.22.33.44
  3. I want that the other servers have access to the public network through the main server.

How to do it

I will make it using IPTables, so I need to install IPTables

$ apt-get install iptables

My main server has the IP address 10.0.0.1 in eth1. The other servers also have their IPs in the range of 10.0.0.x (either using static IPs or DHCP).

Now I will create some iptables rules in the server, by adding these lines to /etc/rc.local file just before the exit 0 line.

echo "1" > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -s 10.0.0.0/24 ! -d 10.0.0.0/24 -j MASQUERADE
iptables -A FORWARD -d 10.0.0.0/24 -o eth1 -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -s 10.0.0.0/24 -i eth1 -j ACCEPT

These rules mean that:

  1. I want to forward traffic
  2. The traffic that comes from the network 10.0.0.x and is not directed to the network gains access to the internet through NAT.
  3. We forward the traffic from the connections made from the internal network to the origin IP.
  4. We accept the traffic that cames from the internal network and goes to the internal network.

Easier to modify

Here it is a script that you can use to customize the NAT for your site:

ovsnode01:~# cat > enable_nat <<\EOF
#!/bin/bash
IFACE_WAN=eth0
IFACE_LAN=eth1
NETWORK_LAN=10.0.0.0/24

case "$1" in
start)
echo "1" > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o $IFACE_WAN -s $NETWORK_LAN ! -d $NETWORK_LAN -j MASQUERADE
iptables -A FORWARD -d $NETWORK_LAN -i $IFACE_WAN -o $IFACE_LAN -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -s $NETWORK_LAN -i $IFACE_LAN -j ACCEPT
exit 0;;
stop)
iptables -t nat -D POSTROUTING -o $IFACE_WAN -s $NETWORK_LAN ! -d $NETWORK_LAN -j MASQUERADE
iptables -D FORWARD -d $NETWORK_LAN -i $IFACE_WAN -o $IFACE_LAN -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -D FORWARD -s $NETWORK_LAN -i $IFACE_LAN -j ACCEPT
exit 0;;
esac
exit 1
EOF
ovsnode01:~# chmod +x enable_nat
ovsnode01:~# ./enable_nat start

Now you can use this script in the ubuntu startup. Just move it to /etc/init.d and issue the next command:

update-rc.d enable_nat defaults 99 00