How to connect complex networking infrastructures with Open vSwitch and LXC containers

Some days ago, I learned How to create a overlay network using Open vSwitch in order to connect LXC containers. In order to extend the features of the set-up that I did there, I wanted to introduce some services: a DHCP server, a router, etc. to create a more complex infrastructure. And so this time I learned…

How to connect complex networking infrastructures with Open vSwitch and LXC containers

My setup is based on the previous one, to introduce common services for networked environments. In particular, I am going to create a router and a DHCP server. So I will have two nodes that will host LXC containers and they will have the following features:

  • Any container in any node will get an IP address from the single DHCP server.
  • Any container will have access to the internet through the single router.
  • The containers will be able to connect between them using their private IP addresses.

We had the set-up in the next figure:

ovs

And now we want to get to the following set-up:

ovs2

Well… we are not making anything new, because we have worked with this before in How to create a multi-LXC infrastructure using custom NAT and DHCP server. But we can see this post as an integration post.

Update of the previous setup

On each of the nodes we have to create the bridge br-cont0 and the containers that we want. Moreover, we have to create the virtual swithc ovsbr0 and to connect it to the other node.

ovsnode01:~# brctl addbr br-cont0
ovsnode01:~# ip link set dev br-cont0 up
ovsnode01:~# cat > ./internal-network.tmpl << EOF
lxc.network.type = veth
lxc.network.link = br-cont0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
ovsnode01:~# lxc-create -f ./internal-network.tmpl -n node01c01 -t ubuntu
ovsnode01:~# lxc-create -f ./internal-network.tmpl -n node01c02 -t ubuntu
ovsnode01:~# apt-get install openvswitch-switch
ovsnode01:~# ovs-vsctl add-br ovsbr0
ovsnode01:~# ovs-vsctl add-port ovsbr0 br-cont0
ovsnode01:~# ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.10.2.22

Warning: we are not starting the containers, because we want them to get the IP address from our dhcp server.

Preparing a bridge to the outern world (NAT bridge)

We need a bridge that will act as a router to the external world for the router in our LAN. This is because we only have two known IP addresses (the one for ovsnode01 and the one for ovsnode02). So we’ll provide access to the Internet through one of them (according to the figure, it will be ovsnode01).

So we will create the bridge and will give it a local IP address:

ovsnode01:~# brctl addbr br-out
ovsnode01:~# ip addr add dev br-out 10.0.1.1/24

And now we will provide access to the containers that connect to that bridge through NAT. So let’s create the following script and execute it:

ovsnode01:~# cat > enable_nat.sh <<\EOF
#!/bin/bash
IFACE_WAN=eth0
IFACE_LAN=br-out
NETWORK_LAN=10.0.1.0/24

echo "1" > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o $IFACE_WAN -s $NETWORK_LAN ! -d $NETWORK_LAN -j MASQUERADE
iptables -A FORWARD -d $NETWORK_LAN -i $IFACE_WAN -o $IFACE_LAN -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -s $NETWORK_LAN -i $IFACE_LAN -j ACCEPT
EOF
ovsnode01:~# chmod +x enable_nat.sh
ovsnode01:~# ./enable_nat.sh

And that’s all. Now ovsnode01 will act as a router for IP addresses in the range 10.0.1.0/24.

DHCP server

Creating a DHCP server is as easy as creating a new container, installing dnsmasq and configuring it.

ovsnode01:~# cat > ./nat-network.tmpl << EOF
lxc.network.type = veth
lxc.network.link = br-out 
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
ovsnode01:~# lxc-create -f nat-network.tmpl -n dhcpserver -t ubuntu
ovsnode01:~# lxc-start -dn dhcpserver
ovsnode01:~# lxc-attach -n dhcpserver -- bash -c 'echo "nameserver 8.8.8.8" > /etc/resolv.conf
ip addr add 10.0.1.2/24 dev eth0
route add default gw 10.0.1.1'
ovsnode01:~# lxc-attach -n dhcpserver

WARNING: we created the container attached to br-out, because we want it to have internet access to be able to install dnsmasq. Moreover we needed to give it an IP address and set the nameserver to the one from google. Once the dhcpserver is configured, we’ll change the configuration to attach to br-cont0, because the dhcpserver only needs to access to the internal network.

Now we have to install dnsmasq:

apt-get update
apt-get install -y dnsmasq

Now we’ll configure the static network interface (172.16.0.202), by modifying the file /etc/network/interfaces

cat > /etc/network/interfaces << EOF
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
address 172.16.1.202
netmask 255.255.255.0
EOF

And finally, we’ll configure dnsmasq

cat > /etc/dnsmasq.conf << EOF
interface=eth0
except-interface=lo
listen-address=172.16.1.202
bind-interfaces
dhcp-range=172.16.1.1,172.16.1.200,1h
dhcp-option=26,1400
dhcp-option=option:router,172.16.1.201
EOF

In this configuration we have created our range of IP addresses (from 172.16.1.1 to 172.16.1.200). We have stated that our router will have the IP address 172.16.1.201 and one important thing: we have set the MTU to 1400 (remember that when using OVS we had to set the MTU to a lower size).

Now we are ready to connect the container to br-cont0. In order to make it, we have to modify the file /var/lib/lxc/dhcpserver/config. In particular, we have to change the value of the attribute lxc.network.link from br-out to br-cont0. Once I modified it, my network configuration in that file is as follows:

# Network configuration
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br-cont0
lxc.network.hwaddr = 00:16:3e:9f:ae:3f

Finally we can reboot our container

ovsnode01:~# lxc-stop -n dhcpserver 
ovsnode01:~# lxc-start -dn dhcpserver

And we can check that our server gets the proper IP address:

root@ovsnode01:~# lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 
dhcpserver RUNNING 0 - 172.16.1.202 -

We could also check that it is connected to the bridge:

ovsnode01:~# ip addr
...
83: vethGUV3HB: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-cont0 state UP group default qlen 1000
 link/ether fe:86:05:f6:f4:55 brd ff:ff:ff:ff:ff:ff
ovsnode01:~# brctl show br-cont0
bridge name bridge id STP enabled interfaces
br-cont0 8000.fe3b968e0937 no vethGUV3HB
ovsnode01:~# lxc-attach -n dhcpserver -- ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default 
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
82: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
 link/ether 00:16:3e:9f:ae:3f brd ff:ff:ff:ff:ff:ff
ovsnode01:~# ethtool -S vethGUV3HB
NIC statistics:
 peer_ifindex: 82

No matter if you do not understand this… it is a very advanced issue for this post. The important thing is that bridge br-cont0 has the device vethGUV3HB, whose number is 83 and its peer interface is the 82 that, in fact, is the eth0 device from inside the container.

Installing the router

Now that we have our dhcpserver ready, we are going to create a container that will act as a router for our network. It is very easy (in fact, we have already created a router). And… this fact arises a question: why are we creating another router?

We create a new router because it has to have an IP address inside the private network and other interface in the network to which we want to provide acess from the internal network.

Once we have this issue clear, let’s create the router, which as an IP in the bridge in the internal network (br-cont0):

ovsnode01:~# cat > ./router-network.tmpl << EOF
 lxc.network.type = veth
 lxc.network.link = br-cont0
 lxc.network.flags = up
 lxc.network.hwaddr = 00:16:3e:xx:xx:xx
 lxc.network.type = veth
lxc.network.link = br-out
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
 ovsnode01:~# lxc-create  -t ubuntu -f router-network.tmpl -n router

WARNING: I don’t know why, but for some reason sometimes lxc 2.0.3 fails in Ubuntu 14.04 when starting containers if they are created using two NICs.

Now we can start the container and start to work with it:

ovsnode01:~# lxc-start -dn router
ovsnode01:~# lxc-attach -n router

Now we simply have to configure the IP addresses for the router (eth0 is the interface in the internal network, bridged to br-cont0, and eth1 is bridged to br-out)

cat > /etc/network/interfaces << EOF
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 172.16.1.201
netmask 255.255.255.0

auto eth1
iface eth1 inet static
address 10.0.1.2
netmask 255.255.255.0
gateway 10.0.1.1
EOF

And finally create the router by using a script which is similar to the previous one:

router:~# apt-get install -y iptables
router:~# cat > enable_nat.sh <<\EOF
#!/bin/bash
IFACE_WAN=eth1
IFACE_LAN=eth0
NETWORK_LAN=172.16.1.201/24

echo "1" > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o $IFACE_WAN -s $NETWORK_LAN ! -d $NETWORK_LAN -j MASQUERADE
iptables -A FORWARD -d $NETWORK_LAN -i $IFACE_WAN -o $IFACE_LAN -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -s $NETWORK_LAN -i $IFACE_LAN -j ACCEPT
EOF
router:~# chmod +x enable_nat.sh
router:~# ./enable_nat.sh

Now we have our router ready to be used.

Starting the containers

Now we can simply start the containers that we created before, and we can check that they get an IP address by DHCP:

ovsnode01:~# lxc-start -n node01c01
ovsnode01:~# lxc-start -n node01c02
ovsnode01:~# lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6
dhcpserver RUNNING 0 - 172.16.1.202 -
node01c01 RUNNING 0 - 172.16.1.39 -
node01c02 RUNNING 0 - 172.16.1.90 -
router RUNNING 0 - 10.0.1.2, 172.16.1.201 -

And also we can check all the hops in our network, to check that it is properly configured:

ovsnode01:~# lxc-attach -n node01c01 -- apt-get install traceroute
(...)
ovsnode01:~# lxc-attach -n node01c01 -- traceroute -n www.google.es
traceroute to www.google.es (216.58.210.131), 30 hops max, 60 byte packets
 1 172.16.1.201 0.085 ms 0.040 ms 0.041 ms
 2 10.0.1.1 0.079 ms 0.144 ms 0.067 ms
 3 10.10.2.201 0.423 ms 0.517 ms 0.514 ms
...
12 216.58.210.131 8.247 ms 8.096 ms 8.195 ms

Now we can go to the other host and create the bridges, the virtual switch and the containers, as we did in the previous post.

WARNING: Just to remember, I leave this snip of code here:

ovsnode02:~# brctl addbr br-cont0
ovsnode02:~# ip link set dev br-cont0 up
ovsnode02:~# cat > ./internal-network.tmpl << EOF
lxc.network.type = veth
lxc.network.link = br-cont0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
ovsnode02:~# lxc-create -f ./internal-network.tmpl -n node01c01 -t ubuntu
ovsnode02:~# lxc-create -f ./internal-network.tmpl -n node01c02 -t ubuntu
ovsnode02:~# apt-get install openvswitch-switch
ovsnode02:~# ovs-vsctl add-br ovsbr0
ovsnode02:~# ovs-vsctl add-port ovsbr0 br-cont0
ovsnode02:~# ovs-vsctl add-port ovsbr0 vxlan0 — set interface vxlan0 type=vxlan options:remote_ip=10.10.2.21

And finally, we can start the containers and check that they get IP addresses from the DHCP server, and that they have connectivity to the internet using the routers that we have created:

ovsnode02:~# lxc-start -n node02c01
ovsnode02:~# lxc-start -n node02c02
ovsnode02:~# lxc-ls -f
NAME STATE IPV4 IPV6 AUTOSTART
-------------------------------------------------
node02c01 RUNNING 172.16.1.50 - NO
node02c02 RUNNING 172.16.1.133 - NO
ovsnode02:~# lxc-attach -n node02c01 -- apt-get install traceroute
(...)
ovsnode02:~# lxc-attach -n node02c01 -- traceroute -n www.google.es
traceroute to www.google.es (216.58.210.131), 30 hops max, 60 byte packets
 1 172.16.1.201 0.904 ms 0.722 ms 0.679 ms
 2 10.0.1.1 0.853 ms 0.759 ms 0.918 ms
 3 10.10.2.201 1.774 ms 1.496 ms 1.603 ms
...
12 216.58.210.131 8.849 ms 8.773 ms 9.062 ms

What is next?

Well, you’d probably want to persist the settings. Maybe you can set the iptables rules (aka the enable_nat.sh script) as a start script in /etc/init.d

As a further work, you can try VLAN tagging in OVS and so on, to duplicate the networks using the same components, but isolating the different networks.

You can also try to include new services (e.g. a private DNS server, a reverse NAT, etc.).

How to create a overlay network using Open vSwitch in order to connect LXC containers.

Open vSwitch (OVS) is a virtual switch implementation that can be used as a tool for Software Defined Network (SDN). The concepts managed by OVS are the same as the concepts managed by hardware switches (ports, routes, etc.). The most important features (for me) of OVS are

  • It enables to have a virtual switch inside a host in which to connect virtual machines, containers, etc.
  • It enables connect switches in different hosts using network tunnels (gre or vxlan).
  • It is possible to program the switch using OpenFlow.

In order to start working with OVS, this time…

I learned how to create a overlay network using Open vSwitch in order to connect LXC containers.

Well, first of all, I have to say that I have used LXC instead of VM because they are lightweight and very straightforward to use in a Ubuntu distribution. But if you understand this, you should be able to follow this how-to and use VMs instead of containers.

What we want (Our set-up)

What we are creating is shown in the next figure:

ovs

We have two nodes (ovsnodeXX) and several containers deployed on them (nodeXXcYY). The result is that all the containers can connect between them, but the traffic is not seen the LAN 10.10.2.x appart from the connection bewteen the OVS switches (because it is tunnelled).

Starting point

  • I have 2 ubuntu-based nodes (ovsnode01 and ovsnode02), with 1 network interface each, and they are able to ping one to each other:
root@ovsnode01:~# ping ovsnode01 -c 3
PING ovsnode01 (10.10.2.21) 56(84) bytes of data.
64 bytes from ovsnode01 (10.10.2.21): icmp_seq=1 ttl=64 time=0.022 ms
64 bytes from ovsnode01 (10.10.2.21): icmp_seq=2 ttl=64 time=0.028 ms
64 bytes from ovsnode01 (10.10.2.21): icmp_seq=3 ttl=64 time=0.033 ms

--- ovsnode01 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.022/0.027/0.033/0.007 ms
root@ovsnode01:~# ping ovsnode02 -c 3
PING ovsnode02 (10.10.2.22) 56(84) bytes of data.
64 bytes from ovsnode02 (10.10.2.22): icmp_seq=1 ttl=64 time=1.45 ms
64 bytes from ovsnode02 (10.10.2.22): icmp_seq=2 ttl=64 time=0.683 ms
64 bytes from ovsnode02 (10.10.2.22): icmp_seq=3 ttl=64 time=0.756 ms

--- ovsnode02 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 0.683/0.963/1.451/0.347 ms
  • I am able to create lxc containers, by issuing commands such as:
lxc-create -n node1 -t ubuntu
  • I am able to create linux bridges (i.e. I have installed the bridge-utils package).

Spoiler (or quick setup)

If you just want the solution, here you have (later I will explain all the steps). On each node you can follow the next steps in ovsnode01:

ovsnode01:~# brctl addbr br-cont0
ovsnode01:~# ip link set dev br-cont0 up
ovsnode01:~# cat > ./container-template.lxc << EOF
lxc.network.type = veth
lxc.network.link = br-cont0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF
ovsnode01:~# lxc-create -f ./container-template.lxc -n node01c01 -t ubuntu
ovsnode01:~# lxc-start -dn node01c01
ovsnode01:~# lxc-create -f ./container-template.lxc -n node01c02 -t ubuntu
ovsnode01:~# lxc-start -dn node01c02
ovsnode01:~# lxc-attach -n node01c01 -- ip addr add 192.168.1.11/24 dev eth0
ovsnode01:~# lxc-attach -n node01c01 -- ifconfig eth0 mtu 1400
ovsnode01:~# lxc-attach -n node01c02 -- ip addr add 192.168.1.12/24 dev eth0
ovsnode01:~# lxc-attach -n node01c02 -- ifconfig eth0 mtu 1400
ovsnode01:~# apt-get install openvswitch-switch
ovsnode01:~# ovs-vsctl add-port ovsbr0 br-cont0
ovsnode01:~# ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.10.2.22

You will need to follow the same steps for ovsnode02, taking into account the names of the containers and the IP addresses.

Preparing the newtork

First we are going to create a bridge (br-cont0) to which the containers are being bridged

We are not using lxcbr0 because it may have other services such as dnsmasq that we should disable before. Moreover, creating our bridge will be more interesting to understand what we are doing

We will issue this command on both ovsnode01 and ovsnode02 nodes.

brctl addbr br-cont0

As this bridge has no IP, it is down (you can see it using ip command). Now we are going to set it up (also in both nodes):

ip link set dev br-cont0 up

Creating the containers

Now we need a template to associate the network to the containers. So we have to create the file container-template.lxc:

cat > ./container-template.lxc << EOF
lxc.network.type = veth
lxc.network.link = br-cont0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOF

In this file we are saying that containers should be automatically bridged to bridge br-cont0 (remember that we created it before) and they will hace an interface with a hardware address that will follow the template. We could also modify the file /etc/lxc/default.conf file instead of creating a new one.

Now, we can create the containers on node ovsnode01:

ovsnode01# lxc-create -f ./container-template.lxc -n node01c01 -t ubuntu
ovsnode01# lxc-start -dn node01c01
ovsnode01# lxc-create -f ./container-template.lxc -n node01c02 -t ubuntu
ovsnode01# lxc-start -dn node01c02

And also in ovsnode02:

ovsnode02# lxc-create-f ./container-template.lxc -n node02c01 -t ubuntu
ovsnode02# lxc-start -dn node02c01
ovsnode02# lxc-create-f ./container-template.lxc -n node02c02 -t ubuntu
ovsnode02# lxc-start -dn node02c02

If you followed my steps, the containers will not have any IP address. This is because you do not have any dhcp server and the containers do not have static IP addresses. And that is what we are going to do now.

We are setting the IP addresses 192.168.1.11 and 192.168.1.21 to node01c01 and node0201 respetively. In order to do so, we have to issue these two commands (each of them in the corresponding node):

ovsnode01# lxc-attach -n node01c01 -- ip addr add 192.168.1.11/24 dev eth0
ovsnode01# lxc-attach -n node01c02 -- ip addr add 192.168.1.12/24 dev eth0

You need to configure the containers in ovsnode02 too:

ovsnode02# lxc-attach -n node02c01 -- ip addr add 192.168.1.21/24 dev eth0
ovsnode02# lxc-attach -n node02c02 -- ip addr add 192.168.1.22/24 dev eth0

Making connections

At this point, you should be able to connect between the containers in the same node: between node01c01 and node01c02, and between node02c01 and node02c02. But not between containers in different nodes.

ovsnode01# lxc-attach -n node01c01 -- ping 192.168.1.11 -c3
PING 192.168.1.11 (192.168.1.11) 56(84) bytes of data.
64 bytes from 192.168.1.11: icmp_seq=1 ttl=64 time=0.136 ms
64 bytes from 192.168.1.11: icmp_seq=2 ttl=64 time=0.051 ms
64 bytes from 192.168.1.11: icmp_seq=3 ttl=64 time=0.055 ms
ovsnode01# lxc-attach -n node01c01 -- ping 192.168.1.21 -c3
PING 192.168.1.21 (192.168.1.21) 56(84) bytes of data.
From 192.168.1.11 icmp_seq=1 Destination Host Unreachable
From 192.168.1.11 icmp_seq=2 Destination Host Unreachable
From 192.168.1.11 icmp_seq=3 Destination Host Unreachable

This is because br-cont0 is somehow a classic “network hub”, in which all the traffic can be listened by all the devices in it. So the containers in the same bridge are in the same LAN (in the same cable, indeed). But there is no connection between the hubs, and we will make it using Open vSwitch.

In the case of Ubuntu, installing OVS is as easy as issuing the next command:

apt-get install openvswitch-switch

Simple, isn’t it? But now we have to prepare the network and we are going to create a virtual switch (on both ovsnode01 and ovsnode02):

ovs-vsctl add-br ovsbr0

Open vSwitch works like a physical switch, with ports that can be connected and so on… And we are going to connect our hub to our switch (i.e. our bridge to the virtual switch):

ovs-vsctl add-port ovsbr0 br-cont0

We’ll make it in both ovsnode01 and ovsnode02

Finally, we’ll connect the ovs switches between them, using a vxlan tunnel:

ovsnode01# ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.10.2.22
ovsnode02# ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.10.2.21

We’ll make each of the two commands above on the corresponding node. Take care that the remote IP addresses are set to the other node 😉

We can check the final configuration of the nodes (let’s show only ovsnode01, but the other is very similar):

ovsnode01# brctl show
bridge name bridge id STP enabled interfaces
br-cont0 8000.fe502f26ea2d no veth3BUL7S
                              vethYLRPM2

ovsnode01# ovs-vsctl show
2096d83a-c7b9-47a8-8fff-d38c6d5ab04d
 Bridge "ovsbr0"
     Port "ovsbr0"
         Interface "ovsbr0"
             type: internal
     Port "vxlan0"
         Interface "vxlan0"
             type: vxlan
             options: {remote_ip="10.10.2.22"}
 ovs_version: "2.0.2"

Warning

Using this set up as is, you will get ping working, but probably no other traffic. This is because the traffic is encapsulated in a transport network. Did you know about MTU?

If we check the eth0 interface from one container we’ll get this:

# lxc-attach -n node01c01 -- ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:16:3e:52:42:2f 
 inet addr:192.168.1.11 Bcast:0.0.0.0 Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
 ...

Pay attention to the MTU value (it is 1500). And if we check the MTU of eth0 from the node, we’ll get this:

# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 60:00:00:00:20:15 
 inet addr:10.10.2.21 Bcast:10.10.2.255 Mask:255.255.255.0
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
 ...

Summarizing, MTU is the size of the message for ethernet, which usually is 1500. But we are sending messages into messages and if we try to use it as is, we are trying to send things that have a size of “1500 + some overhead” in a room of “1500” (we are conciously omiting the units). And “1500 + some overhead” is bigger than “1500” and that is why it will not work.

We have to change the MTU of the containers to a lower size. It is as simple as:

ovsnode01:~# lxc-attach -n node01c01 -- ifconfig eth0 mtu 1400
ovsnode01:~# lxc-attach -n node01c02 -- ifconfig eth0 mtu 1400

ovsnode02:~# lxc-attach -n node02c01 -- ifconfig eth0 mtu 1400
ovsnode02:~# lxc-attach -n node02c02 -- ifconfig eth0 mtu 1400

This method is not persistent, and will be lost in case of rebooting the container. In order to persist it, you can set it in the DHCP server (in case that you are using it), or in the network device set up. In the case of ubuntu it is as simple as adding a line with ‘mtu 1400’ to the proper device in /etc/network/interfaces. As an example for container node01c01:

auto eth0
iface eth0 inet static
address 192.168.1.11
netmask 255.255.255.0
mtu 1400

Some words on Virtual Machines

If you have some expertise on Virtual Machines and Linux (I suppose that if you are following this how-to, this is your case), you should be able to make all the set-up for your VMs. When you create your VM, you simply need to bridge the interfaces of the VM to the bridge that we have created (br-cont0) and that’s all.

And now, what?

Well, now we have what we wanted to have. And now you can play with it. I suggest to create a DHCP server (to not to have to set the MTU and the IP addresses), a router, a DNS server, etc.

And as an advanced option, you can play with traffic tagging, in order to create multiple overlay networks and isolating them from each other.

 

 

How to create a multi-LXC infrastructure using custom NAT and DHCP server

I am creating complex infrastructures with LXC. This time I wanted to create a realistic infrastructure that would simulate a real multi computer infrastructure, and I wanted to control all the components. I know that LXC will provide IP addresses through its DHCP server, and it will NAT the internal IPs, but I want to use my own DHCP server to provide multiple IP ranges. So this time…

I learned how to create a multi-LXC infrastructure using custom NAT router and DHCP server with multiple IP ranges

My virtual container infrastructure will consist of:

  • 1 container that will have 2 network interfaces: 1 to a “public” network (the LXC network) and 1 to a “private” network (to a private bridge).
  • 1 container in a “private network” (with IP in the range of 10.0.0.x)
  • 1 container in other “private network” (with IP in the range of 10.1.0.x)

The containers in the private network will be able to NAT to the outern world through the router.

I am using unprivileged containers (to learn how to create them, please read my previous post), but it is easy to execute all of this using the privileged containers (i.e. sudo lxc-* commands).

Creating the “private network”

First I create a bridge that will act as the switch for the private network:

calfonso@mmlin:~$ sudo su -
root@mmlin:~# cat >> /etc/network/interfaces << EOT
auto privbr
iface privbr inet manual
bridge_ports none
EOT
root@mmlin:~# ifup privbr

Then I will give permissions for my user to be able to add devices to that bridge

(* this is a unprivileged container specific step)

calfonso@mmlin:~$ sudo bash -c 'echo "calfonso veth privbr 100" >> /etc/lxc/lxc-usernet'

Setting the router

Now I will create a container named “router”

$ lxc-create -t download -n router -- -d ubuntu -r xenial -a amd64

And I will edit the configuration to set the proper network configuration. Edit the file $HOME/.local/share/lxc/router/config and modify it to leave it like this one:

# Distribution configuration
lxc.include = /usr/share/lxc/config/ubuntu.common.conf
lxc.include = /usr/share/lxc/config/ubuntu.userns.conf
lxc.arch = x86_64

# Container specific configuration
lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536
lxc.rootfs = /home/calfonso/.local/share/lxc/router/rootfs
lxc.rootfs.backend = dir
lxc.utsname = router

# Network configuration
lxc.network.type = veth
lxc.network.link = lxcbr0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:9b:1b:70

# Private interface for the first range
lxc.network.type = veth
lxc.network.link = privbr
lxc.network.flags = up
lxc.network.ipv4 = 10.0.0.1/24
lxc.network.hwaddr = 20:00:00:9b:1b:70

# Additional interface for the other private range
lxc.network.type = veth
lxc.network.link = privbr
lxc.network.flags = up
lxc.network.ipv4 = 10.1.0.1/24
lxc.network.hwaddr =20:20:00:9b:1b:70

Now you can start your container, and check that you have your network devices:

calfonso@mmlin:~$ lxc-start -n router 
calfonso@mmlin:~$ lxc-attach -n router 
root@router:/# ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:16:3e:9b:1b:70 
 inet addr:10.0.3.87 Bcast:10.0.3.255 Mask:255.255.255.0
 ...
eth1 Link encap:Ethernet HWaddr 20:00:00:9b:1b:70 
 inet addr:10.0.0.1 Bcast:10.0.0.255 Mask:255.255.255.0
 ...
eth2 Link encap:Ethernet HWaddr 20:20:00:9b:1b:70 
 inet addr:10.1.0.1 Bcast:10.1.0.255 Mask:255.255.255.0
...
lo Link encap:Local Loopback 
...

DHCP Server

And now, install DNSMASQ to be used as the nameserver and the DHCP server:

root@router:/# apt-get update
root@router:/# apt-get -y dist-upgrade
root@router:/# apt-get install -y dnsmasq
root@router:/# cat >> /etc/dnsmasq.d/priv-network.conf << EOT
except-interface=eth0
bind-interfaces
dhcp-range=tag:if1,10.0.0.2,10.0.0.100,1h
dhcp-range=tag:if2,10.1.0.2,10.1.0.100,1h
dhcp-host=20:00:00:*:*:*,set:if1
dhcp-host=20:20:00:*:*:*,set:if2
EOT
root@router:/# service dnsmasq restart

Setting up as router

I will use iptables to NAT the internal networks. So we’ll need to install iptables:

root@router:/# apt-get install iptables

Then I have to add the following lines to the /etc/rc.local file (just before the exit 0 line)

echo "1" > /proc/sys/net/ipv4/ip_forward

iptables -t nat -A POSTROUTING -s 10.0.0.0/24 ! -d 10.0.0.0/24 -j MASQUERADE
iptables -A FORWARD -d 10.0.0.0/24 -o eth1 -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -s 10.0.0.0/24 -i eth1 -j ACCEPT

iptables -t nat -A POSTROUTING -s 10.1.0.0/24 ! -d 10.1.0.0/24 -j MASQUERADE
iptables -A FORWARD -d 10.1.0.0/24 -o eth2 -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -s 10.1.0.0/24 -i eth2 -j ACCEPT

And now it is time to reboot the container

root@router:/# exit
calfonso@mmlin:~$ lxc-stop -n router 
calfonso@mmlin:~$ lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 
myunprivilegedcont STOPPED 0 - - - 
router STOPPED 0 - - - 
calfonso@mmlin:~$ lxc-start -n router 
calfonso@mmlin:~$ lxc-attach -n router 
root@router:/# iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-A POSTROUTING -s 10.0.0.0/24 ! -d 10.0.0.0/24 -j MASQUERADE
-A POSTROUTING -s 10.1.0.0/24 ! -d 10.1.0.0/24 -j MASQUERADE
root@router:/#

And that’s all for our router.

(*) All these steps would also work for a common server (either physical o virtual) that will act as a router and DHCP server.

Creating the other containers

Now we are ready to create containers in our subnet. We’ll create a container in network 10.0.0.x (which will be named node_in_0) and other container in network 10.1.0.x (which will be named node_in_1).

Creating a container in network 10.0.0.x

First we create a container (as we did with the router)

calfonso@mmlin:~$ lxc-create -t download -n node_in_0 -- -d ubuntu -r xenial -a amd64

And now we’ll edit its configuration file .local/share/lxc/node_in_0/config to set it like this

# Distribution configuration
lxc.include = /usr/share/lxc/config/ubuntu.common.conf
lxc.include = /usr/share/lxc/config/ubuntu.userns.conf
lxc.arch = x86_64

# Container specific configuration
lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536
lxc.rootfs = /home/calfonso/.local/share/lxc/node_in_0/rootfs
lxc.rootfs.backend = dir
lxc.utsname = node_in_0

# Network configuration
lxc.network.type = veth
lxc.network.link = privbr
lxc.network.flags = up
lxc.network.hwaddr = 20:00:00:e9:79:10

(*) The most noticeable modifications are related to the network device: set the private intarface in lxc.network.link and set the first octects to the hwaddr to the mask set in the DNSMASQ server (I left the other as those that LXC generated by itself).

Now you can start your container and it will get an IP in the range 10.0.0.x.

calfonso@mmlin:~$ lxc-start -n node_in_0 
calfonso@mmlin:~$ lxc-attach -n node_in_0 
root@node_in_0:/# ifconfig
eth0 Link encap:Ethernet HWaddr 20:00:00:e9:79:10 
 inet addr:10.0.0.53 Bcast:10.0.0.255 Mask:255.255.255.0
 ...

lo Link encap:Local Loopback 
...

Creating a container in network 10.1.0.x

Again, we have to create a container

calfonso@mmlin:~$ lxc-create -t download -n node_in_1 -- -d ubuntu -r xenial -a amd64

And now we’ll edit its configuration file .local/share/lxc/node_in_1/config to set it like this

# Distribution configuration
lxc.include = /usr/share/lxc/config/ubuntu.common.conf
lxc.include = /usr/share/lxc/config/ubuntu.userns.conf
lxc.arch = x86_64

# Container specific configuration
lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536
lxc.rootfs = /home/calfonso/.local/share/lxc/node_in_0/rootfs
lxc.rootfs.backend = dir
lxc.utsname = node_in_0

# Network configuration
lxc.network.type = veth
lxc.network.link = privbr
lxc.network.flags = up
lxc.network.hwaddr = 20:20:00:92:45:86

And finally, start the container and check that it has the expected IP address

calfonso@mmlin:~$ lxc-start -n node_in_1 
calfonso@mmlin:~$ lxc-attach -n node_in_1 
root@node_in_1:/# ifconfig
eth0 Link encap:Ethernet HWaddr 20:20:00:92:45:86 
 inet addr:10.1.0.12 Bcast:10.1.0.255 Mask:255.255.255.0
...
lo Link encap:Local Loopback 
...

Ah, you can check that these containers are able to access the internet 😉

root@node_in_1:/# ping -c 3 www.google.es
PING www.google.es (216.58.201.131) 56(84) bytes of data.
64 bytes from mad06s25-in-f3.1e100.net (216.58.201.131): icmp_seq=1 ttl=54 time=8.14 ms
64 bytes from mad06s25-in-f3.1e100.net (216.58.201.131): icmp_seq=2 ttl=54 time=7.91 ms
64 bytes from mad06s25-in-f3.1e100.net (216.58.201.131): icmp_seq=3 ttl=54 time=7.86 ms

--- www.google.es ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 7.869/7.973/8.140/0.119 ms

(*) If you can check that our router is effectively the router (I you don’t trust me 😉 ), you can check it

root@node_in_1:/# apt-get install traceroute
root@node_in_1:/# traceroute www.google.es
traceroute to www.google.es (216.58.201.131), 30 hops max, 60 byte packets
 1 10.1.0.1 (10.1.0.1) 0.101 ms 0.047 ms 0.054 ms
 2 10.0.3.1 (10.0.3.1) 0.102 ms 0.053 ms 0.045 ms
...
8 google-router.red.rediris.es (130.206.255.2) 7.694 ms 7.561 ms 7.659 ms
 9 72.14.235.18 (72.14.235.18) 8.100 ms 7.973 ms 11.170 ms
10 216.239.40.217 (216.239.40.217) 8.077 ms 8.114 ms 8.040 ms
11 mad06s25-in-f3.1e100.net (216.58.201.131) 7.976 ms 7.910 ms 8.106 ms

Last words on this…

Wow, there are a lot of learned things here…

  1. Creating a basic DHCP and DNS server with DNSMASQ
  2. Creating a NAT router with iptables
  3. Creating a bridge without a device attached to it

Each topic would could have be a post in this blog, but the important thing in here is that all these tasks are made inside LXC containers 🙂

How to create unprivileged LXC containers in Ubuntu 14.04

I am used to use LXC containers in Ubuntu. But I usually run them under the root user. I think that this is a bad practise, and I want to run them using my user. The answer is: unprivileged containers and this time…

I learned how to create unprivileged LXC containers in Ubuntu 14.04

My prayers were answered very soon and I found a very useful resource from Stéphane Graber (which is the LXC and LXD project leader at Canonical). I am summarizing the steps here, as they have been made by me, but his post is plenty of useful information, and I really recommend reading it to know what is going on.

Starting point…

I had a working Ubuntu 14.04.4 LTS installation, and a working LXC 2.0.0 installation. I was already able to run privileged containers by issuing commands like

$ sudo lxc-start -n mycontainer

 

If you need info about installing LXC 2, please go to this link, but it is possible that you do not need my post, yet.

First we set up the uid and gid mappings

calfonso@mmlin:~$ sudo usermod --add-subuids 100000-165536 calfonso
calfonso@mmlin:~$ sudo usermod --add-subgids 100000-165536 calfonso

Now we authorize our user to bridge devices to lxcbr0 (with a maximum quota of 10 bridges; you can set the number that best fits to you)

calfonso@mmlin:~$ sudo bash -c 'echo "calfonso veth lxcbr0 10" >> /etc/lxc/lxc-usernet'

Now let lxc access our home folder, because it needs to read the configuration file

calfonso@mmlin:~$ sudo chmod +x $HOME

And create the configuration file (you can set your values for the bridges to which you have allowed the access in the configuration file, mac addresses, etc.)

calfonso@mmlin:~$ mkdir -p $HOME/.config/lxc
calfonso@mmlin:~$ cat >> ~/.config/lxc/default.conf << EOT
lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536
lxc.network.type = veth
lxc.network.link = lxcbr0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
EOT

And finally we are ready to create the unprivileged container

calfonso@mmlin:~$ lxc-create -t download -n myunprivilegedcont -- -d ubuntu -r xenial -a amd64

And that’s all 🙂

(*) Remember that you can still use privileged containers by issuing lxc-* commands using “sudo”

Recommended information

If you want to know more about the uidmap, gidmap, how the bridging permissions work, etc., I recommend you to read this post.