An Overview and Cheat Sheet for systemd networkd configurations

2023-09-23

This is a short cheat sheet for things that I often need to do with systemd-networkd, the network configuraton component of systemd. This post assumes some basic knowledge about systemd-networkd and is primarily a collection of examples.

CLI overview

The CLI tool for systemd-networkd is networkctl.

networkctl shows a short status over all interfaces

networkctl reload reloads and applies the configuration of all interfaces.

networkctl status shows a general status as well as the logs. Check the logs if something does not work as intended. Syntax errors and the like are logged here.

The logs can also be accessed via the journal, usually with journalctl -xef -u systemd-networkd

Configuration Files

systemd-networkd is configured via files usually located in /etc/systemd/network/. There are 3 kinds of files

  • .link-files that are used for renaming interfaces (Documentation)
  • .netdev-files that are creating interfaces (Documentation)
  • .network-files that handle the L3-behaviour of those interfaces (Documentation)

The files are loaded sorted by their name, so it is a good practice to name them starting with numbers to define their order.

Interface renaming

To rename a interface we can first match on some parameters and then give it a new name:

# 00-iface42.link
[Match]
MACAddress=11:22:33:44:55:66
Type=ether
[Link]
Name=iface42

The Type=ether is important when you want to create VLAN interfaces on the renamed interface. By default the VLAN Interfaces would have the same MAC addresses as the main interface. If only the MAC address is matched then the main interface and the VLAN interfaces would all be renamed, leading to hard to find errors. The Type=ether matches only on the ethernet interface, because the VLAN interfaces have vlan as their type. Source: systemd-devel mailing list

VLAN Interfaces

First we have to create the VLAN Interface eth1.102 and assign it the VLAN-ID 102

# 10-eth1.102.netdev
[NetDev]
Name=eth1.102
Kind=vlan

[VLAN]
Id=102

Note that the name of the interface can be chosen arbitrarily and does not have to contain the vlan id.

Then we can assign the VLAN to an other interface:

# 10-eth1.network 
[Match]
Name=eth1

[Network]
VLAN=eth1.102

VRFs

Each VRF needs a routing table assigned to it. In this example it is table 102.

# 10-vrf-uplink-1.netdev
[NetDev]
Name=vrf-uplink-1
Kind=vrf

[VRF]
TableId=102

To bring the VRFs up you need a network file for them. You can use the same file for all your VRFs. An easy example would be

# 99-vrf.network
[Match]
Kind=vrf

[Link]
ActivationPolicy=up
RequiredForOnline=no

Some manuals suggest to match the name here, e.g. Name=vrf-*, however this requires all your vrfs to start with vrf- which might not be the case.

To assign an interface to a VRF, create a network file for that interface:

# 20-eth2.network
[Match]
Name=eth2

[Network]
VRF=vrf-uplink-1

Assigning IP Addresses

SLAAC + DHCP

To run SLAAC for IPv6 and DHCP for legacy IP a network file is added:

[Match]
Name=eth0

[Network]
DHCP=yes
IPv6AcceptRA=yes
IPv6PrivacyExtensions=yes

The handling of IPv6 RAs is a bit complicated, check the documentation for IPv6AcceptRA for details. The IPv6PrivacyExtensions=yes enables IPv6 privacy extensions.

The set the route metric (this is relevant if you have multiple interfaces and e.g. want to prefer your wired over your wireless interface) add the following lines:

...
[DHCPv4]
RouteMetric=200

[IPv6AcceptRA]
RouteMetric=200

Static Addresses

[Match]
Name=wg-1

[Network]
Address=2a0a:4587:2001:1015::/127

Routes

Static routes can be added with the [Route] config section in the network files. To add more routes, simply add more [Route] config sections.

[Route]
Destination=10.42.123.0/28
Gateway=192.0.2.1

[Route]
Destination=2001:db8:101::/46
Gateway: 2001:db8::1

Wireguard Tunnels

For Wireguard a keypair is needed which can be generated with

mkdir -p /etc/wireguard
cd /etc/wireguard
wg genkey | tee privatekey | wg pubkey > publickey

Then the netdev looks like this

[NetDev]
Name=wg-1
Kind=wireguard

[WireGuard]
PrivateKeyFile=/etc/wireguard/privkey
ListenPort=51820

# for a client we only want to listen for but not actively connect to
[WireGuardPeer]
PublicKey=<peers public key here>
AllowedIPs=::/0, 0.0.0.0/0

# for a peer to which we want to connect to
[WireGuardPeer]
PublicKey=<peers public key here>
AllowedIPs=::/0, 0.0.0.0/0
Endpoint=[2001:db8::1234]:51821

Additionally a network file is needed:

[Match]
Name=wg-1

[Network]
Address=172.16.44.1/31

Zyxel NR7101 Performance Testing

2023-05-11

Introduction

The Zyxel NR7101 is a PoE-powered 5G router. It is often used to gain internet connectivity for LAN parties. This leads to large volumes of traffic going through the device. Some events reported reliability issues with this device. So we will investigate where the limitations of this device are.

Test-Setup

We connect our test client (in this case a computer running Linux) via Ethernet to the Zyxel router. That router then connects via LTE or 5G to the Broadband ISP (in this case Deutsche Telekom). For some of the tests we require a test server with internet connectivity to send packets from the internet to our test device.

test

It must be acknowledged, that some of the tests might be influenced by the network of the broadband ISP, however in some cases we can clearly show, that there is a limitation on the Zyxel router.

Accessing the Web-UI

The device can be configured via a Web UI. This can be done via the wireless network that this device creates or via a Ethernet connection.

From there we can configure many settings. Especially interesting for our purposes is setting the APN under Network Setting -> Broadband -> Cellular APN to a APN which gives us a public IPv4 address so that we do not have Carrier Grade NAT. An other important option is to enable IP Passthrough, so that that our test client actually gets the public IPv4 address and the router does not introduce an additional layer of NAT.

The web UI also offers an overview over many of the signal characteristics. An explanation of these values can be found in the Zyxel support portal

Finally the device can also be accessed via SSH. To do so one has to activate SSH via the Maintenance -> Remote Management Menu. Then we can log into the device via SSH with the user admin and the same password that is also used for the web UI. Your SSH client might be very new, so you might have to enable ssh-rsa as a the host key algorithm to log in.

ssh root@192.168.1.1 -o HostKeyAlgorithms=+ssh-rsa

However the SSH login only gives you access to a very minimalistic, proprietary CLI interface.

Gaining root access

Since the regular SSH login is very limited we can get a root login. The router generates it's root password from it's serial number. More details on how to generate that password can be found at the OpenWrt Wiki.

Equipped with this root password we have access to the Linux OS running on the device.

BusyBox v1.20.1 (2022-11-29 09:28:07 CST) built-in shell (ash)
Enter 'help' for a list of built-in commands.

  _______         ___ ___  _______  _____   
 |__     |.--.--.|   |   ||    ___||     |_ 
 |     __||  |  ||-     -||    ___||       |
 |_______||___  ||___|___||_______||_______|
          |_____|  
 -------------------------------------------
    Product: NR7101
    Version: 1.00(ABUV.7)C0
 Build Date: 2022/11/29
 -------------------------------------------
root@NR7101:~# 

It is a small Linux system, that runs the standard Linux networking stack. So you will find netfilter, iptables, iproute2, conntrack, tcpdump, etc.

General commands

Here is an overview of the available commands:

root@NR7101:~# 
8021xd             date               gunzip             mailsend           pure-ftpd          syslog-ng          wan
8021xdi            dbclient           gzip               makedevlinks.sh    pwd                syslogd            watch
AutoSpeedTest      dd                 halt               md5sum             qimeitool          sysupgrade         watchdog
Ethctl             depmod             head               microcom           qmi-network        tail               wc
QFirehose          df                 hexdump            mii_mgr            qmicli             tar                wget
SpeedTest          dhcp6c             hostid             mkdir              quectel-CM         taskset            which
[                  dhcp6relay         hostname           mkfifo             quectel-DTool      tc                 wifi
[[                 dhcp6s             hotplug-call       mknod              quectel_qlog       tcpdump            wifi_led.sh
ac                 diag_start.dat     httpdiag           mktemp             radvd              tee                wifi_off_timer.sh
acl                diff               hwclock            modprobe           ramonitor          telnet             wlan
agetty             dirname            hwnat              more               readlink           telnetd            wlan_wps
arping             dmesg              hwnat-disable.sh   mosquitto_pub      reboot             test               xargs
ash                dns                hwnat-enable.sh    mosquitto_sub      redirect_console   tftp               yes
atcmd              dnsdomainname      id                 mount              reg                time               zcat
ated               dnsmasq            ifconfig           mpstat             reset              top                zcmd
atftp              drop_caches.sh     init               mtd_write          restoredefault     touch              zebra
awk                dropbear           insmod             mtr                rilcmd             tr                 zhttpd
basename           dropbearkey        ip                 mv                 rilcmd.sh          traceroute         zhttpput
bcm_erp.sh         du                 ip6tables          nc                 ripd               traceroute6        zpublishcmd
blkid              ebtables           ip6tables-restore  ndppd              rm                 true               zstun
brctl              echo               ip6tables-save     netstat            rmdir              tty2tcp            zsuptr69
btnd               egrep              ipcalc.sh          nice               rmmod              tty_log_echo       zsuptr69cmd
bunzip2            env                iperf3             nslookup           route              ubiattach          ztr369cmd
busybox            esmd               iptables           ntpclient          rs6                ubiblock           ztr69
bzcat              eth_mac            iptables-restore   ntpd               scp                ubicrc32           ztr69cli
cat                ethwanctl          iptables-save      nuttcp             sed                ubidetach          ztr69cmd
cfg                expr               iwconfig           nvram              self_check.sh      ubiformat          ztzu
chgrp              ez-ipupdate        iwlist             nvram-factory      sendarp            ubimkvol           zupnp
chmod              false              iwpriv             obuspa             seq                ubinfo             zupnp.sh
chown              fdisk              kill               obuspa.sh          setsmp.sh          ubinize            zybtnchk
chpasswd           fgrep              killall            openssl            sh                 ubirename          zycfgfilter
chroot             find               klogd              opkg               sleep              ubirmvol           zycli
clear              firstboot          led.sh             passwd             smp.sh             ubirsvol           zyecho
cmp                flash_mtd          less               pgrep              snmpd              ubiupdatevol       zyecho_client
config.sh          free               ln                 pidof              sort               udhcpc             zyledctl
conntrack          fsync              logger             ping               speedtest          udpst              zysh
conntrackd         ftpget             login              ping6              ss                 umount             zywifid
cp                 ftpput             login.sh           pings              start-stop-daemon  uname              zywifid_run.sh
crond              fuser              logread            pingsvrs           switch             uniq               zywlctl
crontab            fwwatcher          logrotate          pivot_root         switch_root        updatedd
curl               genXML             ls                 poweroff           swversion          uptime
cut                getty              lsmod              pppoectl           sync               vcautohuntctl
dalcmd             gpio               lsof               printf             sys                vconfig
dat2uci            grep               lte_srv_diag       ps                 sysctl             vi

Interfaces

The interface configuration looks like this:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ifb0: <BROADCAST,NOARP> mtu 2048 qdisc noop state DOWN group default qlen 32
    link/ether 2e:c1:bf:ac:dd:3a brd ff:ff:ff:ff:ff:ff
3: ifb1: <BROADCAST,NOARP> mtu 2048 qdisc noop state DOWN group default qlen 32
    link/ether c6:b6:cb:33:53:e7 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br1 state UNKNOWN group default qlen 1000
    link/ether 4c:c5:3e:a3:79:e2 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::4ec5:3eff:fea3:79e2/64 scope link 
       valid_lft forever preferred_lft forever
5: usb0: <NOARP,UP,LOWER_UP> mtu 2048 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::50:f4ff:fe00:0/64 scope link 
       valid_lft forever preferred_lft forever
6: wwan0: <NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::50:f4ff:fe00:0/64 scope link 
       valid_lft forever preferred_lft forever
7: wwan1: <NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
8: wwan2: <NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
9: wwan3: <NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
10: wwan4: <NOARP> mtu 2048 qdisc noop state DOWN group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
11: wwan5: <NOARP> mtu 2048 qdisc noop state DOWN group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
12: wwan6: <NOARP> mtu 2048 qdisc noop state DOWN group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
13: wwan7: <NOARP> mtu 2048 qdisc noop state DOWN group default qlen 1000
    link/ether 02:50:f4:00:00:00 brd ff:ff:ff:ff:ff:ff
14: teql0: <NOARP> mtu 1500 qdisc noop state DOWN group default qlen 100
    link/void 
15: ra0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP group default qlen 1000
    link/ether 4c:c5:3e:a3:79:e3 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::4ec5:3eff:fea3:79e3/64 scope link 
       valid_lft forever preferred_lft forever
16: eth3: <BROADCAST,MULTICAST> mtu 2048 qdisc noop state DOWN group default qlen 1000
    link/ether 00:0c:43:28:80:03 brd ff:ff:ff:ff:ff:ff
17: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 4c:c5:3e:a3:79:e2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.1/24 brd 192.168.1.255 scope global br0
       valid_lft forever preferred_lft forever
    inet6 fe80::4ec5:3eff:fea3:79e2/64 scope link 
       valid_lft forever preferred_lft forever
18: apcli0: <BROADCAST,MULTICAST> mtu 2048 qdisc noop state DOWN group default qlen 1000
    link/ether 4e:c5:3e:03:79:e3 brd ff:ff:ff:ff:ff:ff
19: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 4c:c5:3e:a3:79:e2 brd ff:ff:ff:ff:ff:ff
    inet 37.83.118.29/30 brd 37.83.118.31 scope global br1
       valid_lft forever preferred_lft forever
    inet6 fe80::4ec5:3eff:fea3:79e2/64 scope link 
       valid_lft forever preferred_lft forever

And since there are also bridges, here is the bridge configuration.

root@NR7101:/usr/sbin# brctl show
bridge name	bridge id		STP enabled	interfaces
br0		8000.4cc53ea379e2	no		ra0
br1		8000.4cc53ea379e2	no		eth2

The important parts here are:

  • wwan0: The first WWAN interface.
  • eth2: The Ethernet link that we are using to connect to the device. This is connected to br1.
  • br1: The bridge interface over which we access the device via eth2.
  • br0: The bridge for the wireless LAN network of the device.
  • ra0: the actual wireless interface that is connected to br0.

Routing Table

The routing table is also rather straight forward:

root@NR7101:/usr/sbin# ip route
default dev wwan0  scope link  src 37.83.118.29 
37.83.118.28/30 dev br1  proto kernel  scope link  src 37.83.118.29 
127.0.0.0/16 dev lo  scope link 
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1 
239.0.0.0/8 dev br0  scope link 

First we have the default route as a onlink-route. Then the local networks on the bridges br1 and br0 and the loopback interface. And somehow it also adds a multicast network to the bridge for the wireless networks.

Connection Tracking

Regarding connection tracking there is conntrack running with support for 20480 connections and 4096 buckets for the hash-table.

The other settings for conntrack appear to be reasonable at first glance:

root@NR7101:/usr/sbin# sysctl -a | grep net.netfilter.nf_conntrack
sysctl: error reading key 'net.ipv4.route.flush': Permission denied
sysctl: error reading key 'net.ipv6.route.flush': Permission denied
net.netfilter.nf_conntrack_acct = 0
net.netfilter.nf_conntrack_buckets = 4096
net.netfilter.nf_conntrack_checksum = 1
net.netfilter.nf_conntrack_count = 17
net.netfilter.nf_conntrack_expect_max = 60
net.netfilter.nf_conntrack_frag6_high_thresh = 4194304
net.netfilter.nf_conntrack_frag6_low_thresh = 3145728
net.netfilter.nf_conntrack_frag6_timeout = 60
net.netfilter.nf_conntrack_generic_timeout = 600
net.netfilter.nf_conntrack_helper = 1
net.netfilter.nf_conntrack_icmp_timeout = 30
net.netfilter.nf_conntrack_icmpv6_timeout = 30
net.netfilter.nf_conntrack_log_invalid = 0
net.netfilter.nf_conntrack_max = 20480
net.netfilter.nf_conntrack_tcp_be_liberal = 0
net.netfilter.nf_conntrack_tcp_loose = 1
net.netfilter.nf_conntrack_tcp_max_retrans = 3
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_established = 3600
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300
net.netfilter.nf_conntrack_udp_timeout = 30
net.netfilter.nf_conntrack_udp_timeout_stream = 180

The conntrack tool is also available. conntrack -L lists the tracked sessions while conntrack -C returns the amount of tracked sessions.

IPTables

The router is running iptables as a firewall. The firewall is very lengthy, a dump of the rules can be found in a GitHub repository.

In general there are a lot of tables that are empty by default. In IP-Passthrough mode the FORWARD chain simply accepts everything. The INPUT chain allows the protocols defined for remote management and everything that has a RELATED/ESTABLISHED state.

Benchmarks

Bandwidth-Test

We will start with simple bandwidth test to get an estimate for the maximum bandwidth available.

For this we are running iperf3 for 60 seconds, once with the client as sender, receiver and one in bidirectional mode. We repeat the test with 10 parallel sessions to check, that that we are not limited by a single flow.

The throughput depends a lot on the signal strength and the general usage of the cell we are in. During daytime the tests were in the range of 60-120Mbit/s download speed, however during the early morning hours on a Sunday we were able to achieve the 350Mbit/s that are advertised by the ISP in the area we tested in.

During this test the CPU utilization never got beyond 10%.

new-sessions per second test

To test how many new sessions the device can handle we generate them via trafgen and this package description:

{
  # --- ethernet header ---
  eth(sa=aa:bb:cc:dd:ee:ff, da=gg:hh:ii:jj:kk:ll)
  # --- ip header ---
  ipv4(id=drnd(), ttl=64, sa=A.B.C.D, da=E.F.G.H)
  # --- UDP  header ---
  udp(sport=48054, dport=dinc(10000, 50000), csum=0)
  # payload
  'A',  fill(0x41, 11),
}

The MAC and IP addresses have to be adjusted accordingly. This generates a new session by increasing the UDP destination port. The test is then executed with the following command.

trafgen -o enp0s31f6: -i packet.dsc -P 4 -b 300pps

In this case we are testing with 300 packets (which is equivalent to 300 sessions in this case).

Somewhere between 250 and 300 new sessions per second the Zyxel device starts sending Pause-frames and has one of the four cores completely utilized with interrupt handling. This is due to all interrupts being handled on one core.

root@NR7101:~# cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
  3:          0    2115968          0          0  MIPS GIC  eth2

Since irqbalance is not installed we can't easily redistribute these interrupts.

Reducing the amount of allowed conntrack sessions

When we reduce the amount of conntrack sessions to something very low we can test what happens when the conntrack table is full.

We set it to 80 with echo "80" > /proc/sys/net/netfilter/nf_conntrack_max. Then we ran a iperf test with 100 sessions in parallel.

iperf could not open all of these sessions, because the conntrack session table was full and the router started dropping packets and informed us in dmesg:

...
[14008.628000] nf_conntrack: table full, dropping packet
[14009.636000] nf_conntrack: table full, dropping packet
[14011.904000] nf_conntrack: table full, dropping packet
[14012.916000] nf_conntrack: table full, dropping packet
...

There are now stateful matches in the iptables rules for the path these packets take, but they are dropped regardless.

disabling conntrack for sessions that don't go through the device

We tried to disable connection tracking for all flows that only go through the device, however the firmware that is running on the device does not allow for that. The NOTRACK target and the --notrack option for the CT target are unknown to the device:

root@NR7101:/# iptables -t raw -A PREROUTING -d 1.2.3.4 -j CT --notrack
iptables v1.4.16.3: unknown option "--notrack"
Try `iptables -h' or 'iptables --help' for more information.
root@NR7101:/# iptables -t raw -A PREROUTING -d 1.2.3.4 -j NOTRACK
iptables v1.4.16.3: Couldn't find target `NOTRACK'

Try `iptables -h' or 'iptables --help' for more information.

So disabling connection tracking is not easily possible.

Increasing the amount of conntrack sessions

We can adjust the maximum amount of conntrack sessions and the conntrack bucket size, if we run into the packet-drop problem.

echo "40960" > /proc/sys/net/netfilter/nf_conntrack_max
echo "8192" > /sys/module/nf_conntrack/parameters/hashsize

For a detailed discussion of those values take a look this Conntrack Parameter Tuning wiki page.

Conclusion

The NR7101 has some limitations regarding the amount of sessions it can handle. There is no direct solution available, however increasing the nf_conntrack_max can help a bit. There is no easy workaround for the limited amount of new sessions per second. An other option could be to use the OpenWrt firmware instead.

Remarks regarding the NR7102

The NR7102 comes from the same series of devices as the NR7101. From the outside and the Web UI this device looks very similar to the NR7101. However the method to gain a root password does not work here. So we can't take a deeper look into this device.

Building a simple Linux multipath router

2022-05-03

I am occasionally running LAN parties. The available internet bandwidth is often an issue. I wanted a simple way to balance traffic over multiple internet uplinks. It is assumed that there will be NAT on or after the egress interface of the router.

This first version tries to be simple, while not being optimal in all cases.

The basic feature we will be using is multipath routing. Instead of a simple default route our router will have a default route with multiple nexthops. This can be added via iproute2:

ip route add 0.0.0.0/0 \
	nexthop via 172.21.0.1 dev bond0.300 weight 3 \
	nexthop via 172.21.101.1 dev eno1 weight 1

This route will balance the traffic to the nexthops 172.21.0.1 and 172.21.101.1 over the specified interfaces. The weight setting will balance the flows (statistically) 3 to 1 between the links. You can adjust that according to your available bandwidth.

You also might want to check, that there is no other default route with a lower metric, because that will be preferred.

L4 Hashing

By default the load balancing is only done on L3 (IP) headers. A hash is calculated over these fields and that is then used to assign the flow to a nexthop. So given a big enough number of flows you should see a flow distribution according to your weights. To use the L4 headers (which will make it possible to balance flows of the same client over different uplinks) we have to set the sysctl

net.ipv4.fib_multipath_hash_policy = 1

Caveats

There are some issues with this setup:

  1. It is not tested in production yet.
  2. There might be issues with protocols that use multiple ports in parallel (e.g. ftp) or services that need to track users by IP address. This might be "solved" by sticking to L3 header hashing only.
  3. This setup in this naive form does not consider link latency and special traffic classification
  4. This does not handle traffic shaping. If you want to rate limit the traffic, you have to do this via tc, nft or whatever you prefer.
  5. In the case that you are not running NAT but have a public address space that is reachable via all of your uplinks you can only control the way packets are leaving your network. The ingress way is out of your control.
  6. Keep in mind that different flows might cause different amounts of traffic. If you have 100 flows and a 1:1 split, then you will see ~50 flows per nexthop. If you are unlucky the 50 flows for one nexthop might need a lot of traffic and fill up that link while the other 50 are not enough to fill the other link. Although this scenario is rather unlikely.

Testing the setup

To test the setup you have to send traffic from an address that is not on the transit networks to your nexthops, because otherwise the route selection algorithm might influence the route decision

You can use mtr -I $iface 9.9.9.9 to check which nexthop you get. Just select a appropriate interface or use a second computer behind the router. In the later case you can skip the -I $iface parameter. When you use several different destination addresses you should see the different nexthops you specified.

Alternatively ip route get $target from $address should return you different next hops for different targets.

References

Booting Cisco 3560E switches with IOS 15.2 does not work

2021-09-01

TL;DR: You can not boot IOS 15.2 from the 3560X switches on 3560E switches.

The why and what happens if you try it

I have a couple of 3560E switches. Sadly they only run IOS 15.0. I wanted to have IPv4 address families over OSPFv3. This is not supported in IOS 15.0. But it is in IOS 15.2. And the IOS 15.2 image for the successor model to the 3560E, the 3560X, is called c3560e-universalk9-mz.152-4.E10.bin. And the IOS 15.0 firmware for the 3560X switches is the same file as for the 3560. So I tried to boot the 15.2(4)-E10 image. The switch itself boots the image but hits a malloc error during the boot process, crashes and reboots. So no OSPFv3 with IPv4 address family on those switches.

Here is the output of the boot process:

<...snip...>
POST: Thermal, Fan Tests : Begin
POST: Thermal, Fan Tests : End, Status Passed

POST: PortASIC Port Loopback Tests : Begin
POST: PortASIC Port Loopback Tests : End, Status Passed

POST: EMAC Loopback Tests : Begin
POST: EMAC Loopback Tests : End, Status Passed

Waiting for Port download...Complete
SYSTEM INIT: INSUFFICIENT MEMORY TO BOOT THE IMAGE!



%Software-forced reload


 00:01:01 UTC Mon Jan 2 2006: Unexpected exception to CPUvector 2000, PC = 36828B0
-Traceback= 0x36828B0z 0x2AFD98Cz 0x2B05F5Cz 0x3678A24z 0x367DF60z 0x2B2A9E0z 0x31D2AE0z 0x3118FDCz 0x3206400z 0x3209A2Cz 0x2AE5D30z 0x2AE5EFCz 0x2AE604Cz 0x6D921Cz 0x6D9454z 0x3683BA0z

Writing crashinfo to flash:/crashinfo_ext/crashinfo_ext_4

=== Flushing messages (00:01:03 UTC Mon Jan 2 2006) ===

Buffered messages:
Queued messages:
*Jan  2 00:01:03.929: %SYS-3-LOGGER_FLUSHING: System pausing to ensure console debugging output.

*Mar  1 00:00:07.012: Read env variable - LICENSE_BOOT_LEVEL =
*Jan  2 00:00:03.456: %IOS_LICENSE_IMAGE_APPLICATION-6-LICENSE_LEVEL: Module name = c3560e Next reboot level = ipservices and License = ipservices
*Jan  2 00:01:01.329: %SYS-2-MALLOCFAIL: Memory allocation of 60000 bytes failed from 0x3678A20, alignment 0  <<< This is the relevant part.
Pool: Processor  Free: 38104  Cause: Not enough free memory
Alternate Pool: None  Free: 0  Cause: No Alternate pool
 -Process= "Init", ipl= 0, pid= 3
-Traceback= 6C9738z 2AFD8F8z 2B05F5Cz 3678A24z 367DF60z 2B2A9E0z 31D2AE0z 3118FDCz 3206400z 3209A2Cz 2AE5D30z 2AE5EFCz 2AE604Cz 6D921Cz 6D9454z 3683BA0z
Cisco IOS Software, C3560E Software (C3560E-UNIVERSALK9-M), Version 15.2(4)E10, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2020 by Cisco Systems, Inc.
Compiled Tue 31-Mar-20 21:44 by prod_rel_team

Debug Exception (Could be NULL pointer dereference) Exception (0x2000)!
SRR0 = 0x031BB534  SRR1 = 0x00029230  SRR2 = 0x036828B0  SRR3 = 0x00029230
ESR = 0x00000000  DEAR = 0x00000000  TSR = 0x84000000  DBSR = 0x10000000

CPU Register Context:
Vector = 0x00002000  PC = 0x036828B0  MSR = 0x00029230  CR = 0x35000053
LR = 0x0368284C  CTR = 0x006B4068  XER = 0xE0000075
R0 = 0x0368284C  R1 = 0x06357288  R2 = 0x00000000  R3 = 0x04F897A8
R4 = 0x00000000  R5 = 0x00000000  R6 = 0x06357258  R7 = 0x05850000
R8 = 0x00029230  R9 = 0x05850000  R10 = 0x00008000  R11 = 0x00000000
R12 = 0x35000059  R13 = 0x079817A8  R14 = 0x053AFDD8  R15 = 0x00000000
R16 = 0x038B2550  R17 = 0x00000004  R18 = 0x00000020  R19 = 0x05850000
R20 = 0x00000000  R21 = 0x00000000  R22 = 0x0000EA60  R23 = 0x00000000
R24 = 0x00000000  R25 = 0x053AFDD8  R26 = 0x03678A20  R27 = 0x0000EA60
R28 = 0x00000000  R29 = 0x05EE80C0  R30 = 0x05BA0978  R31 = 0x00000000

Stack trace:
PC = 0x036828B0, SP = 0x06357288
Frame 00: SP = 0x06357298    PC = 0x0368284C
Frame 01: SP = 0x063572C8    PC = 0x02AFD98C
Frame 02: SP = 0x06357360    PC = 0x02B05F5C
Frame 03: SP = 0x06357378    PC = 0x03678A24
Frame 04: SP = 0x06357390    PC = 0x0367DF60
Frame 05: SP = 0x063573C0    PC = 0x02B2A9E0
Frame 06: SP = 0x063573E8    PC = 0x031D2AE0
Frame 07: SP = 0x06357400    PC = 0x03118FDC
Frame 08: SP = 0x06357418    PC = 0x03206400
Frame 09: SP = 0x06357458    PC = 0x03209A2C
Frame 10: SP = 0x06357468    PC = 0x02AE5D30
Frame 11: SP = 0x06357488    PC = 0x02AE5EFC
Frame 12: SP = 0x063574A0    PC = 0x02AE604C
Frame 13: SP = 0x063574D0    PC = 0x006D921C
Frame 14: SP = 0x063575D8    PC = 0x006D9454
Frame 15: SP = 0x063575E0    PC = 0x03683BA0

<...switch reboots...>

The other interesting problem is now: How to reset the switch? Normaly the switch would try the next image on disk if the first one fails, but this image works "good enough" for that mechanic not to kick in. This means that the switch is stuck in an infinite boot loop.

Resetting the switch to the old image

Disconnect the power, connect to the console port, press and hold the Mode button. Then plug the power cord back in. After 10-20 seconds release the Mode button. After a short time the switch: prompt should appear and you are in the bootloader.

To boot the old image you have to initialize the filesystem (flash_init), find the image with dir flash: and delete it with delete flash:/c3560e-universalk9-mz.152-4.E10.bin. After that boot the switch with boot. The switch tries to boot the image that is in the config file, fails and tries the first available image on the device which is probably your old working image. If no working image is on the flash have fun with xmodem or tftp.


Using driver version 1 for media type 2
Base ethernet MAC Address: 08:1f:f3:39:8c:80
Xmodem file system is available.
The password-recovery mechanism is enabled.

The system has been interrupted prior to initializing the
flash filesystem.  The following commands will initialize
the flash filesystem, and finish loading the operating
system software:

    flash_init
    boot

switch: ?
           ? -- Present list of available commands
        boot -- Load and boot an executable image
         cat -- Concatenate (type) file(s)
        copy -- Copy a file
      delete -- Delete file(s)
         dir -- List files in directories
  flash_init -- Initialize flash filesystem(s)
      format -- Format a filesystem
        fsck -- Check filesystem consistency
        help -- Present list of available commands
      memory -- Present memory heap utilization information
       mkdir -- Create dir(s)
        more -- Concatenate (display) file(s)
      rename -- Rename a file
       reset -- Reset the system
       rmdir -- Delete empty dir(s)
         set -- Set or display environment variables
      set_bs -- Set attributes on a boot sector filesystem
   set_param -- Set system parameters in flash
       sleep -- Pause (sleep) for a specified number of seconds
        type -- Concatenate (type) file(s)
       unset -- Unset one or more environment variables
     version -- Display boot loader version

switch: flash_init
Initializing Flash...
mifs[2]: 10 files, 1 directories
mifs[2]: Total bytes     :    2097152
mifs[2]: Bytes used      :     614400
mifs[2]: Bytes available :    1482752
mifs[2]: mifs fsck took 2 seconds.
mifs[3]: 3 files, 1 directories
mifs[3]: Total bytes     :    4194304
mifs[3]: Bytes used      :     949248
mifs[3]: Bytes available :    3245056
mifs[3]: mifs fsck took 2 seconds.
mifs[4]: 5 files, 1 directories
mifs[4]: Total bytes     :     524288
mifs[4]: Bytes used      :       9216
mifs[4]: Bytes available :     515072
mifs[4]: mifs fsck took 0 seconds.
mifs[5]: 5 files, 1 directories
mifs[5]: Total bytes     :     524288
mifs[5]: Bytes used      :       9216
mifs[5]: Bytes available :     515072
mifs[5]: mifs fsck took 1 seconds.
 -- MORE --
mifs[6]: 15 files, 3 directories
mifs[6]: Total bytes     :   57671680
mifs[6]: Bytes used      :   49069056
mifs[6]: Bytes available :    8602624
mifs[6]: mifs fsck took 26 seconds.
...done Initializing Flash.

switch: dir

List of filesystems currently registered:

                  bs[0]: (read-only)
               flash[6]: (read-write)
              xmodem[7]: (read-only)
                null[8]: (read-write)

switch: dir flash:
Directory of flash:/

    2  -rwx  20310016  <date>               c3560e-universalk9-mz.150-2.SE11.bin
    3  drwx  512       <date>               crashinfo_ext
    8  -rwx  1560      <date>               express_setup.debug
    9  -rwx  916       <date>               vlan.dat
   10  -rwx  64        <date>               ztp.py
   11  -rwx  26771456  <date>               c3560e-universalk9-mz.152-4.E10.bin
   12  -rwx  1920      <date>               private-config.text
   13  -rwx  5144      <date>               multiple-fs
   14  -rwx  6223      <date>               config.text
   15  drwx  512       <date>               crashinfo

8602624 bytes available (49069056 bytes used)

switch: delete flash:c3560e-universalk9-mz.152-4.E10.bin
Are you sure you want to delete "flash:c3560e-universalk9-mz.152-4.E10.bin" (y/n)?y
File "flash:c3560e-universalk9-mz.152-4.E10.bin" deleted

switch: boot
Loading "flash:c3560e-universalk9-mz.152-4.E10.bin"...flash:c3560e-universalk9-mz.152-4.E10.bin: no such file or directory

Error loading "flash:c3560e-universalk9-mz.152-4.E10.bin"

Interrupt within 5 seconds to abort boot process.
Loading "flash:/c3560e-universalk9-mz.150-2.SE11.bin"...@@@@@@@@@@@@@@@<... switch booting image as usual ...>

Compressing IPv6 Addresses with Regular Expressions

2021-03-06

This post was lying around for too long, time to finish it and get it off my todo list.

The Goal

Convert from a full IPv6 address like 2001:0db8:0000:0000:0000:0023:4200:0123 to a compressed one like 2001:db8::23:4200:123 with a regular expression. Why? Because I can. And because some prometheus exporters only give you the uncompressed addresses.

Details regarding IPv6 address compression can be found in RFC 5952 Section 4.

Preprocessing

The first step is to get from the full form to a form where the leading zeros are removed/reduced to just a single zero per block.

This could be done like this:

^0{0,3}(.*:)0{0,3}(.*:)0{0,3}(.*:)0{0,3}(.*:)0{0,3}(.*:)0{0,3}(.*:)0{0,3}(.*:)0{0,3}(.*)$

and $1$2$3$4$5$6$7$8 as a replacement string. This results in 2001:db8:0:0:0:23:4200:123

The Common Case

Now the funny part begins. Basically we have to find the first longest sequence of zero-blocks that is longer than 1 block. If we make a group with everything to the left and right of that (including the : ) and combine the 2 groups we get the result. There are some corner cases, those will be handled later.

To find a sequence of length N we can build a very simple expression like

0:0:< in total N zeros >:0:0

Everything to the left of that must have less N consecutive zero blocks. A expression to match that could be:

((0:){0,$N-1}[1-9a-f][0-9a-f]{0,3}:)*

Everything to the right of the zero block sequence can have at most N consecutive zero blocks. The expression for that looks like this.

(:[1-9a-f][0-9a-f]{0,3}(:0){0,$N})*

Combined they result in this expression:

(((0:){0,$N-1}[1-9a-f][0-9a-f]{0,3}:)*)0:0:< in total N zeros >:0:0((:[1-9a-f][0-9a-f]{0,3}(:0){0,$N})*)

Combining the first and fourth group of that expression results in the compressed representation of the address, if the longest zero block sequence is N blocks long. We can build the expression for all possible block lengths.

The Corner Cases

As mentioned earlier there are several corner cases.

Compression at the left or right side

If the longest sequence is at the left or right end (e.g. 0:0:0:0:0:0:1:2 and 2:1:0:0:0:0:0:0) then we need a special expression. For the left side it looks like this:

0(:)0:< N-1 times 0 in total >:0:0((:[1-9a-f][0-9a-f]{0,3}(:0){0,$N})*)

This solves 2 problems:

  1. The group that would be on the left side with the expression for the common case needs to be removed, because there is nothing to match there.
  2. we have to find a : to build the :: in the compressed representation. This is done by taking one from the N zero blocks of the longest sequence.

the right side is constructed the same way.

Compressing 7 consecutive zero blocks

When there are 7 consecutive zero blocks then the compression will happen at either the left or right side, because there is only one non zero block left. for the 7 only the two expressions for the sides are needed, not the common one.

Compressing 8 zero blocks

All the other expressions don't work for the one case of 8 consecutive zeros. But we have to get 2 : for the :: from somewhere. This could look like this:

0(:)0(:)0:0:0:0:0:0

Combining it all

In total we get 3 expressions each for 2, 3, 4, 5 and 6 consecutive zero blocks, 2 for the 7 consecutive zero blocks and 1 for the 8 zero blocks.

Those 18 expressions can be combined like this:

^(($EXPR1)|($EXPR2)|...)$

Building all of this by hand is shitty and annoying. Here is some code to do it.

zero = "0"
non_zero = "[1-9a-f]"
all_chars = "[0-9a-f]"
non_zero_chunk = f"{non_zero}{all_chars}{{0,3}}"

def max_n_zero_block_left(n: int) -> str:
    return f"((({ zero }:){{0,{n}}}{ non_zero_chunk }:)*)"

def max_n_zero_block_right(n: int) -> str:
    return f"((:{ non_zero_chunk }(:{ zero }){{0,{n}}})*)"

def n_length_zero_block(n: int) -> str:
    return ":".join(zero*n)

left_zero_prefix = f"0(:)"
right_zero_suffix = f"(:)0"

patterns = list()
replacement_positions = list()
next_pattern_group_start = 1

for n in range(7,1,-1):
        # special case for the longest continuos zero string at the left
        pattern = f"({ left_zero_prefix }{ n_length_zero_block(n - 1) }({ max_n_zero_block_right(n) }))"
        patterns.append(pattern)
        replacement_positions.append(next_pattern_group_start + 2)
        replacement_positions.append(next_pattern_group_start + 3)
        next_pattern_group_start += 6

        # special case for ending with 0
        pattern = f"(({ max_n_zero_block_left(n-1) }){ n_length_zero_block(n - 1) }{ right_zero_suffix })"
        patterns.append(pattern)
        replacement_positions.append(next_pattern_group_start + 2)
        replacement_positions.append(next_pattern_group_start + 6)
        next_pattern_group_start += 6

        if n == 7:
            continue  # the regular case does not exist for n=7

        # regular case
        pattern = f"(({ max_n_zero_block_left(n-1) }){ n_length_zero_block(n) }({ max_n_zero_block_right(n) }))"
        patterns.append(pattern)
        replacement_positions.append(next_pattern_group_start + 2)
        replacement_positions.append(next_pattern_group_start + 6)
        next_pattern_group_start += 9

patterns.append("(0(:)0(:)0:0:0:0:0:0)")
replacement_positions.append(next_pattern_group_start + 2)
replacement_positions.append(next_pattern_group_start + 3)

all_patterns = "|".join(patterns)
all_patterns = "^(" + all_patterns + ")$"
print("Expression:")
print(all_patterns)

replacement = "".join(f"${{{i}}}" for i in replacement_positions)
print("Replacement:")
print(replacement)

And the output of the script:

Expression:
^((0(:)0:0:0:0:0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,7})*)))|(((((0:){0,6}[1-9a-f][0-9a-f]{0,3}:)*))0:0:0:0:0:0(:)0)|(0(:)0:0:0:0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,6})*)))|(((((0:){0,5}[1-9a-f][0-9a-f]{0,3}:)*))0:0:0:0:0(:)0)|(((((0:){0,5}[1-9a-f][0-9a-f]{0,3}:)*))0:0:0:0:0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,6})*)))|(0(:)0:0:0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,5})*)))|(((((0:){0,4}[1-9a-f][0-9a-f]{0,3}:)*))0:0:0:0(:)0)|(((((0:){0,4}[1-9a-f][0-9a-f]{0,3}:)*))0:0:0:0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,5})*)))|(0(:)0:0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,4})*)))|(((((0:){0,3}[1-9a-f][0-9a-f]{0,3}:)*))0:0:0(:)0)|(((((0:){0,3}[1-9a-f][0-9a-f]{0,3}:)*))0:0:0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,4})*)))|(0(:)0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,3})*)))|(((((0:){0,2}[1-9a-f][0-9a-f]{0,3}:)*))0:0(:)0)|(((((0:){0,2}[1-9a-f][0-9a-f]{0,3}:)*))0:0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,3})*)))|(0(:)0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,2})*)))|(((((0:){0,1}[1-9a-f][0-9a-f]{0,3}:)*))0(:)0)|(((((0:){0,1}[1-9a-f][0-9a-f]{0,3}:)*))0:0(((:[1-9a-f][0-9a-f]{0,3}(:0){0,2})*)))|(0(:)0(:)0:0:0:0:0:0))$
Replacement:
${3}${4}${9}${13}${15}${16}${21}${25}${27}${31}${36}${37}${42}${46}${48}${52}${57}${58}${63}${67}${69}${73}${78}${79}${84}${88}${90}${94}${99}${100}${105}${109}${111}${115}${120}${121}

Please keep in mind that I am just an idiot on the internet, don't use this expression to burn your production environment down.

Interesting Observations of IOS(-XE) ACL CLI and Command Syntax

2020-11-26

My initial contact for the things shown in this article comes from trying to parse and generate ACLs. Many of the things here may not bother you, if you are a pure CLI user and are not generating configs. This article does not claim to be the ultimate source for all the weird details of ACL syntax behaviour. This is based on my experience which primarily comes from IOS 15.7, IOS-XE 03.16, IOS-XE 16.09 and 16.12. IOS-XE 16.12 changed a lot in regards to sequence numbers. So there will be several sections that will be divided into a pre IOS-XE 16.12 part and a post IOS-XE 16.12 part. the regular IOS can be considered a part of the pre IOS-XE 16.12 sections, because I have not found differences between those.

General things

An Access Control List is a series of entries. Each entry matches some parts of a the packet headers. Each entry is either a remark or a permit or deny action. (I will ignore reflexive ACLs and the like here.)

Sequence Numbers

Each entry in an ACL has a sequence number. By default the first entry starts at 10 and every further entry has a number that is 10 higher.

When modifying entries you can specify a sequence number at which position you want to add something. So if you want to insert something between entry 10 and 20 you could choose a unused number between 10 and 20, e.g. 15.

Resequencing

But what happens if you want to insert between two entries where there is no free sequence number? For Legacy IP ACLs you can do a ip access-list <type> <name> resequence which renumbers the entries so that each entry is now 10 numbers apart again.

But that is not implemented for IPv6... A workaround is to recreate the ACL. But that is annoying.

IPv4

Pre IOS-XE 16.12

On IOS and IOS-XE the IPv4 sequence numbers are not a part the config. This means that you have to do show ip access-list ... to see the sequence numbers. They are generated at runtime. This also means that the sequence numbers change after a reboot.

Post IOS-XE 16.12

Sequence numbers are now a part of the configuration. The implications will be discussed later.

IPv6

Pre IOS-XE 16.12

IPv6 sequence numbers can be a part of the config. They are shown on a per entry basis if the sequence number is not exactly 10 bigger than the previous one.

If you enter the commands:

ipv6 access-list test
sequence 10 permit ipv6 host 2001:db8::1 any
sequence 15 permit ipv6 host 2001:db8::2 any
sequence 20 permit ipv6 host 2001:db8::3 any
sequence 25 permit ipv6 host 2001:db8::4 any
sequence 35 permit ipv6 host 2001:db8::5 any
sequence 40 permit ipv6 host 2001:db8::6 any

Then the ACL in the config is this:

ipv6 access-list test
 permit ipv6 host 2001:DB8::1 any
 sequence 15 permit ipv6 host 2001:DB8::2 any
 sequence 20 permit ipv6 host 2001:DB8::3 any
 sequence 25 permit ipv6 host 2001:DB8::4 any
 permit ipv6 host 2001:DB8::5 any
 sequence 40 permit ipv6 host 2001:DB8::6 any

The 2001:db8::2, 2001:db8::3 and 2001:db8::4 have numbers because their distance to the sequence number of the previous entry is 5. 2001:db8::5 has no number because the distance is 10, and 2001:db8::6 has a distance of 5 and therefore has a sequence number.

The show ipv6 access-list test command shows you all sequence numbers, but at the end of the entry, not at the beginning like your config file does it.

IPv6 access list test
    permit ipv6 host 2001:DB8::1 any sequence 10
    permit ipv6 host 2001:DB8::2 any sequence 15
    permit ipv6 host 2001:DB8::3 any sequence 20
    permit ipv6 host 2001:DB8::4 any sequence 25
    permit ipv6 host 2001:DB8::5 any sequence 35
    permit ipv6 host 2001:DB8::6 any sequence 40

Post IOS-XE 16.12

In IOS-XE 16.12 the sequence numbers are always shown in the config. So there is at least one point where IOS got more consistent.

ipv6 access-list test
 sequence 10 permit ipv6 host 2001:DB8::1 any
 sequence 15 permit ipv6 host 2001:DB8::2 any
 sequence 20 permit ipv6 host 2001:DB8::3 any
 sequence 25 permit ipv6 host 2001:DB8::4 any
 sequence 35 permit ipv6 host 2001:DB8::5 any
 sequence 40 permit ipv6 host 2001:DB8::6 any

But sadly the output of the show ipv6 access-list command still puts the sequence number at the end of the entries.

IPv4 Remarks

IPv4 ACL Remarks do not have their own sequence numbers.

Pre IOS-XE 16.12

This means inserting remarks in IPv4 ACLs at a specific position does not work. Only permit and deny are allowed.

asr920(config-ext-nacl)#42 ?
  deny    Specify packets to reject
  permit  Specify packets to forward

If you want to add a remark somewhere in the middle you have to delete the whole ACL, and insert the remark at the correct position while recreating the ACL.

But if you want to insert a new entry with a remark you can work your way around this issue. First you insert a remark without a sequence number and insert an entry with a sequence number and that remark will end up at the same position as the other entry.

ip access-list extended abcd
10 permit ip host 10.0.0.0 any
20 permit ip host 10.0.0.1 any
remark test
15 permit ip host 10.0.0.2 any

results in

ip access-list extended abcd
 permit ip host 10.0.0.0 any
 remark test
 permit ip host 10.0.0.2 any
 permit ip host 10.0.0.1 any

Post IOS-XE 16.12

In IOS-XE 16.12 every entry got a sequence number. But as mentioned earlier, IPv4 remarks did not have sequence numbers. So Cisco "solved" that. An ACL now looks like this in the config:

ip access-list extended remark-seq-numbers
 10 remark foo
 10 permit ip host 10.0.0.0 any
 20 remark bar
 20 permit ip host 10.0.0.42 any

Also inserting remarks at a specific sequence number works now (to some degree). If you enter 15 remark test for the ACL above you get this:

ip access-list extended remark-seq-numbers
 10 remark foo
 10 permit ip host 10.0.0.0 any
 20 remark bar
 20 remark asdf
 20 permit ip host 10.0.0.42 any
 15 remark test

Our remark is now at the end, where it does not belong. This change in IOS-XE 16.12 does result in sequence numbers appearing multiple times and being out of order.

Reusing the same sequence number

When you want to replace an entry in an IPv6 ACL with an other one you can just use the same sequence number and the entry will be replaced.

If you try that with an IPv4 permit/deny entry you get this response:

% Duplicate sequence number
%Failed to add ace to access-list

(yes, the space after the % in the first line and the lack thereof in the second is not a copy+paste error.)

However if you do this in IOS-XE 16.12 with a remark line it works just fine and it replaces the remark.

Singular/Plural in the ACL show commands

While the config commands for ACLs are ip[v6] access-list the show-commands are show ip access-lists with a s at the end and show ipv6 access-list without a s at the end.

c1111-lab#show ip access-lists
Standard IP access list 2
 ...
c1111-lab#show ipv6 access-list
IPv6 access list test
 ...
c1111-lab#show ipv6 access-lists
                               ^
% Invalid input detected at '^' marker.

ACL naming: numbers vs names

There is not only the difference between standard and extended ACLs, there is also a difference between ACLs with a name and ACLs with a number. Which numbers can be given to an ACL depends on their type.

asr920(config)#access-list ?
  <1-99>            IP standard access list
  <100-199>         IP extended access list
  <1300-1999>       IP standard access list (expanded range)
  <2000-2699>       IP extended access list (expanded range)
  <2700-2799>       MPLS access list
<snip>

To make things more complicated the numbers for standard and extended ACLs each have 2 seperate ranges.

A little tip at this point: do not use numbered ACLs. Use ACLs with names, because you can give them a name that hopefully tells you and others what this ACL is for. Or can you remember what ACL 42 was for? and where it is used? (good luck with sh run | incl 42 for large configs)

Standard vs Extended Numbered ACL Config format

Pre IOS-XE 16.12

If you use an extended ACL with a number (please don't) you probably enter

ip access-list extended 101
permit ip host 10.0.0.0 any

and that is exactly what will end up in your config. However if you want to use a standard ACL with a number (pls dont) you enter

ip access-list standard 1
permit host 10.0.0.0

And you will end up with this in your config:

access-list 1 permit 10.0.0.0

post IOS-XE 16.12

This format is no longer used in IOS XE 16.12, it has been changed to the same format as named ACLs are using and it looks like this:

ip access-list standard 1
 10 permit 10.0.0.0

This is nice, because it is less inconsistent. But unless you have absolutly no legacy gear you have to support it anyways.

Remarks

Remarks are very handy when trying to understand lengthy ACLs (lengthy in my case sometimes means hundreds of lines, but others might laught at that). Sadly IOS does not show remarks with show ip access-list, for those you have to take a look into your config.

Interface-ACLs that do not exist

If you delete an ACL that is configured on an interface, the config line is not deleted from the interface. But the interface now accepts all traffic.

asr920(config)#ip access-list extended delete-test
asr920(config-ext-nacl)#deny ip any any
asr920(config-ext-nacl)#do sh run int Gi0/0/0
interface GigabitEthernet0/0/0
 ip address dhcp
 ip access-group delete-test in
 negotiation auto
end

asr920(config-ext-nacl)#do sh ip access-list delete-test
Extended IP access list delete-test
    10 deny ip any any
asr920(config-ext-nacl)#do ping  10.0.0.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)
asr920(config-ext-nacl)#exit
asr920(config)#no ip access-list extended delete-test
asr920(config)#do sh ip access-list delete-test
asr920(config)#do sh run int Gi0/0/0
interface GigabitEthernet0/0/0
 ip address dhcp
 ip access-group delete-test in
 negotiation auto
end

asr920(config)#do ping  10.0.0.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/4 ms

IPv4 ACLs that only consist of remarks

But what if the ACL exists but there is no action statement in an ACL at all? We can build such an ACL that is only a remark. The ACL shows up in the config, but the default deny does not deny all traffic. But when we add an action the implicit deny suddenly works.

asr920(config)#ip access-list extended noentry
asr920(config-ext-nacl)#remark test1
asr920(config)#do sh run | section ip access-list extended noentry
ip access-list extended noentry
 remark test1
asr920(config-std-nacl)#in Gi0/0/0
asr920(config-if)#ip access-group noentry in
asr920(config-if)#do ping 10.0.0.42
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.42, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms
asr920(config-if)#exit
asr920(config)#ip access-list extended noentry
asr920(config-ext-nacl)#permit ip host 192.168.0.0 any
asr920(config-ext-nacl)#do ping 10.0.0.42
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.42, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

Default settings for config sections that can have an ACL

Things like SSH, NTP, SNMP, NETCONF, ... may (and probably should) be secured by ACLs. But by default they are not protected and answer to everything. This means you want to have an ACL for all of them. For IPv4 and IPv6. If you add the first IPv6 address to a device and you dont have an IPv6 ACL for those services configured then your device might be reachable via IPv6. So even if you haven't started with IPv6 for management and monitoring you have to keep in mind that your router listens on every address and if you have IPv6 enabled on any interface your router might be reachable via IPv6.

Port numbers

The router converts some port numbers to names. This is annoying if you want to parse and compare that. At least they match up with the services list published by IANA, altough in some cases the service-name column does not work and you have to use one of the aliases (e.g. port 80 is www instead of http) It's just an other piece that makes your parser a bit more complex.

Order of entries in standard ACLs

Standard ACLS are funny. They automagically put single addresses in front of larger prefixes. It looks like those changes are done in a way that does not influence what is permitted and denied by the ACL, but it makes some things hard to read, because now everything is out of order.

Entering

ip access-list standard order1
permit 10.0.0.0 0.0.0.255
permit host 10.0.42.0

results in this config:

ip access-list standard order1
 permit 10.0.42.0
 permit 10.0.0.0 0.0.0.255

But because that is not confusing enough, you can always add remarks to make things easier to understand...

These commands

ip access-list standard order2
permit 10.0.0.0 0.0.0.255
remark test1

Result in the expected config:

ip access-list standard order2
 permit 10.0.0.0 0.0.0.255
 remark test1

Add an other address, e.g. this:

permit 10.0.42.0

And your config now looks like this:

ip access-list standard order2
 remark test1
 permit 10.0.42.0
 permit 10.0.0.0 0.0.0.255

This is because the remark is always attached to the line after it (not adding a line after it is probably a corner case) and when the single address is added it is attached to this line. But single addresses are put to the top and the remark with them.

Putting one remark in front of several entries can screw you up beautifully. In this example we have 2 groups, called group1 and group2 Each group has a single address and a prefix in it. One could now build an ACL that looks like this.

ip access-list standard out-of-order
remark group1
permit 10.0.1.0 0.0.0.255
permit 10.0.0.0
remark group2
permit 10.0.2.0
permit 10.0.3.0 0.0.0.255

What ends up in the config is this:

ip access-list standard out-of-order
 remark group2
 permit 10.0.2.0
 permit 10.0.0.0
 remark group1
 permit 10.0.1.0 0.0.0.255
 permit 10.0.3.0 0.0.0.255

It looks like the groups have switched positions and 10.0.0.0 and 10.0.3.0 0.0.0.255 have changed groups. Good luck reverse engineering ancient firewall rules...

The sequence numbers in IOS-XE 16.12 can explain what happens here.

ip access-list standard out-of-order
 30 remark group2
 30 permit 10.0.2.0
 20 permit 10.0.0.0
 10 remark group1
 10 permit 10.0.1.0 0.0.0.255
 40 permit 10.0.3.0 0.0.0.255

Connecting to a Device via SSH from Netbox

2020-11-08

Connecting via SSH to a device in netbox would be nice. So I build a thing.

Keep in mind that you have to adjust almost every config or script to your environment.

First of all you need a custom link for devices in netbox that somehow encodes the hostname of your device in a URL.

So your URL could look like this:

ssh://{{ obj.name }}.example.com

An when you give it a name like {% if obj.device_role.name == "Router" %}SSH Login{% endif %} the link is only shown to devices with the router role.

When you click on the link your browser should ask you how to open it.

Now you can build a little script that parses that url, opens your favorite terminal and starts ssh in there. My script is called sshterminal. Dont forget to set execute permissions on the script.

#!/bin/bash

HOST=$(echo "$1" | cut -d"/" -f 3)

alacritty -e ssh $HOST

alacritty is currently the terminal of my choice, but any terminal that has an option to directly execute a programm should work. (thats what the -e option does.)

If something else opens that url you have to set up your browser to use the script. In firefox that is somewhere in the "applications" menu in about:preferences#general.

Now you have a terminal with an ssh connection to the device.

You probably also want to have some options like jump hosts, ancient ciphers for shitty routers, a username and an ssh key. So put something like this in your ssh config:

host *.example.com
    user admin
    Ciphers=+aes256-cbc
    KexAlgorithms=+diffie-hellman-group14-sha1
    IdentityFile=~/.ssh/my_private_key
    ProxyJump my_jump_host

Weechat setup for IRC in tmux via SSH

2020-09-12

Joining some IRC channels was something that I wanted to do for a long time. But just using a regular client on my laptop does not work, because IRC does not store messages for later delivery. Since my Laptop is often not connected to the IRC server for various reasons I had to find a workaround for that. So I gave the Weechat + tmux + ssh setup a try.

I started with a debian VM that is online 24/7. Then I installed Weechat and tmux:

apt install weechat-curses weechat-plugins tmux

Now I needed a user to log in as via SSH and to run Weechat. After that lingering has to be enabled so that the user is allowed to run systemd user services while not being logged in.

loginctl enable-linger $USERNAME

The systemd service is basically copied from the Arch Linux Wiki and looks like this:

[Unit]
Description=A WeeChat client and relay service using Tmux
After=network.target

[Service]
Type=forking
RemainAfterExit=yes
ExecStart=/usr/bin/tmux -L weechat new -d -s weechat weechat
ExecStop=/usr/bin/tmux -L weechat kill-session -t weechat

[Install]
WantedBy=default.target

This unit file is stored as ~/.config/systemd/user/weechat.service

The important part is the tmux -L weechat in the commands in the unit file. For reasons explained further in the Arch Linux wiki article the way systemd starts services would kill the tmux session in which weechat is started after startup. With -L weechat a socket called weechat is used instead of default one which is not impacted by the systemd behavior.

To enable and start the service use

systemctl --user enable weechat.service
systemctl --user start weechat.service

There should now be a tmux sesson with Weechat inside.

Connecting to it is possible via tmux -L weechat attach To make sure that systemd works as intended reboot the VM and check again.

After that some work has to be done to make it easier to access the tmux session. I started with a script like this on the server:

#!/bin/bash

tmux -L weechat attach

stored in /usr/local/bin/weechat-connect so I dont have to type in the long tmux command every time.

Then I wanted to start this as if was a regular programm on my laptop. So I made a script

alacritty -e ssh -t <USERNAME>@<SERVER> weechat-connect

and put it somewhere in my path. alacritty is my terminal emulator, if you use something else just use it. The -e is just the "run this command" option. I hope that every terminal emulator has a similar option, just pick the right one.

Running weechat on my laptop attaches me to the WeeChat running in the tmux session via SSH. Since the SSH connection might die every now and then I might switch to mosh instead of ssh in the future.

Connecting to a Raspberry Pi at first boot via IPv6 link local addresses

2020-03-07

I installed a Raspberry Pi today. My problem is that I am lazy and I don't want to take out a HDMI cable and a keyboard to configure it. You can configure it via SSH. To do this you have to create a file called ssh in the boot partition of the image. 1 But I didn't have a network with DHCP in it. This is not a problem because the Pi speaks IPv6 and when I plug it directly into my laptop it does IPv6 router solicitation. This means that it's address will end up in my laptops neighbor table. So I can just use ssh pi@fe80::2083:1fb1:2912:d686%enp0s25 and log in as usual.

Wireguard AllowedIPs caveats

2020-02-15

I recently tried out wireguard. The AllowedIPs setting confused me a bit. The name, many blog posts and some parts of the documentation mention that this setting is some kind of source IP address filter.

But when you have 2 peers in a config that have all addresses allowed you will get errors. An example config for that could look like this (pls forgive me for only using legacy IP in this example):

[Interface]
PrivateKey = < priv key here >
ListenPort = 51820

[Peer]
Endpoint = 172.20.1.104:51820
PublicKey = zfiU3b7CSyTiGZ9YIAOyvKgDHsFsL78Vij6kB9615ys=
AllowedIPs = 0.0.0.0/0

[Peer]
Endpoint = 172.20.1.126:51820
PublicKey = X++p6RAoYuGE0GXiDI+bJbsS0kI9odzwIUpef5nVKRo=
AllowedIPs = 0.0.0.0/0

If the interface is running and you enter a wg show, the result has no AllowedIPs for the first peer.

interface: wg-p2p
  public key: EfGMYv8Vd3DPLzyAKeMtDe6FzU+EVangtMRnkC+urik=
  private key: (hidden)
  listening port: 51820

peer: zfiU3b7CSyTiGZ9YIAOyvKgDHsFsL78Vij6kB9615ys=
  endpoint: 172.20.1.104:51820
  allowed ips: (none)
  latest handshake: 1 minute, 59 seconds ago
  transfer: 180 B received, 284 B sent

peer: X++p6RAoYuGE0GXiDI+bJbsS0kI9odzwIUpef5nVKRo=
  endpoint: 172.20.1.126:51820
  allowed ips: 0.0.0.0/0
  transfer: 0 B received, 3.47 KiB sent

This is because the AllowedIPs setting is not only used as a source filter. It is also how wireguard decides to which peer a packet is send. A packet is send to the peer which has the destination address in its AllowedIPs.

The AllowedIPs of peers on an interface can not overlap because wireguard does not know which of the multiple peers it has to choose. Each address can only be used once for each interface. Wireguard does not do routing itself! Adding a longer prefix in the AllowedIPs does not mean that only this peer will receive the traffic. It means that you have overlapping addresses configured.

The routing is still done by the linux kernel. The regular routing table needs an entry to send some traffic via the wireguard interface. Only if the routing table decides to send traffic via the wireguard interface the peer config comes into play.

This also means that if you want to build point to point links with wireguard for e.g. for an encrypted, routed backbone you have to create one interface per link. (And not one interface with many peers.)

an Prometheus Exporter for CSGO (and other SRCDS Games)

2019-01-29

I am running the network and/or gameservers for many LAN parties. Once one of the team members told me that some CPUs are to slow to run a CSGO server at 128 ticks per second. Verifying that this happens was not that easy because back then he only had a couple of commands he could execute on the command line of the client. This means that he could only verify that the server is running fast enough while actually being on the server. This is not possible during matches.

Somewhen I stumbled across srcds_perfmon. I actually never used the software. But it has shown me the stats command for SRCDS. This command prints out several statistics, e.g. the tickrate of the server, network traffic, number of maps played and several others. Sadly the output is not consistent between different kinds of servers. CSGO for example outputs

CPU   NetIn   NetOut    Uptime  Maps   FPS   Players  Svms    +-ms   ~tick
10.0      0.0      0.0    8967     0   63.80       0    5.22    0.25    0.05

while L4D2 does not have all of those fields. An other issue is that those fields are not documented. Here is a annotated list which I made from observation, tests and guesses:

  • CPU - unknown
  • NetIn - inbound network traffic in kbit/s
  • NetOut - outbound network traffic in kbit/s
  • Uptime - uptime of the server in minutes
  • Maps - number of maps played
  • FPS - the tickrate of the server
  • Players - the number of real players, bots are not counted
  • Svms - unknown
  • +-ms - unknown
  • ~tick - unknown

If you find mistakes in this list feel free to contact me.

Now to the monitoring part. Since Prometheus is used on many of the events I am helping I wrote an exporter for prometheus. Connecting to a gameserver is often solved via RCON which allows you to execute commands on the gameserver and retrieve the output. I am using the python library aiorcon for this. I am not only executing the stats command but also the status command. status returns some further information not included in stats e.g. the servers name, the amount of bots and the maximum amount of players allowed on the server. The name of the server is used as a label and might be used in dashboards.

To handle the webserver part which is facing to prometheus I used aiohttp, an asynchronous webserver for python. The whole exporter completely follow the recommendations for prometheus exporters. It does not have to run once per gameserver on the same machine. It can be run on any server because it makes no difference where it is running from the query perspective. Also on some events the gameserver people don't know how to handle such an exporter. Therefore just giving the monitoring guy the rcon password and address is sufficient. Also you can easily monitor servers on which you have no direct access, e.g. when the server is run by some hoster which does not allow you to execute random code on his servers. One exporter can therefore query many servers.

Putting the password into the exporter requires a little trick with the relabeling configs of prometheus. Instead of defining targets as <addr>:<port> they are defined as <addr>:<port>:<rcon_pw>. The password is generated via a regex from the given target specification. Then the target specification is rewritten to remove the :<rcon_pw> part.

The full relabeling config and the source code can be found in the github repository. I later realized that someone build a similar exporter before which sadly has the same name and does only use the status command and does therefore not have the performance metric support.

Updating a HP SE326M1 Server

2018-10-08

I have an old HP SE326M1 server. According to rumors on the internet those servers are a OEM version of the DL180G6 with 25 2.5" drives and were used in Microsoft datacenters.

Anyway, I wanted to use this server again. One issue I had with this server was that the combination of the SAS backplane and the RAID controller have an issue. When using the cli tool from within Linux to configure the RAID controller all disks are shown in slot 0. This means that you can not create or modify arrays because the controller (at least from within the Linux cli tool) can not distinguish the different drives. A long time ago I heard that a firmware upgrade was required.

So I set out to find the firmware upgrade. Somewhen I stumbled over a SPP from HP. This is the "Server Pack for Proliant" which contains the firmware upgrades for the HP server. Normally these SPPs require a active support contract. But I found one on the HP website which did not. If can't find that SPP you may ask someone else who has access to one e.g. someone who is working for a company with HP servers. I downloaded this file after creating an account to download this firmware pack. (WTF HP!? why do I need an account to download a firmware package?). After downloading a 5.6GB iso file (WTF HP!?) with only 3MB/s (WTF HP!?) which took about 30 minutes I tried what every mentally sane Linux guys would do: use dd to copy the image to a usb stick. This did not boot at all. So I went to long way , booted up windows, copied the 5.6GB file to a USB stick, mounted the iso on windows and used the hpusbkey.exe tool to build a usb stick with that tool. The tool copied all the files and then just crashed. Anyways I tried booting it anyway. But after booting that I just got the message that the boot failed and that pressing any key would reboot the system. Okay, this did not work.

The next place to try out was iLo2. This is HPs remote management tool. So after several failed attempts to configure an IP address either static or via DHCP I was close to throwing the server out of the window. Then I reseted the iLo config completely and suddenly it accepted IPs via DHCP.

Now there was the next problem: When connecting to the ilo you get a SSL_BAD_HMAC_ERROR, which means that something is broken with SSL/TLS. A workaround is to change some settings of firefox in about:config. The required key is called security.tls.version.fallback-limit. If you set this to 1firefox is talking with ilo2. But be carefull, this enables fallbacks from newer TLS versions to older ones which makes some MITM attacks which involve falling back to older SSL/TLS versions possible.

Upgrading iLo2 to the most recent firmware version was working very well. The firmware can be found for example (here)[http://pingtool.org/latest-hp-ilo-firmwares/]. This page also describes how to get the bin file extracted from the .scexe file HP gives you. Somewhere in the Administration tab you can find a firmware upgrade section. Just upload the firmware there and wait a minute for the iLo to reboot.

Now starts the funny part. iLo2 uses Java applets for remote media mounting and for the remote console. All major browsers removed the support for Java applets. Even the current firefox extended support release got rid of it by now. Someone on (reddit)[reddit.com] mentioned a browser called PaleMoon which still supports it but I haven't tryed that yet. The solution that I ended up with was using Windows 7 and the Internet Explorer 11. This works because iLo2 also has an implementation specificly for IE. This way I was able to mount the SPP iso image on the server and managed to boot from that. But this takes ages. First it needs ~1 minute to boot. Then you have to select if you want to run it in automatic or interactive mode. I chose the automatic mode by accident which worked out. Then it copies things to memory. This took 10-15 minutes for me. Then I reached some kind of GUI with a blue background. At least I had a mouse now. After an other 15 minutes it loaded something which looked like a browser window. At least it pretended to be waiting to transfer some data from 127.0.0.1. After an other 10 minutes the page loaded and started analysing the system. Then it showed a page and was scanning for the firmware upgrades and applied some of them. After a sudden reboot the backplane and raid controller firmware were up-to-date.

installing the hpucacli or ssacli utilies

hpacucli which was later renamed to hpssacli which is now known as ssacli is a tool to access the HP RAID controllers from within the operating system. Sadly this is not in the official repositories but HP has a repository for debian which is maintained for the current (stretch) release.

Furter details on how to install this can be found here. If you have issues with installing take a look at the comment section of this page. I will keep it short and add the commands I used here.

echo -e "deb http://downloads.linux.hpe.com/SDR/repo/mcp/ stretch/current non-free" > /etc/apt/sources.list.d/proliant.sources.list
apt install dirmngr
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys C208ADDE26C2B797 # this is adding the required key from a repo server.
apt update
apt install ssacli

Snake was fun, but Tetris is better!

2018-06-13

A couple of weeks ago I make a snake clone. Then I made Whac-a-Mole and now I stopped at Tetris.

You can find this and the other games here. Like the last games this one is also build with jQuery, Javascript, HTML and CSS. This game was a lot more painful to write, because there are several boundary checks at serveral positions. Also I ran into some interesting bugs which makes me more careful when using the || operator.

Often this is used like this

a = a || default_value;

to asign default values if no value is given. This works due to JS values being truthy or falsy. The || operator takes the first truthy value and returns this. Basically as long as a is not null, undefined or false one gets the left value. But there are more falsy values. There are also 0, -0, NaN and empty strings like "", '' (to be accureate the specs say every string of length 0 is falsy). While digging deeper I realized that there are some further caveats, this stackoverflow answer sums them up. In conclusion: You have to be careful when using this operator because you could get a 0 or an empty string as an input which might screw you, because this might have been a value which you wouldn't want to replace.

My workaround looks like this:

if(rotation === undefined){
    rotation = current_tetromino.rotation;
}

rotation is a parameter of the function that contains this code and has undefined as a default value. This works but there are probably some nicer versions. Maybe this one is better, but still looks very clumsy due to the long variable names.

rotation = (rotation === undefined) ? current_tetromino.rotation : rotation;

A with shorter names it looks like this:

rot = (rot === undefined) ? cur.rot : rot;

This is a bit shorter, but the ternary operator still looks a bit weird to me. I never really liked that fellow but peerhaps I should use it every now and then.

Okay, back to the game. There is still something left to do: I want to change the way the field is drawn. Currently the complete field is redrawn every time something changes. It feels a bit slow sometimes, especially when you rotate a tetromino (thats the name of a block) twice. Most of the time at most 8 grid cells need to be adjusted. When the player fills one or more lines those lines need to be deleted and everything from that point upwards needs to be drawn again.

I made a snake clone part 2

2018-05-23

In my last post I wrote about the snake clone I have written.

With some feedback I got from a couple of friends I added some features to make the game more interesting.

First of all every 10 points the game spawns a stone on the field. The stone is an other obstacle which the snake is not allowed to run into. If you do, you lose a life. The stones are only generated on positions where there is no food or the snake and are at least 4 blocks away from the head of the snake. (Pythagorean theorem fuck yeah!)

The second adjustment I made is to increase the games speed over time. I initially wanted to change the game speed whenever some food is eaten. But since coding, testing and thinking about what you are actually doing hasn't worked out I changed the game speed every step the game does. After a couple of games I realized what I had done and I like the mechanic because it rewards a more aggressive playstyle which collects the food in a faster fashion than allways taking a nice way which gets your snake out of the way and keeps the field open. Especially with a formula like

timestep = timestep * 0.97

the game got pretty fast pretty fast. At ~30 points the game was to fast to be played by a human. Other values like 0.995 moved that point further away but were not really motivating because the game still got insanly fast. So I had to limit the speed of the game to some level that is still playable but not demotivating. I came up with

timestep = basestep * (0.5 + e**(-0.01*step)/2)

basestep is the starting speed of the snake. By default this 333ms. step is the amount of steps the game has done so far. This is basically how often your snake has moved yet. e**(-0.01*step)/2 decreases from 0.5 to 0 for large step counts. Together with the 0.5 this results in a timestep that decreases from basestep to 0.5*basestep. For example after 500 steps the total speed is at ~0.506*basestep or about twice as fast as at the beginning.

Also the game can now be played here.

I made a snake clone

2018-05-18

Today I procrastinated and build a snake clone with HTML, CSS, JavaScript and jQuery.

You can play it here

All together the game has ~200 lines of code. Everything runs in the browser, no fancy server needed. I used the CSS flexbox for many things and wrote my own little "library" for the grid layout.

I am thinking about adding some stones or something in the way to make the game harder. Other options would be some powerups that increase how many points you gain or make the snake faster/slower/shorter. An other idea would be to make this a multiplayer game with the goal to enclose the other player.

Also I am thinking about making more games based on the grid library.

Spontaneous ideas are:

  • tetris
  • whac-a-mole
  • 2048

I don't know what the future might bring. Have fun playing.

LAN Party L3 Networks without additional Hardware

2018-05-18

  • LAN Parties need Broadcast traffic for network discovery
  • Many Networks rely on Spanning Tree and Fat-Links as Uplinks.
  • Hub-Spoke Network Design
  • Single Point of Failure
  • Real topology on the floor is more like a grid of mesh
  • L3 Networks can provide Failover and Load Balancing
  • requires additional hardware

LAN Party networks need broadcast traffic for the local game discovery. This means that LAN party networks are usually a large layer 2 network. But large layer 2 networks with several hundret or thousand hosts are not that nice because the amount of broadcast that normal (Windows) computers send grows rapidly. Usually those networks are large hub-spoke designs. But this often results in very long cable runs if every switch has a direct to the core switch. The alternative to very long cables would be daisy chaining a lot of switches (or even building loops which would be disabled by spanning tree)

The "L3" networks I have seen are still a hub-spoke design on L1 and L2. Each access switch puts all its access ports into a VLAN. The gateway of that VLAN is somewhere on a core switch which does inter-VLAN routing. To get the required broadcast traffic between the seperate VLAN all those VLANs are also connected to a server. This server then uses software like [2] or [3] which listens for broadcast on all those VLANs and duplicates the received packets on all other VLANs. Depending on the software that is used either a white- or blacklisting is used.

This design has several disadvantages:

  • The server that bridges the broadcasts is a single point of failure.
  • The core switch is a single point of failure for the whole network
  • The network cannot use redundant links for traffic because it still has to use spanning tree.

1 [2](Link zu service-discovery-helper) [3](Link zu bcast-bridge)

Importing music with beets (part 1)

2018-05-02

beets is a tool for organizing music. It can auto import your music collection. But the auto import feature is missing some things in my opinion. I will work around some of it's limitations and report on the progress.

The currenct circumstances are:

  • I have a lot of music sorted in different ways. This means that I have multiple harddisks/folders with music on them. Each of them might be sorted differently, e.g. one with $artist/$album/$number - $title and an other one with $artist - $album/$title. Some folders contain music that is also in others. Some may contain full albums, others might not. Some folders may contain the same song in a different file format Some folders may also contain non-music files, e.g. the cover art, log files from converting to an other file format or just other random things.
  • I want to sort my music collection in one place, sorted by $artist/$album/$number - $title
  • I want to have everything just once
  • I have ~30000 tracks in my collection that are already imported into beets and sorted properly.
  • I have several 100GB of music laying around that has to be sorted.
  • I want to do as less as possible by hand, because a manual approach does not scale well.
  • I am not afraid to lose some of the things that are somewhere on some harddisk because the difference between not finding them because I am to lazy to search several disks and not having them is not relevant from a practical perspective.

Okay, let's dive into the sorting process.

beets settings

Everything I have changed in the beets settings (~/.config/beets/config.yaml) is

directory: /data/music/library/
library: /data/music/library.blb
import:
        move: yes

the first two lines are where my music library will be stored and the second line defines where the database which is used by beets is stored. The relevant setting for me is the fourth line. This tells beets to move files into the library folder which leaves me with folders that do not contain music files anymore

Automatic importing

First I am running beets automatic import feature over everything. The important point is to create a logfile, because this will help us with duplicates.

The command do this with is: beet import -q -l import.log

Removing duplicates

beets can find duplicates but it does not automatically remove them. We will take care of those duplicates now with the log file we made with the last step. The logfile now says something like this:

import started Wed May  2 15:05:42 2018
skip /data/music/F/F.R. - Mixtape
skip /data/music/F/FAME - Real Talk Mixtape
skip /data/music/F/FaSy - 33
skip /data/music/F/Fabricant - Demo 2010
skip /data/music/F/Face Of Ruin - Within The Infinite
skip /data/music/F/Faeces - Upstream
skip /data/music/F/Fail Emotions - 2009 - Side A
skip /data/music/F/Fail Emotions - 2010 - Dance Macabre
skip /data/music/F/Fail Emotions - 2010 - Make Bad
skip /data/music/F/Falconer - 2003 - The Sceptre Of Deception
duplicate-skip /data/music/F/Fall Out Boy - Infinity On High
skip /data/music/F/Fall Out Boy - Take This To Your Grave-Direct
skip /data/music/F/Farin Urlaub - Am Ende der Sonne
skip /data/music/F/Farin Urlaub - Die Wahrheit übers Lügen
skip /data/music/F/Farin Urlaub - Endlich Urlaub
skip /data/music/F/Farin Urlaub - Livealbum of Death
skip /data/music/F/Farin Urlaub - Porzellan
skip /data/music/F/Fasics - Ich tu was ich kann! - EP
duplicate-skip /data/music/F/Fear Factory - Archetype (2004) [320KB]
skip /data/music/F/Fear Factory - Demanufacture (1995) [320KB]
duplicate-skip /data/music/F/Fear Factory - Demanufacture [Remastered] (2005) [320KB]
duplicate-skip /data/music/F/Fear Factory - Digimortal (2001) [320KB]
skip /data/music/F/Fear Factory - Mechanize (2010) [320KB]
skip /data/music/F/Fear Factory - Transgression (2005) [320KB]
... [chopped of here]

All lines that started with duplicate-skip have automatically been detected as duplicates. They are not needed anymore and can be removed. A simple bash one-liner for this could look like this:

grep "^duplicate-skip" import.log | cut -d" " -f2- | while read line; do rm -r "$line"; done

First we find all lines that start with duplicate-skip, then we chop of the duplicate-skip to get the path. Then we delete each of the duplicates. Since building one-liners is dangerous I would first start with

grep "^duplicate-skip" import.log | cut -d" " -f2- | while read line; do echo rm -r "$line"; done

The little but important difference lies in the echo. This prints all the rm -r that would be executed into the shell to check them manually and prevent major skrew ups. When everything looks fine.

In my case it looks like this:

rm -r /data/music/F/Fall Out Boy - Infinity On High
rm -r /data/music/F/Fear Factory - Archetype (2004) [320KB]
rm -r /data/music/F/Fear Factory - Demanufacture [Remastered] (2005) [320KB]
rm -r /data/music/F/Fear Factory - Digimortal (2001) [320KB]
rm -r /data/music/F/Finntroll - Jaktens Tid
rm -r /data/music/F/Finntroll - Jaktens Tid (2001)
rm -r /data/music/F/Finntroll - Midnattens Widunder (1999)
rm -r /data/music/F/Finntroll - Nattfodd (2004)
rm -r /data/music/F/Finntroll - Nattfödd
rm -r /data/music/F/Finntroll - Trollhammaren (2004)
rm -r /data/music/F/Finntroll - Ur Jordens Djup
rm -r /data/music/F/Finntroll - Visor Om Slutet (2003)
rm -r /data/music/F/For The Fallen Dreams - Back Burner 2011
rm -r /data/music/F/For The Fallen Dreams - Changes
rm -r /data/music/F/For The Fallen Dreams - Wasted Youth (2012)
rm -r /data/music/F/For the Fallen Dreams - Relentless
rm -r /data/music/F/Fort Minor - The Rising Tied
rm -r /data/music/F/Frithjof Brauer - Nevertheless
rm -r /data/music/F/Frithjof Brauer - Tales from the past
rm -r /data/music/F/From Autumn to Ashes - Holding a Wolf by the Ears
rm -r /data/music/F/From Autumn to Ashes - The Fiction We Live
rm -r /data/music/F/From Autumn to Ashes - The Fiction We Live_
rm -r /data/music/F/From Autumn to Ashes - Too Bad You're Beautiful_
rm -r /data/music/F/Funeral for a Friend - Hours
rm -r /data/music/F/Funeral for a Friend - Memory and Humanity
rm -r /data/music/F/Funeral for a Friend - Seven Ways to Scream Your Name
rm -r /data/music/F/Funeral for a Friend - Tales Don't Tell Themselves
rm -r /data/music/F/Funeral for a Friend - Welcome Home Armageddon

Since these seem to match with the logfile we can probably use the command without the echo and get rid of the duplicates.

This code has one little corner case: When beets decides to groud to folders together e.g. if you have a 2 CD album with the different CDs in different folders the logfile does something like this:

duplicate-skip path1; path2

This screws up the command with the semicolon.

remove empty directories and other crap

Many of my music folders contain things like cover art, conversion logs or other files. Now it is time to get rid of them. A first and easy approach is to go for file extensions. This is only an approximation because technically the file extension is irrelevant. But it works pretty well.

To find all files I use find

find * -type f -print

This prints all files to the terminal. But we are only interested in the file extension which is everything behind the first dot. I dont care about files that don't have a file extension. They are probably no music files, so they will be deleted later.

find * -type f -name "*.*" -print | rev | cut -d"." -f1 | rev | sort -u > musicfileextensions.txt

The rev command is a little workaround for cut which can only keep the first but not the last field in which is has split up the input. The first rev reverses the input, then we cut and keep everything in front of the first . which was the ending of the files. The second rev roates our file extensions again. The sort -u sortes them and removes duplicates. The > musicfileextensions writes the list to a file. Now we have a list of all file endings that exist in the directory we are working on.

Mine looks like this:

JPG
MP3
db
gif
ini
jpg
log
m4a
mp3

My list is pretty short because I am only working on a subset of my collection. The final list might be longer.

Now I sadly have to edit this list manually. After removing every non music file extension the list is a little bit shorter.

MP3
m4a
mp3

This shows us something which is pretty annoying: People can not decide if the file extension is written in upper- or lowercase letters... We now could care about this and build all commands in a way which ignores the case of letters but since we gathered this list from all our music files this list should be complete and contain all occouring case-sensitive variations.

Time to get rid of some other files. find to the rescue (again)! We want to delete:

  • empty directories
  • all files which are no music files (which in our simplified view means all files that dont have on of the music file extensions)
find * -type f ! -name "*.mp3" ! -name "*.MP3" ! -name "*.m4a" -print
find -type d -empty -print

The first command finds all non music files, the second one the directories. If we append -delete the files/directories are deleted. Use this with care, first try it without the -delete flag.

find * -type f ! -name "*.mp3" ! -name "*.MP3" ! -name "*.m4a" -print -delete
find -type d -empty -print -delete

Now we got rid of a lot of things (hopefully).

The next steps will be harder and I will talk about them in the next part.

Today I am starting a blog.

2018-03-11

I am not sure what will happen here.