160 lines
8.1 KiB
Markdown
160 lines
8.1 KiB
Markdown
+++
|
||
date = 2024-02-08
|
||
title = "Arch Linux: Improve boot time performance"
|
||
tags = ["homelab", "arch", "linux"]
|
||
+++
|
||
|
||
{{% figure src="/images/2024/02/arch-linux-logo.png" class="side-image" %}}
|
||
|
||
I run Debian on all my servers. It's a great stable OS and I love it. Proxmox, [which I run on my homelab server](https://www.devroom.io/2020/11/12/the-big-diy-nas-update/#proxmox), is also based on Debian.
|
||
|
||
However, on my desktop I run [Arch Linux](https://archlinux.org/). It's a great distro to tinker with. It comes with a lot of _up to date_ packages, but it also has the AUR - Arch User Repository. So for any app you can find, there probably is an easy way to install it.
|
||
|
||
### Slllooooowwww...
|
||
|
||
As of late, I noticed that boot times on my system were getting longer. Which is strange, because I run some pretty okay hardware.
|
||
|
||
As it turns out, cold booting this box takes 1min 7.538s, according to my logs.
|
||
|
||
Luckily, the [Arch Wiki](https://wiki.archlinux.org/) offers a [nice guide on how to trouble shoot boot performance](https://wiki.archlinux.org/title/Improving_performance/Boot_process).
|
||
|
||
There's `systemd-analyze blame` which will show the time it takes each service to start up. I've copied the top 10 here, which incidentally are also all >1 second start-up times.
|
||
|
||
<pre><font color="#E64747">❯</font> <font color="#42E66C">systemd-analyze</font> blame
|
||
20.771s docker.service
|
||
3.514s dev-sdb3.device
|
||
2.459s systemd-journal-flush.service
|
||
1.880s upower.service
|
||
1.806s ldconfig.service
|
||
1.687s systemd-tmpfiles-setup.service
|
||
1.587s containerd.service
|
||
1.287s systemd-modules-load.service
|
||
1.032s systemd-fsck@dev-disk-by\x2duuid-96EB\x2d4C82.service
|
||
1.028s cups.service
|
||
</pre>
|
||
|
||
Docker is a clear offender here. `dev-sdb3` is also quite slow it seems.
|
||
|
||
Another command recommended in the wiki is `systemd-analyze critical-chain`. This will show you the critical chain to boot your system. Again, docker is here clearly a big offender.
|
||
|
||
<pre><font color="#E356A7">❯</font> <font color="#42E66C">systemd-analyze</font> critical-chain
|
||
The time when unit became active or started is printed after the "@" character.
|
||
The time the unit took to start is printed after the "+" character.
|
||
|
||
graphical.target @33.660s
|
||
└─multi-user.target @33.660s
|
||
└─<font color="#E356A7"><b>docker.service @12.888s +20.771s</b></font>
|
||
└─<font color="#E356A7"><b>containerd.service @11.264s +1.587s</b></font>
|
||
└─network.target @11.236s
|
||
└─<font color="#E356A7"><b>wpa_supplicant.service @27.465s +268ms</b></font>
|
||
└─basic.target @10.366s
|
||
└─<font color="#E356A7"><b>dbus-broker.service @9.822s +541ms</b></font>
|
||
└─dbus.socket @9.793s
|
||
└─sysinit.target @9.759s
|
||
└─<font color="#E356A7"><b>systemd-update-done.service @9.722s +36ms</b></font>
|
||
└─<font color="#E356A7"><b>systemd-journal-catalog-update.service @9.375s +326ms</b></font>
|
||
└─<font color="#E356A7"><b>systemd-tmpfiles-setup.service @7.657s +1.687s</b></font>
|
||
└─local-fs.target @7.587s
|
||
└─<font color="#E356A7"><b>boot.mount @7.458s +128ms</b></font>
|
||
└─<font color="#E356A7"><b>systemd-fsck@dev-disk-by\x2duuid-96EB\x2d4C82.service @6.398s +1.032s</b></font>
|
||
└─dev-disk-by\x2duuid-96EB\x2d4C82.device @6.397s
|
||
</pre>
|
||
|
||
But wait, there's more. `systemd-analyze plot > plot.svg` will generate an SVG image showing you the entire boot process in time. It's big, but there are some clear red markers that indicate issues.
|
||
|
||
At the bottom right you'll find `graphical.target`, where we want to end up as quickly as possible. And it's clear `docker` is in the way.
|
||
|
||
![](/images/2024/02/pre-plot.svg)
|
||
_Open the SVG in a new window to see more detail._
|
||
|
||
## Fixed it!
|
||
|
||
So, with `docker` as a clear offender in slowing down the boot process, let's fix that.
|
||
|
||
There are two systemd units: `docker.service` and `docker.socket`.
|
||
|
||
- `docker.service` is there to start docker and make sure it is up and running.
|
||
- `docker.socket` listens on `/run/docker.sock` (or `/var/run/docker.sock` through a symlink) and will start `docker.service` when needed.
|
||
|
||
I think you know where this is going. `docker.socket` is disabled by default and `docker.service` is enabled. Which makes sense, because when you boot your machine you want docker up and running as well. Especially for servers this makes sense.
|
||
|
||
For my desktop, not so much. I use docker, but not always and I prefer to login and check my email while docker is booting in the background anyway.
|
||
|
||
The trick thus is to disable `docker.service` from starting automatically and make sure `docker.socket` is enabled. That will take docker out of the criticial chain when booting and start docker when I'm logged in and ready to use it.
|
||
|
||
```
|
||
$ sudo systemctl disable docker.service
|
||
$ sudo systemctl enable docker.socket
|
||
```
|
||
|
||
So, what does that look like in `systemd-analyze`?
|
||
|
||
<pre><font color="#E356A7">❯</font> <font color="#42E66C">systemd-analyze</font> critical-chain
|
||
The time when unit became active or started is printed after the "@" character.
|
||
The time the unit took to start is printed after the "+" character.
|
||
|
||
graphical.target @3.893s
|
||
└─multi-user.target @3.893s
|
||
└─<font color="#E356A7"><b>cups.service @3.672s +220ms</b></font>
|
||
└─nss-user-lookup.target @3.763s
|
||
</pre>
|
||
|
||
<pre><font color="#E356A7">❯</font> <font color="#42E66C">systemd-analyze</font> blame
|
||
2.152s systemd-modules-load.service
|
||
1.295s dev-sdb3.device
|
||
622ms boot.mount
|
||
385ms NetworkManager.service
|
||
310ms systemd-udev-trigger.service
|
||
280ms udisks2.service
|
||
258ms systemd-remount-fs.service
|
||
220ms cups.service
|
||
203ms user@1000.service
|
||
189ms systemd-tmpfiles-setup.service
|
||
</pre>
|
||
|
||
![](/images/2024/02/post_plot.svg)
|
||
_Open the SVG in a new window to see more detail._
|
||
|
||
<pre><font color="#E64747">❯</font> <font color="#42E66C">systemctl</font> status docker.socket
|
||
<font color="#42E66C"><b>●</b></font> docker.socket - Docker Socket for the API
|
||
Loaded: loaded (/usr/lib/systemd/system/docker.socket; <font color="#42E66C"><b>enabled</b></font>; preset: <font color="#D7D75F"><b>disabled</b></font>)
|
||
Active: <font color="#42E66C"><b>active (running)</b></font> since Thu 2024-02-08 10:38:47 CET; 5min ago
|
||
Triggers: <font color="#42E66C"><b>●</b></font> docker.service
|
||
Listen: /run/docker.sock (Stream)
|
||
Tasks: 0 (limit: 38400)
|
||
Memory: 0B (peak: 516.0K)
|
||
CPU: 1ms
|
||
CGroup: /system.slice/docker.socket
|
||
</pre>
|
||
|
||
and
|
||
|
||
<pre><font color="#E64747">❯</font> <font color="#42E66C">systemctl</font> status docker.service
|
||
<font color="#42E66C"><b>●</b></font> docker.service - Docker Application Container Engine
|
||
Loaded: loaded (/usr/lib/systemd/system/docker.service; <font color="#D7D75F"><b>disabled</b></font>; preset: <font color="#D7D75F"><b>disabled</b></font>)
|
||
Active: <font color="#42E66C"><b>active (running)</b></font> since Thu 2024-02-08 10:39:33 CET; 5min ago
|
||
TriggeredBy: <font color="#42E66C"><b>●</b></font> docker.socket
|
||
Docs: https://docs.docker.com
|
||
Main PID: 2522 (dockerd)
|
||
Tasks: 42
|
||
Memory: 222.1M (peak: 235.7M)
|
||
CPU: 797ms
|
||
CGroup: /system.slice/docker.service
|
||
└─<font color="#8A8A8A">2522 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock</font>
|
||
</pre>
|
||
|
||
## Was it worth it?
|
||
|
||
Before:
|
||
|
||
> Startup finished in 14.729s (firmware) + 6.386s (loader) + 12.761s (kernel) + 33.661s (userspace) = 1min 7.538s graphical.target reached after 33.660s in userspace.
|
||
|
||
After:
|
||
|
||
> Startup finished in 13.735s (firmware) + 4.074s (loader) + 6.744s (kernel) + 3.893s (userspace) = 28.448s graphical.target reached after 3.893s in userspace.
|
||
|
||
Total boot time went down from 1m8s to 28s. I cannot explain the difference in kernel boot time, but the userspace savings are significant.
|
||
|
||
From here I could probably optimize more by compiling a customized kernel or using a different bootloader. Suspend to RAM would be even faster, but that feels like cheating against a hard boot.
|
||
|
||
Hopefully this will give you some pointers in how to troubleshoot slow boot times on your machine. |