NextGen Stage3 - Dracut-based Initramfs¶
- Table of contents
- NextGen Stage3 - Dracut-based Initramfs
Based on the Dracut Praktikum project run in Freiburg.
Overview¶
dracut
is a modular initramfs generation framework. Its basic functionality can be easily extended with custom modules, e.g. to realize a network boot module using qcow2 images exported via dnbd3 as the root filesystem.
The core framework provides a set of builtin modules handling typical tasks of the init process (mounting pseudo-fs, hardware detection with udev, ...).
See dracut for more information.
A custom dracut module was needed to extend the base functionality with the required components to support a network boot based on dnbd3.
See systemd-init for the module's repository.
Boot process overview¶
Here is a brief description of the most critical steps this modules realizes:- Parse the kernel command line for
slxsrv
andslxbase
parameter to fetch the configuration later- DHCP information received during the initial PXE
- Hardware detection with
udev
- Network cards (mandatory)
- Hard disk drive (optional, sortof)
- Network setup
- (view dedicated section)
- Fetch the configuration file from
http://${slxsrv}/${slxbase}/config
, contains amongst other:- Server and path to the dnbd3 image to connect
- Label of the partition to use as writable space during stage4
- Stage4 setup
- Connect the dnbd3 image specified in the configuration file
- Mount it as raw with
xmount
+libxmount_input_qemu
- Add writable layer with through device mapper
- Stage4 configuration
- Copy over core stage3 services to stage4
- Extract config.tgz
pivot-root
Network setup¶
(Dedicated section because its such a critical part of stage3)
The main job of a custom dracut module is to setup the network to access the remote rootfs.
In a PXE setup, the IP information received by the client during the initial DHCP should be passed to the initramfs via kernel command line.
This is achieved with the syslinux option IPAPPEND 3
or by crafting the ip= option using iPXE's builtin variables.
- The standard
network
module parses the IP information from the kernel command line and configures the network interface accordingly using low-level tools likeip
androute
. It only supports a limited set of specific ip= parameter formats (see Dracut network), which does not include the PXE-like format. A workaround for this problem is to rewrite the ip= parameter to a dracut-friendly format by modifying the KCL using a bind-mount on/proc/cmdline
. - The
systemd-networkd
module uses systemd's network manager to bring up the network. The module itself only installs the required binaries and service files to the initramfs, but, unlike thenetwork
module, does not setup a "wait-for-network" logic in the dracut's main booting process. As thednbd3-rootfs
module expects the network to be accessible in thepre-mount
phase, an extension of the basicsystemd-networkd
module should handle the - A custom network setup module would requires a lot of work in writing a solid network module capable of handling different hardware setups. While this approach is the most flexible when it comes to supporting exotic network configurations, the use of existing network-related modules should be evaluated first and a complete custom network setup solution should only be considered as a last resort.
First tests seem to indicate that systemd-networkd
is best suited for our scenarios.
Stage4 setup¶
This section describes how the stage4 is setup during the stage3 initial boot phase.
The configuration file downloaded during stage3's boot process contains the path to the stage4 image to use.
- Connect dnbd3 image specified in the SLX configuration file
config
- Expose the qcow2 image with
xmount
+libinput_qemu
as RAW - Add RW-layer with device mapper (either ID44 or tmpfs)
- Mount the device mapper output device as the future rootfs
These steps are central to every boot process and are performed on every dnbd3 image.
Some open questions concerning future development:- Where to save qcow2 backing files (diffs)? User management?
- Mounting: Filesystems snapshots like BTRFS?
- Caveat: resize2fs problems with ext4 (konrad)
Stage4 configuration¶
WIP:- user-config: download and extract config.tgz (if needed)
- auto-config: generate configuration file for the stage4's network manager
- auto-config: generate fstab for stage4
- auto-config: copy required dracut services
Configuration¶
During the boot process, the clients will download a configuration file from the boot server containing the path to the DNBD3 image, the root partition label and various options needed for the RW-layer (of note: OpenSLX-ID44 GPT partition label).
Its path is built using the kernel command line parameters slxsrv
and slxbase
: http://${slxsrv}/${slxbase}/config
It will be downloaded to /etc/openslx
in the stage3 and copied over as /opt/openslx/openslx
in the stage4.
Example¶
This will embed the DNBD3 image named SLX_DNBD3_IMAGE with revision SLX_DNBD3_RID from the DNBD3 server at SLX_DNBD3_SERVERS (supports more than one, but let's assume only one is given for now).
It will then use SLX_SYSTEM_PARTITION_IDENTIFIER to find the root partition within the DNBD3 image (using lsblk
).
If a partition labeled as OpenSLX-ID44
is found, it will be used as a writable space for the copy-on-write file of the base qcow2 image using device mapper
The writable device is then mounted as /sysroot
using SLX_MOUNT_ROOT_OPTIONS.
SLX_CONFIGURATION_LOCATION='/opt/openslx/' SLX_DNBD3_SERVERS='1.2.3.4' SLX_DNBD3_RID='1' SLX_DNBD3_DEVICE='/dev/dnbd0' SLX_DNBD3_IMAGE='packer.ubuntu1604.qcow2' SLX_SYSTEM_PARTITION_IDENTIFIER='SLX_SYS' SLX_SYSTEM_PARTITION_PREPARATION_SCRIPT='' SLX_WRITABLE_DEVICE_IDENTIFIER='OpenSLX-ID44' SLX_WRITABLE_DEVICE_IDENTIFIER_TIMEOUT_IN_SECONDS='10' SLX_WRITABLE_DEVICE_STORAGE_FILESYSTEM_CREATE_COMMAND='mkfs.ext4' SLX_WRITABLE_DEVICE_STORAGE_FILESYSTEM_CHECK_COMMAND='fsck.ext4' SLX_WRITABLE_DEVICE_STORAGE_MAXIMUM_FILE_SIZE_IN_MB='2000' SLX_WRITABLE_DEVICE_STORAGE_FILE_PATH='' SLX_WRITABLE_DEVICE_PERSISTENT='no' SLX_GENERATE_FSTAB_SCRIPT='' SLX_RAMDISK_SIZE_IN_KB='1000000' SLX_MOUNT_ROOT_OPTIONS='-o subvol=@' SLX_LOG_FILE_PATH='/var/log/openslx'
Notes¶
systemd-networkd
¶
- The
network
dracut module sets up an active wait for network in dracut's initqueue step# make sure dracut runs initqueue touch /lib/dracut/need-initqueue # stay in initqueue as long as we don't have network access /sbin/initqueue --finished /lib/systemd/systemd-networkd-wait-online
- The
systemd-networkd
module does not do this by default- needs a module extension
- Renaming of the boot interface can be done via .link files
- Tested ok on Ubuntu 16.04
- Failed on CentOS 7.3
Updated by Jonathan Bauer about 7 years ago · 18 revisions