Project

General

Profile

Actions

Linux kernel module for reading compressed QCOW2 images

State of the art

  • Direct reading of QCOW2 disk based file images not possible
  • Workaround: Conversion to binary raw image of fixed size or reading the image as network block device provided by nbd or using xmount

Motivation

  • Avoid these workarounds due to heavy resource usage:
    - Conversion to binary raw image: Needs a lot of additional storage
    - Using nbd or xmount: Lots of switches between kernel- and userspace are necessary
  • Implement a specific Linux kernel module for reading QCOW2 images directly and avoid heavy resource usage

Use case

  • Linux kernel and Initramfs is loaded over the network via PXE/Grub
  • QCOW2 image with data is loaded from a server
  • Developed QCOW2 kernel module is used to read data from already loaded QCOW2 image and allows the mounting of file systems inside the QCOW2 image container

Requirements

  • Read only of QCOW2 QEMU disk based file format
  • Support of sparsing of the disk file format
  • Compression of the disk file format must be implemented
  • Must be compilable and runnable under Linux kernel version 5 or later
  • Optional: Should be merged into the Linux mainline kernel source code

Risks

  • Kernel module is not working due to external circumstances (e.g. kernel source is changing in the future)
  • Team member of the project can't finish the project due to force majeure or disease
  • Resulting Source code of the project is not accepted by the kernel maintainers
  • The merging process of the resulting source code into the mainline kernel source tree exceeds the project period

Rating

  • Stability of the running kernel module
  • Quality of the code
  • Optional: Merge into the Linux mainline kernel source tree

Implementation approaches

FUSE

Implement the reading of the QCOW2 disk image as FUSE module.

Advantages:
  • File format of QCOW2 is not embedded into the kernel source code
  • FUSE module can be used by every user and does not need root permissions
Disadvantages:
  • Low performance due to switches into the user space
  • FUSE is only intended for file system usage and not for block layer functionality (e.g. missing partition scanning)

Device mapper target

Reading is done via a device mapper target that is placed on top of an already existing block device to read the QCOW2 disk image content.

Advantages:
  • Underlying layers and infrastructure can be used to implement the reading of QCOW2 disk images easily
  • Only the processing of the content must be implemented because QCOW2 disk image content is already provided via block device
Disadvantages:
  • Difficult to extend the implementation supporting write access
  • File format of QCOW2 is embedded into the kernel source code

Loop device

Extend the existing loop device to read QCOW2 disk image files, too.

Advantages:
  • Already existing kernel functionality can be used, e.g. partition scanning of block devices and supporting sparse disk images even for writing is possible
  • Can be extended for many file formats besides QCOW2 in the future
Disadvantages:
  • Implementation is not as easy as the device mapper target approach
  • File format of QCOW2 is embedded into the kernel source code

Existing technologies

Use multiple device mapper targets and file systems to reproduce the functionality of QCOW2 disk images.

Advantages:
  • Flexible way to reusing already existing solutions
  • Each component part can be stacked together or replaced if requirements are changed
Disadvantages:
  • Maintenance takes longer due to more complex setup and QCOW2 disk image file format is not used
  • Deployment is difficult due to a lot of different modules (device mapper and file system) on different layers

Discussion with the Linux kernel developer

The different implementation approaches have been discussed on the Linux block layer mailing list.
All messages can be found here: https://www.spinics.net/lists/linux-block/msg39538.html

All in all, the discussion provides some different technologies for general use cases but does not favor a special approach for the realization.
Only contradictions were revealed in certain implementation approaches, also in the QCOW2 realization plan.
That's the reason, why a good compromise should be found.

Analysis of the technologies for the implementation approach

The Loop device approach was selected as good compromise for the implementation of the new QCOW2 in-kernel support.
Therefore, the loop device module and the corresponding user space tool losetup must be analyzed.
Furthermore, the QCOW2 file format must be taken into account to obtain the knowledge for making a good software design.

Loop device structure

The loop device module implements the functionality to provide a specified binary file as block device.
The structure of the module was reconstructed as UML class diagram and looks like the following:
source:analysis/loop/class_diagram_loop_device_small.svg
source:analysis/loop/class_diagram_loop_device_small.pdf

The initialization and exit of the entire driver is visualized as sequence diagram:
source:analysis/loop/sequence_diagram_loop_device_init_exit.svg
source:analysis/loop/sequence_diagram_loop_device_init_exit.pdf

The call graph of the loop device module was reconstructed, too:
source:analysis/loop/call_graph_loop_device.svg
source:analysis/loop/call_graph_loop_device.pdf

The loop device driver implements special IO functions (ioctls) to control the different loop devices.
It uses sysfs entries to provide information about each loop device to the user.
A loop device is set up through the loop-control Misc device. After that, one can configure or modify the available device through the block_device_operations's ioctls.

Losetup functionality

Losetup is the corresponding user space tool to configure the loop device driver running in the kernel context. It uses the available ioctls and sysfs entries of the driver to set up and configure loop devices in the user space.

The functionality of the losetup utility is visualized as call graph:
source:analysis/losetup/call_graph_losetup.svg
source:analysis/losetup/call_graph_losetup.pdf

QCOW2 disk image file format

A detailed description of the file format can be found at source:analysis/qcow2-qemu/doc/qcow2.txt.

The call graph of the implementation is displayed in the following files:
source:analysis/qcow2-qemu/call_graph_qemu_qcow2.svg
source:analysis/qcow2-qemu/call_graph_qemu_qcow2.pdf

The implementation

The implementation can be built manually with the following steps or by using the provided Makefile in the root directory. The Makefile can be invoked with the parameter BUILD=out-of-tree to build the kernel modules of the project out of the Linux source code tree for the currently used Linux kernel of the development machine. If the parameter is not specified the build will be a in-kernel-tree build. In addition to that, the user space tool losetup is built, too.

Building the Linux kernel modules

In-kernel-tree build

The modified loop device kernel module and all new file format subsystem kernel modules can be built manually by execute the following commands:

~$ KCONFIG_CONFIG=$(pwd)/config/kernel-qcow2_x86-64_module.config make -C implementation/loop drivers/block/loop/loop.ko
~$ KCONFIG_CONFIG=$(pwd)/config/kernel-qcow2_x86-64_module.config make -C implementation/loop drivers/block/loop/loop_file_fmt_raw.ko
~$ KCONFIG_CONFIG=$(pwd)/config/kernel-qcow2_x86-64_module.config make -C implementation/loop drivers/block/loop/loop_file_fmt_qcow.ko

After that, the built loop device and file format kernel modules are stored under drivers/block/loop/*.ko.

Out-of-kernel-tree build

Before building the modified loop device module and all new file format subsystem kernel modules, the corresponding Linux headers of the currently running Linux kernel must be installed. Notice that the original kernel headers must be patched with the following modified or new files:

~$ sudo cp -f implementation/loop/include/uapi/linux/loop.h           /lib/modules/`uname -r`/build/include/uapi/linux/loop.h
~$ sudo cp -f drivers/block/Kconfig                                   /lib/modules/`uname -r`/build/drivers/block/Kconfig

~$ sudo mkdir -p /lib/modules/`uname -r`/build/drivers/block/loop

~$ sudo cp -f implementation/loop/drivers/block/loop/loop_main.h      /lib/modules/`uname -r`/build/drivers/block/loop/loop_main.h
~$ sudo cp -f implementation/loop/drivers/block/loop/loop_file_fmt.h  /lib/modules/`uname -r`/build/drivers/block/loop/loop_file_fmt.h
~$ sudo cp -f implementation/loop/drivers/block/loop/Kconfig          /lib/modules/`uname -r`/build/drivers/block/loop/Kconfig

After that, all prerequisites are available and all kernel modules can be built manually out-of-tree by executing the following commands:

~$ make -C /lib/modules/`uname -r`/build CONFIG_BLK_DEV_LOOP=m M=$(pwd)/implementation/loop/drivers/block/loop loop.ko
~$ make -C /lib/modules/`uname -r`/build CONFIG_BLK_DEV_LOOP_FILE_FMT_RAW=m M=$(pwd)/implementation/loop/drivers/block/loop loop_file_fmt_raw.ko
~$ make -C /lib/modules/`uname -r`/build CONFIG_BLK_DEV_LOOP_FILE_FMT_QCOW=m M=$(pwd)/implementation/loop/drivers/block/loop loop_file_fmt_qcow.ko

The built loop device and file format kernel modules are then stored under drivers/block/loop/*.ko.

Building the modules into the Linux kernel

If the implementation should be built into the Linux kernel a kbuild configuration file must be prepared first. The configuration file must contain the following configuration options to enable all functionality of the implementation:

CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_LOOP_FILE_FMT_RAW=y
CONFIG_BLK_DEV_LOOP_FILE_FMT_QCOW=y

Please keep in mind that the CONFIG_BLK_DEV_LOOP_FILE_FMT_QCOW option requires ZLIB functionality which must be enabled in addition to that. After the preparation of a configuration, the modules can be built into the kernel using the normal Linux build command:

~$ KCONFIG_CONFIG=$(pwd)/config/kernel-qcow2_x86-64_built-in.config make -C implementation/loop

In this example, a default kbuild configuration file is used to test the built-in support of the implementation.

Building the losetup utility

The userspace tool losetup can be built manually by executing the following commands:

~$ cd implementation/losetup
~$ ./tools/config-gen all
~$ make losetup

The built losetup utility program can be found in the current directory. The suitable libraries in case of a shared build can be found in the directory .libs.

Repository structure

The development of the kernel modules and the losetup utility takes place in the two following subrepositories included as Git submodules in the main qcow2-kernel repository:
  • qcow2-kernel-linux: Contains all Linux kernel related changes and kernel modules. The implementation is placed in the current Linux development tree (Kernel 5.3.0-rc5+) in the branch kernel-qcow2. A backport for Linux kernel 4.19.67+ is available in the branch kernel-qcow2-linux-4.19.y.
  • qcow2-kernel-util-linux: The changes for the userspace utility losetup are commited in this repository in the branch kernel-qcow2.

Debugging

The QCOW2 file format driver implements the debugfs interface for debugging purposes if the Linux kernel is built with the enabled CONFIG_DEBUG_FS option. At the moment, the QCOW2 driver exports debugfs entries for each QCOW loop device X under loop/loopX/QCOW/ in the mounted debugfs folder.

The debugfs entry file header shows the header of the loaded QCOW file in human readable form.

The entry offset calculates the QCOW current cluster offset in the QCOW file for a given block device offset. The block device offset is specified by writing the offset value into the file and the new cluster offset is calculated. The result can be obtained by reading from the file afterwards.

Performance

Measurements have been done to check the read performance of the QCOW2 implementation and compare them to qemu-nbd. All measurements were carried out on the following test environment:
  • CPU: Intel(R) Core(TM) i3-3225 CPU @ 3.30 GHz
  • HDD: Samsung SSD 840 PRO Series
  • RAM: 8GB RAM
  • OS : Linux kernel 5.3.0-rc5
  • FS: ext4
  • QCOW2: 650 MB QCOW2 file with 64KB cluster size
  • Block device benchmark: fio-3.15

The log files of the measurements are stored in the measurement folder and have the suffix *.log: source:measurements/

Project presentation

The final presentation of the project can be found in the presentation folder: source:presentation/presentation.pdf

Upstream merge of the implementation

An optional goal of the project is the upstream merge of the implementation into the Linux kernel source tree and the util-linux repository.
Before the implementation changes of the losetup utility can be released in form of a merge request for the util-linux repository, the merge request of the kernel modules must be approved by the Linux kernel developers. Therefore, a patch series for the Linux kernel mailing list was created and submitted to the kernel developers. The submitted patch series and another discussion can be found here: https://www.spinics.net/lists/linux-block/msg44251.html
An upstream merge of the patch series did not happen in the existing project duration because the developers see no need for this special topic/implementation.

Updated by Manuel Bentele over 4 years ago · 19 revisions