Deprecating QEMU xilinx_zynq Board Support for ignore_memory_transaction_failures

Original Info

Source: TinyLab Technology
Author: Chao Liu
Original: https://tinylab.org/qemu-drop-ignore_memory_transaction_failures/
Publication date: 2024-10-29

Abstract

The article revolves around ignore_memory_transaction_failures in QEMU’s legacy board models, explaining the background of early RAZ/WI behavior, and how modern QEMU prefers unimplemented-device to explicitly expose accesses to unimplemented devices.

Using the Xilinx Zynq board as an example, it records the startup verification after removing this field, gdbstub debugging, device address-space checks, and the overall patch cleanup approach. This topic is suitable for archiving under QEMU board modeling, memory transaction fault handling, and upstream cleanup workflows.

Archival Note

This is an index for an externally published article; the full text has been imported below.

Main Text

Corrector: TinyCorrect v0.2-rc2 - [toc comments codeblock refs] Author: Liu Chao chao.liu@yeah.net Date: 2024/09/22 Revisor: Bin Meng, falcon Project: RISC-V Linux Kernel Analysis Sponsor: PLCT Lab, ISCAS

Introduction

Some early embedded systems or hardware platforms were not very strict about memory management. When a CPU accessed an unmapped or unallocated memory address:

a read would return zero, a behavior known as Read Address Zero (RAZ);
a write would be ignored, a behavior known as Write Ignore (WI).

In practice, RAZ/WI behavior is typically used for:

Debugging: during development, accesses to unmapped addresses may be ignored or return zero so the system does not crash;
Hardware compatibility: on some hardware platforms, this behavior is treated as a convention to ensure software compatibility across different hardware configurations.

In QEMU, this behavior is usually meant to preserve the legacy behavior of certain hardware platforms. On those platforms, accesses to unmapped memory regions may simply be ignored or return zero instead of raising an error or exception.

`ignore_memory_transaction_failures`

As QEMU evolved, a more modern approach is to use unimplemented-device to model hardware devices that QEMU has not implemented yet. This approach is more in line with modern operating systems’ expectations: when an access targets an unimplemented device, the error or exception should be reported explicitly instead of simply returning zero or ignoring the write.

In addition, devices created through unimplemented-device can record all guest CPU accesses to that device and log them through QEMU’s debug logs. This helps with debugging and validating device-model behavior.

However, some legacy board models may still depend on RAZ/WI behavior to handle devices that QEMU has not modeled yet.

To remain compatible with legacy board models (typically ARM boards), QEMU added a new ignore_memory_transaction_failures field to MachineClass in v2.10.0-291-ged860129ac. The field type is bool.

If this field is set to true, the vCPU will ignore memory transaction failures caused by accesses to unallocated physical addresses; these failures would otherwise usually trigger an exception.

This flag should only be used for legacy board access to unmodeled devices in QEMU that still depends on the old RAZ/WI behavior. Unimplemented devices in new board models should use unimplemented-device.

Analyzing the Implementation

Here we use QEMU v9.0.2 source code as the basis for analysis.

During the realize phase, the CPU model obtains the value of ignore_memory_transaction_failures from MachineClass through cpu_common_realizefn, and then updates cpu->ignore_memory_transaction_failures as follows:

// hw/core/cpu-common.c:195
static void cpu_common_realizefn(DeviceState *dev, Error **errp)
{
    CPUState *cpu = CPU(dev);
    Object *machine = qdev_get_machine();

    /* qdev_get_machine() can return something that's not TYPE_MACHINE
     * if this is one of the user-only emulators; in that case there's
     * no need to check the ignore_memory_transaction_failures board flag.
     */
    if (object_dynamic_cast(machine, TYPE_MACHINE)) {
        MachineClass *mc = MACHINE_GET_CLASS(machine);
        if (mc) {
            cpu->ignore_memory_transaction_failures =
                mc->ignore_memory_transaction_failures;
        }
    }

    ...
}

When the CPU executes memory-access instructions in a full-system simulation, all memory accesses first go through SoftMMU. Taking TCG as an example:

The CPU first checks the Soft TLB. If there is a TLB miss, it enters the slow path and calls a helper function to perform the memory access, as shown below;

                tb binary code
                  ---+---
find tlb    --->     |
                    bne -----------------------------+
direct ld   --->     |                               |
                    -+-                              | TLB Miss
                     |                               |
TB end      --->    -+-                              |
                     |  <--- ldst slow path code <---+
                    ...

Taking a read operation as an example, the helper function eventually calls int_ld_mmio_beN, as shown below:

// accel/tcg/cputlb.c:1268
static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
                      unsigned size, MMUAccessType access_type, int mmu_idx,
                      MemTxResult response, uintptr_t retaddr)
{
    if (!cpu->ignore_memory_transaction_failures // ignore memory access faults
        && cpu->cc->tcg_ops->do_transaction_failed) {
        hwaddr physaddr = full->phys_addr | (addr & ~TARGET_PAGE_MASK);

        cpu->cc->tcg_ops->do_transaction_failed(cpu, physaddr, addr, size,
                                                access_type, mmu_idx,
                                                full->attrs, response, retaddr);
    }
}

// accel/tcg/cputlb.c:1928
static uint64_t int_ld_mmio_beN(CPUState *cpu, CPUTLBEntryFull *full,
                                uint64_t ret_be, vaddr addr, int size,
                                int mmu_idx, MMUAccessType type, uintptr_t ra,
                                MemoryRegion *mr, hwaddr mr_offset)
{
    MemTxResult r;

    ...

    r = memory_region_dispatch_read(mr, mr_offset, &val,
                                    this_mop, full->attrs);
    if (unlikely(r != MEMTX_OK)) {
        io_failed(cpu, full, addr, this_size, type, mmu_idx, r, ra);
    }

    ...

    return ret_be;
}

If memory_region_dispatch_read returns MEMTX_OK, the access succeeded; otherwise io_failed is called.

Inside io_failed, QEMU decides whether to raise an exception based on ignore_memory_transaction_failures. If the value is true, the access is handled as RAZ/WI.

Removing `ignore_memory_transaction_failures` from Xilinx Zynq

Because I happened to have a Linux kernel binary image for Xilinx Zynq in my environment, I decided to try removing ignore_memory_transaction_failures from the Xilinx Zynq board.

First, remove the field initialization in zynq_machine_class_init:

// hw/arm/xilinx_zynq.c
@@ -394,7 +437,6 @@ static void zynq_machine_class_init(ObjectClass *oc, void *data)
     mc->init = zynq_init;
     mc->max_cpus = ZYNQ_MAX_CPUS;
     mc->no_sdcard = 1;
-    mc->ignore_memory_transaction_failures = true;
     mc->valid_cpu_types = valid_cpu_types;
     mc->default_ram_id = "zynq.ext_ram";
     prop = object_class_property_add_str(oc, "boot-mode", NULL,
--

Then rebuild (details omitted here), and run QEMU:

$ ./qemu/build/qemu-system-arm -M xilinx-zynq-a9 \
-serial /dev/null \
-serial mon:stdio \
-display none \
-kernel QEMU_CPUFreq_Zynq/Prebuilt_functional/kernel_standard_linux/uImage \
-dtb QEMU_CPUFreq_Zynq/Prebuilt_functional/my_devicetree.dtb \
--initrd QEMU_CPUFreq_Zynq/Prebuilt_functional/umy_ramdisk.image.gz

PS: To get the matching Linux kernel binary image for testing, run git clone https://github.com/zevorn/QEMU_CPUFreq_Zynq.git

The result was that the terminal showed no output at all. When this happens, don’t panic. We can use QEMU’s gdb remote debugging to inspect what the guest is doing.

Modify the QEMU startup command to add -s -S at the end to enable gdb remote debugging, then open another terminal to debug:

$ gdb-multiarch QEMU_CPUFreq_Zynq/Prebuilt_functional/kernel_standard_linux/uImage
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
...
determining executable automatically.  Try using the "file" command.
0x00000000 in ?? ()
(gdb) c
Continuing.
# Wait a little longer here, because enabling GDB remote debugging in QEMU reduces performance
# Press Ctrl + C here to pause guest execution
Program received signal SIGINT, Interrupt.
0xc0770240 in ?? ()
(gdb) display /i $pc
1: x/i $pc
=> 0xc0770240:  subs    r0, r0, #1
(gdb) si
0xc0770244 in ?? ()
1: x/i $pc
=> 0xc0770244:  bhi     0xc0770240
(gdb) si
0xc0770240 in ?? ()
1: x/i $pc
=> 0xc0770240:  subs    r0, r0, #1
(gdb)
0xc0770244 in ?? ()
1: x/i $pc
=> 0xc0770244:  bhi     0xc0770240

Here’s a small trick: use display /i $pc so that every single-step prints the disassembly of the current instruction, which helps locate the issue.

After observing several single steps, it is clear that the code is stuck in an infinite loop (jumping back and forth between 0xc0770240 and 0xc0770244), which strongly suggests that a memory-access fault triggered an infinite loop in the exception handler. But we still do not have enough information to draw a firm conclusion.

Next, let’s continue debugging to find out where the memory-access fault is triggered.

A conventional way is to use bt to inspect the call stack and analyze the CPU execution path, so we can infer the program logic. The steps are as follows:

(gdb) bt
#0  0xc0770244 in ?? ()
#1  0xc011cb10 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) x /i 0xc011cb10
   0xc011cb10:  b       0xc011cafc
(gdb)

From the backtrace output, there is no useful information because debug symbols are missing.

At this point we need to change our debugging strategy and analyze the guest memory-access fault address directly by debugging the QEMU source code.

As mentioned above, the memory-access fault is set in io_failed, so let’s add a print there to output the guest’s GVA and GPA respectively:

static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
                      unsigned size, MMUAccessType access_type, int mmu_idx,
                      MemTxResult response, uintptr_t retaddr)
{
    if (!cpu->ignore_memory_transaction_failures
        && cpu->cc->tcg_ops->do_transaction_failed) {
        hwaddr physaddr = full->phys_addr | (addr & ~TARGET_PAGE_MASK);

        // Add debug logging to print the GVA, GPA, and access_type
        printf("vaddr %lx phyaddr %lx access_type %d\n", addr, physaddr, access_type);

        cpu->cc->tcg_ops->do_transaction_failed(cpu, physaddr, addr, size,
                                                access_type, mmu_idx,
                                                full->attrs, response, retaddr);
    }
}

After rebuilding and running QEMU again, the output is:

vaddr c884f080 phyaddr f8007080 access_type 0

The guest physical address is 0xf8007080, and access_type is 0, which means it is a read operation.

After checking the Xilinx Zynq DTS, we find that address 0xf8007080 corresponds to the devcfg device. In QEMU, this device is modeled as xlnx,zynq-devcfg, and its register address space is 0xf8007000~0xf8007fff, so this memory-access fault should be coming from the QEMU-emulated devcfg device.

// roms/u-boot/arch/arm/dts/zynq-7000.dtsi
devcfg: devcfg@f8007000 {
    compatible = "xlnx,zynq-devcfg-1.0";
    interrupt-parent = <&intc>;
    interrupts = <0 8 4>;
    reg = <0xf8007000 0x100>;
    clocks = <&clkc 12>, <&clkc 15>, <&clkc 16>, <&clkc 17>, <&clkc 18>;
    clock-names = "ref_clk", "fclk0", "fclk1", "fclk2", "fclk3";
    syscon = <&slcr>;
};

Here, reg = ... defines the address range of devcfg.

However, this device is modeled by QEMU, so in principle it should not generate a memory-access fault unless there is an “address hole”—that is, some address ranges inside the devcfg address space are not implemented.

To verify that hypothesis, first try adding an unimplemented-device for devcfg during Xilinx Zynq board initialization, as shown below:

// hw/arm/xilinx_zynq.c:34
#include "hw/net/cadence_gem.h"
#include "hw/cpu/a9mpcore.h"
#include "hw/qdev-clock.h"
#include "hw/misc/unimp.h" // add header

// hw/arm/xilinx_zynq.c:203
static void zynq_init(MachineState *machine)
{
    /* Other */
    create_unimplemented_device("amba.devcfg", 0xf8007000, 0x100);
    ...
}

After rebuilding QEMU and running it, QEMU prints:

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 5.6.0-axiom+ (kromes@mcsoc2-Latitude-7480) (gcc version 8.2.1 20180802 (GNU Toolchain for the A-profile Architecture 8.2-2018.11 (arm-rel-8.26))) #10 SMP PREEMPT Fri Jul 3 08:42:52 CEST 2020
[    0.000000] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=10c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
[    0.000000] OF: fdt: Machine model: xlnx,zynq-zed
[    0.000000] Memory policy: Data cache writeback
[    0.000000] cma: Failed to reserve 64 MiB
[    0.000000] CPU: All CPU(s) started in SVC mode.
[    0.000000] percpu: Embedded 15 pages/cpu s31948 r8192 d21300 u61440
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32512
[    0.000000] Kernel command line: console=ttyPS0, 115200 root=/dev/ram rw
[    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, l
...

It now prints normally.

But this is still not the end. Our test coverage is not sufficient, so the Xilinx Zynq board may still have other unimplemented devices. Let’s enter QEMU’s command line interface and inspect the devices and address spaces that the Xilinx Zynq board has already implemented:

$ ./qemu/build/qemu-system-arm -M xilinx-zynq-a9 -display none -monitor stdio
QEMU 9.1.50 monitor - type 'help' for more information
(qemu) info mtree
address-space: cpu-memory-0
address-space: cpu-secure-memory-0
address-space: dma
address-space: dma
address-space: memory
  0000000000000000-ffffffffffffffff (prio 0, i/o): system
    0000000000000000-0000000007ffffff (prio 0, ram): zynq.ext_ram
    00000000e0000000-00000000e0000fff (prio 0, i/o): uart
    00000000e0001000-00000000e0001fff (prio 0, i/o): uart
    00000000e0002000-00000000e0002fff (prio 0, i/o): ehci
      00000000e0002000-00000000e00020ff (prio 0, i/o): usb-chipidea.misc
      00000000e0002100-00000000e000210f (prio 0, i/o): capabilities
    ...

address-space: I/O
  0000000000000000-000000000000ffff (prio 0, i/o): io

(qemu) vaddr 8000000 phyaddr 8000000 a

info mtree starts from the base address of the Xilinx Zynq board and lists all devices on the board from low to high. Here we use the UART address space in the output as an example for analysis:

00000000e0000000-00000000e0000fff (prio 0, i/o): uart

Here 0xe0000000-0xe0000fff is the UART address space, and prio is the priority of this device within that mr address range. Therefore, we can also sort the DTS for the Xilinx Zynq board from low to high addresses and identify all unimplemented devices.

As an example, take the PMU device and first check its address range in the DTS:

// roms/u-boot/arch/arm/dts/zynq-7000.dtsi
	pmu@f8891000 {
		compatible = "arm,cortex-a9-pmu";
		interrupts = <0 5 4>, <0 6 4>;
		interrupt-parent = <&intc>;
		reg = <0xf8891000 0x1000>,
		      <0xf8893000 0x1000>;
	};

We can see that PMU has two address ranges: 0xf8891000-0xf8891fff and 0xf8893000-0xf8893fff.

Next, inspect the info mtree output and locate the nearby ranges:

(qemu) info mtree
address-space: cpu-memory-0
address-space: cpu-secure-memory-0
address-space: dma
address-space: dma
address-space: memory
    ...
 00000000e2000000-00000000e5ffffff (prio 0, romd): zynq.pflash
    00000000f8000000-00000000f8000fff (prio 0, i/o): slcr
    00000000f8001000-00000000f8001fff (prio 0, i/o): timer
    00000000f8002000-00000000f8002fff (prio 0, i/o): timer
    00000000f8003000-00000000f8003fff (prio 0, i/o): dma
      00000000f8007000-00000000f800703f (prio 0, i/o): xlnx.ps7-dev-cfg
    00000000f8007100-00000000f800711f (prio 0, i/o): zynq-xadc
    # Roughly here, but there is no corresponding device
    00000000f8f00000-00000000f8f01fff (prio 0, i/o): a9mp-priv-container
    ...
address-space: I/O
  0000000000000000-000000000000ffff (prio 0, i/o): io

Then, in the QEMU source code, add the PMU device as an unimplemented-device:

// hw/arm/xilinx_zynq.c:203
static void zynq_init(MachineState *machine)
{
    ...

    /* DDR remapped to address zero. */
    memory_region_add_subregion(address_space_mem, 0, machine->ram);

    /* PMU */
    create_unimplemented_device("pmu.region0", 0xf8891000, 0x1000);
    create_unimplemented_device("pmu.region1", 0xf8893000, 0x1000);
    ...
}

Continuing the audit, it turns out that many devices are indeed unimplemented. Because there are so many, I will present the code directly without listing them one by one:

// hw/arm/xilinx_zynq.c:203
static void zynq_init(MachineState *machine)
{
    ...

    /* DDR remapped to address zero. */
    memory_region_add_subregion(address_space_mem, 0, machine->ram);

    /* CAN */
    create_unimplemented_device("amba.can0", 0xe0008000, 0x1000);
    create_unimplemented_device("amba.can1", 0xe0009000, 0x1000);

    /* GPIO */
    create_unimplemented_device("amba.gpio0", 0xe000a000, 0x1000);

    /* I2C */
    create_unimplemented_device("amba.i2c0", 0xe0004000, 0x1000);
    create_unimplemented_device("amba.i2c1", 0xe0005000, 0x1000);

    /* Interrupt Controller */
    create_unimplemented_device("amba.intc.region0", 0xf8f00100, 0x100);
    create_unimplemented_device("amba.intc.region1", 0xf8f01000, 0x1000);

    /* Memory Controller */
    create_unimplemented_device("amba.mc", 0xf8006000, 0x1000);

    /* SMCC */
    create_unimplemented_device("amba.smcc", 0xe000e000, 0x1000);
    create_unimplemented_device("amba.smcc.nand0", 0xe1000000, 0x1000000);

    /* Timer */
    create_unimplemented_device("amba.global_timer", 0xf8f00200, 0x20);
    create_unimplemented_device("amba.scutimer", 0xf8f00600, 0x20);

    /* Watchdog */
    create_unimplemented_device("amba.watchdog0", 0xf8005000, 0x1000);

    /* Other */
    create_unimplemented_device("amba.devcfg", 0xf8007000, 0x100);
    create_unimplemented_device("amba.efuse", 0xf800d000, 0x20);
    create_unimplemented_device("amba.etb", 0xf8801000, 0x1000);
    create_unimplemented_device("amba.tpiu", 0xf8803000, 0x1000);
    create_unimplemented_device("amba.funnel", 0xf8804000, 0x1000);
    create_unimplemented_device("amba.ptm.region0", 0xf889c000, 0x1000);
    create_unimplemented_device("amba.ptm.region1", 0xf889d000, 0x1000);

    ...
}

Also, while printing info mtree, I found that the devcfg address range was incorrect:

(qemu) info mtree
    ...
      00000000f8007000-00000000f800703f (prio 0, i/o): xlnx.ps7-dev-cfg
    ...

According to the device-tree configuration above, the devcfg address range should be 0xf8007000-f80070ff. Opening the corresponding code reveals the issue:

// include/hw/dma/xlnx-zynq-devcfg.h
#define XLNX_ZYNQ_DEVCFG_R_MAX (0x100 / 4)

XLNX_ZYNQ_DEVCFG_R_MAX is 0x100 / 4, but the actual size is 0x100.

I checked why the patch author made this change at the time. The message was:

dma: xlnx-zynq-devcfg: Fix up XLNX_ZYNQ_DEVCFG_R_MAX

Whilst according to the Zynq TRM this device covers a register region of
0x000 - 0x120. The register region is also shared with XADCIF prefix
registers at 0x100 and above. Due to how the devcfg and the xadc devices
are implemented in QEMU these are separate models with individual mmio
regions. As such the region registered by the devcfg overlaps with the
xadc when initialized in a machine model (e.g. xilinx-zynq-a9).

This patch fixes up the incorrect region size, where
XLNX_ZYNQ_DEVCFG_R_MAX is missing its '/ 4' causing it to be 0x460 in
size. As well as setting the region size to the 0x0 - 0x100 region so
that an xadc device instance can be registered in the correct region to
pair with the devcfg device instance.

Mapping with XLNX_ZYNQ_DEVCFG_R_MAX = 0x118:
dev: xlnx.ps7-dev-cfg, id ""
mmio 00000000f8007000/0000000000000460
dev: xlnx,zynq-xadc, id ""
mmio 00000000f8007100/0000000000000020

Mapping with XLNX_ZYNQ_DEVCFG_R_MAX = 0x100 / 4:
dev: xlnx.ps7-dev-cfg, id ""
mmio 00000000f8007000/0000000000000100
dev: xlnx,zynq-xadc, id ""
mmio 00000000f8007100/0000000000000020

Signed-off-by: Nathan Rossi <nathan@nathanrossi.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20160921180911.32289-1-nathan@nathanrossi.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

This patch mainly fixes the overlap between the devcfg and xadc address-mapping ranges. However, in xlnx_zynq_devcfg_init, the last argument to register_init_block32 is not multiplied by 4, causing the actual address range created by devcfg to be only 0x40.

The code should be changed as follows:

// hw/dma/xlnx-zynq-devcfg.c:360
static void xlnx_zynq_devcfg_init(Object *obj)
{
    SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
    XlnxZynqDevcfg *s = XLNX_ZYNQ_DEVCFG(obj);
    RegisterInfoArray *reg_array;

    sysbus_init_irq(sbd, &s->irq);

    memory_region_init(&s->iomem, obj, "devcfg", XLNX_ZYNQ_DEVCFG_R_MAX * 4);
    reg_array =
        register_init_block32(DEVICE(obj), xlnx_zynq_devcfg_regs_info,
                              ARRAY_SIZE(xlnx_zynq_devcfg_regs_info),
                              s->regs_info, s->regs,
                              &xlnx_zynq_devcfg_reg_ops,
                              XLNX_ZYNQ_DEVCFG_ERR_DEBUG,
                              XLNX_ZYNQ_DEVCFG_R_MAX); // missing * 4 here
}

We only need to change the last argument of register_init_block32 to XLNX_ZYNQ_DEVCFG_R_MAX * 4.

Conclusion

This article summarizes the implementation of ignore_memory_transaction_failures and how to deprecate that property on certain legacy boards by replacing it with unimplemented_device.

Original Info#

Abstract#

Archival Note#

Main Text#

Introduction#

ignore_memory_transaction_failures#

Analyzing the Implementation#

Removing ignore_memory_transaction_failures from Xilinx Zynq#

Conclusion#

References#