Post-launched user VM with preempt-RT kernel doesn't start


c.susen@...
 

Hi everyone,

I am having trouble setting up a RTVM with ACRN and would be very grateful for your advice!

First some details on my setup:

ACRN Host Hardware:
00:00.0 Host bridge: Intel Corporation 8th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S] (rev 0d)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 0d)
00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 630 (Desktop 9 Series) (rev 02)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
00:15.0 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0 (rev 10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
00:16.3 Serial controller: Intel Corporation Cannon Lake PCH Active Management Technology - SOL (rev 10)
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10)
00:1b.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #17 (rev f0)
00:1c.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #4 (rev f0)
00:1c.6 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #7 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Q370 Chipset LPC/eSPI Controller (rev 10)
00:1f.3 Audio device: Intel Corporation Cannon Lake PCH cAVS (rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10)
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
02:00.0 Non-Volatile memory controller: SK hynix Device 1627
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)

Software Versions:
ACRN Kernel v2.6
ACRN Hypervisor v2.6
SOS -> Ubuntu 20.04.3

Scenario Configuration and Launch Scripts:
I use a scenario configuration and launch scripts based on the industry scenario. I removed all VMs except for the ones corresponding to VM number 2 and 3 in the industry scenario. Apart from that, the scenario and launch scripts were not modified. My board configuration file, scenario configuration file, launch configuration file and the launch scripts can be found in the attachements in case they help in the understanding or solution of my problem.

Description of the Problem:
I have an image for the user VM with Ubuntu 20.04.3 and a preempt-RT kernel (5.10.52-rt47). When I use the image with the launch script number one (non-RTVM), the VM starts without problems. However, when I use the launch script number two (RTVM), I get no output (STDIO) at all from the VM, although I enabled kernel output to hvc0 by modifying the settings of GRUB and verified that I get output during start-up using the working launch script for the non-RTVM. 

The standard output obtained when using both launch scripts can be found in output_launch_uos_id1.txt and output_launch_uos_id2.txt in the attachements. For the RTVM (number two), there is no more output after the last line, even after a few minutes. The machine gets loud and the fan spins very fast. Also, it is not possible to stop the VM with acrnctl. At that point, I have to power off the host machine the hard way.

Unfortunately, I did not figure out what the problem is after days of trying and also have no idea how best to approach the problem. I would really appreciate if you have a hint for me what the problem could be or what I should try out!

Thanks in advance!

Best regards,
Christoph


c.susen@...
 

Hi again,

I managed to add a PCIe card with serial ports to my ACRN host and am now able to use the ACRN shell to get further information on what is going wrong. I hope that the following information makes it easier for you to maybe help me with the setup.

ACRN shell output when using launch_uos_id1.sh -> user VM starts succesfully
[53863044us][cpu=0][vm1:vcpu0][sev=3][seq=100]:VMX ctrl 0x482 not fully enabled: request 0x92220088 but get 0x9621e1fa
[53874398us][cpu=0][vm1:vcpu0][sev=3][seq=101]:VMX ctrl 0x48b not fully enabled: request 0x41004ab but get 0x1004ab
[59499335us][cpu=0][vm1:vcpu0][sev=3][seq=104]:vlapic: Start Secondary VCPU1 for VM[1]...
[59507819us][cpu=1][vm1:vcpu1][sev=3][seq=105]:VMX ctrl 0x482 not fully enabled: request 0x92220088 but get 0x9621e1fa
[59519253us][cpu=1][vm1:vcpu1][sev=3][seq=106]:VMX ctrl 0x48b not fully enabled: request 0x41004ab but get 0x1004ab

ACRN shell output when using launch_uos_id2.sh -> user VM does not start
[186233178us][cpu=5][vm0:vcpu5][sev=1][seq=114]:vpci_assign_pcidev 1:0.0 not support FLR or not support PM reset
[186244017us][cpu=5][vm0:vcpu5][sev=3][seq=115]:pci_vdev_update_vbar_base reprogram PCI:01:00.0 BAR4 to addr:0x4000, which is out of mmio window[0x80000000 - 0xe0000000] or not aligned with size: 0x4000
[186286873us][cpu=2][vm2:vcpu0][sev=3][seq=116]:VMX ctrl 0x482 not fully enabled: request 0x92220088 but get 0x9621e1fa
[186298343us][cpu=2][vm2:vcpu0][sev=3][seq=117]:VMX ctrl 0x48b not fully enabled: request 0x41004ab but get 0x1004ab
[193720869us][cpu=2][vm2:vcpu0][sev=3][seq=121]:vlapic: Start Secondary VCPU1 for VM[2]...
[193729429us][cpu=3][vm2:vcpu1][sev=3][seq=122]:VMX ctrl 0x482 not fully enabled: request 0x92220088 but get 0x9621e1fa
[193740972us][cpu=3][vm2:vcpu1][sev=3][seq=123]:VMX ctrl 0x48b not fully enabled: request 0x41004ab but get 0x1004ab
[194318797us][cpu=5][vm0:vcpu5][sev=3][seq=171]:ret=-1 hypercall=0x80000023 failed in vmcall_vmexit_handler
[194329142us][cpu=5][vm0:vcpu5][sev=3][seq=172]:ret=-1 hypercall=0x80000023 failed in vmcall_vmexit_handler
[194339567us][cpu=5][vm0:vcpu5][sev=3][seq=173]:ret=-1 hypercall=0x80000023 failed in vmcall_vmexit_handler

Output when using vm_console 2
Loading Linux 5.10.52-rt47 ...
Loading initial ramdisk ...

I would really appreciate if someone could help me with that! Thanks in advance!

Best regards,
Christoph


Liu, Fuzhong
 

Hi Christoph,

Could you please help to create one git issue with following info? https://github.com/projectacrn/acrn-hypervisor/issues/new?assignees=&labels=status%3A+new&template=bug_report.md&title=

  1. Code base
  2. Board.xml
  3. Scenario.xml
  4. Launch script
  5. Grub menu for preempt-RT VM

 

Thanks!

BR.

Fuzhong

From: acrn-users@... <acrn-users@...> On Behalf Of c.susen@...
Sent: Friday, December 3, 2021 4:19 PM
To: acrn-users@...
Subject: Re: [acrn-users] Post-launched user VM with preempt-RT kernel doesn't start

 

Hi again,

I managed to add a PCIe card with serial ports to my ACRN host and am now able to use the ACRN shell to get further information on what is going wrong. I hope that the following information makes it easier for you to maybe help me with the setup.

ACRN shell output when using launch_uos_id1.sh -> user VM starts succesfully

[53863044us][cpu=0][vm1:vcpu0][sev=3][seq=100]:VMX ctrl 0x482 not fully enabled: request 0x92220088 but get 0x9621e1fa

[53874398us][cpu=0][vm1:vcpu0][sev=3][seq=101]:VMX ctrl 0x48b not fully enabled: request 0x41004ab but get 0x1004ab

[59499335us][cpu=0][vm1:vcpu0][sev=3][seq=104]:vlapic: Start Secondary VCPU1 for VM[1]...

[59507819us][cpu=1][vm1:vcpu1][sev=3][seq=105]:VMX ctrl 0x482 not fully enabled: request 0x92220088 but get 0x9621e1fa

[59519253us][cpu=1][vm1:vcpu1][sev=3][seq=106]:VMX ctrl 0x48b not fully enabled: request 0x41004ab but get 0x1004ab

ACRN shell output when using launch_uos_id2.sh -> user VM does not start

[186233178us][cpu=5][vm0:vcpu5][sev=1][seq=114]:vpci_assign_pcidev 1:0.0 not support FLR or not support PM reset

[186244017us][cpu=5][vm0:vcpu5][sev=3][seq=115]:pci_vdev_update_vbar_base reprogram PCI:01:00.0 BAR4 to addr:0x4000, which is out of mmio window[0x80000000 - 0xe0000000] or not aligned with size: 0x4000

[186286873us][cpu=2][vm2:vcpu0][sev=3][seq=116]:VMX ctrl 0x482 not fully enabled: request 0x92220088 but get 0x9621e1fa

[186298343us][cpu=2][vm2:vcpu0][sev=3][seq=117]:VMX ctrl 0x48b not fully enabled: request 0x41004ab but get 0x1004ab

[193720869us][cpu=2][vm2:vcpu0][sev=3][seq=121]:vlapic: Start Secondary VCPU1 for VM[2]...

[193729429us][cpu=3][vm2:vcpu1][sev=3][seq=122]:VMX ctrl 0x482 not fully enabled: request 0x92220088 but get 0x9621e1fa

[193740972us][cpu=3][vm2:vcpu1][sev=3][seq=123]:VMX ctrl 0x48b not fully enabled: request 0x41004ab but get 0x1004ab

[194318797us][cpu=5][vm0:vcpu5][sev=3][seq=171]:ret=-1 hypercall=0x80000023 failed in vmcall_vmexit_handler

[194329142us][cpu=5][vm0:vcpu5][sev=3][seq=172]:ret=-1 hypercall=0x80000023 failed in vmcall_vmexit_handler

[194339567us][cpu=5][vm0:vcpu5][sev=3][seq=173]:ret=-1 hypercall=0x80000023 failed in vmcall_vmexit_handler

Output when using vm_console 2

Loading Linux 5.10.52-rt47 ...

Loading initial ramdisk ...

I would really appreciate if someone could help me with that! Thanks in advance!

Best regards,
Christoph


c.susen@...
 
Edited

Hi Fuzhong,

thanks for your reply! I created an issue on GitHub (see https://github.com/projectacrn/acrn-hypervisor/issues/6950).

Best regards,
Christoph