Zephyr as UOS slows down when SOS is busy
Alfonso Sanchez-Beato
Hello, I have been playing with using Zephyr as UOS on top of Ubuntu as SOS, using the Industry scenario and launching the UOS with launch_zephyr.sh. I created this small program to prove that Zephyr was not affected by what happened in the SOS: https://paste.ubuntu.com/p/c3j9XwvrT6/ (with config https://paste.ubuntu.com/p/sPhwTRrRTZ/). The program simply calculates the number of primes below a number and prints the time spent. Although the problem runs and I get output from the serial port, I found two problems: 1. gettimeofday is not returning good times, the elapsed time is ~20 times more than the real one. I see the same problem when using k_uptime_get and k_uptime_delta. It looks related to handling of the HW clock, but the pre-defined frequency (100) seems correct. 2. More importantly, I see that the time spent in the calculation varies when I do something CPU intensive on the SOS. For instance, if I run the same calculation on the SOS, I see that Zephyr is expending 5% more time in the calculation, which should not happen and defeats the purpose of this set-up. ACRN version is: HV version 1.4-unstable-2020-01-09 09:45:06-bcefd673 DBG (daily tag: acrn-2019w47.1-140000p) build by ubuntu API version 1.0 Any ideas on what might be going on? Thanks Alfonso |
|
Alfonso Sanchez-Beato
On Tue, Jan 21, 2020 at 1:07 PM Alfonso Sanchez-Beato <alfonso.sanchez-beato@...> wrote:
I forgot to mention that I am testing on a NUC7i7DNH.
|
|
Hi Alfonso,
My best guess as to what’s happening is that there is some contention between the Service VM and the Zephyr VM to use the cache, Zephyr would typically stay in cache but now that there is some heavy load in the Service VM, it may get evicted at times. This is where Cache Allocation Technology (CAT) can help further isolate the VMs, especially against so-called “noisy neighbours. Unfortunately, Kaby Lake does not support CAT. Here is more about CAT and how to use it with ACRN: https://projectacrn.github.io/latest/getting-started/rt_industry.html?#configure-cat
Geoffroy
From: acrn-users@... <acrn-users@...>
On Behalf Of Alfonso Sanchez-Beato
Sent: Tuesday, January 21, 2020 1:09 PM To: acrn-users@... Subject: Re: [acrn-users] Zephyr as UOS slows down when SOS is busy
On Tue, Jan 21, 2020 at 1:07 PM Alfonso Sanchez-Beato <alfonso.sanchez-beato@...> wrote:
I forgot to mention that I am testing on a NUC7i7DNH.
|
|
Hi Alfonso,
I’m assuming you have been using the standard launch script (from /usr/share/acrn/samples/nuc/launch_zephyr.sh) to test your Zephyr app. If so, can you try to use this script instead to see if it makes any difference? https://paste.ubuntu.com/p/rCjfYv3QV4/
I have not tried it yet with your app but it’s able to launch the Zephyr “Hello World! Acrn” app.
Thanks, Geoffroy
From: VanCutsem, Geoffroy
Sent: Tuesday, January 21, 2020 4:18 PM To: acrn-users@... Subject: RE: [acrn-users] Zephyr as UOS slows down when SOS is busy
Hi Alfonso,
My best guess as to what’s happening is that there is some contention between the Service VM and the Zephyr VM to use the cache, Zephyr would typically stay in cache but now that there is some heavy load in the Service VM, it may get evicted at times. This is where Cache Allocation Technology (CAT) can help further isolate the VMs, especially against so-called “noisy neighbours. Unfortunately, Kaby Lake does not support CAT. Here is more about CAT and how to use it with ACRN: https://projectacrn.github.io/latest/getting-started/rt_industry.html?#configure-cat
Geoffroy
From: acrn-users@... <acrn-users@...>
On Behalf Of Alfonso Sanchez-Beato
On Tue, Jan 21, 2020 at 1:07 PM Alfonso Sanchez-Beato <alfonso.sanchez-beato@...> wrote:
I forgot to mention that I am testing on a NUC7i7DNH.
|
|
Alfonso Sanchez-Beato
Hi Geoffroy, On Wed, Jan 22, 2020 at 9:32 AM Geoffroy Van Cutsem <geoffroy.vancutsem@...> wrote:
I have tried with the new script, and I think it helps a bit, although it does not remove completely the increment in time when something is running in the SOS. But it is a bit difficult to say tbh.
Thanks Alfonso
|
|
Alfonso Sanchez-Beato
On Wed, Jan 22, 2020 at 2:28 PM Alfonso Sanchez-Beato <alfonso.sanchez-beato@...> wrote:
It turns out that disabling Intel Turbo Boost in the BIOS makes the times spent in the calculation on Zephyr almost constant regardless of the load on the SOS. Which makes sense after reading that Turbo Boost changes dynamically the CPU frequency. Of course, the drawback is that now the calculations take more time.
|
|
Yan, Like
Hi Alfonso,
#1 Did you calibrate the zephyr clock rate? I am not an expert on Zephyr, as I known, clock calibration is needed in Zephyr.
#2 Echo to Geoffroy, heavy workloads on SOS may have significant impact on guest VM if no CAT to do LLC isolation.
BRs, Like
From: acrn-users@... <acrn-users@...>
On Behalf Of Alfonso Sanchez-Beato
Sent: Tuesday, January 21, 2020 8:08 PM To: acrn-users@... Subject: [acrn-users] Zephyr as UOS slows down when SOS is busy
Hello,
I have been playing with using Zephyr as UOS on top of Ubuntu as SOS, using the Industry scenario and launching the UOS with launch_zephyr.sh.
I created this small program to prove that Zephyr was not affected by what happened in the SOS: https://paste.ubuntu.com/p/c3j9XwvrT6/ (with config https://paste.ubuntu.com/p/sPhwTRrRTZ/). The program simply calculates the number of primes below a number and prints the time spent.
Although the problem runs and I get output from the serial port, I found two problems:
1. gettimeofday is not returning good times, the elapsed time is ~20 times more than the real one. I see the same problem when using k_uptime_get and k_uptime_delta. It looks related to handling of the HW clock, but the pre-defined frequency (100) seems correct.
2. More importantly, I see that the time spent in the calculation varies when I do something CPU intensive on the SOS. For instance, if I run the same calculation on the SOS, I see that Zephyr is expending 5% more time in the calculation, which should not happen and defeats the purpose of this set-up.
ACRN version is: HV version 1.4-unstable-2020-01-09 09:45:06-bcefd673 DBG (daily tag: acrn-2019w47.1-140000p) build by ubuntu
Any ideas on what might be going on?
Thanks Alfonso
|
|