Re: [PATCH V6 7/8] hv: vCAT: implementing the vCAT MSRs write handler


Eddie Dong
 

-----Original Message-----
From: acrn-dev@... <acrn-dev@...> On
Behalf Of Dongsheng Zhang
Sent: Thursday, October 21, 2021 1:13 AM
To: acrn-dev@...
Subject: Re: [acrn-dev] [PATCH V6 7/8] hv: vCAT: implementing the vCAT
MSRs write handler



-----Original Message-----
From: acrn-dev@... <acrn-dev@...>
On Behalf Of Eddie Dong
Sent: Wednesday, October 20, 2021 5:53 AM
To: acrn-dev@...
Subject: Re: [acrn-dev] [PATCH V6 7/8] hv: vCAT: implementing the vCAT
MSRs write handler



-----Original Message-----
From: acrn-dev@...
<acrn-dev@...> On Behalf Of Dongsheng Zhang
Sent: Wednesday, October 20, 2021 6:42 AM
To: acrn-dev@...
Subject: [acrn-dev] [PATCH V6 7/8] hv: vCAT: implementing the vCAT
MSRs write handler

From: dongshen <dongsheng.x.zhang@...>

Implement the write_vcat_msr() function to handle the
MSR_IA32_PQR_ASSOC and MSR_IA32_type_MASK_n vCAT MSRs write
request.

Several vCAT P2V (physical to virtual) and V2P (virtual to physical)
mappings
exist:

struct acrn_vm_config *vm_config = get_vm_config(vm_id)

max_pcbm = vm_config->max_type_pcbm (type: l2 or l3)
mask_shift = ffs64(max_pcbm)

vclosid = vmsr - MSR_IA32_type_MASK_0
pclosid = vm_config->pclosids[vclosid]

pmsr = MSR_IA32_type_MASK_0 + pclosid
pcbm = vcbm << mask_shift
vcbm = pcbm >> mask_shift

Where
MSR_IA32_type_MASK_n: L2 or L3 mask msr address for CLOSIDn,
from
0C90H through 0D8FH (inclusive).

max_pcbm: a bitmask that selects all the physical cache ways
assigned to the VM

vclosid: virtual CLOSID, always starts from 0

pclosid: corresponding physical CLOSID for a given vclosid

vmsr: virtual msr address, passed to vCAT handlers by the
caller functions rdmsr_vmexit_handler()/wrmsr_vmexit_handler()

pmsr: physical msr address

vcbm: virtual CBM, passed to vCAT handlers by the
caller functions rdmsr_vmexit_handler()/wrmsr_vmexit_handler()

pcbm: physical CBM

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@...>
---
hypervisor/arch/x86/guest/vcat.c | 100
+++++++++++++++++--
hypervisor/arch/x86/guest/vmsr.c | 9 ++
hypervisor/include/arch/x86/asm/guest/vcat.h | 1 +
3 files changed, 104 insertions(+), 6 deletions(-)

diff --git a/hypervisor/arch/x86/guest/vcat.c
b/hypervisor/arch/x86/guest/vcat.c
index 03e9a3cc9..1dda6b61f 100644
--- a/hypervisor/arch/x86/guest/vcat.c
+++ b/hypervisor/arch/x86/guest/vcat.c
@@ -365,6 +365,21 @@ int32_t read_vcat_msr(const struct acrn_vcpu
*vcpu, uint32_t vmsr, uint64_t *rva
return ret;
}

+/**
+ * @brief Map vCBM to pCBM
+ *
+ * @pre vm != NULL
+ */
+static uint64_t vcbm_to_pcbm(const struct acrn_vm *vm, uint64_t
+vcbm, int res) {
+ uint64_t max_pcbm = get_max_pcbm(vm, res);
+
+ /* Find the position low (the first bit set) in max_pcbm */
+ uint16_t low = ffs64(max_pcbm);
+
+ return vcbm << low;
+}
+
/**
* @brief Map vMSR address (abbreviated as vmsr) to corresponding
pMSR address (abbreviated as pmsr)
* Each vMSR or pMSR is identified by a 32-bit integer @@ -395,19
+410,70 @@ static uint32_t vmsr_to_pmsr(const struct acrn_vm *vm,
uint32_t vmsr, int res)
return pmsr;
}

+static void write_pcbm(uint32_t pmsr, uint64_t pcbm) {
+ /* Preserve reserved bits, and only set the pCBM bits */
+ uint64_t pmsr_value = (msr_read(pmsr) & 0xFFFFFFFF00000000UL)
|
pcbm;
+
+ msr_write(pmsr, pmsr_value);
+}
+
+/* Check if bitmask is contiguous:
+ * All (and only) contiguous '1' combinations are allowed (e.g.
+FFFFH, 0FF0H, 003CH, etc.) */ static bool
+is_contiguous_bit_set(uint64_t
+bitmask) {
+ bool ret = false;
+ uint64_t tmp64 = bitmask;
+
+ if (tmp64 != 0UL) {
+ while ((tmp64 & 1UL) == 0UL) {
+ tmp64 >>= 1U;
+ }
+
+ while ((tmp64 & 1UL) != 0UL) {
+ tmp64 >>= 1U;
+ }
+
+ if (tmp64 == 0UL) {
+ ret = true;
+ }
+ }
That is too ugly.

We can use bit-scan instruction to find the highest set-bit (say
high), and the lowest set bit (say low) Then the contiguous value must
be: (1<<(high+1)) --
(1<<low)
Don: no, I do not think your algo is right, we need to check if the 1s are
contiguous.
Did you try it?

Return ( tmp64 == ((1<<(high+1)) - (1 << low));


Can you rethink on this?


+
+ return ret;
+}
+
/**
* @brief vCBM MSR write handler
*
* @pre vcpu != NULL && vcpu->vm != NULL
*/
-static int32_t write_vcbm(__unused struct acrn_vcpu *vcpu, __unused
uint32_t vmsr, __unused uint64_t val, __unused int res)
+static int32_t write_vcbm(struct acrn_vcpu *vcpu, uint32_t vmsr,
+uint64_t val, int res)
{
- /* TODO: this is going to be implemented in a subsequent commit,
will
perform the following actions:
- * write vCBM
- * vmsr_to_pmsr and vcbm_to_pcbm
- * write pCBM
+ int32_t ret = -EINVAL;
+ /*
+ * vcbm set bits should only be in the range of [0, vcbm_len)
(vcat_get_max_vcbm),
+ * so mask with vcat_get_max_vcbm to prevent erroneous vCBM
value
*/
- return -EFAULT;
+ uint64_t masked_vcbm = val & vcat_get_max_vcbm(vcpu->vm, res);
If SW sets bits beyond hardware supported value, I thought we should
inject #GP.
Not?
Don: yes, just return non-zero value (= -EINVAL) here if not contiguous to
upper caller, then eventually in hv_main.c, it will inject a GP to guest after
vmext_handler(), see below:
ret = vmexit_handler(vcpu);
if (ret < 0) {
pr_fatal("dispatch VM exit handler failed for reason"
" %d, ret = %d!", vcpu->arch.exit_reason, ret);
vcpu_inject_gp(vcpu, 0U);
continue;
}
I am saying you missed the check of case when the set-value is beyond the allowed scope, but you are talking is_contiguous_bit_set.

That is 2 different thing and we need to make sure both conditions are met.


+
+ if (is_contiguous_bit_set(masked_vcbm)) {
+ uint32_t pmsr;
+ uint64_t pcbm;
+
+ /* Write vCBM first */
+ vcpu_set_guest_msr(vcpu, vmsr, masked_vcbm | (val &
+0xFFFFFFFF00000000UL));
+
Didn't understand why we not directly set guest value here.
Don: we can, but if the vcbm is not within range: vcbm set bits should only be
in the range of [0, vcbm_len) (vcat_get_max_vcbm), Then what should we do
That is the scope check in above comments. We must do both.

here, return an error code to upper caller (will #GP), or just use masked off
vcbm (masked_vcbm) and set it to guest? I choose the latter one here
Here we are setting the V value. Please exactly follow what SDM says.
vcpu_set_guest_msr is setting V value, whatever the guest write, set it or inject #GP.



+ /* Write pCBM: */
+ pmsr = vmsr_to_pmsr(vcpu->vm, vmsr, res);
+ pcbm = vcbm_to_pcbm(vcpu->vm, masked_vcbm, res);
+ write_pcbm(pmsr, pcbm);
+
+ ret = 0;
+ }
+
+ /* Return non-zero to vmexit_handler if vcbm is not contiguous,
+which
will result in #GP injected to guest */
+ return ret;
}

/**
@@ -444,6 +510,28 @@ static int32_t write_vclosid(struct acrn_vcpu
*vcpu, uint64_t val)
return 0;
}

+/**
+ * @brief vCAT MSRs write handler
+ *
+ * @pre vcpu != NULL && vcpu->vm != NULL */ int32_t
+write_vcat_msr(struct acrn_vcpu *vcpu, uint32_t vmsr, uint64_t val) {
+ int32_t ret = -EACCES;
+
+ if (is_vcat_configured(vcpu->vm)) {
+ if (vmsr == MSR_IA32_PQR_ASSOC) {
+ ret = write_vclosid(vcpu, val);
Please move this to caller side to avoid double switch/if-else...
This one happens frequently at runtime.
Don: yes we can, but move this to caller code (vmsr.c) will make the logic hard
and requires more code.
I didn't see any hard part. Only thing we need to do is to expose write_vclosid publically.

I would try to avoid that if I have the choice.


+ } else if (is_l2_vcbm_msr(vcpu->vm, vmsr)) {
+ ret = write_vcbm(vcpu, vmsr, val,
RDT_RESOURCE_L2);
+ } else if (is_l3_vcbm_msr(vcpu->vm, vmsr)) {
+ ret = write_vcbm(vcpu, vmsr, val,
RDT_RESOURCE_L3);
+ }
+ }
+
+ return ret;
+}
+
/**
* @brief Initialize vCBM MSRs
*
diff --git a/hypervisor/arch/x86/guest/vmsr.c
b/hypervisor/arch/x86/guest/vmsr.c
index 06825d211..412388106 100644
--- a/hypervisor/arch/x86/guest/vmsr.c
+++ b/hypervisor/arch/x86/guest/vmsr.c
@@ -1092,6 +1092,15 @@ int32_t wrmsr_vmexit_handler(struct
acrn_vcpu
*vcpu)
}
break;
}
+#ifdef CONFIG_VCAT_ENABLED
+ case MSR_IA32_L2_MASK_BASE ... (MSR_IA32_L2_MASK_BASE +
NUM_VCAT_L2_MSRS - 1U):
+ case MSR_IA32_L3_MASK_BASE ... (MSR_IA32_L3_MASK_BASE +
NUM_VCAT_L3_MSRS - 1U):
+ case MSR_IA32_PQR_ASSOC:
+ {
+ err = write_vcat_msr(vcpu, msr, v);
+ break;
+ }
+#endif
default:
{
if (is_x2apic_msr(msr)) {
diff --git a/hypervisor/include/arch/x86/asm/guest/vcat.h
b/hypervisor/include/arch/x86/asm/guest/vcat.h
index 16ae0802b..8effa7604 100644
--- a/hypervisor/include/arch/x86/asm/guest/vcat.h
+++ b/hypervisor/include/arch/x86/asm/guest/vcat.h
@@ -16,6 +16,7 @@ void init_vcat_msrs(struct acrn_vcpu *vcpu);
uint16_t vcat_get_num_vclosids(const struct acrn_vm *vm); uint64_t
vcat_pcbm_to_vcbm(const struct acrn_vm *vm, uint64_t pcbm, int res);
int32_t read_vcat_msr(const struct acrn_vcpu *vcpu, uint32_t vmsr,
uint64_t *rval);
+int32_t write_vcat_msr(struct acrn_vcpu *vcpu, uint32_t vmsr,
+uint64_t val);

#endif /* VCAT_H_ */

--
2.25.1










Join acrn-dev@lists.projectacrn.org to automatically receive all group messages.