OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Reinette Chatre	a232b35483	selftests/sgx: Enable multiple thread support commit `26e688f126` upstream. Each thread executing in an enclave is associated with a Thread Control Structure (TCS). The test enclave contains two hardcoded TCS. Each TCS contains meta-data used by the hardware to save and restore thread specific information when entering/exiting the enclave. The two TCS structures within the test enclave share their SSA (State Save Area) resulting in the threads clobbering each other's data. Fix this by providing each TCS their own SSA area. Additionally, there is an 8K stack space and its address is computed from the enclave entry point which is correctly done for TCS #1 that starts on the first address inside the enclave but results in out of bounds memory when entering as TCS #2. Split 8K stack space into two separate pages with offset symbol between to ensure the current enclave entry calculation can continue to be used for both threads. While using the enclave with multiple threads requires these fixes the impact is not apparent because every test up to this point enters the enclave from the first TCS. More detail about the stack fix: ------------------------------- Before this change the test enclave (test_encl) looks as follows: .tcs (2 pages): (page 1) TCS #1 (page 2) TCS #2 .text (1 page) One page of code .data (5 pages) (page 1) encl_buffer (page 2) encl_buffer (page 3) SSA (page 4 and 5) STACK encl_stack: As shown above there is a symbol, encl_stack, that points to the end of the .data segment (pointing to the end of page 5 in .data) which is also the end of the enclave. The enclave entry code computes the stack address by adding encl_stack to the pointer to the TCS that entered the enclave. When entering at TCS #1 the stack is computed correctly but when entering at TCS #2 the stack pointer would point to one page beyond the end of the enclave and a #PF would result when TCS #2 attempts to enter the enclave. The fix involves moving the encl_stack symbol between the two stack pages. Doing so enables the stack address computation in the entry code to compute the correct stack address for each TCS. Intel-SIG: commit `26e688f126` selftests/sgx: Enable multiple thread support. Backport for SGX EDMM support. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/a49dc0d85401db788a0a3f0d795e848abf3b1f44.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:06 +08:00
Reinette Chatre	589c092c85	selftests/sgx: Add page permission and exception test commit `abc5cec473` upstream. The Enclave Page Cache Map (EPCM) is a secure structure used by the processor to track the contents of the enclave page cache. The EPCM contains permissions with which enclave pages can be accessed. SGX support allows EPCM and PTE page permissions to differ - as long as the PTE permissions do not exceed the EPCM permissions. Add a test that: (1) Creates an SGX enclave page with writable EPCM permission. (2) Changes the PTE permission on the page to read-only. This should be permitted because the permission does not exceed the EPCM permission. (3) Attempts a write to the page. This should generate a page fault (#PF) because of the read-only PTE even though the EPCM permissions allow the page to be written to. This introduces the first test of SGX exception handling. In this test the issue that caused the exception (PTE page permissions) can be fixed from outside the enclave and after doing so it is possible to re-enter enclave at original entrypoint with ERESUME. Intel-SIG: commit `abc5cec473` selftests/sgx: Add page permission and exception test. Backport for SGX EDMM support. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/3bcc73a4b9fe8780bdb40571805e7ced59e01df7.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:05 +08:00
Reinette Chatre	1a8bdc5061	selftests/sgx: Rename test properties in preparation for more enclave tests commit `c085dfc768` upstream. SGX selftests prepares a data structure outside of the enclave with the type of and data for the operation that needs to be run within the enclave. At this time only two complementary operations are supported by the enclave: copying a value from outside the enclave into a default buffer within the enclave and reading a value from the enclave's default buffer into a variable accessible outside the enclave. In preparation for more operations supported by the enclave the names of the current enclave operations are changed to more accurately reflect the operations and more easily distinguish it from future operations: * The enums ENCL_OP_PUT and ENCL_OP_GET are renamed to ENCL_OP_PUT_TO_BUFFER and ENCL_OP_GET_FROM_BUFFER respectively. * The structs encl_op_put and encl_op_get are renamed to encl_op_put_to_buf and encl_op_get_from_buf respectively. * The enclave functions do_encl_op_put and do_encl_op_get are renamed to do_encl_op_put_to_buf and do_encl_op_get_from_buf respectively. No functional changes. Intel-SIG: commit `c085dfc768` selftests/sgx: Rename test properties in preparation for more enclave tests. Backport for SGX EDMM support. Suggested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/023fda047c787cf330b88ed9337705edae6a0078.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:05 +08:00
Jarkko Sakkinen	27e18d52f6	selftests/sgx: Provide per-op parameter structs for the test enclave commit `41493a095e` upstream. To add more operations to the test enclave, the protocol needs to allow to have operations with varying parameters. Create a separate parameter struct for each existing operation, with the shared parameters in struct encl_op_header. Intel-SIG: commit `41493a095e` selftests/sgx: Provide per-op parameter structs for the test enclave. Backport for SGX EDMM support. [reinette: rebased to apply on top of oversubscription test series] Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/f9a4a8c436b538003b8ebddaa66083992053cef1.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:05 +08:00
Jarkko Sakkinen	0d7d49d9bd	selftests/sgx: Fix corrupted cpuid macro invocation commit `572a0a647b` upstream. The SGX selftest fails to build on tip/x86/sgx: main.c: In function ‘get_total_epc_mem’: main.c:296:17: error: implicit declaration of function ‘__cpuid’ [-Werror=implicit-function-declaration] 296 \| __cpuid(&eax, &ebx, &ecx, &edx); \| ^~~~~~~ Include cpuid.h and use __cpuid_count() macro in order to fix the compilation issue. [ dhansen: tweak commit message ] Intel-SIG: commit `572a0a647b` selftests/sgx: Fix corrupted cpuid macro invocation. Backport for SGX EDMM support. Fixes: `f0ff2447b8` ("selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20211204202355.23005-1-jarkko@kernel.org Cc: Shuah Khan <shuah@kernel.org> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:04 +08:00
Jarkko Sakkinen	c59086a59a	selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed commit `f0ff2447b8` upstream. Add a variation of the unclobbered_vdso test. In the new test, create a heap for the test enclave, which has the same size as all available Enclave Page Cache (EPC) pages in the system. This will guarantee that all test_encl.elf pages and SGX Enclave Control Structure (SECS) have been swapped out by the page reclaimer during the load time. This test will trigger both the page reclaimer and the page fault handler. The page reclaimer triggered, while the heap is being created during the load time. The page fault handler is triggered for all the required pages, while the test case is executing. Intel-SIG: commit `f0ff2447b8` selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed. Backport for SGX EDMM support. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/41f7c508eea79a3198b5014d7691903be08f9ff1.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:04 +08:00
Jarkko Sakkinen	319148e6ed	selftests/sgx: Move setup_test_encl() to each TEST_F() commit `065825db1f` upstream. Create the test enclave inside each TEST_F(), instead of FIXTURE_SETUP(), so that the heap size can be defined per test. Intel-SIG: commit `065825db1f` selftests/sgx: Move setup_test_encl() to each TEST_F(). Backport for SGX EDMM support. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/70ca264535d2ca0dc8dcaf2281e7d6965f8d4a24.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:03 +08:00
Jarkko Sakkinen	2d57863265	selftests/sgx: Encpsulate the test enclave creation commit `1b35eb7195` upstream. Introduce setup_test_encl() so that the enclave creation can be moved to TEST_F()'s. This is required for a reclaimer test where the heap size needs to be set large enough to triger the page reclaimer. Intel-SIG: commit `1b35eb7195` selftests/sgx: Encpsulate the test enclave creation. Backport for SGX EDMM support. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/bee0ca867a95828a569c1ba2a8e443a44047dc71.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:03 +08:00
Jarkko Sakkinen	88c61332b5	selftests/sgx: Dump segments and /proc/self/maps only on failure commit `1471721489` upstream. Logging is always a compromise between clarity and detail. The main use case for dumping VMA's is when FIXTURE_SETUP() fails, and is less important for enclaves that do initialize correctly. Therefore, print the segments and /proc/self/maps only in the error case. Finally, if a single test ever creates multiple enclaves, the amount of log lines would become enormous. Intel-SIG: commit `1471721489` selftests/sgx: Dump segments and /proc/self/maps only on failure. Backport for SGX EDMM support. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/23cef0ae1de3a8a74cbfbbe74eca48ca3f300fde.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:03 +08:00
Jarkko Sakkinen	f47c1b16e7	selftests/sgx: Create a heap for the test enclave commit `3200505d4d` upstream. Create a heap for the test enclave, which is allocated from /dev/null, and left unmeasured. This is beneficial by its own because it verifies that an enclave built from multiple choices, works properly. If LSM hooks are added for SGX some day, a multi source enclave has higher probability to trigger bugs on access control checks. The immediate need comes from the need to implement page reclaim tests. In order to trigger the page reclaimer, one can just set the size of the heap to high enough. Intel-SIG: commit `3200505d4d` selftests/sgx: Create a heap for the test enclave. Backport for SGX EDMM support. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/e070c5f23578c29608051cab879b1d276963a27a.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:02 +08:00
Jarkko Sakkinen	6069d0ed3b	selftests/sgx: Make data measurement for an enclave segment optional commit `5f0ce664d8` upstream. For a heap makes sense to leave its contents "unmeasured" in the SGX enclave build process, meaning that they won't contribute to the cryptographic signature (a RSA-3072 signed SHA56 hash) of the enclave. Enclaves are signed blobs where the signature is calculated both from page data and also from "structural properties" of the pages. For instance a page offset of every page added to the enclave is hashed. For data, this is optional, not least because hashing a page has a significant contribution to the enclave load time. Thus, where there is no reason to hash, do not. The SGX ioctl interface supports this with SGX_PAGE_MEASURE flag. Only when the flag is set, data is measured. Add seg->measure boolean flag to struct encl_segment. Only when the flag is set, include the segment data to the signature (represented by SIGSTRUCT architectural structure). Intel-SIG: commit `5f0ce664d8` selftests/sgx: Make data measurement for an enclave segment optional. Backport for SGX EDMM support. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/625b6fe28fed76275e9238ec4e15ec3c0d87de81.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:02 +08:00
Jarkko Sakkinen	d97e62407b	selftests/sgx: Assign source for each segment commit `39f62536be` upstream. Define source per segment so that enclave pages can be added from different sources, e.g. anonymous VMA for zero pages. In other words, add 'src' field to struct encl_segment, and assign it to 'encl->src' for pages inherited from the enclave binary. Intel-SIG: commit `39f62536be` selftests/sgx: Assign source for each segment. Backport for SGX EDMM support. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/7850709c3089fe20e4bcecb8295ba87c54cc2b4a.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:02 +08:00
Sean Christopherson	bbc2567b6c	selftests/sgx: Fix a benign linker warning commit `5064343fb1` upstream. The enclave binary (test_encl.elf) is built with only three sections (tcs, text, and data) as controlled by its custom linker script. If gcc is built with "--enable-linker-build-id" (this appears to be a common configuration even if it is by default off) then gcc will pass "--build-id" to the linker that will prompt it (the linker) to write unique bits identifying the linked file to a ".note.gnu.build-id" section. The section ".note.gnu.build-id" does not exist in the test enclave resulting in the following warning emitted by the linker: /usr/bin/ld: warning: .note.gnu.build-id section discarded, --build-id ignored The test enclave does not use the build id within the binary so fix the warning by passing a build id of "none" to the linker that will disable the setting from any earlier "--build-id" options and thus disable the attempt to write the build id to a ".note.gnu.build-id" section that does not exist. Intel-SIG: commit `5064343fb1` selftests/sgx: Fix a benign linker warning. Backport for SGX EDMM support. Link: https://lore.kernel.org/linux-sgx/20191017030340.18301-2-sean.j.christopherson@intel.com/ Suggested-by: Cedric Xing <cedric.xing@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/ca0f8a81fc1e78af9bdbc6a88e0f9c37d82e53f2.1636997631.git.reinette.chatre@intel.com [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:01 +08:00
Tony Luck	f5f9636ca8	x86/sgx: Add check for SGX pages to ghes_do_memory_failure() commit `3ad6fd77a2` upstream. SGX EPC pages do not have a "struct page" associated with them so the pfn_valid() sanity check fails and results in a warning message to the console. Add an additional check to skip the warning if the address of the error is in an SGX EPC page. Intel-SIG: commit `3ad6fd77a2` x86/sgx: Add check for SGX pages to ghes_do_memory_failure(). Backport for SGX MCA recovery co-existence support. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20211026220050.697075-8-tony.luck@intel.com [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:01 +08:00
Tony Luck	75a042336e	x86/sgx: Add hook to error injection address validation commit `c6acb1e7bf` upstream. SGX reserved memory does not appear in the standard address maps. Add hook to call into the SGX code to check if an address is located in SGX memory. There are other challenges in injecting errors into SGX. Update the documentation with a sequence of operations to inject. Intel-SIG: commit `c6acb1e7bf` x86/sgx: Add hook to error injection address validation. Backport for SGX MCA recovery co-existence support. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20211026220050.697075-7-tony.luck@intel.com [ Zhiquan: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:01 +08:00
Tony Luck	487d5438cd	x86/sgx: Hook arch_memory_failure() into mainline code commit `03b122da74` upstream. Add a call inside memory_failure() to call the arch specific code to check if the address is an SGX EPC page and handle it. Note the SGX EPC pages do not have a "struct page" entry, so the hook goes in at the same point as the device mapping hook. Pull the call to acquire the mutex earlier so the SGX errors are also protected. Make set_mce_nospec() skip SGX pages when trying to adjust the 1:1 map. Intel-SIG: commit `03b122da74` x86/sgx: Hook arch_memory_failure() into mainline code. Backport for SGX MCA recovery co-existence support. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20211026220050.697075-6-tony.luck@intel.com [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:00 +08:00
Tony Luck	f8c8adb6ca	x86/sgx: Add SGX infrastructure to recover from poison commit `a495cbdffa` upstream. Provide a recovery function sgx_memory_failure(). If the poison was consumed synchronously then send a SIGBUS. Note that the virtual address of the access is not included with the SIGBUS as is the case for poison outside of SGX enclaves. This doesn't matter as addresses of code/data inside an enclave is of little to no use to code executing outside the (now dead) enclave. Poison found in a free page results in the page being moved from the free list to the per-node poison page list. Intel-SIG: commit `a495cbdffa` x86/sgx: Add SGX infrastructure to recover from poison. Backport for SGX MCA recovery co-existence support. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20211026220050.697075-5-tony.luck@intel.com [ Zhiquan: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:00 +08:00
Tony Luck	3792e69713	x86/sgx: Initial poison handling for dirty and free pages commit `992801ae92` upstream. A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags (Currently just sgx_reclaimer_lock which protects the SGX_EPC_PAGE_RECLAIMER_TRACKED bit in page->flags). In both cases place the poisoned page on a per-node list of poisoned epc pages to make sure it will not be reallocated. Intel-SIG: commit `992801ae92` x86/sgx: Initial poison handling for dirty and free pages. Backport for SGX MCA recovery co-existence support. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20211026220050.697075-4-tony.luck@intel.com [ Zhiquan: amend commit log and adapt the code ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:24:00 +08:00
Tony Luck	58f83be5d4	x86/sgx: Add infrastructure to identify SGX EPC pages commit `40e0e7843e` upstream. X86 machine check architecture reports a physical address when there is a memory error. Handling that error requires a method to determine whether the physical address reported is in any of the areas reserved for EPC pages by BIOS. SGX EPC pages do not have Linux "struct page" associated with them. Keep track of the mapping from ranges of EPC pages to the sections that contain them using an xarray. N.B. adds CONFIG_XARRAY_MULTI to the SGX dependecies. So "select" that in arch/x86/Kconfig for X86/SGX. Create a function arch_is_platform_page() that simply reports whether an address is an EPC page for use elsewhere in the kernel. The ACPI error injection code needs this function and is typically built as a module, so export it. Note that arch_is_platform_page() will be slower than other similar "what type is this page" functions that can simply check bits in the "struct page". If there is some future performance critical user of this function it may need to be implemented in a more efficient way. Note also that the current implementation of xarray allocates a few hundred kilobytes for this usage on a system with 4GB of SGX EPC memory configured. This isn't ideal, but worth it for the code simplicity. Intel-SIG: commit `40e0e7843e` x86/sgx: Add infrastructure to identify SGX EPC pages. Backport for SGX MCA recovery co-existence support. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20211026220050.697075-3-tony.luck@intel.com [ Zhiquan: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:59 +08:00
Tony Luck	649ecbcdf0	x86/sgx: Add new sgx_epc_page flag bit to mark free pages commit `d6d261bded` upstream. SGX EPC pages go through the following life cycle: DIRTY ---> FREE ---> IN-USE --\ ^ \| \-----------------/ Recovery action for poison for a DIRTY or FREE page is simple. Just make sure never to allocate the page. IN-USE pages need some extra handling. Add a new flag bit SGX_EPC_PAGE_IS_FREE that is set when a page is added to a free list and cleared when the page is allocated. Notes: 1) These transitions are made while holding the node->lock so that future code that checks the flags while holding the node->lock can be sure that if the SGX_EPC_PAGE_IS_FREE bit is set, then the page is on the free list. 2) Initially while the pages are on the dirty list the SGX_EPC_PAGE_IS_FREE bit is cleared. Intel-SIG: commit `d6d261bded` x86/sgx: Add new sgx_epc_page flag bit to mark free pages. Backport for SGX MCA recovery co-existence support. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20211026220050.697075-2-tony.luck@intel.com [ Zhiquan: amend commit log and adapt the code ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:59 +08:00
Sean Christopherson	2c7e165f6c	KVM: x86: Fix implicit enum conversion goof in scattered reverse CPUID code commit `462f8ddebc` upstream. Take "enum kvm_only_cpuid_leafs" in scattered specific CPUID helpers (which is obvious in hindsight), and use "unsigned int" for leafs that can be the kernel's standard "enum cpuid_leaf" or the aforementioned KVM-only variant. Loss of the enum params is a bit disapponting, but gcc obviously isn't providing any extra sanity checks, and the various BUILD_BUG_ON() assertions ensure the input is in range. This fixes implicit enum conversions that are detected by clang-11: arch/x86/kvm/cpuid.c:499:29: warning: implicit conversion from enumeration type 'enum kvm_only_cpuid_leafs' to different enumeration type 'enum cpuid_leafs' [-Wenum-conversion] kvm_cpu_cap_init_scattered(CPUID_12_EAX, ~~~~~~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~~ arch/x86/kvm/cpuid.c:837:31: warning: implicit conversion from enumeration type 'enum kvm_only_cpuid_leafs' to different enumeration type 'enum cpuid_leafs' [-Wenum-conversion] cpuid_entry_override(entry, CPUID_12_EAX); ~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~~ 2 warnings generated. Intel-SIG: commit `462f8ddebc` KVM: x86: Fix implicit enum conversion goof in scattered reverse CPUID code. Backport for SGX virtualization support. Fixes: `4e66c0cb79` ("KVM: x86: Add support for reverse CPUID lookup of scattered features") Cc: Kai Huang <kai.huang@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210421010850.3009718-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:58 +08:00
Mauro Carvalho Chehab	e1677961e8	docs: virt: api.rst: fix a pointer to SGX documentation commit `0a5fab9f08` upstream. The document which describes the SGX kernel architecture was added at commit `3fa97bf001` ("Documentation/x86: Document SGX kernel architecture") but the reference at virt/kvm/api.rst is pointing to some non-existing document. Intel-SIG: commit `0a5fab9f08` docs: virt: api.rst: fix a pointer to SGX documentation Backport for SGX virtualization support on Intel Xeon platform. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/138c24633c6e4edf862a2b4d77033c603fc10406.1621413933.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:58 +08:00
Reinette Chatre	63114f3b50	x86/sgx: Silence softlockup detection when releasing large enclaves commit `8795359e35` upstream. Vijay reported that the "unclobbered_vdso_oversubscribed" selftest triggers the softlockup detector. Actual SGX systems have 128GB of enclave memory or more. The "unclobbered_vdso_oversubscribed" selftest creates one enclave which consumes all of the enclave memory on the system. Tearing down such a large enclave takes around a minute, most of it in the loop where the EREMOVE instruction is applied to each individual 4k enclave page. Spending one minute in a loop triggers the softlockup detector. Add a cond_resched() to give other tasks a chance to run and placate the softlockup detector. Intel-SIG: commit `8795359e35` x86/sgx: Silence softlockup detection when releasing large enclaves. Backport for SGX virtualization support. Cc: stable@vger.kernel.org Fixes: `1728ab54b4` ("x86/sgx: Add a page reclaimer") Reported-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> (kselftest as sanity check) Link: https://lkml.kernel.org/r/ced01cac1e75f900251b0a4ae1150aa8ebd295ec.1644345232.git.reinette.chatre@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. The issue has been fixed by commit `d8c6333085` ("sgx: fix softlockup when sgx_encl_release"), just align the code to upstream here. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:58 +08:00
Reinette Chatre	a0e0443a2b	x86/sgx: Fix free page accounting commit `ac5d272a0a` upstream. The SGX driver maintains a single global free page counter, sgx_nr_free_pages, that reflects the number of free pages available across all NUMA nodes. Correspondingly, a list of free pages is associated with each NUMA node and sgx_nr_free_pages is updated every time a page is added or removed from any of the free page lists. The main usage of sgx_nr_free_pages is by the reclaimer that runs when it (sgx_nr_free_pages) goes below a watermark to ensure that there are always some free pages available to, for example, support efficient page faults. With sgx_nr_free_pages accessed and modified from a few places it is essential to ensure that these accesses are done safely but this is not the case. sgx_nr_free_pages is read without any protection and updated with inconsistent protection by any one of the spin locks associated with the individual NUMA nodes. For example: CPU_A CPU_B ----- ----- spin_lock(&nodeA->lock); spin_lock(&nodeB->lock); ... ... sgx_nr_free_pages--; /* NOT SAFE / sgx_nr_free_pages--; spin_unlock(&nodeA->lock); spin_unlock(&nodeB->lock); Since sgx_nr_free_pages may be protected by different spin locks while being modified from different CPUs, the following scenario is possible: CPU_A CPU_B ----- ----- {sgx_nr_free_pages = 100} spin_lock(&nodeA->lock); spin_lock(&nodeB->lock); sgx_nr_free_pages--; sgx_nr_free_pages--; / LOAD sgx_nr_free_pages = 100 / / LOAD sgx_nr_free_pages = 100 / / sgx_nr_free_pages-- / / sgx_nr_free_pages-- / / STORE sgx_nr_free_pages = 99 / / STORE sgx_nr_free_pages = 99 */ spin_unlock(&nodeA->lock); spin_unlock(&nodeB->lock); In the above scenario, sgx_nr_free_pages is decremented from two CPUs but instead of sgx_nr_free_pages ending with a value that is two less than it started with, it was only decremented by one while the number of free pages were actually reduced by two. The consequence of sgx_nr_free_pages not being protected is that its value may not accurately reflect the actual number of free pages on the system, impacting the availability of free pages in support of many flows. The problematic scenario is when the reclaimer does not run because it believes there to be sufficient free pages while any attempt to allocate a page fails because there are no free pages available. In the SGX driver the reclaimer's watermark is only 32 pages so after encountering the above example scenario 32 times a user space hang is possible when there are no more free pages because of repeated page faults caused by no free pages made available. The following flow was encountered: asm_exc_page_fault ... sgx_vma_fault() sgx_encl_load_page() sgx_encl_eldu() // Encrypted page needs to be loaded from backing // storage into newly allocated SGX memory page sgx_alloc_epc_page() // Allocate a page of SGX memory __sgx_alloc_epc_page() // Fails, no free SGX memory ... if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) // Wake reclaimer wake_up(&ksgxd_waitq); return -EBUSY; // Return -EBUSY giving reclaimer time to run return -EBUSY; return -EBUSY; return VM_FAULT_NOPAGE; The reclaimer is triggered in above flow with the following code: static bool sgx_should_reclaim(unsigned long watermark) { return sgx_nr_free_pages < watermark && !list_empty(&sgx_active_page_list); } In the problematic scenario there were no free pages available yet the value of sgx_nr_free_pages was above the watermark. The allocation of SGX memory thus always failed because of a lack of free pages while no free pages were made available because the reclaimer is never started because of sgx_nr_free_pages' incorrect value. The consequence was that user space kept encountering VM_FAULT_NOPAGE that caused the same address to be accessed repeatedly with the same result. Change the global free page counter to an atomic type that ensures simultaneous updates are done safely. While doing so, move the updating of the variable outside of the spin lock critical section to which it does not belong. Intel-SIG: commit `ac5d272a0a` x86/sgx: Fix free page accounting Backport for SGX virtualization support. Cc: stable@vger.kernel.org Fixes: `901ddbb9ec` ("x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()") Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/a95a40743bbd3f795b465f30922dde7f1ea9e0eb.1637004094.git.reinette.chatre@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:57 +08:00
Paolo Bonzini	984d2b5033	x86/sgx/virt: implement SGX_IOC_VEPC_REMOVE ioctl commit `ae095b16fc` upstream. For bare-metal SGX on real hardware, the hardware provides guarantees SGX state at reboot. For instance, all pages start out uninitialized. The vepc driver provides a similar guarantee today for freshly-opened vepc instances, but guests such as Windows expect all pages to be in uninitialized state on startup, including after every guest reboot. Some userspace implementations of virtual SGX would rather avoid having to close and reopen the /dev/sgx_vepc file descriptor and re-mmap the virtual EPC. For example, they could sandbox themselves after the guest starts and forbid further calls to open(), in order to mitigate exploits from untrusted guests. Therefore, add a ioctl that does this with EREMOVE. Userspace can invoke the ioctl to bring its vEPC pages back to uninitialized state. There is a possibility that some pages fail to be removed if they are SECS pages, and the child and SECS pages could be in separate vEPC regions. Therefore, the ioctl returns the number of EREMOVE failures, telling userspace to try the ioctl again after it's done with all vEPC regions. A more verbose description of the correct usage and the possible error conditions is documented in sgx.rst. Intel-SIG: commit `ae095b16fc` x86/sgx/virt: implement SGX_IOC_VEPC_REMOVE ioctl. Backport for SGX virtualization support. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20211021201155.1523989-3-pbonzini@redhat.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:57 +08:00
Paolo Bonzini	d4b3139ba7	x86/sgx/virt: extract sgx_vepc_remove_page commit `fd5128e622` upstream. For bare-metal SGX on real hardware, the hardware provides guarantees SGX state at reboot. For instance, all pages start out uninitialized. The vepc driver provides a similar guarantee today for freshly-opened vepc instances, but guests such as Windows expect all pages to be in uninitialized state on startup, including after every guest reboot. One way to do this is to simply close and reopen the /dev/sgx_vepc file descriptor and re-mmap the virtual EPC. However, this is problematic because it prevents sandboxing the userspace (for example forbidding open() after the guest starts; this is doable with heavy use of SCM_RIGHTS file descriptor passing). In order to implement this, we will need a ioctl that performs EREMOVE on all pages mapped by a /dev/sgx_vepc file descriptor: other possibilities, such as closing and reopening the device, are racy. Start the implementation by creating a separate function with just the __eremove wrapper. Intel-SIG: commit `fd5128e622` x86/sgx/virt: extract sgx_vepc_remove_page. Backport for SGX virtualization support. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20211021201155.1523989-2-pbonzini@redhat.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:57 +08:00
Liam Howlett	024277f378	x86/sgx: use vma_lookup() in sgx_encl_find() commit `9ce2c3fc0b` upstream. Use vma_lookup() to find the VMA at a specific address. As vma_lookup() will return NULL if the address is not within any VMA, the start address no longer needs to be validated. Intel-SIG: commit `9ce2c3fc0b` x86/sgx: use vma_lookup() in sgx_encl_find(). Backport for SGX virtualization support. Link: https://lkml.kernel.org/r/20210521174745.2219620-10-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:56 +08:00
Kai Huang	0aa1db9b22	x86/sgx: Add missing xa_destroy() when virtual EPC is destroyed commit `4692bc775d` upstream. xa_destroy() needs to be called to destroy a virtual EPC's page array before calling kfree() to free the virtual EPC. Currently it is not called so add the missing xa_destroy(). Intel-SIG: commit `4692bc775d` x86/sgx: Add missing xa_destroy() when virtual EPC is destroyed. Backport for SGX virtualization support. Fixes: `540745ddbc` ("x86/sgx: Introduce virtual EPC for use by KVM guests") Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Tested-by: Yang Zhong <yang.zhong@intel.com> Link: https://lkml.kernel.org/r/20210615101639.291929-1-kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:56 +08:00
Jarkko Sakkinen	c85942cb1c	x86/sgx: Do not update sgx_nr_free_pages in sgx_setup_epc_section() commit `ae40aaf6bd` upstream. The commit in Fixes: changed the SGX EPC page sanitization to end up in sgx_free_epc_page() which puts clean and sanitized pages on the free list. This was done for the reason that it is best to keep the logic to assign available-for-use EPC pages to the correct NUMA lists in a single location. sgx_nr_free_pages is also incremented by sgx_free_epc_pages() but those pages which are being added there per EPC section do not belong to the free list yet because they haven't been sanitized yet - they land on the dirty list first and the sanitization happens later when ksgxd starts massaging them. So remove that addition there and have sgx_free_epc_page() do that solely. [ bp: Sanitize commit message too. ] Intel-SIG: commit `ae40aaf6bd` x86/sgx: Do not update sgx_nr_free_pages in sgx_setup_epc_section(). Backport for SGX virtualization support. Fixes: `51ab30eb2a` ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210408092924.7032-1-jarkko@kernel.org Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:56 +08:00
Sean Christopherson	1ca19405c2	KVM: x86: Add capability to grant VM access to privileged SGX attribute commit `fe7e948837` upstream. Add a capability, KVM_CAP_SGX_ATTRIBUTE, that can be used by userspace to grant a VM access to a priveleged attribute, with args[0] holding a file handle to a valid SGX attribute file. The SGX subsystem restricts access to a subset of enclave attributes to provide additional security for an uncompromised kernel, e.g. to prevent malware from using the PROVISIONKEY to ensure its nodes are running inside a geniune SGX enclave and/or to obtain a stable fingerprint. To prevent userspace from circumventing such restrictions by running an enclave in a VM, KVM restricts guest access to privileged attributes by default. Intel-SIG: commit `fe7e948837` KVM: x86: Add capability to grant VM access to privileged SGX attribute. Backport for SGX virtualization support Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <0b099d65e933e068e3ea934b0523bab070cb8cea.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:55 +08:00
Sean Christopherson	2f673d4403	KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC commit `72add915fb` upstream. Enable SGX virtualization now that KVM has the VM-Exit handlers needed to trap-and-execute ENCLS to ensure correctness and/or enforce the CPU model exposed to the guest. Add a KVM module param, "sgx", to allow an admin to disable SGX virtualization independent of the kernel. When supported in hardware and the kernel, advertise SGX1, SGX2 and SGX LC to userspace via CPUID and wire up the ENCLS_EXITING bitmap based on the guest's SGX capabilities, i.e. to allow ENCLS to be executed in an SGX-enabled guest. With the exception of the provision key, all SGX attribute bits may be exposed to the guest. Guest access to the provision key, which is controlled via securityfs, will be added in a future patch. Note, KVM does not yet support exposing ENCLS_C leafs or ENCLV leafs. Intel-SIG: commit `72add915fb` KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <a99e9c23310c79f2f4175c1af4c4cbcef913c3e5.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. - When the changes were made at v5.13, we have call path as below: kvm_vcpu_ioctl_set_cpuid2() -> kvm_set_cpuid() -> kvm_update_cpuid_runtime() -> kvm_vcpu_after_set_cpuid() chunk 1) -> static_call(kvm_x86_vcpu_after_set_cpuid)() => vmx_vcpu_after_set_cpuid() chunk 2) Now we have call path as below at v5.4: kvm_vcpu_ioctl_set_cpuid2() -> kvm_x86_ops->cpuid_update() => vmx_cpuid_update() chunk 2) -> kvm_update_cpuid() chunk 1) 1. Move chunk 1) in function kvm_vcpu_after_set_cpuid() to kvm_update_cpuid(). 2. Move chunk 2) in function vmx_vcpu_after_set_cpuid() to vmx_cpuid_update(). - In function nested_vmx_exit_reflected(), invert the return value of nested_vmx_exit_handled_encls() as per the old logic. - The commit `5a085326d5` ("KVM: VMX: Rename ops.h to vmx_ops.h") has not been backported, so including old ops.h. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:55 +08:00
Sean Christopherson	4cbbea4d55	KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC) commit `b6f084ca55` upstream. Add a VM-Exit handler to trap-and-execute EINIT when SGX LC is enabled in the host. When SGX LC is enabled, the host kernel may rewrite the hardware values at will, e.g. to launch enclaves with different signers, thus KVM needs to intercept EINIT to ensure it is executed with the correct LE hash (even if the guest sees a hardwired hash). Switching the LE hash MSRs on VM-Enter/VM-Exit is not a viable option as writing the MSRs is prohibitively expensive, e.g. on SKL hardware each WRMSR is ~400 cycles. And because EINIT takes tens of thousands of cycles to execute, the ~1500 cycle overhead to trap-and-execute EINIT is unlikely to be noticed by the guest, let alone impact its overall SGX performance. Intel-SIG: commit `b6f084ca55` KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC). Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <57c92fa4d2083eb3be9e6355e3882fc90cffea87.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:54 +08:00
Sean Christopherson	2a51bfc16b	KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs commit `8f102445d4` upstream. Emulate the four Launch Enclave public key hash MSRs (LE hash MSRs) that exist on CPUs that support SGX Launch Control (LC). SGX LC modifies the behavior of ENCLS[EINIT] to use the LE hash MSRs when verifying the key used to sign an enclave. On CPUs without LC support, the LE hash is hardwired into the CPU to an Intel controlled key (the Intel key is also the reset value of the LE hash MSRs). Track the guest's desired hash so that a future patch can stuff the hash into the hardware MSRs when executing EINIT on behalf of the guest, when those MSRs are writable in host. Intel-SIG: commit `8f102445d4` KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <c58ef601ddf88f3a113add837969533099b1364a.1618196135.git.kai.huang@intel.com> [Add a comment regarding the MSRs being available until SGX is locked. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:54 +08:00
Sean Christopherson	439463947f	KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions commit `70210c044b` upstream. Add an ECREATE handler that will be used to intercept ECREATE for the purpose of enforcing and enclave's MISCSELECT, ATTRIBUTES and XFRM, i.e. to allow userspace to restrict SGX features via CPUID. ECREATE will be intercepted when any of the aforementioned masks diverges from hardware in order to enforce the desired CPUID model, i.e. inject #GP if the guest attempts to set a bit that hasn't been enumerated as allowed-1 in CPUID. Note, access to the PROVISIONKEY is not yet supported. Intel-SIG: commit `70210c044b` KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <c3a97684f1b71b4f4626a1fc3879472a95651725.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:54 +08:00
Sean Christopherson	679d595826	KVM: VMX: Frame in ENCLS handler for SGX virtualization commit `9798adbc04` upstream. Introduce sgx.c and sgx.h, along with the framework for handling ENCLS VM-Exits. Add a bool, enable_sgx, that will eventually be wired up to a module param to control whether or not SGX virtualization is enabled at runtime. Intel-SIG: commit `9798adbc04` KVM: VMX: Frame in ENCLS handler for SGX virtualization. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <1c782269608b2f5e1034be450f375a8432fb705d.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:53 +08:00
Sean Christopherson	412af65f9c	KVM: VMX: Add basic handling of VM-Exit from SGX enclave commit `3c0c2ad1ae` upstream. Add support for handling VM-Exits that originate from a guest SGX enclave. In SGX, an "enclave" is a new CPL3-only execution environment, wherein the CPU and memory state is protected by hardware to make the state inaccesible to code running outside of the enclave. When exiting an enclave due to an asynchronous event (from the perspective of the enclave), e.g. exceptions, interrupts, and VM-Exits, the enclave's state is automatically saved and scrubbed (the CPU loads synthetic state), and then reloaded when re-entering the enclave. E.g. after an instruction based VM-Exit from an enclave, vmcs.GUEST_RIP will not contain the RIP of the enclave instruction that trigered VM-Exit, but will instead point to a RIP in the enclave's untrusted runtime (the guest userspace code that coordinates entry/exit to/from the enclave). To help a VMM recognize and handle exits from enclaves, SGX adds bits to existing VMCS fields, VM_EXIT_REASON.VMX_EXIT_REASON_FROM_ENCLAVE and GUEST_INTERRUPTIBILITY_INFO.GUEST_INTR_STATE_ENCLAVE_INTR. Define the new architectural bits, and add a boolean to struct vcpu_vmx to cache VMX_EXIT_REASON_FROM_ENCLAVE. Clear the bit in exit_reason so that checks against exit_reason do not need to account for SGX, e.g. "if (exit_reason == EXIT_REASON_EXCEPTION_NMI)" continues to work. KVM is a largely a passive observer of the new bits, e.g. KVM needs to account for the bits when propagating information to a nested VMM, but otherwise doesn't need to act differently for the majority of VM-Exits from enclaves. The one scenario that is directly impacted is emulation, which is for all intents and purposes impossible[1] since KVM does not have access to the RIP or instruction stream that triggered the VM-Exit. The inability to emulate is a non-issue for KVM, as most instructions that might trigger VM-Exit unconditionally #UD in an enclave (before the VM-Exit check. For the few instruction that conditionally #UD, KVM either never sets the exiting control, e.g. PAUSE_EXITING[2], or sets it if and only if the feature is not exposed to the guest in order to inject a #UD, e.g. RDRAND_EXITING. But, because it is still possible for a guest to trigger emulation, e.g. MMIO, inject a #UD if KVM ever attempts emulation after a VM-Exit from an enclave. This is architecturally accurate for instruction VM-Exits, and for MMIO it's the least bad choice, e.g. it's preferable to killing the VM. In practice, only broken or particularly stupid guests should ever encounter this behavior. Add a WARN in skip_emulated_instruction to detect any attempt to modify the guest's RIP during an SGX enclave VM-Exit as all such flows should either be unreachable or must handle exits from enclaves before getting to skip_emulated_instruction. [1] Impossible for all practical purposes. Not truly impossible since KVM could implement some form of para-virtualization scheme. [2] PAUSE_LOOP_EXITING only affects CPL0 and enclaves exist only at CPL3, so we also don't need to worry about that interaction. Intel-SIG: commit `3c0c2ad1ae` KVM: VMX: Add basic handling of VM-Exit from SGX enclave Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <315f54a8507d09c292463ef29104e1d4c62e9090.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:53 +08:00
Sean Christopherson	b59f25207d	KVM: x86: Add reverse-CPUID lookup support for scattered SGX features commit `01de8682b3` upstream. Define a new KVM-only feature word for advertising and querying SGX sub-features in CPUID.0x12.0x0.EAX. Because SGX1 and SGX2 are scattered in the kernel's feature word, they need to be translated so that the bit numbers match those of hardware. Intel-SIG: commit `01de8682b3` KVM: x86: Add reverse-CPUID lookup support for scattered SGX features. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <e797c533f4c71ae89265bbb15a02aef86b67cbec.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:53 +08:00
Sean Christopherson	b90f0185c7	KVM: x86: Define new #PF SGX error code bit commit `00e7646c35` upstream. Page faults that are signaled by the SGX Enclave Page Cache Map (EPCM), as opposed to the traditional IA32/EPT page tables, set an SGX bit in the error code to indicate that the #PF was induced by SGX. KVM will need to emulate this behavior as part of its trap-and-execute scheme for virtualizing SGX Launch Control, e.g. to inject SGX-induced #PFs if EINIT faults in the host, and to support live migration. Intel-SIG: commit `00e7646c35` KVM: x86: Define new #PF SGX error code bit. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <e170c5175cb9f35f53218a7512c9e3db972b97a2.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:52 +08:00
Sean Christopherson	81b1a77341	KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX) commit `54f958cdaa` upstream. Export the gva_to_gpa() helpers for use by SGX virtualization when executing ENCLS[ECREATE] and ENCLS[EINIT] on behalf of the guest. To execute ECREATE and EINIT, KVM must obtain the GPA of the target Secure Enclave Control Structure (SECS) in order to get its corresponding HVA. Because the SECS must reside in the Enclave Page Cache (EPC), copying the SECS's data to a host-controlled buffer via existing exported helpers is not a viable option as the EPC is not readable or writable by the kernel. SGX virtualization will also use gva_to_gpa() to obtain HVAs for non-EPC pages in order to pass user pointers directly to ECREATE and EINIT, which avoids having to copy pages worth of data into the kernel. Intel-SIG: commit `54f958cdaa` KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX). Backport for SGX virtualization support Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <02f37708321bcdfaa2f9d41c8478affa6e84b04d.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:52 +08:00
Sean Christopherson	b5a31aa4d3	x86/sgx: Move provisioning device creation out of SGX driver commit `b3754e5d3d` upstream. And extract sgx_set_attribute() out of sgx_ioc_enclave_provision() and export it as symbol for KVM to use. The provisioning key is sensitive. The SGX driver only allows to create an enclave which can access the provisioning key when the enclave creator has permission to open /dev/sgx_provision. It should apply to a VM as well, as the provisioning key is platform-specific, thus an unrestricted VM can also potentially compromise the provisioning key. Move the provisioning device creation out of sgx_drv_init() to sgx_init() as a preparation for adding SGX virtualization support, so that even if the SGX driver is not enabled due to flexible launch control not being available, SGX virtualization can still be enabled, and use it to restrict a VM's capability of being able to access the provisioning key. [ bp: Massage commit message. ] Intel-SIG: commit `b3754e5d3d` x86/sgx: Move provisioning device creation out of SGX driver. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/0f4d044d621561f26d5f4ef73e8dc6cd18cc7e79.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:52 +08:00
Sean Christopherson	5bffd94a65	x86/sgx: Add helpers to expose ECREATE and EINIT to KVM commit `d155030b1e` upstream. The host kernel must intercept ECREATE to impose policies on guests, and intercept EINIT to be able to write guest's virtual SGX_LEPUBKEYHASH MSR values to hardware before running guest's EINIT so it can run correctly according to hardware behavior. Provide wrappers around __ecreate() and __einit() to hide the ugliness of overloading the ENCLS return value to encode multiple error formats in a single int. KVM will trap-and-execute ECREATE and EINIT as part of SGX virtualization, and reflect ENCLS execution result to guest by setting up guest's GPRs, or on an exception, injecting the correct fault based on return value of __ecreate() and __einit(). Use host userspace addresses (provided by KVM based on guest physical address of ENCLS parameters) to execute ENCLS/EINIT when possible. Accesses to both EPC and memory originating from ENCLS are subject to segmentation and paging mechanisms. It's also possible to generate kernel mappings for ENCLS parameters by resolving PFN but using __uaccess_xx() is simpler. [ bp: Return early if the __user memory accesses fail, use cpu_feature_enabled(). ] Intel-SIG: commit `d155030b1e` x86/sgx: Add helpers to expose ECREATE and EINIT to KVM. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/20e09daf559aa5e9e680a0b4b5fba940f1bad86e.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:51 +08:00
Kai Huang	01f2dfaa47	x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs commit `73916b6a0c` upstream. Add a helper to update SGX_LEPUBKEYHASHn MSRs. SGX virtualization also needs to update those MSRs based on guest's "virtual" SGX_LEPUBKEYHASHn before EINIT from guest. Intel-SIG: commit `73916b6a0c` x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs. Backport for SGX virtualization support. Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/dfb7cd39d4dd62ea27703b64afdd8bccb579f623.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:51 +08:00
Sean Christopherson	e456ee71e7	x86/sgx: Add encls_faulted() helper commit `a67136b458` upstream. Add a helper to extract the fault indicator from an encoded ENCLS return value. SGX virtualization will also need to detect ENCLS faults. Intel-SIG: commit `a67136b458` x86/sgx: Add encls_faulted() helper. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/c1f955898110de2f669da536fc6cf62e003dff88.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:50 +08:00
Sean Christopherson	3d7c41aabd	x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT) commit `32ddda8e44` upstream. Define the ENCLS leafs that are available with SGX2, also referred to as Enclave Dynamic Memory Management (EDMM). The leafs will be used by KVM to conditionally expose SGX2 capabilities to guests. Intel-SIG: commit `32ddda8e44` x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT). Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/5f0970c251ebcc6d5add132f0d750cc753b7060f.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:50 +08:00
Sean Christopherson	85e9629da7	x86/sgx: Move ENCLS leaf definitions to sgx.h commit `9c55c78a73` upstream. Move the ENCLS leaf definitions to sgx.h so that they can be used by KVM. Intel-SIG: commit `9c55c78a73` x86/sgx: Move ENCLS leaf definitions to sgx.h. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/2e6cd7c5c1ced620cfcd292c3c6c382827fde6b2.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:50 +08:00
Kai Huang	0480ed73c0	x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled commit `faa7d3e6f3` upstream. Modify sgx_init() to always try to initialize the virtual EPC driver, even if the SGX driver is disabled. The SGX driver might be disabled if SGX Launch Control is in locked mode, or not supported in the hardware at all. This allows (non-Linux) guests that support non-LC configurations to use SGX. [ bp: De-silli-fy the test. ] Intel-SIG: commit `faa7d3e6f3` x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled. Backport for SGX virtualization support. Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Sean Christopherson <seanjc@google.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/d35d17a02bbf8feef83a536cec8b43746d4ea557.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:49 +08:00
Sean Christopherson	6b21ebc4dd	x86/cpu/intel: Allow SGX virtualization without Launch Control support commit `332bfc7bec` upstream. The kernel will currently disable all SGX support if the hardware does not support launch control. Make it more permissive to allow SGX virtualization on systems without Launch Control support. This will allow KVM to expose SGX to guests that have less-strict requirements on the availability of flexible launch control. Improve error message to distinguish between three cases. There are two cases where SGX support is completely disabled: 1) SGX has been disabled completely by the BIOS 2) SGX LC is locked by the BIOS. Bare-metal support is disabled because of LC unavailability. SGX virtualization is unavailable (because of Kconfig). One where it is partially available: 3) SGX LC is locked by the BIOS. Bare-metal support is disabled because of LC unavailability. SGX virtualization is supported. Intel-SIG: commit `332bfc7bec` x86/cpu/intel: Allow SGX virtualization without Launch Control support. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/b3329777076509b3b601550da288c8f3c406a865.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:49 +08:00
Sean Christopherson	94045e80c7	x86/sgx: Introduce virtual EPC for use by KVM guests commit `540745ddbc` upstream. Add a misc device /dev/sgx_vepc to allow userspace to allocate "raw" Enclave Page Cache (EPC) without an associated enclave. The intended and only known use case for raw EPC allocation is to expose EPC to a KVM guest, hence the 'vepc' moniker, virt.{c,h} files and X86_SGX_KVM Kconfig. The SGX driver uses the misc device /dev/sgx_enclave to support userspace in creating an enclave. Each file descriptor returned from opening /dev/sgx_enclave represents an enclave. Unlike the SGX driver, KVM doesn't control how the guest uses the EPC, therefore EPC allocated to a KVM guest is not associated with an enclave, and /dev/sgx_enclave is not suitable for allocating EPC for a KVM guest. Having separate device nodes for the SGX driver and KVM virtual EPC also allows separate permission control for running host SGX enclaves and KVM SGX guests. To use /dev/sgx_vepc to allocate a virtual EPC instance with particular size, the hypervisor opens /dev/sgx_vepc, and uses mmap() with the intended size to get an address range of virtual EPC. Then it may use the address range to create one KVM memory slot as virtual EPC for a guest. Implement the "raw" EPC allocation in the x86 core-SGX subsystem via /dev/sgx_vepc rather than in KVM. Doing so has two major advantages: - Does not require changes to KVM's uAPI, e.g. EPC gets handled as just another memory backend for guests. - EPC management is wholly contained in the SGX subsystem, e.g. SGX does not have to export any symbols, changes to reclaim flows don't need to be routed through KVM, SGX's dirty laundry doesn't have to get aired out for the world to see, and so on and so forth. The virtual EPC pages allocated to guests are currently not reclaimable. Reclaiming an EPC page used by enclave requires a special reclaim mechanism separate from normal page reclaim, and that mechanism is not supported for virutal EPC pages. Due to the complications of handling reclaim conflicts between guest and host, reclaiming virtual EPC pages is significantly more complex than basic support for SGX virtualization. [ bp: - Massage commit message and comments - use cpu_feature_enabled() - vertically align struct members init - massage Virtual EPC clarification text - move Kconfig prompt to Virtualization ] Intel-SIG: commit `540745ddbc` x86/sgx: Introduce virtual EPC for use by KVM guests. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/0c38ced8c8e5a69872db4d6a1c0dabd01e07cad7.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:49 +08:00
Sean Christopherson	11353b1c93	x86/sgx: Add SGX_CHILD_PRESENT hardware error code commit `231d3dbdda` upstream. SGX driver can accurately track how enclave pages are used. This enables SECS to be specifically targeted and EREMOVE'd only after all child pages have been EREMOVE'd. This ensures that SGX driver will never encounter SGX_CHILD_PRESENT in normal operation. Virtual EPC is different. The host does not track how EPC pages are used by the guest, so it cannot guarantee EREMOVE success. It might, for instance, encounter a SECS with a non-zero child count. Add a definition of SGX_CHILD_PRESENT. It will be used exclusively by the SGX virtualization driver to handle recoverable EREMOVE errors when saniziting EPC pages after they are freed. Intel-SIG: commit `231d3dbdda` x86/sgx: Add SGX_CHILD_PRESENT hardware error code. Backport for SGX virtualization support. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/050b198e882afde7e6eba8e6a0d4da39161dbb5a.1616136308.git.kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:48 +08:00
Kai Huang	f6164a994e	x86/sgx: Wipe out EREMOVE from sgx_free_epc_page() commit `b0c7459be0` upstream. EREMOVE takes a page and removes any association between that page and an enclave. It must be run on a page before it can be added into another enclave. Currently, EREMOVE is run as part of pages being freed into the SGX page allocator. It is not expected to fail, as it would indicate a use-after-free of EPC pages. Rather than add the page back to the pool of available EPC pages, the kernel intentionally leaks the page to avoid additional errors in the future. However, KVM does not track how guest pages are used, which means that SGX virtualization use of EREMOVE might fail. Specifically, it is legitimate that EREMOVE returns SGX_CHILD_PRESENT for EPC assigned to KVM guest, because KVM/kernel doesn't track SECS pages. To allow SGX/KVM to introduce a more permissive EREMOVE helper and to let the SGX virtualization code use the allocator directly, break out the EREMOVE call from the SGX page allocator. Rename the original sgx_free_epc_page() to sgx_encl_free_epc_page(), indicating that it is used to free an EPC page assigned to a host enclave. Replace sgx_free_epc_page() with sgx_encl_free_epc_page() in all call sites so there's no functional change. At the same time, improve the error message when EREMOVE fails, and add documentation to explain to the user what that failure means and to suggest to the user what to do when this bug happens in the case it happens. [ bp: Massage commit message, fix typos and sanitize text, simplify. ] Intel-SIG: commit `b0c7459be0` x86/sgx: Wipe out EREMOVE from sgx_free_epc_page(). Backport for SGX virtualization support. Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/20210325093057.122834-1-kai.huang@intel.com Signed-off-by: Fan Du <fan.du@intel.com> [ Zhiquan Li: amend commit log and resolve the conflict. ] Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>	2024-06-11 21:23:48 +08:00

... 2 3 4 5 6 ...

875845 Commits All Branches Search

875845 Commits

All Branches