OpenCloudOS-Kernel/Documentation/userspace-api/index.rst

32 lines
669 B
ReStructuredText
Raw Normal View History

=====================================
The Linux kernel user-space API guide
=====================================
.. _man-pages: https://www.kernel.org/doc/man-pages/
While much of the kernel's user-space API is documented elsewhere
(particularly in the man-pages_ project), some user-space information can
also be found in the kernel tree itself. This manual is intended to be the
place where this information is gathered.
.. class:: toc-title
Table of contents
.. toctree::
:maxdepth: 2
no_new_privs
seccomp_filter
unshare
prctl: Add speculation control prctls Add two new prctls to control aspects of speculation related vulnerabilites and their mitigations to provide finer grained control over performance impacting mitigations. PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature which is selected with arg2 of prctl(2). The return value uses bit 0-2 with the following meaning: Bit Define Description 0 PR_SPEC_PRCTL Mitigation can be controlled per task by PR_SET_SPECULATION_CTRL 1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is disabled 2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is enabled If all bits are 0 the CPU is not affected by the speculation misfeature. If PR_SPEC_PRCTL is set, then the per task control of the mitigation is available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation misfeature will fail. PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which is selected by arg2 of prctl(2) per task. arg3 is used to hand in the control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE. The common return values are: EINVAL prctl is not implemented by the architecture or the unused prctl() arguments are not 0 ENODEV arg2 is selecting a not supported speculation misfeature PR_SET_SPECULATION_CTRL has these additional return values: ERANGE arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE ENXIO prctl control of the selected speculation misfeature is disabled The first supported controlable speculation misfeature is PR_SPEC_STORE_BYPASS. Add the define so this can be shared between architectures. Based on an initial patch from Tim Chen and mostly rewritten. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2018-04-29 21:20:11 +08:00
spec_ctrl
accelerators/ocxl
ioasid: Add /dev/ioasid for userspace commit ff6ff5e316b6bd83ea35dff8d88e7a3daa4000de Intel-BKC. I/O Address Space IDs (IOASIDs) is used to tag DMA requests to target multiple DMA address spaces for physical devices. Its PCI terminology is called PASID (Process Address Space ID). Platforms with PASID support can provide PASID granularity DMA isolation, which is very useful for efficient and secure device sharing (SVA, subdevice passthrough, etc.). Today only kernel drivers are allowed to allocate IOASIDs [1]. This patch aims to extend this capability to userspace as required in device pass- through scenarios. For example, a userspace driver may want to create its own DMA address spaces besides the default IOVA address space established by the kernel on the assigned device (e.g. vDPA control vq [2] and guest SVA [3]), thus need to get IOASIDs from the kernel IOASID allocator for tagging. In concept, each device can have its own IOASID space, thus it's also possible for userspace driver to manage a private IOASID space itself, say, when PF/VF is assigned. However it doesn't work for subdevice pass- through, as multiple subdevices under the same parent device share a single IOASID space thus IOASIDs must be centrally managed by the kernel in such case. This patch introduces a /dev/ioasid interface for this purpose (per discussion in [4]). An IOASID is just a number before it is tagged to a specific DMA address space. The actual IOASID tagging (to DMA requests) and association (with DMA address spaces) operations from userspace are scrutinized by specific device passthrough frameworks, which must ensure that a malicious driver cannot program arbitrary IOASIDs to its assigned device to access DMA address spaces that don't belong to it, this is out of the scope of this patch (a reference VFIO implementation will be posted soon). Open: PCIe PASID is 20bit implying a space with 1M IOASIDs. although it's plenty there was an open [4] on whether this user interface is open to all processes or only selective processes (e.g. with device assigned). In this patchseries, a cgroup controller is introduced to manage IOASID quota that a process is allowed to use. A cgroup-enabled system may by default set quota=0 to disallow IOASID allocation for most processes, and then having the virt management stack to adjust the quota for a process which gets device assigned. But yeah, we are also willing to hear more suggestions. [1] https://lore.kernel.org/linux-iommu/1565900005-62508-8-git-send-email-jacob.jun.pan@linux.intel.com/ [2] https://lore.kernel.org/kvm/20201216064818.48239-1-jasowang@redhat.com/ [3] https://lore.kernel.org/linux-iommu/1599734733-6431-1-git-send-email-yi.l.liu@intel.com/ [4] https://lore.kernel.org/kvm/20201014171055.328a52f4@w520.home/ Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>
2022-06-01 14:38:31 +08:00
ioasid
.. only:: subproject and html
Indices
=======
* :ref:`genindex`