commit 052adee668 upstream.
None of the call sites cares about the actual return code. Change the
return type to boolean and return 'true' on success.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.736773588@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 2beb4a53fc upstream.
The kernel pushes context on to the userspace stack to prepare for the
user's signal handler. When the user has supplied an alternate signal
stack, via sigaltstack(2), it is easy for the kernel to verify that the
stack size is sufficient for the current hardware context.
Check if writing the hardware context to the alternate stack will exceed
it's size. If yes, then instead of corrupting user-data and proceeding with
the original signal handler, an immediate SIGSEGV signal is delivered.
Refactor the stack pointer check code from on_sig_stack() and use the new
helper.
While the kernel allows new source code to discover and use a sufficient
alternate signal stack size, this check is still necessary to protect
binaries with insufficient alternate signal stack size from data
corruption.
Fixes: c2bc11f10a ("x86, AVX-512: Enable AVX-512 States Context Switch")
Reported-by: Florian Weimer <fweimer@redhat.com>
Suggested-by: Jann Horn <jannh@google.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20210518200320.17239-6-chang.seok.bae@intel.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=153531
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit fcfb716332 upstream.
When the direct saving of the FPU registers to the user space sigframe
fails, copy_fpregs_to_sigframe() attempts to clear the user buffer.
The most likely reason for such a fail is a page fault. As
copy_fpregs_to_sigframe() is invoked with pagefaults disabled the chance
that __clear_user() succeeds is minuscule.
Move the clearing out into the caller which replaces the
fault_in_pages_writeable() in that error handling path.
The return value confusion will be cleaned up separately.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.679356300@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 4164a482a5 upstream.
There is no reason to have the header zeroing in the pagefault disabled
region. Do it upfront once.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.621674721@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 4339d0c63c upstream.
FPU restore from a signal frame can trigger various exceptions. The
exceptions are caught with an exception table entry. The handler of this
entry stores the trap number in EAX. The FPU specific fixup negates that
trap number to convert it into an negative error code.
Any other exception than #PF is fatal and recovery is not possible. This
relies on the fact that the #PF exception number is the same as EFAULT, but
that's not really obvious.
Remove the negation from the exception fixup as it really has no value and
check for X86_TRAP_PF at the call site.
There is still confusion due to the return code conversion for the error
case which will be cleaned up separately.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.506192488@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 0c2e62ba04 upstream.
Now that the MC safe copy and FPU have been converted to use the MCE safe
fixup types remove EX_TYPE_FAULT from the list of types which MCE considers
to be safe to be recovered in kernel.
This removes the SGX exception handling of ENCLS from the #MC safe
handling, but according to the SGX wizards the current SGX implementations
cannot survive #MC on ENCLS:
https://lore.kernel.org/r/YS+upEmTfpZub3s9@google.com
The code relies on the trap number being stored if ENCLS raised an
exception. That's still working, but it does no longer trick the MCE code
into assuming that #MC is handled correctly for ENCLS.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.445255957@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit c6304556f3 upstream.
The macros used for restoring FPU state from a user space buffer can handle
all exceptions including #MC. They need to return the trap number in the
error case as the code which invokes them needs to distinguish the cause of
the failure. It aborts the operation for anything except #PF.
Use the new EX_TYPE_FAULT_MCE_SAFE exception table fixup type to document
the nature of the fixup.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.387464538@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit c1c97d1754 upstream.
Nothing in that code uses the trap number which was stored by the exception
fixup which is instantiated via _ASM_EXTABLE_FAULT().
Use _ASM_EXTABLE(... EX_TYPE_DEFAULT_MCE_SAFE) instead which just handles
the IP fixup and the type indicates to the #MC handler that the call site
can handle the abort caused by #MC correctly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.328706042@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 2cadf5248b upstream.
Provide exception fixup types which can be used to identify fixups which
allow in kernel #MC recovery and make them invoke the existing handlers.
These will be used at places where #MC recovery is handled correctly by the
caller.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.269689153@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit opencloudos.
_ASM_EXTABLE_HANDLE() was removed by 46d28947d9,
so that complied failed. Fix it.
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 46d28947d9 upstream.
The exception table entries contain the instruction address, the fixup
address and the handler address. All addresses are relative. Storing the
handler address has a few downsides:
1) Most handlers need to be exported
2) Handlers can be defined everywhere and there is no overview about the
handler types
3) MCE needs to check the handler type to decide whether an in kernel #MC
can be recovered. The functionality of the handler itself is not in any
way special, but for these checks there need to be separate functions
which in the worst case have to be exported.
Some of these 'recoverable' exception fixups are pretty obscure and
just reuse some other handler to spare code. That obfuscates e.g. the
#MC safe copy functions. Cleaning that up would require more handlers
and exports
Rework the exception fixup mechanics by storing a fixup type number instead
of the handler address and invoke the proper handler for each fixup
type. Also teach the extable sort to leave the type field alone.
This makes most handlers static except for special cases like the MCE
MSR fixup and the BPF fixup. This allows to add more types for cleaning up
the obscure places without adding more handler code and exports.
There is a marginal code size reduction for a production config and it
removes _eight_ exported symbols.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lkml.kernel.org/r/20210908132525.211958725@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 083b32d6f4 upstream.
and the random number of tabs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.154428878@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 32fd8b59f9 upstream.
No point in defining the identical macros twice depending on C or assembly
mode. They are still identical.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.023659534@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 326b567f82 upstream.
No need to have the same code all over the place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132524.963232825@linutronix.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 41f45fb045 upstream.
Stephen Rothwell reported that the objtool cross-build breaks on
non-x86 hosts:
> tools/arch/x86/include/asm/asm.h:185:24: error: invalid register name for 'current_stack_pointer'
> 185 | register unsigned long current_stack_pointer asm(_ASM_SP);
> | ^~~~~~~~~~~~~~~~~~~~~
The PowerPC host obviously doesn't know much about x86 register names.
Protect the kernel-specific bits of <asm/asm.h>, so that it can be
included by tooling and cross-built.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit e100777016 upstream.
They do get called from the #MC handler which is already marked
"noinstr".
Commit
e2def7d49d ("x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR")
already got rid of the instrumentation in the MSR accessors, fix the
annotation now too, in order to get rid of:
vmlinux.o: warning: objtool: do_machine_check()+0x4a: call to mce_rdmsrl() leaves .noinstr.text section
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200915194020.28807-1-bp@alien8.de
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 2976908e41 upstream.
A user has reported that they are seeing spurious corrected errors on
their hardware.
Intel Errata HSD131, HSM142, HSW131, and BDM48 report that "spurious
corrected errors may be logged in the IA32_MC0_STATUS register with
the valid field (bit 63) set, the uncorrected error field (bit 61) not
set, a Model Specific Error Code (bits [31:16]) of 0x000F, and an MCA
Error Code (bits [15:0]) of 0x0005." The Errata PDFs are linked in the
bugzilla below.
Block these spurious errors from the console and logs.
[ bp: Move the intel_filter_mce() header declarations into the already
existing CONFIG_X86_MCE_INTEL ifdeffery. ]
Co-developed-by: Alexander Krupp <centos@akr.yagii.de>
Signed-off-by: Alexander Krupp <centos@akr.yagii.de>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206587
Link: https://lkml.kernel.org/r/20200219131611.36816-1-prarit@redhat.com
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 8919f07276 upstream.
The test measures the kernel's signal delivery with different (enough vs.
insufficient) stack sizes.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20210518200320.17239-7-chang.seok.bae@intel.com
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit bdf6c8b84a upstream.
The SIGSTKSZ constant may not represent enough stack size in some
architectures as the hardware state size grows.
Use getauxval(AT_MINSIGSTKSZ) to increase the stack size.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20210518200320.17239-5-chang.seok.bae@intel.com
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 1c33bb0507 upstream.
Historically, signal.h defines MINSIGSTKSZ (2KB) and SIGSTKSZ (8KB), for
use by all architectures with sigaltstack(2). Over time, the hardware state
size grew, but these constants did not evolve. Today, literal use of these
constants on several architectures may result in signal stack overflow, and
thus user data corruption.
A few years ago, the ARM team addressed this issue by establishing
getauxval(AT_MINSIGSTKSZ). This enables the kernel to supply a value
at runtime that is an appropriate replacement on current and future
hardware.
Add getauxval(AT_MINSIGSTKSZ) support to x86, analogous to the support
added for ARM in
94b07c1f8c ("arm64: signal: Report signal frame size to userspace via auxv").
Also, include a documentation to describe x86-specific auxiliary vectors.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20210518200320.17239-4-chang.seok.bae@intel.com
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 7cd60e43a6 upstream.
Define AT_MINSIGSTKSZ in the generic uapi header. It is already used
as generic ABI in glibc's generic elf.h, and this define will prevent
future namespace conflicts. In particular, x86 is also using this
generic definition.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20210518200320.17239-2-chang.seok.bae@intel.com
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit a3e340c157 upstream.
The wq resources needs to be released before the kernel type is reset by
__drv_disable_wq(). With dma channels unregistered and wq quiesced, all the
wq resources for dmaengine can be freed. There is no need to wait until wq
is disabled. With the wq->type being reset to "unknown", the driver is
skipping the freeing of the resources.
Fixes: 0cda4f6986 ("dmaengine: idxd: create dmaengine driver for wq 'device'")
Reported-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163517405099.3484556.12521975053711345244.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 2efe58cfaa upstream.
According to core-api/dma-api-howto.rst, the address from
dma_alloc_coherent is gauranteed to align to the smallest PAGE_SIZE order.
That supercedes the 64B/32B alignment requirement of the completion record.
Remove alignment adjustment code.
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163517396063.3484297.7494385225280705372.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 88d97ea82c upstream.
Add halt interrupt support. Given that the misc interrupt handler already
check halt state, the driver just need to run the halt handling code when
receiving the halt interrupt.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163114224352.846654.14334468363464318828.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 15af840831 upstream.
Remove setting of wq type on guest kernel during configuration load on RO
device config. The user will set the kernel wq type and this setting based
on config is not necessary.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163474724511.2607444.1876715711451990426.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 79c4c3db7d upstream.
DSA spec 1.2 has moved the GENCFG register under the GENCAP configuration
support with respect to writability. Add check in driver before writing to
GENCFG register.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163406171896.1303830.11217958011385656998.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 1f6a89efbf upstream.
The variable err is being initialized with a value that is never read, it
is being updated later on. The assignment is redundant and can be removed
and move the declaration into the local scope.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20211015123447.27560-1-colin.king@canonical.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 85f604af9c upstream.
percpu_ref_tryget_live() is safe to call as long as ref is between init and
exit according to the function comment. Move percpu_ref_exit() so it is
called after the dma channel is no longer valid to ensure this holds true.
Fixes: 93a40a6d74 ("dmaengine: idxd: add percpu_ref to descriptor submission path")
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163294293832.914350.10326422026738506152.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit f9f4082dbc upstream.
The spinlock is not being used in hard interrupt context. There is no need
to disable irq when acquiring the lock. The interrupt thread handler also
is not in bottom half context, therefore we can also remove disabling of
the bh. Convert all dev_lock acquisition to plain spin_lock() calls.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162984026772.1939166.11504067782824765879.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit f9f4082dbc upstream.
The cmd_lock spinlock is not being used in hard interrupt context. There is
no need to disable irq when acquiring the lock. Convert all cmd_lock
acquisition to plain spin_lock() calls.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162984027930.1939209.15758413737332339204.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit d8071323c5 upstream.
DSA spec says WQ priv bit is 0 if the Privileged Mode Enable field of the
PCI Express PASID capability is 0 and pasid is enabled. Make sure that the
WQCFG priv field is set correctly according to usage type. Reject config if
setting up kernel WQ type and no support. Also add the correct priv setup
for a descriptor.
Fixes: 484f910e93 ("dmaengine: idxd: fix wq config registers offset programming")
Cc: Ramesh Thomas <ramesh.thomas@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162939084657.903168.14160019185148244596.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 9806eb5c79 upstream.
Since submission is sent to limited portal, the actual wq size for shared
wq is set by the threshold rather than the wq size. When the wq type is
shared, set the allocated descriptors to the threshold.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162827151733.3459223.3829837172226042408.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 0b030f54f0 upstream.
The submission path for dmaengine API does not do descriptor freeing on
failure. Also, with the abort mechanism, the freeing of descriptor happens
when the abort callback is completed. Therefore free descriptor on all
error paths for submission call to make things consistent. Also remove the
double free that would happen on abort in idxd_dma_tx_submit() call.
Fixes: 6b4b87f2c3 ("dmaengine: idxd: fix submission race window")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162827146072.3459011.10255348500504659810.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 9fce3b3a0a upstream.
The list lock is never acquired in interrupt context. Therefore there is no
need to disable interrupts. Remove interrupt flags for lock operations.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162826417450.3454650.3733188117742416238.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit d803c8b9f3 upstream.
The interrupt thread handler currently loops forever to process outstanding
completions. This causes either an "irq X: nobody cared" kernel splat or
the NMI watchdog kicks in due to running too long in the function. The irq
thread handler is expected to run again after exiting if there are
interrupts fired while the thread handler is running. So the handler code
can process all the completed I/O in a single pass and exit without losing
the follow on completed I/O.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162802977005.3084234.11836261157026497585.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit bd2f4ae5e0 upstream.
The block on fault flag is not cleared when we disable or reset wq. This
causes it to remain set if the user does not clear it on the next
configuration load. Add clear of flag in dxd_wq_disable_cleanup()
routine.
Fixes: da32b28c95 ("dmaengine: idxd: cleanup workqueue config after disabling")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162803023553.3086015.8158952172068868803.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 81c2f79c21 upstream.
The device general capability has a bit that indicate whether 'block on
fault' is supported. Add check to wq sysfs knob to check if cap exists
before allowing user to toggle.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162802992615.3084999.12539468940404102898.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit e9c5b0b53c upstream.
'device_driver_attach()' dereferences its first argument (i.e. 'alt_drv')
so it must not be NULL.
Simplify the error handling logic about NULL 'alt_drv' in order to be
more robust and future-proof.
Fixes: 568b212646 ("dmaengine: idxd: fix uninit var for alt_drv")
Fixes: 6e7f3ee97b ("dmaengine: idxd: move dsa_drv support to compatible mode")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/77f0dc4f3966591d1f0cffb614a94085f8895a85.1627560174.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit b60bb6e2bf upstream.
Coverity static analysis of linux-next found issue.
The check (status == IDXD_COMP_DESC_ABORT) is always false since status
was previously masked with 0x7f and IDXD_COMP_DESC_ABORT is 0xff.
Fixes: 6b4b87f2c3 ("dmaengine: idxd: fix submission race window")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162698465160.3560828.18173186265683415384.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 125d10373a upstream.
Enabling device and wq returns standard errno and that does not provide
enough details to indicate what exactly failed. The hardware command status
is only 8bits. Expand the command status to 32bits and use the upper 16
bits to define software errors to provide more details on the exact
failure. Bit 31 will be used to indicate the error is software set as the
driver is using some of the spec defined hardware error as well.
Cc: Ramesh Thomas <ramesh.thomas@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681373579.1968485.5891788397526827892.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit a9c171527a upstream.
The device submission portal is on a 4k page and any of those 64bit aligned
address on the page can be used for descriptor submission. By rotating the
offset through the 4k range and prevent successive writes to the same MMIO
address, performance improvement is observed through testing.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681372446.1968485.10634280461681015569.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 673d812d30 upstream.
The sbitmap wait and allocate routine checks the index that is returned
from sbitmap_queue_get(). It should be idxd >= 0 as 0 is also a valid
index. This fixes issue where submission path hangs when WQ size is 1.
Fixes: 0705107fcc ("dmaengine: idxd: move submission to sbitmap_queue")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162697645067.3478714.506720687816951762.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit 568b212646 upstream.
0-day detected uninitialized alt_drv variable in the bind_store() function.
The branch can be taken when device is not idxd device or wq 'struct
device'. Init alt_drv to NULL.
Fixes: 6e7f3ee97b ("dmaengine: idxd: move dsa_drv support to compatible mode")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162689250332.2114335.636367120454420852.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>
commit ade8a86b51 upstream.
Set GRPCFG traffic class to value of 1 for best performance on current
generation of accelerators. Also add override option to allow experimentation.
Sysfs knobs are disabled for DSA/IAX gen1 devices.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681373005.1968485.3761065664382799202.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Chen Zhuo <sagazchen@tencent.com>
Signed-off-by: Xinghui Li <korantli@tencent.com>