OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Ming Lei	c7e2d94b3d	blk-mq: free hw queue's resource in hctx's release handler Once blk_cleanup_queue() returns, tags shouldn't be used any more, because blk_mq_free_tag_set() may be called. Commit `45a9c9d909` ("blk-mq: Fix a use-after-free") fixes this issue exactly. However, that commit introduces another issue. Before `45a9c9d909`, we are allowed to run queue during cleaning up queue if the queue's kobj refcount is held. After that commit, queue can't be run during queue cleaning up, otherwise oops can be triggered easily because some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue(). We have invented ways for addressing this kind of issue before, such as: `8dc765d438` ("SCSI: fix queue cleanup race before queue initialization is done") `c2856ae2f3` ("blk-mq: quiesce queue before freeing queue") But still can't cover all cases, recently James reports another such kind of issue: https://marc.info/?l=linux-scsi&m=155389088124782&w=2 This issue can be quite hard to address by previous way, given scsi_run_queue() may run requeues for other LUNs. Fixes the above issue by freeing hctx's resources in its release handler, and this way is safe becasue tags isn't needed for freeing such hctx resource. This approach follows typical design pattern wrt. kobject's release handler. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reported-by: James Smart <james.smart@broadcom.com> Fixes: `45a9c9d909` ("blk-mq: Fix a use-after-free") Cc: stable@vger.kernel.org Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-05-04 07:24:05 -06:00
Ming Lei	fbc2a15e34	blk-mq: move cancel of requeue_work into blk_mq_release With holding queue's kobject refcount, it is safe for driver to schedule requeue. However, blk_mq_kick_requeue_list() may be called after blk_sync_queue() is done because of concurrent requeue activities, then requeue work may not be completed when freeing queue, and kernel oops is triggered. So moving the cancel of requeue_work into blk_mq_release() for avoiding race between requeue and freeing queue. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-05-04 07:24:04 -06:00
Ming Lei	e87eb301be	blk-mq: grab .q_usage_counter when queuing request from plug code path Just like aio/io_uring, we need to grab 2 refcount for queuing one request, one is for submission, another is for completion. If the request isn't queued from plug code path, the refcount grabbed in generic_make_request() serves for submission. In theroy, this refcount should have been released after the sumission(async run queue) is done. blk_freeze_queue() works with blk_sync_queue() together for avoiding race between cleanup queue and IO submission, given async run queue activities are canceled because hctx->run_work is scheduled with the refcount held, so it is fine to not hold the refcount when running the run queue work function for dispatch IO. However, if request is staggered into plug list, and finally queued from plug code path, the refcount in submission side is actually missed. And we may start to run queue after queue is removed because the queue's kobject refcount isn't guaranteed to be grabbed in flushing plug list context, then kernel oops is triggered, see the following race: blk_mq_flush_plug_list(): blk_mq_sched_insert_requests() insert requests to sw queue or scheduler queue blk_mq_run_hw_queue Because of concurrent run queue, all requests inserted above may be completed before calling the above blk_mq_run_hw_queue. Then queue can be freed during the above blk_mq_run_hw_queue(). Fixes the issue by grab .q_usage_counter before calling blk_mq_sched_insert_requests() in blk_mq_flush_plug_list(). This way is safe because the queue is absolutely alive before inserting request. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-05-04 07:24:02 -06:00
Linus Torvalds	aa1be08f52	* PPC and ARM bugfixes from submaintainers * Fix old Windows versions on AMD (recent regression) * Fix old Linux versions on processors without EPT * Fixes for LAPIC timer optimizations -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlzMc18UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNE0ggAj4c9FVC5aFeiBAj1YIcDijT3UtmG AjhoESE61rZI3PkZ5vcj2GC8eS7sKxExpCrQLsB5rLCF+7X90+tW155BHTHGU0ey ZgfGj23vlbZpvwZ4B5ujQ/Lmpry76pmy8EYekQogPP/eJxOB3oMk06tjh1mfSdIn D4Gj8jvYBB2ygAfmW91+YLLZos56id0N+Hyn/s95w4I1o6hKlkdpTOURAJKSGTb1 2t0+XADUt4ZwPM6+2X/eOBMGpeZP0/eR7H3kdyPy3ydm0sFjMiAAs0NbNp3eblB6 oqnytnGUPt8EEoq+wdZahLTbgJst2Ds++XAvVdBZED7zwGaBSETfg03eCg== =YP4M -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: - PPC and ARM bugfixes from submaintainers - Fix old Windows versions on AMD (recent regression) - Fix old Linux versions on processors without EPT - Fixes for LAPIC timer optimizations * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits) KVM: nVMX: Fix size checks in vmx_set_nested_state KVM: selftests: make hyperv_cpuid test pass on AMD KVM: lapic: Check for in-kernel LAPIC before deferencing apic pointer KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size x86/kvm/mmu: reset MMU context when 32-bit guest switches PAE KVM: x86: Whitelist port 0x7e for pre-incrementing %rip Documentation: kvm: fix dirty log ioctl arch lists KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit KVM: arm/arm64: Don't emulate virtual timers on userspace ioctls kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP KVM: arm/arm64: Ensure vcpu target is unset on reset failure KVM: lapic: Convert guest TSC to host time domain if necessary KVM: lapic: Allow user to disable adaptive tuning of timer advancement KVM: lapic: Track lapic timer advance per vCPU KVM: lapic: Disable timer advancement if adaptive tuning goes haywire x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012 KVM: x86: Consider LAPIC TSC-Deadline timer expired if deadline too short KVM: PPC: Book3S: Protect memslots while validating user address KVM: PPC: Book3S HV: Perserve PSSCR FAKE_SUSPEND bit on guest exit KVM: arm/arm64: vgic-v3: Retire pending interrupts on disabling LPIs ...	2019-05-03 16:49:46 -07:00
John David Anglin	11c03dc85f	parisc: Update huge TLB page support to use per-pagetable spinlock This patch updates the parisc huge TLB page support to use per-pagetable spinlocks. This patch requires Mikulas' per-pagetable spinlock patch and the revised TLB serialization patch from Helge and myself. With Mikulas' patch, we need to use the per-pagetable spinlock for page table updates. The TLB lock is only used to serialize TLB flushes on machines with the Merced bus. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:41 +02:00
Mikulas Patocka	b37d1c1898	parisc: Use per-pagetable spinlock PA-RISC uses a global spinlock to protect pagetable updates in the TLB fault handlers. When multiple cores are taking TLB faults simultaneously, the cache line containing the spinlock becomes a bottleneck. This patch embeds the spinlock in the top level page directory, so that every process has its own lock. It improves performance by 30% when doing parallel compilations. At least on the N class systems, only one PxTLB inter processor broadcast can be active at any one time on the Merced bus. If a Merced bus is found, this patch serializes the TLB flushes with the pa_tlb_flush_lock spinlock. v1: Initial patch by Mikulas v2: Added Merced detection by Helge v3: Revised TLB serialization by Dave & Helge Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:41 +02:00
Helge Deller	d19a12906e	parisc: Allow live-patching of __meminit functions When making the text sections writeable with set_kernel_text_rw(1), include all text sections including those in the __init section. Otherwise functions marked with __meminit will stay read-only. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # 4.20+	2019-05-03 23:47:40 +02:00
Helge Deller	2d94a832e2	parisc: Add memory barrier to asm pdc and sync instructions Add compiler memory barriers to ensure the compiler doesn't reorder memory operations around these instructions. Cc: stable@vger.kernel.org # v4.20+ Fixes: `3847dab774` ("parisc: Add alternative coding infrastructure") Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:40 +02:00
John David Anglin	44224bdb99	parisc: Add memory clobber to TLB purges The pdtlb and pitlb instructions are strongly ordered. The asms invoking these instructions should be compiler memory barriers to ensure the compiler doesn't reorder memory operations around these instructions. Signed-off-by: John David Anglin <dave.anglin@bell.net> CC: stable@vger.kernel.org # v4.20+ Fixes: `3847dab774` ("parisc: Add alternative coding infrastructure") Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:40 +02:00
John David Anglin	9e5c602186	parisc: Use ldcw instruction for SMP spinlock release barrier There are only a couple of instructions that can function as a memory barrier on parisc. Currently, we use the sync instruction as a memory barrier when releasing a spinlock. However, the ldcw instruction is a better barrier when we have a handy memory location since it operates in the cache on coherent machines. This patch updates the spinlock release code to use ldcw. I also changed the "stw,ma" instructions to "stw" instructions as it is not an adequate barrier. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:40 +02:00
John David Anglin	6c63ef8001	parisc: Remove lock code to serialize TLB operations in pacache.S TLB operations only need to be serialized on machines with the Merced (Stretch) bus. The only machines in this category are L and N class, and they require a 64-bit PA 2.0 kernel. On these machines, we use local TLB purges in the tmpalias routines. We don't need to serialize TLB purges on all other machines. Thus, the lock/unlock code can be removed when CONFIG_PA20 is not defined. Further, when CONFIG_PA20 is not defined, alternative patching converts the TLB purges to local purges when PA 2.0 hardware has been detected. Signed-off-by: John David Anglin <dave.anglin@bell.net> Tested-By: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:40 +02:00
Helge Deller	dbdf076099	parisc: Switch from DISCONTIGMEM to SPARSEMEM The commit `1c30844d2d` ("mm: reclaim small amounts of memory when an external fragmentation event occurs") breaks memory management on a parisc c8000 workstation with this memory layout: 0) Start 0x0000000000000000 End 0x000000003fffffff Size 1024 MB 1) Start 0x0000000100000000 End 0x00000001bfdfffff Size 3070 MB 2) Start 0x0000004040000000 End 0x00000040ffffffff Size 3072 MB With the patch `1c30844d2d`, the kernel will incorrectly reclaim the first zone when it fills up, ignoring the fact that there are two completely free zones. Basiscally, it limits cache size to 1GiB. The parisc kernel is currently using the DISCONTIGMEM implementation, but isn't NUMA. Avoid this issue or strange work-arounds by switching to the more commonly used SPARSEMEM implementation. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: `1c30844d2d` ("mm: reclaim small amounts of memory when an external fragmentation event occurs") Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:40 +02:00
Sven Schnelle	6b1370ae39	parisc: enable wide mode early The idle task might have been allocated above 4GB. With the current code we cannot access that memory because the CPU is still running in narrow mode. This was found on a J5000 machine and the patch is required to enable SPARSEMEM on that machine. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:40 +02:00
Sven Schnelle	75da60ff53	parisc: update feature lists Update lists to reflect that we have now KGDB and kretprobes support. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:40 +02:00
Helge Deller	0e4db23e12	parisc: Show n/a if product number not available Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:39 +02:00
Sven Schnelle	ea5a8c620f	parisc: remove unused flags parameter in __patch_text() It's not used by patch_map()/patch_unmap(), so lets remove it. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:39 +02:00
Sven Schnelle	376e5fd7ec	doc: update kprobes supported architecture list Now that kprobes and kretprobes are implemented, update the list in Documentation to reflect that. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:39 +02:00
Sven Schnelle	e0b59b7b63	parisc: Implement kretprobes Implement kretprobes on parisc, parts stolen from powerpc. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:39 +02:00
Sven Schnelle	1253d18d2d	parisc: remove kprobes.h from generic-y We're providing our own version now. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:39 +02:00
Sven Schnelle	8858ac8e9e	parisc: Implement kprobes Implement kprobes support for PA-RISC. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:39 +02:00
Sven Schnelle	ea1afe339a	parisc: add functions required by KPROBE_EVENTS implement regs_get_register(), regs_get_kernel_stack_nth() and regs_within_kernel_stack() Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:39 +02:00
Helge Deller	82d96bf68e	parisc: PA-Linux requires at least 32 MB RAM Even a 32-bit kernel requires at least 27 MB to decompress itself, so halt the system with a message if the system has less memory than 32 MB. Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:39 +02:00
Helge Deller	b438749044	parisc: Skip registering LED when running in QEMU No need to spend CPU cycles when we run on QEMU. Signed-off-by: Helge Deller <deller@gmx.de> CC: stable@vger.kernel.org # v4.9+	2019-05-03 23:47:39 +02:00
Helge Deller	f30bfa6d29	parisc: Tune LASI LAN for QEMU Do not loose cycles when we run on QEMU, and fix one trivial typo. Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:38 +02:00
Helge Deller	3e1120f4b5	parisc: Export running_on_qemu symbol for modules Signed-off-by: Helge Deller <deller@gmx.de> CC: stable@vger.kernel.org # v4.9+	2019-05-03 23:47:38 +02:00
Sven Schnelle	eacbfce19d	parisc: add KGDB support This patch add KGDB support to PA-RISC. It also implements single-stepping utilizing the recovery counter. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:38 +02:00
Sven Schnelle	620a53d522	parisc: add parisc code patching Instead of re-mapping the whole kernel text with RWX rights add a patch_text() which can be used to replace instructions in the kernel .text section. Based on the ARM implementation. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:38 +02:00
Sven Schnelle	ccfbc68d41	parisc: add set_fixmap()/clear_fixmap() These functions will be used for adding code patching functions later. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:38 +02:00
Alexandre Ghiti	17d9822d4b	parisc: Consider stack randomization for mmap base only when necessary Do not offset mmap base address because of stack randomization if current task does not want randomization. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Helge Deller <deller@gmx.de>	2019-05-03 23:47:38 +02:00
Iker Perez del Palomar Sustatxa	39abe9d88b	hwmon: (lm75) Add support for TMP75B The TMP75B has a different control register, supports 12-bit resolution and the default conversion rate is 37 Hz. Signed-off-by: Iker Perez del Palomar Sustatxa <iker.perez@codethink.co.uk> Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2019-05-03 13:16:18 -07:00
Iker Perez del Palomar Sustatxa	be889be778	dt-bindings: hwmon: Add tmp75b to lm75.txt Update the LM75's devicetree definition to allow Texas Instruments TMP75B be probed. Signed-off-by: Iker Perez del Palomar Sustatxa <iker.perez@codethink.co.uk> Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2019-05-03 13:13:15 -07:00
Petr Mladek	f68d67cf2f	livepatch: Remove duplicated code for early initialization kobject_init() call added one more operation that has to be done when doing the early initialization of both static and dynamic livepatch structures. It would have been easier when the early initialization code was not duplicated. Let's deduplicate it for future generations of livepatching hackers. The patch does not change the existing behavior. Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2019-05-03 21:11:23 +02:00
Petr Mladek	4d141ab341	livepatch: Remove custom kobject state handling kobject_init() always succeeds and sets the reference count to 1. It allows to always free the structures via kobject_put() and the related release callback. Note that the custom kobject state handling was used only because we did not know that kobject_put() can and actually should get called even when kobject_init_and_add() fails. The patch should not change the existing behavior. Suggested-by: "Tobin C. Harding" <tobin@kernel.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2019-05-03 21:11:22 +02:00
Linus Torvalds	82463436a7	Merge branch 'i2c/for-current-fixed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C driver bugfixes and a MAINTAINERS update for you" * 'i2c/for-current-fixed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: Prevent runtime suspend of adapter when Host Notify is required i2c: synquacer: fix enumeration of slave devices MAINTAINERS: friendly takeover of i2c-gpio driver i2c: designware: ratelimit 'transfer when suspended' errors i2c: imx: correct the method of getting private data in notifier_call	2019-05-03 11:42:01 -07:00
Nicholas Piggin	08ae95f4fd	nohz_full: Allow the boot CPU to be nohz_full Allow the boot CPU/CPU0 to be nohz_full. Have the boot CPU take the do_timer duty during boot until a housekeeping CPU can take over. This is supported when CONFIG_PM_SLEEP_SMP is not configured, or when it is configured and the arch allows suspend on non-zero CPUs. nohz_full has been trialed at a large supercomputer site and found to significantly reduce jitter. In order to deploy it in production, they need CPU0 to be nohz_full because their job control system requires the application CPUs to start from 0, and the housekeeping CPUs are placed higher. An equivalent job scheduling that uses CPU0 for housekeeping could be achieved by modifying their system, but it is preferable if nohz_full can support their environment without modification. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20190411033448.20842-6-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-05-03 19:42:58 +02:00
Nicholas Piggin	9219565aa8	sched/isolation: Require a present CPU in housekeeping mask During housekeeping mask setup, currently a possible CPU is required. That does not guarantee the CPU would be available at boot time, so check to ensure that at least one present CPU is in the mask. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20190411033448.20842-5-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-05-03 19:42:58 +02:00
Nicholas Piggin	9ca12ac04b	kernel/cpu: Allow non-zero CPU to be primary for suspend / kexec freeze This patch provides an arch option, ARCH_SUSPEND_NONZERO_CPU, to opt-in to allowing suspend to occur on one of the housekeeping CPUs rather than hardcoded CPU0. This will allow CPU0 to be a nohz_full CPU with a later change. It may be possible for platforms with hardware/firmware restrictions on suspend/wake effectively support this by handing off the final stage to CPU0 when kernel housekeeping is no longer required. Another option is to make housekeeping / nohz_full mask dynamic at runtime, but the complexity could not be justified at this time. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20190411033448.20842-4-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-05-03 19:42:58 +02:00
Nicholas Piggin	2f1a6fbbef	power/suspend: Add function to disable secondaries for suspend This adds a function to disable secondary CPUs for suspend that are not necessarily non-zero / non-boot CPUs. Platforms will be able to use this to suspend using non-zero CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20190411033448.20842-3-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-05-03 19:42:41 +02:00
Alexander Shishkin	aad14ad3cf	intel_th: msu: Add current window tracking Now that we have a way to switch between MSC buffer windows, add code to track the current window. The hardware register NWSA that contains the address of the next window is unfortunately not always usable, and since the driver has full control of the window switching, there is no reason not to keep this on the software side. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:17:40 +02:00
Alexander Shishkin	6cac7866c2	intel_th: msu: Add a sysfs attribute to trigger window switch Now that we have the means to trigger a window switch for the MSU trace store, add a sysfs file to allow triggering it from userspace. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:17:40 +02:00
Alexander Shishkin	4840572d3d	intel_th: msu: Correct the block wrap detection In multi window mode the MSU will set "window wrap" bit to indicate block wrapping as well. Take this into account when checking data blocks. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:16:21 +02:00
Alexander Shishkin	8116db57cf	intel_th: Add switch triggering support Add support for asserting window switch trigger when tracing to MSU output ports. This allows for software controlled switching between windows of the MSU buffer, which can be used for double buffering while exporting the trace data further from the MSU. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:16:21 +02:00
Alexander Shishkin	9958e02523	intel_th: gth: Factor out trace start/stop The trace enable/disable functions of the GTH include the code that starts and stops trace flom from the sources. This start/stop functionality will also be used in the window switch trigger sequence. Factor out start/stop code from the larger trace enable/disable code in preparation for the window switch sequence. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:16:21 +02:00
Alexander Shishkin	8d4155126e	intel_th: msu: Factor out pipeline draining The code that waits for the pipeline empty condition of the MSU is currently called in the path that disables the trace. We will also need this in the window switch trigger sequence. Therefore, factor out this code and make it accessible to the GTH device. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:16:21 +02:00
Alexander Shishkin	ba39bd8306	intel_th: msu: Switch over to scatterlist Instead of using a home-grown array of pointers to the DMA pages, switch over to scatterlist data types and accessors, which has all the convenient accessors, can be used to batch-map DMA memory and is convenient for passing around between different layers, which will be useful when MSU buffer management has to cross the boundaries of the MSU driver. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:14:30 +02:00
Alexander Shishkin	0de9e0351d	intel_th: msu: Replace open-coded list_{first,last,next}_entry variants There are a few places in the code where open-coded versions of list entry accessors list_first_entry()/list_last_entry()/list_next_entry() are used. Replace those with the standard macros. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:14:30 +02:00
Alexander Shishkin	4c5bb6eb40	intel_th: Only report useful IRQs to subdevices The only type of IRQ triggering event that is useful to us at the moment is the "last block" interrupt of the MSU. This interrupt can only be enabled via "MINTCTL" register that doesn't exist in earlier version of the Intel TH. Enumerate the presence of MINTCTL via per-device driver data structure and only instantiate the IRQ resource for subdevices if this capability is present. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:14:30 +02:00
Alexander Shishkin	aac8da6517	intel_th: msu: Start handling IRQs We intend to use the interrupt to detect Last Block condition in the MSU driver, which we can use for double-buffering software-managed data transfers. Add an interrupt handler to the MSU driver. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:14:30 +02:00
Alexander Shishkin	7b7036d47c	intel_th: pci: Use MSI interrupt signalling Since Intel TH is capable of MSI interrupt signalling, make use of it. The way it works is, each of the 7 interrupt triggering events has its own vector in this mode, as opposed to interrupt line delivery, where all events are signalled via the same line. Failing to enable MSI, the driver falls back to using an interrupt line. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:14:30 +02:00
Alexander Shishkin	62a593022c	intel_th: Communicate IRQ via resource Currently, the IRQ is passed between the glue layers and the core as a separate argument, while the MMIO resources are passed as resources. This also limits the number of IRQs thus used to one, while the current versions of Intel TH use a different MSI vector for each interrupt triggering event, of which there are 7. Change this to pass IRQ in the resources array. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-03 18:14:29 +02:00

... 2 3 4 5 6 ...

830194 Commits All Branches Search

830194 Commits

All Branches