When we have an SHPC-capable bridge with a second SHPC-capable bridge
below it, pushing the upstream bridge's attention button causes a
deadlock.
The deadlock happens because we use the shpchp_wq workqueue to run
shpchp_pushbutton_thread(), which uses shpchp_disable_slot() to remove
devices below the upstream bridge. When we remove the downstream bridge,
we call shpc_remove(), the shpchp driver's .remove() method. That calls
flush_workqueue(shpchp_wq), which deadlocks because the
shpchp_pushbutton_thread() work item is still running.
This patch avoids the deadlock by creating a workqueue for every slot
and removing the single shared workqueue.
Here's the call path that leads to the deadlock:
shpchp_queue_pushbutton_work
queue_work(shpchp_wq) # shpchp_pushbutton_thread
...
shpchp_pushbutton_thread
shpchp_disable_slot
remove_board
shpchp_unconfigure_device
pci_stop_and_remove_bus_device
...
shpc_remove # shpchp driver .remove method
hpc_release_ctlr
cleanup_slots
flush_workqueue(shpchp_wq)
This change is based on code inspection, since we don't have hardware
with this topology.
Based-on-patch-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org
Use non-ordered workqueue for attention button events.
Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in shpchp as a result.
486b10b9f4 ("PCI: pciehp: Handle push button event asynchronously") made
the same change to pciehp. I split this out from a patch by Yijing Wang
<wangyijing@huawei.com> so we fix one thing at a time and to make the
shpchp history correspond more closely with the pciehp history.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
"slots_not_empty" is initialized to zero and can't be set again before
reaching this point, so this return statement is dead. Remove it.
Found by Coverity (CID 114324).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* Rename shpchp_wq to shpchp_ordered_wq and add non-ordered shpchp_wq
which is used instead of the system workqueue. This is to remove
the use of flush_scheduled_work() which is deprecated and scheduled
for removal.
* With cmwq in place, there's no point in creating workqueues lazily.
Create both shpchp_wq and shpchp_ordered_wq upfront.
* Include workqueue.h from shpchp.h.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Stanse found a cut&pasted memory leak in pciehp_queue_pushbutton_work
and shpchp_queue_pushbutton_work. info is not freed/assigned on all
paths. Fix that.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Move the max_bus_speed and cur_bus_speed into the pci_bus. Expose the
values through the PCI slot driver instead of the hotplug slot driver.
Update all the hotplug drivers to use the pci_bus instead of their own
data structures.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch refines messages in shpchp module. The main changes are as
follows:
- remove the trailing "."
- remove __func__ as much as possible
- capitalize the first letter of messages
- show PCI device address including its domain
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We do not need to manage our own name parameter, especially since
the PCI core can change it on our behalf, in the case of duplicate
slot names.
Remove 'name' from shpchp's version of struct slot.
This change also removes the unused struct task_event from the
slot structure.
Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.
Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch deletes trailing white space in SHPCHP driver. This has no
functional change.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The struct php_ctlr seems to be only for complicating codes. This
patch removes struct php_ctlr and related codes.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Current SHPCHP driver shows device number of slots in info messages,
but it is useless and should be replaced with slot name.
This patch replaces the device number shown in the info messages with
the slot name.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: Kristen Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The code related to handling bus speed in SHPCHP driver is
unnecessarily complex. This patch cleans up and simplify that.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Current SHPCHP driver doesn't care about the confliction between
hotplug operation via sysfs and hotplug operation via attention
button. So if those ware conflicted, slot could be an unexpected
state.
This patch changes SHPCHP driver to handle slot state properly. With
this patch, slot events are handled according to the current slot
state as shown at the Table below.
Table. Slot States and Event Handling
=========================================================================
Slot State Event and Action
=========================================================================
STATIC - Go to POWERON state if user initiates
(Slot enabled, insertion request via sysfs
Slot disabled) - Go to POWEROFF state if user initiates removal
request via sysfs
- Go to BLINKINGON state if user presses
attention button when the slot is disabled
- Go to BLINKINGOFF state if user presses
attention button when the slot is enabled
The event handler of SHPCHP driver is unnecessarily very complex. In
addition, current event handler can only a fixed number of events at
the same time, and some of events would be lost if several number of
events happened at the same time.
This patch simplify the event handler by using 'work queue', and it
also fix the above-mentioned issue.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The wait_for_ctrl_irq() function in SHPCHP driver is no longer needed.
This patch removes that. This patch has no functional change.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Current shpchp driver might cause system panic because of lack of
serialization. It can be reproduced very easily by the following
operation.
# cd /sys/bus/pci/slots/<slot#>
# while true; do echo 0 > power ; echo 1 > power ; done &
# while true; do echo 0 > power ; echo 1 > power ; done &
This patch fixes this issue by changing shpchp to get appropreate
semaphore for hot-plug operation.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch cleanups codes that check the command status. For this, it
introduces a new semaphore "cmd_sem" for each controller.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
semaphore to mutex conversion.
the conversion was generated via scripts, and the result was validated
automatically via a script as well.
build tested with allyesconfig.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes the AMD POGO errata on the hotplug controller where the
platform will lock up or reboot if PERR/SERR generation is enabled and a
slot is sent an enable command. This fix disables PERR/SERR generation
before a slot is sent the enable command by first saving related
registers, turning off SERR/PERR generation, enabling the slot, then
restoring the registers.
Signed-off-by: David Keck <david.keck@amd.com>
Cc: Kristen Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Current SHPCHP driver uses msleep_interruptible() function to wait for
a command completion event. But I think this would cause an unnecessary
long wait until timeout, if command completion interrupt came before
task state was changed to TASK_INTERRUPTIBLE. This patch fixes this
issue. With this patch, command completion becomes faster as follows:
o Without this patch
# time echo 1 > power
real 0m4.708s
user 0m0.000s
sys 0m0.524s
o With this patch
# time echo 1 > power
real 0m2.221s
user 0m0.000s
sys 0m0.532s
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reduce the number of debug messages generated if shpchp debug is
enabled. I tried to restrict this to removing debug messages that
are either early-driver-debug type messages, or print information
that can be inferred through other debug prints.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove un-necessary header includes, remove dead code, remove
some type casts, receive function return in the correct data
type...
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
State information is currently stored in per-slot as well as
per-pci-function data structures in shpchp. There's a lot of
overlap in the information kept, and some of it is never used.
This patch consolidates the state information to per-slot and
eliminates unused data structures. The biggest change is to
eliminate the pci_func structure and the code around managing
its lists.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch eliminates saving the PCI config header for devices
in hotplug capable slots. We now use the PCI core to get the
specific parts of the config header as required.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reduce the SHPC hotplug driver's dependence on ACPI. We don't
walk the acpi namespace anymore to build a list of bridges and
devices. The remaining interaction with ACPI is to run the
_OSHP method to transition control of hotplug hardware from
system BIOS to the shpc hotplug driver, and to run the _HPP
method to get hotplug device parameters like cache line size,
latency timer and SERR/PERR enable from BIOS.
Note that one of the side effects of this patch is that shpchp
does not enable the hot-added device or its DMA bus mastering
automatically now. It expects the device driver to do that.
This may break some drivers and we will have to fix them as
they are reported.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch converts the standard hotplug controller driver to use
the PCI core for resource management. This eliminates a whole lot
of duplicated code, and integrates shpchp in the system's normal
PCI handling code.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
PCI_ROM_ADDRESS is a 32 bit register and as such should be accessed using
pci_bus_{read,write}_config_dword(). A recent audit of drivers/ turned up
several cases of byte- and word-sized accesses. The harmful ones were fixed
by Linus directly. This patches up one of the remaining
harmless-but-still-wrong cases caught in the dragnet.
Signed-off-by: Adam Kropelin <akropel1@rochester.rr.com>
Cc: <kristen.c.accardi@intel.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Here is a patch to fix the problem of echoing 1 to "power" file
to enabled slot causing the slot to power down, and echoing 0
to disabled slot causing shpchp_disabled_slot() to be called
twice. This problem was reported by kenji Kaneshige.
Thanks,
Dely
Signed-off-by: Dely Sy <dely.l.sy@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!