OpenCloudOS-Kernel/Documentation
Oza Pawandeep 7e9084b367 PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices
PCIe ERR_FATAL errors mean the Link is unreliable.  Components on the Link
may need to be reset to return to reliable operation (PCIe r4.0, sec
6.2.2).  We previously handled these errors much differently depending on
whether the platform supports Downstream Port Containment (DPC) (PCIe r4.0,
sec 6.2.10) or not.

The AER driver has historically logged the error details, called
driver-supplied pci_error_handlers callbacks, and reset the Link.  This
reset downstream devices, but did not remove them from the PCI subsystem,
re-enumerate them, or call their driver .remove() or .probe() methods.

DPC is different because the hardware automatically disables the Link when
it detects ERR_FATAL, which resets downstream devices.  There's no
opportunity for pci_error_handlers callbacks before resetting the Link.
The DPC driver removes affected devices (which calls their driver .remove()
methods), brings the Link back up, and re-enumerates (which calls driver
.probe() methods).

Align AER ERR_FATAL handling with DPC by resetting the Link in software,
skipping the driver pci_error_handlers callbacks, removing the devices from
the PCI subsystem, and re-enumerating.  The idea is that drivers and
devices should see the same behavior for ERR_FATAL events, regardless of
whether they're handled by AER or DPC.

Here are the basic ERR_FATAL recovery steps, showing the previous AER
behavior, the AER behavior after this patch, and the DPC behavior:

                          AER        AER      DPC
                          previous   new      behavior
                          --------   ---      --------
  Log error               yes        yes      yes (minimal)
  drv.error_detected()    yes        no       no
  Reset Link              yes        yes      yes
  drv.mmio_enabled()      yes        no       no
  drv.slot_reset()        yes        no       no
  drv.resume()            yes        no       no
  Remove PCI devices      no         yes      yes
    (calls drv.remove())
  Re-enumerate            no         yes      yes
    (calls drv.probe())

N.B. With DPC, the Link reset happens before the driver .remove() calls,
while with AER, the reset happens *after* the .remove() calls.  The goal is
to eventually do the reset before .remove() for AER as well.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, squash doc patch into this, remove unused
"result_data"]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-05-17 16:44:13 -05:00
..
ABI RTC for 4.17 2018-04-10 10:22:27 -07:00
EDID
PCI PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices 2018-05-17 16:44:13 -05:00
RCU Merge branches 'cond_resched.2017.12.04a', 'dyntick.2017.11.28a', 'fixes.2017.12.11a', 'srbd.2017.12.05a' and 'torture.2017.12.11a' into HEAD 2017-12-11 09:21:58 -08:00
accelerators ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL 2018-03-02 13:02:15 +11:00
accounting
acpi ACPI / LPIT: Add Low Power Idle Table (LPIT) support 2017-10-11 15:38:10 +02:00
admin-guide ARM: 2018-04-09 11:42:31 -07:00
aoe
arm MTD changes: 2018-04-06 12:15:41 -07:00
arm64 ARM: 2018-04-09 11:42:31 -07:00
auxdisplay
backlight
block block, bfq: move debug blkio stats behind CONFIG_DEBUG_BLK_CGROUP 2017-11-14 20:13:33 -07:00
blockdev SCSI misc on 20170907 2017-09-07 21:11:05 -07:00
bpf bpf, doc: add description wrt native/bpf clang target and pointer size 2018-03-20 15:47:45 -07:00
bus-devices
cdrom Documentation/cdrom: fix German sharp s in LaTex 2018-03-08 19:35:29 -07:00
cgroup-v1 page cache: use xa_lock 2018-04-11 10:28:39 -07:00
cma
connector
console
core-api Documentation: Mention why %p prints ptrval 2018-03-23 12:42:18 -06:00
cpu-freq cpufreq: Drop cpufreq_table_validate_and_show() 2018-04-10 08:40:45 +02:00
cpuidle cpuidle: Add definition of residency to sysfs documentation 2018-04-09 13:44:37 +02:00
crypto crypto: doc - clarify hash callbacks state machine 2018-03-31 01:33:02 +08:00
dev-tools There's been a fair amount of activity in Documentation/ this time around: 2018-04-03 13:35:51 -07:00
device-mapper dm verity: add 'check_at_most_once' option to only validate hashes once 2018-04-03 15:04:29 -04:00
devicetree The large diff this time around is from the addition of a new clk driver 2018-04-13 15:51:06 -07:00
doc-guide doc-guide: kernel-doc: add comment about formatting verification 2018-02-23 08:06:15 -07:00
driver-api MTD changes: 2018-04-06 12:15:41 -07:00
driver-model serdev: Introduce devm_serdev_device_open() 2018-01-08 10:08:34 +00:00
early-userspace
extcon
fault-injection Documentation: nvme: Documentation for nvme fault injection 2018-03-26 08:53:43 -06:00
fb documentation: fb: update list of available compiled-in fonts 2017-11-08 03:39:52 -07:00
features arch: remove obsolete architecture ports 2018-04-02 20:20:12 -07:00
filesystems Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2018-04-13 16:55:41 -07:00
firmware_class
fmc
fpga fpga: mgr: separate getting/locking FPGA manager 2017-11-28 16:30:37 +01:00
gpio Documentation: gpio: Move drivers-on-gpio.txt to driver-api 2018-03-23 04:22:29 +01:00
gpu Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
hid Documentation: fix input related doc refs 2017-10-12 11:14:06 -06:00
hwmon hwmon: (lm92) Add max6635 to lm92_id[] 2018-03-22 09:33:24 -07:00
i2c i2c: i801: Add missing documentation entries for Braswell and Kaby Lake 2018-02-21 09:17:20 +01:00
ia64 ia64: doc: tweak whitespace for 'console=' parameter 2018-03-05 14:41:38 -08:00
ide
iio iio: adc: New driver for Cirrus Logic EP93xx ADC 2017-07-25 19:56:23 +01:00
infiniband Documentation/ABI: update infiniband sysfs interfaces 2018-02-23 08:18:33 -07:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2018-04-05 13:21:57 -07:00
ioctl arch: remove tile port 2018-03-16 10:56:03 +01:00
isdn Documentation/isdn: check and fix dead links ... 2018-03-26 12:31:13 -04:00
kbuild Kconfig updates for v4.17 2018-04-03 16:28:01 -07:00
kdump kexec/kdump: minor Documentation updates for arm64 and Image 2017-07-12 16:26:00 -07:00
kernel-hacking Documentation: Fix misconversion of #if 2018-01-17 16:45:01 -07:00
laptops Documentation: fix admin-guide doc refs 2017-10-12 11:13:28 -06:00
leds Documentation: leds: Update 00-INDEX file 2017-10-23 20:17:03 +02:00
lightnvm
livepatch livepatch: Remove immediate feature 2018-01-11 10:58:03 +01:00
locking Linux 4.16-rc2 2018-02-21 09:57:55 +01:00
m68k
maintainer docs: Add an intro note to the maintainers handbook 2017-12-11 14:46:10 -07:00
md raid5-ppl: PPL support for disks with write-back cache enabled 2018-01-15 14:29:42 -08:00
media media: extended-controls.rst: transmitter -> receiver 2018-04-04 08:05:20 -04:00
memory-devices
mic
mips Documentation: mips: Update AU1xxx_IDE Kconfig dependencies 2018-02-01 12:45:35 -07:00
misc-devices
mmc
mtd mtd: spi-nor: add an API to restore the status of SPI flash chip 2017-12-13 00:36:00 +01:00
namespaces
netlabel
networking Staging/IIO patches for 4.17-rc1 2018-04-04 18:56:27 -07:00
nfc
nios2
nvdimm
nvmem NVMEM documentation fix: A minor typo 2017-08-24 13:31:58 -06:00
openrisc Documentation: openrisc: Updates to README 2017-10-30 21:37:53 +09:00
parisc
pcmcia
perf drivers/bus: Move Arm CCN PMU driver 2018-03-06 17:26:15 +01:00
phy
platform
power PM / hibernate: Make passing hibernate offsets more friendly 2018-03-30 12:01:20 +02:00
powerpc powerpc updates for 4.13 2017-07-07 13:55:45 -07:00
pps drivers/pps: aesthetic tweaks to PPS-related content 2017-09-08 18:26:51 -07:00
process Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-04-15 16:12:35 -07:00
pti
ptp ptp: Fix documentation to match code. 2018-03-26 12:13:21 -04:00
rapidio Documentation: rapidio: move sysfs interface to ABI 2018-02-23 08:25:45 -07:00
s390 vfio-ccw: update documentation 2018-03-01 17:32:14 +01:00
scheduler sched/deadline: Fix the description of runtime accounting in the documentation 2017-11-16 09:00:35 +01:00
scsi scsi: documentation: Obsolete documentation references 2018-03-21 18:34:20 -04:00
security selinux: Update SELinux SCTP documentation 2018-03-20 16:26:15 -04:00
serial tty: n_gsm: do not send/receive in ldisc close path 2017-06-03 18:48:52 +09:00
sh docs-rst: convert sh book to ReST 2017-05-16 08:44:18 -03:00
sound sound updates for 4.15-rc1 2017-11-14 18:01:46 -08:00
sparc sparc64: Add support for ADI (Application Data Integrity) 2018-03-18 07:38:48 -07:00
sphinx Documentation/sphinx: Fix Directive import error 2018-03-09 10:46:14 -07:00
sphinx-static docs RTD theme: code-block with line nos - lines and line numbers don't line up. 2017-07-17 13:48:45 -06:00
spi spi: Document SPI slave controller support 2017-05-26 13:11:00 +01:00
sysctl taint: add taint for randstruct 2018-04-11 10:28:35 -07:00
target
thermal thermal: Add cooling device's statistics in sysfs 2018-04-02 21:49:01 +08:00
timers sched/isolation: Eliminate NO_HZ_FULL_ALL 2018-02-15 15:40:37 -08:00
trace New features: 2018-04-10 11:27:30 -07:00
translations Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2018-02-01 13:36:15 -08:00
usb Documentation updates for 4.16. New stuff includes refcount_t 2018-01-31 19:25:25 -08:00
userspace-api seccomp: Implement SECCOMP_RET_KILL_PROCESS action 2017-08-14 13:46:50 -07:00
virtual KVM: trivial documentation cleanups 2018-03-28 22:47:06 +02:00
vm page cache: use xa_lock 2018-04-11 10:28:39 -07:00
w1 Documentation updates for 4.16. New stuff includes refcount_t 2018-01-31 19:25:25 -08:00
watchdog watchdog: remove bfin_wdt driver 2018-03-26 15:57:04 +02:00
wimax
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-04-15 16:12:35 -07:00
xtensa xtensa: add support for KASAN 2017-12-16 22:37:12 -08:00
.gitignore
00-INDEX CRIS: Drop support for the CRIS port 2018-03-16 10:56:05 +01:00
Changes
CodingStyle
DMA-API-HOWTO.txt DMA-API-HOWTO.txt: standardize document format 2017-07-14 13:51:32 -06:00
DMA-API.txt dma-coherent: remove the DMA_MEMORY_MAP and DMA_MEMORY_IO flags 2017-09-01 11:59:17 +02:00
DMA-ISA-LPC.txt DMA-ISA-LPC.txt: standardize document format 2017-07-14 13:51:33 -06:00
DMA-attributes.txt DMA-attributes.txt: standardize document format 2017-07-14 13:51:33 -06:00
IPMI.txt ipmi: Make IPMI panic strings always available 2017-09-27 16:03:45 -05:00
IRQ-affinity.txt IRQ-affinity.txt: standardize document format 2017-07-14 13:51:41 -06:00
IRQ-domain.txt irqdomain: Kill CONFIG_IRQ_DOMAIN_DEBUG 2018-01-24 12:32:58 +01:00
IRQ.txt IRQ.txt: add a markup for its title 2017-07-14 13:51:42 -06:00
Intel-IOMMU.txt Intel-IOMMU.txt: standardize document format 2017-07-14 13:51:38 -06:00
Makefile Documentation: add script and build target to check for broken file references 2017-10-12 11:07:42 -06:00
SAK.txt SAK.txt: standardize document format 2017-07-14 13:58:04 -06:00
SM501.txt SM501.txt: standardize document format 2017-07-14 13:58:06 -06:00
SubmittingPatches
atomic_bitops.txt locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit() 2018-02-13 14:55:53 +01:00
atomic_t.txt Documentation/locking/atomic: Finish the document... 2017-08-25 11:06:33 +02:00
bcache.txt bcache.txt: standardize document format 2017-07-14 13:51:27 -06:00
bt8xxgpio.txt bt8xxgpio.txt: standardize document format 2017-07-14 13:51:27 -06:00
btmrvl.txt btmrvl.txt: standardize document format 2017-07-14 13:51:27 -06:00
bus-virt-phys-mapping.txt bus-virt-phys-mapping.txt: standardize document format 2017-07-14 13:51:28 -06:00
cachetlb.txt cachetlb.txt: standardize document format 2017-07-14 13:51:28 -06:00
cgroup-v2.txt cgroup, docs: document the root cgroup behavior of cpu and io controllers 2018-01-16 08:07:09 -08:00
circular-buffers.txt doc: De-emphasize smp_read_barrier_depends 2017-12-05 11:57:53 -08:00
clearing-warn-once.txt kernel debug: support resetting WARN*_ONCE 2017-11-17 16:10:00 -08:00
clk.txt Documentation: clk: enable lock is not held for clk_is_enabled API 2018-03-16 15:44:43 -07:00
conf.py docs: Remove "could not extract kernel version" warning 2017-12-11 15:20:04 -07:00
cpu-load.txt cpu-load: standardize document format 2017-07-14 13:51:30 -06:00
cputopology.txt cputopology.txt: standardize document format 2017-07-14 13:51:30 -06:00
crc32.txt crc32.txt: standardize document format 2017-07-14 13:51:30 -06:00
dcdbas.txt dcdbas.txt: standardize document format 2017-07-14 13:51:31 -06:00
debugging-modules.txt
debugging-via-ohci1394.txt debugging-via-ohci1394.txt: standardize document format 2017-07-14 13:51:34 -06:00
dell_rbu.txt dell_rbu.txt: standardize document format 2017-07-14 13:58:12 -06:00
digsig.txt digsig.txt: standardize document format 2017-07-14 13:51:31 -06:00
docutils.conf
dontdiff Remove gperf usage from toolchain 2017-08-19 11:02:53 -07:00
efi-stub.txt efi-stub.txt: standardize document format 2017-07-14 13:51:34 -06:00
eisa.txt eisa.txt: standardize document format 2017-07-14 13:51:34 -06:00
flexible-arrays.txt flexible-arrays.txt: standardize document format 2017-07-14 13:51:35 -06:00
futex-requeue-pi.txt futex-requeue-pi.txt: standardize document format 2017-07-14 13:51:35 -06:00
gcc-plugins.txt gcc-plugins.txt: standardize document format 2017-07-14 13:51:36 -06:00
highuid.txt highuid.txt: standardize document format 2017-07-14 13:51:36 -06:00
hw_random.txt hw_random.txt: standardize document format 2017-07-14 13:51:37 -06:00
hwspinlock.txt hwspinlock.txt: standardize document format 2017-07-14 13:51:37 -06:00
index.rst Documentation: add Linux tracing to Sphinx TOC tree 2018-03-07 10:22:53 -07:00
intel_txt.txt intel_txt.txt: standardize document format 2017-07-14 13:51:38 -06:00
io-mapping.txt io-mapping.txt: standardize document format 2017-07-14 13:51:38 -06:00
io_ordering.txt io_ordering.txt: standardize document format 2017-07-14 13:51:39 -06:00
iostats.txt iostats.txt: update it to cover recent Kernels 2017-07-14 13:51:40 -06:00
irqflags-tracing.txt irqflags-tracing.txt: standardize document format 2017-07-14 13:51:42 -06:00
isa.txt isa.txt: standardize document format 2017-07-14 13:51:43 -06:00
isapnp.txt isapnp.txt: promote title level 2017-07-14 13:51:43 -06:00
kernel-per-CPU-kthreads.txt kernel-per-CPU-kthreads.txt: standardize document format 2017-07-14 13:51:43 -06:00
kobject.txt kobject.txt: standardize document format 2017-07-14 13:51:44 -06:00
kprobes.txt kprobes/docs: Remove jprobes related documents 2017-10-20 11:02:55 +02:00
kref.txt kref.txt: standardize document format 2017-07-14 13:51:45 -06:00
ldm.txt ldm.txt: standardize document format 2017-07-14 13:51:45 -06:00
lockup-watchdogs.txt lockup-watchdogs.txt: standardize document format 2017-07-14 13:51:46 -06:00
logo.gif
logo.txt
lsm.txt docs-rst: convert lsm from DocBook to ReST 2017-05-16 08:44:19 -03:00
lzo.txt lzo.txt: standardize document format 2017-07-14 13:51:46 -06:00
mailbox.txt mailbox.txt: standardize document format 2017-07-14 13:51:47 -06:00
memory-barriers.txt locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more 2018-03-10 10:22:22 +01:00
memory-hotplug.txt memory-hotplug.txt: standardize document format 2017-07-14 13:57:53 -06:00
men-chameleon-bus.txt men-chameleon-bus.txt: standardize document format 2017-07-14 13:57:54 -06:00
nommu-mmap.txt nommu-mmap.txt: don't use all upper case on titles 2017-07-14 13:57:55 -06:00
ntb.txt This series converts a number of top-level documents to the RST format 2017-07-15 12:58:58 -07:00
numastat.txt numastat.txt: standardize document format 2017-07-14 13:57:56 -06:00
padata.txt padata.txt: standardize document format 2017-07-14 13:57:56 -06:00
parport-lowlevel.txt parport-lowlevel.txt: standardize document format 2017-07-14 13:57:57 -06:00
percpu-rw-semaphore.txt percpu-rw-semaphore.txt: standardize document format 2017-07-14 13:57:58 -06:00
phy.txt phy.txt: standardize document format 2017-07-14 13:57:58 -06:00
pi-futex.txt Documentation: fix locking rt-mutex doc refs 2017-10-19 12:56:44 -06:00
pnp.txt pnp.txt: standardize document format 2017-07-14 13:57:59 -06:00
preempt-locking.txt preempt-locking.txt: standardize document format 2017-07-14 13:58:00 -06:00
pwm.txt pwm: Standardize document format 2017-07-06 08:23:30 +02:00
rbtree.txt rbtree: cache leftmost node internally 2017-09-08 18:26:48 -07:00
remoteproc.txt remoteproc.txt: standardize document format 2017-07-14 13:58:02 -06:00
rfkill.txt rfkill.txt: standardize document format 2017-07-14 13:58:02 -06:00
robust-futex-ABI.txt robust-futex-ABI.txt: standardize document format 2017-07-14 13:58:03 -06:00
robust-futexes.txt robust-futexes.txt: standardize document format 2017-07-14 13:58:03 -06:00
rpmsg.txt rpmsg.txt: standardize document format 2017-07-14 13:58:04 -06:00
rtc.txt Documentation: rtc: move iotcl interface documentation to ABI 2018-01-12 00:20:41 +01:00
sgi-ioc4.txt sgi-ioc4.txt: standardize document format 2017-07-14 13:58:05 -06:00
siphash.txt siphash.txt: standardize document format 2017-07-14 13:58:06 -06:00
smsc_ece1099.txt smsc_ece1099.txt: standardize document format 2017-07-14 13:58:07 -06:00
speculation.txt Documentation: Document array_index_nospec 2018-01-30 21:54:28 +01:00
static-keys.txt jump_label: Provide hotplug context variants 2017-08-10 12:28:59 +02:00
svga.txt documentation/svga.txt: update outdated file 2017-11-20 10:45:50 -07:00
switchtec.txt NTB: switchtec_ntb: Update switchtec documentation with notes for NTB 2017-11-18 20:37:13 -05:00
sync_file.txt sync_file.txt: standardize document format 2017-05-24 13:01:27 -03:00
tee.txt tee.txt: standardize document format 2017-07-14 13:58:14 -06:00
this_cpu_ops.txt this_cpu_ops.txt: standardize document format 2017-07-14 13:58:08 -06:00
unaligned-memory-access.txt unaligned-memory-access.txt: standardize document format 2017-07-14 13:58:09 -06:00
vfio-mediated-device.txt vfio-mediated-device.txt: standardize document format 2017-07-14 13:58:10 -06:00
vfio.txt vfio.txt: standardize document format 2017-07-14 13:58:10 -06:00
video-output.txt
xillybus.txt xillybus.txt: standardize document format 2017-07-14 13:58:11 -06:00
xz.txt xz.txt: standardize document format 2017-07-14 13:58:11 -06:00
zorro.txt zorro.txt: standardize document format 2017-07-14 13:58:12 -06:00