OpenCloudOS-Kernel/drivers
Chris Wilson d1d1ebf412 drm/i915: Don't touch fence->error when resetting an innocent request
If the request has been completed before the reset took effect, we don't
need to mark it up as being a victim. Touching fence->error after the
fence has been signaled is detected by dma_fence_set_error() and
triggers a BUG:

[  231.743133] kernel BUG at ./include/linux/dma-fence.h:434!
[  231.743156] invalid opcode: 0000 [#1] SMP KASAN
[  231.743172] Modules linked in: i915 drm_kms_helper drm iptable_nat nf_nat_ipv4 nf_nat x86_pkg_temp_thermal iosf_mbi i2c_algo_bit cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea fb font fbdev [last unloaded: drm]
[  231.743221] CPU: 2 PID: 20 Comm: kworker/2:0 Tainted: G     U          4.13.0-rc1+ #52
[  231.743236] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011
[  231.743363] Workqueue: events_long i915_hangcheck_elapsed [i915]
[  231.743382] task: ffff8801f42e9780 task.stack: ffff8801f42f8000
[  231.743489] RIP: 0010:i915_gem_reset_engine+0x45a/0x460 [i915]
[  231.743505] RSP: 0018:ffff8801f42ff770 EFLAGS: 00010202
[  231.743521] RAX: 0000000000000007 RBX: ffff8801bf6b1880 RCX: ffffffffa02881a6
[  231.743537] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff8801bf6b18c8
[  231.743551] RBP: ffff8801f42ff7c8 R08: 0000000000000001 R09: 0000000000000000
[  231.743566] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801edb02d00
[  231.743581] R13: ffff8801e19d4200 R14: 000000000000001d R15: ffff8801ce2a4000
[  231.743599] FS:  0000000000000000(0000) GS:ffff8801f5a80000(0000) knlGS:0000000000000000
[  231.743614] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  231.743629] CR2: 00007f0ebd1add10 CR3: 0000000002621000 CR4: 00000000000406e0
[  231.743643] Call Trace:
[  231.743752]  i915_gem_reset+0x6c/0x150 [i915]
[  231.743853]  i915_reset+0x175/0x210 [i915]
[  231.743958]  i915_reset_device+0x33b/0x350 [i915]
[  231.744061]  ? valleyview_pipestat_irq_handler+0xe0/0xe0 [i915]
[  231.744081]  ? trace_hardirqs_off_caller+0x70/0x110
[  231.744102]  ? _raw_spin_unlock_irqrestore+0x46/0x50
[  231.744120]  ? find_held_lock+0x119/0x150
[  231.744138]  ? mark_lock+0x6d/0x850
[  231.744241]  ? gen8_gt_irq_ack+0x1f0/0x1f0 [i915]
[  231.744262]  ? work_on_cpu_safe+0x60/0x60
[  231.744284]  ? rcu_read_lock_sched_held+0x57/0xa0
[  231.744400]  ? gen6_read32+0x2ba/0x320 [i915]
[  231.744506]  i915_handle_error+0x382/0x5f0 [i915]
[  231.744611]  ? gen6_rps_reset_ei+0x20/0x20 [i915]
[  231.744630]  ? vsnprintf+0x128/0x8e0
[  231.744649]  ? pointer+0x6b0/0x6b0
[  231.744667]  ? debug_check_no_locks_freed+0x1a0/0x1a0
[  231.744688]  ? scnprintf+0x92/0xe0
[  231.744706]  ? snprintf+0xb0/0xb0
[  231.744820]  hangcheck_declare_hang+0x15a/0x1a0 [i915]
[  231.744932]  ? engine_stuck+0x440/0x440 [i915]
[  231.744951]  ? rcu_read_lock_sched_held+0x57/0xa0
[  231.745062]  ? gen6_read32+0x2ba/0x320 [i915]
[  231.745173]  ? gen6_read16+0x320/0x320 [i915]
[  231.745284]  ? intel_engine_get_active_head+0x91/0x170 [i915]
[  231.745401]  i915_hangcheck_elapsed+0x3d8/0x400 [i915]
[  231.745424]  process_one_work+0x3e8/0xac0
[  231.745444]  ? pwq_dec_nr_in_flight+0x110/0x110
[  231.745464]  ? do_raw_spin_lock+0x8e/0x120
[  231.745484]  worker_thread+0x8d/0x720
[  231.745506]  kthread+0x19e/0x1f0
[  231.745524]  ? process_one_work+0xac0/0xac0
[  231.745541]  ? kthread_create_on_node+0xa0/0xa0
[  231.745560]  ret_from_fork+0x27/0x40
[  231.745581] Code: 8b 7d c8 e8 49 0d 02 e1 49 8b 7f 38 48 8b 75 b8 48 83 c7 10 e8 b8 89 be e1 e9 95 fc ff ff 4c 89 e7 e8 4b b9 ff ff e9 30 ff ff ff <0f> 0b 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe
[  231.745767] RIP: i915_gem_reset_engine+0x45a/0x460 [i915] RSP: ffff8801f42ff770

At first glance this looks to be related to commit c64992e035
("drm/i915: Look for active requests earlier in the reset path"), but it
could easily happen before as well. On the other hand, we no longer
logged victims due to the active_request being dropped earlier.

v2: Be trickier to unwind the incomplete request as we cannot rely on
request retirement for the lockless per-engine reset.
v3: Reprobe the active request at the time of the reset.

Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: c64992e035 ("drm/i915: Look for active requests earlier in the reset path")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-15-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:57 +02:00
..
accessibility
acpi acpi/nfit: Fix memory corruption/Unregister mce decoder on failure 2017-07-17 11:43:58 -07:00
amba
android binder: Use wake up hint for synchronous transactions. 2017-07-17 14:44:19 +02:00
ata Merge branch 'for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2017-07-06 09:41:58 -07:00
atm atm: zatm: Fix an error handling path in 'zatm_init_one()' 2017-07-18 11:37:46 -07:00
auxdisplay
base Power management fixes for v4.13-rc2 2017-07-20 14:56:46 -07:00
bcma
block nbd: kill unused ret in recv_work 2017-07-13 08:03:30 -06:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-07-05 12:31:59 -07:00
bus main drm pull for v4.13 2017-07-09 18:48:37 -07:00
cdrom block: don't set bounce limit in blk_init_queue 2017-06-27 12:13:45 -06:00
char Add wait_for_random_bytes() and get_random_*_wait() functions so that 2017-07-15 12:44:02 -07:00
clk Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2017-07-15 10:59:54 -07:00
clocksource clocksource/drivers/timer-of: Handle of_irq_get_byname() result correctly 2017-07-17 22:43:00 +02:00
connector
cpufreq Merge branches 'intel_pstate' and 'pm-domains' 2017-07-20 18:57:15 +02:00
cpuidle powerpc updates for 4.13 2017-07-07 13:55:45 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2017-07-14 22:49:50 -07:00
dax device-dax: fix sysfs duplicate warnings 2017-07-18 17:49:14 -07:00
dca
devfreq PM / devfreq: constify attribute_group structures. 2017-07-06 10:17:24 +09:00
dio
dma dmaengine updates for 4.13-rc1 2017-07-08 12:36:50 -07:00
dma-buf Linux 4.13-rc2 2017-07-27 08:15:43 +10:00
edac EDAC, pnd2: Fix Apollo Lake DIMM detection 2017-06-29 10:37:50 +02:00
eisa
extcon
firewire
firmware efi: avoid fortify checks in EFI stub 2017-07-12 16:26:02 -07:00
fmc
fpga
fsi drivers/fsi: fix fsi_slave_mode prototype 2017-07-17 16:13:54 +02:00
gpio Merge (most of) tag 'mfd-next-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd 2017-07-07 13:30:05 -07:00
gpu drm/i915: Don't touch fence->error when resetting an innocent request 2017-07-27 09:38:57 +02:00
hid HID: hid-logitech-hidpp: add NULL check on devm_kmemdup() return value 2017-07-20 15:45:39 +02:00
hsi HSI changes for the v4.13 series 2017-07-04 14:28:22 -07:00
hv vmbus: re-enable channel tasklet 2017-07-17 15:00:47 +02:00
hwmon hwmon: (applesmc) Avoid buffer overruns 2017-07-15 16:38:56 -07:00
hwspinlock
hwtracing Char/Misc patches for 4.13-rc1 2017-07-03 20:55:59 -07:00
i2c Merge branch 'i2c/for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2017-07-12 10:04:56 -07:00
ide ide: avoid warning for timings calculation 2017-07-21 04:37:22 +01:00
idle intel_idle: Use more common logging style 2017-06-29 22:58:35 +02:00
iio hwmon updates for v4.13: 2017-07-04 11:48:27 -07:00
infiniband RDMA/core: Initialize port_num in qp_attr 2017-07-20 11:24:13 -04:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2017-07-14 22:53:37 -07:00
iommu IOMMU Updates for Linux v4.13 2017-07-12 10:00:04 -07:00
ipack
irqchip irqchip/digicolor: Drop unnecessary static 2017-07-18 21:59:23 +02:00
isdn isdn: avm: c4: constify pci_device_id. 2017-07-15 21:25:56 -07:00
leds LED updates for 4.13 2017-07-06 11:32:40 -07:00
lguest
lightnvm lightnvm: pblk: remove unnecessary checks 2017-07-07 13:17:36 -06:00
macintosh Merge branch 'work.misc-set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-05 13:13:32 -07:00
mailbox Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration 2017-07-07 10:24:07 -07:00
mcb
md Merge tag 'md/4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-07-18 11:51:08 -07:00
media main drm pull for v4.13 2017-07-09 18:48:37 -07:00
memory ARM: SoC driver updates 2017-07-04 14:47:47 -07:00
memstick
message
mfd chrome-platform-for-linus-4.13 2017-07-11 09:55:47 -07:00
misc powerpc updates for 4.13 2017-07-07 13:55:45 -07:00
mmc MMC core: 2017-07-14 13:10:06 -07:00
mtd MTD updates for v4.13-rc1: 2017-07-13 12:07:44 -07:00
mux mux: mux-core: unregister mux_class in mux_exit() 2017-07-17 16:38:35 +02:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-07-20 16:33:39 -07:00
nfc NFC 4.13 pull request 2017-07-01 14:30:39 -07:00
ntb ntb: Add error path/handling to Debug FS entry creation 2017-07-06 11:30:08 -04:00
nubus
nvdimm libnvdimm: fix badblock range handling of ARS range 2017-07-17 11:43:58 -07:00
nvme nvmet: don't report 0-bytes in serial number 2017-07-20 08:41:56 -06:00
nvmem nvmem: rockchip-efuse: amend compatible rk322x-efuse to rk3228-efuse 2017-07-17 16:15:57 +02:00
of Device properties framework updates for v4.13-rc1 2017-07-10 15:23:45 -07:00
oprofile
parisc parisc: ->mapping_error 2017-07-05 21:46:42 +02:00
parport
pci Power management fixes for v4.13-rc1 2017-07-14 22:24:25 -07:00
pcmcia
perf Merge branch 'aarch64/for-next/ras-apei' into aarch64/for-next/core 2017-06-26 10:54:27 +01:00
phy
pinctrl This is the big bulk of pin control changes for the v4.13 series: 2017-07-06 11:38:59 -07:00
platform Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2017-07-15 10:59:54 -07:00
pnp This is the bulk of GPIO changes for the v4.13 series: 2017-07-07 12:40:27 -07:00
power power supply and reset changes for the v4.13 series (part 2) 2017-07-13 11:47:59 -07:00
powercap powercap/RAPL: prevent overridding bits outside of the mask 2017-06-28 00:38:34 +02:00
pps
ps3
ptp ptp: dte: Use LL suffix for 64-bit constants 2017-07-06 11:40:58 +01:00
pwm pwm: Changes for v4.13-rc1 2017-07-13 11:49:52 -07:00
rapidio
ras arm64 updates for 4.13: 2017-07-05 17:09:27 -07:00
regulator Merge remote-tracking branches 'regulator/topic/settle', 'regulator/topic/tps65910' and 'regulator/topic/tps65917' into regulator-next 2017-07-03 16:52:21 +01:00
remoteproc remoteproc/keystone: Fix circular dependencies for ARM configs 2017-06-27 16:21:34 -07:00
reset ARM: SoC driver updates 2017-07-04 14:47:47 -07:00
rpmsg rpmsg updates for v4.13 2017-07-06 15:38:31 -07:00
rtc RTC for 4.13 2017-07-13 12:15:06 -07:00
s390 drivers: s390: move static and inline before return type 2017-07-12 16:26:04 -07:00
sbus block: don't set bounce limit in blk_init_queue 2017-06-27 12:13:45 -06:00
scsi SCSI fixes on 20170715 2017-07-17 12:26:12 -07:00
sfi
sh drivers/sh/intc/virq.c: delete an error message for a failed memory allocation in add_virq_to_pirq() 2017-07-06 16:24:30 -07:00
sn
soc ARM: SoC driver updates 2017-07-04 14:47:47 -07:00
spi Merge branch 'for-spi' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-08 10:41:53 -07:00
spmi spmi: pmic-arb: Always allocate ppid_to_apid table 2017-07-17 15:00:47 +02:00
ssb
staging drm: linux-next: build failure after merge of the drm-misc tree 2017-07-27 08:27:11 +10:00
target Add wait_for_random_bytes() and get_random_*_wait() functions so that 2017-07-15 12:44:02 -07:00
tc
tee
thermal Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2017-07-14 13:12:32 -07:00
thunderbolt thunderbolt: Correct access permissions for active NVM contents 2017-07-17 15:55:08 +02:00
tty tty: hide unused pty_get_peer function 2017-07-17 17:04:41 +02:00
uio
usb xhci: fix memleak in xhci_run() 2017-07-20 14:40:36 +02:00
uwb driver core patches for 4.13-rc1 2017-07-03 20:27:48 -07:00
vfio VFIO updates for v4.13-rc1 2017-07-13 12:23:54 -07:00
vhost Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2017-07-13 14:27:32 -07:00
video Merge branch 'akpm' (patches from Andrew) 2017-07-13 12:38:49 -07:00
virt
virtio
vlynq
vme
w1 w1: omap-hdq: fix error return code in omap_hdq_probe() 2017-07-17 16:48:15 +02:00
watchdog Merge git://www.linux-watchdog.org/linux-watchdog 2017-07-11 09:59:37 -07:00
xen xen/balloon: don't online new memory initially 2017-07-23 08:13:18 +02:00
zorro
Kconfig
Makefile