Pull vfs fixes from Al Viro.
- fix io_destroy()/aio_complete() race
- the vfs_open() change to get rid of open_check_o_direct() boilerplate
was nice, but buggy. Al has a patch avoiding a revert, but that's
definitely not a last-day fodder, so for now revert it is...
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Revert "fs: fold open_check_o_direct into do_dentry_open"
fix io_destroy()/aio_complete() race
This reverts commit cab64df194.
Having vfs_open() in some cases drop the reference to
struct file combined with
error = vfs_open(path, f, cred);
if (error) {
put_filp(f);
return ERR_PTR(error);
}
return f;
is flat-out wrong. It used to be
error = vfs_open(path, f, cred);
if (!error) {
/* from now on we need fput() to dispose of f */
error = open_check_o_direct(f);
if (error) {
fput(f);
f = ERR_PTR(error);
}
} else {
put_filp(f);
f = ERR_PTR(error);
}
and sure, having that open_check_o_direct() boilerplate gotten rid of is
nice, but not that way...
Worse, another call chain (via finish_open()) is FUBAR now wrt
FILE_OPENED handling - in that case we get error returned, with file
already hit by fput() *AND* FILE_OPENED not set. Guess what happens in
path_openat(), when it hits
if (!(opened & FILE_OPENED)) {
BUG_ON(!error);
put_filp(file);
}
The root cause of all that crap is that the callers of do_dentry_open()
have no way to tell which way did it fail; while that could be fixed up
(by passing something like int *opened to do_dentry_open() and have it
marked if we'd called ->open()), it's probably much too late in the
cycle to do so right now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Update prctl and cpufeatures.h tools/ copies with the kernel sources
originals, which makes 'perf trace' know about the new prctl options
for speculation control and silences the build warnings (Arnaldo Carvalho de Melo)
- Update insn.h in Intel-PT instruction decoder with its original from from the
kernel sources, to silence build warnings, no effect on the actual tools this
time around (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEELb9bqkb7Te0zijNb1lAW81NSqkAFAlsSr6wACgkQ1lAW81NS
qkDGKg/+KrSDzwaP0kjow02QY4AxmCDZMjWRN5sz9pVF5RIGG36HgTcLoSJgVTFU
Fv1My+7EtHDScb+8SJLPKCiSskBL2jWZdbAaivvWHxkL1U089TiyBDhAr3s+OwFD
Xlp+GFEhkMlDHYkGOwQ6W+aIobZeGo9Pm96/QTYUWWaO/euxFsjPYDuDgeE5JVDq
ppzrDSajz56m1k146WtnCdpQm/jepcaoYLdo8W/8MPIvDsHUxplFC+EB7BEad3Bx
VCO9ovnfF6vRdt8mM/CJl2cnRDj6vbeeivzlp3pHviDwioklgVEvjLsfez617kRV
xWQ32Dez+6EKbhdxgGQDinGooEoSAKLhVM2abvpE7Mp+rUw2L0B6eJrXcHzy32y0
ZouQbQaDRtVSNHbN/xONpjXCzKMrFIZV24wCj2Zo1WTdpOT6/OGruh9dqwzYcSxP
bS4xd1n8dhu1A6cqDGbUtrlWfol7NJNx/YbssNIY7RDXytoOLEVXhtMR3cvYiCJg
uCqKfDHQ5RQYiN896YmY/3uhmFniIBut+RQIqT3b/NVp0BKh5S4JZF/+tjxYv2wq
9XM0pW2fm/4SYyd6wqcoP0JxyPdo8Ugx2Tf/9f6hAHvsRTYgGZwPWlR12XYYCnrc
zkGVp8Tf+l0JdUODWc4M/I68YU5xadl31J/+I/C+DFAEWM++XB0=
=/cVu
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo-4.17-20180602' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Carvalho de Melo:
- Update prctl and cpufeatures.h tools/ copies with the kernel sources
originals, which makes 'perf trace' know about the new prctl options
for speculation control and silences the build warnings (Arnaldo Carvalho de Melo)
- Update insn.h in Intel-PT instruction decoder with its original from from the
kernel sources, to silence build warnings, no effect on the actual tools this
time around (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull scheduler fixes from Thomas Gleixner:
- two patches addressing the problem that the scheduler allows under
certain conditions user space tasks to be scheduled on CPUs which are
not yet fully booted which causes a few subtle and hard to debug
issue
- add a missing runqueue clock update in the deadline scheduler which
triggers a warning under certain circumstances
- fix a silly typo in the scheduler header file
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/headers: Fix typo
sched/deadline: Fix missing clock update
sched/core: Require cpu_active() in select_task_rq(), for user tasks
sched/core: Fix rules for running on online && !active CPUs
Pull turbostat utility updates for v4.18 from Len Brown.
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (65 commits)
tools/power turbostat: update version number
tools/power turbostat: Add Node in output
tools/power turbostat: add node information into turbostat calculations
tools/power turbostat: remove num_ from cpu_topology struct
tools/power turbostat: rename num_cores_per_pkg to num_cores_per_node
tools/power turbostat: track thread ID in cpu_topology
tools/power turbostat: Calculate additional node information for a package
tools/power turbostat: Fix node and siblings lookup data
tools/power turbostat: set max_num_cpus equal to the cpumask length
tools/power turbostat: if --num_iterations, print for specific number of iterations
tools/power turbostat: Add Cannon Lake support
tools/power turbostat: delete duplicate #defines
x86: msr-index.h: Correct SNB_C1/C3_AUTO_UNDEMOTE defines
tools/power turbostat: Correct SNB_C1/C3_AUTO_UNDEMOTE defines
tools/power turbostat: add POLL and POLL% column
tools/power turbostat: Fix --hide Pk%pc10
tools/power turbostat: Build-in "Low Power Idle" counters support
tools/power turbostat: Don't make man pages executable
tools/power turbostat: remove blank lines
tools/power turbostat: a small C-states dump readability immprovement
...
Now we setup q->nr_requests when switching to one new scheduler,
but not do it for 'none', then q->nr_requests may not be correct
for 'none'.
This patch fixes this issue by always updating 'nr_requests' when
switching to 'none'.
Cc: Marco Patalano <mpatalan@redhat.com>
Cc: "Ewan D. Milne" <emilne@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If we end up splitting a bio and the queue goes away between
the initial submission and the later split submission, then we
can block forever in blk_queue_enter() waiting for the reference
to drop to zero. This will never happen, since we already hold
a reference.
Mark a split bio as already having entered the queue, so we can
just use the live non-blocking queue enter variant.
Thanks to Tetsuo Handa for the analysis.
Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The counter for the number of allocated pages includes pages in the
mempool's reserve, so checking that the number of allocated pages is 0
needs to happen after we exit the mempool.
Fixes: 6f1c819c21 ("dm: convert to bioset_init()/mempool_init()")
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Reported-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Fixed to always just use percpu_counter_sum()
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull networking fixes from David Miller:
1) Infinite loop in _decode_session6(), from Eric Dumazet.
2) Pass correct argument to nla_strlcpy() in netfilter, also from Eric
Dumazet.
3) Out of bounds memory access in ipv6 srh code, from Mathieu Xhonneux.
4) NULL deref in XDP_REDIRECT handling of tun driver, from Toshiaki
Makita.
5) Incorrect idr release in cls_flower, from Paul Blakey.
6) Probe error handling fix in davinci_emac, from Dan Carpenter.
7) Memory leak in XPS configuration, from Alexander Duyck.
8) Use after free with cloned sockets in kcm, from Kirill Tkhai.
9) MTU handling fixes fo ip_tunnel and ip6_tunnel, from Nicolas
Dichtel.
10) Fix UAPI hole in bpf data structure for 32-bit compat applications,
from Daniel Borkmann.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
bpf: fix uapi hole for 32 bit compat applications
net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
ip6_tunnel: remove magic mtu value 0xFFF8
ip_tunnel: restore binding to ifaces with a large mtu
net: dsa: b53: Add BCM5389 support
kcm: Fix use-after-free caused by clonned sockets
net-sysfs: Fix memory leak in XPS configuration
ixgbe: fix parsing of TC actions for HW offload
net: ethernet: davinci_emac: fix error handling in probe()
net/ncsi: Fix array size in dumpit handler
cls_flower: Fix incorrect idr release when failing to modify rule
net/sonic: Use dma_mapping_error()
xfrm Fix potential error pointer dereference in xfrm_bundle_create.
vhost_net: flush batched heads before trying to busy polling
tun: Fix NULL pointer dereference in XDP redirect
be2net: Fix error detection logic for BE3
net: qmi_wwan: Add Netgear Aircard 779S
mlxsw: spectrum: Forbid creation of VLAN 1 over port/LAG
atm: zatm: fix memcmp casting
iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs
...
Add a function to allocate wdata without allocating pages for data
transfer. This gives the caller an option to pass a number of pages that
point to the data buffer to write to.
wdata is reponsible for free those pages after it's done.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
With offset defined in rdata, transport functions need to look at this
offset when reading data into the correct places in pages.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Add a function to allocate rdata without allocating pages for data
transfer. This gives the caller an option to pass a number of pages
that point to the data buffer.
rdata is still reponsible for free those pages after it's done.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Eve of merge window fix: The original code was so bogus as to be
casting the wrong generic device to an rport and proceeding to take
actions based on the bogus values it found. Fortunately it seems the
location that is dereferenced always exists, so the code hasn't oopsed
yet, but it certainly annoys the memory checkers.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWxMJDSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWkTAP0QPfpP
ywyhrODRRPNg73zZnF3qo3CSeswSxDdjyW/4JAD/aLgTfPOydD+EA/sr/hjcs+Z/
DU3lt68c+CVp1kRtZ9A=
=9sLD
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"Eve of merge window fix: The original code was so bogus as to be
casting the wrong generic device to an rport and proceeding to take
actions based on the bogus values it found.
Fortunately it seems the location that is dereferenced always exists,
so the code hasn't oopsed yet, but it certainly annoys the memory
checkers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_transport_srp: Fix shost to rport translation
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbEvsXAAoJEAx081l5xIa+LncP/355OijfiJxjbze+eJVlz+1r
CIlZsZBMc9aqsGU2HyVCTx3dqglrx0UbXZWRyI5WgPuqgi8x/rgbrTk+RyWr3eKw
QQD+oHMN6vXn07ss3VYyHgW+5tkEZ/bbkTaTVRagag/wCZ9eOFgtww1VllYEw1ZG
QI+tOWKhBpcU1dnoL+Bxu1W4JckPXK5FPlcFF/zkt/bIOSedoj+3MZl5Db7c1NSz
quw0mR8C/luf5H2Ump5WIgqKRD8j3xM/EDnY0gdIg9HMmm3k0xIhNLcU/rwNi5ns
qvurOPGK9Fteu94QnmmIbKj1E/ms/KRDA+71UoqmW2YYKJJAYHYr6t8Q4HXlKFcC
MGq1pDGzaZrTbOtqHPJv6iLcnA28GgjDGQ0nQuNWp7mhjmb+fbqLftFQjLeHNPUc
lu3pmmE8FZWI4lSTiqj5ojM1ceZFgGFN2l52PS+17wVHAHln+WpIMbFpaqkxlFRz
zwMr09d70w8qQiW/0b5Pf8n7hq7ud3SZEOhzaAjQ6ggPNXRQ1czj1s8QUsUNt+3b
o+uYC9kSpS67PxUs4QTPFyicMueZaGB+WT5Y+Cr5d4OJIwenzkmRhlAuR05W755T
P5vbe4SJxih+FqcxfAWFJBFoRIRC49YxB/UXYpCIK4iSXpWVVNYearwzjJxywvzA
QppQU+Y9IfvVPNHQtzxX
=MK3/
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-for-v4.17-rc8' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A few final fixes:
i915:
- fix for potential Spectre vector in the new query uAPI
- fix NULL pointer deref (FDO #106559)
- DMI fix to hide LVDS for Radiant P845 (FDO #105468)
amdgpu:
- suspend/resume DC regression fix
- underscan flicker fix on fiji
- gamma setting fix after dpms
omap:
- fix oops regression
core:
- fix PSR timing
dw-hdmi:
- fix oops regression"
* tag 'drm-fixes-for-v4.17-rc8' of git://people.freedesktop.org/~airlied/linux:
drm/amd/display: Update color props when modeset is required
drm/amd/display: Make atomic-check validate underscan changes
drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense
drm/amd/display: Fix BUG_ON during CRTC atomic check update
drm/i915/query: nospec expects no more than an unsigned long
drm/i915/query: Protect tainted function pointer lookup
drm/i915/lvds: Move acpi lid notification registration to registration phase
drm/i915: Disable LVDS on Radiant P845
drm/omap: fix NULL deref crash with SDI displays
drm/psr: Fix missed entry in PSR setup time table.
Two last minute DC fixes for 4.17. A fix for underscan on fiji and
a fix for gamma settings getting after dpms.
* 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux:
drm/amd/display: Update color props when modeset is required
drm/amd/display: Make atomic-check validate underscan changes
A final few MIPS fixes for 4.17:
- Drop Lantiq gphy reboot/remove reset (4.14)
- prctl(PR_SET_FP_MODE): Disallow PRE without FR (4.0)
- ptrace(PTRACE_PEEKUSR): Fix 64-bit FGRs (3.15)
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQS7lRNBWUYtqfDOVL41zuSGKxAj8gUCWxHFOwAKCRA1zuSGKxAj
8i39AQCX9phkffpRDnA1e/MiGGeRZ5+f9FBOzuS1x2nzdUagoQD9FzWxdcx57Syj
ye0kUtmc/wm8U6kz3qC3OInSeVuIEQg=
=3NGM
-----END PGP SIGNATURE-----
Merge tag 'mips_fixes_4.17_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from James Hogan:
"A final few MIPS fixes for 4.17:
- drop Lantiq gphy reboot/remove reset (4.14)
- prctl(PR_SET_FP_MODE): Disallow PRE without FR (4.0)
- ptrace(PTRACE_PEEKUSR): Fix 64-bit FGRs (3.15)"
* tag 'mips_fixes_4.17_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: ptrace: Fix PTRACE_PEEKUSR requests for 64-bit FGRs
MIPS: prctl: Disallow FRE without FR with PR_SET_FP_MODE requests
MIPS: lantiq: gphy: Drop reboot/remove reset asserts
- Revert a pfn page mapping optimization identified as introducing
a bad page state regression (Alex Williamson)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJbEsWYAAoJECObm247sIsi1aUP/ROnej4AhAT8CI9/6YMAdejd
M5aJVAGjsa3fcmGmm92IVkNsnYk9cm7YlmwvTVuOq8zoN0ZAuZvBkQwyfdqXoI8M
ow1GQkv9kjZA+A1vH+A8HW3Liaf7wqPU0Db5nkypOYlaPPNqSAj2VUmbaiBcq+W+
FEshsiIjshRr+YJxKvV1nCRQGlBPMPxMLqqiufVDJE9cSHwwul8lEBHRg1WRcfpN
kKTO+RUEDT99B+HE77icq9l6IVQqnlGMXZ5/ODtpFwhIeQRo1eklspHdajJgUUnT
ksQzrFXCgOnkFJt9pCoquPF/aXGhbTHZbEBudwobeNPZiI3Rqri1RlDcSaqdjZbT
eJUvHt7xSPv0zbqClUOZmqr1QPEym901jA0NrFaoQLjjA1KpjEUIv7/FEKsaE2w8
N2leTwCo04Atkp+gm9XKzZnsrBLaG5vo4pYJrhv6BTBLrJyHQMnLFjYQCzTaBsjM
cnFyv9zlkztSqbLYB9cgSQRNQc8pxfckdPrLhVnrsissLF8Ce5XA3oKzjBu+0Z80
PfvEeA//vzmReYTrm+M0i/gA3whvllEkUMWFMMqH93CtygmyxWrJuLDI+EVg1pbK
ZnwaPnl9q12dPM3MtIefbqXiHSj5rmvHS966oppY/KSUcyM6iEU5XIOZYjfXkluh
W5fYxvPyBE+AvtYwr0wa
=Qb4S
-----END PGP SIGNATURE-----
Merge tag 'vfio-v4.17' of git://github.com/awilliam/linux-vfio
Pull VFIO fix from Alex Williamson:
"Revert a pfn page mapping optimization identified as introducing a bad
page state regression (Alex Williamson)"
* tag 'vfio-v4.17' of git://github.com/awilliam/linux-vfio:
Revert "vfio/type1: Improve memory pinning process for raw PFN mapping"
Here are 4 small bugfixes for some char/misc drivers. Well, really 3
fixes and one fix for one of those fixes due to problems found by 0-day.
This resolves some reported issues with the hwtracing drivers, and a
reported regression for the thunderbolt subsystem. All of these have
been in linux-next for a while now with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWxKMCw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymK2wCdEsDr7v19XalCGEUwrUlTiVM8Du0An2MkgogQ
EzZn7+QsxTgLMYG4N9gl
=Z00b
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are four small bugfixes for some char/misc drivers. Well, really
three fixes and one fix for one of those fixes due to problems found
by 0-day.
This resolves some reported issues with the hwtracing drivers, and a
reported regression for the thunderbolt subsystem. All of these have
been in linux-next for a while now with no reported problems"
* tag 'char-misc-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
hwtracing: stm: fix build error on some arches
intel_th: Use correct device when freeing buffers
stm class: Use vmalloc for the master map
thunderbolt: Handle NULL boot ACL entries properly
Here are some old IIO driver fixes that were sitting in my tree for a
few weeks. Sorry about not getting them to you sooner. They fix a
number of small IIO driver issues that have been reported.
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWxKN4g8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymkogCfbe2/iQJcT8kXD0s/73A/KqkKPksAn0NbVWyh
HEroVoD7yDcdm2A+t36A
=OWdB
-----END PGP SIGNATURE-----
Merge tag 'staging-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull IIO driver fixes from Greg KH:
"Here are some old IIO driver fixes that were sitting in my tree for a
few weeks. Sorry about not getting them to you sooner. They fix a
number of small IIO driver issues that have been reported.
All of these have been in linux-next for a while with no reported
problems"
* tag 'staging-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: adc: select buffer for at91-sama5d2_adc
iio: hid-sensor-trigger: Fix sometimes not powering up the sensor after resume
iio: adc: at91-sama5d2_adc: fix channel configuration for differential channels
iio:kfifo_buf: check for uint overflow
iio:buffer: make length types match kfifo types
iio: adc: stm32-dfsdm: fix sample rate for div2 spi clock
iio: adc: stm32-dfsdm: fix successive oversampling settings
iio: ad7793: implement IIO_CHAN_INFO_SAMP_FREQ
- bnxt netdev changes merged this cycle caused the bnxt RDMA driver to crash under
certain situations
- Arnd found (several, unfortunately) kconfig problems with the patches adding
INFINIBAND_ADDR_TRANS. Reverting this last part, will fix it more fully
outside -rc.
- Subtle change in error code for a uapi function caused breakage in userspace.
This was bug was subtly introduced cycle
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCgAGBQJbEXg1AAoJEDht9xV+IJsax2YQAJjXnfQ9+QsbtT8sqFzGFLo3
r5aqqwF6RkGDFsovVDRH/9S62JjeqJRQA+4ykooajD+6mXU06Sf3p4tcoco0Dqhn
66b/lkdPFzXSytlne7AnUnA3xkKG4u5jYGReSryIQjXu29iwt8scgiiqt8nX9Gzi
eC9U2UQn5ZF65yRo4V/UGuHjdnUXiPYfg2Ff5YqLUxdL0XE42ftpuiR3Xuzgfj8c
/rdqDwnvdViQwPeTSNTJoZzeV+49WKp9BP+lzsCeIXvzuzY1aOd94z0i7fq342hB
jXpz6PtRTSBkZ4xBuvtopnoz0HQXMv7kQFMobkyjaP3qXcKz6Dx9d3QvWkLq8qdQ
D4MPYjVCONJAJvXppxuNSzyz0lg5EaSICWeZhkr5P68Ja0fptiXAumVCTr/kEifV
Yz8y0xsEVMIxBy3C9HOCe4khzWKqd9Uoo4/VDM+GdhKHIpUCnnKTte9vIZcYg1JL
1doCithudNX7KF4K79JJqADwtFhXoPaHt3XF9YRhgouDN9gC+/hB5ZZvD4jjcqhF
tLfRh1+LeK6QuVTE//ON/OS5xGgEzcZVLUJSUs/NdB+5YAXREz/2+IoK/0+rFiP0
wlHlgUk9ZQx3j7La5iiPrRdK+h6u0LUYd02XuMevnnswGqv+DWrxDnFy/icovPCt
AxHZ0KxdLJeo2euOd755
=Ppe+
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Just three small last minute regressions that were found in the last
week. The Broadcom fix is a bit big for rc7, but since it is fixing
driver crash regressions that were merged via netdev into rc1, I am
sending it.
- bnxt netdev changes merged this cycle caused the bnxt RDMA driver
to crash under certain situations
- Arnd found (several, unfortunately) kconfig problems with the
patches adding INFINIBAND_ADDR_TRANS. Reverting this last part,
will fix it more fully outside -rc.
- Subtle change in error code for a uapi function caused breakage in
userspace. This was bug was subtly introduced cycle"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/core: Fix error code for invalid GID entry
IB: Revert "remove redundant INFINIBAND kconfig dependencies"
RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes
Merge two fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: fix the NULL mapping case in __isolate_lru_page()
mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
George Boole would have noticed a slight error in 4.16 commit
69d763fc6d ("mm: pin address_space before dereferencing it while
isolating an LRU page"). Fix it, to match both the comment above it,
and the original behaviour.
Although anonymous pages are not marked PageDirty at first, we have an
old habit of calling SetPageDirty when a page is removed from swap
cache: so there's a category of ex-swap pages that are easily
migratable, but were inadvertently excluded from compaction's async
migration in 4.16.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805302014001.12558@eggly.anvils
Fixes: 69d763fc6d ("mm: pin address_space before dereferencing it while isolating an LRU page")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Ivan Kalvachev <ikalvachev@gmail.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Swapping load on huge=always tmpfs (with khugepaged tuned up to be very
eager, but I'm not sure that is relevant) soon hung uninterruptibly,
waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most
often when "cp -a" was trying to write to a smallish file. Debug showed
that the page in question was not locked, and page->mapping NULL by now,
but page->index consistent with having been in a huge page before.
Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede7
("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added
in; but took hours to reproduce on a 4.17 kernel (no idea why).
The culprit proved to be the __ClearPageDirty() on tails beyond i_size in
__split_huge_page(): the non-atomic __bitoperation may have been safe when
4.8's baa355fd33 ("thp: file pages support for split_huge_page()")
introduced it, but liable to erase PageWaiters after 4.10's 6290602709
("mm: add PageWaiters indicating tasks are waiting for a page bit").
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils
Fixes: 6290602709 ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bisection by Amadeusz Sławiński implicates this commit leading to bad
page state issues after VM shutdown, likely due to unbalanced page
references. The original commit was intended only as a performance
improvement, therefore revert for offline rework.
Link: https://lkml.org/lkml/2018/6/2/97
Fixes: 356e88ebe4 ("vfio/type1: Improve memory pinning process for raw PFN mapping")
Cc: Jason Cai (Xiang Feng) <jason.cai@linux.alibaba.com>
Reported-by: Amadeusz Sławiński <amade@asmblr.net>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Daniel Borkmann says:
====================
pull-request: bpf 2018-06-02
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) BPF uapi fix in struct bpf_prog_info and struct bpf_map_info in
order to fix offsets on 32 bit archs.
This will have a minor merge conflict with net-next which has the
__u32 gpl_compatible:1 bitfield in struct bpf_prog_info at this
location. Resolution is to use the gpl_compatible member.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Output a Node column if there is more than one node/socket.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The previous patches have added node information to turbostat, but the
counters code does not take it into account.
Add node information from cpu_topology calculations to turbostat
counters.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cleanup, remove num_ from num_nodes_per_pkg, num_cores_per_node, and
num_threads_per_node.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
turbostat incorrectly assumes that there is one node per package. As a
result num_cores_per_pkg is not correctly named and is actually
num_cores_per_node.
Rename num_cores_per_pkg to num_cores_per_node.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The code can be simplified if the cpu_topology *cpus tracks the thread
IDs. This removes an additional file lookup and simplifies the counter
initialization code.
Add thread ID to cpu_topology information and cleanup the counter
initialization code.
v2: prevent thread_id from being overwritten
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The code currently assumes each package has exactly one node. This is not
the case for AMD systems and Intel systems with COD. AMD systems also
may re-enumerate each node's core IDs starting at 0 (for example, an AMD
processor may have two nodes, each with core IDs from 0 to 7). In order
to properly enumerate the cores we need to track both the physical and
logical node IDs.
Add physical_node_id to track the node ID assigned by the kernel, and
logical_node_id used by turbostat to track the nodes per package ie) a
0-based count within the package.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The turbostat code only looks at thread_siblings_list to determine if
processing units/threads are on the same the core. This works well on
Intel systems which have a shared L1 instruction and data cache. This
does not work on AMD systems which have shared L1 instruction cache but
separate L1 data caches. Other utilities also check sibling's core ID
to determine if the processing unit shares the same core.
Additionally, the cpu_topology *cpus list used in topology_probe() can
be used elsewhere in the code to simplify things.
Export *cpus to the entire turbostat code, and add Processing Unit/Thread
IDs information to each cpu_topology struct. Confirm that the thread
is on the same core as indicated by thread_siblings_list.
[v2]: Fixup CPU_* usage that caused gcc malloc error.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Future fixes will use sysfs files that contain cpumask output. The code
needs to know the length of the cpumask in order to determine which cpus
are set in a cpumask. Currently topo.max_cpu_num is the maximum cpu
number. It can be increased the the maximum value of cpus represented in
cpumasks.
Set max_num_cpus to the length of a cpumask.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
There's a use case during test to only print specific round of iterations
if --num_iterations is specified, for example, with this patch applied:
turbostat -i 5 -n 4
will capture 4 samples with 5 seconds interval.
[lenb: renamed to --num_iterations from --iterations]
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
All MSRs related to turbostat are same as Kabylake.
Even though SDM claims that core C3 residency can be read from MSR 0x662,
the read on this MSR fails on CNL platform. Hence disabled C3 MSR read
and display.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The SNB_C1_AUTO_UNDEMOTE definition should have been deleted once
it was copied into msr-index.h. One copy of the truth is better --
particularly when Matt needs to fix it:-)
Signed-off-by: Len Brown <len.brown@intel.com>
According to the Intel Software Developers' Manual, Vol. 4, Order No.
335592, these macros have been reversed since they were added in the
initial turbostat commit. The reversed definitions were presumably
copied from turbostat.c to this file.
Fixes: 9c63a650bb ("tools/power/x86/turbostat: share kernel MSR #defines")
Signed-off-by: Matt Turner <mattst88@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Len Brown <len.brown@intel.com>
According to the Intel Software Developers' Manual, Vol. 4, Order No.
335592, these macros have been reversed since they were added.
Fixes: 889facbee3 ("tools/power turbostat: v3.0: monitor Watts and Temperature")
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Like the "C1" and "C1%" column, the new POLL and POLL% columns
show invocations and residency% during the measurement interval.
While it didn't seem important to track in the past,
we've recently found some Linux cpuidle bugs related to POLL%.
Signed-off-by: Len Brown <len.brown@intel.com>
The column header for PC10 residency is "Pk%pc10"
This is missing the 'g' that others have, eg Pkg%pc6,
to allow tab-delimited columns to fit into 8-columns.
However, --hide Pk%pc10 did not work, it was still looking for the 'g'.
This was confusing, because --list shows the correct "Pk%pc10"
Reported-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Linux 4.15 exports the ACPI Low Power Idle Table's
counters in /sys/devices/system/cpu/cpuidle/
low_power_idle_cpu_residency_us
Show this in the "CPU%LPI" column.
Today this reflects the "North Complex"
residency in PC10, so expect it to
closely follow "Pk%pc10".
low_power_idle_system_residency_us
Show this in the "SYS%LPI" column.
Today, this reflects the North is in PC10,
plus the PCH is sufficiently quiescent
to save additional power via the "S0ix"
system state, as measured by the
PCH SLP_S0 counter.
Signed-off-by: Len Brown <len.brown@intel.com>
This way the implementation doesn't depend on buffer_head internals.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
We only call into this function through the iomap iterators, so we already
know the buffer is unwritten. In addition to that we always require the
uptodate flag that is ORed with the result anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>