Commit Graph

933704 Commits

Author SHA1 Message Date
Horatiu Vultur 7882c895b7 bridge: mrp: Validate when setting the port role
This patch adds specific checks for primary(0x0) and secondary(0x1) when
setting the port role. For any other value the function
'br_mrp_set_port_role' will return -EINVAL.

Fixes: 20f6a05ef6 ("bridge: mrp: Rework the MRP netlink interface")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 14:38:05 -07:00
Horatiu Vultur 2464bc7c28 bridge: uapi: mrp: Fix MRP_PORT_ROLE
Currently the MRP_PORT_ROLE_NONE has the value 0x2 but this is in conflict
with the IEC 62439-2 standard. The standard defines the following port
roles: primary (0x0), secondary(0x1), interconnect(0x2).
Therefore remove the port role none.

Fixes: 4714d13791 ("bridge: uapi: mrp: Add mrp attributes.")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 14:38:05 -07:00
Andy Shevchenko 852f71942c i2c: designware: Adjust bus speed independently of ACPI
John Stultz reported that commit f9288fcc5c ("i2c: designware: Move
ACPI parts into common module") caused a regression on the HiKey board
where adv7511 HDMI bridge driver wasn't probing anymore due the I2C bus
failed to start.

It seems the change caused the bus speed being zero when CONFIG_ACPI
not set and neither speed based on "clock-frequency" device property
or default fast mode is set.

Fix this by splitting i2c_dw_acpi_adjust_bus_speed() to
i2c_dw_acpi_round_bus_speed() and i2c_dw_adjust_bus_speed(), where
the latter one has the code that runs independently of ACPI.

Fixes: f9288fcc5c ("i2c: designware: Move ACPI parts into common module")
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-06-23 21:24:33 +02:00
Linus Torvalds 26e122e97a All bugfixes except for a couple cleanup patches.
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7x2lwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPiVAgAn/83Vx/YrF9sr0+TLzukzfOubJVK
 Majxb0I06De23VDExiDoZjh5CnCN3kDja0m2c543ZI1XOrHRbp09v1goJQkAgiT0
 AQ8Npi1KB71io18SbZtrAhPLmSiUgRirF+XWHB38qjdbZixvZyWz8nvSITFY8aJQ
 ICgbm5jftzBdSOKEhqbHwZ+LcXjEGZsehwTiHpUBKUR/kNlRFV5UFAd5m+CT5i4O
 3DydLIReATDCoZUKfkBjYtoR3c9DyWESyfWD4GZ/2xRKr/1QfiZ4dA0cd/P9hJYz
 7MAG+ULvJGlasSzmcEQJ/X3o9QuIJzpQFpwbKeMX6gOsEsSVUQeriUHIFA==
 =jTFw
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "All bugfixes except for a couple cleanup patches"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: VMX: Remove vcpu_vmx's defunct copy of host_pkru
  KVM: x86: allow TSC to differ by NTP correction bounds without TSC scaling
  KVM: X86: Fix MSR range of APIC registers in X2APIC mode
  KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL
  KVM: nVMX: Plumb L2 GPA through to PML emulation
  KVM: x86/mmu: Avoid mixing gpa_t with gfn_t in walk_addr_generic()
  KVM: LAPIC: ensure APIC map is up to date on concurrent update requests
  kvm: lapic: fix broken vcpu hotplug
  Revert "KVM: VMX: Micro-optimize vmexit time when not exposing PMU"
  KVM: VMX: Add helpers to identify interrupt type from intr_info
  kvm/svm: disable KCSAN for svm_vcpu_run()
  KVM: MIPS: Fix a build error for !CPU_LOONGSON64
2020-06-23 11:01:16 -07:00
Xuan Zhuo b772f07add io_uring: fix io_sq_thread no schedule when busy
When the user consumes and generates sqe at a fast rate,
io_sqring_entries can always get sqe, and ret will not be equal to -EBUSY,
so that io_sq_thread will never call cond_resched or schedule, and then
we will get the following system error prompt:

rcu: INFO: rcu_sched self-detected stall on CPU
or
watchdog: BUG: soft lockup-CPU#23 stuck for 112s! [io_uring-sq:1863]

This patch checks whether need to call cond_resched() by checking
the need_resched() function every cycle.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-23 11:54:30 -06:00
Maxime Ripard 842ec61f40
drm/sun4i: mixer: Call of_dma_configure if there's an IOMMU
The main DRM device is actually a virtual device so it doesn't have the
iommus property, which is instead on the DMA masters, in this case the
mixers.

Add a call to of_dma_configure with the mixers DT node but on the DRM
virtual device to configure it in the same way than the mixers.

Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/9a4daf438dd3f2fe07afb23688bfb793a0613d7d.1589378833.git-series.maxime@cerno.tech
(cherry picked from commit b718102dbd)
[Maxime: Applied to -fixes since it missed the merge window and display is
         broken without it]
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-06-23 19:13:37 +02:00
Zhang Xiaoxu 95a3d8f3af cifs: Fix double add page to memcg when cifs_readpages
When xfstests generic/451, there is an BUG at mm/memcontrol.c:
  page:ffffea000560f2c0 refcount:2 mapcount:0 mapping:000000008544e0ea
       index:0xf
  mapping->aops:cifs_addr_ops dentry name:"tst-aio-dio-cycle-write.451"
  flags: 0x2fffff80000001(locked)
  raw: 002fffff80000001 ffffc90002023c50 ffffea0005280088 ffff88815cda0210
  raw: 000000000000000f 0000000000000000 00000002ffffffff ffff88817287d000
  page dumped because: VM_BUG_ON_PAGE(page->mem_cgroup)
  page->mem_cgroup:ffff88817287d000
  ------------[ cut here ]------------
  kernel BUG at mm/memcontrol.c:2659!
  invalid opcode: 0000 [#1] SMP
  CPU: 2 PID: 2038 Comm: xfs_io Not tainted 5.8.0-rc1 #44
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_
    073836-buildvm-ppc64le-16.ppc.4
  RIP: 0010:commit_charge+0x35/0x50
  Code: 0d 48 83 05 54 b2 02 05 01 48 89 77 38 c3 48 c7
        c6 78 4a ea ba 48 83 05 38 b2 02 05 01 e8 63 0d9
  RSP: 0018:ffffc90002023a50 EFLAGS: 00010202
  RAX: 0000000000000000 RBX: ffff88817287d000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffff88817ac97ea0 RDI: ffff88817ac97ea0
  RBP: ffffea000560f2c0 R08: 0000000000000203 R09: 0000000000000005
  R10: 0000000000000030 R11: ffffc900020237a8 R12: 0000000000000000
  R13: 0000000000000001 R14: 0000000000000001 R15: ffff88815a1272c0
  FS:  00007f5071ab0800(0000) GS:ffff88817ac80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000055efcd5ca000 CR3: 000000015d312000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   mem_cgroup_charge+0x166/0x4f0
   __add_to_page_cache_locked+0x4a9/0x710
   add_to_page_cache_locked+0x15/0x20
   cifs_readpages+0x217/0x1270
   read_pages+0x29a/0x670
   page_cache_readahead_unbounded+0x24f/0x390
   __do_page_cache_readahead+0x3f/0x60
   ondemand_readahead+0x1f1/0x470
   page_cache_async_readahead+0x14c/0x170
   generic_file_buffered_read+0x5df/0x1100
   generic_file_read_iter+0x10c/0x1d0
   cifs_strict_readv+0x139/0x170
   new_sync_read+0x164/0x250
   __vfs_read+0x39/0x60
   vfs_read+0xb5/0x1e0
   ksys_pread64+0x85/0xf0
   __x64_sys_pread64+0x22/0x30
   do_syscall_64+0x69/0x150
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f5071fcb1af
  Code: Bad RIP value.
  RSP: 002b:00007ffde2cdb8e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
  RAX: ffffffffffffffda RBX: 00007ffde2cdb990 RCX: 00007f5071fcb1af
  RDX: 0000000000001000 RSI: 000055efcd5ca000 RDI: 0000000000000003
  RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000001000 R11: 0000000000000293 R12: 0000000000000001
  R13: 000000000009f000 R14: 0000000000000000 R15: 0000000000001000
  Modules linked in:
  ---[ end trace 725fa14a3e1af65c ]---

Since commit 3fea5a499d ("mm: memcontrol: convert page cache to a new
mem_cgroup_charge() API") not cancel the page charge, the pages maybe
double add to pagecache:
thread1                       | thread2
cifs_readpages
readpages_get_pages
 add_to_page_cache_locked(head,index=n)=0
                              | readpages_get_pages
                              | add_to_page_cache_locked(head,index=n+1)=0
 add_to_page_cache_locked(head, index=n+1)=-EEXIST
 then, will next loop with list head page's
 index=n+1 and the page->mapping not NULL
readpages_get_pages
add_to_page_cache_locked(head, index=n+1)
 commit_charge
  VM_BUG_ON_PAGE

So, we should not do the next loop when any page add to page cache
failed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
2020-06-23 12:04:52 -05:00
Linus Torvalds 3e08a95294 for-5.8-rc2-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl7yABEACgkQxWXV+ddt
 WDtGoQ//cBWRRWLlLTRgpaKnY6t8JgVUqNvPJISHHf45cNbOJh0yo8hUuKMW+440
 8ovYqtFoZD+JHcHDE2sMueHBFe38rG5eT/zh8j/ruhBzeJcTb3lSYz53d7sfl5kD
 cIVngPEVlGziDqW2PsWLlyh8ulBGzY3YmS6kAEkyP/6/uhE/B1dq6qn3GUibkbKI
 dfNjHTLwZVmwnqoxLu8ZE2/hHFbzhl0sm09snsXYSVu13g36+edp0Z+pF0MlKGVk
 G6YrnZcts8TWwneZ4nogD9f2CMvzMhYDDLyEjsX0Ouhb+Cu2WNxdfrJ2ZbPNU82w
 EGbo451mIt6Ht8wicEjh27LWLI7YMraF/Ig/ODMdvFBYDbhl4voX2t+4n+p5Czbg
 AW6Wtg/q5EaaNFqrTsqAAiUn0+R3sMiDWrE0AewcE7syPGqQ2XMwP4la5pZ36rz8
 8Vo5KIGo44PIJ1dMwcX+bg3HTtUnBJSxE5fUi0rJ3ZfHKGjLS79VonEeQjh3QD6W
 0UlK+jCjo6KZoe33XdVV2hVkHd63ZIlliXWv0LOR+gpmqqgW2b3wf181zTvo/5sI
 v0fDjstA9caqf68ChPE9jJi7rZPp/AL1yAQGEiNzjKm4U431TeZJl2cpREicMJDg
 FCDU51t9425h8BFkM4scErX2/53F1SNNNSlAsFBGvgJkx6rTENs=
 =/eCR
 -----END PGP SIGNATURE-----

Merge tag 'for-5.8-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A number of fixes, located in two areas, one performance fix and one
  fixup for better integration with another patchset.

   - bug fixes in nowait aio:
       - fix snapshot creation hang after nowait-aio was used
       - fix failure to write to prealloc extent past EOF
       - don't block when extent range is locked

   - block group fixes:
       - relocation failure when scrub runs in parallel
       - refcount fix when removing fails
       - fix race between removal and creation
       - space accounting fixes

   - reinstante fast path check for log tree at unlink time, fixes
     performance drop up to 30% in REAIM

   - kzfree/kfree fixup to ease treewide patchset renaming kzfree"

* tag 'for-5.8-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: use kfree() in btrfs_ioctl_get_subvol_info()
  btrfs: fix RWF_NOWAIT writes blocking on extent locks and waiting for IO
  btrfs: fix RWF_NOWAIT write not failling when we need to cow
  btrfs: fix failure of RWF_NOWAIT write into prealloc extent beyond eof
  btrfs: fix hang on snapshot creation after RWF_NOWAIT write
  btrfs: check if a log root exists before locking the log_mutex on unlink
  btrfs: fix bytes_may_use underflow when running balance and scrub in parallel
  btrfs: fix data block group relocation failure due to concurrent scrub
  btrfs: fix race between block group removal and block group creation
  btrfs: fix a block group ref counter leak after failure to remove block group
2020-06-23 09:20:11 -07:00
Christophe JAILLET 0659d4205d PM: sleep: core: mark 2 functions as __init to save some memory
'early_resume_init()' and 'late_resume_init() 'are only called respectively
via 'early_resume_init' and 'late_resume_init'.

They can be marked as __init to save a few bytes of memory.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-23 17:35:33 +02:00
Srinivas Pandruvada 589bab6bb3 cpufreq: intel_pstate: Add one more OOB control bit
Add one more bit for OOB (Out Of Band) enabling of P-states.

If OOB handling of P-states is enabled, intel_pstate shouldn't load.
Currently, only "BIT(8) == 1" of the MSR MSR_MISC_PWR_MGMT is
considered as OOB, but "BIT(18) == 1" needs to be taken into
consideration as OOB condition too.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Add an empty code line, edit subject and changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-23 17:24:32 +02:00
Steven Rostedt (VMware) 097350d1c6 ring-buffer: Zero out time extend if it is nested and not absolute
Currently the ring buffer makes events that happen in interrupts that preempt
another event have a delta of zero. (Hopefully we can change this soon). But
this is to deal with the races of updating a global counter with lockless
and nesting functions updating deltas.

With the addition of absolute time stamps, the time extend didn't follow
this rule. A time extend can happen if two events happen longer than 2^27
nanoseconds appart, as the delta time field in each event is only 27 bits.
If that happens, then a time extend is injected with 2^59 bits of
nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between
two events, and as it is writing the event, an interrupt triggers, it will
see the 2^27 difference as well and inject a time extend of its own. But a
recent change made the time extend logic not take into account the nesting,
and this can cause two time extend deltas to happen moving the time stamp
much further ahead than the current time. This gets all reset when the ring
buffer moves to the next page, but that can cause time to appear to go
backwards.

This was observed in a trace-cmd recording, and since the data is saved in a
file, with trace-cmd report --debug, it was possible to see that this indeed
did happen!

  bash-52501   110d... 81778.908247: sched_switch:         bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64]
  <idle>-0     110d... 81778.908757: sched_switch:         swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64]
 TIME EXTEND: delta:306454770 length:0
  bash-52501   110.... 81779.215212: sched_swap_numa:      src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48]
 TIME EXTEND: delta:306458165 length:0
  bash-52501   110dNh. 81779.521670: sched_wakeup:         migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40]

and at the next page, caused the time to go backwards:

  bash-52504   110d... 81779.685411: sched_switch:         bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64]
CPU:110 [SUBBUFFER START] [81779379165886:0x1320000]
  <idle>-0     110dN.. 81779.379166: sched_wakeup:         bash:52504 [120] success=1 CPU:110 [0:0x10:40]
  <idle>-0     110d... 81779.379167: sched_switch:         swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64]

Link: https://lkml.kernel.org/r/20200622151815.345d1bf5@oasis.local.home

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes: dc4e2801d4 ("ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMP")
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-06-23 11:18:42 -04:00
Mark Brown 4dc9b282bf arm64: Depend on newer binutils when building PAC
Versions of binutils prior to 2.33.1 don't understand the ELF notes that
are added by modern compilers to indicate the PAC and BTI options used
to build the code. This causes them to emit large numbers of warnings in
the form:

aarch64-linux-gnu-nm: warning: .tmp_vmlinux.kallsyms2: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000

during the kernel build which is currently causing quite a bit of
disruption for automated build testing using clang.

In commit 15cd0e675f (arm64: Kconfig: ptrauth: Add binutils version
check to fix mismatch) we added a dependency on binutils to avoid this
issue when building with versions of GCC that emit the notes but did not
do so for clang as it was believed that the existing check for
.cfi_negate_ra_state was already requiring a new enough binutils. This
does not appear to be the case for some versions of binutils (eg, the
binutils in Debian 10) so instead refactor so we require a new enough
GNU binutils in all cases other than when we are using an old GCC
version that does not emit notes.

Other, more exotic, combinations of tools are possible such as using
clang, lld and gas together are possible and may have further problems
but rather than adding further version checks it looks like the most
robust thing will be to just test that we can build cleanly with the
configured tools but that will require more review and discussion so do
this for now to address the immediate problem disrupting build testing.

Reported-by: KernelCI <bot@kernelci.org>
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1054
Link: https://lore.kernel.org/r/20200619123550.48098-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-23 16:18:17 +01:00
Chen Yu 81e6737518 PM: s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
Suspend to idle was found to not work on Goldmont CPU recently.

The issue happens due to:

 1. On Goldmont the CPU in idle can only be woken up via IPIs,
    not POLLING mode, due to commit 08e237fa56 ("x86/cpu: Add
    workaround for MONITOR instruction erratum on Goldmont based
    CPUs")

 2. When the CPU is entering suspend to idle process, the
    _TIF_POLLING_NRFLAG remains on, because cpuidle_enter_s2idle()
    doesn't match call_cpuidle() exactly.

 3. Commit b2a02fc43a ("smp: Optimize send_call_function_single_ipi()")
    makes use of _TIF_POLLING_NRFLAG to avoid sending IPIs to idle
    CPUs.

 4. As a result, some IPIs related functions might not work
    well during suspend to idle on Goldmont. For example, one
    suspected victim:

    tick_unfreeze() -> timekeeping_resume() -> hrtimers_resume()
    -> clock_was_set() -> on_each_cpu() might wait forever,
    because the IPIs will not be sent to the CPUs which are
    sleeping with _TIF_POLLING_NRFLAG set, and Goldmont CPU
    could not be woken up by only setting _TIF_NEED_RESCHED
    on the monitor address.

To avoid that, clear the _TIF_POLLING_NRFLAG flag before invoking
enter_s2idle_proper() in cpuidle_enter_s2idle() in analogy with the
call_cpuidle() code flow.

Fixes: b2a02fc43a ("smp: Optimize send_call_function_single_ipi()")
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
[ rjw: Subject / changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-23 17:06:55 +02:00
Macpaul Lin a32a1fc998 ALSA: usb-audio: add quirk for Samsung USBC Headset (AKG)
We've found Samsung USBC Headset (AKG) (VID: 0x04e8, PID: 0xa051)
need a tiny delay after each class compliant request.
Otherwise the device might not be able to be recognized each times.

Signed-off-by: Chihhao Chen <chihhao.chen@mediatek.com>
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1592910203-24035-1-git-send-email-macpaul.lin@mediatek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-06-23 16:13:49 +02:00
Will Deacon 2d071968a4 arm64: compat: Remove 32-bit sigreturn code from the vDSO
The sigreturn code in the compat vDSO is unused. Remove it.

Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-23 14:56:39 +01:00
Will Deacon 8e411be6aa arm64: compat: Always use sigpage for sigreturn trampoline
The 32-bit sigreturn trampoline in the compat sigpage matches the binary
representation of the arch/arm/ sigpage exactly. This is important for
debuggers (e.g. GDB) and unwinders (e.g. libunwind) since they rely
on matching the instruction sequence in order to identify that they are
unwinding through a signal. The same cannot be said for the sigreturn
trampoline in the compat vDSO, which defeats the unwinder heuristics and
instead attempts to use unwind directives for the unwinding. This is in
contrast to arch/arm/, which never uses the vDSO for sigreturn.

Ensure compatibility with arch/arm/ and existing unwinders by always
using the sigpage for the sigreturn trampoline, regardless of the
presence of the compat vDSO.

Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-23 14:56:24 +01:00
Will Deacon a39060b009 arm64: compat: Allow 32-bit vdso and sigpage to co-exist
In preparation for removing the signal trampoline from the compat vDSO,
allow the sigpage and the compat vDSO to co-exist.

For the moment the vDSO signal trampoline will still be used when built.
Subsequent patches will move to the sigpage consistently.

Acked-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-23 14:47:03 +01:00
Will Deacon 87676cfca1 arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline
Commit 7e9f5e6629 ("arm64: vdso: Add --eh-frame-hdr to ldflags") results
in a .eh_frame_hdr section for the vDSO, which in turn causes the libgcc
unwinder to unwind out of signal handlers using the .eh_frame information
populated by our .cfi directives. In conjunction with a4eb355a3f
("arm64: vdso: Fix CFI directives in sigreturn trampoline"), this has
been shown to cause segmentation faults originating from within the
unwinder during thread cancellation:

 | Thread 14 "virtio-net-rx" received signal SIGSEGV, Segmentation fault.
 | 0x0000000000435e24 in uw_frame_state_for ()
 | (gdb) bt
 | #0  0x0000000000435e24 in uw_frame_state_for ()
 | #1  0x0000000000436e88 in _Unwind_ForcedUnwind_Phase2 ()
 | #2  0x00000000004374d8 in _Unwind_ForcedUnwind ()
 | #3  0x0000000000428400 in __pthread_unwind (buf=<optimized out>) at unwind.c:121
 | #4  0x0000000000429808 in __do_cancel () at ./pthreadP.h:304
 | #5  sigcancel_handler (sig=32, si=0xffff33c743f0, ctx=<optimized out>) at nptl-init.c:200
 | #6  sigcancel_handler (sig=<optimized out>, si=0xffff33c743f0, ctx=<optimized out>) at nptl-init.c:165
 | #7  <signal handler called>
 | #8  futex_wait_cancelable (private=0, expected=0, futex_word=0x3890b708) at ../sysdeps/unix/sysv/linux/futex-internal.h:88

After considerable bashing of heads, it appears that our CFI directives
for unwinding out of the sigreturn trampoline are only processed by libgcc
when both a .eh_frame_hdr section is present *and* the mysterious NOP is
covered by an entry in .eh_frame. With both of these now in place, it has
highlighted that our CFI directives are not comprehensive enough to
restore the stack pointer of the interrupted context. This results in libgcc
falling back to an arm64-specific unwinder after computing a bogus PC value
from the unwind tables. The unwinder promptly dereferences this bogus address
in an attempt to see if the pointed-to instruction sequence looks like
the sigreturn trampoline.

Restore the old unwind behaviour, which relied solely on heuristics in
the unwinder, by removing the .eh_frame_hdr section from the vDSO and
commenting out the insufficient CFI directives for now. Add comments to
explain the current, miserable state of affairs.

Cc: Tamas Zsoldos <tamas.zsoldos@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Acked-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-23 14:47:03 +01:00
Eric Auger 8e36baf97b dma-remap: align the size in dma_common_*_remap()
Running a guest with a virtio-iommu protecting virtio devices
is broken since commit 515e5b6d90 ("dma-mapping: use vmap insted
of reimplementing it"). Before the conversion, the size was
page aligned in __get_vm_area_node(). Doing so fixes the
regression.

Fixes: 515e5b6d90 ("dma-mapping: use vmap insted of reimplementing it")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-23 14:14:41 +02:00
Christoph Hellwig d07ae4c486 dma-mapping: DMA_COHERENT_POOL should select GENERIC_ALLOCATOR
The dma coherent pool code needs genalloc.  Move the select over
from DMA_REMAP, which doesn't actually need it.

Fixes: dbed452a07 ("dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Rientjes <rientjes@google.com>
2020-06-23 14:13:58 +02:00
David Rientjes 1a2b3357e8 dma-direct: add missing set_memory_decrypted() for coherent mapping
When a coherent mapping is created in dma_direct_alloc_pages(), it needs
to be decrypted if the device requires unencrypted DMA before returning.

Fixes: 3acac06550 ("dma-mapping: merge the generic remapping helpers into dma-direct")
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-23 14:13:54 +02:00
Christian Borntraeger 827c491392 s390/debug: avoid kernel warning on too large number of pages
When specifying insanely large debug buffers a kernel warning is
printed. The debug code does handle the error gracefully, though.
Instead of duplicating the check let us silence the warning to
avoid crashes when panic_on_warn is used.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-23 14:05:55 +02:00
Vasily Gorbik 998f5bbe3d s390/kasan: fix early pgm check handler execution
Currently if early_pgm_check_handler is called it ends up in pgm check
loop. The problem is that early_pgm_check_handler is instrumented by
KASAN but executed without DAT flag enabled which leads to addressing
exception when KASAN checks try to access shadow memory.

Fix that by executing early handlers with DAT flag on under KASAN as
expected.

Reported-and-tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-23 14:05:50 +02:00
Sven Schnelle e64a1618af s390: fix system call single stepping
When single stepping an svc instruction on s390, the kernel is entered
with a PER program check interruption. The program check handler than
jumps to the system call handler by reloading the PSW. The code didn't
set GPR13 to the thread pointer in struct task_struct. This made the
kernel access invalid memory while trying to fetch the syscall function
address. Fix this by always assigned GPR13 after .Lsysc_per.

Fixes: 0b0ed657fe ("s390: remove critical section cleanup from entry.S")
Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-23 14:05:45 +02:00
Anson Huang c95c9693b1 soc: imx8m: Correct i.MX8MP UID fuse offset
Correct i.MX8MP UID fuse offset according to fuse map:

UID_LOW: 0x420
UID_HIGH: 0x430

Fixes: fc40200ebf ("soc: imx: increase build coverage for imx8m soc driver")
Fixes: 18f662a738 ("soc: imx: Add i.MX8MP SoC driver support")
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reviewed-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-06-23 19:45:21 +08:00
Hans de Goede a05caf9e62 drm: panel-orientation-quirks: Use generic orientation-data for Acer S1003
The Acer S1003 has proper DMI strings for sys-vendor and product-name,
so we do not need to match by BIOS-date.

This means that the Acer S1003 can use the generic lcd800x1280_rightside_up
drm_dmi_panel_orientation_data struct which is also used by other quirks.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200531093025.28050-2-hdegoede@redhat.com
2020-06-23 12:32:06 +02:00
Hans de Goede 6c22bc18a3 drm: panel-orientation-quirks: Add quirk for Asus T101HA panel
Like the Asus T100HA the Asus T101HA also uses a panel which has been
mounted 90 degrees rotated, albeit in the opposite direction.
Add a quirk for this.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200531093025.28050-1-hdegoede@redhat.com
2020-06-23 12:30:34 +02:00
Christoffer Nielsen 73094608b8 ALSA: usb-audio: Add registration quirk for Kingston HyperX Cloud Flight S
Similar to the Kingston HyperX AMP, the Kingston HyperX Cloud
Alpha S (0951:0x16ea) uses two interfaces, but only the second
interface contains the capture stream. This patch delays the
registration until the second interface appears.

Signed-off-by: Christoffer Nielsen <cn@obviux.dk>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/CAOtG2YHOM3zy+ed9KS-J4HkZo_QGzcUG9MigSp4e4_-13r6B=Q@mail.gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-06-23 12:09:35 +02:00
Sean Christopherson e4553b4976 KVM: VMX: Remove vcpu_vmx's defunct copy of host_pkru
Remove vcpu_vmx.host_pkru, which got left behind when PKRU support was
moved to common x86 code.

No functional change intended.

Fixes: 37486135d3 ("KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200617034123.25647-1-sean.j.christopherson@intel.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-23 06:01:29 -04:00
Marcelo Tosatti 26769f96e6 KVM: x86: allow TSC to differ by NTP correction bounds without TSC scaling
The Linux TSC calibration procedure is subject to small variations
(its common to see +-1 kHz difference between reboots on a given CPU, for example).

So migrating a guest between two hosts with identical processor can fail, in case
of a small variation in calibrated TSC between them.

Without TSC scaling, the current kernel interface will either return an error
(if user_tsc_khz <= tsc_khz) or enable TSC catchup mode.

This change enables the following TSC tolerance check to
accept KVM_SET_TSC_KHZ within tsc_tolerance_ppm (which is 250ppm by default).

        /*
         * Compute the variation in TSC rate which is acceptable
         * within the range of tolerance and decide if the
         * rate being applied is within that bounds of the hardware
         * rate.  If so, no scaling or compensation need be done.
         */
        thresh_lo = adjust_tsc_khz(tsc_khz, -tsc_tolerance_ppm);
        thresh_hi = adjust_tsc_khz(tsc_khz, tsc_tolerance_ppm);
        if (user_tsc_khz < thresh_lo || user_tsc_khz > thresh_hi) {
                pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", user_tsc_khz, thresh_lo, thresh_hi);
                use_scaling = 1;
        }

NTP daemon in the guest can correct this difference (NTP can correct upto 500ppm).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Message-Id: <20200616114741.GA298183@fuller.cnet>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-23 05:55:17 -04:00
Xiaoyao Li bf10bd0be5 KVM: X86: Fix MSR range of APIC registers in X2APIC mode
Only MSR address range 0x800 through 0x8ff is architecturally reserved
and dedicated for accessing APIC registers in x2APIC mode.

Fixes: 0105d1a526 ("KVM: x2apic interface to lapic")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200616073307.16440-1-xiaoyao.li@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-23 05:49:45 -04:00
Lu Baolu 48f0bcfb7a iommu/vt-d: Fix misuse of iommu_domain_identity_map()
The iommu_domain_identity_map() helper takes start/end PFN as arguments.
Fix a misuse case where the start and end addresses are passed.

Fixes: e70b081c6f ("iommu/vt-d: Remove IOVA handling code from the non-dma_ops path")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Tom Murphy <murphyt7@tcd.ie>
Link: https://lore.kernel.org/r/20200622231345.29722-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:32 +02:00
Lu Baolu 04c00956ee iommu/vt-d: Update scalable mode paging structure coherency
The Scalable-mode Page-walk Coherency (SMPWC) field in the VT-d extended
capability register indicates the hardware coherency behavior on paging
structures accessed through the pasid table entry. This is ignored in
current code and using ECAP.C instead which is only valid in legacy mode.
Fix this so that paging structure updates could be manually flushed from
the cache line if hardware page walking is not snooped.

Fixes: 765b6a98c1 ("iommu/vt-d: Enumerate the scalable mode capability")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:32 +02:00
Lu Baolu 50310600eb iommu/vt-d: Enable PCI ACS for platform opt in hint
PCI ACS is disabled if Intel IOMMU is off by default or intel_iommu=off
is used in command line. Unfortunately, Intel IOMMU will be forced on if
there're devices sitting on an external facing PCI port that is marked
as untrusted (for example, thunderbolt peripherals). That means, PCI ACS
is disabled while Intel IOMMU is forced on to isolate those devices. As
the result, the devices of an MFD will be grouped by a single group even
the ACS is supported on device.

[    0.691263] pci 0000:00:07.1: Adding to iommu group 3
[    0.691277] pci 0000:00:07.2: Adding to iommu group 3
[    0.691292] pci 0000:00:07.3: Adding to iommu group 3

Fix it by requesting PCI ACS when Intel IOMMU is detected with platform
opt in hint.

Fixes: 89a6079df7 ("iommu/vt-d: Force IOMMU on for platform opt in hint")
Co-developed-by: Lalithambika Krishnakumar <lalithambika.krishnakumar@intel.com>
Signed-off-by: Lalithambika Krishnakumar <lalithambika.krishnakumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:32 +02:00
Rajat Jain 67e8a5b18d iommu/vt-d: Don't apply gfx quirks to untrusted devices
Currently, an external malicious PCI device can masquerade the VID:PID
of faulty gfx devices, and thus apply iommu quirks to effectively
disable the IOMMU restrictions for itself.

Thus we need to ensure that the device we are applying quirks to, is
indeed an internal trusted device.

Signed-off-by: Rajat Jain <rajatja@google.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:32 +02:00
Lu Baolu 16ecf10e81 iommu/vt-d: Set U/S bit in first level page table by default
When using first-level translation for IOVA, currently the U/S bit in the
page table is cleared which implies DMA requests with user privilege are
blocked. As the result, following error messages might be observed when
passing through a device to user level:

DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Read] Request device [41:00.0] PASID 1 fault addr 7ecdcd000
        [fault reason 129] SM: U/S set 0 for first-level translation
        with user privilege

This fixes it by setting U/S bit in the first level page table and makes
IOVA over first level compatible with previous second-level translation.

Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Reported-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:31 +02:00
Lu Baolu 9486727f59 iommu/vt-d: Make Intel SVM code 64-bit only
Current Intel SVM is designed by setting the pgd_t of the processor page
table to FLPTR field of the PASID entry. The first level translation only
supports 4 and 5 level paging structures, hence it's infeasible for the
IOMMU to share a processor's page table when it's running in 32-bit mode.
Let's disable 32bit support for now and claim support only when all the
missing pieces are ready in the future.

Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Suggested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:31 +02:00
Alexander Usyskin 8c289ea064 mei: me: add tiger lake point device ids for H platforms.
Add Tiger Lake device ids H for HECI1.
TGH_H is also used in Tatlow SPS platform we need to
disable the mei interface there.

Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20200619165121.2145330-7-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-23 07:55:47 +02:00
Tomas Winkler f76d77f50b mei: me: disable mei interface on Mehlow server platforms
For SPS firmware versions 5.0 and newer the way detection has changed.
The detection is done now via PCI_CFG_HFS_3 register.
To prevent conflict the previous method will get sps_4 suffix
Disable both CNP_H and CNP_H_3 interfaces. CNP_H_3 requires
a separate configuration as it doesn't support DMA.

Cc: <stable@vger.kernel.org>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20200619165121.2145330-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-23 07:55:47 +02:00
Todd Kjos d35d3660e0 binder: fix null deref of proc->context
The binder driver makes the assumption proc->context pointer is invariant after
initialization (as documented in the kerneldoc header for struct proc).
However, in commit f0fe2c0f05 ("binder: prevent UAF for binderfs devices II")
proc->context is set to NULL during binder_deferred_release().

Another proc was in the middle of setting up a transaction to the dying
process and crashed on a NULL pointer deref on "context" which is a local
set to &proc->context:

    new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1;

Here's the stack:

[ 5237.855435] Call trace:
[ 5237.855441] binder_get_ref_for_node_olocked+0x100/0x2ec
[ 5237.855446] binder_inc_ref_for_node+0x140/0x280
[ 5237.855451] binder_translate_binder+0x1d0/0x388
[ 5237.855456] binder_transaction+0x2228/0x3730
[ 5237.855461] binder_thread_write+0x640/0x25bc
[ 5237.855466] binder_ioctl_write_read+0xb0/0x464
[ 5237.855471] binder_ioctl+0x30c/0x96c
[ 5237.855477] do_vfs_ioctl+0x3e0/0x700
[ 5237.855482] __arm64_sys_ioctl+0x78/0xa4
[ 5237.855488] el0_svc_common+0xb4/0x194
[ 5237.855493] el0_svc_handler+0x74/0x98
[ 5237.855497] el0_svc+0x8/0xc

The fix is to move the kfree of the binder_device to binder_free_proc()
so the binder_device is freed when we know there are no references
remaining on the binder_proc.

Fixes: f0fe2c0f05 ("binder: prevent UAF for binderfs devices II")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622200715.114382-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-23 07:54:46 +02:00
Aiden Leong 26ac10be3c GUE: Fix a typo
Fix a typo in gue.h

Signed-off-by: Aiden Leong <aiden.leong@aibsd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 21:12:44 -07:00
Geliang Tang b562f58bbc mptcp: drop sndr_key in mptcp_syn_options
In RFC 8684, we don't need to send sndr_key in SYN package anymore, so drop
it.

Fixes: cc7972ea19 ("mptcp: parse and emit MP_CAPABLE option according to v1 spec")
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 21:06:39 -07:00
Gaurav Singh 21a739c64d ethtool: Fix check in ethtool_rx_flow_rule_create
Fix check in ethtool_rx_flow_rule_create

Fixes: eca4205f9e ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator")
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 20:48:12 -07:00
Taehee Yoo de0083c7ed hsr: avoid to create proc file after unregister
When an interface is being deleted, "/proc/net/dev_snmp6/<interface name>"
is deleted.
The function for this is addrconf_ifdown() in the addrconf_notify() and
it is called by notification, which is NETDEV_UNREGISTER.
But, if NETDEV_CHANGEMTU is triggered after NETDEV_UNREGISTER,
this proc file will be created again.
This recreated proc file will be deleted by netdev_wati_allrefs().
Before netdev_wait_allrefs() is called, creating a new HSR interface
routine can be executed and It tries to create a proc file but it will
find an un-deleted proc file.
At this point, it warns about it.

To avoid this situation, it can use ->dellink() instead of
->ndo_uninit() to release resources because ->dellink() is called
before NETDEV_UNREGISTER.
So, a proc file will not be recreated.

Test commands
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link set dummy0 mtu 1300

    #SHELL1
    while :
    do
        ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1
    done

    #SHELL2
    while :
    do
        ip link del hsr0
    done

Splat looks like:
[ 9888.980852][ T2752] proc_dir_entry 'dev_snmp6/hsr0' already registered
[ 9888.981797][    C2] WARNING: CPU: 2 PID: 2752 at fs/proc/generic.c:372 proc_register+0x2d5/0x430
[ 9888.981798][    C2] Modules linked in: hsr dummy veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6x
[ 9888.981814][    C2] CPU: 2 PID: 2752 Comm: ip Tainted: G        W         5.8.0-rc1+ #616
[ 9888.981815][    C2] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 9888.981816][    C2] RIP: 0010:proc_register+0x2d5/0x430
[ 9888.981818][    C2] Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 65 01 00 00 49 8b b5 e0 00 00 00 48 89 ea 40
[ 9888.981819][    C2] RSP: 0018:ffff8880628dedf0 EFLAGS: 00010286
[ 9888.981821][    C2] RAX: dffffc0000000008 RBX: ffff888028c69170 RCX: ffffffffaae09a62
[ 9888.981822][    C2] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f75ac
[ 9888.981823][    C2] RBP: ffff888028c693f4 R08: ffffed100d9401bd R09: ffffed100d9401bd
[ 9888.981824][    C2] R10: ffffffffaddf406f R11: 0000000000000001 R12: ffff888028c69308
[ 9888.981825][    C2] R13: ffff8880663584c8 R14: dffffc0000000000 R15: ffffed100518d27e
[ 9888.981827][    C2] FS:  00007f3876b3b0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000
[ 9888.981828][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9888.981829][    C2] CR2: 00007f387601a8c0 CR3: 000000004101a002 CR4: 00000000000606e0
[ 9888.981830][    C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9888.981831][    C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9888.981832][    C2] Call Trace:
[ 9888.981833][    C2]  ? snmp6_seq_show+0x180/0x180
[ 9888.981834][    C2]  proc_create_single_data+0x7c/0xa0
[ 9888.981835][    C2]  snmp6_register_dev+0xb0/0x130
[ 9888.981836][    C2]  ipv6_add_dev+0x4b7/0xf60
[ 9888.981837][    C2]  addrconf_notify+0x684/0x1ca0
[ 9888.981838][    C2]  ? __mutex_unlock_slowpath+0xd0/0x670
[ 9888.981839][    C2]  ? kasan_unpoison_shadow+0x30/0x40
[ 9888.981840][    C2]  ? wait_for_completion+0x250/0x250
[ 9888.981841][    C2]  ? inet6_ifinfo_notify+0x100/0x100
[ 9888.981842][    C2]  ? dropmon_net_event+0x227/0x410
[ 9888.981843][    C2]  ? notifier_call_chain+0x90/0x160
[ 9888.981844][    C2]  ? inet6_ifinfo_notify+0x100/0x100
[ 9888.981845][    C2]  notifier_call_chain+0x90/0x160
[ 9888.981846][    C2]  register_netdevice+0xbe5/0x1070
[ ... ]

Reported-by: syzbot+1d51c8b74efa4c44adeb@syzkaller.appspotmail.com
Fixes: e0a4b99773 ("hsr: use upper/lower device infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 20:42:23 -07:00
Frieder Schrempf d22a16cc92 ARM: dts: imx6ul-kontron: Change WDOG_ANY signal from push-pull to open-drain
The WDOG_ANY signal is connected to the RESET_IN signal of the SoM
and baseboard. It is currently configured as push-pull, which means
that if some external device like a programmer wants to assert the
RESET_IN signal by pulling it to ground, it drives against the high
level WDOG_ANY output of the SoC.

To fix this we set the WDOG_ANY signal to open-drain configuration.
That way we make sure that the RESET_IN can be asserted by the
watchdog as well as by external devices.

Fixes: 1ea4b76cdf ("ARM: dts: imx6ul-kontron-n6310: Add Kontron i.MX6UL N6310 SoM and boards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-06-23 11:39:35 +08:00
Frieder Schrempf 04a2c05179 ARM: dts: imx6ul-kontron: Move watchdog from Kontron i.MX6UL/ULL board to SoM
The watchdog's WDOG_ANY signal is used to trigger a POR of the SoC,
if a soft reset is issued. As the SoM hardware connects the WDOG_ANY
and the POR signals, the watchdog node itself and the pin
configuration should be part of the common SoM devicetree.
Let's move it from the baseboard's devicetree to its proper place.

Fixes: 1ea4b76cdf ("ARM: dts: imx6ul-kontron-n6310: Add Kontron i.MX6UL N6310 SoM and boards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-06-23 11:39:21 +08:00
Sean Christopherson bf09fb6cba KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL
Remove support for context switching between the guest's and host's
desired UMWAIT_CONTROL.  Propagating the guest's value to hardware isn't
required for correct functionality, e.g. KVM intercepts reads and writes
to the MSR, and the latency effects of the settings controlled by the
MSR are not architecturally visible.

As a general rule, KVM should not allow the guest to control power
management settings unless explicitly enabled by userspace, e.g. see
KVM_CAP_X86_DISABLE_EXITS.  E.g. Intel's SDM explicitly states that C0.2
can improve the performance of SMT siblings.  A devious guest could
disable C0.2 so as to improve the performance of their workloads at the
detriment to workloads running in the host or on other VMs.

Wholesale removal of UMWAIT_CONTROL context switching also fixes a race
condition where updates from the host may cause KVM to enter the guest
with the incorrect value.  Because updates are are propagated to all
CPUs via IPI (SMP function callback), the value in hardware may be
stale with respect to the cached value and KVM could enter the guest
with the wrong value in hardware.  As above, the guest can't observe the
bad value, but it's a weird and confusing wart in the implementation.

Removal also fixes the unnecessary usage of VMX's atomic load/store MSR
lists.  Using the lists is only necessary for MSRs that are required for
correct functionality immediately upon VM-Enter/VM-Exit, e.g. EFER on
old hardware, or for MSRs that need to-the-uop precision, e.g. perf
related MSRs.  For UMWAIT_CONTROL, the effects are only visible in the
kernel via TPAUSE/delay(), and KVM doesn't do any form of delay in
vcpu_vmx_run().  Using the atomic lists is undesirable as they are more
expensive than direct RDMSR/WRMSR.

Furthermore, even if giving the guest control of the MSR is legitimate,
e.g. in pass-through scenarios, it's not clear that the benefits would
outweigh the overhead.  E.g. saving and restoring an MSR across a VMX
roundtrip costs ~250 cycles, and if the guest diverged from the host
that cost would be paid on every run of the guest.  In other words, if
there is a legitimate use case then it should be enabled by a new
per-VM capability.

Note, KVM still needs to emulate MSR_IA32_UMWAIT_CONTROL so that it can
correctly expose other WAITPKG features to the guest, e.g. TPAUSE,
UMWAIT and UMONITOR.

Fixes: 6e3ba4abce ("KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL")
Cc: stable@vger.kernel.org
Cc: Jingqi Liu <jingqi.liu@intel.com>
Cc: Tao Xu <tao3.xu@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200623005135.10414-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-22 20:54:57 -04:00
Tuomas Tynkkynen b835a71ef6 usbnet: smsc95xx: Fix use-after-free after removal
Syzbot reports an use-after-free in workqueue context:

BUG: KASAN: use-after-free in mutex_unlock+0x19/0x40 kernel/locking/mutex.c:737
 mutex_unlock+0x19/0x40 kernel/locking/mutex.c:737
 __smsc95xx_mdio_read drivers/net/usb/smsc95xx.c:217 [inline]
 smsc95xx_mdio_read+0x583/0x870 drivers/net/usb/smsc95xx.c:278
 check_carrier+0xd1/0x2e0 drivers/net/usb/smsc95xx.c:644
 process_one_work+0x777/0xf90 kernel/workqueue.c:2274
 worker_thread+0xa8f/0x1430 kernel/workqueue.c:2420
 kthread+0x2df/0x300 kernel/kthread.c:255

It looks like that smsc95xx_unbind() is freeing the structures that are
still in use by the concurrently running workqueue callback. Thus switch
to using cancel_delayed_work_sync() to ensure the work callback really
is no longer active.

Reported-by: syzbot+29dc7d4ae19b703ff947@syzkaller.appspotmail.com
Signed-off-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:34:31 -07:00
Ido Schimmel f3fe412b0a mlxsw: spectrum: Do not rely on machine endianness
The second commit cited below performed a cast of 'u32 buffsize' to
'(u16 *)' when calling mlxsw_sp_port_headroom_8x_adjust():

mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, (u16 *) &buffsize);

Colin noted that this will behave differently on big endian
architectures compared to little endian architectures.

Fix this by following Colin's suggestion and have the function accept
and return 'u32' instead of passing the current size by reference.

Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Fixes: 60833d54d5 ("mlxsw: spectrum: Adjust headroom buffers for 8x ports")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
Suggested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:29:51 -07:00
Dejin Zheng 6d61f483f1 net: phy: smsc: fix printing too many logs
Commit 7ae7ad2f11 ("net: phy: smsc: use phy_read_poll_timeout()
to simplify the code") will print a lot of logs as follows when Ethernet
cable is not connected:

[    4.473105] SMSC LAN8710/LAN8720 2188000.ethernet-1:00: lan87xx_read_status failed: -110

When wait 640 ms for check ENERGYON bit, the timeout should not be
regarded as an actual error and an error message also should not be
printed. due to a hardware bug in LAN87XX device, it leads to unstable
detection of plugging in Ethernet cable when LAN87xx is in Energy Detect
Power-Down mode. the workaround for it involves, when the link is down,
and at each read_status() call:

- disable EDPD mode, forcing the PHY out of low-power mode
- waiting 640ms to see if we have any energy detected from the media
- re-enable entry to EDPD mode

This is presumably enough to allow the PHY to notice that a cable is
connected, and resume normal operations to negotiate with the partner.
The problem is that when no media is detected, the 640ms wait times
out and this commit was modified to prints an error message. it is an
inappropriate conversion by used phy_read_poll_timeout() to introduce
this bug. so fix this issue by use read_poll_timeout() to replace
phy_read_poll_timeout().

Fixes: 7ae7ad2f11 ("net: phy: smsc: use phy_read_poll_timeout() to simplify the code")
Reported-by: Kevin Groeneveld <kgroeneveld@gmail.com>
Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:08:48 -07:00