Commit Graph

1074149 Commits

Author SHA1 Message Date
Jason A. Donenfeld 042e293e16 random: wake up /dev/random writers after zap
When account() is called, and the amount of entropy dips below
random_write_wakeup_bits, we wake up the random writers, so that they
can write some more in. However, the RNDZAPENTCNT/RNDCLEARPOOL ioctl
sets the entropy count to zero -- a potential reduction just like
account() -- but does not unblock writers. This commit adds the missing
logic to that ioctl to unblock waiting writers.

Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-04 19:22:32 +01:00
Dominik Brodowski c321e907aa random: continually use hwgenerator randomness
The rngd kernel thread may sleep indefinitely if the entropy count is
kept above random_write_wakeup_bits by other entropy sources. To make
best use of multiple sources of randomness, mix entropy from hardware
RNGs into the pool at least once within CRNG_RESEED_INTERVAL.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-04 19:22:32 +01:00
Jason A. Donenfeld d2a02e3c8b lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI
blake2s_compress_generic is weakly aliased by blake2s_compress. The
current harness for function selection uses a function pointer, which is
ordinarily inlined and resolved at compile time. But when Clang's CFI is
enabled, CFI still triggers when making an indirect call via a weak
symbol. This seems like a bug in Clang's CFI, as though it's bucketing
weak symbols and strong symbols differently. It also only seems to
trigger when "full LTO" mode is used, rather than "thin LTO".

[    0.000000][    T0] Kernel panic - not syncing: CFI failure (target: blake2s_compress_generic+0x0/0x1444)
[    0.000000][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-mainline-06981-g076c855b846e #1
[    0.000000][    T0] Hardware name: MT6873 (DT)
[    0.000000][    T0] Call trace:
[    0.000000][    T0]  dump_backtrace+0xfc/0x1dc
[    0.000000][    T0]  dump_stack_lvl+0xa8/0x11c
[    0.000000][    T0]  panic+0x194/0x464
[    0.000000][    T0]  __cfi_check_fail+0x54/0x58
[    0.000000][    T0]  __cfi_slowpath_diag+0x354/0x4b0
[    0.000000][    T0]  blake2s_update+0x14c/0x178
[    0.000000][    T0]  _extract_entropy+0xf4/0x29c
[    0.000000][    T0]  crng_initialize_primary+0x24/0x94
[    0.000000][    T0]  rand_initialize+0x2c/0x6c
[    0.000000][    T0]  start_kernel+0x2f8/0x65c
[    0.000000][    T0]  __primary_switched+0xc4/0x7be4
[    0.000000][    T0] Rebooting in 5 seconds..

Nonetheless, the function pointer method isn't so terrific anyway, so
this patch replaces it with a simple boolean, which also gets inlined
away. This successfully works around the Clang bug.

In general, I'm not too keen on all of the indirection involved here; it
clearly does more harm than good. Hopefully the whole thing can get
cleaned up down the road when lib/crypto is overhauled more
comprehensively. But for now, we go with a simple bandaid.

Fixes: 6048fdcc5f ("lib/crypto: blake2s: include as built-in")
Link: https://github.com/ClangBuiltLinux/linux/issues/1567
Reported-by: Miles Chen <miles.chen@mediatek.com>
Tested-by: Miles Chen <miles.chen@mediatek.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-04 19:22:32 +01:00
Linus Torvalds cff7f2237c A patch to make it possible to disable zero copy path in the messenger
to avoid checksum or authentication tag mismatches and ensuing session
 resets in case the destination buffer isn't guaranteed to be stable.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmH9JWMTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi6GVB/0QZtlzCwL0JVNlF1kro96/Sbyb4kNi
 vUgD9L1RLBBDBuGAgVHgIch3E8KxAwTia0BHWH/kxLAV84RqmpcIwuZiAjrqJaoz
 9JbXmO47+2/lul6YOrzTLDwWzvoMcv/ngUJYbulD0F6oeVqD9Kl3qUkrpf5cy+mJ
 uJpzXDhqrx9A5ruopQRFlx2br1sPp3Jn/45WXejoEUSxnbyKtejK6aZBmatvBgsX
 gtfVqSCQeY+bWXkhkg4ZaYAHRqH1lG5we6FEbB0RIG5gY9ygf1w2OWr33S2qrbTg
 DKz96jQ4nDAsMpvlis1y0IjpxiAY0c1A0B06E0Xxov/d4fNdtAlnaci5
 =pHA5
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.17-rc3' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A patch to make it possible to disable zero copy path in the messenger
  to avoid checksum or authentication tag mismatches and ensuing session
  resets in case the destination buffer isn't guaranteed to be stable"

* tag 'ceph-for-5.17-rc3' of git://github.com/ceph/ceph-client:
  libceph: optionally use bounce buffer on recv path in crc mode
  libceph: make recv path in secure mode work the same as send path
2022-02-04 09:54:02 -08:00
Linus Torvalds 1eb7de177d 9p-for-5.17-rc3: fix cannot walk open fid rule
the 9p 'walk' operation requires fid arguments to not originate from
 an open or create call and we've missed that for a while as the
 servers regularly running tests with don't enforce the check and
 no active reviewer knew about the rule.
 
 Both reporters confirmed reverting this patch fixes things for them
 and looking at it further wasn't actually required...
 Will take more time for follow up and enforcing the rule more
 thoroughly later.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmH9A04ACgkQq06b7GqY
 5nBsmQ/9G4MTSusDbUL80d/VMXmUWQ4wfiQ9/t8ikH/QQL3nNf5az5WJxp38McHQ
 SJrZdpSbfy44DRwoAtkO964Qrvn+Yxm+H7aVfXWwThW5l6bF2/tox7VgSLnQmQoE
 96V4zHMDmP1ibmAf3Eh0SfmQQnN7eDec9AXD6sf7sI0pKoRCUD2R8oHVrMLSdJKr
 tRWpFP0C0HLrB6ptkLYOEt0IrQxMfdN7qlWNE7016REJgtNXPrN6U2DN3iXITAHp
 f1pSm+BKrzPcDpT5fGzPfhDvrVtEoep8+vTaszSEedg+Va9sJp8fc3Ha+YWNdZGv
 7ZM6uMTubNOFaSEn33HqsJ7EkUuihckdbkg3FwQJNK6wOHslzWBLeQFj2vNDw8ji
 jBU8owqAbWOlQ2wciHPXrb83mV97nnEMQVge4+9LQQaxLJaoIPTwohKUQxbjiRBU
 b6St4Xs1Npq/h818aycH3pMsoA5FwDrny5OR++E4RbbgiffejmjkxjXG8BPx/mWt
 UjT5BQQ4tD/JQGbB6OCpDkuCnMpHApNIvEodN8P72uZCU9nK6ARXse1tYcR6JqNf
 otNcJBrGTd4o3r/Fgb4wi8D0bLMcVNvXYRQyV/ebRC+NJkUZDIjmfFT8joKnxuIf
 w54gPfPRroxTnvwlniyxsT7HPAS1F6Ds5ve3wTiUNqiGWX6JtsQ=
 =afc6
 -----END PGP SIGNATURE-----

Merge tag '9p-for-5.17-rc3' of git://github.com/martinetd/linux

Pull 9p fix from Dominique Martinet:
 "Fix 'cannot walk open fid' rule

  The 9p 'walk' operation requires fid arguments to not originate from
  an open or create call and we've missed that for a while as the
  servers regularly running tests with don't enforce the check and no
  active reviewer knew about the rule.

  Both reporters confirmed reverting this patch fixes things for them
  and looking at it further wasn't actually required... Will take more
  time for follow up and enforcing the rule more thoroughly later"

* tag '9p-for-5.17-rc3' of git://github.com/martinetd/linux:
  Revert "fs/9p: search open fids first"
2022-02-04 09:44:42 -08:00
Linus Torvalds 633a8e8986 7 smb3 fixes, one for stable, the rest fscache related, and an fscache helper
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmH8v0oACgkQiiy9cAdy
 T1F8CQwAjxP339Ln4AQMiuYNWxnzhHgbKKgPD+PZxvS/lvBzIcyTz0aHqRTIGwLr
 8+NH39nzQCDgXRh8qMPc/RPj+qDUMeEV7mEwv3pj9R13tUDJB41bi2nLEIcOoISq
 uJ66aiJ+Nm6e7dU8rbHq7fpvdfnaQgOK64hMk4WWJsiByehUT/fGMs0GTMQKry5i
 t0xYfyfX/ty8/8yrlHTAHFrq60468VHLsaX3Bkxm3Mxff1pziAtwjiqhkP6U1jNQ
 7/qaxHHwgY6Yogoy9Oe9MsFHwhx9dFi3g3j+BLbZn4rpffvU6mJF28iFx9Q42P+K
 BDXLzYhnczkgyTnyASkoZt6s5YVWMLR7K2WT7obLwAmS2UbOkvGiuZCs7xp6AOFz
 9AQCKJqM2V5tWs9doNGubE3pIRjOOFi5ohXq6+CUZv5kqEzyfy+xfEr557JJxSAw
 /PVb1dW7CbDWK+BW1AtfcQo9NA6FQmkXTy+/QdzeEoIxm3szuHOJ+/lY7RP9QTt0
 Fdc+8Y1o
 =2fkO
 -----END PGP SIGNATURE-----

Merge tag '5.17-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "SMB3 client fixes including:

   - multiple fscache related fixes, reenabling ability to read/write to
     cached files for cifs.ko (that was temporarily disabled for cifs.ko
     a few weeks ago due to the recent fscache changes)

   - also includes a new fscache helper function ("query_occupancy")
     used by above

   - fix for multiuser mounts and NTLMSSP auth (workstation name) for
     stable

   - fix locking ordering problem in multichannel code

   - trivial malformed comment fix"

* tag '5.17-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix workstation_name for multiuser mounts
  Invalidate fscache cookie only when inode attributes are changed.
  cifs: Fix the readahead conversion to manage the batch when reading from cache
  cifs: Implement cache I/O by accessing the cache directly
  netfs, cachefiles: Add a method to query presence of data in the cache
  cifs: Transition from ->readpages() to ->readahead()
  cifs: unlock chan_lock before calling cifs_put_tcp_session
  Fix a warning about a malformed kernel doc comment in cifs
2022-02-04 09:34:37 -08:00
Shuah Khan 07d2505b96 kselftest/vm: revert "tools/testing/selftests/vm/userfaultfd.c: use swap() to make code cleaner"
With this change, userfaultfd fails to build with undefined reference
swap() error:

  userfaultfd.c: In function `userfaultfd_stress':
  userfaultfd.c:1530:17: warning: implicit declaration of function `swap'; did you mean `swab'? [-Wimplicit-function-declaration]
   1530 |                 swap(area_src, area_dst);
        |                 ^~~~
        |                 swab
  /usr/bin/ld: /tmp/ccDGOAdV.o: in function `userfaultfd_stress':
  userfaultfd.c:(.text+0x549e): undefined reference to `swap'
  /usr/bin/ld: userfaultfd.c:(.text+0x54bc): undefined reference to `swap'
  collect2: error: ld returned 1 exit status

Revert the commit to fix the problem.

Link: https://lkml.kernel.org/r/20220202003340.87195-1-skhan@linuxfoundation.org
Fixes: 2c769ed713 ("tools/testing/selftests/vm/userfaultfd.c: use swap() to make code cleaner")
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Minghao Chi <chi.minghao@zte.com.cn>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:05 -08:00
Mike Rapoport 6a0fb704b0 MAINTAINERS: update rppt's email
Use my @kernel.org address

Link: https://lkml.kernel.org/r/20220203090324.3701774-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:05 -08:00
Lang Yu c10a0f877f mm/kmemleak: avoid scanning potential huge holes
When using devm_request_free_mem_region() and devm_memremap_pages() to
add ZONE_DEVICE memory, if requested free mem region's end pfn were
huge(e.g., 0x400000000), the node_end_pfn() will be also huge (see
move_pfn_range_to_zone()).  Thus it creates a huge hole between
node_start_pfn() and node_end_pfn().

We found on some AMD APUs, amdkfd requested such a free mem region and
created a huge hole.  In such a case, following code snippet was just
doing busy test_bit() looping on the huge hole.

  for (pfn = start_pfn; pfn < end_pfn; pfn++) {
	struct page *page = pfn_to_online_page(pfn);
		if (!page)
			continue;
	...
  }

So we got a soft lockup:

  watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
  CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
  RIP: 0010:pfn_to_online_page+0x5/0xd0
  Call Trace:
    ? kmemleak_scan+0x16a/0x440
    kmemleak_write+0x306/0x3a0
    ? common_file_perm+0x72/0x170
    full_proxy_write+0x5c/0x90
    vfs_write+0xb9/0x260
    ksys_write+0x67/0xe0
    __x64_sys_write+0x1a/0x20
    do_syscall_64+0x3b/0xc0
    entry_SYSCALL_64_after_hwframe+0x44/0xae

I did some tests with the patch.

(1) amdgpu module unloaded

before the patch:

  real    0m0.976s
  user    0m0.000s
  sys     0m0.968s

after the patch:

  real    0m0.981s
  user    0m0.000s
  sys     0m0.973s

(2) amdgpu module loaded

before the patch:

  real    0m35.365s
  user    0m0.000s
  sys     0m35.354s

after the patch:

  real    0m1.049s
  user    0m0.000s
  sys     0m1.042s

Link: https://lkml.kernel.org/r/20211108140029.721144-1-lang.yu@amd.com
Signed-off-by: Lang Yu <lang.yu@amd.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:05 -08:00
Minghao Chi 520ba72406 ipc/sem: do not sleep with a spin lock held
We can't call kvfree() with a spin lock held, so defer it.

Link: https://lkml.kernel.org/r/20211223031207.556189-1-chi.minghao@zte.com.cn
Fixes: fc37a3b8b4 ("[PATCH] ipc sem: use kvmalloc for sem_undo allocation")
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Yang Guang <cgel.zte@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:05 -08:00
Mike Rapoport 314c459a6f mm/pgtable: define pte_index so that preprocessor could recognize it
Since commit 974b9b2c68 ("mm: consolidate pte_index() and
pte_offset_*() definitions") pte_index is a static inline and there is
no define for it that can be recognized by the preprocessor.  As a
result, vm_insert_pages() uses slower loop over vm_insert_page() instead
of insert_pages() that amortizes the cost of spinlock operations when
inserting multiple pages.

Link: https://lkml.kernel.org/r/20220111145457.20748-1-rppt@kernel.org
Fixes: 974b9b2c68 ("mm: consolidate pte_index() and pte_offset_*() definitions")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reported-by: Christian Dietrich <stettberger@dokucode.de>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:05 -08:00
Pasha Tatashin 80110bbfbb mm/page_table_check: check entries at pmd levels
syzbot detected a case where the page table counters were not properly
updated.

  syzkaller login:  ------------[ cut here ]------------
  kernel BUG at mm/page_table_check.c:162!
  invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  CPU: 0 PID: 3099 Comm: pasha Not tainted 5.16.0+ #48
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIO4
  RIP: 0010:__page_table_check_zero+0x159/0x1a0
  Call Trace:
   free_pcp_prepare+0x3be/0xaa0
   free_unref_page+0x1c/0x650
   free_compound_page+0xec/0x130
   free_transhuge_page+0x1be/0x260
   __put_compound_page+0x90/0xd0
   release_pages+0x54c/0x1060
   __pagevec_release+0x7c/0x110
   shmem_undo_range+0x85e/0x1250
  ...

The repro involved having a huge page that is split due to uprobe event
temporarily replacing one of the pages in the huge page.  Later the huge
page was combined again, but the counters were off, as the PTE level was
not properly updated.

Make sure that when PMD is cleared and prior to freeing the level the
PTEs are updated.

Link: https://lkml.kernel.org/r/20220131203249.2832273-5-pasha.tatashin@soleen.com
Fixes: df4e817b71 ("mm: page table check")
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Paul Turner <pjt@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:04 -08:00
Pasha Tatashin e59a47b8a4 mm/khugepaged: unify collapse pmd clear, flush and free
Unify the code that flushes, clears pmd entry, and frees the PTE table
level into a new function collapse_and_free_pmd().

This cleanup is useful as in the next patch we will add another call to
this function to iterate through PTE prior to freeing the level for page
table check.

Link: https://lkml.kernel.org/r/20220131203249.2832273-4-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Paul Turner <pjt@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:04 -08:00
Pasha Tatashin 64d8b9e145 mm/page_table_check: use unsigned long for page counters and cleanup
For consistency, use "unsigned long" for all page counters.

Also, reduce code duplication by calling __page_table_check_*_clear()
from __page_table_check_*_set() functions.

Link: https://lkml.kernel.org/r/20220131203249.2832273-3-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Wei Xu <weixugc@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Paul Turner <pjt@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:04 -08:00
Pasha Tatashin fb5222aae6 mm/debug_vm_pgtable: remove pte entry from the page table
Patch series "page table check fixes and cleanups", v5.

This patch (of 4):

The pte entry that is used in pte_advanced_tests() is never removed from
the page table at the end of the test.

The issue is detected by page_table_check, to repro compile kernel with
the following configs:

CONFIG_DEBUG_VM_PGTABLE=y
CONFIG_PAGE_TABLE_CHECK=y
CONFIG_PAGE_TABLE_CHECK_ENFORCED=y

During the boot the following BUG is printed:

  debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
  ------------[ cut here ]------------
  kernel BUG at mm/page_table_check.c:162!
  invalid opcode: 0000 [#1] PREEMPT SMP PTI
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-11413-g2c271fe77d52 #3
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
  ...

The entry should be properly removed from the page table before the page
is released to the free list.

Link: https://lkml.kernel.org/r/20220131203249.2832273-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20220131203249.2832273-2-pasha.tatashin@soleen.com
Fixes: a5c3b9ffb0 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers")
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>	[5.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:04 -08:00
Chen Wandun a85468b766 Revert "mm/page_isolation: unset migratetype directly for non Buddy page"
This reverts commit 721fb891ad.

Commit 721fb891ad ("mm/page_isolation: unset migratetype directly for
non Buddy page") will result memory that should in buddy disappear by
mistake.  move_freepages_block moves all pages in pageblock instead of
pages indicated by input parameter, so if input pages is not in buddy
but other pages in pageblock is in buddy, it will result in page out of
control.

Link: https://lkml.kernel.org/r/20220126024436.13921-1-chenwandun@huawei.com
Fixes: 721fb891ad ("mm/page_isolation: unset migratetype directly for non Buddy page")
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Dong Aisheng <aisheng.dong@nxp.com>
Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-04 09:25:04 -08:00
Jakub Kicinski 40106e005b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) Don't refresh timeout for SCTP flows in CLOSED state.

2) Don't allow access to transport header if fragment offset is set on.

3) Reinitialize internal conntrack state for retransmitted TCP
   syn-ack packet.

4) Update MAINTAINER file to add the Netfilter group tree. Moving
   forward, Florian Westphal has access to this tree so he can also
   send pull requests.

5) Set on IPS_HELPER for entries created via ctnetlink, otherwise NAT
   might zap it.

All patches from Florian Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: ctnetlink: disable helper autoassign
  MAINTAINERS: netfilter: update git links
  netfilter: conntrack: re-init state for retransmitted syn-ack
  netfilter: conntrack: move synack init code to helper
  netfilter: nft_payload: don't allow th access for fragments
  netfilter: conntrack: don't refresh sctp entries in closed state
====================

Link: https://lore.kernel.org/r/20220204151903.320786-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-04 08:47:42 -08:00
Hou Tao b5e975d256 bpf, arm64: Enable kfunc call
Since commit b2eed9b588 ("arm64/kernel: kaslr: reduce module
randomization range to 2 GB"), for arm64 whether KASLR is enabled
or not, the module is placed within 2GB of the kernel region, so
s32 in bpf_kfunc_desc is sufficient to represente the offset of
module function relative to __bpf_call_base. The only thing needed
is to override bpf_jit_supports_kfunc_call().

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220130092917.14544-2-hotforest@gmail.com
2022-02-04 16:37:13 +01:00
Jiapeng Chong c761161851 mac80211: Remove redundent assignment channel_type
Fix the following coccicheck warnings:

net/mac80211/util.c:3265:3: warning: Value stored to 'channel_type' is
never read [clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220113161557.129427-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:27:45 +01:00
Baligh Gasmi 45d33746d2 mac80211: remove useless ieee80211_vif_is_mesh() check
We check ieee80211_vif_is_mesh() at the top if() block,
there's no need to check for it again.

Signed-off-by: Baligh Gasmi <gasmibal@gmail.com>
Link: https://lore.kernel.org/r/20220203153035.198697-1-gasmibal@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:27:07 +01:00
Avraham Stern ea5907db2a mac80211: fix struct ieee80211_tx_info size
The size of the status_driver_data field was not adjusted when
the is_valid_ack_signal field was added.
Since the size of struct ieee80211_tx_info is limited, replace
the is_valid_ack_signal field with a flags field, and adjust the
struct size accordingly.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.0ff363d4fa56.I45792c0187034a6d0e1c99a7db741996ef7caba3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:26:53 +01:00
Mordechay Goodstein 97634ef4bf mac80211: mlme: validate peer HE supported rates
We validate that AP has mandatory rates set in HE capabilities.

Also we make sure AP is consistent with itself on rates set in HE basic
rates required joining the BSS and rates set in HE capabilities.

Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.7023450fdf16.I194df59252097ba25a0a543456d4350f1607a538@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:26:40 +01:00
Johannes Berg 453a2a8205 mac80211: remove unused macros
Various macros in mac80211 aren't used, remove them. In one
case it's used under ifdef, so ifdef it for the W=2 warning.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.5172d7fd878e.I2f1fce686a2b71003f083b2566fb09cf16b8165a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:26:27 +01:00
Johannes Berg 1b198233a3 cfg80211: pmsr: remove useless ifdef guards
This isn't a header file, I guess I must've copied from
the header file and forgotten to remove the guards.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.330d03623b08.Idda91cd6f1c7bd865a50c47d408e5cdab0fd951f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:26:16 +01:00
Johannes Berg ae962e5f63 mac80211: airtime: avoid variable shadowing
This isn't very dangerous, since the outer 'rate' variable
isn't even a pointer, but it's still confusing, so use a
different variable inside.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.8e9b2bfaa0f5.I41c53f754eef28206d04dafc7263ccb99b63d490@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:25:55 +01:00
Mordechay Goodstein 6ad1dce5eb mac80211: mlme: add documentation from spec to code
Reference the spec why we decline HE support in
case STA don't support all HE basic rates recurred by AP.

Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.f1bafd0861b7.I566612d99bca5245dc06cbcc70369b94a525389c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:25:45 +01:00
Mordechay Goodstein abd5a8e5cc mac80211: vht: use HE macros for parsing HE capabilities
IEEE80211_VHT_MCS_NOT_SUPPORTED and IEEE80211_HE_MCS_NOT_SUPPORTED
have the same value so no real bug, but for code integrity use the
HE macros for parsing HE capabilities.

Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.e974b7b3b217.I732cc7f770c7fa06e4840adb5d45d7ee99ac8eb5@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:25:32 +01:00
Johannes Berg 5beb53d6ba ieee80211: radiotap: fix -Wcast-qual warnings
When enabling -Wcast-qual e.g. via W=3, we get a lot of
warnings from this file, whenever it's included. Since
the fixes are simple, just do that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.cc733aeb1a18.I03396e1bf7a1af364cbd0916037f65d800035039@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:25:21 +01:00
Johannes Berg 7e367b06f1 cfg80211: fix -Wcast-qual warnings
When enabling -Wcast-qual e.g. via W=3, we get a lot of
warnings from this file, whenever it's included. Since
the fixes are simple, just do that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.6a1d52213019.I92d82e7251cf712faa43fd09db3142327a3bce3d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:25:10 +01:00
Johannes Berg bed8947893 ieee80211: fix -Wcast-qual warnings
When enabling -Wcast-qual e.g. via W=3, we get a lot of
warnings from this file, whenever it's included. Since
the fixes are simple, just do that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.79ec4a4bab29.I8177a0c79d656c552e22c88931d8da06f2977896@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:24:56 +01:00
Avraham Stern 5666ee154f cfg80211: don't add non transmitted BSS to 6GHz scanned channels
When adding 6GHz channels to scan request based on reported
co-located APs, don't add channels that have only APs with
"non-transmitted" BSSes if they only match the wildcard SSID since
they will be found by probing the "transmitted" BSS.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.f6ddf099f934.I231e55885d3644f292d00dfe0f42653269f2559e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:24:41 +01:00
Johannes Berg 667aa74264 cfg80211/mac80211: assume CHECKSUM_COMPLETE includes SNAP
There's currently only one driver that reports CHECKSUM_COMPLETE,
that is iwlwifi. The current hardware there calculates checksum
after the SNAP header, but only RFC 1042 (and some other cases,
but replicating the exact hardware logic for corner cases in the
driver seemed awkward.)

Newer generations of hardware will checksum _including_ the SNAP,
which makes things easier.

To handle that, simply always assume the checksum _includes_ the
SNAP header, which this patch does, requiring to first add it
for older iwlwifi hardware, and then remove it again later on
conversion.

Alternatively, we could have:

 1) Always assumed the checksum starts _after_ the SNAP header;
    the problem with this is that we'd have to replace the exact
    "what is the SNAP" check in iwlwifi that cfg80211 has.

 2) Made it configurable with some flag, but that seemed like too
    much complexity.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.230736e19e0e.I3e6745873585ad943c152fab9e23b5221f17a95f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:23:19 +01:00
Mordechay Goodstein f39b7d62a1 mac80211: consider RX NSS in UHB connection
In UHB connection we don't have any HT/VHT elemens so in order to
calculate the max RX-NSS we need also to look at HE capa element, this
causes to limit us to max rx nss in UHB to 1.

Also anyway we need to look at HE max rx NSS and not only at HT/VHT
capa to determine the max rx nss over the connection.

Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.3713e0dea5dd.I3b9a15b4c53465c3f86f35459e9dc15ae4ea2abd@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:23:01 +01:00
Johannes Berg 1f2c104448 mac80211: limit bandwidth in HE capabilities
If we're limiting bandwidth for some reason such as regulatory
restrictions, then advertise that limitation just like we do
for VHT today, so the AP is aware we cannot use the higher BW
it might be using.

Fixes: 41cbb0f5a2 ("mac80211: add support for HE")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220202104617.70c8e3e7ee76.If317630de69ff1146bec7d47f5b83038695eb71d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-04 16:22:39 +01:00
Joerg Roedel 9b45a7738e iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()
The polling loop for the register change in iommu_ga_log_enable() needs
to have a udelay() in it.  Otherwise the CPU might be faster than the
IOMMU hardware and wrongly trigger the WARN_ON() further down the code
stream. Use a 10us for udelay(), has there is some hardware where
activation of the GA log can take more than a 100ms.

A future optimization should move the activation check of the GA log
to the point where it gets used for the first time. But that is a
bigger change and not suitable for a fix.

Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220204115537.3894-1-joro@8bytes.org
2022-02-04 12:57:26 +01:00
Bo Jiao b3ad9d6a1d mt76: redefine mt76_for_each_q_rx to adapt mt7986 changes
Check ndesc of q_rx to avoid potential hole in iteration.
This is an intermediate patch to add mt7986 support.

Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Bo Jiao <Bo.Jiao@mediatek.com>
Reviewed-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
2022-02-04 11:28:36 +01:00
Samuel Mendoza-Jonas fe68195daf ixgbevf: Require large buffers for build_skb on 82599VF
From 4.17 onwards the ixgbevf driver uses build_skb() to build an skb
around new data in the page buffer shared with the ixgbe PF.
This uses either a 2K or 3K buffer, and offsets the DMA mapping by
NET_SKB_PAD + NET_IP_ALIGN. When using a smaller buffer RXDCTL is set to
ensure the PF does not write a full 2K bytes into the buffer, which is
actually 2K minus the offset.

However on the 82599 virtual function, the RXDCTL mechanism is not
available. The driver attempts to work around this by using the SET_LPE
mailbox method to lower the maximm frame size, but the ixgbe PF driver
ignores this in order to keep the PF and all VFs in sync[0].

This means the PF will write up to the full 2K set in SRRCTL, causing it
to write NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the buffer.
With 4K pages split into two buffers, this means it either writes
NET_SKB_PAD + NET_IP_ALIGN bytes past the first buffer (and into the
second), or NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the DMA
mapping.

Avoid this by only enabling build_skb when using "large" buffers (3K).
These are placed in each half of an order-1 page, preventing the PF from
writing past the end of the mapping.

[0]: Technically it only ever raises the max frame size, see
ixgbe_set_vf_lpe() in ixgbe_sriov.c

Fixes: f15c5ba5b6 ("ixgbevf: add support for using order 1 pages to receive large frames")
Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:23:21 +00:00
David S. Miller c531adaf88 Merge branch 'ipa-RX-replenish'
Alex Elder says:

====================
net: ipa: improve RX buffer replenishing

This series revises the algorithm used for replenishing receive
buffers on RX endpoints.  Currently there are two atomic variables
that track how many receive buffers can be sent to the hardware.
The new algorithm obviates the need for those, by just assuming we
always want to provide the hardware with buffers until it can hold
no more.

The first patch eliminates an atomic variable that's not required.
The next moves some code into the main replenish function's caller,
making one of the called function's arguments unnecessary.   The
next six refactor things a bit more, adding a new helper function
that allows us to eliminate an additional atomic variable.  And the
final two implement two more minor improvements.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:09 +00:00
Alex Elder 9654d8c462 net: ipa: determine replenish doorbell differently
Rather than tracking the number of receive buffer transactions that
have been submitted without a doorbell, just track the total number
of transactions that have been issued.  Then ring the doorbell when
that number modulo the replenish batch size is 0.

The effect is roughly the same, but the new count is slightly more
interesting, and this approach will someday allow the replenish
batch size to be tuned at runtime.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:08 +00:00
Alex Elder 5d6ac24fb1 net: ipa: replenish after delivering payload
Replenishing is now solely driven by whether transactions are
available for a channel, and it doesn't really matter whether
we replenish before or after we deliver received packets to the
network stack.

Replenishing before delivering the payload adds a little latency.
Eliminate that by requesting a replenish after the payload is
delivered.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:08 +00:00
Alex Elder 09b337deda net: ipa: kill replenish_backlog
We no longer use the replenish_backlog atomic variable to decide
when we've got work to do providing receive buffers to hardware.
Basically, we try to keep the hardware as full as possible, all the
time.  We keep supplying buffers until the hardware has no more
space for them.

As a result, we can get rid of the replenish_backlog field and the
atomic operations performed on it.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:08 +00:00
Alex Elder 5fc7f9ba2e net: ipa: introduce gsi_channel_trans_idle()
Create a new function that returns true if all transactions for a
channel are available for use.

Use it in ipa_endpoint_replenish_enable() to see whether to start
replenishing, and in ipa_endpoint_replenish() to determine whether
it's necessary after a failure to schedule delayed work to ensure a
future replenish attempt occurs.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:08 +00:00
Alex Elder d0ac30e74e net: ipa: don't use replenish_backlog
Rather than determining when to stop replenishing using the
replenish backlog, just stop when we have exhausted all available
transactions.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:08 +00:00
Alex Elder 6a606b9015 net: ipa: allocate transaction in replenish loop
When replenishing, have ipa_endpoint_replenish() allocate a
transaction, and pass that to ipa_endpoint_replenish_one() to fill.
Then, if that produces no error, commit the transaction within the
replenish loop as well.  In this way we can distinguish between
transaction failures and buffer allocation/mapping failures.

Failure to allocate a transaction simply means the hardware already
has as many receive buffers as it can hold.  In that case we can
break out of the replenish loop because there's nothing more to do.

If we fail to allocate or map pages for the receive buffer, just
try again later.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:08 +00:00
Alex Elder b9dbabc5ca net: ipa: decide on doorbell in replenish loop
Decide whether the doorbell should be signaled when committing a
replenish transaction in the main replenish loop, rather than in
ipa_endpoint_replenish_one().  This is a step to facilitate the
next patch.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:08 +00:00
Alex Elder 4b22d84195 net: ipa: increment backlog in replenish caller
Three spots call ipa_endpoint_replenish(), and just one of those
requests that the backlog be incremented after completing the
replenish operation.

Instead, have the caller increment the backlog, and get rid of the
add_one argument to ipa_endpoint_replenish().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:07 +00:00
Alex Elder b4061c136b net: ipa: allocate transaction before pages when replenishing
A transaction failure only occurs if no more transactions are
available for an endpoint.  It's a very cheap test.

When replenishing an RX endpoint buffer, there's no point in
allocating pages if transactions are exhausted.  So don't bother
doing so unless the transaction allocation succeeds.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:07 +00:00
Alex Elder a9bec7ae70 net: ipa: kill replenish_saved
The replenish_saved field keeps track of the number of times a new
buffer is added to the backlog when replenishing is disabled.  We
don't really use it though, so there's no need for us to track it
separately.  Whether replenishing is enabled or not, we can simply
increment the backlog.

Get rid of replenish_saved, and initialize and increment the backlog
where it would have otherwise been used.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:16:07 +00:00
Jakub Kicinski b93235e689 tls: cap the output scatter list to something reasonable
TLS recvmsg() passes user pages as destination for decrypt.
The decrypt operation is repeated record by record, each
record being 16kB, max. TLS allocates an sg_table and uses
iov_iter_get_pages() to populate it with enough pages to
fit the decrypted record.

Even though we decrypt a single message at a time we size
the sg_table based on the entire length of the iovec.
This leads to unnecessarily large allocations, risking
triggering OOM conditions.

Use iov_iter_truncate() / iov_iter_reexpand() to construct
a "capped" version of iov_iter_npages(). Alternatively we
could parametrize iov_iter_npages() to take the size as
arg instead of using i->count, or do something else..

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:14:07 +00:00
Russell King (Oracle) 6ff6064605 net: dsa: realtek: convert to phylink_generic_validate()
Populate the supported interfaces and MAC capabilities for the Realtek
rtl8365 DSA switch and remove the old validate implementation to allow
DSA to use phylink_generic_validate() for this switch driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04 10:12:53 +00:00