OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	194dfe88d6	asm-generic updates for 5.18 There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. There are some obvious conflicts against changes to the removed files. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmI69BsACgkQmmx57+YA GNn/zA//f4d5VTT0ThhRxRWTu9BdThGHoB8TUcY7iOhbsWu0X/913NItRC3UeWNl IdmisaXgVtirg1dcC2pWUmrcHdoWOCEGfK4+Zr2NhSWfuZDWvODHK9pGWk4WLnhe cQgUNBvIuuAMryGtrOBwHPO4TpfCyy2ioeVP36ZfcsWXdDxTrqfaq/56mk3sxIP6 sUTk1UEjut9NG4C9xIIvcSU50R3l6LryQE/H9kyTLtaSvfvTOvprcVYCq0GPmSzo DtQ1Wwa9zbJ+4EqoMiP5RrgQwWvOTg2iRByLU8ytwlX3e/SEF0uihvMv1FQbL8zG G8RhGUOKQSEhaBfc3lIkm8GpOVPh0uHzB6zhn7daVmAWtazRD2Nu59BMjipa+ims a8Z58iHH7jRAnKeEkVZqXKb1CEiUxaQx/IeVPzN4QlwMhDtwrI76LY7ZJ1zCqTGY ENG0yRLav1XselYBslOYXGtOEWcY5EZPWqLyWbp4P9vz2g0Fe0gZxoIOvPmNQc89 QnfXpCt7vm/DGkyO255myu08GOLeMkisVqUIzLDB9avlym5mri7T7vk9abBa2YyO CRpTL5gl1/qKPWuH1UI5mvhT+sbbBE2SUHSuy84btns39ZKKKynwCtdu+hSQkKLE h9pV30Gf1cLTD4JAE0RWlUgOmbBLVp34loTOexQj4MrLM1noOnw= =vtCN -----END PGP SIGNATURE----- Merge tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks" * tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits) nds32: Remove the architecture uaccess: remove CONFIG_SET_FS ia64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support sparc64: remove CONFIG_SET_FS support lib/test_lockup: fix kernel pointer check for separate address spaces uaccess: generalize access_ok() uaccess: fix type mismatch warnings from access_ok() arm64: simplify access_ok() m68k: fix access_ok for coldfire MIPS: use simpler access_ok() MIPS: Handle address errors for accesses above CPU max virtual user address uaccess: add generic __{get,put}_kernel_nofault nios2: drop access_ok() check from __put_user() x86: use more conventional access_ok() definition x86: remove __range_not_ok() sparc64: add __{get,put}_kernel_nofault() nds32: fix access_ok() checks in get/put_user uaccess: fix nios2 and microblaze get_user_8() sparc64: fix building assembly files ...	2022-03-23 18:03:08 -07:00
Linus Torvalds	9c4b86ebf5	fbdev fixes and updates for kernel v5.18-rc1 Lots of small fixes and code cleanups across most of the fbdev drivers. This includes conversions to use helper functions, const conversions, spelling fixes, help text updates, adding return value checks, small build fixes, and much more. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYjogFQAKCRD3ErUQojoP X3Y7AQD/0Qd0zm6klv4EPeyLXOYzs6uXdyHiJGyCBABP3WxKZQD+J4yNXjd1g50A iGbsawaUpFMcaXTETr0NcrtRkc5jCgY= =HHa3 -----END PGP SIGNATURE----- Merge tag 'for-5.18/fbdev-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev updates from Helge Deller: "Lots of small fixes and code cleanups across most of the fbdev drivers. This includes conversions to use helper functions, const conversions, spelling fixes, help text updates, adding return value checks, small build fixes, and much more" * tag 'for-5.18/fbdev-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: (59 commits) video: fbdev: kyro: make read-only array ODValues static const video: fbdev: offb: fix warning comparing pointer to 0 video: fbdev: omapfb: Add missing of_node_put() in dvic_probe_of video: fbdev: sm712fb: Fix crash in smtcfb_write() video: fbdev: s3c-fb: fix platform_get_irq.cocci warning video: fbdev: sm712fb: Fix crash in smtcfb_read() video: fbdev: via: check the return value of kstrdup() video: fbdev: au1100fb: Spelling s/palette/palette/ video: fbdev: atari: Atari 2 bpp (STe) palette bugfix video: fbdev: atari: Remove unused atafb_setcolreg() video: fbdev: atari: Convert to standard round_up() helper video: fbdev: atari: Fix TT High video mode video: fbdev: udlfb: replace snprintf in show functions with sysfs_emit video: fbdev: omapfb: panel-tpo-td043mtea1: Use sysfs_emit() instead of snprintf() video: fbdev: omapfb: panel-dsi-cm: Use sysfs_emit() instead of snprintf() video: fbdev: omapfb: Use sysfs_emit() instead of snprintf() video: fbdev: s3c-fb: Use platform_get_irq() to get the interrupt video: fbdev: Fix wrong file path for pvr2fb.c in Kconfig help text video: fbdev: pxa3xx-gcu: Remove unnecessary print function dev_err() video: fbdev: pxa168fb: Remove unnecessary print function dev_err() ...	2022-03-23 14:45:01 -07:00
Kajol Jain	d0007eb15c	powerpc/papr_scm: Fix build failure when The following build failure occurs when CONFIG_PERF_EVENTS is not set as generic pmu functions are not visible in that scenario. arch/powerpc/platforms/pseries/papr_scm.c:372:35: error: ‘struct perf_event’ has no member named ‘attr’ p->nvdimm_events_map[event->attr.config], ^~ In file included from ./include/linux/list.h:5, from ./include/linux/kobject.h:19, from ./include/linux/of.h:17, from arch/powerpc/platforms/pseries/papr_scm.c:5: arch/powerpc/platforms/pseries/papr_scm.c: In function ‘papr_scm_pmu_event_init’: arch/powerpc/platforms/pseries/papr_scm.c:389:49: error: ‘struct perf_event’ has no member named ‘pmu’ struct nvdimm_pmu nd_pmu = to_nvdimm_pmu(event->pmu); ^~ ./include/linux/container_of.h:18:26: note: in definition of macro ‘container_of’ void __mptr = (void )(ptr); \ ^~~ arch/powerpc/platforms/pseries/papr_scm.c:389:30: note: in expansion of macro ‘to_nvdimm_pmu’ struct nvdimm_pmu nd_pmu = to_nvdimm_pmu(event->pmu); ^~~~~~~~~~~~~ In file included from ./include/linux/bits.h:22, from ./include/linux/bitops.h:6, from ./include/linux/of.h:15, from arch/powerpc/platforms/pseries/papr_scm.c:5: Fix the build issue by adding check for CONFIG_PERF_EVENTS config option and also add stub function for papr_scm_pmu_register to handle the CONFIG_PERF_EVENTS=n case. Also move the position of macro "to_nvdimm_pmu" inorder to merge it in CONFIG_PERF_EVENTS=y block. based on libnvdimm-for-next tree) Fixes: `4c08d4bbc0` ("powerpc/papr_scm: Add perf interface support") (Commit id Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Link: https://lore.kernel.org/r/20220323164550.109768-2-kjain@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-03-23 12:17:36 -07:00
Linus Torvalds	9030fb0bb9	Folio changes for 5.18 - Rewrite how munlock works to massively reduce the contention on i_mmap_rwsem (Hugh Dickins): https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/ - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig): https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/ - Convert GUP to use folios and make pincount available for order-1 pages. (Matthew Wilcox) - Convert a few more truncation functions to use folios (Matthew Wilcox) - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox) - Convert rmap_walk to use folios (Matthew Wilcox) - Convert most of shrink_page_list() to use a folio (Matthew Wilcox) - Add support for creating large folios in readahead (Matthew Wilcox) -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmI4ucgACgkQDpNsjXcp gj69Wgf6AwqwmO5Tmy+fLScDPqWxmXJofbocae1kyoGHf7Ui91OK4U2j6IpvAr+g P/vLIK+JAAcTQcrSCjymuEkf4HkGZOR03QQn7maPIEe4eLrZRQDEsmHC1L9gpeJp s/GMvDWiGE0Tnxu0EOzfVi/yT+qjIl/S8VvqtCoJv1HdzxitZ7+1RDuqImaMC5MM Qi3uHag78vLmCltLXpIOdpgZhdZexCdL2Y/1npf+b6FVkAJRRNUnA0gRbS7YpoVp CbxEJcmAl9cpJLuj5i5kIfS9trr+/QcvbUlzRxh4ggC58iqnmF2V09l2MJ7YU3XL v1O/Elq4lRhXninZFQEm9zjrri7LDQ== =n9Ad -----END PGP SIGNATURE----- Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache Pull folio updates from Matthew Wilcox: - Rewrite how munlock works to massively reduce the contention on i_mmap_rwsem (Hugh Dickins): https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/ - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig): https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/ - Convert GUP to use folios and make pincount available for order-1 pages. (Matthew Wilcox) - Convert a few more truncation functions to use folios (Matthew Wilcox) - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox) - Convert rmap_walk to use folios (Matthew Wilcox) - Convert most of shrink_page_list() to use a folio (Matthew Wilcox) - Add support for creating large folios in readahead (Matthew Wilcox) * tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits) mm/damon: minor cleanup for damon_pa_young selftests/vm/transhuge-stress: Support file-backed PMD folios mm/filemap: Support VM_HUGEPAGE for file mappings mm/readahead: Switch to page_cache_ra_order mm/readahead: Align file mappings for non-DAX mm/readahead: Add large folio readahead mm: Support arbitrary THP sizes mm: Make large folios depend on THP mm: Fix READ_ONLY_THP warning mm/filemap: Allow large folios to be added to the page cache mm: Turn can_split_huge_page() into can_split_folio() mm/vmscan: Convert pageout() to take a folio mm/vmscan: Turn page_check_references() into folio_check_references() mm/vmscan: Account large folios correctly mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios mm/vmscan: Free non-shmem folios without splitting them mm/rmap: Constify the rmap_walk_control argument mm/rmap: Convert rmap_walk() to take a folio mm: Turn page_anon_vma() into folio_anon_vma() mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() ...	2022-03-22 17:03:12 -07:00
Linus Torvalds	3bf03b9a08	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs - Most the MM patches which precede the patches in Willy's tree: kasan, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb, userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp, cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap, zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits) mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release() Docs/ABI/testing: add DAMON sysfs interface ABI document Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface selftests/damon: add a test for DAMON sysfs interface mm/damon/sysfs: support DAMOS stats mm/damon/sysfs: support DAMOS watermarks mm/damon/sysfs: support schemes prioritization mm/damon/sysfs: support DAMOS quotas mm/damon/sysfs: support DAMON-based Operation Schemes mm/damon/sysfs: support the physical address space monitoring mm/damon/sysfs: link DAMON for virtual address spaces monitoring mm/damon: implement a minimal stub for sysfs-based DAMON interface mm/damon/core: add number of each enum type values mm/damon/core: allow non-exclusive DAMON start/stop Docs/damon: update outdated term 'regions update interval' Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling Docs/vm/damon: call low level monitoring primitives the operations mm/damon: remove unnecessary CONFIG_DAMON option mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}() mm/damon/dbgfs-test: fix is_target_id() change ...	2022-03-22 16:11:53 -07:00
David Hildenbrand	2848a28b0a	drivers/base/node: consolidate node device subsystem initialization in node_dev_init() ... and call node_dev_init() after memory_dev_init() from driver_init(), so before any of the existing arch/subsys calls. All online nodes should be known at that point: early during boot, arch code determines node and zone ranges and sets the relevant nodes online; usually this happens in setup_arch(). This is in line with memory_dev_init(), which initializes the memory device subsystem and creates all memory block devices. Similar to memory_dev_init(), panic() if anything goes wrong, we don't want to continue with such basic initialization errors. The important part is that node_dev_init() gets called after memory_dev_init() and after cpu_dev_init(), but before any of the relevant archs call register_cpu() to register the new cpu device under the node device. The latter should be the case for the current users of topology_init(). Link: https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Tested-by: Anatoly Pugachev <matorola@gmail.com> (sparc64) Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Rapoport <rppt@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-03-22 15:57:10 -07:00
Hari Bathini	ee97347fe0	powerpc/fadump: opt out from freeing pages on cma activation failure With commit `a4e92ce8e4` ("powerpc/fadump: Reservationless firmware assisted dump"), Linux kernel's Contiguous Memory Allocator (CMA) based reservation was introduced in fadump. That change was aimed at using CMA to let applications utilize the memory reserved for fadump while blocking it from being used for kernel pages. The assumption was, even if CMA activation fails for whatever reason, the memory still remains reserved to avoid it from being used for kernel pages. But commit `072355c1cf` ("mm/cma: expose all pages to the buddy if activation of an area fails") breaks this assumption as it started exposing all pages to buddy allocator on CMA activation failure. It led to warning messages like below while running crash-utility on vmcore of a kernel having above two commits: crash: seek error: kernel virtual address: <from reserved region> To fix this problem, opt out from exposing pages to buddy allocator on CMA activation failure for fadump reserved memory. Link: https://lkml.kernel.org/r/20220117075246.36072-3-hbathini@linux.ibm.com Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-03-22 15:57:09 -07:00
David Hildenbrand	e16faf2678	cma: factor out minimum alignment requirement Patch series "mm: enforce pageblock_order < MAX_ORDER". Having pageblock_order >= MAX_ORDER seems to be able to happen in corner cases and some parts of the kernel are not prepared for it. For example, Aneesh has shown [1] that such kernels can be compiled on ppc64 with 64k base pages by setting FORCE_MAX_ZONEORDER=8, which will run into a WARN_ON_ONCE(order >= MAX_ORDER) in comapction code right during boot. We can get pageblock_order >= MAX_ORDER when the default hugetlb size is bigger than the maximum allocation granularity of the buddy, in which case we are no longer talking about huge pages but instead gigantic pages. Having pageblock_order >= MAX_ORDER can only make alloc_contig_range() of such gigantic pages more likely to succeed. Reliable use of gigantic pages either requires boot time allcoation or CMA, no need to overcomplicate some places in the kernel to optimize for corner cases that are broken in other areas of the kernel. This patch (of 2): Let's enforce pageblock_order < MAX_ORDER and simplify. Especially patch #1 can be regarded a cleanup before: [PATCH v5 0/6] Use pageblock_order for cma and alloc_contig_range alignment. [2] [1] https://lkml.kernel.org/r/87r189a2ks.fsf@linux.ibm.com [2] https://lkml.kernel.org/r/20220211164135.1803616-1-zi.yan@sent.com Link: https://lkml.kernel.org/r/20220214174132.219303-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Rob Herring <robh@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: John Garry via iommu <iommu@lists.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-03-22 15:57:05 -07:00
Anshuman Khandual	16785bd774	mm: merge pte_mkhuge() call into arch_make_huge_pte() Each call into pte_mkhuge() is invariably followed by arch_make_huge_pte(). Instead arch_make_huge_pte() can accommodate pte_mkhuge() at the beginning. This updates generic fallback stub for arch_make_huge_pte() and available platforms definitions. This makes huge pte creation much cleaner and easier to follow. Link: https://lkml.kernel.org/r/1643860669-26307-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-03-22 15:57:04 -07:00
Linus Torvalds	14705fda8f	New features: - NFSv3 support in NFSD is now always built - Added NFSD support for the NFSv4 birth-time file attribute - Added support for storing and displaying sockaddrs in trace points - NFSD now recognizes RPC_AUTH_TLS probes Performance improvements: - Optimized the svc transport enqueuing mechanism - Added micro-optimizations for the duplicate reply cache Notable bug fixes: - Allocation of the NFSD file cache hash table is more reliable -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmI3XNkACgkQM2qzM29m f5dNERAAqJ/nzfVp2H5BKLszJ7p/s13wFqbW719Rzymzl6/tUUHOqIsVdBsJsa/b BdZQLLDwa6ZB5zOAnC6FosRKYu4lwixOOf94pC6a9ZDD/glYVKF8mZG+RZXPAy16 g3JUOi/bcyHXv0ZUhbv7DqW+HHM+owPP4vlNJ9ChiiLr/Xdp8NBKj+4Qtn/wcAo+ Xuvx7fU/5Mbemh+dd5mWker4afHvpt9V6U6s04m5LiTPPnHVnxmeyekJGUCOY0QO cm/6SPNDqyn/VEfM/SRxEnLE9QcHRhZo/4PKRGF4wYolcviIogbILE1M7Ig/r/Gv 6Du2kcRAhyZ3zgWnu799Ivn3Q6IrVjxZwqmsi7YHURTwYKyZtxYsUk0MCBcpnxrE WyTS2onpElbMop3viKCnQdpIetbsHnUNg3udUV6ugbdCbnZuhUw5B/d6x0o8ZWDE C0f+jnX+GnBstn0vkcj2H0+VQTd5hUJtXMrooI42ODJoboQRZcmePwoXjqCmw3sy PXTxLZC/5+4zNHGUuz4Pq4V7FKr4nHhDzaW4ZDO3mILx4ahceotulY1B/yoBUu8o /LAhu2kJ6nFQkmpzdrGzPeOstgJYHm9CaitRvMzg+NJxEAJdebypdQDbX5iNpgfW MDXH4n8eIqroTlQ/mQYEV0EbC7BaTqSCL6rQdcrcFUPu9n4Fcno= =5nac -----END PGP SIGNATURE----- Merge tag 'nfsd-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd updates from Chuck Lever: "New features: - NFSv3 support in NFSD is now always built - Added NFSD support for the NFSv4 birth-time file attribute - Added support for storing and displaying sockaddrs in trace points - NFSD now recognizes RPC_AUTH_TLS probes Performance improvements: - Optimized the svc transport enqueuing mechanism - Added micro-optimizations for the duplicate reply cache Notable bug fixes: - Allocation of the NFSD file cache hash table is more reliable" * tag 'nfsd-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (30 commits) nfsd: fix using the correct variable for sizeof() nfsd: use correct format characters NFSD: prevent integer overflow on 32 bit systems NFSD: prevent underflow in nfssvc_decode_writeargs() fs/lock: documentation cleanup. Replace inode->i_lock with flc_lock. NFSD: Fix nfsd_breaker_owns_lease() return values NFSD: Clean up _lm_ operation names arch: Remove references to CONFIG_NFSD_V3 in the default configs NFSD: Remove CONFIG_NFSD_V3 nfsd: more robust allocation failure handling in nfsd_file_cache_init SUNRPC: Teach server to recognize RPC_AUTH_TLS NFSD: Move svc_serv_ops::svo_function into struct svc_serv NFSD: Remove svc_serv_ops::svo_module SUNRPC: Remove svc_shutdown_net() SUNRPC: Rename svc_close_xprt() SUNRPC: Rename svc_create_xprt() SUNRPC: Remove svo_shutdown method SUNRPC: Merge svc_do_enqueue_xprt() into svc_enqueue_xprt() SUNRPC: Remove the .svo_enqueue_xprt method SUNRPC: Record endpoint information in trace log ...	2022-03-22 10:29:51 -07:00
Linus Torvalds	2142b7f0c6	hardening updates for v5.18-rc1 - Add arm64 Shadow Call Stack support for GCC 12 (Dan Li) - Avoid memset with stack offset randomization under Clang (Marco Elver) - Clean up stackleak plugin to play nice with .noinstr (Kees Cook) - Check stack depth for greater usercopy hardening coverage (Kees Cook) -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmI4kXMWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhBoD/wJFr0s13Cvsbibuk7PLAPJlQe9 QBMolrrS9+JNoqdIMiILrmthCPnDBkBNrU/YvfkIyGQOO2RGxrtZVzLhyHKCDg6u iIkNG9S5D12ucEdqqLWdZxyBZcQuR6Rf//lGvtx8ps+jYy8fDwRekurJIb3kWl5u qB0O0PFd+RjGgvtm+Fh8h0FiBMxbKfPXI+s7W2rCfcwe+w5Z24YD1eoCHmnQJYcu Mnuk7cHsx2TFms4UqUK1Z/0EBpCKNEEX4s0z/nrfu8dRTPvLqLgbGpcmXTkik9PN BucIxgdRqqYbTyGvhsDhpEUVfmFcQzdPmuMnnnUc8BiXy9EqGqSfjMEzutuf+RS7 0i4LWoDW2LYMUixqDLAMdLpwdC2Ca7hP62kE4vNVqW3jBty+jhPBVO6ddhHO14nd q6m+CQz0SVTIyrLI4N+TNg/EIj2DpBpAhs49QWDOL/ZqP0ewYk8Ef8pXKgJo2jJC aAs+18pdpoVCEs1fztzjuWZT77iTmziYhb2BOMnT4yBcAdifi7eW6l0pYsgfxoJ/ WC/MmTWt08/IHBk09d8GbFdoP8byDUgzmzUUoskJJH2JA7475xM6qhI2J627Lpth baEv3UT8JWBBX+koU2wxhxKgscIvbNjJjpEGNt2YuBBeQ4lrlijsFzQjmu62gZDL LG0XOVV97/1V9uJ2CA== =yaWZ -----END PGP SIGNATURE----- Merge tag 'hardening-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening updates from Kees Cook: - Add arm64 Shadow Call Stack support for GCC 12 (Dan Li) - Avoid memset with stack offset randomization under Clang (Marco Elver) - Clean up stackleak plugin to play nice with .noinstr (Kees Cook) - Check stack depth for greater usercopy hardening coverage (Kees Cook) * tag 'hardening-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arm64: Add gcc Shadow Call Stack support m68k: Implement "current_stack_pointer" xtensa: Implement "current_stack_pointer" usercopy: Check valid lifetime via stack depth stack: Constrain and fix stack offset randomization with Clang builds stack: Introduce CONFIG_RANDOMIZE_KSTACK_OFFSET gcc-plugins/stackleak: Ignore .noinstr.text and .entry.text gcc-plugins/stackleak: Exactly match strings instead of prefixes gcc-plugins/stackleak: Provide verbose mode	2022-03-21 19:32:04 -07:00
Linus Torvalds	93e220a62d	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - hwrng core now credits for low-quality RNG devices. Algorithms: - Optimisations for neon aes on arm/arm64. - Add accelerated crc32_be on arm64. - Add ffdheXYZ(dh) templates. - Disallow hmac keys < 112 bits in FIPS mode. - Add AVX assembly implementation for sm3 on x86. Drivers: - Add missing local_bh_disable calls for crypto_engine callback. - Ensure BH is disabled in crypto_engine callback path. - Fix zero length DMA mappings in ccree. - Add synchronization between mailbox accesses in octeontx2. - Add Xilinx SHA3 driver. - Add support for the TDES IP available on sama7g5 SoC in atmel" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits) crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list crypto: dh - Remove the unused function dh_safe_prime_dh_alg() hwrng: nomadik - Change clk_disable to clk_disable_unprepare crypto: arm64 - cleanup comments crypto: qat - fix initialization of pfvf rts_map_msg structures crypto: qat - fix initialization of pfvf cap_msg structures crypto: qat - remove unneeded assignment crypto: qat - disable registration of algorithms crypto: hisilicon/qm - fix memset during queues clearing crypto: xilinx: prevent probing on non-xilinx hardware crypto: marvell/octeontx - Use swap() instead of open coding it crypto: ccree - Fix use after free in cc_cipher_exit() crypto: ccp - ccp_dmaengine_unregister release dma channels crypto: octeontx2 - fix missing unlock hwrng: cavium - fix NULL but dereferenced coccicheck error crypto: cavium/nitrox - don't cast parameter in bit operations crypto: vmx - add missing dependencies MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver crypto: xilinx - Add Xilinx SHA3 driver ...	2022-03-21 16:02:36 -07:00
Matthew Wilcox (Oracle)	9e996c2115	powerpc: Add pmd_pfn() This is straightforward for everything except nohash64 where we indirect through pmd_page(). There must be a better way to do this. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>	2022-03-21 12:59:02 -04:00
Matthew Wilcox (Oracle)	d1d8a3b4d0	mm: Turn isolate_lru_page() into folio_isolate_lru() Add isolate_lru_page() as a wrapper around isolate_lru_folio(). TestClearPageLRU() would have always failed on a tail page, so returning -EBUSY is the same behaviour. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com>	2022-03-21 12:56:36 -04:00
Michael Ellerman	fe2640bd7a	powerpc/pseries: Fix use after free in remove_phb_dynamic() In remove_phb_dynamic() we use &phb->io_resource, after we've called device_unregister(&host_bridge->dev). But the unregister may have freed phb, because pcibios_free_controller_deferred() is the release function for the host_bridge. If there are no outstanding references when we call device_unregister() then phb will be freed out from under us. This has gone mainly unnoticed, but with slub_debug and page_poison enabled it can lead to a crash: PID: 7574 TASK: c0000000d492cb80 CPU: 13 COMMAND: "drmgr" #0 [c0000000e4f075a0] crash_kexec at c00000000027d7dc #1 [c0000000e4f075d0] oops_end at c000000000029608 #2 [c0000000e4f07650] __bad_page_fault at c0000000000904b4 #3 [c0000000e4f076c0] do_bad_slb_fault at c00000000009a5a8 #4 [c0000000e4f076f0] data_access_slb_common_virt at c000000000008b30 Data SLB Access [380] exception frame: R0: c000000000167250 R1: c0000000e4f07a00 R2: c000000002a46100 R3: c000000002b39ce8 R4: 00000000000000c0 R5: 00000000000000a9 R6: 3894674d000000c0 R7: 0000000000000000 R8: 00000000000000ff R9: 0000000000000100 R10: 6b6b6b6b6b6b6b6b R11: 0000000000008000 R12: c00000000023da80 R13: c0000009ffd38b00 R14: 0000000000000000 R15: 000000011c87f0f0 R16: 0000000000000006 R17: 0000000000000003 R18: 0000000000000002 R19: 0000000000000004 R20: 0000000000000005 R21: 000000011c87ede8 R22: 000000011c87c5a8 R23: 000000011c87d3a0 R24: 0000000000000000 R25: 0000000000000001 R26: c0000000e4f07cc8 R27: c00000004d1cc400 R28: c0080000031d00e8 R29: c00000004d23d800 R30: c00000004d1d2400 R31: c00000004d1d2540 NIP: c000000000167258 MSR: 8000000000009033 OR3: c000000000e9f474 CTR: 0000000000000000 LR: c000000000167250 XER: 0000000020040003 CCR: 0000000024088420 MQ: 0000000000000000 DAR: 6b6b6b6b6b6b6ba3 DSISR: c0000000e4f07920 Syscall Result: fffffffffffffff2 [NIP : release_resource+56] [LR : release_resource+48] #5 [c0000000e4f07a00] release_resource at c000000000167258 (unreliable) #6 [c0000000e4f07a30] remove_phb_dynamic at c000000000105648 #7 [c0000000e4f07ab0] dlpar_remove_slot at c0080000031a09e8 [rpadlpar_io] #8 [c0000000e4f07b50] remove_slot_store at c0080000031a0b9c [rpadlpar_io] #9 [c0000000e4f07be0] kobj_attr_store at c000000000817d8c #10 [c0000000e4f07c00] sysfs_kf_write at c00000000063e504 #11 [c0000000e4f07c20] kernfs_fop_write_iter at c00000000063d868 #12 [c0000000e4f07c70] new_sync_write at c00000000054339c #13 [c0000000e4f07d10] vfs_write at c000000000546624 #14 [c0000000e4f07d60] ksys_write at c0000000005469f4 #15 [c0000000e4f07db0] system_call_exception at c000000000030840 #16 [c0000000e4f07e10] system_call_vectored_common at c00000000000c168 To avoid it, we can take a reference to the host_bridge->dev until we're done using phb. Then when we drop the reference the phb will be freed. Fixes: `2dd9c11b9d` ("powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)") Reported-by: David Dai <zdai@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Link: https://lore.kernel.org/r/20220318034219.1188008-1-mpe@ellerman.id.au	2022-03-21 13:17:47 +11:00
Jakub Kicinski	e243f39685	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 13:56:58 -07:00
Nicholas Piggin	35de589cb8	powerpc/time: improve decrementer clockevent processing The stop/shutdown op should not use decrementer_set_next_event because that sets decrementers_next_tb to now + decrementer_max, which means a decrementer interrupt that occurs after that time will call the clockevent event handler unexpectedly. Set next_tb to ~0 here to prevent any clock event call. Init all clockevents to stopped. Then the decrementer clockevent device always has event_handler set and applicable because we know the clock event device was not stopped. So make this call unconditional to show that it is always called. next_tb need not be set to ~0 before the event handler is called because it will stop the clockevent device if there is no other timer. Finally, the timer broadcast interrupt should not modify next_tb because it is not involved with the local decrementer clockevent on this CPU. This doesn't fix a known bug, just tidies the code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220124143930.3923442-3-npiggin@gmail.com	2022-03-16 16:55:23 +11:00
Nicholas Piggin	cf74ff52e3	powerpc/time: Fix KVM host re-arming a timer beyond decrementer range If the next host timer is beyond decrementer range, timer_rearm_host_dec will leave decrementer not programmed. This will not cause a problem for the host it will just set the decrementer correctly when the decrementer interrupt hits, it seems safer not to leave the next host decrementer interrupt timing able to be influenced by a guest. This code is only used in the P9 KVM paths so it's unlikely to be hit practically unless large decrementer is force disabled in the host. Fixes: `25aa145856` ("powerpc/time: add API for KVM to re-arm the host timer/decrementer") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220124143930.3923442-2-npiggin@gmail.com	2022-03-16 16:55:23 +11:00
Nicholas Piggin	9d71165d39	powerpc/tm: Fix more userspace r13 corruption Commit `cf13435b73` ("powerpc/tm: Fix userspace r13 corruption") fixes a problem in treclaim where a SLB miss can occur on the thread_struct->ckpt_regs while SCRATCH0 is live with the saved user r13 value, clobbering it with the kernel r13 and ultimately resulting in kernel r13 being stored in ckpt_regs. There is an equivalent problem in trechkpt where the user r13 value is loaded into r13 from chkpt_regs to be recheckpointed, but a SLB miss could occur on ckpt_regs accesses after that, which will result in r13 being clobbered with a kernel value and that will get recheckpointed and then restored to user registers. The same memory page is accessed right before this critical window where a SLB miss could cause corruption, so hitting the bug requires the SLB entry be removed within a small window of instructions, which is possible if a SLB related MCE hits there. PAPR also permits the hypervisor to discard this SLB entry (because slb_shadow->persistent is only set to SLB_NUM_BOLTED) although it's not known whether any implementations would do this (KVM does not). So this is an extremely unlikely bug, only found by inspection. Fix this by also storing user r13 in a temporary location on the kernel stack and don't change the r13 register from kernel r13 until the RI=0 critical section that does not fault. The SCRATCH0 change is not strictly part of the fix, it's only used in the RI=0 section so it does not have the same problem as the previous SCRATCH0 bug. Fixes: `98ae22e15b` ("powerpc: Add helper functions for transactional memory context switching") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220311024733.48926-1-npiggin@gmail.com	2022-03-16 11:59:24 +11:00
Randy Dunlap	d64e3eab75	powerpc/xive: fix return value of __setup handler __setup() handlers should return 1 to obsolete_checksetup() in init/main.c to indicate that the boot option has been handled. A return of 0 causes the boot option/value to be listed as an Unknown kernel parameter and added to init's (limited) argument or environment strings. Also, error return codes don't mean anything to obsolete_checksetup() -- only non-zero (usually 1) or zero. So return 1 from xive_off() and xive_store_eoi_cmdline(). Fixes: `243e25112d` ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Fixes: `c21ee04f11` ("powerpc/xive: Add a kernel parameter for StoreEOI") [lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru] Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>: Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220313065936.4363-1-rdunlap@infradead.org	2022-03-16 11:34:07 +11:00
Peter Zijlstra	cc66bb9145	x86/ibt,kprobes: Cure sym+0 equals fentry woes In order to allow kprobes to skip the ENDBR instructions at sym+0 for X86_KERNEL_IBT builds, change _kprobe_addr() to take an architecture callback to inspect the function at hand and modify the offset if needed. This streamlines the existing interface to cover more cases and require less hooks. Once PowerPC gets fully converted there will only be the one arch hook. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.405947704@infradead.org	2022-03-15 10:32:38 +01:00
Peter Zijlstra	d15cb3dab1	x86/livepatch: Validate __fentry__ location Currently livepatch assumes __fentry__ lives at func+0, which is most likely untrue with IBT on. Instead make it use ftrace_location() by default which both validates and finds the actual ip if there is any in the same symbol. Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.285971256@infradead.org	2022-03-15 10:32:37 +01:00
Chuck Lever	f3e4080edd	arch: Remove references to CONFIG_NFSD_V3 in the default configs CONFIG_NFSD_V3 has been removed. NFSD support for NFSv3 can no longer be disabled. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2022-03-11 10:25:16 -05:00
Christophe Leroy	3af722cb73	powerpc/net: Implement powerpc specific csum_shift() to remove branch Today's implementation of csum_shift() leads to branching based on parity of 'offset' 000002f8 <csum_block_add>: 2f8: 70 a5 00 01 andi. r5,r5,1 2fc: 41 a2 00 08 beq 304 <csum_block_add+0xc> 300: 54 84 c0 3e rotlwi r4,r4,24 304: 7c 63 20 14 addc r3,r3,r4 308: 7c 63 01 94 addze r3,r3 30c: 4e 80 00 20 blr Use first bit of 'offset' directly as input of the rotation instead of branching. 000002f8 <csum_block_add>: 2f8: 54 a5 1f 38 rlwinm r5,r5,3,28,28 2fc: 20 a5 00 20 subfic r5,r5,32 300: 5c 84 28 3e rotlw r4,r4,r5 304: 7c 63 20 14 addc r3,r3,r4 308: 7c 63 01 94 addze r3,r3 30c: 4e 80 00 20 blr And change to left shift instead of right shift to skip one more instruction. This has no impact on the final sum. 000002f8 <csum_block_add>: 2f8: 54 a5 1f 38 rlwinm r5,r5,3,28,28 2fc: 5c 84 28 3e rotlw r4,r4,r5 300: 7c 63 20 14 addc r3,r3,r4 304: 7c 63 01 94 addze r3,r3 308: 4e 80 00 20 blr Seems like only powerpc benefits from a branchless implementation. Other main architectures like ARM or X86 get better code with the generic implementation and its branch. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-11 10:57:22 +00:00
Jakub Kicinski	1e8a3f0d2a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net net/dsa/dsa2.c commit `afb3cc1a39` ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails") commit `e83d565378` ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master") https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/ drivers/net/ethernet/intel/ice/ice.h commit `97b0129146` ("ice: Fix error with handling of bonding MTU") commit `43113ff734` ("ice: add TTY for GNSS module for E810T device") https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/ drivers/staging/gdm724x/gdm_lte.c commit `fc7f750dc9` ("staging: gdm724x: fix use after free in gdm_lte_rx()") commit `4bcc4249b4` ("staging: Use netif_rx().") https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 17:16:56 -08:00
Eric W. Biederman	03248addad	resume_user_mode: Move to resume_user_mode.h Move set_notify_resume and tracehook_notify_resume into resume_user_mode.h. While doing that rename tracehook_notify_resume to resume_user_mode_work. Update all of the places that included tracehook.h for these functions to include resume_user_mode.h instead. Update all of the callers of tracehook_notify_resume to call resume_user_mode_work. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-12-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-03-10 16:51:50 -06:00
Eric W. Biederman	153474ba1a	ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h Rename tracehook_report_syscall_{entry,exit} to ptrace_report_syscall_{entry,exit} and place them in ptrace.h There is no longer any generic tracehook infractructure so make these ptrace specific functions ptrace specific. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-3-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-03-10 13:35:08 -06:00
Kajol Jain	4c08d4bbc0	powerpc/papr_scm: Add perf interface support Performance monitoring support for papr-scm nvdimm devices via perf interface is added which includes addition of pmu functions like add/del/read/event_init for nvdimm_pmu struture. A new parameter 'priv' in added to the pdev_archdata structure to save nvdimm_pmu device pointer, to handle the unregistering of pmu device. papr_scm_pmu_register function populates the nvdimm_pmu structure with name, capabilities, cpumask along with event handling functions. Finally the populated nvdimm_pmu structure is passed to register the pmu device. Event handling functions internally uses hcall to get events and counter data. Result in power9 machine with 2 nvdimm device: Ex: List all event by perf list command:# perf list nmem nmem0/cache_rh_cnt/ [Kernel PMU event] nmem0/cache_wh_cnt/ [Kernel PMU event] nmem0/cri_res_util/ [Kernel PMU event] nmem0/ctl_res_cnt/ [Kernel PMU event] nmem0/ctl_res_tm/ [Kernel PMU event] nmem0/fast_w_cnt/ [Kernel PMU event] nmem0/host_l_cnt/ [Kernel PMU event] nmem0/host_l_dur/ [Kernel PMU event] nmem0/host_s_cnt/ [Kernel PMU event] nmem0/host_s_dur/ [Kernel PMU event] nmem0/med_r_cnt/ [Kernel PMU event] nmem0/med_r_dur/ [Kernel PMU event] nmem0/med_w_cnt/ [Kernel PMU event] nmem0/med_w_dur/ [Kernel PMU event] nmem0/mem_life/ [Kernel PMU event] nmem0/poweron_secs/ [Kernel PMU event] ... nmem1/mem_life/ [Kernel PMU event] nmem1/poweron_secs/ [Kernel PMU event] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> [Add numa_map_to_online_node function call to get online node id] Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@in.ibm.com> Link: https://lore.kernel.org/r/20220225143024.47947-4-kjain@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-03-09 17:50:50 -08:00
Alexey Kardashevskiy	d799769188	powerpc/64: Add UADDR64 relocation support When ld detects unaligned relocations, it emits R_PPC64_UADDR64 relocations instead of R_PPC64_RELATIVE. Currently R_PPC64_UADDR64 are detected by arch/powerpc/tools/relocs_check.sh and expected not to work. Below is a simple chunk to trigger this behaviour (this disables optimization for the demonstration purposes only, this also happens with -O1/-O2 when CONFIG_PRINTK_INDEX=y, for example): \#pragma GCC push_options \#pragma GCC optimize ("O0") struct entry { const char *file; int line; } __attribute__((packed)); static const struct entry e1 = { .file = __FILE__, .line = __LINE__ }; static const struct entry e2 = { .file = __FILE__, .line = __LINE__ }; ... prom_printf("e1=%s %lx %lx\n", e1.file, (unsigned long) e1.file, mfmsr()); prom_printf("e2=%s %lx\n", e2.file, (unsigned long) e2.file); \#pragma GCC pop_options This adds support for UADDR64 for 64bit. This reuses __dynamic_symtab from the 32bit code which supports more relocation types already. Because RELACOUNT includes only R_PPC64_RELATIVE, this replaces it with RELASZ which is the size of all relocation records. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220309061822.168173-1-aik@ozlabs.ru	2022-03-09 21:47:53 +11:00
Hangyu Hua	3fd46e551f	powerpc: 8xx: fix a return value error in mpc8xx_pic_init mpc8xx_pic_init() should return -ENOMEM instead of 0 when irq_domain_add_linear() return NULL. This cause mpc8xx_pics_init to continue executing even if mpc8xx_pic_host is NULL. Fixes: `cc76404fea` ("powerpc/8xx: Fix possible device node reference leak") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220223070223.26845-1-hbh25y@gmail.com	2022-03-09 21:46:55 +11:00
jing yangyang	9f5196065e	powerpc/ps3: remove unneeded semicolons Eliminate the following coccicheck warnings: ./arch/powerpc/platforms/ps3/system-bus.c:606:2-3: Unneeded semicolon ./arch/powerpc/platforms/ps3/system-bus.c:765:2-3: Unneeded semicolon Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: jing yangyang <jing.yangyang@zte.com.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/01647102607ce9640c9f27786d976109a3c4ea7e.1629197153.git.jing.yangyang@zte.com.cn	2022-03-09 14:26:35 +11:00
Paolo Bonzini	37b2a6510a	KVM: use __vcalloc for very large allocations Allocations whose size is related to the memslot size can be arbitrarily large. Do not use kvzalloc/kvcalloc, as those are limited to "not crazy" sizes that fit in 32 bits. Cc: stable@vger.kernel.org Fixes: `7661809d49` ("mm: don't allow oversized kvmalloc() calls") Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-08 09:30:57 -05:00
Christophe Leroy	7929939193	powerpc/64: Force inlining of prevent_user_access() and set_kuap() A ppc64_defconfig build exhibits about 10 copied of prevent_user_access(). It also have one copy of set_kuap(). c000000000017340 <.prevent_user_access.constprop.0>: c00000000001a038: 4b ff d3 09 bl c000000000017340 <.prevent_user_access.constprop.0> c00000000001aabc: 4b ff c8 85 bl c000000000017340 <.prevent_user_access.constprop.0> c00000000001ab38: 4b ff c8 09 bl c000000000017340 <.prevent_user_access.constprop.0> c00000000001ade0: 4b ff c5 61 bl c000000000017340 <.prevent_user_access.constprop.0> c000000000039b90 <.prevent_user_access.constprop.0>: c00000000003ac08: 4b ff ef 89 bl c000000000039b90 <.prevent_user_access.constprop.0> c00000000003b9d0: 4b ff e1 c1 bl c000000000039b90 <.prevent_user_access.constprop.0> c00000000003ba54: 4b ff e1 3d bl c000000000039b90 <.prevent_user_access.constprop.0> c00000000003bbfc: 4b ff df 95 bl c000000000039b90 <.prevent_user_access.constprop.0> c00000000015dde0 <.prevent_user_access.constprop.0>: c0000000001612c0: 4b ff cb 21 bl c00000000015dde0 <.prevent_user_access.constprop.0> c000000000161b54: 4b ff c2 8d bl c00000000015dde0 <.prevent_user_access.constprop.0> c000000000188cf0 <.prevent_user_access.constprop.0>: c00000000018d658: 4b ff b6 99 bl c000000000188cf0 <.prevent_user_access.constprop.0> c00000000030fe20 <.prevent_user_access.constprop.0>: c0000000003123d4: 4b ff da 4d bl c00000000030fe20 <.prevent_user_access.constprop.0> c000000000313970: 4b ff c4 b1 bl c00000000030fe20 <.prevent_user_access.constprop.0> c0000000005e6bd0 <.prevent_user_access.constprop.0>: c0000000005e7d8c: 4b ff ee 45 bl c0000000005e6bd0 <.prevent_user_access.constprop.0> c0000000007bcae0 <.prevent_user_access.constprop.0>: c0000000007bda10: 4b ff f0 d1 bl c0000000007bcae0 <.prevent_user_access.constprop.0> c0000000007bda54: 4b ff f0 8d bl c0000000007bcae0 <.prevent_user_access.constprop.0> c0000000007bdd28: 4b ff ed b9 bl c0000000007bcae0 <.prevent_user_access.constprop.0> c0000000007c0390: 4b ff c7 51 bl c0000000007bcae0 <.prevent_user_access.constprop.0> c00000000094e4f0 <.prevent_user_access.constprop.0>: c000000000950e40: 4b ff d6 b1 bl c00000000094e4f0 <.prevent_user_access.constprop.0> c00000000097d2d0 <.prevent_user_access.constprop.0>: c0000000009813fc: 4b ff be d5 bl c00000000097d2d0 <.prevent_user_access.constprop.0> c000000000acd540 <.prevent_user_access.constprop.0>: c000000000ad1d60: 4b ff b7 e1 bl c000000000acd540 <.prevent_user_access.constprop.0> c000000000e5d680 <.prevent_user_access.constprop.0>: c000000000e64b60: 4b ff 8b 21 bl c000000000e5d680 <.prevent_user_access.constprop.0> c000000000e64b6c: 4b ff 8b 15 bl c000000000e5d680 <.prevent_user_access.constprop.0> c000000000e64c38: 4b ff 8a 49 bl c000000000e5d680 <.prevent_user_access.constprop.0> When building signal_64.c with -Winline the following messages appear: ./arch/powerpc/include/asm/book3s/64/kup.h:331:20: error: inlining failed in call to 'set_kuap': call is unlikely and code size would grow [-Werror=inline] ./arch/powerpc/include/asm/book3s/64/kup.h:401:20: error: inlining failed in call to 'prevent_user_access.constprop': call is unlikely and code size would grow [-Werror=inline] Those functions are used on hot pathes and have been expected to be inlined at all time. Force them inline. This patch reduces the kernel text size by 700 bytes, confirming that not inlining those functions is not worth it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eff9b2b211957fa2e8707e46f31674097fd563a3.1644588972.git.christophe.leroy@csgroup.eu	2022-03-08 22:33:07 +11:00
Christophe Leroy	0b0057cc41	powerpc/bitops: Force inlining of fls() Building a kernel with CONFIG_CC_OPTIMISE_FOR_SIZE leads to the following functions being copied several times in vmlinux: 31 times __ilog2_u32() 34 times fls() Disassembly follows: c00f476c <fls>: c00f476c: 7c 63 00 34 cntlzw r3,r3 c00f4770: 20 63 00 20 subfic r3,r3,32 c00f4774: 4e 80 00 20 blr c00f4778 <__ilog2_u32>: c00f4778: 94 21 ff f0 stwu r1,-16(r1) c00f477c: 7c 08 02 a6 mflr r0 c00f4780: 90 01 00 14 stw r0,20(r1) c00f4784: 4b ff ff e9 bl c00f476c <fls> c00f4788: 80 01 00 14 lwz r0,20(r1) c00f478c: 38 63 ff ff addi r3,r3,-1 c00f4790: 7c 08 03 a6 mtlr r0 c00f4794: 38 21 00 10 addi r1,r1,16 c00f4798: 4e 80 00 20 blr When forcing inlining of fls(), we get c0008b80 <__ilog2_u32>: c0008b80: 7c 63 00 34 cntlzw r3,r3 c0008b84: 20 63 00 1f subfic r3,r3,31 c0008b88: 4e 80 00 20 blr vmlinux size gets reduced by 1 kbyte with that change. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/adc9c9d6378f6b5008246ca717993d7870188efb.1644569473.git.christophe.leroy@csgroup.eu	2022-03-08 22:33:03 +11:00
Rohan McLure	6b3a3e12f8	powerpc: declare unmodified attribute_group usages const Inspired by (bd75b4ef4977: Constify static attribute_group structs), accepted by linux-next, reported: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220210202805.7750-4-rikard.falkeborn@gmail.com/ Nearly all singletons of type struct attribute_group are never modified, and so are candidates for being const. Declare them as const. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220307231414.86560-1-rmclure@linux.ibm.com	2022-03-08 22:15:32 +11:00
YueHaibing	5986f6b657	powerpc/spufs: Fix build warning when CONFIG_PROC_FS=n arch/powerpc/platforms/cell/spufs/sched.c:1055:12: warning: ‘show_spu_loadavg’ defined but not used [-Wunused-function] static int show_spu_loadavg(struct seq_file s, void private) ^~~~~~~~~~~~~~~~ Move it into #ifdef block to fix this, also remove unneeded semicolon. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308100928.23540-1-yuehaibing@huawei.com	2022-03-08 22:07:41 +11:00
Hangyu Hua	d601fd24e6	powerpc/secvar: fix refcount leak in format_show() Refcount leak will happen when format_show returns failure in multiple cases. Unified management of of_node_put can fix this problem. Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220302021959.10959-1-hbh25y@gmail.com	2022-03-08 22:07:41 +11:00
Michael Ellerman	1a76e520ee	powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E Since the IBM A2 CPU support was removed, see commit `fb5a515704` ("powerpc: Remove platforms/wsp and associated pieces"), the only 64-bit Book3E CPUs we support are Freescale (NXP) ones. However our Kconfig still allows configurating a kernel that has 64-bit Book3E support, but no Freescale CPU support enabled. Such a kernel would never boot, it doesn't know about any CPUs. It also causes build errors, as reported by lkp, because PPC_BARRIER_NOSPEC is not enabled in such a configuration: powerpc64-linux-ld: arch/powerpc/net/bpf_jit_comp64.o:(.toc+0x0): undefined reference to `powerpc_security_features' To fix this, force PPC_FSL_BOOK3E to be selected whenever we are building a 64-bit Book3E kernel. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220304061222.2478720-1-mpe@ellerman.id.au	2022-03-08 22:07:41 +11:00
Christophe Leroy	76222808fc	powerpc: Move C prototypes out of asm-prototypes.h We originally added asm-prototypes.h in commit `42f5b4cacd` ("powerpc: Introduce asm-prototypes.h"). It's purpose was for prototypes of C functions that are only called from asm, in order to fix sparse warnings about missing prototypes. A few months later Nick added a different use case in commit `4efca4ed05` ("kbuild: modversions for EXPORT_SYMBOL() for asm") for C prototypes for exported asm functions. This is basically the inverse of our original usage. Since then we've added various prototypes to asm-prototypes.h for both reasons, meaning we now need to unstitch it all. Dispatch prototypes of C functions into relevant headers and keep only the prototypes for functions defined in assembly. For the time being, leave prom_init() there because moving it into asm/prom.h or asm/setup.h conflicts with drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowrom.o This will be fixed later by untaggling asm/pci.h and asm/prom.h or by renaming the function in shadowrom.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/62d46904eca74042097acf4cb12c175e3067f3d1.1646413435.git.christophe.leroy@csgroup.eu	2022-03-08 22:06:25 +11:00
Nicholas Piggin	f771b55731	KVM: PPC: Use KVM_CAP_PPC_AIL_MODE_3 Use KVM_CAP_PPC_AIL_MODE_3 to advertise the capability to set the AIL resource mode to 3 with the H_SET_MODE hypercall. This capability differs between processor types and KVM types (PR, HV, Nested HV), and affects guest-visible behaviour. QEMU will implement a cap-ail-mode-3 to control this behaviour[1], and use the KVM CAP if available to determine KVM support[2]. [1] https://lists.nongnu.org/archive/html/qemu-ppc/2022-02/msg00437.html [2] https://lists.nongnu.org/archive/html/qemu-ppc/2022-02/msg00439.html Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> [mpe: Rebase onto `93b71801a8` from kvm-ppc-cap-210 branch, add EXPORT_SYMBOL] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220222064727.2314380-4-npiggin@gmail.com	2022-03-08 13:14:01 +11:00
Nicholas Piggin	839d893b40	KVM: PPC: Book3S PR: Disallow AIL != 0 KVM PR does not implement address translation modes on interrupt, so it must not allow H_SET_MODE to succeed. The behaviour change caused by this mode is architected and not advisory (interrupts must behave differently). QEMU does not deal with differences in AIL support in the host. The solution to that is a spapr capability and corresponding KVM CAP, but this patch does not break things more than before (the host behaviour already differs, this change just disallows some modes that are not implemented properly). By happy coincidence, this allows PR Linux guests that are using the SCV facility to boot and run, because Linux disables the use of SCV if AIL can not be set to 3. This does not fix the underlying problem of missing SCV support (an OS could implement real-mode SCV vectors and try to enable the facility). The true fix for that is for KVM PR to emulate scv interrupts from the facility unavailable interrupt. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Link: https://lore.kernel.org/r/20220222064727.2314380-3-npiggin@gmail.com	2022-03-08 13:14:00 +11:00
Nicholas Piggin	b5149e2292	KVM: PPC: Book3S PR: Disable SCV when AIL could be disabled PR KVM does not support running with AIL enabled, and SCV does is not supported with AIL disabled. Fix this by ensuring the SCV facility is disabled with FSCR while a CPU could be running with AIL=0. The PowerNV host supports disabling AIL on a per-CPU basis, so SCV just needs to be disabled when a vCPU is being run. The pSeries machine can only switch AIL on a system-wide basis, so it must disable SCV support at boot if the configuration can potentially run a PR KVM guest. Also ensure a the FSCR[SCV] bit can not be enabled when emulating mtFSCR for the guest. SCV is not emulated for the PR guest at the moment, this just fixes the host crashes. Alternatives considered and rejected: - SCV support can not be disabled by PR KVM after boot, because it is advertised to userspace with HWCAP. - AIL can not be disabled on a per-CPU basis. At least when running on pseries it is a per-LPAR setting. - Support for real-mode SCV vectors will not be added because they are at 0x17000 so making such a large fixed head space causes immediate value limits to be exceeded, requiring a lot rework and more code. - Disabling SCV for any PR KVM possible kernel will cause a slowdown when not using PR KVM. - A boot time option to disable SCV to use PR KVM is user-hostile. - System call instruction emulation for SCV facility unavailable instructions is too complex and old emulation code was subtly broken and removed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Link: https://lore.kernel.org/r/20220222064727.2314380-2-npiggin@gmail.com	2022-03-08 13:13:58 +11:00
Christophe Leroy	a4abd55a24	powerpc/kexec: Declare kexec_paca static kexec_paca is exclusively used in kexec/core_64.c Declare it static. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/094983ee851644165b7700c73cac63cfe20596cd.1646413435.git.christophe.leroy@csgroup.eu	2022-03-08 00:05:01 +11:00
Christophe Leroy	e15c703be4	powerpc/smp: Declare current_set static current_set extern not needed anymore since commit `eafd825ed7` ("powerpc/64: Simplify __secondary_start paca->kstack handling") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a55eb65c9d7319f0af3c31e3f6ba36522f10003d.1646413435.git.christophe.leroy@csgroup.eu	2022-03-08 00:05:01 +11:00
Christophe Leroy	e86debbbb5	powerpc: Cleanup asm-prototypes.c Last call to sys_swapcontext() from ASM was removed by commit `fbcee2ebe8` ("powerpc/32: Always save non volatile GPRs at syscall entry") sys_debug_setcontext() prototype not needed anymore since commit `f3675644e1` ("powerpc/syscalls: signal_{32, 64} - switch to SYSCALL_DEFINE") sys_switch_endian() prototype not needed anymore since commit `81dac81778` ("powerpc/64: Make sys_switch_endian() traceable") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Keep _mcount() prototype to avoid modpost errors] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3ed660a585df2080ea8412ec20fbf652f5bf013a.1646413435.git.christophe.leroy@csgroup.eu	2022-03-08 00:05:01 +11:00
Christophe Leroy	2ca48dbb21	powerpc/ftrace: Use STK_GOT in ftrace_mprofile.S Instead of open coding offset value 24, use STK_GOT when accessing got register in stack. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9042bb30fa972056715fe5b6598a7c8049681293.1645099283.git.christophe.leroy@csgroup.eu	2022-03-08 00:05:01 +11:00
Christophe Leroy	a5f04d1f27	powerpc/ftrace: Regroup PPC64 specific operations in ftrace_mprofile.S CONFIG_MPROFILE_KERNEL is only for PPC64 and ftrace_mprofile.o is build on PPC64 only when CONFIG_MPROFILE_KERNEL is defined. Move saving of r0 inside #ifdef PPC64 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/619dfb672bf4f1b777a4b3f8b4f14e637fea2716.1645099283.git.christophe.leroy@csgroup.eu	2022-03-08 00:05:01 +11:00
Christophe Leroy	228216716c	powerpc/ftrace: Refactor ftrace_{regs_}caller ftrace_caller() and frace_regs_caller() have now a lot in common. Refactor them using GAS macros. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9d7df9e4fc98a86051489f61d3c9bc67f92f7e27.1645099283.git.christophe.leroy@csgroup.eu	2022-03-08 00:05:00 +11:00
Christophe Leroy	9bdb2eec3d	powerpc/ftrace: Don't use lmw/stmw in ftrace_regs_caller() For the same reason as commit `a85c728cb5` ("powerpc/32: Don't use lmw/stmw for saving/restoring non volatile regs"), don't use lmw/stmw in ftrace_regs_caller(). Use the same macros for PPC32 and PPC64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ec286d2cc6989668a96f14543275437d2f3f0e3a.1645099283.git.christophe.leroy@csgroup.eu	2022-03-08 00:05:00 +11:00
Pratik R. Sampat	3c14b73454	powerpc/pseries: Interface to represent PAPR firmware attributes Adds a syscall interface to represent the energy and frequency related PAPR attributes on the system using the new H_CALL "H_GET_ENERGY_SCALE_INFO". H_GET_EM_PARMS H_CALL was previously responsible for exporting this information in the lparcfg, however the H_GET_EM_PARMS H_CALL will be deprecated P10 onwards. The H_GET_ENERGY_SCALE_INFO H_CALL is of the following call format: hcall( uint64 H_GET_ENERGY_SCALE_INFO, // Get energy scale info uint64 flags, // Per the flag request uint64 firstAttributeId,// The attribute id uint64 bufferAddress, // Guest physical address of the output buffer uint64 bufferSize // The size in bytes of the output buffer ); As specified in PAPR+ v2.11, section 14.14.3. This H_CALL can query either all the attributes at once with firstAttributeId = 0, flags = 0 as well as query only one attribute at a time with firstAttributeId = id, flags = 1. The output buffer consists of the following 1. number of attributes - 8 bytes 2. array offset to the data location - 8 bytes 3. version info - 1 byte 4. A data array of size num attributes, which contains the following: a. attribute ID - 8 bytes b. attribute value in number - 8 bytes c. attribute name in string - 64 bytes d. attribute value in string - 64 bytes The new H_CALL exports information in direct string value format, hence a new interface has been introduced in /sys/firmware/papr/energy_scale_info to export this information to userspace so that the firmware can add new values without the need for the kernel to be changed. The H_CALL returns the name, numeric value and string value (if exists) The format of exposing the sysfs information is as follows: /sys/firmware/papr/energy_scale_info/ \|-- <id>/ \|-- desc \|-- value \|-- value_desc (if exists) \|-- <id>/ \|-- desc \|-- value \|-- value_desc (if exists) ... The energy information that is exported is useful for userspace tools such as powerpc-utils. Currently these tools infer the "power_mode_data" value in the lparcfg, which in turn is obtained from the to be deprecated H_GET_EM_PARMS H_CALL. On future platforms, such userspace utilities will have to look at the data returned from the new H_CALL being populated in this new sysfs interface and report this information directly without the need of interpretation. Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220217105321.52941-2-psampat@linux.ibm.com	2022-03-08 00:05:00 +11:00
Ganesh Goudar	cc15ff3275	powerpc/mce: Avoid using irq_work_queue() in realmode In realmode mce handler we use irq_work_queue() to defer the processing of mce events, irq_work_queue() can only be called when translation is enabled because it touches memory outside RMA, hence we enable translation before calling irq_work_queue and disable on return, though it is not safe to do in realmode. To avoid this, program the decrementer and call the event processing functions from timer handler. Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220120121931.517974-1-ganeshgr@linux.ibm.com	2022-03-08 00:05:00 +11:00
Ganesh Goudar	0a182611d1	powerpc/mce: Modify the real address error logging messages To avoid ambiguity, modify the strings in real address error logging messages to "foreign/control memory" from "foreign", Since the error discriptions in P9 user manual and P10 user manual are different for same type of errors. P9 User Manual for MCE: DSISR:59 Host real address to foreign space during translation. DSISR:60 Host real address to foreign space on a load or store access. P10 User Manual for MCE: DSISR:59 D-side tablewalk used a host real address in the control memory address range. DSISR:60 D-side operand access to control memory address space. Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220107141428.67862-3-ganeshgr@linux.ibm.com	2022-03-08 00:05:00 +11:00
Ganesh Goudar	0f54bddefe	powerpc/pseries: Parse control memory access error Add support to parse and log control memory access error for pseries. These changes are made according to PAPR v2.11 10.3.2.2.12. Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220107141428.67862-1-ganeshgr@linux.ibm.com	2022-03-08 00:04:59 +11:00
Naveen N. Rao	49c3af43e6	powerpc/bpf: Simplify bpf_to_ppc() and adopt it for powerpc64 Convert bpf_to_ppc() to a macro to help simplify its usage since codegen_context is available in all places it is used. Adopt it also for powerpc64 for uniformity and get rid of the global b2p structure. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/09f0540ce3e0cd4120b5b33993b5e73b6ef9e979.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:59 +11:00
Jordan Niethe	3a3fc9bf10	powerpc64/bpf: Store temp registers' bpf to ppc mapping In bpf_jit_build_body(), the mapping of TMP_REG_1 and TMP_REG_2's bpf register to ppc register is evalulated at every use despite not changing. Instead, determine the ppc register once and store the result. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> [Rebased, converted additional usage sites] Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0944e2f0fa6dd254ea401f1c946fb6c9a5294278.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:59 +11:00
Naveen N. Rao	036d559c0b	powerpc/bpf: Use _Rn macros for GPRs Use _Rn macros to specify register names to make their usage clear. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7df626b8cdc6141d4295ac16137c82ad570b6637.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:59 +11:00
Naveen N. Rao	576a6c3a00	powerpc/bpf: Move bpf_jit64.h into bpf_jit_comp64.c There is no need for a separate header anymore. Move the contents of bpf_jit64.h into bpf_jit_comp64.c Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b873a8e6eff7d91bf2a2cabdd53082aadfe20761.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:59 +11:00
Naveen N. Rao	7b187dcdb5	powerpc/bpf: Cleanup bpf_jit.h - PPC_EX32() is only used by ppc32 JIT. Move it to bpf_jit_comp32.c - PPC_LI64() is only valid in ppc64. #ifdef it - PPC_FUNC_ADDR() is not used anymore. Remove it. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/58f5b66b2f8546bbbee620f62103a8e97a63eb7c.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:59 +11:00
Naveen N. Rao	794abc08d7	powerpc64/bpf: Get rid of PPC_BPF_[LL\|STL\|STLU] macros All these macros now have a single user. Expand their usage in place. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e0526fc7633a34f983a7a330712b55bdfaf20482.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:58 +11:00
Naveen N. Rao	391c271f4d	powerpc64/bpf: Convert some of the uses of PPC_BPF_[LL\|STL] to PPC_BPF_[LD\|STD] PPC_BPF_[LL\|STL] are macros meant for scenarios where we may have to deal with a non-word aligned offset. Limit their usage to only those scenarios by converting the rest to just use PPC_BPF_[LD\|STD]. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0eb472428165a307f6fdaf22b0c33cbf13a9a635.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:58 +11:00
Naveen N. Rao	74bbe3f084	powerpc/bpf: Rename PPC_BL_ABS() to PPC_BL() PPC_BL_ABS() is just doing a relative branch with link. The name suggests that it is for branching to an absolute address, which is incorrect. Rename the macro to a more appropriate PPC_BL(). Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f0e57b6c7a6ee40dba645535b70da46f46e8af5e.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:58 +11:00
Naveen N. Rao	feb6307289	powerpc64/bpf: Optimize instruction sequence used for function calls When calling BPF helpers, we load the function address to call into a register. This can result in upto 5 instructions. Optimize this by instead using the kernel toc in r2 and adjusting offset to the BPF helper. This works since all BPF helpers are part of kernel text, and all BPF programs/functions utilize the kernel TOC. Further more: - load the actual function entry address in elf v1, rather than loading it through the function descriptor address. - load the Local Entry Point (LEP) in elf v2 skipping TOC setup. - consolidate code across elf abi v1 and v2 by using r12 on both. Reported-by: Anton Blanchard <anton@ozlabs.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1233c7544e60dcb021c52b1f840b0f21a87b33ed.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:58 +11:00
Naveen N. Rao	43d636f8b4	powerpc64/bpf elfv1: Do not load TOC before calling functions BPF helpers always reside in core kernel and all BPF programs use the kernel TOC. As such, there is no need to load the TOC before calling helpers or other BPF functions. Drop code to do the same. Add a check to ensure we don't proceed if this assumption ever changes in future. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a3cd3da4d24d95d845cd10382b1af083600c9074.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:58 +11:00
Naveen N. Rao	b10cb163c4	powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry In preparation for using kernel TOC, load the same in r2 on entry. With elfv1, the kernel TOC is already setup by our caller. We adjust the number of instructions to skip on a tail call accordingly. We get rid of the #ifdef in bpf_jit_emit_tail_call() since FUNCTION_DESCR_SIZE is itself under a #ifdef. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/18a05a4ceec14a8617c9dd4b7128d0afa83fd14e.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:58 +11:00
Naveen N. Rao	4eeac2b0aa	powerpc64: Set PPC64_ELF_ABI_v[1\|2] macros to 1 Set macros to 1 so that they can be used with __is_defined(). Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/abad4868416ddfd42893f99c0cad8e5faf998095.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:57 +11:00
Naveen N. Rao	1d4866d565	powerpc64/bpf: Use r12 for constant blinding In preparation for preserving kernel toc in r2, switch BPF_REG_AX from r2 to r12. r12 is not used by bpf JIT except during external helper/bpf calls, or with BPF_NOSPEC. These sequences aren't emitted when BPF_REG_AX is used for constant blinding and other purposes. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e109f98617eacb4512c17a48525e94eda42889e6.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:57 +11:00
Naveen N. Rao	c2067f7f88	powerpc64/bpf: Do not save/restore LR on each call to bpf_stf_barrier() Instead of saving and restoring LR before each invocation to bpf_stf_barrier(), set SEEN_FUNC flag so that we save/restore LR in prologue/epilogue. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4446f25478d82a2a4ac9dab2ebdfd88ddf923eb7.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:57 +11:00
Naveen N. Rao	0ffdbce6f4	powerpc/bpf: Handle large branch ranges with BPF_EXIT In some scenarios, it is possible that the program epilogue is outside the branch range for a BPF_EXIT instruction. Instead of rejecting such programs, emit epilogue as an alternate exit point from the program. Track the location of the same so that subsequent exits can take either of the two paths. Reported-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/33aa2e92645a92712be23b18035a2c6dcb92ff8d.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:57 +11:00
Naveen N. Rao	bafb5898de	powerpc/bpf: Emit a single branch instruction for known short branch ranges PPC_BCC() emits two instructions to accommodate scenarios where we need to branch outside the range of a conditional branch. PPC_BCC_SHORT() emits a single branch instruction and can be used when the branch is known to be within a conditional branch range. Convert some of the uses of PPC_BCC() in the powerpc BPF JIT over to PPC_BCC_SHORT() where we know the branch range. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/edbca01377d1d5f472868bf6d8962b0a0d85b96f.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:57 +11:00
Naveen N. Rao	acd7408d27	powerpc/bpf: Skip branch range validation during first pass During the first pass, addrs[] is still being populated. So, all branches to following instructions will appear to be going to the start of the JIT program. Ignore branch range validation for such instructions and assume those to be in range. Branch range validation will happen during the second pass after addrs[] is setup properly. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bc517413d11636e20dbfc88503dad14bcbe391e2.1644834730.git.naveen.n.rao@linux.vnet.ibm.com	2022-03-08 00:04:57 +11:00
Michael Ellerman	591b4b2684	powerpc/code-patching: Pre-map patch area Paul reported a warning with DEBUG_ATOMIC_SLEEP=y: BUG: sleeping function called from invalid context at include/linux/sched/mm.h:256 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 preempt_count: 0, expected: 0 ... Call Trace: dump_stack_lvl+0xa0/0xec (unreliable) __might_resched+0x2f4/0x310 kmem_cache_alloc+0x220/0x4b0 __pud_alloc+0x74/0x1d0 hash__map_kernel_page+0x2cc/0x390 do_patch_instruction+0x134/0x4a0 arch_jump_label_transform+0x64/0x78 __jump_label_update+0x148/0x180 static_key_enable_cpuslocked+0xd0/0x120 static_key_enable+0x30/0x50 check_kvm_guest+0x60/0x88 pSeries_smp_probe+0x54/0xb0 smp_prepare_cpus+0x3e0/0x430 kernel_init_freeable+0x20c/0x43c kernel_init+0x30/0x1a0 ret_from_kernel_thread+0x5c/0x64 Peter pointed out that this is because do_patch_instruction() has disabled interrupts, but then map_patch_area() calls map_kernel_page() then hash__map_kernel_page() which does a sleeping memory allocation. We only see the warning in KVM guests with SMT enabled, which is not particularly common, or on other platforms if CONFIG_KPROBES is disabled, also not common. The reason we don't see it in most configurations is that another path that happens to have interrupts enabled has allocated the required page tables for us, eg. there's a path in kprobes init that does that. That's just pure luck though. As Christophe suggested, the simplest solution is to do a dummy map/unmap when we initialise the patching, so that any required page table levels are pre-allocated before the first call to do_patch_instruction(). This works because the unmap doesn't free any page tables that were allocated by the map, it just clears the PTE, leaving the page table levels there for the next map. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Debugged-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220223015821.473097-1-mpe@ellerman.id.au	2022-03-08 00:04:57 +11:00
Michael Ellerman	d4679ac8ea	powerpc/64s: Don't use DSISR for SLB faults Since commit `46ddcb3950` ("powerpc/mm: Show if a bad page fault on data is read or write.") we use page_fault_is_write(regs->dsisr) in __bad_page_fault() to determine if the fault is for a read or write, and change the message printed accordingly. But SLB faults, aka Data Segment Interrupts, don't set DSISR (Data Storage Interrupt Status Register) to a useful value. All ISA versions from v2.03 through v3.1 specify that the Data Segment Interrupt sets DSISR "to an undefined value". As far as I can see there's no mention of SLB faults setting DSISR in any BookIV content either. This manifests as accesses that should be a read being incorrectly reported as writes, for example, using the xmon "dump" command: 0:mon> d 0x5deadbeef0000000 5deadbeef0000000 [359526.415354][ C6] BUG: Unable to handle kernel data access on write at 0x5deadbeef0000000 [359526.415611][ C6] Faulting instruction address: 0xc00000000010a300 cpu 0x6: Vector: 380 (Data SLB Access) at [c00000000ffbf400] pc: c00000000010a300: mread+0x90/0x190 If we disassemble the PC, we see a load instruction: 0:mon> di c00000000010a300 c00000000010a300 89490000 lbz r10,0(r9) We can also see in exceptions-64s.S that the data_access_slb block doesn't set IDSISR=1, which means it doesn't load DSISR into pt_regs. So the value we're using to determine if the fault is a read/write is some stale value in pt_regs from a previous page fault. Rework the printing logic to separate the SLB fault case out, and only print read/write in the cases where we can determine it. The result looks like eg: 0:mon> d 0x5deadbeef0000000 5deadbeef0000000 [ 721.779525][ C6] BUG: Unable to handle kernel data access at 0x5deadbeef0000000 [ 721.779697][ C6] Faulting instruction address: 0xc00000000014cbe0 cpu 0x6: Vector: 380 (Data SLB Access) at [c00000000ffbf390] 0:mon> d 0 0000000000000000 [ 742.793242][ C6] BUG: Kernel NULL pointer dereference at 0x00000000 [ 742.793316][ C6] Faulting instruction address: 0xc00000000014cbe0 cpu 0x6: Vector: 380 (Data SLB Access) at [c00000000ffbf390] Fixes: `46ddcb3950` ("powerpc/mm: Show if a bad page fault on data is read or write.") Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Link: https://lore.kernel.org/r/20220222113449.319193-1-mpe@ellerman.id.au	2022-03-08 00:04:56 +11:00
Jakob Koschel	fa1321b11b	powerpc/sysdev: fix incorrect use to determine if list is empty 'gtm' will always be set by list_for_each_entry(). It is incorrect to assume that the iterator value will be NULL if the list is empty. Instead of checking the pointer it should be checked if the list is empty. Fixes: `83ff9dcf37` ("powerpc/sysdev: implement FSL GTM support") Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220228142434.576226-1-jakobkoschel@gmail.com	2022-03-08 00:04:56 +11:00
Haren Myneni	37e6764895	powerpc/pseries/vas: Add VAS migration handler Since the VAS windows belong to the VAS hardware resource, the hypervisor expects the partition to close them on source partition and reopen them after the partition migrated on the destination machine. This handler is called before pseries_suspend() to close these windows and again invoked after migration. All active windows for both default and QoS types will be closed and mark them inactive and reopened after migration with this handler. During the migration, the user space receives paste instruction failure if it issues copy/paste on these inactive windows. The current migration implementation does not freeze the user space and applications can continue to open VAS windows while migration is in progress. So when the migration_in_progress flag is set, VAS open window API returns -EBUSY. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/05e45ff4f8babd2490ccb7ae923884f4aa21a7e5.camel@linux.ibm.com	2022-03-08 00:04:56 +11:00
Haren Myneni	716d7a2e37	powerpc/pseries/vas: Modify reconfig open/close functions for migration VAS is a hardware engine stays on the chip. So when the partition migrates, all VAS windows on the source system have to be closed and reopen them on the destination after migration. The kernel has to consider both DLPAR CPU and migration events to take action on VAS windows. So using VAS_WIN_NO_CRED_CLOSE and VAS_WIN_MIGRATE_CLOSE status bits and windows will be reopened after migration only after both status bits are cleared. This patch make changes to the current reconfig_open/close_windows functions to support migration: - Set VAS_WIN_MIGRATE_CLOSE to the window status when closes and reopen windows with the same status during resume. - Continue to close all windows even if deallocate HCALL failed (should not happen) since no way to stop migration with the current LPM implementation. - If the DLPAR CPU event happens while migration is in progress, set VAS_WIN_NO_CRED_CLOSE to the window status. Close window happens with the first event (migration or DLPAR) and Reopen window happens only with the last event (migration or DLPAR). Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0aad580387cb58379496b4cbbd7c5596e9ea70be.camel@linux.ibm.com	2022-03-08 00:04:56 +11:00
Haren Myneni	278fe1cc22	powerpc/pseries/vas: Define global hv_cop_caps struct The coprocessor capabilities struct is used to get default and QoS capabilities from the hypervisor during init, DLPAR event and migration. So instead of allocating this struct for each event, define global struct and reuse it which allows the migration code to avoid adding an error path. Also disable copy/paste feature flag if any capabilities HCALL is failed. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/57da6a270fcb9308cd57be7c88037029343080f7.camel@linux.ibm.com	2022-03-08 00:04:56 +11:00
Haren Myneni	45f06eac30	powerpc/pseries/vas: Add 'update_total_credits' entry for QoS capabilities pseries supports two types of credits - Default (uses normal priority FIFO) and Qality of service (QoS uses high priority FIFO). The user decides the number of QoS credits and sets this value with HMC interface. The total credits for QoS capabilities can be changed dynamically with HMC interface which invokes drmgr to communicate to the kernel. This patch creats 'update_total_credits' entry for QoS capabilities so that drmgr command can write the new target QoS credits in sysfs. Instead of using this value, the kernel gets the new QoS capabilities from the hypervisor whenever update_total_credits is updated to make sure sync with the QoS target credits in the hypervisor. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b01ef31a0f964686d00243e7de7f09c73c07e69e.camel@linux.ibm.com	2022-03-08 00:04:56 +11:00
Haren Myneni	b903737bc5	powerpc/pseries/vas: sysfs interface to export capabilities The hypervisor provides the available VAS GZIP capabilities such as default or QoS window type and the target available credits in each type. This patch creates sysfs entries and exports the target, used and the available credits for each feature. This interface can be used by the user space to determine the credits usage or to set the target credits in the case of QoS type (for DLPAR). /sys/devices/vas/vas0/gzip/default_capabilities (default GZIP capabilities) nr_total_credits /* Total credits available. Can be /* changed with DLPAR operation / nr_used_credits / Used credits */ /sys/devices/vas/vas0/gzip/qos_capabilities (QoS GZIP capabilities) nr_total_credits nr_used_credits Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/702d8b626ebfac2b52f4995eebeafe1c9a6fcb75.camel@linux.ibm.com	2022-03-08 00:04:56 +11:00
Haren Myneni	c656cfe571	powerpc/pseries/vas: Reopen windows with DLPAR core add VAS windows can be closed in the hypervisor due to lost credits when the core is removed and the kernel gets fault for NX requests on these inactive windows. If the NX requests are issued on these inactive windows, OS gets page faults and the paste failure will be returned to the user space. If the lost credits are available later with core add, reopen these windows and set them active. Later when the OS sees page faults on these active windows, it creates mapping on the new paste address. Then the user space can continue to use these windows and send HW compression requests to NX successfully. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d9f360e21355e6826142c81146acfa9b60bc7ecc.camel@linux.ibm.com	2022-03-08 00:04:55 +11:00
Haren Myneni	8ef7b9e176	powerpc/pseries/vas: Close windows with DLPAR core removal The hypervisor assigns vas credits (windows) for each LPAR based on the number of cores configured in that system. The OS is expected to release credits when cores are removed, and may allocate more when cores are added. So there is a possibility of using excessive credits (windows) in the LPAR and the hypervisor expects the system to close the excessive windows so that NX load can be equally distributed across all LPARs in the system. When the OS closes the excessive windows in the hypervisor, it sets the window status inactive and invalidates window virtual address mapping. The user space receives paste instruction failure if any NX requests are issued on the inactive window. Then the user space can use with the available open windows or retry NX requests until this window active again. This patch also adds the notifier for core removal/add to close windows in the hypervisor if the system lost credits (core removal) and reopen windows in the hypervisor when the previously lost credits are available. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/108928f9c00a48cc6a722315d482d07cf66acf5a.camel@linux.ibm.com	2022-03-08 00:04:55 +11:00
Haren Myneni	6a8d4ca891	powerpc/vas: Map paste address only if window is active The paste address mapping is done with mmap() after the window is opened with ioctl. The partition has to close VAS windows in the hypervisor if it lost credits due to DLPAR core removal. But the kernel marks these windows inactive until the previously lost credits are available later. If the window is inactive due to DLPAR after this mmap(), the paste instruction returns failure until the the OS reopens this window again. Before the user space issuing mmap(), there is a possibility of happening DLPAR core removal event which causes the corresponding window inactive. So if the window is not active, return mmap() failure with -EACCES and expects the user space reissue mmap() when the window is active or open a new window when the credit is available. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bbb203c26b324534e25658cb1dbbcb5160a2f93a.camel@linux.ibm.com	2022-03-08 00:04:55 +11:00
Haren Myneni	b5c63d90cc	powerpc/vas: Return paste instruction failure if no active window The VAS window may not be active if the system looses credits and the NX generates page fault when it receives request on unmap paste address. The kernel handles the fault by remap new paste address if the window is active again, Otherwise return the paste instruction failure if the executed instruction that caused the fault was a paste. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/492b9aefd593061d51dda67ee4d2fc449c000dce.camel@linux.ibm.com	2022-03-08 00:04:55 +11:00
Haren Myneni	1fe3a33ba0	powerpc/vas: Add paste address mmap fault handler The user space opens VAS windows and issues NX requests by pasting CRB on the corresponding paste address mmap. When the system lost credits due to core removal, the kernel has to close the window in the hypervisor and make the window inactive by unmapping this paste address. Also the OS has to handle NX request page faults if the user space issue NX requests. This handler maps the new paste address with the same VMA when the window is active again (due to core add with DLPAR). Otherwise returns paste failure. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3956e1c1fdfde69127055ff1c0256c7d71104030.camel@linux.ibm.com	2022-03-08 00:04:55 +11:00
Haren Myneni	976410cd2c	powerpc/pseries/vas: Save PID in pseries_vas_window struct The kernel sets the VAS window with PID when it is opened in the hypervisor. During DLPAR operation, windows can be closed and reopened in the hypervisor when the credit is available. So saves this PID in pseries_vas_window struct when the window is opened initially and reuse it later during DLPAR operation. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a57cbe6d292fe49ad55a0b49c5679d6a24d8fe73.camel@linux.ibm.com	2022-03-08 00:04:55 +11:00
Haren Myneni	40562fe4fa	powerpc/pseries/vas: Use common names in VAS capability structure nr_total/nr_used_credits provides credits usage to user space via sysfs and the same interface can be used on PowerNV in future. Changed with proper naming so that applicable on both pseries and PowerNV. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f4313e9f198ee4f8d4fa4d015d8d1873e17851e6.camel@linux.ibm.com	2022-03-08 00:04:54 +11:00
Michael Ellerman	9ef78b6293	Merge branch 'topic/ppc-kvm' into next Merge our topic branch containing powerpc KVM related commits. Alexey Kardashevskiy (1): KVM: PPC: Merge powerpc's debugfs entry content into generic entry Fabiano Rosas (9): KVM: PPC: Book3S HV: Stop returning internal values to userspace KVM: PPC: Fix vmx/vsx mixup in mmio emulation KVM: PPC: mmio: Reject instructions that access more than mmio.data size KVM: PPC: mmio: Return to guest after emulation failure KVM: PPC: Book3s: mmio: Deliver DSI after emulation failure KVM: PPC: Book3S HV: Check return value of kvmppc_radix_init KVM: PPC: Book3S HV: Delay setting of kvm ops KVM: PPC: Book3S HV: Free allocated memory if module init fails KVM: PPC: Decrement module refcount if init_vm fails Jason Wang (1): powerpc/kvm: no need to initialise statics to 0 Nour-eddine Taleb (1): KVM: PPC: Book3S HV: remove unnecessary casts	2022-03-08 00:02:29 +11:00
Michael Ellerman	4bc06c59f6	Merge branch 'topic/func-desc-lkdtm' into next Merge a topic branch we are maintaining with some cross-architecture changes to function descriptor handling and their use in LKDTM. From Christophe's cover letter: Fix LKDTM for PPC64/IA64/PARISC PPC64/IA64/PARISC have function descriptors. LKDTM doesn't work on those three architectures because LKDTM messes up function descriptors with functions. This series does some cleanup in the three architectures and refactors function descriptors so that it can then easily use it in a generic way in LKDTM.	2022-03-07 23:34:32 +11:00
Nicholas Piggin	c7fa848ff0	KVM: PPC: Book3S HV P9: Fix "lost kick" race When new work is created that requires attention from the hypervisor (e.g., to inject an interrupt into the guest), fast_vcpu_kick is used to pull the target vcpu out of the guest if it may have been running. Therefore the work creation side looks like this: vcpu->arch.doorbell_request = 1; kvmppc_fast_vcpu_kick_hv(vcpu) { smp_mb(); cpu = vcpu->cpu; if (cpu != -1) send_ipi(cpu); } And the guest entry side should look like this: vcpu->cpu = smp_processor_id(); smp_mb(); if (vcpu->arch.doorbell_request) { // do something (abort entry or inject doorbell etc) } But currently the store and load are flipped, so it is possible for the entry to see no doorbell pending, and the doorbell creation misses the store to set cpu, resulting lost work (or at least delayed until the next guest exit). Fix this by reordering the entry operations and adding a smp_mb between them. The P8 path appears to have a similar race which is commented but not addressed yet. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220303053315.1056880-2-npiggin@gmail.com	2022-03-07 13:14:30 +11:00
Michael Ellerman	48015b632f	powerpc: Fix STACKTRACE=n build Our skiroot_defconfig doesn't enable FTRACE, and so doesn't get STACKTRACE enabled either. That leads to a build failure since commit `1614b2b11f` ("arch: Make ARCH_STACKWALK independent of STACKTRACE") made stacktrace.c build even when STACKTRACE=n. arch/powerpc/kernel/stacktrace.c: In function ‘handle_backtrace_ipi’: arch/powerpc/kernel/stacktrace.c:171:2: error: implicit declaration of function ‘nmi_cpu_backtrace’ 171 \| nmi_cpu_backtrace(regs); \| ^~~~~~~~~~~~~~~~~ arch/powerpc/kernel/stacktrace.c: In function ‘arch_trigger_cpumask_backtrace’: arch/powerpc/kernel/stacktrace.c:226:2: error: implicit declaration of function ‘nmi_trigger_cpumask_backtrace’ 226 \| nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This happens because our headers haven't defined arch_trigger_cpumask_backtrace, which causes lib/nmi_backtrace.c not to build nmi_cpu_backtrace(). The code in question doesn't actually depend on STACKTRACE=y, that was just added because arch_trigger_cpumask_backtrace() lived in stacktrace.c for convenience. So drop the dependency on CONFIG_STACKTRACE, that causes lib/nmi_backtrace.c to build nmi_cpu_backtrace() etc. and fixes the build. Fixes: `1614b2b11f` ("arch: Make ARCH_STACKWALK independent of STACKTRACE") [mpe: Cherry pick of `5a72345e6a` from next into fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220212111349.2806972-1-mpe@ellerman.id.au	2022-03-07 10:26:20 +11:00
Murilo Opsfelder Araujo	58dbe9b373	powerpc/64s: Fix build failure when CONFIG_PPC_64S_HASH_MMU is not set The following build failure occurs when CONFIG_PPC_64S_HASH_MMU is not set: arch/powerpc/kernel/setup_64.c: In function ‘setup_per_cpu_areas’: arch/powerpc/kernel/setup_64.c:811:21: error: ‘mmu_linear_psize’ undeclared (first use in this function); did you mean ‘mmu_virtual_psize’? 811 \| if (mmu_linear_psize == MMU_PAGE_4K) \| ^~~~~~~~~~~~~~~~ \| mmu_virtual_psize arch/powerpc/kernel/setup_64.c:811:21: note: each undeclared identifier is reported only once for each function it appears in Move the declaration of mmu_linear_psize outside of CONFIG_PPC_64S_HASH_MMU ifdef. After the above is fixed, it fails later with the following error: ld: arch/powerpc/kexec/file_load_64.o: in function `.arch_kexec_kernel_image_probe': file_load_64.c:(.text+0x1c1c): undefined reference to `.add_htab_mem_range' Fix that, too, by conditioning add_htab_mem_range() symbol to CONFIG_PPC_64S_HASH_MMU. Fixes: `387e220a2e` ("powerpc/64s: Move hash MMU support code under CONFIG_PPC_64S_HASH_MMU") Reported-by: Erhard F. <erhard_f@mailbox.org> Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215567 Link: https://lore.kernel.org/r/20220301204743.45133-1-muriloo@linux.ibm.com	2022-03-05 20:42:21 +11:00
Nour-eddine Taleb	e40b38a41c	KVM: PPC: Book3S HV: remove unnecessary casts Remove unnecessary casts, from "void " to "struct kvmppc_xics " Signed-off-by: Nour-eddine Taleb <kernel.noureddine@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220303143416.201851-1-kernel.noureddine@gmail.com	2022-03-04 12:58:46 +11:00
Christoph Hellwig	27674ef6c7	mm: remove the extra ZONE_DEVICE struct page refcount ZONE_DEVICE struct pages have an extra reference count that complicates the code for put_page() and several places in the kernel that need to check the reference count to see that a page is not being used (gup, compaction, migration, etc.). Clean up the code so the reference count doesn't need to be treated specially for ZONE_DEVICE pages. Note that this excludes the special idle page wakeup for fsdax pages, which still happens at refcount 1. This is a separate issue and will be sorted out later. Given that only fsdax pages require the notifiacation when the refcount hits 1 now, the PAGEMAP_OPS Kconfig symbol can go away and be replaced with a FS_DAX check for this hook in the put_page fastpath. Based on an earlier patch from Ralph Campbell <rcampbell@nvidia.com>. Link: https://lkml.kernel.org/r/20220210072828.2930359-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Chaitanya Kulkarni <kch@nvidia.com> Cc: Christian Knig <christian.koenig@amd.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>	2022-03-03 12:47:33 -05:00
Christoph Hellwig	dc90f0846d	mm: don't include <linux/memremap.h> in <linux/mm.h> Move the check for the actual pgmap types that need the free at refcount one behavior into the out of line helper, and thus avoid the need to pull memremap.h into mm.h. Link: https://lkml.kernel.org/r/20220210072828.2930359-7-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Chaitanya Kulkarni <kch@nvidia.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>	2022-03-03 12:47:33 -05:00
Anders Roxell	8219d31eff	powerpc/lib/sstep: Fix build errors with newer binutils Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian 2.37.90.20220207) the following build error shows up: {standard input}: Assembler messages: {standard input}:10576: Error: unrecognized opcode: `stbcx.' {standard input}:10680: Error: unrecognized opcode: `lharx' {standard input}:10694: Error: unrecognized opcode: `lbarx' Rework to add assembler directives [1] around the instruction. The problem with this might be that we can trick a power6 into single-stepping through an stbcx. for instance, and it will execute that in kernel mode. [1] https://sourceware.org/binutils/docs/as/PowerPC_002dPseudo.html#PowerPC_002dPseudo Fixes: `350779a29f` ("powerpc: Handle most loads and stores in instruction emulation code") Cc: stable@vger.kernel.org # v4.14+ Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220224162215.3406642-3-anders.roxell@linaro.org	2022-03-01 23:51:09 +11:00
Anders Roxell	8667d0d64d	powerpc: Fix build errors with newer binutils Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian 2.37.90.20220207) the following build error shows up: {standard input}: Assembler messages: {standard input}:1190: Error: unrecognized opcode: `stbcix' {standard input}:1433: Error: unrecognized opcode: `lwzcix' {standard input}:1453: Error: unrecognized opcode: `stbcix' {standard input}:1460: Error: unrecognized opcode: `stwcix' {standard input}:1596: Error: unrecognized opcode: `stbcix' ... Rework to add assembler directives [1] around the instruction. Going through them one by one shows that the changes should be safe. Like __get_user_atomic_128_aligned() is only called in p9_hmi_special_emu(), which according to the name is specific to power9. And __raw_rm_read*() are only called in things that are powernv or book3s_hv specific. [1] https://sourceware.org/binutils/docs/as/PowerPC_002dPseudo.html#PowerPC_002dPseudo Cc: stable@vger.kernel.org Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> [mpe: Make commit subject more descriptive] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220224162215.3406642-2-anders.roxell@linaro.org	2022-03-01 23:51:08 +11:00
Anders Roxell	a633cb1edd	powerpc/lib/sstep: Fix 'sthcx' instruction Looks like there been a copy paste mistake when added the instruction 'stbcx' twice and one was probably meant to be 'sthcx'. Changing to 'sthcx' from 'stbcx'. Fixes: `350779a29f` ("powerpc: Handle most loads and stores in instruction emulation code") Cc: stable@vger.kernel.org # v4.14+ Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220224162215.3406642-1-anders.roxell@linaro.org	2022-03-01 23:51:08 +11:00
Michael Ellerman	2863dd2db2	powerpc/Makefile: Don't pass -mcpu=powerpc64 when building 32-bit When CONFIG_GENERIC_CPU=y (true for all our defconfigs) we pass -mcpu=powerpc64 to the compiler, even when we're building a 32-bit kernel. This happens because we have an ifdef CONFIG_PPC_BOOK3S_64/else block in the Makefile that was written before 32-bit supported GENERIC_CPU. Prior to that the else block only applied to 64-bit Book3E. The GCC man page says -mcpu=powerpc64 "[specifies] a pure ... 64-bit big endian PowerPC ... architecture machine [type], with an appropriate, generic processor model assumed for scheduling purposes." It's unclear how that interacts with -m32, which we are also passing, although obviously -m32 is taking precedence in some sense, as the 32-bit kernel only contains 32-bit instructions. This was noticed by inspection, not via any bug reports, but it does affect code generation. Comparing before/after code generation, there are some changes to instruction scheduling, and the after case (with -mcpu=powerpc64 removed) the compiler seems more keen to use r8. Fix it by making the else case only apply to Book3E 64, which excludes 32-bit. Fixes: `0e00a8c9fd` ("powerpc: Allow CPU selection also on PPC32") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220215112858.304779-1-mpe@ellerman.id.au	2022-03-01 23:51:04 +11:00
Daniel Henrique Barboza	749ed4a206	powerpc/mm/numa: skip NUMA_NO_NODE onlining in parse_numa_properties() Executing node_set_online() when nid = NUMA_NO_NODE results in an undefined behavior. node_set_online() will call node_set_state(), into __node_set(), into set_bit(), and since NUMA_NO_NODE is -1 we'll end up doing a negative shift operation inside arch/powerpc/include/asm/bitops.h. This potential UB was detected running a kernel with CONFIG_UBSAN. The behavior was introduced by commit `10f78fd0da` ("powerpc/numa: Fix a regression on memoryless node 0"), where the check for nid > 0 was removed to fix a problem that was happening with nid = 0, but the result is that now we're trying to online NUMA_NO_NODE nids as well. Checking for nid >= 0 will allow node 0 to be onlined while avoiding this UB with NUMA_NO_NODE. Fixes: `10f78fd0da` ("powerpc/numa: Fix a regression on memoryless node 0") Reported-by: Ping Fang <pifang@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220224182312.1012527-1-danielhb413@gmail.com	2022-03-01 23:41:01 +11:00
Christophe Leroy	973e2e6462	powerpc/interrupt: Remove struct interrupt_state Since commit `ceff77efa4` ("powerpc/64e/interrupt: Use new interrupt context tracking scheme") struct interrupt_state has been empty and unused. Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1d862ce3eab3da6ca7ac47d4a78a18f154462511.1645806970.git.christophe.leroy@csgroup.eu	2022-03-01 23:41:00 +11:00
Hari Bathini	607451ce0a	powerpc/fadump: register for fadump as early as possible Crash recovery (fadump) is setup in the userspace by some service. This service rebuilds initrd with dump capture capability, if it is not already dump capture capable before proceeding to register for firmware assisted dump (echo 1 > /sys/kernel/fadump/registered). But arming the kernel with crash recovery support does not have to wait for userspace configuration. So, register for fadump while setting it up itself. This can at worst lead to a scenario, where /proc/vmcore is ready afer crash but the initrd does not know how/where to offload it, which is always better than not having a /proc/vmcore at all due to incomplete configuration in the userspace at the time of crash. Commit `0823c68b05` ("powerpc/fadump: re-register firmware-assisted dump if already registered") ensures this change does not break userspace. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> [mpe: Reword comment] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220201105305.155511-1-hbathini@linux.ibm.com	2022-03-01 23:41:00 +11:00

1 2 3 4 5 ...

24907 Commits