OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	888d3c9f7f	sysctl-6.4-rc1 This pull request goes with only a few sysctl moves from the kernel/sysctl.c file, the rest of the work has been put towards deprecating two API calls which incur recursion and prevent us from simplifying the registration process / saving memory per move. Most of the changes have been soaking on linux-next since v6.3-rc3. I've slowed down the kernel/sysctl.c moves due to Matthew Wilcox's feedback that we should see if we could save memory with these moves instead of incurring more memory. We currently incur more memory since when we move a syctl from kernel/sysclt.c out to its own file we end up having to add a new empty sysctl used to register it. To achieve saving memory we want to allow syctls to be passed without requiring the end element being empty, and just have our registration process rely on ARRAY_SIZE(). Without this, supporting both styles of sysctls would make the sysctl registration pretty brittle, hard to read and maintain as can be seen from Meng Tang's efforts to do just this [0]. Fortunately, in order to use ARRAY_SIZE() for all sysctl registrations also implies doing the work to deprecate two API calls which use recursion in order to support sysctl declarations with subdirectories. And so during this development cycle quite a bit of effort went into this deprecation effort. I've annotated the following two APIs are deprecated and in few kernel releases we should be good to remove them: * register_sysctl_table() * register_sysctl_paths() During this merge window we should be able to deprecate and unexport register_sysctl_paths(), we can probably do that towards the end of this merge window. Deprecating register_sysctl_table() will take a bit more time but this pull request goes with a few example of how to do this. As it turns out each of the conversions to move away from either of these two API calls also saves memory. And so long term, all these changes will prove to have saved a bit of memory on boot. The way I see it then is if remove a user of one deprecated call, it gives us enough savings to move one kernel/sysctl.c out from the generic arrays as we end up with about the same amount of bytes. Since deprecating register_sysctl_table() and register_sysctl_paths() does not require maintainer coordination except the final unexport you'll see quite a bit of these changes from other pull requests, I've just kept the stragglers after rc3. Most of these changes have been soaking on linux-next since around rc3. [0] https://lkml.kernel.org/r/ZAD+cpbrqlc5vmry@bombadil.infradead.org -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRHAjQSHG1jZ3JvZkBr ZXJuZWwub3JnAAoJEM4jHQowkoinTzgQAI/uKHKi0VlUR1l2Psl0XbseUVueuyj3 ZDxSJpbVUmsoDf2MlLjzB8mYE3ricnNTDbLr7qOyA6pXdM1N0mY5LQmRVRu8/ffd 2T1hQ5pl7YnJdWP5dPhcF9Y+jnu1tjX1MW5DS4fzllwK7FnD86HuIruGq52RAPS/ /FH+BD9eodLWWXk6A/o2GFqoWxPKQI0GLxEYWa7Hg7yt8E/3PQL9QsRzn8i6U+HW BrN/+G3YD1VCCzXu0UAeXnm+i1Z7CdvqNdZuSkvE3DObiZ5WpOS+/i7FrDB7zdiu zAbHaifHnDPtcK3w2ZodbLAAwEWD/mG4iwIjE2kgIMVYxBv7TFDBRREXAWYAevIT UUuZnWDQsGaWdjywrebaUycEfd6dytKyan0fTXgMFkcoWRjejhitfdM2iZDdQROg q453p4HqOw4vTrhy4ov4zOX7J3EFiBzpZdl+SmLqcXk+jbLVb/Q9snUWz1AFtHBl gHoP5bS82uVktGG3MsObjgTzYYMQjO9YGIrVuW1VP9uWs8WaoWx6M9FQJIIhtwE+ h6wG2s7CjuFWnS0/IxWmDOn91QyUn1w7ohiz9TuvYj/5GLSBpBDGCJHsNB5T2WS1 qbQRaZ2Kg3j9TeyWfXxdlxBx7bt3ni+J/IXDY0zom2sTpGHKl8D2g5AzmEXJDTpl kd7Z3gsmwhDh =0U0W -----END PGP SIGNATURE----- Merge tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl updates from Luis Chamberlain: "This only does a few sysctl moves from the kernel/sysctl.c file, the rest of the work has been put towards deprecating two API calls which incur recursion and prevent us from simplifying the registration process / saving memory per move. Most of the changes have been soaking on linux-next since v6.3-rc3. I've slowed down the kernel/sysctl.c moves due to Matthew Wilcox's feedback that we should see if we could save memory with these moves instead of incurring more memory. We currently incur more memory since when we move a syctl from kernel/sysclt.c out to its own file we end up having to add a new empty sysctl used to register it. To achieve saving memory we want to allow syctls to be passed without requiring the end element being empty, and just have our registration process rely on ARRAY_SIZE(). Without this, supporting both styles of sysctls would make the sysctl registration pretty brittle, hard to read and maintain as can be seen from Meng Tang's efforts to do just this [0]. Fortunately, in order to use ARRAY_SIZE() for all sysctl registrations also implies doing the work to deprecate two API calls which use recursion in order to support sysctl declarations with subdirectories. And so during this development cycle quite a bit of effort went into this deprecation effort. I've annotated the following two APIs are deprecated and in few kernel releases we should be good to remove them: - register_sysctl_table() - register_sysctl_paths() During this merge window we should be able to deprecate and unexport register_sysctl_paths(), we can probably do that towards the end of this merge window. Deprecating register_sysctl_table() will take a bit more time but this pull request goes with a few example of how to do this. As it turns out each of the conversions to move away from either of these two API calls also saves memory. And so long term, all these changes will prove to have saved a bit of memory on boot. The way I see it then is if remove a user of one deprecated call, it gives us enough savings to move one kernel/sysctl.c out from the generic arrays as we end up with about the same amount of bytes. Since deprecating register_sysctl_table() and register_sysctl_paths() does not require maintainer coordination except the final unexport you'll see quite a bit of these changes from other pull requests, I've just kept the stragglers after rc3" Link: https://lkml.kernel.org/r/ZAD+cpbrqlc5vmry@bombadil.infradead.org [0] * tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (29 commits) fs: fix sysctls.c built mm: compaction: remove incorrect #ifdef checks mm: compaction: move compaction sysctl to its own file mm: memory-failure: Move memory failure sysctls to its own file arm: simplify two-level sysctl registration for ctl_isa_vars ia64: simplify one-level sysctl registration for kdump_ctl_table utsname: simplify one-level sysctl registration for uts_kern_table ntfs: simplfy one-level sysctl registration for ntfs_sysctls coda: simplify one-level sysctl registration for coda_table fs/cachefiles: simplify one-level sysctl registration for cachefiles_sysctls xfs: simplify two-level sysctl registration for xfs_table nfs: simplify two-level sysctl registration for nfs_cb_sysctls nfs: simplify two-level sysctl registration for nfs4_cb_sysctls lockd: simplify two-level sysctl registration for nlm_sysctls proc_sysctl: enhance documentation xen: simplify sysctl registration for balloon md: simplify sysctl registration hv: simplify sysctl registration scsi: simplify sysctl registration with register_sysctl() csky: simplify alignment sysctl registration ...	2023-04-27 16:52:33 -07:00
Luis Chamberlain	96200952ab	apparmor: simplify sysctls with register_sysctl_init() Using register_sysctl_paths() is really only needed if you have subdirectories with entries. We can use the simple register_sysctl() instead. Acked-by: John Johansen <john.johansen@canonical.com> Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>	2023-04-13 11:49:20 -07:00
Paul Moore	f22f9aaf6c	selinux: remove the runtime disable functionality After working with the larger SELinux-based distros for several years, we're finally at a place where we can disable the SELinux runtime disable functionality. The existing kernel deprecation notice explains the functionality and why we want to remove it: The selinuxfs "disable" node allows SELinux to be disabled at runtime prior to a policy being loaded into the kernel. If disabled via this mechanism, SELinux will remain disabled until the system is rebooted. The preferred method of disabling SELinux is via the "selinux=0" boot parameter, but the selinuxfs "disable" node was created to make it easier for systems with primitive bootloaders that did not allow for easy modification of the kernel command line. Unfortunately, allowing for SELinux to be disabled at runtime makes it difficult to secure the kernel's LSM hooks using the "__ro_after_init" feature. It is that last sentence, mentioning the '__ro_after_init' hardening, which is the real motivation for this change, and if you look at the diffstat you'll see that the impact of this patch reaches across all the different LSMs, helping prevent tampering at the LSM hook level. From a SELinux perspective, it is important to note that if you continue to disable SELinux via "/etc/selinux/config" it may appear that SELinux is disabled, but it is simply in an uninitialized state. If you load a policy with `load_policy -i`, you will see SELinux come alive just as if you had loaded the policy during early-boot. It is also worth noting that the "/sys/fs/selinux/disable" file is always writable now, regardless of the Kconfig settings, but writing to the file has no effect on the system, other than to display an error on the console if a non-zero/true value is written. Finally, in the several years where we have been working on deprecating this functionality, there has only been one instance of someone mentioning any user visible breakage. In this particular case it was an individual's kernel test system, and the workaround documented in the deprecation notice ("selinux=0" on the kernel command line) resolved the issue without problem. Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2023-03-20 12:34:23 -04:00
Linus Torvalds	f122a08b19	capability: just use a 'u64' instead of a 'u32[2]' array Back in 2008 we extended the capability bits from 32 to 64, and we did it by extending the single 32-bit capability word from one word to an array of two words. It was then obfuscated by hiding the "2" behind two macro expansions, with the reasoning being that maybe it gets extended further some day. That reasoning may have been valid at the time, but the last thing we want to do is to extend the capability set any more. And the array of values not only causes source code oddities (with loops to deal with it), but also results in worse code generation. It's a lose-lose situation. So just change the 'u32[2]' into a 'u64' and be done with it. We still have to deal with the fact that the user space interface is designed around an array of these 32-bit values, but that was the case before too, since the array layouts were different (ie user space doesn't use an array of 32-bit values for individual capability masks, but an array of 32-bit slices of multiple masks). So that marshalling of data is actually simplified too, even if it does remain somewhat obscure and odd. This was all triggered by my reaction to the new "cap_isidentical()" introduced recently. By just using a saner data structure, it went from unsigned __capi; CAP_FOR_EACH_U32(__capi) { if (a.cap[__capi] != b.cap[__capi]) return false; } return true; to just being return a.val == b.val; instead. Which is rather more obvious both to humans and to compilers. Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-03-01 10:01:22 -08:00
Linus Torvalds	3822a7c409	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K DmxHkn0LAitGgJRS/W9w81yrgig9tAQ= =MlGs -----END PGP SIGNATURE----- Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...	2023-02-23 17:09:35 -08:00
Linus Torvalds	05e6295f7b	fs.idmapped.v6.3 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY+5NlQAKCRCRxhvAZXjc orOaAP9i2h3OJy95nO2Fpde0Bt2UT+oulKCCcGlvXJ8/+TQpyQD/ZQq47gFQ0EAz Br5NxeyGeecAb0lHpFz+CpLGsxMrMwQ= =+BG5 -----END PGP SIGNATURE----- Merge tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfs idmapping updates from Christian Brauner: - Last cycle we introduced the dedicated struct mnt_idmap type for mount idmapping and the required infrastucture in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). As promised in last cycle's pull request message this converts everything to rely on struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevant on the mount level. Especially for non-vfs developers without detailed knowledge in this area this was a potential source for bugs. This finishes the conversion. Instead of passing the plain namespace around this updates all places that currently take a pointer to a mnt_userns with a pointer to struct mnt_idmap. Now that the conversion is done all helpers down to the really low-level helpers only accept a struct mnt_idmap argument instead of two namespace arguments. Conflating mount and other idmappings will now cause the compiler to complain loudly thus eliminating the possibility of any bugs. This makes it impossible for filesystem developers to mix up mount and filesystem idmappings as they are two distinct types and require distinct helpers that cannot be used interchangeably. Everything associated with struct mnt_idmap is moved into a single separate file. With that change no code can poke around in struct mnt_idmap. It can only be interacted with through dedicated helpers. That means all filesystems are and all of the vfs is completely oblivious to the actual implementation of idmappings. We are now also able to extend struct mnt_idmap as we see fit. For example, we can decouple it completely from namespaces for users that don't require or don't want to use them at all. We can also extend the concept of idmappings so we can cover filesystem specific requirements. In combination with the vfs{g,u}id_t work we finished in v6.2 this makes this feature substantially more robust and thus difficult to implement wrong by a given filesystem and also protects the vfs. - Enable idmapped mounts for tmpfs and fulfill a longstanding request. A long-standing request from users had been to make it possible to create idmapped mounts for tmpfs. For example, to share the host's tmpfs mount between multiple sandboxes. This is a prerequisite for some advanced Kubernetes cases. Systemd also has a range of use-cases to increase service isolation. And there are more users of this. However, with all of the other work going on this was way down on the priority list but luckily someone other than ourselves picked this up. As usual the patch is tiny as all the infrastructure work had been done multiple kernel releases ago. In addition to all the tests that we already have I requested that Rodrigo add a dedicated tmpfs testsuite for idmapped mounts to xfstests. It is to be included into xfstests during the v6.3 development cycle. This should add a slew of additional tests. * tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits) shmem: support idmapped mounts for tmpfs fs: move mnt_idmap fs: port vfs{g,u}id helpers to mnt_idmap fs: port fs{g,u}id helpers to mnt_idmap fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap fs: port i_{g,u}id_{needs_}update() to mnt_idmap quota: port to mnt_idmap fs: port privilege checking helpers to mnt_idmap fs: port inode_owner_or_capable() to mnt_idmap fs: port inode_init_owner() to mnt_idmap fs: port acl to mnt_idmap fs: port xattr to mnt_idmap fs: port ->permission() to pass mnt_idmap fs: port ->fileattr_set() to pass mnt_idmap fs: port ->set_acl() to pass mnt_idmap fs: port ->get_acl() to pass mnt_idmap fs: port ->tmpfile() to pass mnt_idmap fs: port ->rename() to pass mnt_idmap fs: port ->mknod() to pass mnt_idmap fs: port ->mkdir() to pass mnt_idmap ...	2023-02-20 11:53:11 -08:00
John Johansen	cbb13e12a5	apparmor: Fix regression in compat permissions for getattr This fixes a regression in mediation of getattr when old policy built under an older ABI is loaded and mapped to internal permissions. The regression does not occur for all getattr permission requests, only appearing if state zero is the final state in the permission lookup. This is because despite the first state (index 0) being guaranteed to not have permissions in both newer and older permission formats, it may have to carry permissions that were not mediated as part of an older policy. These backward compat permissions are mapped here to avoid special casing the mediation code paths. Since the mapping code already takes into account backwards compat permission from older formats it can be applied to state 0 to fix the regression. Fixes: `408d53e923` ("apparmor: compute file permissions on profile load") Reported-by: Philip Meulengracht <the_meulengracht@hotmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2023-02-15 11:24:38 -08:00
Christian Brauner	e67fe63341	fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap Convert to struct mnt_idmap. Remove legacy file_mnt_user_ns() and mnt_user_ns(). Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:29 +01:00
Christian Brauner	4609e1f18e	fs: port ->permission() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:28 +01:00
Christian Brauner	c54bd91e9e	fs: port ->mkdir() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:26 +01:00
Hao Sun	0b7b8704dd	mm: new primitive kvmemdup() Similar to kmemdup(), but support large amount of bytes with kvmalloc() and does not guarantee that the result will be physically contiguous. Use only in cases where kvmalloc() is needed and free it with kvfree(). Also adapt policy_unpack.c in case someone bisect into this. Link: https://lkml.kernel.org/r/20221221144245.27164-1-sunhao.th@gmail.com Signed-off-by: Hao Sun <sunhao.th@gmail.com> Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nick Terrell <terrelln@fb.com> Cc: John Johansen <john.johansen@canonical.com> Cc: Paul Moore <paul@paul-moore.com> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-01-18 17:12:47 -08:00
Linus Torvalds	93761c93e9	+ Features - switch to zstd compression for profile raw data + Cleanups - Simplify obtain the newest label on a cred - remove useless static inline functions - compute permission conversion on policy unpack - refactor code to share common permissins - refactor unpack to group policy backwards compatiblity code - add __init annotation to aa_{setup/teardown}_dfa_engine() + Bug Fixes - fix a memleak in - multi_transaction_new() - free_ruleset() - unpack_profile() - alloc_ns() - fix lockdep warning when removing a namespace - fix regression in stacking due to label flags - fix loading of child before parent - fix kernel-doc comments that differ from fns - fix spelling errors in comments - store return value of unpack_perms_table() to signed variable -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAmOZwywACgkQBS82cBjV w9jBjRAAmj4gyK0L3eGY4IV2BpvnkHwHY4lOObJulTwILOOj0Pz8CJqRCa/HDCGj aOlnwqksPsAjadzzfi58D6TnT+3fOuskbcMgTyvX5jraTXPrUl90+hXorbXKuLrw iaX6QxW8soNW/s3oJhrC2HxbIhGA9VpVnmQpVZpJMmz5bU2xmzL62FCN8x88kytr 9CygaudPrvwYJf5pPd62p7ltj2S6lFwZ6dVCyiDQGTc+Gyng4G8p4MCfI1CwMMyo mAUeeRnoeeBwH3tSy/Wsr72jPKjsMASpcMHo3ns/dVSw/ug2FYYToZbfxT/uAa6O WVHfS1Kv/5afG9xxyfocWecd+Yp3lsXq9F+q36uOT9NeJmlej9aJr5sWMcvV3sru QVNN7tFZbHqCnLhpl6RDH/NiguweNYQXrl2lukXZe/FKu/KDasFIOzL+IAt2TqZE 3mWrha7Q7j/gdBw8+fHHGtXCx0NSQlz1oFLo/y/mI7ztwUPJsBYbH5+108iP0ys/ 7Kd+jkYRucJB4upGH4meQbN6f/rrs3+m/b/j0Q8RCFHAs2f+mYZeN/JOHCo0T4YH KO1W60846fPs+7yZTVxWYFpR/kIuXksyxMWpEEZFFtF4MNoaeM1uypBWqm/JmKYr 8oDtEyiOd/qmZnWRcuO3/bmdoJUZY1zTXWA0dlScYc8vR4KC+EE= =6GKy -----END PGP SIGNATURE----- Merge tag 'apparmor-pr-2022-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor updates from John Johansen: "Features: - switch to zstd compression for profile raw data Cleanups: - simplify obtaining the newest label on a cred - remove useless static inline functions - compute permission conversion on policy unpack - refactor code to share common permissins - refactor unpack to group policy backwards compatiblity code - add __init annotation to aa_{setup/teardown}_dfa_engine() Bug Fixes: - fix a memleak in - multi_transaction_new() - free_ruleset() - unpack_profile() - alloc_ns() - fix lockdep warning when removing a namespace - fix regression in stacking due to label flags - fix loading of child before parent - fix kernel-doc comments that differ from fns - fix spelling errors in comments - store return value of unpack_perms_table() to signed variable" * tag 'apparmor-pr-2022-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (64 commits) apparmor: Fix uninitialized symbol 'array_size' in policy_unpack_test.c apparmor: Add __init annotation to aa_{setup/teardown}_dfa_engine() apparmor: Fix memleak in alloc_ns() apparmor: Fix memleak issue in unpack_profile() apparmor: fix a memleak in free_ruleset() apparmor: Fix spelling of function name in comment block apparmor: Use pointer to struct aa_label for lbs_cred AppArmor: Fix kernel-doc LSM: Fix kernel-doc AppArmor: Fix kernel-doc apparmor: Fix loading of child before parent apparmor: refactor code that alloc null profiles apparmor: fix obsoleted comments for aa_getprocattr() and audit_resource() apparmor: remove useless static inline functions apparmor: Fix unpack_profile() warn: passing zero to 'ERR_PTR' apparmor: fix uninitialize table variable in error in unpack_trans_table apparmor: store return value of unpack_perms_table() to signed variable apparmor: Fix kunit test for out of bounds array apparmor: Fix decompression of rawdata for read back to userspace apparmor: Fix undefined references to zstd_ symbols ...	2022-12-14 13:42:09 -08:00
Linus Torvalds	c76ff350bd	lsm/stable-6.2 PR 20221212 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmOXmxkUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMPXg//cxfYC8lRtVpuGNCZWDietSiHzpzu +qFntaTplvybJMQX0HfgNee5cTBZM+W5mp1BHRcZInvV5LRhyrVtgsxDBifutE4x LyUJAw5SkiPdRC+XLDIRLKiZCobFBLVs2zO+qibIqsyR60pFjU6WXBLbJfidXBFR yWudDbLU0YhQJCHdNHNqnHCgqrEculxn6q3QPvm/DX0xzBwkFHSSYBkGNvHW2ZTA lKNreEOwEk5DTLIKjP4bJ72ixp0xbshw5CXuxtwB/12/4h8QbWbJVQLlIeZrTLmp zQXQLJ3pCqKJ2OUCgMDK+wmkvLezd80BV3Due7KX0pT0YRDygoh5QEpZ5/8k8eG7 prxToh2gJWk2htfJF6kgMpAh9Jqewcke4BysbYVM/427OPZYwQqLDZDGOzbtT6pl FYF+adN9wwkAErnHnPlzYipUEpBWurbjtsV8KFWNERoZ4YmzfSPEisRqGIHDGRws bTyq/7qs5FXkb1zULELj8V+S2ULsmxPqsxJ63p9di54Uo9lHK0I+0IUtajGDdfze psAasa9DD/oH2PAbSmpQ5Xo9XyfHRXsVuz1twEmEA14ML0m4wHbNWVHaK0aaXVdG kJKSDSjMsiV+GiwNo7ISJ4pVdUpnMI/iZSghFfV28cJslNhJDeaREHaE/Wtn1/xF /bCVmEfS16UoJsQ= =klFk -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Improve the error handling in the device cgroup such that memory allocation failures when updating the access policy do not potentially alter the policy. - Some minor fixes to reiserfs to ensure that it properly releases LSM-related xattr values. - Update the security_socket_getpeersec_stream() LSM hook to take sockptr_t values. Previously the net/BPF folks updated the getsockopt code in the network stack to leverage the sockptr_t type to make it easier to pass both kernel and __user pointers, but unfortunately when they did so they didn't convert the LSM hook. While there was/is no immediate risk by not converting the LSM hook, it seems like this is a mistake waiting to happen so this patch proactively does the LSM hook conversion. - Convert vfs_getxattr_alloc() to return an int instead of a ssize_t and cleanup the callers. Internally the function was never going to return anything larger than an int and the callers were doing some very odd things casting the return value; this patch fixes all that and helps bring a bit of sanity to vfs_getxattr_alloc() and its callers. - More verbose, and helpful, LSM debug output when the system is booted with "lsm.debug" on the command line. There are examples in the commit description, but the quick summary is that this patch provides better information about which LSMs are enabled and the ordering in which they are processed. - General comment and kernel-doc fixes and cleanups. * tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: Fix description of fs_context_parse_param lsm: Add/fix return values in lsm_hooks.h and fix formatting lsm: Clarify documentation of vm_enough_memory hook reiserfs: Add missing calls to reiserfs_security_free() lsm,fs: fix vfs_getxattr_alloc() return type and caller error paths device_cgroup: Roll back to original exceptions after copy failure LSM: Better reporting of actual LSMs at boot lsm: make security_socket_getpeersec_stream() sockptr_t safe audit: Fix some kernel-doc warnings lsm: remove obsoleted comments for security hooks fs: edit a comment made in bad taste	2022-12-13 09:47:48 -08:00
Linus Torvalds	299e2b1967	Landlock updates for v6.2-rc1 -----BEGIN PGP SIGNATURE----- iIYEABYIAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCY5b27RAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSg9YA/0K10H+VsGt1+qqR4+w9SM7SFzbgszrV3Yw9 rwiPgaPVAP9rxXPr2bD2hAk7/Lv9LeJ2kfM9RzMErP1A6UsC5YVbDA== =mAG7 -----END PGP SIGNATURE----- Merge tag 'landlock-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This adds file truncation support to Landlock, contributed by Günther Noack. As described by Günther [1], the goal of these patches is to work towards a more complete coverage of file system operations that are restrictable with Landlock. The known set of currently unsupported file system operations in Landlock is described at [2]. Out of the operations listed there, truncate is the only one that modifies file contents, so these patches should make it possible to prevent the direct modification of file contents with Landlock. The new LANDLOCK_ACCESS_FS_TRUNCATE access right covers both the truncate(2) and ftruncate(2) families of syscalls, as well as open(2) with the O_TRUNC flag. This includes usages of creat() in the case where existing regular files are overwritten. Additionally, this introduces a new Landlock security blob associated with opened files, to track the available Landlock access rights at the time of opening the file. This is in line with Unix's general approach of checking the read and write permissions during open(), and associating this previously checked authorization with the opened file. An ongoing patch documents this use case [3]. In order to treat truncate(2) and ftruncate(2) calls differently in an LSM hook, we split apart the existing security_path_truncate hook into security_path_truncate (for truncation by path) and security_file_truncate (for truncation of previously opened files)" Link: https://lore.kernel.org/r/20221018182216.301684-1-gnoack3000@gmail.com [1] Link: https://www.kernel.org/doc/html/v6.1/userspace-api/landlock.html#filesystem-flags [2] Link: https://lore.kernel.org/r/20221209193813.972012-1-mic@digikod.net [3] * tag 'landlock-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: samples/landlock: Document best-effort approach for LANDLOCK_ACCESS_FS_REFER landlock: Document Landlock's file truncation support samples/landlock: Extend sample tool to support LANDLOCK_ACCESS_FS_TRUNCATE selftests/landlock: Test ftruncate on FDs created by memfd_create(2) selftests/landlock: Test FD passing from restricted to unrestricted processes selftests/landlock: Locally define __maybe_unused selftests/landlock: Test open() and ftruncate() in multiple scenarios selftests/landlock: Test file truncation support landlock: Support file truncation landlock: Document init_layer_masks() helper landlock: Refactor check_access_path_dual() into is_access_to_paths_allowed() security: Create file_truncate hook from path_truncate hook	2022-12-13 09:14:50 -08:00
Linus Torvalds	e1212e9b6f	fs.vfsuid.conversion.v6.2 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY5bspgAKCRCRxhvAZXjc opEWAQDpF5rnZn1vv4/uOTij9ztcA4yLxu/Q19CdqBaoHlWZ9AD/d3eecee3bh5h iPHtlUK5/VspfD9LPpdc5ZbPCdZ2pA4= =t6NN -----END PGP SIGNATURE----- Merge tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfsuid updates from Christian Brauner: "Last cycle we introduced the vfs{g,u}id_t types and associated helpers to gain type safety when dealing with idmapped mounts. That initial work already converted a lot of places over but there were still some left, This converts all remaining places that still make use of non-type safe idmapping helpers to rely on the new type safe vfs{g,u}id based helpers. Afterwards it removes all the old non-type safe helpers" * tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: fs: remove unused idmapping helpers ovl: port to vfs{g,u}id_t and associated helpers fuse: port to vfs{g,u}id_t and associated helpers ima: use type safe idmapping helpers apparmor: use type safe idmapping helpers caps: use type safe idmapping helpers fs: use type safe idmapping helpers mnt_idmapping: add missing helpers	2022-12-12 19:20:05 -08:00
Rae Moar	b11e51dd70	apparmor: test: make static symbols visible during kunit testing Use macros, VISIBLE_IF_KUNIT and EXPORT_SYMBOL_IF_KUNIT, to allow static symbols to be conditionally set to be visible during apparmor_policy_unpack_test, which removes the need to include the testing file in the implementation file. Change the namespace of the symbols that are now conditionally visible (by adding the prefix aa_) to avoid confusion with symbols of the same name. Allow the test to be built as a module and namespace the module name from policy_unpack_test to apparmor_policy_unpack_test to improve clarity of the module name. Provide an example of how static symbols can be dealt with in testing. Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-12-12 14:13:48 -07:00
Paul Moore	f6fbd8cbf3	lsm,fs: fix vfs_getxattr_alloc() return type and caller error paths The vfs_getxattr_alloc() function currently returns a ssize_t value despite the fact that it only uses int values internally for return values. Fix this by converting vfs_getxattr_alloc() to return an int type and adjust the callers as necessary. As part of these caller modifications, some of the callers are fixed to properly free the xattr value buffer on both success and failure to ensure that memory is not leaked in the failure case. Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-11-18 17:07:03 -05:00
Paul Moore	b10b9c342f	lsm: make security_socket_getpeersec_stream() sockptr_t safe Commit `4ff09db1b7` ("bpf: net: Change sk_getsockopt() to take the sockptr_t argument") made it possible to call sk_getsockopt() with both user and kernel address space buffers through the use of the sockptr_t type. Unfortunately at the time of conversion the security_socket_getpeersec_stream() LSM hook was written to only accept userspace buffers, and in a desire to avoid having to change the LSM hook the commit author simply passed the sockptr_t's userspace buffer pointer. Since the only sk_getsockopt() callers at the time of conversion which used kernel sockptr_t buffers did not allow SO_PEERSEC, and hence the security_socket_getpeersec_stream() hook, this was acceptable but also very fragile as future changes presented the possibility of silently passing kernel space pointers to the LSM hook. There are several ways to protect against this, including careful code review of future commits, but since relying on code review to catch bugs is a recipe for disaster and the upstream eBPF maintainer is "strongly against defensive programming", this patch updates the LSM hook, and all of the implementations to support sockptr_t and safely handle both user and kernel space buffers. Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-11-04 23:25:30 -04:00
John Johansen	4295c60bbe	apparmor: Fix uninitialized symbol 'array_size' in policy_unpack_test.c Make sure array_size is initialized in the kunit test to get rid of compiler warnings. This will also make sure the following tests fail consistently if the first test fails. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-11-01 21:23:05 -07:00
Xiu Jianfeng	f6c64dc32a	apparmor: Add __init annotation to aa_{setup/teardown}_dfa_engine() The aa_setup_dfa_engine() and aa_teardown_dfa_engine() is only called in apparmor_init(), so let us add __init annotation to them. Fixes: `11c236b89d` ("apparmor: add a default null dfa") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-11-01 21:17:26 -07:00
Xiu Jianfeng	e9e6fa49db	apparmor: Fix memleak in alloc_ns() After changes in commit `a1bd627b46` ("apparmor: share profile name on replacement"), the hname member of struct aa_policy is not valid slab object, but a subset of that, it can not be freed by kfree_sensitive(), use aa_policy_destroy() to fix it. Fixes: `a1bd627b46` ("apparmor: share profile name on replacement") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-11-01 05:32:13 -07:00
Christian Brauner	5e26a01e56	apparmor: use type safe idmapping helpers We already ported most parts and filesystems over for v6.0 to the new vfs{g,u}id_t type and associated helpers for v6.0. Convert the remaining places so we can remove all the old helpers. This is a non-functional change. Reviewed-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2022-10-26 10:03:19 +02:00
Xiu Jianfeng	3265949f7c	apparmor: Fix memleak issue in unpack_profile() Before aa_alloc_profile(), it has allocated string for @*ns_name if @tmpns is not NULL, so directly return -ENOMEM if aa_alloc_profile() failed will cause a memleak issue, and even if aa_alloc_profile() succeed, in the @fail_profile tag of aa_unpack(), it need to free @ns_name as well, this patch fixes them. Fixes: `736ec752d9` ("AppArmor: policy routines for loading and unpacking policy") Fixes: `04dc715e24` ("apparmor: audit policy ns specified in policy load") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-25 00:15:19 -07:00
Gaosheng Cui	7dd426e33e	apparmor: fix a memleak in free_ruleset() When the aa_profile is released, we will call free_ruleset to release aa_ruleset, but we don't free the memory of aa_ruleset, so there will be memleak, fix it. unreferenced object 0xffff8881475df800 (size 1024): comm "apparmor_parser", pid 883, jiffies 4294899650 (age 9114.088s) hex dump (first 32 bytes): 00 f8 5d 47 81 88 ff ff 00 f8 5d 47 81 88 ff ff ..]G......]G.... 00 00 00 00 00 00 00 00 00 dc 65 47 81 88 ff ff ..........eG.... backtrace: [<00000000370e658e>] __kmem_cache_alloc_node+0x182/0x700 [<00000000f2f5a6d2>] kmalloc_trace+0x2c/0x130 [<00000000c5c905b3>] aa_alloc_profile+0x1bc/0x5c0 [<00000000bc4fa72b>] unpack_profile+0x319/0x30c0 [<00000000eab791e9>] aa_unpack+0x307/0x1450 [<000000002c3a6ee1>] aa_replace_profiles+0x1b8/0x3790 [<00000000d0c3fd54>] policy_update+0x35a/0x890 [<00000000d04fed90>] profile_replace+0x1d1/0x260 [<00000000cba0c0a7>] vfs_write+0x283/0xd10 [<000000006bae64a5>] ksys_write+0x134/0x260 [<00000000b2fd8f31>] __x64_sys_write+0x78/0xb0 [<00000000f3c8a015>] do_syscall_64+0x5c/0x90 [<00000000a242b1db>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: `217af7e2f4` ("apparmor: refactor profile rules and attachments") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-25 00:15:19 -07:00
Yang Li	d44c692350	apparmor: Fix spelling of function name in comment block 'resouce' -> 'resource' Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2396 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-25 00:15:19 -07:00
Xiu Jianfeng	37923d4321	apparmor: Use pointer to struct aa_label for lbs_cred According to the implementations of cred_label() and set_cred_label(), we should use pointer to struct aa_label for lbs_cred instead of struct aa_task_ctx, this patch fixes it. Fixes: `bbd3662a83` ("Infrastructure management of the cred security blob") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-25 00:15:19 -07:00
Jiapeng Chong	a2217387c3	AppArmor: Fix kernel-doc security/apparmor/ipc.c:53: warning: expecting prototype for audit_cb(). Prototype was for audit_signal_cb() instead. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2337 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-25 00:15:18 -07:00
Jiapeng Chong	391f121150	LSM: Fix kernel-doc security/apparmor/lsm.c:753: warning: expecting prototype for apparmor_bprm_committed_cred(). Prototype was for apparmor_bprm_committed_creds() instead. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2338 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-25 00:15:18 -07:00
Jiapeng Chong	64a27ba984	AppArmor: Fix kernel-doc security/apparmor/audit.c:93: warning: expecting prototype for audit_base(). Prototype was for audit_pre() instead. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2339 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-25 00:15:18 -07:00
John Johansen	665b1856dc	apparmor: Fix loading of child before parent Unfortunately it is possible for some userspace's to load children profiles before the parent profile. This can even happen when the child and the parent are in different load sets. Fix this by creating a null place holder profile that grants no permissions and can be replaced by the parent once it is loaded. Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-25 00:15:11 -07:00
John Johansen	58f89ce58b	apparmor: refactor code that alloc null profiles Bother unconfined and learning profiles use the null profile as their base. Refactor so they are share a common base routine. This doesn't save much atm but will be important when the feature set of the parent is inherited. Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-24 22:35:36 -07:00
Gaosheng Cui	1f2bc06a8d	apparmor: fix obsoleted comments for aa_getprocattr() and audit_resource() Update the comments for aa_getprocattr() and audit_resource(), the args of them have beed changed since commit `76a1d263ab` ("apparmor: switch getprocattr to using label_print fns()"). Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-24 22:35:23 -07:00
Gaosheng Cui	2f7a29deba	apparmor: remove useless static inline functions Remove the following useless static inline functions: 1. label_is_visible() is a static function in security/apparmor/label.c, and it's not used, aa_ns_visible() can do the same things as it, so it's redundant. 2. is_deleted() is a static function in security/apparmor/file.c, and it's not used since commit `aebd873e8d` ("apparmor: refactor path name lookup and permission checks around labels"), so it's redundant. They are redundant, so remove them. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-24 22:35:11 -07:00
Günther Noack	3350607dc5	security: Create file_truncate hook from path_truncate hook Like path_truncate, the file_truncate hook also restricts file truncation, but is called in the cases where truncation is attempted on an already-opened file. This is required in a subsequent commit to handle ftruncate() operations differently to truncate() operations. Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20221018182216.301684-2-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>	2022-10-19 09:01:40 +02:00
John Johansen	53991aedcd	apparmor: Fix unpack_profile() warn: passing zero to 'ERR_PTR' unpack_profile() sets a default error on entry but this gets overridden by error assignment by functions called in its body. If an error check that was relying on the default value is triggered after one of these error assignments then zero will be passed to ERR_PTR. Fix this by setting up a default -EPROTO assignment in the error path and while we are at it make sure the correct error is returned in non-default cases. Fixes: `217af7e2f4` ("apparmor: refactor profile rules and attachments") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-10 17:17:19 -07:00
John Johansen	ee21a175ec	apparmor: fix uninitialize table variable in error in unpack_trans_table The error path has one case where *table is uninitialized, initialize it. Fixes: `a0792e2ced` ("apparmor: make transition table unpack generic so it can be reused") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-10 11:18:50 -07:00
Muhammad Usama Anjum	5515a8e30e	apparmor: store return value of unpack_perms_table() to signed variable The unpack_perms_table() can return error which is negative value. Store the return value to a signed variable. policy->size is unsigned variable. It shouldn't be used to store the return status. Fixes: 2d6b2dea7f3c ("apparmor: add the ability for policy to specify a permission table") Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-04 02:34:29 -07:00
John Johansen	3249054168	apparmor: Fix kunit test for out of bounds array The apparmor kunit tests are failing on the out of bounds array check with the following failure # policy_unpack_test_unpack_array_out_of_bounds: EXPECTATION FAILED at security/apparmor/policy_unpack_test.c:178 Expected unpack_array(puf->e, name, &array_size) == 1, but unpack_array(puf->e, name, &array_size) == -1 # policy_unpack_test_unpack_array_out_of_bounds: EXPECTATION FAILED at security/apparmor/policy_unpack_test.c:180 Expected array_size == 0, but array_size == 64192 not ok 5 - policy_unpack_test_unpack_array_out_of_bounds This is because unpack_array changed to allow distinguishing between the array not being present and an error. In the error case the array size is not set and should not be tested. Reported-by: kernel test robot <yujie.liu@intel.com> Fixes: 995a5b64620e ("apparmor: make unpack_array return a trianary value") Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	a2f31df06b	apparmor: Fix decompression of rawdata for read back to userspace The rawdata readback has a few of problems. First if compression is enabled when the data is read then the compressed data is read out instead decompressing the data. Second if compression of the data fails, the code does not handle holding onto the raw_data in uncompressed form. Third if the compression is enabled/disabled after the rawdata was loaded, the check against the global control of whether to use compression does not reflect what was already done to the data. Fix these by always storing the compressed size, along with the original data size even if compression fails or is not used. And use this to detect whether the rawdata is actually compressed. Fixes: 52ccc20c652b ("apparmor: use zstd compression for profile data") Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Jon Tourville <jon.tourville@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	70f24a9f90	apparmor: Fix undefined references to zstd_ symbols Unfortunately the switch to using zstd compression did not properly ifdef all the code that uses zstd_ symbols. So that if exporting of binary policy is disabled in the config the compile will fail with the following errors security/apparmor/lsm.c:1545: undefined reference to `zstd_min_clevel' aarch64-linux-ld: security/apparmor/lsm.c:1545: undefined reference to `zstd_max_clevel' Reported-by: kernel test robot <lkp@intel.com> Fixes: 52ccc20c652b ("apparmor: use zstd compression for profile data") Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Jon Tourville <jon.tourville@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	14d37a7f14	apparmor: make sure the decompression ctx is promperly initialized The decompress ctx was not properly initialized when reading raw profile data back to userspace. Reported-by: kernel test robot <lkp@intel.com> Fixes: 52ccc20c652b ("apparmor: use zstd compression for profile data") Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	73c7e91c8b	apparmor: Remove unnecessary size check when unpacking trans_table The index into the trans_table has a max size of 2^24 bits which the code was testing but this is unnecessary as unpack_array can only unpack a table of 2^16 bits in size so the table unpacked will never be larger than what can be indexed, and any test here is redundant. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	1ddece8cd0	apparmor: Fix doc comment for compute_fperms When compute_fperms was moved to policy_compat and made static it was renamed from aa_compute_fperms to just compute_fperms to help indicate it is only available statically. Unfortunately the doc comment did not also get updated to reflect the change. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
Xiu Jianfeng	65f7f666f2	apparmor: make __aa_path_perm() static Make __aa_path_perm() static as it's only used inside apparmor/file.c. Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
Gaosheng Cui	adaa9a3f72	apparmor: Simplify obtain the newest label on a cred In aa_get_task_label(), aa_get_newest_cred_label(__task_cred(task)) can do the same things as aa_get_newest_label(__aa_task_raw_label(task)), so we can replace it and remove __aa_task_raw_label() to simplify the code. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	1f939c6bd1	apparmor: Fix regression in stacking due to label flags The unconfined label flag is not being computed correctly. It should only be set if all the profiles in the vector are set, which is different than what is required for the debug and stale flag that are set if any on the profile flags are set. Fixes: `c1ed5da197` ("apparmor: allow label to carry debug flags") Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	961f3e3de1	apparmor: fix aa_class_names[] to match reserved classes The class name map did not have the reserved names added. Fix this Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	1ad22fcc4d	apparmor: rework profile->rules to be a list Convert profile->rules to a list as the next step towards supporting multiple rulesets in a profile. For this step only support a single list entry item. The logic for iterating the list will come as a separate step. Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	217af7e2f4	apparmor: refactor profile rules and attachments In preparation for moving from a single set of rules and a single attachment to multiple rulesets and attachments separate from the profile refactor attachment information and ruleset info into their own structures. Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00
John Johansen	3bf3d728a5	apparmor: verify loaded permission bits masks don't overlap Add an additional verification that loaded permission sets don't overlap in ways that are not intended. This will help ensure that permission accumulation can't result in an invalid permission set. Signed-off-by: John Johansen <john.johansen@canonical.com>	2022-10-03 14:49:04 -07:00

1 2 3 4 5 ...

736 Commits