OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Paul Moore	cdbec3ede0	selinux: shorten the policy capability enum names The SELinux policy capability enum names are rather long and follow the "POLICYDB_CAPABILITY_XXX format". While the "POLICYDB_" prefix is helpful in tying the enums to other SELinux policy constants, macros, etc. there is no reason why we need to spell out "CAPABILITY" completely. Shorten "CAPABILITY" to "CAP" in order to make things a bit shorter and cleaner. Moving forward, the SELinux policy capability enum names should follow the "POLICYDB_CAP_XXX" format. Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-03-02 11:37:03 -05:00
Roopa Prabhu	7b8135f4df	rtnetlink: add new rtm tunnel api for tunnel id filtering This patch adds new rtm tunnel msg and api for tunnel id filtering in dst_metadata devices. First dst_metadata device to use the api is vxlan driver with AF_BRIDGE family. This and later changes add ability in vxlan driver to do tunnel id filtering (or vni filtering) on dst_metadata devices. This is similar to vlan api in the vlan filtering bridge. this patch includes selinux nlmsg_route_perms support for RTM_*TUNNEL api from Benjamin Poirier. Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-01 08:38:02 +00:00
Richard Haines	65881e1db4	selinux: allow FIOCLEX and FIONCLEX with policy capability These ioctls are equivalent to fcntl(fd, F_SETFD, flags), which SELinux always allows too. Furthermore, a failed FIOCLEX could result in a file descriptor being leaked to a process that should not have access to it. As this patch removes access controls, a policy capability needs to be enabled in policy to always allow these ioctls. Based-on-patch-by: Demi Marie Obenour <demiobenour@gmail.com> Signed-off-by: Richard Haines <richard_c_haines@btinternet.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-25 15:35:19 -05:00
Ondrej Mosnacek	ce2fc710c9	selinux: fix misuse of mutex_is_locked() mutex_is_locked() tests whether the mutex is locked by any task, while here we want to test if it is held by the current task. To avoid false/missed WARNINGs, use lockdep_assert_is_held() and lockdep_assert_is_not_held() instead, which do the right thing (though they are a no-op if CONFIG_LOCKDEP=n). Cc: stable@vger.kernel.org Fixes: `2554a48f44` ("selinux: measure state and policy capabilities") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-22 18:02:58 -05:00
Christian Göttsche	b97df7c098	selinux: use correct type for context length security_sid_to_context() expects a pointer to an u32 as the address where to store the length of the computed context. Reported by sparse: security/selinux/xfrm.c:359:39: warning: incorrect type in arg 4 (different signedness) security/selinux/xfrm.c:359:39: expected unsigned int [usertype] scontext_len security/selinux/xfrm.c:359:39: got int Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: wrapped commit description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-18 10:45:54 -05:00
Christian Göttsche	5ea33af9d4	selinux: drop return statement at end of void functions Those return statements at the end of a void function are redundant. Reported by clang-tidy [readability-redundant-control-flow] Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-18 10:42:12 -05:00
Ondrej Mosnacek	3eb8eaf2ca	security: implement sctp_assoc_established hook in selinux Do this by extracting the peer labeling per-association logic from selinux_sctp_assoc_request() into a new helper selinux_sctp_process_new_assoc() and use this helper in both selinux_sctp_assoc_request() and selinux_sctp_assoc_established(). This ensures that the peer labeling behavior as documented in Documentation/security/SCTP.rst is applied both on the client and server side: """ An SCTP socket will only have one peer label assigned to it. This will be assigned during the establishment of the first association. Any further associations on this socket will have their packet peer label compared to the sockets peer label, and only if they are different will the ``association`` permission be validated. This is validated by checking the socket peer sid against the received packets peer sid to determine whether the association should be allowed or denied. """ At the same time, it also ensures that the peer label of the association is set to the correct value, such that if it is peeled off into a new socket, the socket's peer label will then be set to the association's peer label, same as it already works on the server side. While selinux_inet_conn_established() (which we are replacing by selinux_sctp_assoc_established() for SCTP) only deals with assigning a peer label to the connection (socket), in case of SCTP we need to also copy the (local) socket label to the association, so that selinux_sctp_sk_clone() can then pick it up for the new socket in case of SCTP peeloff. Careful readers will notice that the selinux_sctp_process_new_assoc() helper also includes the "IPv4 packet received over an IPv6 socket" check, even though it hadn't been in selinux_sctp_assoc_request() before. While such check is not necessary in selinux_inet_conn_request() (because struct request_sock's family field is already set according to the skb's family), here it is needed, as we don't have request_sock and we take the initial family from the socket. In selinux_sctp_assoc_established() it is similarly needed as well (and also selinux_inet_conn_established() already has it). Fixes: `72e89f5008` ("security: Add support for SCTP security hooks") Reported-by: Prashanth Prahlad <pprahlad@redhat.com> Based-on-patch-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Tested-by: Richard Haines <richard_c_haines@btinternet.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-15 15:06:32 -05:00
Ondrej Mosnacek	70f4169ab4	selinux: parse contexts for mount options early Commit `b8b87fd954` ("selinux: Fix selinux_sb_mnt_opts_compat()") started to parse mount options into SIDs in selinux_add_opt() if policy has already been loaded. Since it's extremely unlikely that anyone would depend on the ability to set SELinux contexts on fs_context before loading the policy and then mounting that context after simplify the logic by always parsing the options early. Note that the multi-step mounting is only possible with the new fscontext mount API and wasn't possible before its introduction. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-04 13:40:15 -05:00
Vratislav Bendel	186edf7e36	selinux: fix double free of cond_list on error paths On error path from cond_read_list() and duplicate_policydb_cond_list() the cond_list_destroy() gets called a second time in caller functions, resulting in NULL pointer deref. Fix this by resetting the cond_list_len to 0 in cond_list_destroy(), making subsequent calls a noop. Also consistently reset the cond_list pointer to NULL after freeing. Cc: stable@vger.kernel.org Signed-off-by: Vratislav Bendel <vbendel@redhat.com> [PM: fix line lengths in the description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-02 11:02:10 -05:00
Paul Moore	0e326df069	selinux: various sparse fixes When running the SELinux code through sparse, there are a handful of warnings. This patch resolves some of these warnings caused by "__rcu" mismatches. % make W=1 C=1 security/selinux/ Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-01 19:08:28 -05:00
Scott Mayhew	6bc1968c14	selinux: try to use preparsed sid before calling parse_sid() Avoid unnecessary parsing of sids that have already been parsed via selinux_sb_eat_lsm_opts(). Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-01 16:37:17 -05:00
Scott Mayhew	b8b87fd954	selinux: Fix selinux_sb_mnt_opts_compat() selinux_sb_mnt_opts_compat() is called under the sb_lock spinlock and shouldn't be performing any memory allocations. Fix this by parsing the sids at the same time we're chopping up the security mount options string and then using the pre-parsed sids when doing the comparison. Fixes: `cc274ae776` ("selinux: fix sleeping function called from invalid context") Fixes: `69c4a42d72` ("lsm,selinux: add new hook to compare new mount to an existing mount") Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-01 16:21:22 -05:00
Casey Schaufler	ecff30575b	LSM: general protection fault in legacy_parse_param The usual LSM hook "bail on fail" scheme doesn't work for cases where a security module may return an error code indicating that it does not recognize an input. In this particular case Smack sees a mount option that it recognizes, and returns 0. A call to a BPF hook follows, which returns -ENOPARAM, which confuses the caller because Smack has processed its data. The SELinux hook incorrectly returns 1 on success. There was a time when this was correct, however the current expectation is that it return 0 on success. This is repaired. Reported-by: syzbot+d1e3b1d92d25abf97943@syzkaller.appspotmail.com Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-27 20:43:02 -05:00
Paul Moore	cdeea45422	selinux: fix a type cast problem in cred_init_security() In the process of removing an explicit type cast to preserve a cred const qualifier in cred_init_security() we ran into a problem where the task_struct::real_cred field is defined with the "__rcu" attribute but the selinux_cred() function parameter is not, leading to a sparse warning: security/selinux/hooks.c:216:36: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct cred const cred @@ got struct cred const [noderef] __rcu real_cred As we don't want to add the "__rcu" attribute to the selinux_cred() parameter, we're going to add an explicit cast back to cred_init_security(). Fixes: `b084e189b0` ("selinux: simplify cred_init_security") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-27 12:52:43 -05:00
Christian Göttsche	b5e68162f8	selinux: drop unused macro The macro _DEBUG_HASHES is nowhere used. The configuration DEBUG_HASHES enables debugging of the SELinux hash tables, but the with an underscore prefixed macro definition has no direct impact or any documentation. Reported by clang [-Wunused-macros] Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-26 16:17:18 -05:00
Christian Göttsche	b084e189b0	selinux: simplify cred_init_security The parameter of selinux_cred() is declared const, so an explicit cast dropping the const qualifier is not necessary. Without the cast the local variable cred serves no purpose. Reported by clang [-Wcast-qual] Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-26 15:57:39 -05:00
Christian Göttsche	73073d956a	selinux: do not discard const qualifier in cast Do not discard the const qualifier on the cast from const void* to __be32*; the addressed value is not modified. Reported by clang [-Wcast-qual] Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-26 15:54:45 -05:00
Christian Göttsche	056945a96c	selinux: drop unused parameter of avtab_insert_node The parameter cur is not used in avtab_insert_node(). Reported by clang [-Wunused-parameter] Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-26 15:37:27 -05:00
Christian Göttsche	0b3c2b3dc9	selinux: drop cast to same type Both the lvalue scontextp and rvalue scontext are of the type char. Drop the redundant explicit cast not needed since commit `9a59daa03d` ("SELinux: fix sleeping allocation in security_context_to_sid"), where the type of scontext changed from const char to char*. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-26 15:25:47 -05:00
Christian Göttsche	9e2fe574c0	selinux: enclose macro arguments in parenthesis Enclose the macro arguments in parenthesis to avoid potential evaluation order issues. Note the xperm and ebitmap macros are still not side-effect safe due to double evaluation. Reported by clang-tidy [bugprone-macro-parentheses] Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-26 15:13:58 -05:00
Christian Göttsche	d3b1161f29	selinux: declare name parameter of hash_eval const String literals are passed as second argument to hash_eval(). Also the parameter is already declared const in the DEBUG_HASHES configuration. Reported by clang [-Wwrite-strings]: security/selinux/ss/policydb.c:1881:26: error: passing 'const char [8]' to parameter of type 'char ' discards qualifiers hash_eval(&p->range_tr, rangetr); ^~~~~~~~~ security/selinux/ss/policydb.c:707:55: note: passing argument to parameter 'hash_name' here static inline void hash_eval(struct hashtab h, char hash_name) ^ security/selinux/ss/policydb.c:2099:32: error: passing 'const char [11]' to parameter of type 'char ' discards qualifiers hash_eval(&p->filename_trans, filenametr); ^~~~~~~~~~~~ security/selinux/ss/policydb.c:707:55: note: passing argument to parameter 'hash_name' here static inline void hash_eval(struct hashtab h, char hash_name) ^ Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: line wrapping in description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-26 13:54:25 -05:00
Christian Göttsche	08df49054f	selinux: declare path parameters of _genfs_sid const The path parameter is only read from in security_genfs_sid(), selinux_policy_genfs_sid() and __security_genfs_sid(). Since a string literal is passed as argument, declare the parameter const. Also align the parameter names in the declaration and definition. Reported by clang [-Wwrite-strings]: security/selinux/hooks.c:553:60: error: passing 'const char [2]' to parameter of type 'char ' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] rc = security_genfs_sid(&selinux_state, ... , /, ^~~ ./security/selinux/include/security.h:389:36: note: passing argument to parameter 'name' here const char fstype, char *name, u16 sclass, ^ Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: wrapped description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-25 19:59:52 -05:00
Christian Göttsche	bcb62828e3	selinux: check return value of sel_make_avc_files sel_make_avc_files() might fail and return a negative errno value on memory allocation failures. Re-add the check of the return value, dropped in `66f8e2f03c` ("selinux: sidtab reverse lookup hash table"). Reported by clang-analyzer: security/selinux/selinuxfs.c:2129:2: warning: Value stored to 'ret' is never read [deadcode.DeadStores] ret = sel_make_avc_files(dentry); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: `66f8e2f03c` ("selinux: sidtab reverse lookup hash table") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> [PM: description line wrapping, added proper commit ref] Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-25 19:21:21 -05:00
GONG, Ruiqi	0266c25e7c	selinux: access superblock_security_struct in LSM blob way LSM blob has been involved for superblock's security struct. So fix the remaining direct access to sb->s_security by using the LSM blob mechanism. Fixes: `08abe46b2c` ("selinux: fall back to SECURITY_FS_USE_GENFS if no xattr support") Fixes: `69c4a42d72` ("lsm,selinux: add new hook to compare new mount to an existing mount") Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-01-25 18:46:04 -05:00
Linus Torvalds	a135ce4400	selinux/stable-5.17 PR 20220110 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmHcekAUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNMAQ//TlHhQPEkPYZwD0EPXQIkl5IxGZfR DSg24OMzrh+x/jKVYhMXuW6BlnF10wd2ocG0DNAt9weF22PLrbDsxbzAAnv+MLXM wCzWLmgKpPpbaG57oa17LMcswDjVr8YbxXPOvyZJ/G3YsWDarH8Ezot8iNlVUkOh oqjjXTy0dyLUB/FsW7sIGWa3O18SkbI4pDQxL2vcpdDbvPtAY94kNG3bKTONK/kn OxLrjKscTtu6EuSdFhMqwcUxpfqvqUDrEfiSdNruMlT0/DixHgFlJA8WHXjn7Wpq XTbqdFrClfpmIiTrPSSszLrZAceegTdefDAf5wgfTxmcfYKiPDF9kPu5UPX1rAgO hSoXvGvPjez+RzyXmv9S292lkhV4tz/Wg0YlxZ0LUWif/CSZEyeGGoFZs8080Ukl 1C/oNrByriaGGRLVwKNGOg5x/UkP1ipnorNnAiiIhw+xkzfXm12RdisUyUgLh9+w WsfsC9BJkXMpuvfkgya5PJ677wVNou4ZtJX+2MV8CfGEKTAJp/X/HKh3tWkQFYKx 35QLVO1LD6gJZlSZZTsjZDxUyPwSd8e55GSFn1qKIHuC2jKjSkwCE7JfErIrI/W0 js6HKO2ak7oMTWjxRizANyKI/yva/DeMl+mKCC4QV0xmwlm5JsOe7mYSCNGws7AV A2qYX7S1xKbLTGI= =8H+p -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20220110' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Nothing too significant, but five SELinux patches for v5.17 that do the following: - Harden the code through additional use of the struct_size() macro - Plug some memory leaks - Clean up the code via removal of the security_add_mnt_opt() LSM hook and minor tweaks to selinux_add_opt() - Rename security_task_getsecid_subj() to better reflect its actual behavior/use - now called security_current_getsecid_subj()" * tag 'selinux-pr-20220110' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: minor tweaks to selinux_add_opt() selinux: fix potential memleak in selinux_add_opt() security,selinux: remove security_add_mnt_opt() selinux: Use struct_size() helper in kmalloc() lsm: security_task_getsecid_subj() -> security_current_getsecid_subj()	2022-01-11 13:03:06 -08:00
Tom Rix	732bc2ff08	selinux: initialize proto variable in selinux_ip_postroute_compat() Clang static analysis reports this warning hooks.c:5765:6: warning: 4th function call argument is an uninitialized value if (selinux_xfrm_postroute_last(sksec->sid, skb, &ad, proto)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ selinux_parse_skb() can return ok without setting proto. The later call to selinux_xfrm_postroute_last() does an early check of proto and can return ok if the garbage proto value matches. So initialize proto. Cc: stable@vger.kernel.org Fixes: `eef9b41622` ("selinux: cleanup selinux_xfrm_sock_rcv_skb() and selinux_xfrm_postroute_last()") Signed-off-by: Tom Rix <trix@redhat.com> [PM: typo/spelling and checkpatch.pl description fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-12-27 10:41:20 -05:00
Paul Moore	6cd9d4b978	selinux: minor tweaks to selinux_add_opt() Two minor edits to selinux_add_opt(): use "sizeof(*ptr)" instead of "sizeof(type)" in the kzalloc() call, and rename the "Einval" jump target to "err" for the sake of consistency. Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-12-21 15:14:45 -05:00
Bernard Zhao	2e08df3c7c	selinux: fix potential memleak in selinux_add_opt() This patch try to fix potential memleak in error branch. Fixes: `ba64186233` ("selinux: new helper - selinux_add_opt()") Signed-off-by: Bernard Zhao <bernard@vivo.com> [PM: tweak the subject line, add Fixes tag] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-12-21 14:47:35 -05:00
Scott Mayhew	cc274ae776	selinux: fix sleeping function called from invalid context selinux_sb_mnt_opts_compat() is called via sget_fc() under the sb_lock spinlock, so it can't use GFP_KERNEL allocations: [ 868.565200] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:230 [ 868.568246] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 4914, name: mount.nfs [ 868.569626] preempt_count: 1, expected: 0 [ 868.570215] RCU nest depth: 0, expected: 0 [ 868.570809] Preemption disabled at: [ 868.570810] [<0000000000000000>] 0x0 [ 868.571848] CPU: 1 PID: 4914 Comm: mount.nfs Kdump: loaded Tainted: G W 5.16.0-rc5.2585cf9dfa #1 [ 868.573273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014 [ 868.574478] Call Trace: [ 868.574844] <TASK> [ 868.575156] dump_stack_lvl+0x34/0x44 [ 868.575692] __might_resched.cold+0xd6/0x10f [ 868.576308] slab_pre_alloc_hook.constprop.0+0x89/0xf0 [ 868.577046] __kmalloc_track_caller+0x72/0x420 [ 868.577684] ? security_context_to_sid_core+0x48/0x2b0 [ 868.578569] kmemdup_nul+0x22/0x50 [ 868.579108] security_context_to_sid_core+0x48/0x2b0 [ 868.579854] ? _nfs4_proc_pathconf+0xff/0x110 [nfsv4] [ 868.580742] ? nfs_reconfigure+0x80/0x80 [nfs] [ 868.581355] security_context_str_to_sid+0x36/0x40 [ 868.581960] selinux_sb_mnt_opts_compat+0xb5/0x1e0 [ 868.582550] ? nfs_reconfigure+0x80/0x80 [nfs] [ 868.583098] security_sb_mnt_opts_compat+0x2a/0x40 [ 868.583676] nfs_compare_super+0x113/0x220 [nfs] [ 868.584249] ? nfs_try_mount_request+0x210/0x210 [nfs] [ 868.584879] sget_fc+0xb5/0x2f0 [ 868.585267] nfs_get_tree_common+0x91/0x4a0 [nfs] [ 868.585834] vfs_get_tree+0x25/0xb0 [ 868.586241] fc_mount+0xe/0x30 [ 868.586605] do_nfs4_mount+0x130/0x380 [nfsv4] [ 868.587160] nfs4_try_get_tree+0x47/0xb0 [nfsv4] [ 868.587724] vfs_get_tree+0x25/0xb0 [ 868.588193] do_new_mount+0x176/0x310 [ 868.588782] __x64_sys_mount+0x103/0x140 [ 868.589388] do_syscall_64+0x3b/0x90 [ 868.589935] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 868.590699] RIP: 0033:0x7f2b371c6c4e [ 868.591239] Code: 48 8b 0d dd 71 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 71 0e 00 f7 d8 64 89 01 48 [ 868.593810] RSP: 002b:00007ffc83775d88 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 868.594691] RAX: ffffffffffffffda RBX: 00007ffc83775f10 RCX: 00007f2b371c6c4e [ 868.595504] RDX: 0000555d517247a0 RSI: 0000555d51724700 RDI: 0000555d51724540 [ 868.596317] RBP: 00007ffc83775f10 R08: 0000555d51726890 R09: 0000555d51726890 [ 868.597162] R10: 0000000000000000 R11: 0000000000000246 R12: 0000555d51726890 [ 868.598005] R13: 0000000000000003 R14: 0000555d517246e0 R15: 0000555d511ac925 [ 868.598826] </TASK> Cc: stable@vger.kernel.org Fixes: `69c4a42d72` ("lsm,selinux: add new hook to compare new mount to an existing mount") Signed-off-by: Scott Mayhew <smayhew@redhat.com> [PM: cleanup/line-wrap the backtrace] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-12-16 17:47:39 -05:00
Ondrej Mosnacek	52f982f00b	security,selinux: remove security_add_mnt_opt() Its last user has been removed in commit `f2aedb713c` ("NFS: Add fs_context support."). Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-12-06 13:46:24 -05:00
Xiu Jianfeng	5fe3757289	selinux: Use struct_size() helper in kmalloc() Make use of struct_size() helper instead of an open-coded calculation. Link: https://github.com/KSPP/linux/issues/160 Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-12-05 21:58:32 -05:00
Paul Moore	6326948f94	lsm: security_task_getsecid_subj() -> security_current_getsecid_subj() The security_task_getsecid_subj() LSM hook invites misuse by allowing callers to specify a task even though the hook is only safe when the current task is referenced. Fix this by removing the task_struct argument to the hook, requiring LSM implementations to use the current task. While we are changing the hook declaration we also rename the function to security_current_getsecid_subj() in an effort to reinforce that the hook captures the subjective credentials of the current task and not an arbitrary task on the system. Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-11-22 17:52:47 -05:00
Ondrej Mosnacek	dc27f3c5d1	selinux: fix NULL-pointer dereference when hashtab allocation fails When the hash table slot array allocation fails in hashtab_init(), h->size is left initialized with a non-zero value, but the h->htable pointer is NULL. This may then cause a NULL pointer dereference, since the policydb code relies on the assumption that even after a failed hashtab_init(), hashtab_map() and hashtab_destroy() can be safely called on it. Yet, these detect an empty hashtab only by looking at the size. Fix this by making sure that hashtab_init() always leaves behind a valid empty hashtab when the allocation fails. Cc: stable@vger.kernel.org Fixes: `03414a49ad` ("selinux: do not allocate hashtabs dynamically") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-11-19 16:11:39 -05:00
Paul Moore	32a370abf1	net,lsm,selinux: revert the security_sctp_assoc_established() hook This patch reverts two prior patches, `e7310c9402` ("security: implement sctp_assoc_established hook in selinux") and `7c2ef0240e` ("security: add sctp_assoc_established hook"), which create the security_sctp_assoc_established() LSM hook and provide a SELinux implementation. Unfortunately these two patches were merged without proper review (the Reviewed-by and Tested-by tags from Richard Haines were for previous revisions of these patches that were significantly different) and there are outstanding objections from the SELinux maintainers regarding these patches. Work is currently ongoing to correct the problems identified in the reverted patches, as well as others that have come up during review, but it is unclear at this point in time when that work will be ready for inclusion in the mainline kernel. In the interest of not keeping objectionable code in the kernel for multiple weeks, and potentially a kernel release, we are reverting the two problematic patches. Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-11-12 12:07:02 -05:00
Xin Long	e7310c9402	security: implement sctp_assoc_established hook in selinux Different from selinux_inet_conn_established(), it also gives the secid to asoc->peer_secid in selinux_sctp_assoc_established(), as one UDP-type socket may have more than one asocs. Note that peer_secid in asoc will save the peer secid for this asoc connection, and peer_sid in sksec will just keep the peer secid for the latest connection. So the right use should be do peeloff for UDP-type socket if there will be multiple asocs in one socket, so that the peeloff socket has the right label for its asoc. v1->v2: - call selinux_inet_conn_established() to reduce some code duplication in selinux_sctp_assoc_established(), as Ondrej suggested. - when doing peeloff, it calls sock_create() where it actually gets secid for socket from socket_sockcreate_sid(). So reuse SECSID_WILD to ensure the peeloff socket keeps using that secid after calling selinux_sctp_sk_clone() for client side. Fixes: `72e89f5008` ("security: Add support for SCTP security hooks") Reported-by: Prashanth Prahlad <pprahlad@redhat.com> Reviewed-by: Richard Haines <richard_c_haines@btinternet.com> Tested-by: Richard Haines <richard_c_haines@btinternet.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-03 11:09:20 +00:00
Xin Long	c081d53f97	security: pass asoc to sctp_assoc_request and sctp_sk_clone This patch is to move secid and peer_secid from endpoint to association, and pass asoc to sctp_assoc_request and sctp_sk_clone instead of ep. As ep is the local endpoint and asoc represents a connection, and in SCTP one sk/ep could have multiple asoc/connection, saving secid/peer_secid for new asoc will overwrite the old asoc's. Note that since asoc can be passed as NULL, security_sctp_assoc_request() is moved to the place right after the new_asoc is created in sctp_sf_do_5_1B_init() and sctp_sf_do_unexpected_init(). v1->v2: - fix the description of selinux_netlbl_skbuff_setsid(), as Jakub noticed. - fix the annotation in selinux_sctp_assoc_request(), as Richard Noticed. Fixes: `72e89f5008` ("security: Add support for SCTP security hooks") Reported-by: Prashanth Prahlad <pprahlad@redhat.com> Reviewed-by: Richard Haines <richard_c_haines@btinternet.com> Tested-by: Richard Haines <richard_c_haines@btinternet.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-03 11:09:20 +00:00
Linus Torvalds	cdab10bf32	selinux/stable-5.16 PR 20211101 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmGANbAUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNaMBAAg+9gZr0F7xiafu8JFZqZfx/AQdJ2 G2cn3le+/tXGZmF8m/+82lOaR6LeQLatgSDJNSkXWkKr0nRwseQJDbtRfvYJdn0t Ax05/Fmz6OGxQ2wgRYgaFiSrKpE5p3NhDtiLFVdkCJaQNe/8DZOc7NhBl6EjZf3x ubhl2hUiJ4AmiXGwcYhr4uKgP4nhW8OM1/OkskVi+bBMmLA8KTY9kslmIDP5E3BW 29W4qhqeLNQupY5dGMEMVcyxY9ZUWpO39q4uOaQVZrUGE7xABkj/jhnxT5gFTSlI pu8VhsYXm9KuRVveIsv0L5SZfadwoM9YAl7ki1wD3W5rHqOAte3rBTm6VmNlQwfU MqxP65Jiyxudxet5Be3/dCRH/+MDQuwBxivgmZXbeVxor2SeznVb0GDaEUC5FSHu CJIgWtQzsPJMxgAEGXN4F3QGP0htTTJni56GUPOsrf4TIBW02TT+oLTLFRIokQQL INNOfwVSRXElnCsvxsHR4oB+JZ9pJyBaAmeupcQ6jmcKiWlbLj4s+W0U0pM5h91v hmMpz7KMxrX6gVL4gB2Jj4aN3r5YRbq26NBu6D+wdwwBTeTTocaHSpAqkv4buClf uNk3cG8Hkp8TTg9cM8jYgpxMyzKH/AI/Uw3VhEa1xCiq2Ck3DgfnZvnvcRRaZevU FPgmwgqePJXGi60= =sb8J -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Add LSM/SELinux/Smack controls and auditing for io-uring. As usual, the individual commit descriptions have more detail, but we were basically missing two things which we're adding here: + establishment of a proper audit context so that auditing of io-uring ops works similarly to how it does for syscalls (with some io-uring additions because io-uring ops are not syscalls) + additional LSM hooks to enable access control points for some of the more unusual io-uring features, e.g. credential overrides. The additional audit callouts and LSM hooks were done in conjunction with the io-uring folks, based on conversations and RFC patches earlier in the year. - Fixup the binder credential handling so that the proper credentials are used in the LSM hooks; the commit description and the code comment which is removed in these patches are helpful to understand the background and why this is the proper fix. - Enable SELinux genfscon policy support for securityfs, allowing improved SELinux filesystem labeling for other subsystems which make use of securityfs, e.g. IMA. * tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: security: Return xattr name from security_dentry_init_security() selinux: fix a sock regression in selinux_ip_postroute_compat() binder: use cred instead of task for getsecid binder: use cred instead of task for selinux checks binder: use euid from cred instead of using task LSM: Avoid warnings about potentially unused hook variables selinux: fix all of the W=1 build warnings selinux: make better use of the nf_hook_state passed to the NF hooks selinux: fix race condition when computing ocontext SIDs selinux: remove unneeded ipv6 hook wrappers selinux: remove the SELinux lockdown implementation selinux: enable genfscon labeling for securityfs Smack: Brutalist io_uring support selinux: add support for the io_uring access controls lsm,io_uring: add LSM hooks to io_uring io_uring: convert io_uring to the secure anon inode interface fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure() audit: add filtering for io_uring records audit,io_uring,io-wq: add some basic audit support to io_uring audit: prepare audit_context for use in calling contexts beyond syscalls	2021-11-01 21:06:18 -07:00
Vivek Goyal	15bf32398a	security: Return xattr name from security_dentry_init_security() Right now security_dentry_init_security() only supports single security label and is used by SELinux only. There are two users of this hook, namely ceph and nfs. NFS does not care about xattr name. Ceph hardcodes the xattr name to security.selinux (XATTR_NAME_SELINUX). I am making changes to fuse/virtiofs to send security label to virtiofsd and I need to send xattr name as well. I also hardcoded the name of xattr to security.selinux. Stephen Smalley suggested that it probably is a good idea to modify security_dentry_init_security() to also return name of xattr so that we can avoid this hardcoding in the callers. This patch adds a new parameter "const char **xattr_name" to security_dentry_init_security() and LSM puts the name of xattr too if caller asked for it (xattr_name != NULL). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: James Morris <jamorris@linux.microsoft.com> [PM: fixed typos in the commit description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-10-20 08:17:08 -04:00
Paul Moore	1c73213ba9	selinux: fix a sock regression in selinux_ip_postroute_compat() Unfortunately we can't rely on nf_hook_state->sk being the proper originating socket so revert to using skb_to_full_sk(skb). Fixes: `1d1e1ded13` ("selinux: make better use of the nf_hook_state passed to the NF hooks") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-10-19 12:35:18 -04:00
Todd Kjos	52f8869337	binder: use cred instead of task for selinux checks Since binder was integrated with selinux, it has passed 'struct task_struct' associated with the binder_proc to represent the source and target of transactions. The conversion of task to SID was then done in the hook implementations. It turns out that there are race conditions which can result in an incorrect security context being used. Fix by using the 'struct cred' saved during binder_open and pass it to the selinux subsystem. Cc: stable@vger.kernel.org # 5.14 (need backport for earlier stables) Fixes: `79af73079d` ("Add security hooks to binder and implement the hooks for SELinux.") Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: Todd Kjos <tkjos@google.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-10-14 20:48:04 -04:00
Paul Moore	e9fd729293	selinux: fix all of the W=1 build warnings There were a number of places in the code where the function definition did not match the associated comment block as well at least one file where the appropriate header files were not included (missing function declaration/prototype); this patch fixes all of these issue such that building the SELinux code with "W=1" is now warning free. % make W=1 security/selinux/ Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-10-13 16:31:51 -04:00
Paul Moore	1d1e1ded13	selinux: make better use of the nf_hook_state passed to the NF hooks This patch builds on a previous SELinux/netfilter patch by Florian Westphal and makes better use of the nf_hook_state variable passed into the SELinux/netfilter hooks as well as a number of other small cleanups in the related code. Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-10-13 16:31:18 -04:00
Ondrej Mosnacek	cbfcd13be5	selinux: fix race condition when computing ocontext SIDs Current code contains a lot of racy patterns when converting an ocontext's context structure to an SID. This is being done in a "lazy" fashion, such that the SID is looked up in the SID table only when it's first needed and then cached in the "sid" field of the ocontext structure. However, this is done without any locking or memory barriers and is thus unsafe. Between commits `24ed7fdae6` ("selinux: use separate table for initial SID lookup") and `66f8e2f03c` ("selinux: sidtab reverse lookup hash table"), this race condition lead to an actual observable bug, because a pointer to the shared sid field was passed directly to sidtab_context_to_sid(), which was using this location to also store an intermediate value, which could have been read by other threads and interpreted as an SID. In practice this caused e.g. new mounts to get a wrong (seemingly random) filesystem context, leading to strange denials. This bug has been spotted in the wild at least twice, see [1] and [2]. Fix the race condition by making all the racy functions use a common helper that ensures the ocontext::sid accesses are made safely using the appropriate SMP constructs. Note that security_netif_sid() was populating the sid field of both contexts stored in the ocontext, but only the first one was actually used. The SELinux wiki's documentation on the "netifcon" policy statement [3] suggests that using only the first context is intentional. I kept only the handling of the first context here, as there is really no point in doing the SID lookup for the unused one. I wasn't able to reproduce the bug mentioned above on any kernel that includes commit `66f8e2f03c`, even though it has been reported that the issue occurs with that commit, too, just less frequently. Thus, I wasn't able to verify that this patch fixes the issue, but it makes sense to avoid the race condition regardless. [1] https://github.com/containers/container-selinux/issues/89 [2] https://lists.fedoraproject.org/archives/list/selinux@lists.fedoraproject.org/thread/6DMTAMHIOAOEMUAVTULJD45JZU7IBAFM/ [3] https://selinuxproject.org/page/NetworkStatements#netifcon Cc: stable@vger.kernel.org Cc: Xinjie Zheng <xinjie@google.com> Reported-by: Sujithra Periasamy <sujithra@google.com> Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-10-11 19:18:04 -04:00
Florian Westphal	4342f70538	selinux: remove unneeded ipv6 hook wrappers Netfilter places the protocol number the hook function is getting called from in state->pf, so we can use that instead of an extra wrapper. While at it, remove one-line wrappers too and make selinux_ip_{out,forward,postroute} useable as hook function. Signed-off-by: Florian Westphal <fw@strlen.de> Message-Id: <20211011202229.28289-1-fw@strlen.de> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-10-11 17:44:00 -04:00
David S. Miller	578f393227	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2021-10-07 1) Fix a sysbot reported shift-out-of-bounds in xfrm_get_default. From Pavel Skripkin. 2) Fix XFRM_MSG_MAPPING ABI breakage. The new XFRM_MSG_MAPPING messages were accidentally not paced at the end. Fix by Eugene Syromiatnikov. 3) Fix the uapi for the default policy, use explicit field and macros and make it accessible to userland. From Nicolas Dichtel. 4) Fix a missing rcu lock in xfrm_notify_userpolicy(). From Nicolas Dichtel. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-07 12:44:41 +01:00
Paul Moore	f5d0e5e9d7	selinux: remove the SELinux lockdown implementation NOTE: This patch intentionally omits any "Fixes:" metadata or stable tagging since it removes a SELinux access control check; while removing the control point is the right thing to do moving forward, removing it in stable kernels could be seen as a regression. The original SELinux lockdown implementation in `59438b4647` ("security,lockdown,selinux: implement SELinux lockdown") used the current task's credentials as both the subject and object in the SELinux lockdown hook, selinux_lockdown(). Unfortunately that proved to be incorrect in a number of cases as the core kernel was calling the LSM lockdown hook in places where the credentials from the "current" task_struct were not the correct credentials to use in the SELinux access check. Attempts were made to resolve this by adding a credential pointer to the LSM lockdown hook as well as suggesting that the single hook be split into two: one for user tasks, one for kernel tasks; however neither approach was deemed acceptable by Linus. Faced with the prospect of either changing the subj/obj in the access check to a constant context (likely the kernel's label) or removing the SELinux lockdown check entirely, the SELinux community decided that removing the lockdown check was preferable. The supporting changes to the general LSM layer are left intact, this patch only removes the SELinux implementation. Acked-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-09-30 10:12:33 -04:00
Christian Göttsche	8a764ef1bd	selinux: enable genfscon labeling for securityfs Add support for genfscon per-file labeling of securityfs files. This allows for separate labels and thereby access control for different files. For example a genfscon statement genfscon securityfs /integrity/ima/policy \ system_u:object_r:ima_policy_t:s0 will set a private label to the IMA policy file and thus allow to control the ability to set the IMA policy. Setting labels directly with setxattr(2), e.g. by chcon(1) or setfiles(8), is still not supported. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: line width fixes in the commit description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-09-28 18:49:03 -04:00
Paul Moore	a3727a8bac	selinux,smack: fix subjective/objective credential use mixups Jann Horn reported a problem with commit `eb1231f73c` ("selinux: clarify task subjective and objective credentials") where some LSM hooks were attempting to access the subjective credentials of a task other than the current task. Generally speaking, it is not safe to access another task's subjective credentials and doing so can cause a number of problems. Further, while looking into the problem, I realized that Smack was suffering from a similar problem brought about by a similar commit `1fb057dcde` ("smack: differentiate between subjective and objective task credentials"). This patch addresses this problem by restoring the use of the task's objective credentials in those cases where the task is other than the current executing task. Not only does this resolve the problem reported by Jann, it is arguably the correct thing to do in these cases. Cc: stable@vger.kernel.org Fixes: `eb1231f73c` ("selinux: clarify task subjective and objective credentials") Fixes: `1fb057dcde` ("smack: differentiate between subjective and objective task credentials") Reported-by: Jann Horn <jannh@google.com> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-09-23 12:30:59 -04:00
Paul Moore	740b03414b	selinux: add support for the io_uring access controls This patch implements two new io_uring access controls, specifically support for controlling the io_uring "personalities" and IORING_SETUP_SQPOLL. Controlling the sharing of io_urings themselves is handled via the normal file/inode labeling and sharing mechanisms. The io_uring { override_creds } permission restricts which domains the subject domain can use to override it's own credentials. Granting a domain the io_uring { override_creds } permission allows it to impersonate another domain in io_uring operations. The io_uring { sqpoll } permission restricts which domains can create asynchronous io_uring polling threads. This is important from a security perspective as operations queued by this asynchronous thread inherit the credentials of the thread creator by default; if an io_uring is shared across process/domain boundaries this could result in one domain impersonating another. Controlling the creation of sqpoll threads, and the sharing of io_urings across processes, allow policy authors to restrict the ability of one domain to impersonate another via io_uring. As a quick summary, this patch adds a new object class with two permissions: io_uring { override_creds sqpoll } These permissions can be seen in the two simple policy statements below: allow domA_t domB_t : io_uring { override_creds }; allow domA_t self : io_uring { sqpoll }; Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-09-19 22:40:32 -04:00
Eugene Syromiatnikov	844f7eaaed	include/uapi/linux/xfrm.h: Fix XFRM_MSG_MAPPING ABI breakage Commit `2d151d3907` ("xfrm: Add possibility to set the default to block if we have no policy") broke ABI by changing the value of the XFRM_MSG_MAPPING enum item, thus also evading the build-time check in security/selinux/nlmsgtab.c:selinux_nlmsg_lookup for presence of proper security permission checks in nlmsg_xfrm_perms. Fix it by placing XFRM_MSG_SETDEFAULT/XFRM_MSG_GETDEFAULT to the end of the enum, right before __XFRM_MSG_MAX, and updating the nlmsg_xfrm_perms accordingly. Fixes: `2d151d3907` ("xfrm: Add possibility to set the default to block if we have no policy") References: https://lore.kernel.org/netdev/20210901151402.GA2557@altlinux.org/ Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Acked-by: Antony Antony <antony.antony@secunet.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2021-09-14 10:31:35 +02:00
Linus Torvalds	aef4892a63	integrity-v5.15 -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCYS4c6hQcem9oYXJAbGlu dXguaWJtLmNvbQAKCRDLwZzRsCrn5b4OAP9l7cnpkOzVUtjoNIIYdIiKTDp+Kb8v 3o08lxtyzALfKgEAlrizzLfphqLa2yCdxbyaTjkx19J7tav27xVti8uVGgs= =hIxY -----END PGP SIGNATURE----- Merge tag 'integrity-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity subsystem updates from Mimi Zohar: - Limit the allowed hash algorithms when writing security.ima xattrs or verifying them, based on the IMA policy and the configured hash algorithms. - Return the calculated "critical data" measurement hash and size to avoid code duplication. (Preparatory change for a proposed LSM.) - and a single patch to address a compiler warning. * tag 'integrity-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: IMA: reject unknown hash algorithms in ima_get_hash_algo IMA: prevent SETXATTR_CHECK policy rules with unavailable algorithms IMA: introduce a new policy option func=SETXATTR_CHECK IMA: add a policy option to restrict xattr hash algorithms on appraisal IMA: add support to restrict the hash algorithms used for file appraisal IMA: block writes of the security.ima xattr with unsupported algorithms IMA: remove the dependency on CRYPTO_MD5 ima: Add digest and digest_len params to the functions to measure a buffer ima: Return int in the functions to measure a buffer ima: Introduce ima_get_current_hash_algo() IMA: remove -Wmissing-prototypes warning	2021-09-02 12:51:41 -07:00
Linus Torvalds	9e9fb7655e	Core: - Enable memcg accounting for various networking objects. BPF: - Introduce bpf timers. - Add perf link and opaque bpf_cookie which the program can read out again, to be used in libbpf-based USDT library. - Add bpf_task_pt_regs() helper to access user space pt_regs in kprobes, to help user space stack unwinding. - Add support for UNIX sockets for BPF sockmap. - Extend BPF iterator support for UNIX domain sockets. - Allow BPF TCP congestion control progs and bpf iterators to call bpf_setsockopt(), e.g. to switch to another congestion control algorithm. Protocols: - Support IOAM Pre-allocated Trace with IPv6. - Support Management Component Transport Protocol. - bridge: multicast: add vlan support. - netfilter: add hooks for the SRv6 lightweight tunnel driver. - tcp: - enable mid-stream window clamping (by user space or BPF) - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD - more accurate DSACK processing for RACK-TLP - mptcp: - add full mesh path manager option - add partial support for MP_FAIL - improve use of backup subflows - optimize option processing - af_unix: add OOB notification support. - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the router. - mac80211: Target Wake Time support in AP mode. - can: j1939: extend UAPI to notify about RX status. Driver APIs: - Add page frag support in page pool API. - Many improvements to the DSA (distributed switch) APIs. - ethtool: extend IRQ coalesce uAPI with timer reset modes. - devlink: control which auxiliary devices are created. - Support CAN PHYs via the generic PHY subsystem. - Proper cross-chip support for tag_8021q. - Allow TX forwarding for the software bridge data path to be offloaded to capable devices. Drivers: - veth: more flexible channels number configuration. - openvswitch: introduce per-cpu upcall dispatch. - Add internet mix (IMIX) mode to pktgen. - Transparently handle XDP operations in the bonding driver. - Add LiteETH network driver. - Renesas (ravb): - support Gigabit Ethernet IP - NXP Ethernet switch (sja1105) - fast aging support - support for "H" switch topologies - traffic termination for ports under VLAN-aware bridge - Intel 1G Ethernet - support getcrosststamp() with PCIe PTM (Precision Time Measurement) for better time sync - support Credit-Based Shaper (CBS) offload, enabling HW traffic prioritization and bandwidth reservation - Broadcom Ethernet (bnxt) - support pulse-per-second output - support larger Rx rings - Mellanox Ethernet (mlx5) - support ethtool RSS contexts and MQPRIO channel mode - support LAG offload with bridging - support devlink rate limit API - support packet sampling on tunnels - Huawei Ethernet (hns3): - basic devlink support - add extended IRQ coalescing support - report extended link state - Netronome Ethernet (nfp): - add conntrack offload support - Broadcom WiFi (brcmfmac): - add WPA3 Personal with FT to supported cipher suites - support 43752 SDIO device - Intel WiFi (iwlwifi): - support scanning hidden 6GHz networks - support for a new hardware family (Bz) - Xen pv driver: - harden netfront against malicious backends - Qualcomm mobile - ipa: refactor power management and enable automatic suspend - mhi: move MBIM to WWAN subsystem interfaces Refactor: - Ambient BPF run context and cgroup storage cleanup. - Compat rework for ndo_ioctl. Old code removal: - prism54 remove the obsoleted driver, deprecated by the p54 driver. - wan: remove sbni/granch driver. Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmEukBYACgkQMUZtbf5S IrsyHA//TO8dw18NYts4n9LmlJT2naJ7yBUUSSXK/M+DtW0MQ9nnHhqzPm5uJdRl IgQTNJrW3dYzRwgqaWZqEwO1t5/FI+f87ND1Nsekg7x9tF66a6ov5WxU26TwwSba U+si/inQ/4chuQ+LxMQobqCDxaLE46I2dIoRl+YfndJ24DRzYSwAEYIPPbSdfyU+ +/l+3s4GaxO4k/hLciPAiOniyxLoUNiGUTNh+2yqRBXelSRJRKVnl+V22ANFrxRW nTEiplfVKhlPU1e4iLuRtaxDDiePHhw9I3j/lMHhfeFU2P/gKJIvz4QpGV0CAZg2 1VvDU32WEx1GQLXJbKm0KwoNRUq1QSjOyyFti+BO7ugGaYAR4gKhShOqlSYLzUtB tbtzQhSNLWOGqgmSJOztZb5kFDm2EdRSll5/lP2uyFlPkIsIp0QbscJVzNTnS74b Xz15ZOw41Z4TfWPEMWgfrx6Zkm7pPWkly+7WfUkPcHa1gftNz6tzXXxSXcXIBPdi yQ5JCzzxrM5573YHuk5YedwZpn6PiAt4A/muFGk9C6aXP60TQAOS/ppaUzZdnk4D NfOk9mj06WEULjYjPcKEuT3GGWE6kmjb8Pu0QZWKOchv7vr6oZly1EkVZqYlXELP AfhcrFeuufie8mqm0jdb4LnYaAnqyLzlb1J4Zxh9F+/IX7G3yoc= =JDGD -----END PGP SIGNATURE----- Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Enable memcg accounting for various networking objects. BPF: - Introduce bpf timers. - Add perf link and opaque bpf_cookie which the program can read out again, to be used in libbpf-based USDT library. - Add bpf_task_pt_regs() helper to access user space pt_regs in kprobes, to help user space stack unwinding. - Add support for UNIX sockets for BPF sockmap. - Extend BPF iterator support for UNIX domain sockets. - Allow BPF TCP congestion control progs and bpf iterators to call bpf_setsockopt(), e.g. to switch to another congestion control algorithm. Protocols: - Support IOAM Pre-allocated Trace with IPv6. - Support Management Component Transport Protocol. - bridge: multicast: add vlan support. - netfilter: add hooks for the SRv6 lightweight tunnel driver. - tcp: - enable mid-stream window clamping (by user space or BPF) - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD - more accurate DSACK processing for RACK-TLP - mptcp: - add full mesh path manager option - add partial support for MP_FAIL - improve use of backup subflows - optimize option processing - af_unix: add OOB notification support. - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the router. - mac80211: Target Wake Time support in AP mode. - can: j1939: extend UAPI to notify about RX status. Driver APIs: - Add page frag support in page pool API. - Many improvements to the DSA (distributed switch) APIs. - ethtool: extend IRQ coalesce uAPI with timer reset modes. - devlink: control which auxiliary devices are created. - Support CAN PHYs via the generic PHY subsystem. - Proper cross-chip support for tag_8021q. - Allow TX forwarding for the software bridge data path to be offloaded to capable devices. Drivers: - veth: more flexible channels number configuration. - openvswitch: introduce per-cpu upcall dispatch. - Add internet mix (IMIX) mode to pktgen. - Transparently handle XDP operations in the bonding driver. - Add LiteETH network driver. - Renesas (ravb): - support Gigabit Ethernet IP - NXP Ethernet switch (sja1105): - fast aging support - support for "H" switch topologies - traffic termination for ports under VLAN-aware bridge - Intel 1G Ethernet - support getcrosststamp() with PCIe PTM (Precision Time Measurement) for better time sync - support Credit-Based Shaper (CBS) offload, enabling HW traffic prioritization and bandwidth reservation - Broadcom Ethernet (bnxt) - support pulse-per-second output - support larger Rx rings - Mellanox Ethernet (mlx5) - support ethtool RSS contexts and MQPRIO channel mode - support LAG offload with bridging - support devlink rate limit API - support packet sampling on tunnels - Huawei Ethernet (hns3): - basic devlink support - add extended IRQ coalescing support - report extended link state - Netronome Ethernet (nfp): - add conntrack offload support - Broadcom WiFi (brcmfmac): - add WPA3 Personal with FT to supported cipher suites - support 43752 SDIO device - Intel WiFi (iwlwifi): - support scanning hidden 6GHz networks - support for a new hardware family (Bz) - Xen pv driver: - harden netfront against malicious backends - Qualcomm mobile - ipa: refactor power management and enable automatic suspend - mhi: move MBIM to WWAN subsystem interfaces Refactor: - Ambient BPF run context and cgroup storage cleanup. - Compat rework for ndo_ioctl. Old code removal: - prism54 remove the obsoleted driver, deprecated by the p54 driver. - wan: remove sbni/granch driver" * tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1715 commits) net: Add depends on OF_NET for LiteX's LiteETH ipv6: seg6: remove duplicated include net: hns3: remove unnecessary spaces net: hns3: add some required spaces net: hns3: clean up a type mismatch warning net: hns3: refine function hns3_set_default_feature() ipv6: remove duplicated 'net/lwtunnel.h' include net: w5100: check return value after calling platform_get_resource() net/mlxbf_gige: Make use of devm_platform_ioremap_resourcexxx() net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource() net: mdio-ipq4019: Make use of devm_platform_ioremap_resource() fou: remove sparse errors ipv4: fix endianness issue in inet_rtm_getroute_build_skb() octeontx2-af: Set proper errorcode for IPv4 checksum errors octeontx2-af: Fix static code analyzer reported issues octeontx2-af: Fix mailbox errors in nix_rss_flowkey_cfg octeontx2-af: Fix loop in free and unmap counter af_unix: fix potential NULL deref in unix_dgram_connect() dpaa2-eth: Replace strlcpy with strscpy octeontx2-af: Use NDC TX for transmit packet data ...	2021-08-31 16:43:06 -07:00
Linus Torvalds	befa491ce6	Merge tag 'selinux-pr-20210830' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux update from Paul Moore: "We've got an unusually small SELinux pull request for v5.15 that consists of only one (?!) patch that is really pretty minor when you look at it. Unsurprisingly it passes all of our tests and merges cleanly on top of your tree right now, please merge this for v5.15" * tag 'selinux-pr-20210830' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: return early for possible NULL audit buffers	2021-08-31 12:53:34 -07:00
Jakub Kicinski	0ca8d3ca45	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Build failure in drivers/net/wwan/mhi_wwan_mbim.c: add missing parameter (0, assuming we don't want buffer pre-alloc). Conflict in drivers/net/dsa/sja1105/sja1105_main.c between: `589918df93` ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") `0fac6aa098` ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode") Follow the instructions from the commit message of the former commit - removed the if conditions. When looking at commit `589918df93` ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") note that the mask_iotag fields get removed by the following patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-05 15:08:47 -07:00
Xiu Jianfeng	4c156084da	selinux: correct the return value when loads initial sids It should not return 0 when SID 0 is assigned to isids. This patch fixes it. Cc: stable@vger.kernel.org Fixes: `e3e0b582c3` ("selinux: remove unused initial SIDs and improve handling") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> [PM: remove changelog from description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-08-02 09:59:50 -04:00
Jeremy Kerr	bc49d8169a	mctp: Add MCTP base Add basic Kconfig, an initial (empty) af_mctp source object, and {AF,PF}_MCTP definitions, and the required definitions for a new protocol type. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-29 15:06:49 +01:00
Roberto Sassu	ca3c9bdb10	ima: Add digest and digest_len params to the functions to measure a buffer This patch performs the final modification necessary to pass the buffer measurement to callers, so that they provide a functionality similar to ima_file_hash(). It adds the 'digest' and 'digest_len' parameters to ima_measure_critical_data() and process_buffer_measurement(). These functions calculate the digest even if there is no suitable rule in the IMA policy and, in this case, they simply return 1 before generating a new measurement entry. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2021-07-23 09:27:02 -04:00
Austin Kim	893c47d196	selinux: return early for possible NULL audit buffers audit_log_start() may return NULL in below cases: - when audit is not initialized. - when audit backlog limit exceeds. After the call to audit_log_start() is made and then possible NULL audit buffer argument is passed to audit_log_() functions, audit_log_() functions return immediately in case of a NULL audit buffer argument. But it is optimal to return early when audit_log_start() returns NULL, because it is not necessary for audit_log_*() functions to be called with NULL audit buffer argument. So add exception handling for possible NULL audit buffers where return value can be handled from callers. Signed-off-by: Austin Kim <austin.kim@lge.com> [PM: tweak subject line] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-07-14 15:25:27 -04:00
Al Viro	d99cf13f14	selinux: kill 'flags' argument in avc_has_perm_flags() and avc_audit() ... along with avc_has_perm_flags() itself, since now it's identical to avc_has_perm() (as pointed out by Paul Moore) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [PM: add "selinux:" prefix to subj and tweak for length] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-06-11 13:11:45 -04:00
Al Viro	b17ec22fb3	selinux: slow_avc_audit has become non-blocking dump_common_audit_data() is safe to use under rcu_read_lock() now; no need for AVC_NONBLOCKING and games around it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-06-11 13:05:18 -04:00
Yang Li	d0a83314db	selinux: Fix kernel-doc Fix function name and add comment for parameter state in ss/services.c kernel-doc to remove some warnings found by running make W=1 LLVM=1. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-06-11 12:59:45 -04:00
Minchan Kim	648f2c6100	selinux: use __GFP_NOWARN with GFP_NOWAIT in the AVC In the field, we have seen lots of allocation failure from the call path below. 06-03 13:29:12.999 1010315 31557 31557 W Binder : 31542_2: page allocation failure: order:0, mode:0x800(GFP_NOWAIT), nodemask=(null),cpuset=background,mems_allowed=0 ... ... 06-03 13:29:12.999 1010315 31557 31557 W Call trace: 06-03 13:29:12.999 1010315 31557 31557 W : dump_backtrace.cfi_jt+0x0/0x8 06-03 13:29:12.999 1010315 31557 31557 W : dump_stack+0xc8/0x14c 06-03 13:29:12.999 1010315 31557 31557 W : warn_alloc+0x158/0x1c8 06-03 13:29:12.999 1010315 31557 31557 W : __alloc_pages_slowpath+0x9d8/0xb80 06-03 13:29:12.999 1010315 31557 31557 W : __alloc_pages_nodemask+0x1c4/0x430 06-03 13:29:12.999 1010315 31557 31557 W : allocate_slab+0xb4/0x390 06-03 13:29:12.999 1010315 31557 31557 W : ___slab_alloc+0x12c/0x3a4 06-03 13:29:12.999 1010315 31557 31557 W : kmem_cache_alloc+0x358/0x5e4 06-03 13:29:12.999 1010315 31557 31557 W : avc_alloc_node+0x30/0x184 06-03 13:29:12.999 1010315 31557 31557 W : avc_update_node+0x54/0x4f0 06-03 13:29:12.999 1010315 31557 31557 W : avc_has_extended_perms+0x1a4/0x460 06-03 13:29:12.999 1010315 31557 31557 W : selinux_file_ioctl+0x320/0x3d0 06-03 13:29:12.999 1010315 31557 31557 W : __arm64_sys_ioctl+0xec/0x1fc 06-03 13:29:12.999 1010315 31557 31557 W : el0_svc_common+0xc0/0x24c 06-03 13:29:12.999 1010315 31557 31557 W : el0_svc+0x28/0x88 06-03 13:29:12.999 1010315 31557 31557 W : el0_sync_handler+0x8c/0xf0 06-03 13:29:12.999 1010315 31557 31557 W : el0_sync+0x1a4/0x1c0 .. .. 06-03 13:29:12.999 1010315 31557 31557 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:12.999 1010315 31557 31557 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:12.999 1010315 31557 31557 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 10230 30892 30892 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 10230 30892 30892 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:13.000 10230 30892 30892 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 10230 30892 30892 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 10230 30892 30892 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 Based on [1], selinux is tolerate for failure of memory allocation. Then, use __GFP_NOWARN together. [1] `476accbe2f` ("selinux: use GFP_NOWAIT in the AVC kmem_caches") Signed-off-by: Minchan Kim <minchan@kernel.org> [PM: subj fix, line wraps, normalized commit refs] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-06-10 21:13:53 -04:00
Ondrej Mosnacek	869cbeef18	lsm_audit,selinux: pass IB device name by reference While trying to address a Coverity warning that the dev_name string might end up unterminated when strcpy'ing it in selinux_ib_endport_manage_subnet(), I realized that it is possible (and simpler) to just pass the dev_name pointer directly, rather than copying the string to a buffer. The ibendport variable goes out of scope at the end of the function anyway, so the lifetime of the dev_name pointer will never be shorter than that of ibendport, thus we can safely just pass the dev_name pointer and be done with it. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-05-14 16:38:19 -04:00
Jiapeng Chong	fd781f459b	selinux: Remove redundant assignment to rc Variable rc is set to '-EINVAL' but this value is never read as it is overwritten or not used later on, hence it is a redundant assignment and can be removed. Cleans up the following clang-analyzer warning: security/selinux/ss/services.c:2103:3: warning: Value stored to 'rc' is never read [clang-analyzer-deadcode.DeadStores]. security/selinux/ss/services.c:2079:2: warning: Value stored to 'rc' is never read [clang-analyzer-deadcode.DeadStores]. security/selinux/ss/services.c:2071:2: warning: Value stored to 'rc' is never read [clang-analyzer-deadcode.DeadStores]. security/selinux/ss/services.c:2062:2: warning: Value stored to 'rc' is never read [clang-analyzer-deadcode.DeadStores]. security/selinux/ss/policydb.c:2592:3: warning: Value stored to 'rc' is never read [clang-analyzer-deadcode.DeadStores]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-05-10 21:48:11 -04:00
Souptick Joarder	7cffc377e1	selinux: Corrected comment to match kernel-doc comment Minor documentation update. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-05-10 21:41:52 -04:00
Zhongjun Tan	8a922805fb	selinux: delete selinux_xfrm_policy_lookup() useless argument seliunx_xfrm_policy_lookup() is hooks of security_xfrm_policy_lookup(). The dir argument is uselss in security_xfrm_policy_lookup(). So remove the dir argument from selinux_xfrm_policy_lookup() and security_xfrm_policy_lookup(). Signed-off-by: Zhongjun Tan <tanzhongjun@yulong.com> [PM: reformat the subject line] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-05-10 21:38:31 -04:00
Ondrej Mosnacek	e1cce3a3cb	selinux: constify some avtab function arguments This makes the code a bit easier to reason about. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-05-10 21:35:02 -04:00
Ondrej Mosnacek	fba472bb38	selinux: simplify duplicate_policydb_cond_list() by using kmemdup() We can do the allocation + copying of expr.nodes in one go using kmemdup(). Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-05-10 21:31:58 -04:00
Linus Torvalds	17ae69aba8	Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAmCInP4ACgkQrZhLv9lQ BTza0g//dTeb9woC9H7qlEhK4l9yk62lTss60Q8X7m7ZSNfdL4tiEbi64SgK+iOW OOegbrOEb8Kzh4KJJYmVlVZ5YUWyH4szgmee1wnylBdsWiWaPLPF3Cflz77apy6T TiiBsJd7rRE29FKheaMt34B41BMh8QHESN+DzjzJWsFoi/uNxjgSs2W16XuSupKu bpRmB1pYNXMlrkzz7taL05jndZYE5arVriqlxgAsuLOFOp/ER7zecrjImdCM/4kL W6ej0R1fz2Geh6CsLBJVE+bKWSQ82q5a4xZEkSYuQHXgZV5eywE5UKu8ssQcRgQA VmGUY5k73rfY9Ofupf2gCaf/JSJNXKO/8Xjg0zAdklKtmgFjtna5Tyg9I90j7zn+ 5swSpKuRpilN8MQH+6GWAnfqQlNoviTOpFeq3LwBtNVVOh08cOg6lko/bmebBC+R TeQPACKS0Q0gCDPm9RYoU1pMUuYgfOwVfVRZK1prgi2Co7ZBUMOvYbNoKYoPIydr ENBYljlU1OYwbzgR2nE+24fvhU8xdNOVG1xXYPAEHShu+p7dLIWRLhl8UCtRQpSR 1ofeVaJjgjrp29O+1OIQjB2kwCaRdfv/Gq1mztE/VlMU/r++E62OEzcH0aS+mnrg yzfyUdI8IFv1q6FGT9yNSifWUWxQPmOKuC8kXsKYfqfJsFwKmHM= =uCN4 -----END PGP SIGNATURE----- Merge tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull Landlock LSM from James Morris: "Add Landlock, a new LSM from Mickaël Salaün. Briefly, Landlock provides for unprivileged application sandboxing. From Mickaël's cover letter: "The goal of Landlock is to enable to restrict ambient rights (e.g. global filesystem access) for a set of processes. Because Landlock is a stackable LSM [1], it makes possible to create safe security sandboxes as new security layers in addition to the existing system-wide access-controls. This kind of sandbox is expected to help mitigate the security impact of bugs or unexpected/malicious behaviors in user-space applications. Landlock empowers any process, including unprivileged ones, to securely restrict themselves. Landlock is inspired by seccomp-bpf but instead of filtering syscalls and their raw arguments, a Landlock rule can restrict the use of kernel objects like file hierarchies, according to the kernel semantic. Landlock also takes inspiration from other OS sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil. In this current form, Landlock misses some access-control features. This enables to minimize this patch series and ease review. This series still addresses multiple use cases, especially with the combined use of seccomp-bpf: applications with built-in sandboxing, init systems, security sandbox tools and security-oriented APIs [2]" The cover letter and v34 posting is here: https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/ See also: https://landlock.io/ This code has had extensive design discussion and review over several years" Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1] Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2] * tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: landlock: Enable user space to infer supported features landlock: Add user and kernel documentation samples/landlock: Add a sandbox manager example selftests/landlock: Add user space tests landlock: Add syscall implementations arch: Wire up Landlock syscalls fs,security: Add sb_delete hook landlock: Support filesystem access-control LSM: Infrastructure management of the superblock landlock: Add ptrace restrictions landlock: Set up the security framework and manage credentials landlock: Add ruleset and domain management landlock: Add object management	2021-05-01 18:50:44 -07:00
Linus Torvalds	9d31d23389	Networking changes for 5.13. Core: - bpf: - allow bpf programs calling kernel functions (initially to reuse TCP congestion control implementations) - enable task local storage for tracing programs - remove the need to store per-task state in hash maps, and allow tracing programs access to task local storage previously added for BPF_LSM - add bpf_for_each_map_elem() helper, allowing programs to walk all map elements in a more robust and easier to verify fashion - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT redirection - lpm: add support for batched ops in LPM trie - add BTF_KIND_FLOAT support - mostly to allow use of BTF on s390 which has floats in its headers files - improve BPF syscall documentation and extend the use of kdoc parsing scripts we already employ for bpf-helpers - libbpf, bpftool: support static linking of BPF ELF files - improve support for encapsulation of L2 packets - xdp: restructure redirect actions to avoid a runtime lookup, improving performance by 4-8% in microbenchmarks - xsk: build skb by page (aka generic zerocopy xmit) - improve performance of software AF_XDP path by 33% for devices which don't need headers in the linear skb part (e.g. virtio) - nexthop: resilient next-hop groups - improve path stability on next-hops group changes (incl. offload for mlxsw) - ipv6: segment routing: add support for IPv4 decapsulation - icmp: add support for RFC 8335 extended PROBE messages - inet: use bigger hash table for IP ID generation - tcp: deal better with delayed TX completions - make sure we don't give up on fast TCP retransmissions only because driver is slow in reporting that it completed transmitting the original - tcp: reorder tcp_congestion_ops for better cache locality - mptcp: - add sockopt support for common TCP options - add support for common TCP msg flags - include multiple address ids in RM_ADDR - add reset option support for resetting one subflow - udp: GRO L4 improvements - improve 'forward' / 'frag_list' co-existence with UDP tunnel GRO, allowing the first to take place correctly even for encapsulated UDP traffic - micro-optimize dev_gro_receive() and flow dissection, avoid retpoline overhead on VLAN and TEB GRO - use less memory for sysctls, add a new sysctl type, to allow using u8 instead of "int" and "long" and shrink networking sysctls - veth: allow GRO without XDP - this allows aggregating UDP packets before handing them off to routing, bridge, OvS, etc. - allow specifing ifindex when device is moved to another namespace - netfilter: - nft_socket: add support for cgroupsv2 - nftables: add catch-all set element - special element used to define a default action in case normal lookup missed - use net_generic infra in many modules to avoid allocating per-ns memory unnecessarily - xps: improve the xps handling to avoid potential out-of-bound accesses and use-after-free when XPS change race with other re-configuration under traffic - add a config knob to turn off per-cpu netdev refcnt to catch underflows in testing Device APIs: - add WWAN subsystem to organize the WWAN interfaces better and hopefully start driving towards more unified and vendor- -independent APIs - ethtool: - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt support) - allow network drivers to dump arbitrary SFP EEPROM data, current offset+length API was a poor fit for modern SFP which define EEPROM in terms of pages (incl. mlx5 support) - act_police, flow_offload: add support for packet-per-second policing (incl. offload for nfp) - psample: add additional metadata attributes like transit delay for packets sampled from switch HW (and corresponding egress and policy-based sampling in the mlxsw driver) - dsa: improve support for sandwiched LAGs with bridge and DSA - netfilter: - flowtable: use direct xmit in topologies with IP forwarding, bridging, vlans etc. - nftables: counter hardware offload support - Bluetooth: - improvements for firmware download w/ Intel devices - add support for reading AOSP vendor capabilities - add support for virtio transport driver - mac80211: - allow concurrent monitor iface and ethernet rx decap - set priority and queue mapping for injected frames - phy: add support for Clause-45 PHY Loopback - pci/iov: add sysfs MSI-X vector assignment interface to distribute MSI-X resources to VFs (incl. mlx5 support) New hardware/drivers: - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit interfaces. - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and BCM63xx switches - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches - ath11k: support for QCN9074 a 802.11ax device - Bluetooth: Broadcom BCM4330 and BMC4334 - phy: Marvell 88X2222 transceiver support - mdio: add BCM6368 MDIO mux bus controller - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips - mana: driver for Microsoft Azure Network Adapter (MANA) - Actions Semi Owl Ethernet MAC - can: driver for ETAS ES58X CAN/USB interfaces Pure driver changes: - add XDP support to: enetc, igc, stmmac - add AF_XDP support to: stmmac - virtio: - page_to_skb() use build_skb when there's sufficient tailroom (21% improvement for 1000B UDP frames) - support XDP even without dedicated Tx queues - share the Tx queues with the stack when necessary - mlx5: - flow rules: add support for mirroring with conntrack, matching on ICMP, GTP, flex filters and more - support packet sampling with flow offloads - persist uplink representor netdev across eswitch mode changes - allow coexistence of CQE compression and HW time-stamping - add ethtool extended link error state reporting - ice, iavf: support flow filters, UDP Segmentation Offload - dpaa2-switch: - move the driver out of staging - add spanning tree (STP) support - add rx copybreak support - add tc flower hardware offload on ingress traffic - ionic: - implement Rx page reuse - support HW PTP time-stamping - octeon: support TC hardware offloads - flower matching on ingress and egress ratelimitting. - stmmac: - add RX frame steering based on VLAN priority in tc flower - support frame preemption (FPE) - intel: add cross time-stamping freq difference adjustment - ocelot: - support forwarding of MRP frames in HW - support multiple bridges - support PTP Sync one-step timestamping - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like learning, flooding etc. - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350, SC7280 SoCs) - mt7601u: enable TDLS support - mt76: - add support for 802.3 rx frames (mt7915/mt7615) - mt7915 flash pre-calibration support - mt7921/mt7663 runtime power management fixes Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCKFPIACgkQMUZtbf5S Irtw0g/+NA8bWdHNgG4H5rya0pv2z3IieLRmSdDfKRQQXcJpklawc5MKVVaTee/Q 5/QqgPdCsu1LAU6JXBKsKmyDDaMlQKdWuKbOqDSiAQKoMesZStTEHf9d851ZzgxA Cdb6O7BD3lBl/IN+oxNG+KcmD1LKquTPKGySq2mQtEdLO12ekAsranzmj4voKffd q9tBShpXQ7Dq77DLYfiQXVCvsizNcbbJFuxX0o9Lpb9+61ZyYAbogZSa9ypiZZwR I/9azRBtJg7UV1aD/cLuAfy66Qh7t63+rCxVazs5Os8jVO26P/jQdisnnOe/x+p9 wYEmKm3GSu0V4SAPxkWW+ooKusflCeqDoMIuooKt6kbP6BRj540veGw3Ww/m5YFr 7pLQkTSP/tSjuGQIdBE1LOP5LBO8DZeC8Kiop9V0fzAW9hFSZbEq25WW0bPj8QQO zA4Z7yWlslvxcfY2BdJX3wD8klaINkl/8fDWZFFsBdfFX2VeLtm7Xfduw34BJpvU rYT3oWr6PhtkPAKR32SUcemSfeWgIVU41eSshzRz3kez1NngBUuLlSGGSEaKbes5 pZVt6pYFFVByyf6MTHFEoQvafZfEw04JILZpo4R5V8iTHzom0kD3Py064sBiXEw2 B6t+OW4qgcxGblpFkK2lD4kR2s1TPUs0ckVO6sAy1x8q60KKKjY= =vcbA -----END PGP SIGNATURE----- Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - bpf: - allow bpf programs calling kernel functions (initially to reuse TCP congestion control implementations) - enable task local storage for tracing programs - remove the need to store per-task state in hash maps, and allow tracing programs access to task local storage previously added for BPF_LSM - add bpf_for_each_map_elem() helper, allowing programs to walk all map elements in a more robust and easier to verify fashion - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT redirection - lpm: add support for batched ops in LPM trie - add BTF_KIND_FLOAT support - mostly to allow use of BTF on s390 which has floats in its headers files - improve BPF syscall documentation and extend the use of kdoc parsing scripts we already employ for bpf-helpers - libbpf, bpftool: support static linking of BPF ELF files - improve support for encapsulation of L2 packets - xdp: restructure redirect actions to avoid a runtime lookup, improving performance by 4-8% in microbenchmarks - xsk: build skb by page (aka generic zerocopy xmit) - improve performance of software AF_XDP path by 33% for devices which don't need headers in the linear skb part (e.g. virtio) - nexthop: resilient next-hop groups - improve path stability on next-hops group changes (incl. offload for mlxsw) - ipv6: segment routing: add support for IPv4 decapsulation - icmp: add support for RFC 8335 extended PROBE messages - inet: use bigger hash table for IP ID generation - tcp: deal better with delayed TX completions - make sure we don't give up on fast TCP retransmissions only because driver is slow in reporting that it completed transmitting the original - tcp: reorder tcp_congestion_ops for better cache locality - mptcp: - add sockopt support for common TCP options - add support for common TCP msg flags - include multiple address ids in RM_ADDR - add reset option support for resetting one subflow - udp: GRO L4 improvements - improve 'forward' / 'frag_list' co-existence with UDP tunnel GRO, allowing the first to take place correctly even for encapsulated UDP traffic - micro-optimize dev_gro_receive() and flow dissection, avoid retpoline overhead on VLAN and TEB GRO - use less memory for sysctls, add a new sysctl type, to allow using u8 instead of "int" and "long" and shrink networking sysctls - veth: allow GRO without XDP - this allows aggregating UDP packets before handing them off to routing, bridge, OvS, etc. - allow specifing ifindex when device is moved to another namespace - netfilter: - nft_socket: add support for cgroupsv2 - nftables: add catch-all set element - special element used to define a default action in case normal lookup missed - use net_generic infra in many modules to avoid allocating per-ns memory unnecessarily - xps: improve the xps handling to avoid potential out-of-bound accesses and use-after-free when XPS change race with other re-configuration under traffic - add a config knob to turn off per-cpu netdev refcnt to catch underflows in testing Device APIs: - add WWAN subsystem to organize the WWAN interfaces better and hopefully start driving towards more unified and vendor- independent APIs - ethtool: - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt support) - allow network drivers to dump arbitrary SFP EEPROM data, current offset+length API was a poor fit for modern SFP which define EEPROM in terms of pages (incl. mlx5 support) - act_police, flow_offload: add support for packet-per-second policing (incl. offload for nfp) - psample: add additional metadata attributes like transit delay for packets sampled from switch HW (and corresponding egress and policy-based sampling in the mlxsw driver) - dsa: improve support for sandwiched LAGs with bridge and DSA - netfilter: - flowtable: use direct xmit in topologies with IP forwarding, bridging, vlans etc. - nftables: counter hardware offload support - Bluetooth: - improvements for firmware download w/ Intel devices - add support for reading AOSP vendor capabilities - add support for virtio transport driver - mac80211: - allow concurrent monitor iface and ethernet rx decap - set priority and queue mapping for injected frames - phy: add support for Clause-45 PHY Loopback - pci/iov: add sysfs MSI-X vector assignment interface to distribute MSI-X resources to VFs (incl. mlx5 support) New hardware/drivers: - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit interfaces. - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and BCM63xx switches - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches - ath11k: support for QCN9074 a 802.11ax device - Bluetooth: Broadcom BCM4330 and BMC4334 - phy: Marvell 88X2222 transceiver support - mdio: add BCM6368 MDIO mux bus controller - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips - mana: driver for Microsoft Azure Network Adapter (MANA) - Actions Semi Owl Ethernet MAC - can: driver for ETAS ES58X CAN/USB interfaces Pure driver changes: - add XDP support to: enetc, igc, stmmac - add AF_XDP support to: stmmac - virtio: - page_to_skb() use build_skb when there's sufficient tailroom (21% improvement for 1000B UDP frames) - support XDP even without dedicated Tx queues - share the Tx queues with the stack when necessary - mlx5: - flow rules: add support for mirroring with conntrack, matching on ICMP, GTP, flex filters and more - support packet sampling with flow offloads - persist uplink representor netdev across eswitch mode changes - allow coexistence of CQE compression and HW time-stamping - add ethtool extended link error state reporting - ice, iavf: support flow filters, UDP Segmentation Offload - dpaa2-switch: - move the driver out of staging - add spanning tree (STP) support - add rx copybreak support - add tc flower hardware offload on ingress traffic - ionic: - implement Rx page reuse - support HW PTP time-stamping - octeon: support TC hardware offloads - flower matching on ingress and egress ratelimitting. - stmmac: - add RX frame steering based on VLAN priority in tc flower - support frame preemption (FPE) - intel: add cross time-stamping freq difference adjustment - ocelot: - support forwarding of MRP frames in HW - support multiple bridges - support PTP Sync one-step timestamping - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like learning, flooding etc. - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350, SC7280 SoCs) - mt7601u: enable TDLS support - mt76: - add support for 802.3 rx frames (mt7915/mt7615) - mt7915 flash pre-calibration support - mt7921/mt7663 runtime power management fixes" * tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits) net: selftest: fix build issue if INET is disabled net: netrom: nr_in: Remove redundant assignment to ns net: tun: Remove redundant assignment to ret net: phy: marvell: add downshift support for M88E1240 net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255 net/sched: act_ct: Remove redundant ct get and check icmp: standardize naming of RFC 8335 PROBE constants bpf, selftests: Update array map tests for per-cpu batched ops bpf: Add batched ops support for percpu array bpf: Implement formatted output helpers with bstr_printf seq_file: Add a seq_bprintf function sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues net:nfc:digital: Fix a double free in digital_tg_recv_dep_req net: fix a concurrency bug in l2tp_tunnel_register() net/smc: Remove redundant assignment to rc mpls: Remove redundant assignment to err llc2: Remove redundant assignment to rc net/tls: Remove redundant initialization of record rds: Remove redundant assignment to nr_sig dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0 ...	2021-04-29 11:57:23 -07:00
Linus Torvalds	f1c921fb70	selinux/stable-5.13 PR 20210426 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmCHM2sUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNfCg/9GmoCyCh+ZRj5RGQ6M+yJas1+yyJQ uEfTNde54yfATUTaaWYnZG59yqzM3I2uaV11U7tqg8ajiFPxJKqbs5R9jl3lnSjH 0Dg22nXPSCOTKcU0x/DeLoKRr+M9jO1K/nQ8NEZvYX4nC/OgtCvJqb/oEQZIKAk5 2a7OEmNNQyFGd274p9dELaDHxN9UIaJ2PzQFXtq7ROHgBXQO4ONb2ajOf6mDSFQb vP/CDHwaH+pcE28w44oRy0/YBkO1SrdqoFQchg5yFagM5tQRLGkXK4OFSs5KHi5Q YMtmaOzMPIv1e5eaC1HuuMJYA4pPb30T9hFHP7tmBVZfmZaFaDeUs+BhMm98WTiS o0iTP7tfs36/poOR1Q0/sB06uvF9hUAAX1ZuE95YySifbXU9hsUc9b0uQSwCdg9P /J9rcdHLTpWqjw9n02mezWmAvo5U8ZvbDs+0xPIwI+3RTUP5t6mp+Hd5Tc7bPTq1 0rpWXx+FQoSytFap5qiUSiwBp+HF6HQnNIXB0Muf6wctChoTjvo7TwoxH//z4kEm +SddhOCNkB7VC/X7hOxhl0F/rdHuXvb1AFIWjpTLJH2CR1PvMtF+sGey+uPT6hKZ /gvhmQGjFdph99eGlfVbCNvx1pM61O25IscaYD1T2wGImw+z7dX4WkG3WoOdDSkR bRjrBkcHh0gLhWk= =HTEy -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Add support for measuring the SELinux state and policy capabilities using IMA. - A handful of SELinux/NFS patches to compare the SELinux state of one mount with a set of mount options. Olga goes into more detail in the patch descriptions, but this is important as it allows more flexibility when using NFS and SELinux context mounts. - Properly differentiate between the subjective and objective LSM credentials; including support for the SELinux and Smack. My clumsy attempt at a proper fix for AppArmor didn't quite pass muster so John is working on a proper AppArmor patch, in the meantime this set of patches shouldn't change the behavior of AppArmor in any way. This change explains the bulk of the diffstat beyond security/. - Fix a problem where we were not properly terminating the permission list for two SELinux object classes. * tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: add proper NULL termination to the secclass_map permissions smack: differentiate between subjective and objective task credentials selinux: clarify task subjective and objective credentials lsm: separate security_task_getsecid() into subjective and objective variants nfs: account for selinux security context when deciding to share superblock nfs: remove unneeded null check in nfs_fill_super() lsm,selinux: add new hook to compare new mount to an existing mount selinux: fix misspellings using codespell tool selinux: fix misspellings using codespell tool selinux: measure state and policy capabilities selinux: Allow context mounts for unpriviliged overlayfs	2021-04-27 13:42:11 -07:00
Casey Schaufler	1aea780837	LSM: Infrastructure management of the superblock Move management of the superblock->sb_security blob out of the individual security modules and into the security infrastructure. Instead of allocating the blobs from within the modules, the modules tell the infrastructure how much space is required, and the space is allocated there. Cc: John Johansen <john.johansen@canonical.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-6-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>	2021-04-22 12:22:10 -07:00
Paul Moore	e4c82eafb6	selinux: add proper NULL termination to the secclass_map permissions This patch adds the missing NULL termination to the "bpf" and "perf_event" object class permission lists. This missing NULL termination should really only affect the tools under scripts/selinux, with the most important being genheaders.c, although in practice this has not been an issue on any of my dev/test systems. If the problem were to manifest itself it would likely result in bogus permissions added to the end of the object class; thankfully with no access control checks using these bogus permissions and no policies defining these permissions the impact would likely be limited to some noise about undefined permissions during policy load. Cc: stable@vger.kernel.org Fixes: `ec27c3568a` ("selinux: bpf: Add selinux check for eBPF syscall operations") Fixes: `da97e18458` ("perf_event: Add support for LSM and SELinux checks") Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-04-21 21:43:25 -04:00
Jakub Kicinski	8859a44ea0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-09 20:48:35 -07:00
Linus Torvalds	60144b23c9	selinux/stable-5.12 PR 20210409 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmBwjZcUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOAcg//eZL4z0ksGo9s/Y+/9qOIYH2tMPU5 OOVZCekBENiq2LOuVbzAndeHOLZflf3iigBwtMvqHaAsdPAKH/3UedzD0/nxG39m S2gowEuNEfxtuBwuIZMFaMGzzLyjlZJ3xxi6omIyj/2JqPNyBbbFxR/VC4agJZI5 oG6VfwhZJmFi1oJiNoGKjwihKHZQ90yd8UU5rMI+Np0TnP1Or3OvRaZjR47r+dWS tAu3nTKrVEyGTcPeGzg9TS5tIko0jQ1FyrqPDBhfaJta48bX/9s70We6rwqJj8Vg HiiSDPMK5EKkPLso+1vqvBI9q6xdhNeS+M2JP+/ewK/cqVKMkTVVys6l+T3a6HcY rIXdgTWdMFiAQ6OW44z30fiwSxW3kI5M62um31nepoqvzX7acl6R1laILFztedWM EOfCznZmE6ccYmcZnrqEmNsdF+Se1TUiM87bN90tAGmF9F4Yw2qGM0raiV3OJhDZ P2zR/+DceSHI2pNfFtB5VVXZelHoKVhoRcRWvpzn7YW3UmnAl83HoJasBfa/j4rx qvo+nj5ptCSX/kUYjvfvrRV1rY/BAaSVlFLpgYKY1r8/hdRN5DLpdE5cHh6Gky6B fJen4a7yVecp8IKK+WR3maJ0hymo5ccUoB5AKzMOXeECKRqKkIDAKiEN59g9t96+ avKfojgsh1tNHA0= =mGDl -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20210409' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fixes from Paul Moore: "Three SELinux fixes. These fix known problems relating to (re)loading SELinux policy or changing the policy booleans, and pass our test suite without problem" * tag 'selinux-pr-20210409' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix race between old and new sidtab selinux: fix cond_list corruption when changing booleans selinux: make nslot handling in avtab more robust	2021-04-09 11:51:06 -07:00
Ondrej Mosnacek	9ad6e9cb39	selinux: fix race between old and new sidtab Since commit `1b8b31a2e6` ("selinux: convert policy read-write lock to RCU"), there is a small window during policy load where the new policy pointer has already been installed, but some threads may still be holding the old policy pointer in their read-side RCU critical sections. This means that there may be conflicting attempts to add a new SID entry to both tables via sidtab_context_to_sid(). See also (and the rest of the thread): https://lore.kernel.org/selinux/CAFqZXNvfux46_f8gnvVvRYMKoes24nwm2n3sPbMjrB8vKTW00g@mail.gmail.com/ Fix this by installing the new policy pointer under the old sidtab's spinlock along with marking the old sidtab as "frozen". Then, if an attempt to add new entry to a "frozen" sidtab is detected, make sidtab_context_to_sid() return -ESTALE to indicate that a new policy has been installed and that the caller will have to abort the policy transaction and try again after re-taking the policy pointer (which is guaranteed to be a newer policy). This requires adding a retry-on-ESTALE logic to all callers of sidtab_context_to_sid(), but fortunately these are easy to determine and aren't that many. This seems to be the simplest solution for this problem, even if it looks somewhat ugly. Note that other places in the kernel (e.g. do_mknodat() in fs/namei.c) use similar stale-retry patterns, so I think it's reasonable. Cc: stable@vger.kernel.org Fixes: `1b8b31a2e6` ("selinux: convert policy read-write lock to RCU") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-04-07 20:42:56 -04:00
Ondrej Mosnacek	d8f5f0ea5b	selinux: fix cond_list corruption when changing booleans Currently, duplicate_policydb_cond_list() first copies the whole conditional avtab and then tries to link to the correct entries in cond_dup_av_list() using avtab_search(). However, since the conditional avtab may contain multiple entries with the same key, this approach often fails to find the right entry, potentially leading to wrong rules being activated/deactivated when booleans are changed. To fix this, instead start with an empty conditional avtab and add the individual entries one-by-one while building the new av_lists. This approach leads to the correct result, since each entry is present in the av_lists exactly once. The issue can be reproduced with Fedora policy as follows: # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t public_content_rw_t:dir { add_name create link remove_name rename reparent rmdir setattr unlink watch watch_reads write }; [ ftpd_anon_write ]:True # setsebool ftpd_anon_write=off ftpd_connect_all_unreserved=off ftpd_connect_db=off ftpd_full_access=off On fixed kernels, the sesearch output is the same after the setsebool command: # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t public_content_rw_t:dir { add_name create link remove_name rename reparent rmdir setattr unlink watch watch_reads write }; [ ftpd_anon_write ]:True While on the broken kernels, it will be different: # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True While there, also simplify the computation of nslots. This changes the nslots values for nrules 2 or 3 to just two slots instead of 4, which makes the sequence more consistent. Cc: stable@vger.kernel.org Fixes: `c7c556f1e8` ("selinux: refactor changing booleans") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-04-02 11:46:55 -04:00
Ondrej Mosnacek	442dc00f82	selinux: make nslot handling in avtab more robust 1. Make sure all fileds are initialized in avtab_init(). 2. Slightly refactor avtab_alloc() to use the above fact. 3. Use h->nslot == 0 as a sentinel in the access functions to prevent dereferencing h->htable when it's not allocated. Cc: stable@vger.kernel.org Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-04-02 11:46:37 -04:00
David S. Miller	efd13b71a3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 15:31:22 -07:00
Paul Moore	eb1231f73c	selinux: clarify task subjective and objective credentials SELinux has a function, task_sid(), which returns the task's objective credentials, but unfortunately is used in a few places where the subjective task credentials should be used. Most notably in the new security_task_getsecid_subj() LSM hook. This patch fixes this and attempts to make things more obvious by introducing a new function, task_sid_subj(), and renaming the existing task_sid() function to task_sid_obj(). This patch also adds an interesting function in task_sid_binder(). The task_sid_binder() function has a comment which hopefully describes it's reason for being, but it basically boils down to the simple fact that we can't safely access another task's subjective credentials so in the case of binder we need to stick with the objective credentials regardless. Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-22 15:24:01 -04:00
Paul Moore	4ebd7651bf	lsm: separate security_task_getsecid() into subjective and objective variants Of the three LSMs that implement the security_task_getsecid() LSM hook, all three LSMs provide the task's objective security credentials. This turns out to be unfortunate as most of the hook's callers seem to expect the task's subjective credentials, although a small handful of callers do correctly expect the objective credentials. This patch is the first step towards fixing the problem: it splits the existing security_task_getsecid() hook into two variants, one for the subjective creds, one for the objective creds. void security_task_getsecid_subj(struct task_struct p, u32 secid); void security_task_getsecid_obj(struct task_struct p, u32 secid); While this patch does fix all of the callers to use the correct variant, in order to keep this patch focused on the callers and to ease review, the LSMs continue to use the same implementation for both hooks. The net effect is that this patch should not change the behavior of the kernel in any way, it will be up to the latter LSM specific patches in this series to change the hook implementations and return the correct credentials. Acked-by: Mimi Zohar <zohar@linux.ibm.com> (IMA) Acked-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-22 15:23:32 -04:00
Olga Kornievskaia	69c4a42d72	lsm,selinux: add new hook to compare new mount to an existing mount Add a new hook that takes an existing super block and a new mount with new options and determines if new options confict with an existing mount or not. A filesystem can use this new hook to determine if it can share the an existing superblock with a new superblock for the new mount. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [PM: tweak the subject line, fix tab/space problems] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-22 14:53:37 -04:00
Linus Torvalds	8419639062	selinux/stable-5.12 PR 20210322 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmBYx3gUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPSSw/+MnJxbBEfxMXll2LwCRXvyW0Q/F++ sSLPKZL9B5E7jANbTBlkUW+tMwsckTS7euPvRuJj2+mrSujRnSTl158JAAcn34gd lpiGQpttFZD75Eh9sLNg0OZ7PflwQvAzHt52EweD8/OE5O8BLBg7o56SYMr3LkGu Up9YcZPHNlj+NhfvWebv3jSB6dv392cG33iZoqmW81wSzmlXHGdzS5UTiIFnsp3X kbhLKaZWDSBHuAVMuAxtx3x3sQO1ElfFHxKRYM1fzfl0BMy30Wv6YnXHW2nn08Hr oT26968C0Rl9carTnA+G60Nj4WoTWW2dF20Mih+05vkpqFLjdMtFra7fFndbmfNi f7Gj5DJNrbunX1dMFJkyPnO/1x74RFUhZbCKm5ffvmF8AcYVivbbsyUAy/xduPWo m9hjXDVZLUbWxGBUFxyJD6qQw/wuz+qII8B7SBCKaDdCtM74TlXBVug8prrPcWHV tO3ljjbxEjBJ6zsFIJ9IlV3rJTL0v4RbAELXXp5qcZOJpnUtuH8cxj0Ryzo3yCY5 g/m6IHhm5OfJ5TBSc5UIj2NJQi7sJ+Yv/++lms+RB2MVopx4lJ+UK7140gCA40iC 1EPOGXCnB/b1k5F38dqdpI5MD+/uAzOMusQvPfL4x0xoQidzsqDmqgaS+V8pIYl6 nisL4eEe2K7PWX4= =mFaE -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20210322' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fixes from Paul Moore: "Three SELinux patches: - Fix a problem where a local variable is used outside its associated function. Thankfully this can only be triggered by reloading the SELinux policy, which is a restricted operation for other obvious reasons. - Fix some incorrect, and inconsistent, audit and printk messages when loading the SELinux policy. All three patches are relatively minor and have been through our testing with no failures" * tag 'selinux-pr-20210322' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinuxfs: unify policy load error reporting selinux: fix variable scope issue in live sidtab conversion selinux: don't log MAC_POLICY_LOAD record on failed policy load	2021-03-22 11:34:31 -07:00
Ondrej Mosnacek	ee5de60a08	selinuxfs: unify policy load error reporting Let's drop the pr_err()s from sel_make_policy_nodes() and just add one pr_warn_ratelimited() call to the sel_make_policy_nodes() error path in sel_write_load(). Changing from error to warning makes sense, since after `02a52c5c8c` ("selinux: move policy commit after updating selinuxfs"), this error path no longer leads to a broken selinuxfs tree (it's just kept in the original state and policy load is aborted). I also added _ratelimited to be consistent with the other prtin in the same function (it's probably not necessary, but can't really hurt... there are likely more important error messages to be printed when filesystem entry creation starts erroring out). Suggested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-18 23:26:59 -04:00
Ondrej Mosnacek	6406887a12	selinux: fix variable scope issue in live sidtab conversion Commit `02a52c5c8c` ("selinux: move policy commit after updating selinuxfs") moved the selinux_policy_commit() call out of security_load_policy() into sel_write_load(), which caused a subtle yet rather serious bug. The problem is that security_load_policy() passes a reference to the convert_params local variable to sidtab_convert(), which stores it in the sidtab, where it may be accessed until the policy is swapped over and RCU synchronized. Before `02a52c5c8c`, selinux_policy_commit() was called directly from security_load_policy(), so the convert_params pointer remained valid all the way until the old sidtab was destroyed, but now that's no longer the case and calls to sidtab_context_to_sid() on the old sidtab after security_load_policy() returns may cause invalid memory accesses. This can be easily triggered using the stress test from commit `ee1a84fdfe` ("selinux: overhaul sidtab to fix bug and improve performance"): ``` function rand_cat() { echo $(( $RANDOM % 1024 )) } function do_work() { while true; do echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \ >/sys/fs/selinux/context 2>/dev/null \|\| true done } do_work >/dev/null & do_work >/dev/null & do_work >/dev/null & while load_policy; do echo -n .; sleep 0.1; done kill %1 kill %2 kill %3 ``` Fix this by allocating the temporary sidtab convert structures dynamically and passing them among the selinux_policy_{load,cancel,commit} functions. Fixes: `02a52c5c8c` ("selinux: move policy commit after updating selinuxfs") Cc: stable@vger.kernel.org Tested-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> [PM: merge fuzz in security.h and services.c] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-18 23:23:46 -04:00
Ondrej Mosnacek	519dad3bcd	selinux: don't log MAC_POLICY_LOAD record on failed policy load If sel_make_policy_nodes() fails, we should jump to 'out', not 'out1', as the latter would incorrectly log an MAC_POLICY_LOAD audit record, even though the policy hasn't actually been reloaded. The 'out1' jump label now becomes unused and can be removed. Fixes: `02a52c5c8c` ("selinux: move policy commit after updating selinuxfs") Cc: stable@vger.kernel.org Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-18 23:13:04 -04:00
Ido Schimmel	710ec56223	nexthop: Add netlink defines and enumerators for resilient NH groups - RTM_NEWNEXTHOP et.al. that handle resilient groups will have a new nested attribute, NHA_RES_GROUP, whose elements are attributes NHA_RES_GROUP_. - RTM_NEWNEXTHOPBUCKET et.al. is a suite of new messages that will currently serve only for dumping of individual buckets of resilient next hop groups. For nexthop group buckets, these messages will carry a nested attribute NHA_RES_BUCKET, whose elements are attributes NHA_RES_BUCKET_. There are several reasons why a new suite of messages is created for nexthop buckets instead of overloading the information on the existing RTM_{NEW,DEL,GET}NEXTHOP messages. First, a nexthop group can contain a large number of nexthop buckets (4k is not unheard of). This imposes limits on the amount of information that can be encoded for each nexthop bucket given a netlink message is limited to 64k bytes. Second, while RTM_NEWNEXTHOPBUCKET is only used for notifications at this point, in the future it can be extended to provide user space with control over nexthop buckets configuration. - The new group type is NEXTHOP_GRP_TYPE_RES. Note that nexthop code is adjusted to bounce groups with that type for now. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:12:59 -08:00
Xiong Zhenwu	431c3be16b	selinux: fix misspellings using codespell tool A typo is f out by codespell tool in 422th line of security.h: $ codespell ./security/selinux/include/ ./security.h:422: thie ==> the, this Fix a typo found by codespell. Signed-off-by: Xiong Zhenwu <xiong.zhenwu@zte.com.cn> [PM: subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-08 19:46:33 -05:00
Xiong Zhenwu	63ddf1baa0	selinux: fix misspellings using codespell tool A typo is found out by codespell tool in 16th line of hashtab.c $ codespell ./security/selinux/ss/ ./hashtab.c:16: rouding ==> rounding Fix a typo found by codespell. Signed-off-by: Xiong Zhenwu <xiong.zhenwu@zte.com.cn> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-08 19:44:30 -05:00
Lakshmi Ramasubramanian	2554a48f44	selinux: measure state and policy capabilities SELinux stores the configuration state and the policy capabilities in kernel memory. Changes to this data at runtime would have an impact on the security guarantees provided by SELinux. Measuring this data through IMA subsystem provides a tamper-resistant way for an attestation service to remotely validate it at runtime. Measure the configuration state and policy capabilities by calling the IMA hook ima_measure_critical_data(). To enable SELinux data measurement, the following steps are required: 1, Add "ima_policy=critical_data" to the kernel command line arguments to enable measuring SELinux data at boot time. For example, BOOT_IMAGE=/boot/vmlinuz-5.11.0-rc3+ root=UUID=fd643309-a5d2-4ed3-b10d-3c579a5fab2f ro nomodeset security=selinux ima_policy=critical_data 2, Add the following rule to /etc/ima/ima-policy measure func=CRITICAL_DATA label=selinux Sample measurement of SELinux state and policy capabilities: 10 2122...65d8 ima-buf sha256:13c2...1292 selinux-state 696e...303b Execute the following command to extract the measured data from the IMA's runtime measurements list: grep "selinux-state" /sys/kernel/security/integrity/ima/ascii_runtime_measurements \| tail -1 \| cut -d' ' -f 6 \| xxd -r -p The output should be a list of key-value pairs. For example, initialized=1;enforcing=0;checkreqprot=1;network_peer_controls=1;open_perms=1;extended_socket_class=1;always_check_network=0;cgroup_seclabel=1;nnp_nosuid_transition=1;genfs_seclabel_symlinks=0; To verify the measurement is consistent with the current SELinux state reported on the system, compare the integer values in the following files with those set in the IMA measurement (using the following commands): - cat /sys/fs/selinux/enforce - cat /sys/fs/selinux/checkreqprot - cat /sys/fs/selinux/policy_capabilities/[capability_file] Note that the actual verification would be against an expected state and done on a separate system (likely an attestation server) requiring "initialized=1;enforcing=1;checkreqprot=0;" for a secure state and then whatever policy capabilities are actually set in the expected policy (which can be extracted from the policy itself via seinfo, for example). Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Suggested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-08 19:39:07 -05:00
Vivek Goyal	7fa2e79a6b	selinux: Allow context mounts for unpriviliged overlayfs Now overlayfs allow unpriviliged mounts. That is root inside a non-init user namespace can mount overlayfs. This is being added in 5.11 kernel. Giuseppe tried to mount overlayfs with option "context" and it failed with error -EACCESS. $ su test $ unshare -rm $ mkdir -p lower upper work merged $ mount -t overlay -o lowerdir=lower,workdir=work,upperdir=upper,userxattr,context='system_u:object_r:container_file_t:s0' none merged This fails with -EACCESS. It works if option "-o context" is not specified. Little debugging showed that selinux_set_mnt_opts() returns -EACCESS. So this patch adds "overlay" to the list, where it is fine to specific context from non init_user_ns. Reported-by: Giuseppe Scrivano <gscrivan@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> [PM: trimmed the changelog from the description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-03-08 19:34:38 -05:00
Linus Torvalds	7d6beb71da	idmapped-mounts-v5.12 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f 4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg= =yPaw -----END PGP SIGNATURE----- Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull idmapped mounts from Christian Brauner: "This introduces idmapped mounts which has been in the making for some time. Simply put, different mounts can expose the same file or directory with different ownership. This initial implementation comes with ports for fat, ext4 and with Christoph's port for xfs with more filesystems being actively worked on by independent people and maintainers. Idmapping mounts handle a wide range of long standing use-cases. Here are just a few: - Idmapped mounts make it possible to easily share files between multiple users or multiple machines especially in complex scenarios. For example, idmapped mounts will be used in the implementation of portable home directories in systemd-homed.service(8) where they allow users to move their home directory to an external storage device and use it on multiple computers where they are assigned different uids and gids. This effectively makes it possible to assign random uids and gids at login time. - It is possible to share files from the host with unprivileged containers without having to change ownership permanently through chown(2). - It is possible to idmap a container's rootfs and without having to mangle every file. For example, Chromebooks use it to share the user's Download folder with their unprivileged containers in their Linux subsystem. - It is possible to share files between containers with non-overlapping idmappings. - Filesystem that lack a proper concept of ownership such as fat can use idmapped mounts to implement discretionary access (DAC) permission checking. - They allow users to efficiently changing ownership on a per-mount basis without having to (recursively) chown(2) all files. In contrast to chown (2) changing ownership of large sets of files is instantenous with idmapped mounts. This is especially useful when ownership of a whole root filesystem of a virtual machine or container is changed. With idmapped mounts a single syscall mount_setattr syscall will be sufficient to change the ownership of all files. - Idmapped mounts always take the current ownership into account as idmappings specify what a given uid or gid is supposed to be mapped to. This contrasts with the chown(2) syscall which cannot by itself take the current ownership of the files it changes into account. It simply changes the ownership to the specified uid and gid. This is especially problematic when recursively chown(2)ing a large set of files which is commong with the aforementioned portable home directory and container and vm scenario. - Idmapped mounts allow to change ownership locally, restricting it to specific mounts, and temporarily as the ownership changes only apply as long as the mount exists. Several userspace projects have either already put up patches and pull-requests for this feature or will do so should you decide to pull this: - systemd: In a wide variety of scenarios but especially right away in their implementation of portable home directories. https://systemd.io/HOME_DIRECTORY/ - container runtimes: containerd, runC, LXD:To share data between host and unprivileged containers, unprivileged and privileged containers, etc. The pull request for idmapped mounts support in containerd, the default Kubernetes runtime is already up for quite a while now: https://github.com/containerd/containerd/pull/4734 - The virtio-fs developers and several users have expressed interest in using this feature with virtual machines once virtio-fs is ported. - ChromeOS: Sharing host-directories with unprivileged containers. I've tightly synced with all those projects and all of those listed here have also expressed their need/desire for this feature on the mailing list. For more info on how people use this there's a bunch of talks about this too. Here's just two recent ones: https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf https://fosdem.org/2021/schedule/event/containers_idmap/ This comes with an extensive xfstests suite covering both ext4 and xfs: https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts It covers truncation, creation, opening, xattrs, vfscaps, setid execution, setgid inheritance and more both with idmapped and non-idmapped mounts. It already helped to discover an unrelated xfs setgid inheritance bug which has since been fixed in mainline. It will be sent for inclusion with the xfstests project should you decide to merge this. In order to support per-mount idmappings vfsmounts are marked with user namespaces. The idmapping of the user namespace will be used to map the ids of vfs objects when they are accessed through that mount. By default all vfsmounts are marked with the initial user namespace. The initial user namespace is used to indicate that a mount is not idmapped. All operations behave as before and this is verified in the testsuite. Based on prior discussions we want to attach the whole user namespace and not just a dedicated idmapping struct. This allows us to reuse all the helpers that already exist for dealing with idmappings instead of introducing a whole new range of helpers. In addition, if we decide in the future that we are confident enough to enable unprivileged users to setup idmapped mounts the permission checking can take into account whether the caller is privileged in the user namespace the mount is currently marked with. The user namespace the mount will be marked with can be specified by passing a file descriptor refering to the user namespace as an argument to the new mount_setattr() syscall together with the new MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern of extensibility. The following conditions must be met in order to create an idmapped mount: - The caller must currently have the CAP_SYS_ADMIN capability in the user namespace the underlying filesystem has been mounted in. - The underlying filesystem must support idmapped mounts. - The mount must not already be idmapped. This also implies that the idmapping of a mount cannot be altered once it has been idmapped. - The mount must be a detached/anonymous mount, i.e. it must have been created by calling open_tree() with the OPEN_TREE_CLONE flag and it must not already have been visible in the filesystem. The last two points guarantee easier semantics for userspace and the kernel and make the implementation significantly simpler. By default vfsmounts are marked with the initial user namespace and no behavioral or performance changes are observed. The manpage with a detailed description can be found here: `1d7b902e28` In order to support idmapped mounts, filesystems need to be changed and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The patches to convert individual filesystem are not very large or complicated overall as can be seen from the included fat, ext4, and xfs ports. Patches for other filesystems are actively worked on and will be sent out separately. The xfstestsuite can be used to verify that port has been done correctly. The mount_setattr() syscall is motivated independent of the idmapped mounts patches and it's been around since July 2019. One of the most valuable features of the new mount api is the ability to perform mounts based on file descriptors only. Together with the lookup restrictions available in the openat2() RESOLVE_* flag namespace which we added in v5.6 this is the first time we are close to hardened and race-free (e.g. symlinks) mounting and path resolution. While userspace has started porting to the new mount api to mount proper filesystems and create new bind-mounts it is currently not possible to change mount options of an already existing bind mount in the new mount api since the mount_setattr() syscall is missing. With the addition of the mount_setattr() syscall we remove this last restriction and userspace can now fully port to the new mount api, covering every use-case the old mount api could. We also add the crucial ability to recursively change mount options for a whole mount tree, both removing and adding mount options at the same time. This syscall has been requested multiple times by various people and projects. There is a simple tool available at https://github.com/brauner/mount-idmapped that allows to create idmapped mounts so people can play with this patch series. I'll add support for the regular mount binary should you decide to pull this in the following weeks: Here's an example to a simple idmapped mount of another user's home directory: u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt u1001@f2-vm:/$ ls -al /home/ubuntu/ total 28 drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 . drwxr-xr-x 4 root root 4096 Oct 28 04:00 .. -rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 ubuntu ubuntu 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 ubuntu ubuntu 807 Feb 25 2020 .profile -rw-r--r-- 1 ubuntu ubuntu 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ ls -al /mnt/ total 28 drwxr-xr-x 2 u1001 u1001 4096 Oct 28 22:07 . drwxr-xr-x 29 root root 4096 Oct 28 22:01 .. -rw------- 1 u1001 u1001 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 u1001 u1001 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 u1001 u1001 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 u1001 u1001 807 Feb 25 2020 .profile -rw-r--r-- 1 u1001 u1001 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 u1001 u1001 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ touch /mnt/my-file u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file u1001@f2-vm:/$ ls -al /mnt/my-file -rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file u1001@f2-vm:/$ ls -al /home/ubuntu/my-file -rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file u1001@f2-vm:/$ getfacl /mnt/my-file getfacl: Removing leading '/' from absolute path names # file: mnt/my-file # owner: u1001 # group: u1001 user::rw- user:u1001:rwx group::rw- mask::rwx other::r-- u1001@f2-vm:/$ getfacl /home/ubuntu/my-file getfacl: Removing leading '/' from absolute path names # file: home/ubuntu/my-file # owner: ubuntu # group: ubuntu user::rw- user:ubuntu:rwx group::rw- mask::rwx other::r--" * tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits) xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl xfs: support idmapped mounts ext4: support idmapped mounts fat: handle idmapped mounts tests: add mount_setattr() selftests fs: introduce MOUNT_ATTR_IDMAP fs: add mount_setattr() fs: add attr_flags_to_mnt_flags helper fs: split out functions to hold writers namespace: only take read lock in do_reconfigure_mnt() mount: make {lock,unlock}_mount_hash() static namespace: take lock_mount_hash() directly when changing flags nfs: do not export idmapped mounts overlayfs: do not mount on top of idmapped mounts ecryptfs: do not mount on top of idmapped mounts ima: handle idmapped mounts apparmor: handle idmapped mounts fs: make helpers idmap mount aware exec: handle idmapped mounts would_dump: handle idmapped mounts ...	2021-02-23 13:39:45 -08:00
Linus Torvalds	d643a99089	integrity-v5.12 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEjSMCCC7+cjo3nszSa3kkZrA+cVoFAmArRwIUHHpvaGFyQGxp bnV4LmlibS5jb20ACgkQa3kkZrA+cVo6JxAAkZHDhv6Zv7FfVsFjE7yDJwRFBu4o jnAowPxa/xl6MlR2ICTFLHjOimEcpvzySO2IM85WCxjRYaNevITOxEZE+qfE/Byo K1MuZOSXXBa2+AgO1Tku+ZNrQvzTsphgtvhlSD9ReN7P84C/rxG5YDomME+8/6rR QH7Ly/izyc3VNKq7nprT8F2boJ0UxpcwNHZiH2McQD3UvUaZOecwpcpvth5pbgad Ej2r72Q+IR0voqM/T1dc4TjW5Wcw/m27vhGQoOfQ5f+as5r9r1cPSWj0wRJTkATo F/SiKuyWUwOGkRO8I9aaXXzTBgcJw/7MmZe8yNDg5QJrUzD8F5cdjlHZdsnz5BJq tLo4kUsR4xMePEppJ4a10ZUDQa737j97C20xTwOHf6mKGIqmoooGAsjW9xUyYqHU rYuLP4qB7ua4j8Uz9zVJazjgQWPQ+8Ad9MkjQLLhr00Azpz4mVweWVGjCJQC0pky Jr2H4xj3JLAoygqMWfJxr9aVBpfy4Wmo0U29ryZuxZUr178qSXoL3QstGWXRa2MN TwzpgHi1saItQ6iXAO0HB6Tsw0h8INyjrm7c3ANbmBwMsYMYeKcTG87+Z0LkK82w C5SW2uQT9aLBXx9lZx8z0RpxygO1cW+KjlZxRYSfQa/ev/aF2kBz0ruGQgvqai4K ceh/cwrYjrCbFVc= =mojv -----END PGP SIGNATURE----- Merge tag 'integrity-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull IMA updates from Mimi Zohar: "New is IMA support for measuring kernel critical data, as per usual based on policy. The first example measures the in memory SELinux policy. The second example measures the kernel version. In addition are four bug fixes to address memory leaks and a missing 'static' function declaration" * tag 'integrity-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: Make function integrity_add_key() static ima: Free IMA measurement buffer after kexec syscall ima: Free IMA measurement buffer on error IMA: Measure kernel version in early boot selinux: include a consumer of the new IMA critical data hook IMA: define a builtin critical data measurement policy IMA: extend critical data hook to limit the measurement based on a label IMA: limit critical data measurement based on a label IMA: add policy rule to measure critical data IMA: define a hook to measure kernel integrity critical data IMA: add support to measure buffer data hash IMA: generalize keyring specific measurement constructs evm: Fix memleak in init_desc	2021-02-21 17:08:06 -08:00
Christian Brauner	71bc356f93	commoncap: handle idmapped mounts When interacting with user namespace and non-user namespace aware filesystem capabilities the vfs will perform various security checks to determine whether or not the filesystem capabilities can be used by the caller, whether they need to be removed and so on. The main infrastructure for this resides in the capability codepaths but they are called through the LSM security infrastructure even though they are not technically an LSM or optional. This extends the existing security hooks security_inode_removexattr(), security_inode_killpriv(), security_inode_getsecurity() to pass down the mount's user namespace and makes them aware of idmapped mounts. In order to actually get filesystem capabilities from disk the capability infrastructure exposes the get_vfs_caps_from_disk() helper. For user namespace aware filesystem capabilities a root uid is stored alongside the capabilities. In order to determine whether the caller can make use of the filesystem capability or whether it needs to be ignored it is translated according to the superblock's user namespace. If it can be translated to uid 0 according to that id mapping the caller can use the filesystem capabilities stored on disk. If we are accessing the inode that holds the filesystem capabilities through an idmapped mount we map the root uid according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts: reading filesystem caps from disk enforces that the root uid associated with the filesystem capability must have a mapping in the superblock's user namespace and that the caller is either in the same user namespace or is a descendant of the superblock's user namespace. For filesystems that are mountable inside user namespace the caller can just mount the filesystem and won't usually need to idmap it. If they do want to idmap it they can create an idmapped mount and mark it with a user namespace they created and which is thus a descendant of s_user_ns. For filesystems that are not mountable inside user namespaces the descendant rule is trivially true because the s_user_ns will be the initial user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-11-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-01-24 14:27:17 +01:00
Tycho Andersen	c7c7a1a18a	xattr: handle idmapped mounts When interacting with extended attributes the vfs verifies that the caller is privileged over the inode with which the extended attribute is associated. For posix access and posix default extended attributes a uid or gid can be stored on-disk. Let the functions handle posix extended attributes on idmapped mounts. If the inode is accessed through an idmapped mount we need to map it according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. This has no effect for e.g. security xattrs since they don't store uids or gids and don't perform permission checks on them like posix acls do. Link: https://lore.kernel.org/r/20210121131959.646623-10-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Tycho Andersen <tycho@tycho.pizza> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-01-24 14:27:17 +01:00
Christian Brauner	21cb47be6f	inode: make init and permission helpers idmapped mount aware The inode_owner_or_capable() helper determines whether the caller is the owner of the inode or is capable with respect to that inode. Allow it to handle idmapped mounts. If the inode is accessed through an idmapped mount it according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Similarly, allow the inode_init_owner() helper to handle idmapped mounts. It initializes a new inode on idmapped mounts by mapping the fsuid and fsgid of the caller from the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-7-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-01-24 14:27:16 +01:00
Lakshmi Ramasubramanian	fdd1ffe8a8	selinux: include a consumer of the new IMA critical data hook SELinux stores the active policy in memory, so the changes to this data at runtime would have an impact on the security guarantees provided by SELinux. Measuring in-memory SELinux policy through IMA subsystem provides a secure way for the attestation service to remotely validate the policy contents at runtime. Measure the hash of the loaded policy by calling the IMA hook ima_measure_critical_data(). Since the size of the loaded policy can be large (several MB), measure the hash of the policy instead of the entire policy to avoid bloating the IMA log entry. To enable SELinux data measurement, the following steps are required: 1, Add "ima_policy=critical_data" to the kernel command line arguments to enable measuring SELinux data at boot time. For example, BOOT_IMAGE=/boot/vmlinuz-5.10.0-rc1+ root=UUID=fd643309-a5d2-4ed3-b10d-3c579a5fab2f ro nomodeset security=selinux ima_policy=critical_data 2, Add the following rule to /etc/ima/ima-policy measure func=CRITICAL_DATA label=selinux Sample measurement of the hash of SELinux policy: To verify the measured data with the current SELinux policy run the following commands and verify the output hash values match. sha256sum /sys/fs/selinux/policy \| cut -d' ' -f 1 grep "selinux-policy-hash" /sys/kernel/security/integrity/ima/ascii_runtime_measurements \| tail -1 \| cut -d' ' -f 6 Note that the actual verification of SELinux policy would require loading the expected policy into an identical kernel on a pristine/known-safe system and run the sha256sum /sys/kernel/selinux/policy there to get the expected hash. Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2021-01-14 23:41:46 -05:00
Daniel Colascione	29cd6591ab	selinux: teach SELinux about anonymous inodes This change uses the anon_inodes and LSM infrastructure introduced in the previous patches to give SELinux the ability to control anonymous-inode files that are created using the new anon_inode_getfd_secure() function. A SELinux policy author detects and controls these anonymous inodes by adding a name-based type_transition rule that assigns a new security type to anonymous-inode files created in some domain. The name used for the name-based transition is the name associated with the anonymous inode for file listings --- e.g., "[userfaultfd]" or "[perf_event]". Example: type uffd_t; type_transition sysadm_t sysadm_t : anon_inode uffd_t "[userfaultfd]"; allow sysadm_t uffd_t:anon_inode { create }; (The next patch in this series is necessary for making userfaultfd support this new interface. The example above is just for exposition.) Signed-off-by: Daniel Colascione <dancol@google.com> Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-01-14 17:38:10 -05:00
Ondrej Mosnacek	08abe46b2c	selinux: fall back to SECURITY_FS_USE_GENFS if no xattr support When a superblock is assigned the SECURITY_FS_USE_XATTR behavior by the policy yet it lacks xattr support, try to fall back to genfs rather than rejecting the mount. If a genfscon rule is found for the filesystem, then change the behavior to SECURITY_FS_USE_GENFS, otherwise reject the mount as before. A similar fallback is already done in security_fs_use() if no behavior specification is found for the given filesystem. This is needed e.g. for virtiofs, which may or may not support xattrs depending on the backing host filesystem. Example: # seinfo --genfs \| grep ' ramfs' genfscon ramfs / system_u:object_r:ramfs_t:s0 # echo '(fsuse xattr ramfs (system_u object_r fs_t ((s0) (s0))))' >ramfs_xattr.cil # semodule -i ramfs_xattr.cil # mount -t ramfs none /mnt Before: mount: /mnt: mount(2) system call failed: Operation not supported. After: (mount succeeds) # ls -Zd /mnt system_u:object_r:ramfs_t:s0 /mnt See also: https://lore.kernel.org/selinux/20210105142148.GA3200@redhat.com/T/ https://github.com/fedora-selinux/selinux-policy/pull/478 Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-01-13 08:55:11 -05:00
Ondrej Mosnacek	e0de8a9aeb	selinux: mark selinux_xfrm_refcount as __read_mostly This is motivated by a perfomance regression of selinux_xfrm_enabled() that happened on a RHEL kernel due to false sharing between selinux_xfrm_refcount and (the late) selinux_ss.policy_rwlock (i.e. the .bss section memory layout changed such that they happened to share the same cacheline). Since the policy rwlock's memory region was modified upon each read-side critical section, the readers of selinux_xfrm_refcount had frequent cache misses, eventually leading to a significant performance degradation under a TCP SYN flood on a system with many cores (32 in this case, but it's detectable on less cores as well). While upstream has since switched to RCU locking, so the same can no longer happen here, selinux_xfrm_refcount could still share a cacheline with another frequently written region, thus marking it __read_mostly still makes sense. __read_mostly helps, because it will put the symbol in a separate section along with other read-mostly variables, so there should never be a clash with frequently written data. Since selinux_xfrm_refcount is modified only in case of an explicit action, it should be safe to do this (i.e. it shouldn't disrupt other read-mostly variables too much). Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2021-01-12 10:12:58 -05:00

1 2 3 4 5 ...

1851 Commits