OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Michal Hocko	476accbe2f	selinux: use GFP_NOWAIT in the AVC kmem_caches There is a strange __GFP_NOMEMALLOC usage pattern in SELinux, specifically GFP_ATOMIC \| __GFP_NOMEMALLOC which doesn't make much sense. GFP_ATOMIC on its own allows to access memory reserves while __GFP_NOMEMALLOC dictates we cannot use memory reserves. Replace this with the much more sane GFP_NOWAIT in the AVC code as we can tolerate memory allocation failures in that code. Signed-off-by: Michal Hocko <mhocko@kernel.org> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-08-08 09:12:23 -04:00
Stephen Smalley	af63f4193f	selinux: Generalize support for NNP/nosuid SELinux domain transitions As systemd ramps up enabling NNP (NoNewPrivileges) for system services, it is increasingly breaking SELinux domain transitions for those services and their descendants. systemd enables NNP not only for services whose unit files explicitly specify NoNewPrivileges=yes but also for services whose unit files specify any of the following options in combination with running without CAP_SYS_ADMIN (e.g. specifying User= or a CapabilityBoundingSet= without CAP_SYS_ADMIN): SystemCallFilter=, SystemCallArchitectures=, RestrictAddressFamilies=, RestrictNamespaces=, PrivateDevices=, ProtectKernelTunables=, ProtectKernelModules=, MemoryDenyWriteExecute=, or RestrictRealtime= as per the systemd.exec(5) man page. The end result is bad for the security of both SELinux-disabled and SELinux-enabled systems. Packagers have to turn off these options in the unit files to preserve SELinux domain transitions. For users who choose to disable SELinux, this means that they miss out on at least having the systemd-supported protections. For users who keep SELinux enabled, they may still be missing out on some protections because it isn't necessarily guaranteed that the SELinux policy for that service provides the same protections in all cases. commit `7b0d0b40cd` ("selinux: Permit bounded transitions under NO_NEW_PRIVS or NOSUID.") allowed bounded transitions under NNP in order to support limited usage for sandboxing programs. However, defining typebounds for all of the affected service domains is impractical to implement in policy, since typebounds requires us to ensure that each domain is allowed everything all of its descendant domains are allowed, and this has to be repeated for the entire chain of domain transitions. There is no way to clone all allow rules from descendants to their ancestors in policy currently, and doing so would be undesirable even if it were practical, as it requires leaking permissions to objects and operations into ancestor domains that could weaken their own security in order to allow them to the descendants (e.g. if a descendant requires execmem permission, then so do all of its ancestors; if a descendant requires execute permission to a file, then so do all of its ancestors; if a descendant requires read to a symbolic link or temporary file, then so do all of its ancestors...). SELinux domains are intentionally not hierarchical / bounded in this manner normally, and making them so would undermine their protections and least privilege. We have long had a similar tension with SELinux transitions and nosuid mounts, albeit not as severe. Users often have had to choose between retaining nosuid on a mount and allowing SELinux domain transitions on files within those mounts. This likewise leads to unfortunate tradeoffs in security. Decouple NNP/nosuid from SELinux transitions, so that we don't have to make a choice between them. Introduce a nnp_nosuid_transition policy capability that enables transitions under NNP/nosuid to be based on a permission (nnp_transition for NNP; nosuid_transition for nosuid) between the old and new contexts in addition to the current support for bounded transitions. Domain transitions can then be allowed in policy without requiring the parent to be a strict superset of all of its children. With this change, systemd unit files can be left unmodified from upstream. SELinux-disabled and SELinux-enabled users will benefit from retaining any of the systemd-provided protections. SELinux policy will only need to be adapted to enable the new policy capability and to allow the new permissions between domain pairs as appropriate. NB: Allowing nnp_transition between two contexts opens up the potential for the old context to subvert the new context by installing seccomp filters before the execve. Allowing nosuid_transition between two contexts opens up the potential for a context transition to occur on a file from an untrusted filesystem (e.g. removable media or remote filesystem). Use with care. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-08-02 16:36:04 -04:00
Kees Cook	62874c3adf	selinux: Refactor to remove bprm_secureexec hook The SELinux bprm_secureexec hook can be merged with the bprm_set_creds hook since it's dealing with the same information, and all of the details are finalized during the first call to the bprm_set_creds hook via prepare_binprm() (subsequent calls due to binfmt_script, etc, are ignored via bprm->called_set_creds). Here, the test can just happen at the end of the bprm_set_creds hook, and the bprm_secureexec hook can be dropped. Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Paul Moore <paul@paul-moore.com> Tested-by: Paul Moore <paul@paul-moore.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Reviewed-by: Andy Lutomirski <luto@kernel.org>	2017-08-01 12:03:07 -07:00
Kees Cook	ddb4a1442d	exec: Rename bprm->cred_prepared to called_set_creds The cred_prepared bprm flag has a misleading name. It has nothing to do with the bprm_prepare_cred hook, and actually tracks if bprm_set_creds has been called. Rename this flag and improve its comment. Cc: David Howells <dhowells@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Serge Hallyn <serge@hallyn.com>	2017-08-01 12:02:48 -07:00
Florian Westphal	591bb2789b	netfilter: nf_hook_ops structs can be const We no longer place these on a list so they can be const. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-07-31 19:10:44 +02:00
Luis Ressel	2a764b529a	selinux: Assign proper class to PF_UNIX/SOCK_RAW sockets For PF_UNIX, SOCK_RAW is synonymous with SOCK_DGRAM (cf. net/unix/af_unix.c). This is a tad obscure, but libpcap uses it. Signed-off-by: Luis Ressel <aranea@aixah.de> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-07-25 15:13:41 -04:00
Florian Westphal	09c7570480	xfrm: remove flow cache After rcu conversions performance degradation in forward tests isn't that noticeable anymore. See next patch for some numbers. A followup patcg could then also remove genid from the policies as we do not cache bundles anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:13:41 -07:00
Linus Torvalds	7114f51fcb	Merge branch 'work.memdup_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull memdup_user() conversions from Al Viro: "A fairly self-contained series - hunting down open-coded memdup_user() and memdup_user_nul() instances" * 'work.memdup_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: bpf: don't open-code memdup_user() kimage_file_prepare_segments(): don't open-code memdup_user() ethtool: don't open-code memdup_user() do_ip_setsockopt(): don't open-code memdup_user() do_ipv6_setsockopt(): don't open-code memdup_user() irda: don't open-code memdup_user() xfrm_user_policy(): don't open-code memdup_user() ima_write_policy(): don't open-code memdup_user_nul() sel_write_validatetrans(): don't open-code memdup_user_nul()	2017-07-05 16:05:24 -07:00
Linus Torvalds	5518b69b76	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Reasonably busy this cycle, but perhaps not as busy as in the 4.12 merge window: 1) Several optimizations for UDP processing under high load from Paolo Abeni. 2) Support pacing internally in TCP when using the sch_fq packet scheduler for this is not practical. From Eric Dumazet. 3) Support mutliple filter chains per qdisc, from Jiri Pirko. 4) Move to 1ms TCP timestamp clock, from Eric Dumazet. 5) Add batch dequeueing to vhost_net, from Jason Wang. 6) Flesh out more completely SCTP checksum offload support, from Davide Caratti. 7) More plumbing of extended netlink ACKs, from David Ahern, Pablo Neira Ayuso, and Matthias Schiffer. 8) Add devlink support to nfp driver, from Simon Horman. 9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa Prabhu. 10) Add stack depth tracking to BPF verifier and use this information in the various eBPF JITs. From Alexei Starovoitov. 11) Support XDP on qed device VFs, from Yuval Mintz. 12) Introduce BPF PROG ID for better introspection of installed BPF programs. From Martin KaFai Lau. 13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann. 14) For loads, allow narrower accesses in bpf verifier checking, from Yonghong Song. 15) Support MIPS in the BPF selftests and samples infrastructure, the MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David Daney. 16) Support kernel based TLS, from Dave Watson and others. 17) Remove completely DST garbage collection, from Wei Wang. 18) Allow installing TCP MD5 rules using prefixes, from Ivan Delalande. 19) Add XDP support to Intel i40e driver, from Björn Töpel 20) Add support for TC flower offload in nfp driver, from Simon Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub Kicinski, and Bert van Leeuwen. 21) IPSEC offloading support in mlx5, from Ilan Tayari. 22) Add HW PTP support to macb driver, from Rafal Ozieblo. 23) Networking refcount_t conversions, From Elena Reshetova. 24) Add sock_ops support to BPF, from Lawrence Brako. This is useful for tuning the TCP sockopt settings of a group of applications, currently via CGROUPs" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits) net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap cxgb4: Support for get_ts_info ethtool method cxgb4: Add PTP Hardware Clock (PHC) support cxgb4: time stamping interface for PTP nfp: default to chained metadata prepend format nfp: remove legacy MAC address lookup nfp: improve order of interfaces in breakout mode net: macb: remove extraneous return when MACB_EXT_DESC is defined bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case bpf: fix return in load_bpf_file mpls: fix rtm policy in mpls_getroute net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t net, ax25: convert ax25_route.refcount from atomic_t to refcount_t net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t ...	2017-07-05 12:31:59 -07:00
Linus Torvalds	e24dd9ee53	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer updates from James Morris: - a major update for AppArmor. From JJ: * several bug fixes and cleanups * the patch to add symlink support to securityfs that was floated on the list earlier and the apparmorfs changes that make use of securityfs symlinks * it introduces the domain labeling base code that Ubuntu has been carrying for several years, with several cleanups applied. And it converts the current mediation over to using the domain labeling base, which brings domain stacking support with it. This finally will bring the base upstream code in line with Ubuntu and provide a base to upstream the new feature work that Ubuntu carries. * This does _not_ contain any of the newer apparmor mediation features/controls (mount, signals, network, keys, ...) that Ubuntu is currently carrying, all of which will be RFC'd on top of this. - Notable also is the Infiniband work in SELinux, and the new file:map permission. From Paul: "While we're down to 21 patches for v4.13 (it was 31 for v4.12), the diffstat jumps up tremendously with over 2k of line changes. Almost all of these changes are the SELinux/IB work done by Daniel Jurgens; some other noteworthy changes include a NFS v4.2 labeling fix, a new file:map permission, and reporting of policy capabilities on policy load" There's also now genfscon labeling support for tracefs, which was lost in v4.1 with the separation from debugfs. - Smack incorporates a safer socket check in file_receive, and adds a cap_capable call in privilege check. - TPM as usual has a bunch of fixes and enhancements. - Multiple calls to security_add_hooks() can now be made for the same LSM, to allow LSMs to have hook declarations across multiple files. - IMA now supports different "ima_appraise=" modes (eg. log, fix) from the boot command line. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (126 commits) apparmor: put back designators in struct initialisers seccomp: Switch from atomic_t to recount_t seccomp: Adjust selftests to avoid double-join seccomp: Clean up core dump logic IMA: update IMA policy documentation to include pcr= option ima: Log the same audit cause whenever a file has no signature ima: Simplify policy_func_show. integrity: Small code improvements ima: fix get_binary_runtime_size() ima: use ima_parse_buf() to parse template data ima: use ima_parse_buf() to parse measurements headers ima: introduce ima_parse_buf() ima: Add cgroups2 to the defaults list ima: use memdup_user_nul ima: fix up #endif comments IMA: Correct Kconfig dependencies for hash selection ima: define is_ima_appraise_enabled() ima: define Kconfig IMA_APPRAISE_BOOTPARAM option ima: define a set of appraisal rules requiring file signatures ima: extend the "ima_policy" boot command line to support multiple policies ...	2017-07-05 11:26:35 -07:00
David S. Miller	3d09198243	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two entries being added at the same time to the IFLA policy table, whilst parallel bug fixes to decnet routing dst handling overlapping with the dst gc removal in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-21 17:35:22 -04:00
Julien Gomes	94df30a652	rtnetlink: add NEWCACHEREPORT message type New NEWCACHEREPORT message type to be used for cache reports sent via Netlink, effectively allowing splitting cache report reception from mroute programming. Suggested-by: Ryan Halbrook <halbrook@arista.com> Signed-off-by: Julien Gomes <julien@arista.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-21 11:22:52 -04:00
Jeff Vander Stoep	6a3911837d	selinux: enable genfscon labeling for tracefs In kernel version 4.1, tracefs was separated from debugfs into its own filesystem. Prior to this split, files in /sys/kernel/debug/tracing could be labeled during filesystem creation using genfscon or later from userspace using setxattr. This change re-enables support for genfscon labeling. Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-06-20 15:53:34 -04:00
Paul Moore	023f108dcc	selinux: fix double free in selinux_parse_opts_str() This patch is based on a discussion generated by an earlier patch from Tetsuo Handa: * https://marc.info/?t=149035659300001&r=1&w=2 The double free problem involves the mnt_opts field of the security_mnt_opts struct, selinux_parse_opts_str() frees the memory on error, but doesn't set the field to NULL so if the caller later attempts to call security_free_mnt_opts() we trigger the problem. In order to play it safe we change selinux_parse_opts_str() to call security_free_mnt_opts() on error instead of free'ing the memory directly. This should ensure that everything is handled correctly, regardless of what the caller may do. Fixes: `e000752989` ("LSM/SELinux: Interfaces to allow FS to control mount options") Cc: stable@vger.kernel.org Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-06-13 17:34:22 +10:00
Scott Mayhew	0b4d3452b8	security/selinux: allow security_sb_clone_mnt_opts to enable/disable native labeling behavior When an NFSv4 client performs a mount operation, it first mounts the NFSv4 root and then does path walk to the exported path and performs a submount on that, cloning the security mount options from the root's superblock to the submount's superblock in the process. Unless the NFS server has an explicit fsid=0 export with the "security_label" option, the NFSv4 root superblock will not have SBLABEL_MNT set, and neither will the submount superblock after cloning the security mount options. As a result, setxattr's of security labels over NFSv4.2 will fail. In a similar fashion, NFSv4.2 mounts mounted with the context= mount option will not show the correct labels because the nfs_server->caps flags of the cloned superblock will still have NFS_CAP_SECURITY_LABEL set. Allowing the NFSv4 client to enable or disable SECURITY_LSM_NATIVE_LABELS behavior will ensure that the SBLABEL_MNT flag has the correct value when the client traverses from an exported path without the "security_label" option to one with the "security_label" option and vice versa. Similarly, checking to see if SECURITY_LSM_NATIVE_LABELS is set upon return from security_sb_clone_mnt_opts() and clearing NFS_CAP_SECURITY_LABEL if necessary will allow the correct labels to be displayed for NFSv4.2 mounts mounted with the context= mount option. Resolves: https://github.com/SELinuxProject/selinux-kernel/issues/35 Signed-off-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov> Tested-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-06-09 16:17:47 -04:00
Junil Lee	b4958c892e	selinux: use kmem_cache for ebitmap The allocated size for each ebitmap_node is 192byte by kzalloc(). Then, ebitmap_node size is fixed, so it's possible to use only 144byte for each object by kmem_cache_zalloc(). It can reduce some dynamic allocation size. Signed-off-by: Junil Lee <junil0814.lee@lge.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-06-09 16:13:50 -04:00
Florian Westphal	8e71bf75ef	selinux: use pernet operations for hook registration It will allow us to remove the old netfilter hook api in the near future. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-06-02 10:27:46 -04:00
Al Viro	0b884d25f5	sel_write_validatetrans(): don't open-code memdup_user_nul() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-05-25 23:28:36 -04:00
Daniel Jurgens	409dcf3153	selinux: Add a cache for quicker retreival of PKey SIDs It is likely that the SID for the same PKey will be requested many times. To reduce the time to modify QPs and process MADs use a cache to store PKey SIDs. This code is heavily based on the "netif" and "netport" concept originally developed by James Morris <jmorris@redhat.com> and Paul Moore <paul@paul-moore.com> (see security/selinux/netif.c and security/selinux/netport.c for more information) Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 12:28:12 -04:00
Daniel Jurgens	ab861dfca1	selinux: Add IB Port SMP access vector Add a type for Infiniband ports and an access vector for subnet management packets. Implement the ib_port_smp hook to check that the caller has permission to send and receive SMPs on the end port specified by the device name and port. Add interface to query the SID for a IB port, which walks the IB_PORT ocontexts to find an entry for the given name and port. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 12:28:02 -04:00
Daniel Jurgens	cfc4d882d4	selinux: Implement Infiniband PKey "Access" access vector Add a type and access vector for PKeys. Implement the ib_pkey_access hook to check that the caller has permission to access the PKey on the given subnet prefix. Add an interface to get the PKey SID. Walk the PKey ocontexts to find an entry for the given subnet prefix and pkey. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 12:27:50 -04:00
Daniel Jurgens	3a976fa676	selinux: Allocate and free infiniband security hooks Implement and attach hooks to allocate and free Infiniband object security structures. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 12:27:41 -04:00
Daniel Jurgens	a806f7a161	selinux: Create policydb version for Infiniband support Support for Infiniband requires the addition of two new object contexts, one for infiniband PKeys and another IB Ports. Added handlers to read and write the new ocontext types when reading or writing a binary policy representation. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 12:27:32 -04:00
Daniel Jurgens	8f408ab64b	selinux lsm IB/core: Implement LSM notification system Add a generic notificaiton mechanism in the LSM. Interested consumers can register a callback with the LSM and security modules can produce events. Because access to Infiniband QPs are enforced in the setup phase of a connection security should be enforced again if the policy changes. Register infiniband devices for policy change notification and check all QPs on that device when the notification is received. Add a call to the notification mechanism from SELinux when the AVC cache changes or setenforce is cleared. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 12:27:11 -04:00
Matthias Kaehlcke	270e857314	selinux: Remove redundant check for unknown labeling behavior The check is already performed in ocontext_read() when the policy is loaded. Removing the array also fixes the following warning when building with clang: security/selinux/hooks.c:338:20: error: variable 'labeling_behaviors' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration] Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 10:24:06 -04:00
Stephen Smalley	4dc2fce342	selinux: log policy capability state when a policy is loaded Log the state of SELinux policy capabilities when a policy is loaded. For each policy capability known to the kernel, log the policy capability name and the value set in the policy. For policy capabilities that are set in the loaded policy but unknown to the kernel, log the policy capability index, since this is the only information presently available in the policy. Sample output with a policy created with a new capability defined that is not known to the kernel: SELinux: policy capability network_peer_controls=1 SELinux: policy capability open_perms=1 SELinux: policy capability extended_socket_class=1 SELinux: policy capability always_check_network=0 SELinux: policy capability cgroup_seclabel=0 SELinux: unknown policy capability 5 Resolves: https://github.com/SELinuxProject/selinux-kernel/issues/32 Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 10:23:50 -04:00
Stephen Smalley	ccb544781d	selinux: do not check open permission on sockets open permission is currently only defined for files in the kernel (COMMON_FILE_PERMS rather than COMMON_FILE_SOCK_PERMS). Construction of an artificial test case that tries to open a socket via /proc/pid/fd will generate a recvfrom avc denial because recvfrom and open happen to map to the same permission bit in socket vs file classes. open of a socket via /proc/pid/fd is not supported by the kernel regardless and will ultimately return ENXIO. But we hit the permission check first and can thus produce these odd/misleading denials. Omit the open check when operating on a socket. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 10:23:42 -04:00
Stephen Smalley	3ba4bf5f1e	selinux: add a map permission check for mmap Add a map permission check on mmap so that we can distinguish memory mapped access (since it has different implications for revocation). When a file is opened and then read or written via syscalls like read(2)/write(2), we revalidate access on each read/write operation via selinux_file_permission() and therefore can revoke access if the process context, the file context, or the policy changes in such a manner that access is no longer allowed. When a file is opened and then memory mapped via mmap(2) and then subsequently read or written directly in memory, we presently have no way to revalidate or revoke access. The purpose of a separate map permission check on mmap(2) is to permit policy to prohibit memory mapping of specific files for which we need to ensure that every access is revalidated, particularly useful for scenarios where we expect the file to be relabeled at runtime in order to reflect state changes (e.g. cross-domain solution, assured pipeline without data copying). Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 10:23:39 -04:00
Stephen Smalley	db59000ab7	selinux: only invoke capabilities and selinux for CAP_MAC_ADMIN checks SELinux uses CAP_MAC_ADMIN to control the ability to get or set a raw, uninterpreted security context unknown to the currently loaded security policy. When performing these checks, we only want to perform a base capabilities check and a SELinux permission check. If any other modules that implement a capable hook are stacked with SELinux, we do not want to require them to also have to authorize CAP_MAC_ADMIN, since it may have different implications for their security model. Rework the CAP_MAC_ADMIN checks within SELinux to only invoke the capabilities module and the SELinux permission checking. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 10:23:22 -04:00
Markus Elfring	46be14d2b6	selinux: Return an error code only as a constant in sidtab_insert() * Return an error code without storing it in an intermediate variable. * Delete the local variable "rc" and the jump label "out" which became unnecessary with this refactoring. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 10:23:17 -04:00
Markus Elfring	62934ffb9e	selinux: Return directly after a failed memory allocation in policydb_index() Replace five goto statements (and previous variable assignments) by direct returns after a memory allocation failure in this function. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 10:23:12 -04:00
Tetsuo Handa	a79be23860	selinux: Use task_alloc hook rather than task_create hook This patch is a preparation for getting rid of task_create hook because task_alloc hook which can do what task_create hook can do was revived. Creating a new thread is unlikely prohibited by security policy, for fork()/execve()/exit() is fundamental of how processes are managed in Unix. If a program is known to create a new thread, it is likely that permission to create a new thread is given to that program. Therefore, a situation where security_task_create() returns an error is likely that the program was exploited and lost control. Even if SELinux failed to check permission to create a thread at security_task_create(), SELinux can later check it at security_task_alloc(). Since the new thread is not yet visible from the rest of the system, nobody can do bad things using the new thread. What we waste will be limited to some initialization steps such as dup_task_struct(), copy_creds() and audit_alloc() in copy_process(). We can tolerate these overhead for unlikely situation. Therefore, this patch changes SELinux to use task_alloc hook rather than task_create hook so that we can remove task_create hook. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-23 10:23:02 -04:00
Linus Torvalds	11fbf53d66	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted bits and pieces from various people. No common topic in this pile, sorry" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs/affs: add rename exchange fs/affs: add rename2 to prepare multiple methods Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx() fs: don't set *REFERENCED on single use objects fs: compat: Remove warning from COMPATIBLE_IOCTL remove pointless extern of atime_need_update_rcu() fs: completely ignore unknown open flags fs: add a VALID_OPEN_FLAGS fs: remove _submit_bh() fs: constify tree_descr arrays passed to simple_fill_super() fs: drop duplicate header percpu-rwsem.h fs/affs: bugfix: Write files greater than page size on OFS fs/affs: bugfix: enable writes on OFS disks fs/affs: remove node generation check fs/affs: import amigaffs.h fs/affs: bugfix: make symbolic links work again	2017-05-09 09:12:53 -07:00
Linus Torvalds	0302e28dee	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: IMA: - provide ">" and "<" operators for fowner/uid/euid rules KEYS: - add a system blacklist keyring - add KEYCTL_RESTRICT_KEYRING, exposes keyring link restriction functionality to userland via keyctl() LSM: - harden LSM API with __ro_after_init - add prlmit security hook, implement for SELinux - revive security_task_alloc hook TPM: - implement contextual TPM command 'spaces'" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (98 commits) tpm: Fix reference count to main device tpm_tis: convert to using locality callbacks tpm: fix handling of the TPM 2.0 event logs tpm_crb: remove a cruft constant keys: select CONFIG_CRYPTO when selecting DH / KDF apparmor: Make path_max parameter readonly apparmor: fix parameters so that the permission test is bypassed at boot apparmor: fix invalid reference to index variable of iterator line 836 apparmor: use SHASH_DESC_ON_STACK security/apparmor/lsm.c: set debug messages apparmor: fix boolreturn.cocci warnings Smack: Use GFP_KERNEL for smk_netlbl_mls(). smack: fix double free in smack_parse_opts_str() KEYS: add SP800-56A KDF support for DH KEYS: Keyring asymmetric key restrict method with chaining KEYS: Restrict asymmetric key linkage using a specific keychain KEYS: Add a lookup_restriction function for the asymmetric key type KEYS: Add KEYCTL_RESTRICT_KEYRING KEYS: Consistent ordering for __key_link_begin and restrict check KEYS: Add an optional lookup_restriction hook to key_type ...	2017-05-03 08:50:52 -07:00
Eric Biggers	cda37124f4	fs: constify tree_descr arrays passed to simple_fill_super() simple_fill_super() is passed an array of tree_descr structures which describe the files to create in the filesystem's root directory. Since these arrays are never modified intentionally, they should be 'const' so that they are placed in .rodata and benefit from memory protection. This patch updates the function signature and all users, and also constifies tree_descr.name. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 23:54:06 -04:00
Dan Carpenter	cae303df3f	selinux: Fix an uninitialized variable bug We removed this initialization as a cleanup but it is probably required. The concern is that "nel" can be zero. I'm not an expert on SELinux code but I think it looks possible to write an SELinux policy which triggers this bug. GCC doesn't catch this, but my static checker does. Fixes: `9c312e79d6` ("selinux: Delete an unnecessary variable initialisation in range_read()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-31 15:16:18 -04:00
Matthias Kaehlcke	342e91578e	selinux: Remove unnecessary check of array base in selinux_set_mapping() 'perms' will never be NULL since it isn't a plain pointer but an array of u32 values. This fixes the following warning when building with clang: security/selinux/ss/services.c:158:16: error: address of array 'p_in->perms' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] while (p_in->perms && p_in->perms[k]) { Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 18:57:25 -04:00
Markus Elfring	710a0647ba	selinuxfs: Use seq_puts() in sel_avc_stats_seq_show() A string which did not contain data format specifications should be put into a sequence. Thus use the corresponding function "seq_puts". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:52:30 -04:00
Markus Elfring	8ee4586ca5	selinux: Adjust two checks for null pointers The script "checkpatch.pl" pointed information out like the following. Comparison to NULL could be written !… Thus fix affected source code places. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:46:10 -04:00
Markus Elfring	b380f78377	selinux: Use kmalloc_array() in sidtab_init() A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kmalloc_array". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:44:10 -04:00
Markus Elfring	ebd2b47ba5	selinux: Return directly after a failed kzalloc() in roles_init() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:39:17 -04:00
Markus Elfring	7befb7514e	selinux: Return directly after a failed kzalloc() in perm_read() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:30:51 -04:00
Markus Elfring	442ca4d656	selinux: Return directly after a failed kzalloc() in common_read() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:29:11 -04:00
Markus Elfring	df4a14dfb4	selinux: Return directly after a failed kzalloc() in class_read() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:24:58 -04:00
Markus Elfring	ea6e2f7d12	selinux: Return directly after a failed kzalloc() in role_read() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:22:12 -04:00
Markus Elfring	549fe69ee5	selinux: Return directly after a failed kzalloc() in type_read() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:20:07 -04:00
Markus Elfring	4bd9f07b89	selinux: Return directly after a failed kzalloc() in user_read() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 11:15:17 -04:00
Markus Elfring	b592119100	selinux: Improve another size determination in sens_read() Replace the specification of a data type by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 10:51:09 -04:00
Markus Elfring	3c354d7d7b	selinux: Return directly after a failed kzalloc() in sens_read() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 09:56:48 -04:00
Markus Elfring	7f6d0ad8b7	selinux: Return directly after a failed kzalloc() in cat_read() Return directly after a call of the function "kzalloc" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-29 09:54:48 -04:00
David Ahern	983701eb06	rtnetlink: Add RTM_DELNETCONF Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 22:32:42 -07:00
Markus Elfring	9c312e79d6	selinux: Delete an unnecessary variable initialisation in range_read() The local variable "rt" will be set to an appropriate pointer a bit later. Thus omit the explicit initialisation at the beginning which became unnecessary with a previous update step. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 18:16:55 -04:00
Markus Elfring	57152a5be0	selinux: Return directly after a failed next_entry() in range_read() Return directly after a call of the function "next_entry" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 18:11:33 -04:00
Markus Elfring	02fcef27cc	selinux: Delete an unnecessary variable assignment in filename_trans_read() The local variable "ft" was set to a null pointer despite of an immediate reassignment. Thus remove this statement from the beginning of a loop. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 18:08:05 -04:00
Markus Elfring	315e01ada8	selinux: One function call less in genfs_read() after null pointer detection Call the function "kfree" at the end only after it was determined that the local variable "newgenfs" contained a non-null pointer. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 17:53:29 -04:00
Markus Elfring	3a0aa56518	selinux: Return directly after a failed next_entry() in genfs_read() Return directly after a call of the function "next_entry" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 17:45:29 -04:00
Markus Elfring	b4e4686f65	selinux: Delete an unnecessary return statement in policydb_destroy() The script "checkpatch.pl" pointed information out like the following. WARNING: void function return statements are not generally useful Thus remove such a statement in the affected function. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 17:21:26 -04:00
Markus Elfring	ad10a10567	selinux: Use kcalloc() in policydb_index() Multiplications for the size determination of memory allocations indicated that array data structures should be processed. Thus use the corresponding function "kcalloc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 17:14:16 -04:00
Markus Elfring	cb8d21e364	selinux: Adjust four checks for null pointers The script "checkpatch.pl" pointed information out like the following. Comparison to NULL could be written !… Thus fix affected source code places. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 16:36:38 -04:00
Markus Elfring	2f00e680fe	selinux: Use kmalloc_array() in hashtab_create() A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kmalloc_array". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 16:31:20 -04:00
Markus Elfring	fb13a312da	selinux: Improve size determinations in four functions Replace the specification of data structures by pointer dereferences as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 16:29:02 -04:00
Markus Elfring	e34cfef901	selinux: Delete an unnecessary return statement in cond_compute_av() The script "checkpatch.pl" pointed information out like the following. WARNING: void function return statements are not generally useful Thus remove such a statement in the affected function. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 16:25:42 -04:00
Markus Elfring	f6076f704a	selinux: Use kmalloc_array() in cond_init_bool_indexes() * A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kmalloc_array". This issue was detected by using the Coccinelle software. * Replace the specification of a data type by a pointer dereference to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-23 16:21:41 -04:00
Alexander Potapenko	e2f586bd83	selinux: check for address length in selinux_socket_bind() KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of uninitialized memory in selinux_socket_bind(): ================================================================== BUG: KMSAN: use of unitialized memory inter: 0 CPU: 3 PID: 1074 Comm: packet2 Tainted: G B 4.8.0-rc6+ #1916 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 0000000000000000 ffff8800882ffb08 ffffffff825759c8 ffff8800882ffa48 ffffffff818bf551 ffffffff85bab870 0000000000000092 ffffffff85bab550 0000000000000000 0000000000000092 00000000bb0009bb 0000000000000002 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff825759c8>] dump_stack+0x238/0x290 lib/dump_stack.c:51 [<ffffffff818bdee6>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1008 [<ffffffff818bf0fb>] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424 [<ffffffff822dae71>] selinux_socket_bind+0xf41/0x1080 security/selinux/hooks.c:4288 [<ffffffff8229357c>] security_socket_bind+0x1ec/0x240 security/security.c:1240 [<ffffffff84265d98>] SYSC_bind+0x358/0x5f0 net/socket.c:1366 [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356 [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292 [<ffffffff8518217c>] entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.o:? chained origin: 00000000ba6009bb [<ffffffff810bb7a7>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67 [< inline >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322 [< inline >] kmsan_save_stack mm/kmsan/kmsan.c:337 [<ffffffff818bd2b8>] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:530 [<ffffffff818bf033>] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380 [<ffffffff84265b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356 [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356 [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292 [<ffffffff8518217c>] return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.o:? origin description: ----address@SYSC_bind (origin=00000000b8c00900) ================================================================== (the line numbers are relative to 4.8-rc6, but the bug persists upstream) , when I run the following program as root: ======================================================= #include <string.h> #include <sys/socket.h> #include <netinet/in.h> int main(int argc, char *argv[]) { struct sockaddr addr; int size = 0; if (argc > 1) { size = atoi(argv[1]); } memset(&addr, 0, sizeof(addr)); int fd = socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP); bind(fd, &addr, size); return 0; } ======================================================= (for different values of \|size\| other error reports are printed). This happens because bind() unconditionally copies \|size\| bytes of \|addr\| to the kernel, leaving the rest uninitialized. Then security_socket_bind() reads the IP address bytes, including the uninitialized ones, to determine the port, or e.g. pass them further to sel_netnode_find(), which uses them to calculate a hash. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Eric Dumazet <edumazet@google.com> [PM: fixed some whitespace damage] Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-10 15:22:16 -05:00
James Morris	579fc0dc09	selinux: constify nlmsg permission tables Constify nlmsg permission tables, which are initialized once and then do not change. Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-03-06 11:58:08 -05:00
James Morris	ca97d939db	security: mark LSM hooks as __ro_after_init Mark all of the registration hooks as __ro_after_init (via the __lsm_ro_after_init macro). Signed-off-by: James Morris <james.l.morris@oracle.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Kees Cook <keescook@chromium.org>	2017-03-06 11:00:15 +11:00
James Morris	dd0859dccb	security: introduce CONFIG_SECURITY_WRITABLE_HOOKS Subsequent patches will add RO hardening to LSM hooks, however, SELinux still needs to be able to perform runtime disablement after init to handle architectures where init-time disablement via boot parameters is not feasible. Introduce a new kernel configuration parameter CONFIG_SECURITY_WRITABLE_HOOKS, and a helper macro __lsm_ro_after_init, to handle this case. Signed-off-by: James Morris <james.l.morris@oracle.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Kees Cook <keescook@chromium.org>	2017-03-06 11:00:12 +11:00
Stephen Smalley	84e6885e9e	selinux: fix kernel BUG on prlimit(..., NULL, NULL) commit 79bcf325e6b32b3c ("prlimit,security,selinux: add a security hook for prlimit") introduced a security hook for prlimit() and implemented it for SELinux. However, if prlimit() is called with NULL arguments for both the new limit and the old limit, then the hook is called with 0 for the read/write flags, since the prlimit() will neither read nor write the process' limits. This would in turn lead to calling avc_has_perm() with 0 for the requested permissions, which triggers a BUG_ON() in avc_has_perm_noaudit() since the kernel should never be invoking avc_has_perm() with no permissions. Fix this in the SELinux hook by returning immediately if the flags are 0. Arguably prlimit64() itself ought to return immediately if both old_rlim and new_rlim are NULL since it is effectively a no-op in that case. Reported by the lkp-robot based on trinity testing. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-03-06 10:44:16 +11:00
Stephen Smalley	791ec491c3	prlimit,security,selinux: add a security hook for prlimit When SELinux was first added to the kernel, a process could only get and set its own resource limits via getrlimit(2) and setrlimit(2), so no MAC checks were required for those operations, and thus no security hooks were defined for them. Later, SELinux introduced a hook for setlimit(2) with a check if the hard limit was being changed in order to be able to rely on the hard limit value as a safe reset point upon context transitions. Later on, when prlimit(2) was added to the kernel with the ability to get or set resource limits (hard or soft) of another process, LSM/SELinux was not updated other than to pass the target process to the setrlimit hook. This resulted in incomplete control over both getting and setting the resource limits of another process. Add a new security_task_prlimit() hook to the check_prlimit_permission() function to provide complete mediation. The hook is only called when acting on another task, and only if the existing DAC/capability checks would allow access. Pass flags down to the hook to indicate whether the prlimit(2) call will read, write, or both read and write the resource limits of the target process. The existing security_task_setrlimit() hook is left alone; it continues to serve a purpose in supporting the ability to make decisions based on the old and/or new resource limit values when setting limits. This is consistent with the DAC/capability logic, where check_prlimit_permission() performs generic DAC/capability checks for acting on another task, while do_prlimit() performs a capability check based on a comparison of the old and new resource limits. Fix the inline documentation for the hook to match the code. Implement the new hook for SELinux. For setting resource limits, we reuse the existing setrlimit permission. Note that this does overload the setrlimit permission to mean the ability to set the resource limit (soft or hard) of another process or the ability to change one's own hard limit. For getting resource limits, a new getrlimit permission is defined. This was not originally defined since getrlimit(2) could only be used to obtain a process' own limits. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-03-06 10:43:47 +11:00
Linus Torvalds	1827adb11a	Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull sched.h split-up from Ingo Molnar: "The point of these changes is to significantly reduce the <linux/sched.h> header footprint, to speed up the kernel build and to have a cleaner header structure. After these changes the new <linux/sched.h>'s typical preprocessed size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K lines), which is around 40% faster to build on typical configs. Not much changed from the last version (-v2) posted three weeks ago: I eliminated quirks, backmerged fixes plus I rebased it to an upstream SHA1 from yesterday that includes most changes queued up in -next plus all sched.h changes that were pending from Andrew. I've re-tested the series both on x86 and on cross-arch defconfigs, and did a bisectability test at a number of random points. I tried to test as many build configurations as possible, but some build breakage is probably still left - but it should be mostly limited to architectures that have no cross-compiler binaries available on kernel.org, and non-default configurations" * 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits) sched/headers: Clean up <linux/sched.h> sched/headers: Remove #ifdefs from <linux/sched.h> sched/headers: Remove the <linux/topology.h> include from <linux/sched.h> sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h> sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h> sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h> sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h> sched/headers: Remove <linux/sched.h> from <linux/sched/init.h> sched/core: Remove unused prefetch_stack() sched/headers: Remove <linux/rculist.h> from <linux/sched.h> sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h> sched/headers: Remove <linux/signal.h> from <linux/sched.h> sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> sched/headers: Remove the runqueue_is_locked() prototype sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h> sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h> sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h> sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h> ...	2017-03-03 10:16:38 -08:00
Ingo Molnar	299300258d	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> We are going to split <linux/sched/task.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:35 +01:00
Ingo Molnar	3f07c01441	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/signal.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:29 +01:00
Stephen Smalley	2651225b5e	selinux: wrap cgroup seclabel support with its own policy capability commit `1ea0ce4069` ("selinux: allow changing labels for cgroupfs") broke the Android init program, which looks up security contexts whenever creating directories and attempts to assign them via setfscreatecon(). When creating subdirectories in cgroup mounts, this would previously be ignored since cgroup did not support userspace setting of security contexts. However, after the commit, SELinux would attempt to honor the requested context on cgroup directories and fail due to permission denial. Avoid breaking existing userspace/policy by wrapping this change with a conditional on a new cgroup_seclabel policy capability. This preserves existing behavior until/unless a new policy explicitly enables this capability. Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-03-02 10:27:40 +11:00
Alexey Dobriyan	5b5e0928f7	lib/vsprintf.c: remove %Z support Now that %z is standartised in C99 there is no reason to support %Z. Unlike %L it doesn't even make format strings smaller. Use BUILD_BUG_ON in a couple ATM drivers. In case anyone didn't notice lib/vsprintf.o is about half of SLUB which is in my opinion is quite an achievement. Hopefully this patch inspires someone else to trim vsprintf.c more. Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-27 18:43:47 -08:00
Dave Jiang	11bac80004	mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf ->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-24 17:46:54 -08:00
Linus Torvalds	f1ef09fde1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "There is a lot here. A lot of these changes result in subtle user visible differences in kernel behavior. I don't expect anything will care but I will revert/fix things immediately if any regressions show up. From Seth Forshee there is a continuation of the work to make the vfs ready for unpriviled mounts. We had thought the previous changes prevented the creation of files outside of s_user_ns of a filesystem, but it turns we missed the O_CREAT path. Ooops. Pavel Tikhomirov and Oleg Nesterov worked together to fix a long standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only children that are forked after the prctl are considered and not children forked before the prctl. The only known user of this prctl systemd forks all children after the prctl. So no userspace regressions will occur. Holding earlier forked children to the same rules as later forked children creates a semantic that is sane enough to allow checkpoing of processes that use this feature. There is a long delayed change by Nikolay Borisov to limit inotify instances inside a user namespace. Michael Kerrisk extends the API for files used to maniuplate namespaces with two new trivial ioctls to allow discovery of the hierachy and properties of namespaces. Konstantin Khlebnikov with the help of Al Viro adds code that when a network namespace exits purges it's sysctl entries from the dcache. As in some circumstances this could use a lot of memory. Vivek Goyal fixed a bug with stacked filesystems where the permissions on the wrong inode were being checked. I continue previous work on ptracing across exec. Allowing a file to be setuid across exec while being ptraced if the tracer has enough credentials in the user namespace, and if the process has CAP_SETUID in it's own namespace. Proc files for setuid or otherwise undumpable executables are now owned by the root in the user namespace of their mm. Allowing debugging of setuid applications in containers to work better. A bug I introduced with permission checking and automount is now fixed. The big change is to mark the mounts that the kernel initiates as a result of an automount. This allows the permission checks in sget to be safely suppressed for this kind of mount. As the permission check happened when the original filesystem was mounted. Finally a special case in the mount namespace is removed preventing unbounded chains in the mount hash table, and making the semantics simpler which benefits CRIU. The vfs fix along with related work in ima and evm I believe makes us ready to finish developing and merge fully unprivileged mounts of the fuse filesystem. The cleanups of the mount namespace makes discussing how to fix the worst case complexity of umount. The stacked filesystem fixes pave the way for adding multiple mappings for the filesystem uids so that efficient and safer containers can be implemented" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc/sysctl: Don't grab i_lock under sysctl_lock. vfs: Use upper filesystem inode in bprm_fill_uid() proc/sysctl: prune stale dentries during unregistering mnt: Tuck mounts under others instead of creating shadow/side mounts. prctl: propagate has_child_subreaper flag to every descendant introduce the walk_process_tree() helper nsfs: Add an ioctl() to return owner UID of a userns fs: Better permission checking for submounts exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction vfs: open() with O_CREAT should not create inodes with unknown ids nsfs: Add an ioctl() to return the namespace type proc: Better ownership of files for non-dumpable tasks in user namespaces exec: Remove LSM_UNSAFE_PTRACE_CAP exec: Test the ptracer's saved cred to see if the tracee can gain caps exec: Don't reset euid and egid when the tracee has CAP_SETUID inotify: Convert to using per-namespace limits	2017-02-23 20:33:51 -08:00
Linus Torvalds	3051bf36c2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini Varadhan. 2) Simplify classifier state on sk_buff in order to shrink it a bit. From Willem de Bruijn. 3) Introduce SIPHASH and it's usage for secure sequence numbers and syncookies. From Jason A. Donenfeld. 4) Reduce CPU usage for ICMP replies we are going to limit or suppress, from Jesper Dangaard Brouer. 5) Introduce Shared Memory Communications socket layer, from Ursula Braun. 6) Add RACK loss detection and allow it to actually trigger fast recovery instead of just assisting after other algorithms have triggered it. From Yuchung Cheng. 7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot. 8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert. 9) Export MPLS packet stats via netlink, from Robert Shearman. 10) Significantly improve inet port bind conflict handling, especially when an application is restarted and changes it's setting of reuseport. From Josef Bacik. 11) Implement TX batching in vhost_net, from Jason Wang. 12) Extend the dummy device so that VF (virtual function) features, such as configuration, can be more easily tested. From Phil Sutter. 13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric Dumazet. 14) Add new bpf MAP, implementing a longest prefix match trie. From Daniel Mack. 15) Packet sample offloading support in mlxsw driver, from Yotam Gigi. 16) Add new aquantia driver, from David VomLehn. 17) Add bpf tracepoints, from Daniel Borkmann. 18) Add support for port mirroring to b53 and bcm_sf2 drivers, from Florian Fainelli. 19) Remove custom busy polling in many drivers, it is done in the core networking since 4.5 times. From Eric Dumazet. 20) Support XDP adjust_head in virtio_net, from John Fastabend. 21) Fix several major holes in neighbour entry confirmation, from Julian Anastasov. 22) Add XDP support to bnxt_en driver, from Michael Chan. 23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan. 24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi. 25) Support GRO in IPSEC protocols, from Steffen Klassert" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits) Revert "ath10k: Search SMBIOS for OEM board file extension" net: socket: fix recvmmsg not returning error from sock_error bnxt_en: use eth_hw_addr_random() bpf: fix unlocking of jited image when module ronx not set arch: add ARCH_HAS_SET_MEMORY config net: napi_watchdog() can use napi_schedule_irqoff() tcp: Revert "tcp: tcp_probe: use spin_lock_bh()" net/hsr: use eth_hw_addr_random() net: mvpp2: enable building on 64-bit platforms net: mvpp2: switch to build_skb() in the RX path net: mvpp2: simplify MVPP2_PRS_RI_* definitions net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT net: mvpp2: remove unused register definitions net: mvpp2: simplify mvpp2_bm_bufs_add() net: mvpp2: drop useless fields in mvpp2_bm_pool and related code net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue' net: mvpp2: release reference to txq_cpu[] entry after unmapping net: mvpp2: handle too large value in mvpp2_rx_time_coal_set() net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set() net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set ...	2017-02-22 10:15:09 -08:00
David S. Miller	35eeacf182	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-02-11 02:31:11 -05:00
James Morris	a2a15479d6	Merge branch 'stable-4.11' of git://git.infradead.org/users/pcmoore/selinux into next	2017-02-10 10:28:49 +11:00
Stephen Smalley	0c461cb727	selinux: fix off-by-one in setprocattr SELinux tries to support setting/clearing of /proc/pid/attr attributes from the shell by ignoring terminating newlines and treating an attribute value that begins with a NUL or newline as an attempt to clear the attribute. However, the test for clearing attributes has always been wrong; it has an off-by-one error, and this could further lead to reading past the end of the allocated buffer since commit `bb646cdb12` ("proc_pid_attr_write(): switch to memdup_user()"). Fix the off-by-one error. Even with this fix, setting and clearing /proc/pid/attr attributes from the shell is not straightforward since the interface does not support multiple write() calls (so shells that write the value and newline separately will set and then immediately clear the attribute, requiring use of echo -n to set the attribute), whereas trying to use echo -n "" to clear the attribute causes the shell to skip the write() call altogether since POSIX says that a zero-length write causes no side effects. Thus, one must use echo -n to set and echo without -n to clear, as in the following example: $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate unconfined_u:object_r:user_home_t:s0 $ echo "" > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate Note the use of /proc/$$ rather than /proc/self, as otherwise the cat command will read its own attribute value, not that of the shell. There are no users of this facility to my knowledge; possibly we should just get rid of it. UPDATE: Upon further investigation it appears that a local process with the process:setfscreate permission can cause a kernel panic as a result of this bug. This patch fixes CVE-2017-2618. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: added the update about CVE-2017-2618 to the commit description] Cc: stable@vger.kernel.org # 3.5: `d6ea83ec68` Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-02-08 19:09:53 +11:00
James Morris	e2241be62d	Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/selinux into next	2017-02-08 19:01:07 +11:00
Antonio Murdaca	1ea0ce4069	selinux: allow changing labels for cgroupfs This patch allows changing labels for cgroup mounts. Previously, running chcon on cgroupfs would throw an "Operation not supported". This patch specifically whitelist cgroupfs. The patch could also allow containers to write only to the systemd cgroup for instance, while the other cgroups are kept with cgroup_t label. Signed-off-by: Antonio Murdaca <runcom@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-02-07 22:17:47 -05:00
Stephen Smalley	a050a570db	selinux: fix off-by-one in setprocattr SELinux tries to support setting/clearing of /proc/pid/attr attributes from the shell by ignoring terminating newlines and treating an attribute value that begins with a NUL or newline as an attempt to clear the attribute. However, the test for clearing attributes has always been wrong; it has an off-by-one error, and this could further lead to reading past the end of the allocated buffer since commit `bb646cdb12` ("proc_pid_attr_write(): switch to memdup_user()"). Fix the off-by-one error. Even with this fix, setting and clearing /proc/pid/attr attributes from the shell is not straightforward since the interface does not support multiple write() calls (so shells that write the value and newline separately will set and then immediately clear the attribute, requiring use of echo -n to set the attribute), whereas trying to use echo -n "" to clear the attribute causes the shell to skip the write() call altogether since POSIX says that a zero-length write causes no side effects. Thus, one must use echo -n to set and echo without -n to clear, as in the following example: $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate unconfined_u:object_r:user_home_t:s0 $ echo "" > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate Note the use of /proc/$$ rather than /proc/self, as otherwise the cat command will read its own attribute value, not that of the shell. There are no users of this facility to my knowledge; possibly we should just get rid of it. UPDATE: Upon further investigation it appears that a local process with the process:setfscreate permission can cause a kernel panic as a result of this bug. This patch fixes CVE-2017-2618. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: added the update about CVE-2017-2618 to the commit description] Cc: stable@vger.kernel.org # 3.5: `d6ea83ec68` Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-02-07 17:31:38 -05:00
Krister Johansen	4548b683b7	Introduce a sysctl that modifies the value of PROT_SOCK. Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl that denotes the first unprivileged inet port in the namespace. To disable all privileged ports set this to zero. It also checks for overlap with the local port range. The privileged and local range may not overlap. The use case for this change is to allow containerized processes to bind to priviliged ports, but prevent them from ever being allowed to modify their container's network configuration. The latter is accomplished by ensuring that the network namespace is not a child of the user namespace. This modification was needed to allow the container manager to disable a namespace's priviliged port restrictions without exposing control of the network namespace to processes in the user namespace. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 12:10:51 -05:00
Eric W. Biederman	9227dd2a84	exec: Remove LSM_UNSAFE_PTRACE_CAP With previous changes every location that tests for LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the LSM_UNSAFE_PTRACE_CAP redundant, so remove it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-01-24 12:03:08 +13:00
Casey Schaufler	d69dece5f5	LSM: Add /sys/kernel/security/lsm I am still tired of having to find indirect ways to determine what security modules are active on a system. I have added /sys/kernel/security/lsm, which contains a comma separated list of the active security modules. No more groping around in /proc/filesystems or other clever hacks. Unchanged from previous versions except for being updated to the latest security next branch. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-01-19 13:18:29 +11:00
Stephen Smalley	3a2f5a59a6	security,selinux,smack: kill security_task_wait hook As reported by yangshukui, a permission denial from security_task_wait() can lead to a soft lockup in zap_pid_ns_processes() since it only expects sys_wait4() to return 0 or -ECHILD. Further, security_task_wait() can in general lead to zombies; in the absence of some way to automatically reparent a child process upon a denial, the hook is not useful. Remove the security hook and its implementations in SELinux and Smack. Smack already removed its check from its hook. Reported-by: yangshukui <yangshukui@huawei.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-12 11:10:57 -05:00
Stephen Smalley	b4ba35c75a	selinux: drop unused socket security classes Several of the extended socket classes introduced by commit `da69a5306a` ("selinux: support distinctions among all network address families") are never used because sockets can never be created with the associated address family. Remove these unused socket security classes. The removed classes are bridge_socket for PF_BRIDGE, ib_socket for PF_IB, and mpls_socket for PF_MPLS. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-12 11:10:24 -05:00
Gary Tierney	900fde06cb	selinux: default to security isid in sel_make_bools() if no sid is found Use SECINITSID_SECURITY as the default SID for booleans which don't have a matching SID returned from security_genfs_sid(), also update the error message to a warning which matches this. This prevents the policy failing to load (and consequently the system failing to boot) when there is no default genfscon statement matched for the selinuxfs in the new policy. Signed-off-by: Gary Tierney <gary.tierney@gmx.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:32 -05:00
Gary Tierney	4262fb51c9	selinux: log errors when loading new policy Adds error logging to the code paths which can fail when loading a new policy in sel_write_load(). If the policy fails to be loaded from userspace then a warning message is printed, whereas if a failure occurs after loading policy from userspace an error message will be printed with details on where policy loading failed (recreating one of /classes/, /policy_capabilities/, /booleans/ in the SELinux fs). Also, if sel_make_bools() fails to obtain an SID for an entry in /booleans/* an error will be printed indicating the path of the boolean. Signed-off-by: Gary Tierney <gary.tierney@gmx.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Stephen Smalley	b21507e272	proc,security: move restriction on writing /proc/pid/attr nodes to proc Processes can only alter their own security attributes via /proc/pid/attr nodes. This is presently enforced by each individual security module and is also imposed by the Linux credentials implementation, which only allows a task to alter its own credentials. Move the check enforcing this restriction from the individual security modules to proc_pid_attr_write() before calling the security hook, and drop the unnecessary task argument to the security hook since it can only ever be the current task. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Stephen Smalley	be0554c9bf	selinux: clean up cred usage and simplify SELinux was sometimes using the task "objective" credentials when it could/should use the "subjective" credentials. This was sometimes hidden by the fact that we were unnecessarily passing around pointers to the current task, making it appear as if the task could be something other than current, so eliminate all such passing of current. Inline various permission checking helper functions that can be reduced to a single avc_has_perm() call. Since the credentials infrastructure only allows a task to alter its own credentials, we can always assume that current must be the same as the target task in selinux_setprocattr after the check. We likely should move this check from selinux_setprocattr() to proc_pid_attr_write() and drop the task argument to the security hook altogether; it can only serve to confuse things. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Stephen Smalley	01593d3299	selinux: allow context mounts on tmpfs, ramfs, devpts within user namespaces commit `aad82892af` ("selinux: Add support for unprivileged mounts from user namespaces") prohibited any use of context mount options within non-init user namespaces. However, this breaks use of context mount options for tmpfs mounts within user namespaces, which are being used by Docker/runc. There is no reason to block such usage for tmpfs, ramfs or devpts. Exempt these filesystem types from this restriction. Before: sh$ userns_child_exec -p -m -U -M '0 1000 1' -G '0 1000 1' bash sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp mount: tmpfs is write-protected, mounting read-only mount: cannot mount tmpfs read-only After: sh$ userns_child_exec -p -m -U -M '0 1000 1' -G '0 1000 1' bash sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp sh# ls -Zd /tmp unconfined_u:object_r:user_tmp_t:s0:c13 /tmp Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Stephen Smalley	ef37979a2c	selinux: handle ICMPv6 consistently with ICMP commit 79c8b348f215 ("selinux: support distinctions among all network address families") mapped datagram ICMP sockets to the new icmp_socket security class, but left ICMPv6 sockets unchanged. This change fixes that oversight to handle both kinds of sockets consistently. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Yongqin Liu	a2c7c6fbe5	selinux: add security in-core xattr support for tracefs Since kernel 4.1 ftrace is supported as a new separate filesystem. It gets automatically mounted by the kernel under the old path /sys/kernel/debug/tracing. Because it lives now on a separate filesystem SELinux needs to be updated to also support setting SELinux labels on tracefs inodes. This is required for compatibility in Android when moving to Linux 4.1 or newer. Signed-off-by: Yongqin Liu <yongqin.liu@linaro.org> Signed-off-by: William Roberts <william.c.roberts@intel.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:30 -05:00
Stephen Smalley	da69a5306a	selinux: support distinctions among all network address families Extend SELinux to support distinctions among all network address families implemented by the kernel by defining new socket security classes and mapping to them. Otherwise, many sockets are mapped to the generic socket class and are indistinguishable in policy. This has come up previously with regard to selectively allowing access to bluetooth sockets, and more recently with regard to selectively allowing access to AF_ALG sockets. Guido Trentalancia submitted a patch that took a similar approach to add only support for distinguishing AF_ALG sockets, but this generalizes his approach to handle all address families implemented by the kernel. Socket security classes are also added for ICMP and SCTP sockets. Socket security classes were not defined for AF_* values that are reserved but unimplemented in the kernel, e.g. AF_NETBEUI, AF_SECURITY, AF_ASH, AF_ECONET, AF_SNA, AF_WANPIPE. Backward compatibility is provided by only enabling the finer-grained socket classes if a new policy capability is set in the policy; older policies will behave as before. The legacy redhat1 policy capability that was only ever used in testing within Fedora for ptrace_child is reclaimed for this purpose; as far as I can tell, this policy capability is not enabled in any supported distro policy. Add a pair of conditional compilation guards to detect when new AF_* values are added so that we can update SELinux accordingly rather than having to belatedly update it long after new address families are introduced. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:30 -05:00
Linus Torvalds	67327145c4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull SElinux fix from James Morris: "From Paul: 'A small SELinux patch to fix some clang/llvm compiler warnings and ensure the tools under scripts work well in the face of kernel changes'" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: selinux: use the kernel headers when building scripts/selinux	2016-12-22 10:03:52 -08:00
Paul Moore	bfc5e3a6af	selinux: use the kernel headers when building scripts/selinux Commit `3322d0d64f` ("selinux: keep SELinux in sync with new capability definitions") added a check on the defined capabilities without explicitly including the capability header file which caused problems when building genheaders for users of clang/llvm. Resolve this by using the kernel headers when building genheaders, which is arguably the right thing to do regardless, and explicitly including the kernel's capability.h header file in classmap.h. We also update the mdp build, even though it wasn't causing an error we really should be using the headers from the kernel we are building. Reported-by: Nicolas Iooss <nicolas.iooss@m4x.org> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-12-21 10:39:25 -05:00
Linus Torvalds	683b96f4d1	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Generally pretty quiet for this release. Highlights: Yama: - allow ptrace access for original parent after re-parenting TPM: - add documentation - many bugfixes & cleanups - define a generic open() method for ascii & bios measurements Integrity: - Harden against malformed xattrs SELinux: - bugfixes & cleanups Smack: - Remove unnecessary smack_known_invalid label - Do not apply star label in smack_setprocattr hook - parse mnt opts after privileges check (fixes unpriv DoS vuln)" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (56 commits) Yama: allow access for the current ptrace parent tpm: adjust return value of tpm_read_log tpm: vtpm_proxy: conditionally call tpm_chip_unregister tpm: Fix handling of missing event log tpm: Check the bios_dir entry for NULL before accessing it tpm: return -ENODEV if np is not set tpm: cleanup of printk error messages tpm: replace of_find_node_by_name() with dev of_node property tpm: redefine read_log() to handle ACPI/OF at runtime tpm: fix the missing .owner in tpm_bios_measurements_ops tpm: have event log use the tpm_chip tpm: drop tpm1_chip_register(/unregister) tpm: replace dynamically allocated bios_dir with a static array tpm: replace symbolic permission with octal for securityfs files char: tpm: fix kerneldoc tpm2_unseal_trusted name typo tpm_tis: Allow tpm_tis to be bound using DT tpm, tpm_vtpm_proxy: add kdoc comments for VTPM_PROXY_IOC_NEW_DEV tpm: Only call pm_runtime_get_sync if device has a parent tpm: define a generic open() method for ascii & bios measurements Documentation: tpm: add the Physical TPM device tree binding documentation ...	2016-12-14 13:57:44 -08:00
Andreas Gruenbacher	9287aed2ad	selinux: Convert isec->lock into a spinlock Convert isec->lock from a mutex into a spinlock. Instead of holding the lock while sleeping in inode_doinit_with_dentry, set isec->initialized to LABEL_PENDING and release the lock. Then, when the sid has been determined, re-acquire the lock. If isec->initialized is still set to LABEL_PENDING, set isec->sid; otherwise, the sid has been set by another task (LABEL_INITIALIZED) or invalidated (LABEL_INVALID) in the meantime. This fixes a deadlock on gfs2 where * one task is in inode_doinit_with_dentry -> gfs2_getxattr, holds isec->lock, and tries to acquire the inode's glock, and * another task is in do_xmote -> inode_go_inval -> selinux_inode_invalidate_secctx, holds the inode's glock, and tries to acquire isec->lock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> [PM: minor tweaks to keep checkpatch.pl happy] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-22 17:44:02 -05:00
Stephen Smalley	3322d0d64f	selinux: keep SELinux in sync with new capability definitions When a new capability is defined, SELinux needs to be updated. Trigger a build error if a new capability is defined without corresponding update to security/selinux/include/classmap.h's COMMON_CAP2_PERMS. This is similar to BUILD_BUG_ON() guards in the SELinux nlmsgtab code to ensure that SELinux tracks new netlink message types as needed. Note that there is already a similar build guard in security/selinux/hooks.c to detect when more than 64 capabilities are defined, since that will require adding a third capability class to SELinux. A nicer way to do this would be to extend scripts/selinux/genheaders or a similar tool to auto-generate the necessary definitions and code for SELinux capability checking from include/uapi/linux/capability.h. AppArmor does something similar in its Makefile, although it only needs to generate a single table of names. That is left as future work. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: reformat the description to keep checkpatch.pl happy] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-21 15:37:24 -05:00
Stephen Smalley	ea49d10eee	selinux: normalize input to /sys/fs/selinux/enforce At present, one can write any signed integer value to /sys/fs/selinux/enforce and it will be stored, e.g. echo -1 > /sys/fs/selinux/enforce or echo 2 > /sys/fs/selinux/enforce. This makes no real difference to the kernel, since it only ever cares if it is zero or non-zero, but some userspace code compares it with 1 to decide if SELinux is enforcing, and this could confuse it. Only a process that is already root and is allowed the setenforce permission in SELinux policy can write to /sys/fs/selinux/enforce, so this is not considered to be a security issue, but it should be fixed. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-20 17:13:19 -05:00
Nicolas Pitre	baa73d9e47	posix-timers: Make them configurable Some embedded systems have no use for them. This removes about 25KB from the kernel binary size when configured out. Corresponding syscalls are routed to a stub logging the attempt to use those syscalls which should be enough of a clue if they were disabled without proper consideration. They are: timer_create, timer_gettime: timer_getoverrun, timer_settime, timer_delete, clock_adjtime, setitimer, getitimer, alarm. The clock_settime, clock_gettime, clock_getres and clock_nanosleep syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME, CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast majority of use cases with very little code. Signed-off-by: Nicolas Pitre <nico@linaro.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: linux-kbuild@vger.kernel.org Cc: netdev@vger.kernel.org Cc: Michal Marek <mmarek@suse.com> Cc: Edward Cree <ecree@solarflare.com> Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-16 09:26:35 +01:00
Andreas Gruenbacher	13457d073c	selinux: Clean up initialization of isec->sclass Now that isec->initialized == LABEL_INITIALIZED implies that isec->sclass is valid, skip such inodes immediately in inode_doinit_with_dentry. For the remaining inodes, initialize isec->sclass at the beginning of inode_doinit_with_dentry to simplify the code. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-14 15:53:04 -05:00
Andreas Gruenbacher	db978da8fa	proc: Pass file mode to proc_pid_make_inode Pass the file mode of the proc inode to be created to proc_pid_make_inode. In proc_pid_make_inode, initialize inode->i_mode before calling security_task_to_inode. This allows selinux to set isec->sclass right away without introducing "half-initialized" inode security structs. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-14 15:39:48 -05:00
Andreas Gruenbacher	420591128c	selinux: Minor cleanups Fix the comment for function __inode_security_revalidate, which returns an integer. Use the LABEL_* constants consistently for isec->initialized. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-14 15:25:07 -05:00
Tetsuo Handa	8931c3bdb3	SELinux: Use GFP_KERNEL for selinux_parse_opts_str(). Since selinux_parse_opts_str() is calling match_strdup() which uses GFP_KERNEL, it is safe to use GFP_KERNEL from kcalloc() which is called by selinux_parse_opts_str(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-14 15:03:38 -05:00
Andy Lutomirski	d17af5056c	mm: Change vm_is_stack_for_task() to vm_is_stack_for_current() Asking for a non-current task's stack can't be done without races unless the task is frozen in kernel mode. As far as I know, vm_is_stack_for_task() never had a safe non-current use case. The __unused annotation is because some KSTK_ESP implementations ignore their parameter, which IMO is further justification for this patch. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Jann Horn <jann@thejh.net> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux API <linux-api@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tycho Andersen <tycho.andersen@canonical.com> Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-10-20 09:21:41 +02:00
Linus Torvalds	101105b171	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning	2016-10-10 20:16:43 -07:00
Linus Torvalds	97d2116708	Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr updates from Al Viro: "xattr stuff from Andreas This completes the switch to xattr_handler ->get()/->set() from ->getxattr/->setxattr/->removexattr" * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Remove {get,set,remove}xattr inode operations xattr: Stop calling {get,set,remove}xattr inode operations vfs: Check for the IOP_XATTR flag in listxattr xattr: Add __vfs_{get,set,remove}xattr helpers libfs: Use IOP_XATTR flag for empty directory handling vfs: Use IOP_XATTR flag for bad-inode handling vfs: Add IOP_XATTR inode operations flag vfs: Move xattr_resolve_name to the front of fs/xattr.c ecryptfs: Switch to generic xattr handlers sockfs: Get rid of getxattr iop sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names kernfs: Switch to generic xattr handlers hfs: Switch to generic xattr handlers jffs2: Remove jffs2_{get,set,remove}xattr macros xattr: Remove unnecessary NULL attribute name check	2016-10-10 17:11:50 -07:00
Linus Torvalds	563873318d	Merge branch 'printk-cleanups' Merge my system logging cleanups, triggered by the broken '\n' patches. The line continuation handling has been broken basically forever, and the code to handle the system log records was both confusing and dubious. And it would do entirely the wrong thing unless you always had a terminating newline, partly because it couldn't actually see whether a message was marked KERN_CONT or not (but partly because the LOG_CONT handling in the recording code was rather confusing too). This re-introduces a real semantically meaningful KERN_CONT, and fixes the few places I noticed where it was missing. There are probably more missing cases, since KERN_CONT hasn't actually had any semantic meaning for at least four years (other than the checkpatch meaning of "no log level necessary, this is a continuation line"). This also allows the combination of KERN_CONT and a log level. In that case the log level will be ignored if the merging with a previous line is successful, but if a new record is needed, that new record will now get the right log level. That also means that you can at least in theory combine KERN_CONT with the "pr_info()" style helpers, although any use of pr_fmt() prefixing would make that just result in a mess, of course (the prefix would end up in the middle of a continuing line). * printk-cleanups: printk: make reading the kernel log flush pending lines printk: re-organize log_output() to be more legible printk: split out core logging code into helper function printk: reinstate KERN_CONT for printing continuation lines	2016-10-10 09:29:50 -07:00
Linus Torvalds	4bcc595ccd	printk: reinstate KERN_CONT for printing continuation lines Long long ago the kernel log buffer was a buffered stream of bytes, very much like stdio in user space. It supported log levels by scanning the stream and noticing the log level markers at the beginning of each line, but if you wanted to print a partial line in multiple chunks, you just did multiple printk() calls, and it just automatically worked. Except when it didn't, and you had very confusing output when different lines got all mixed up with each other. Then you got fragment lines mixing with each other, or with non-fragment lines, because it was traditionally impossible to tell whether a printk() call was a continuation or not. To at least help clarify the issue of continuation lines, we added a KERN_CONT marker back in 2007 to mark continuation lines: `4749252776` ("printk: add KERN_CONT annotation"). That continuation marker was initially an empty string, and didn't actuall make any semantic difference. But it at least made it possible to annotate the source code, and have check-patch notice that a printk() didn't need or want a log level marker, because it was a continuation of a previous line. To avoid the ambiguity between a continuation line that had that KERN_CONT marker, and a printk with no level information at all, we then in 2009 made KERN_CONT be a real log level marker which meant that we could now reliably tell the difference between the two cases. `5fd29d6ccb` ("printk: clean up handling of log-levels and newlines") and we could take advantage of that to make sure we didn't mix up continuation lines with lines that just didn't have any loglevel at all. Then, in 2012, the kernel log buffer was changed to be a "record" based log, where each line was a record that has a loglevel and a timestamp. You can see the beginning of that conversion in commits `e11fea92e1` ("kmsg: export printk records to the /dev/kmsg interface") `7ff9554bb5` ("printk: convert byte-buffer to variable-length record buffer") with a number of follow-up commits to fix some painful fallout from that conversion. Over all, it took a couple of months to sort out most of it. But the upside was that you could have concurrent readers (and writers) of the kernel log and not have lines with mixed output in them. And one particular pain-point for the record-based kernel logging was exactly the fragmentary lines that are generated in smaller chunks. In order to still log them as one recrod, the continuation lines need to be attached to the previous record properly. However the explicit continuation record marker that is actually useful for this exact case was actually removed in aroundm the same time by commit `61e99ab8e3` ("printk: remove the now unnecessary "C" annotation for KERN_CONT") due to the incorrect belief that KERN_CONT wasn't meaningful. The ambiguity between "is this a continuation line" or "is this a plain printk with no log level information" was reintroduced, and in fact became an even bigger pain point because there was now the whole record-level merging of kernel messages going on. This patch reinstates the KERN_CONT as a real non-empty string marker, so that the ambiguity is fixed once again. But it's not a plain revert of that original removal: in the four years since we made KERN_CONT an empty string again, not only has the format of the log level markers changed, we've also had some usage changes in this area. For example, some ACPI code seems to use KERN_CONT _together_ with a log level, and now uses both the KERN_CONT marker and (for example) a KERN_INFO marker to show that it's an informational continuation of a line. Which is actually not a bad idea - if the continuation line cannot be attached to its predecessor, without the log level information we don't know what log level to assign to it (and we traditionally just assigned it the default loglevel). So having both a log level and the KERN_CONT marker is not necessarily a bad idea, but it does mean that we need to actually iterate over potentially multiple markers, rather than just a single one. Also, since KERN_CONT was still conceptually needed, and encouraged, but didn't actually _do_ anything, we've also had the reverse problem: rather than having too many annotations it has too few, and there is bit rot with code that no longer marks the continuation lines with the KERN_CONT marker. So this patch not only re-instates the non-empty KERN_CONT marker, it also fixes up the cases of bit-rot I noticed in my own logs. There are probably other cases where KERN_CONT will be needed to be added, either because it is new code that never dealt with the need for KERN_CONT, or old code that has bitrotted without anybody noticing. That said, we should strive to avoid the need for KERN_CONT. It does result in real problems for logging, and should generally not be seen as a good feature. If we some day can get rid of the feature entirely, because nobody does any fragmented printk calls, that would be lovely. But until that point, let's at mark the code that relies on the hacky multi-fragment kernel printk's. Not only does it avoid the ambiguity, it also annotates code as "maybe this would be good to fix some day". (That said, particularly during single-threaded bootup, the downsides of KERN_CONT are very limited. Things get much hairier when you have multiple threads going on and user level reading and writing logs too). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-09 12:23:38 -07:00
Andreas Gruenbacher	5d6c31910b	xattr: Add __vfs_{get,set,remove}xattr helpers Right now, various places in the kernel check for the existence of getxattr, setxattr, and removexattr inode operations and directly call those operations. Switch to helper functions and test for the IOP_XATTR flag instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-10-07 20:10:44 -04:00
Deepa Dinamani	078cd8279e	fs: Replace CURRENT_TIME with current_time() for inode timestamps CURRENT_TIME macro is not appropriate for filesystems as it doesn't use the right granularity for filesystem timestamps. Use current_time() instead. CURRENT_TIME is also not y2038 safe. This is also in preparation for the patch that transitions vfs timestamps to use 64 bit time and hence make them y2038 safe. As part of the effort current_time() will be extended to do range checks. Hence, it is necessary for all file system timestamps to use current_time(). Also, current_time() will be transitioned along with vfs to be y2038 safe. Note that whenever a single call to current_time() is used to change timestamps in different inodes, it is because they share the same time granularity. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Felipe Balbi <balbi@kernel.org> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-09-27 21:06:21 -04:00
Vivek Goyal	43af5de742	lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE Right now LSM_AUDIT_DATA_PATH type contains "struct path" in union "u" of common_audit_data. This information is used to print path of file at the same time it is also used to get to dentry and inode. And this inode information is used to get to superblock and device and print device information. This does not work well for layered filesystems like overlay where dentry contained in path is overlay dentry and not the real dentry of underlying file system. That means inode retrieved from dentry is also overlay inode and not the real inode. SELinux helpers like file_path_has_perm() are doing checks on inode retrieved from file_inode(). This returns the real inode and not the overlay inode. That means we are doing check on real inode but for audit purposes we are printing details of overlay inode and that can be confusing while debugging. Hence, introduce a new type LSM_AUDIT_DATA_FILE which carries file information and inode retrieved is real inode using file_inode(). That way right avc denied information is given to user. For example, following is one example avc before the patch. type=AVC msg=audit(1473360868.399:214): avc: denied { read open } for pid=1765 comm="cat" path="/root/.../overlay/container1/merged/readfile" dev="overlay" ino=21443 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0 It looks as follows after the patch. type=AVC msg=audit(1473360017.388:282): avc: denied { read open } for pid=2530 comm="cat" path="/root/.../overlay/container1/merged/readfile" dev="dm-0" ino=2377915 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0 Notice that now dev information points to "dm-0" device instead of "overlay" device. This makes it clear that check failed on underlying inode and not on the overlay inode. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> [PM: slight tweaks to the description to make checkpatch.pl happy] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-09-19 13:42:38 -04:00
Wei Yongjun	9b6a9ecc2d	selinux: fix error return code in policydb_read() Fix to return error code -EINVAL from the error handling case instead of 0 (rc is overwrite to 0 when policyvers >= POLICYDB_VERSION_ROLETRANS), as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> [PM: normalize "selinux" in patch subject, description line wrap] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-09-13 17:14:43 -04:00
William Roberts	7c686af071	selinux: fix overflow and 0 length allocations Throughout the SELinux LSM, values taken from sepolicy are used in places where length == 0 or length == <saturated> matter, find and fix these. Signed-off-by: William Roberts <william.c.roberts@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-30 15:45:50 -04:00
William Roberts	3bc7bcf69b	selinux: initialize structures libsepol pointed out an issue where its possible to have an unitialized jmp and invalid dereference, fix this. While we're here, zero allocate all the *_val_to_struct structures. Signed-off-by: William Roberts <william.c.roberts@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-29 19:22:10 -04:00
William Roberts	74d977b65e	selinux: detect invalid ebitmap When count is 0 and the highbit is not zero, the ebitmap is not valid and the internal node is not allocated. This causes issues when routines, like mls_context_isvalid() attempt to use the ebitmap_for_each_bit() and ebitmap_node_get_bit() as they assume a highbit > 0 will have a node allocated. Signed-off-by: William Roberts <william.c.roberts@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-29 19:19:50 -04:00
William Roberts	348a0db9e6	selinux: drop SECURITY_SELINUX_POLICYDB_VERSION_MAX Remove the SECURITY_SELINUX_POLICYDB_VERSION_MAX Kconfig option Per: https://github.com/SELinuxProject/selinux/wiki/Kernel-Todo This was only needed on Fedora 3 and 4 and just causes issues now, so drop it. The MAX and MIN should just be whatever the kernel can support. Signed-off-by: William Roberts <william.c.roberts@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-18 20:01:15 -04:00
Vivek Goyal	a518b0a5b0	selinux: Implement dentry_create_files_as() hook Calculate what would be the label of newly created file and set that secid in the passed creds. Context of the task which is actually creating file is retrieved from set of creds passed in. (old->security). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-10 08:25:22 -04:00
Vivek Goyal	c957f6df52	selinux: Pass security pointer to determine_inode_label() Right now selinux_determine_inode_label() works on security pointer of current task. Soon I need this to work on a security pointer retrieved from a set of creds. So start passing in a pointer and caller can decide where to fetch security pointer from. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-08 20:45:29 -04:00
Vivek Goyal	19472b69d6	selinux: Implementation for inode_copy_up_xattr() hook When a file is copied up in overlay, we have already created file on upper/ with right label and there is no need to copy up selinux label/xattr from lower file to upper file. In fact in case of context mount, we don't want to copy up label as newly created file got its label from context= option. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-08 20:43:59 -04:00
Vivek Goyal	56909eb3f5	selinux: Implementation for inode_copy_up() hook A file is being copied up for overlay file system. Prepare a new set of creds and set create_sid appropriately so that new file is created with appropriate label. Overlay inode has right label for both context and non-context mount cases. In case of non-context mount, overlay inode will have the label of lower file and in case of context mount, overlay inode will have the label from context= mount option. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-08 20:41:52 -04:00
Javier Martinez Canillas	1a93a6eac3	security: Use IS_ENABLED() instead of checking for built-in or module The IS_ENABLED() macro checks if a Kconfig symbol has been enabled either built-in or as a module, use that macro instead of open coding the same. Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-08 13:08:25 -04:00
Linus Torvalds	835c92d43b	Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull qstr constification updates from Al Viro: "Fairly self-contained bunch - surprising lot of places passes struct qstr * as an argument when const struct qstr * would suffice; it complicates analysis for no good reason. I'd prefer to feed that separately from the assorted fixes (those are in #for-linus and with somewhat trickier topology)" * 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: qstr: constify instances in adfs qstr: constify instances in lustre qstr: constify instances in f2fs qstr: constify instances in ext2 qstr: constify instances in vfat qstr: constify instances in procfs qstr: constify instances in fuse qstr constify instances in fs/dcache.c qstr: constify instances in nfs qstr: constify instances in ocfs2 qstr: constify instances in autofs4 qstr: constify instances in hfs qstr: constify instances in hfsplus qstr: constify instances in logfs qstr: constify dentry_init_security	2016-08-06 09:49:02 -04:00
Linus Torvalds	7a1e8b80fb	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - TPM core and driver updates/fixes - IPv6 security labeling (CALIPSO) - Lots of Apparmor fixes - Seccomp: remove 2-phase API, close hole where ptrace can change syscall #" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits) apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family) tpm: Factor out common startup code tpm: use devm_add_action_or_reset tpm2_i2c_nuvoton: add irq validity check tpm: read burstcount from TPM_STS in one 32-bit transaction tpm: fix byte-order for the value read by tpm2_get_tpm_pt tpm_tis_core: convert max timeouts from msec to jiffies apparmor: fix arg_size computation for when setprocattr is null terminated apparmor: fix oops, validate buffer size in apparmor_setprocattr() apparmor: do not expose kernel stack apparmor: fix module parameters can be changed after policy is locked apparmor: fix oops in profile_unpack() when policy_db is not present apparmor: don't check for vmalloc_addr if kvzalloc() failed apparmor: add missing id bounds check on dfa verification apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task apparmor: use list_next_entry instead of list_entry_next apparmor: fix refcount race when finding a child profile apparmor: fix ref count leak when profile sha1 hash is read apparmor: check that xindex is in trans_table bounds ...	2016-07-29 17:38:46 -07:00
Al Viro	4f3ccd7657	qstr: constify dentry_init_security Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-07-20 23:30:06 -04:00
James Morris	d011a4d861	Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/selinux into next	2016-07-07 10:15:34 +10:00
Huw Davies	4fee5242bf	calipso: Add a label cache. This works in exactly the same way as the CIPSO label cache. The idea is to allow the lsm to cache the result of a secattr lookup so that it doesn't need to perform the lookup for every skbuff. It introduces two sysctl controls: calipso_cache_enable - enables/disables the cache. calipso_cache_bucket_size - sets the size of a cache bucket. Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-06-27 15:06:17 -04:00
Huw Davies	a04e71f631	netlabel: Pass a family parameter to netlbl_skbuff_err(). This makes it possible to route the error to the appropriate labelling engine. CALIPSO is far less verbose than CIPSO when encountering a bogus packet, so there is no need for a CALIPSO error handler. Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-06-27 15:06:16 -04:00
Huw Davies	2917f57b6b	calipso: Allow the lsm to label the skbuff directly. In some cases, the lsm needs to add the label to the skbuff directly. A NF_INET_LOCAL_OUT IPv6 hook is added to selinux to match the IPv4 behaviour. This allows selinux to label the skbuffs that it requires. Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-06-27 15:06:15 -04:00
Huw Davies	e1adea9270	calipso: Allow request sockets to be relabelled by the lsm. Request sockets need to have a label that takes into account the incoming connection as well as their parent's label. This is used for the outgoing SYN-ACK and for their child full-socket. Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-06-27 15:05:29 -04:00
Huw Davies	1f440c99d3	netlabel: Prevent setsockopt() from changing the hop-by-hop option. If a socket has a netlabel in place then don't let setsockopt() alter the socket's IPv6 hop-by-hop option. This is in the same spirit as the existing check for IPv4. Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-06-27 15:05:27 -04:00
Huw Davies	ceba1832b1	calipso: Set the calipso socket label to match the secattr. CALIPSO is a hop-by-hop IPv6 option. A lot of this patch is based on the equivalent CISPO code. The main difference is due to manipulating the options in the hop-by-hop header. Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-06-27 15:02:51 -04:00
Seth Forshee	aad82892af	selinux: Add support for unprivileged mounts from user namespaces Security labels from unprivileged mounts in user namespaces must be ignored. Force superblocks from user namespaces whose labeling behavior is to use xattrs to use mountpoint labeling instead. For the mountpoint label, default to converting the current task context into a form suitable for file objects, but also allow the policy writer to specify a different label through policy transition rules. Pieced together from code snippets provided by Stephen Smalley. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2016-06-24 11:02:54 -05:00
Andy Lutomirski	380cf5ba6b	fs: Treat foreign mounts as nosuid If a process gets access to a mount from a different user namespace, that process should not be able to take advantage of setuid files or selinux entrypoints from that filesystem. Prevent this by treating mounts from other mount namespaces and those not owned by current_user_ns() or an ancestor as nosuid. This will make it safer to allow more complex filesystems to be mounted in non-root user namespaces. This does not remove the need for MNT_LOCK_NOSUID. The setuid, setgid, and file capability bits can no longer be abused if code in a user namespace were to clear nosuid on an untrusted filesystem, but this patch, by itself, is insufficient to protect the system from abuse of files that, when execed, would increase MAC privilege. As a more concrete explanation, any task that can manipulate a vfsmount associated with a given user namespace already has capabilities in that namespace and all of its descendents. If they can cause a malicious setuid, setgid, or file-caps executable to appear in that mount, then that executable will only allow them to elevate privileges in exactly the set of namespaces in which they are already privileges. On the other hand, if they can cause a malicious executable to appear with a dangerous MAC label, running it could change the caller's security context in a way that should not have been possible, even inside the namespace in which the task is confined. As a hardening measure, this would have made CVE-2014-5207 much more difficult to exploit. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2016-06-24 10:40:41 -05:00
Heinrich Schuchardt	309c5fad5d	selinux: fix type mismatch avc_cache_threshold is of type unsigned int. Do not use a signed new_value in sscanf(page, "%u", &new_value). Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> [PM: subject prefix fix, description cleanup] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-06-15 16:20:28 -04:00
Paul Moore	8bebe88c09	selinux: import NetLabel category bitmaps correctly The existing ebitmap_netlbl_import() code didn't correctly handle the case where the ebitmap_node was not aligned/sized to a power of two, this patch fixes this (on x86_64 ebitmap_node contains six bitmaps making a range of 0..383). Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-06-09 10:40:37 -04:00
Stephen Smalley	7ea59202db	selinux: Only apply bounds checking to source types The current bounds checking of both source and target types requires allowing any domain that has access to the child domain to also have the same permissions to the parent, which is undesirable. Drop the target bounds checking. KaiGai Kohei originally removed all use of target bounds in commit `7d52a155e3` ("selinux: remove dead code in type_attribute_bounds_av()") but this was reverted in commit `2ae3ba3938` ("selinux: libsepol: remove dead code in check_avtab_hierarchy_callback()") because it would have required explicitly allowing the parent any permissions to the child that the child is allowed to itself. This change in contrast retains the logic for the case where both source and target types are bounded, thereby allowing access if the parent of the source is allowed the corresponding permissions to the parent of the target. Further, this change reworks the logic such that we only perform a single computation for each case and there is no ambiguity as to how to resolve a bounds violation. Under the new logic, if the source type and target types are both bounded, then the parent of the source type must be allowed the same permissions to the parent of the target type. If only the source type is bounded, then the parent of the source type must be allowed the same permissions to the target type. Examples of the new logic and comparisons with the old logic: 1. If we have: typebounds A B; then: allow B self:process <permissions>; will satisfy the bounds constraint iff: allow A self:process <permissions>; is also allowed in policy. Under the old logic, the allow rule on B satisfies the bounds constraint if any of the following three are allowed: allow A B:process <permissions>; or allow B A:process <permissions>; or allow A self:process <permissions>; However, either of the first two ultimately require the third to satisfy the bounds constraint under the old logic, and therefore this degenerates to the same result (but is more efficient - we only need to perform one compute_av call). 2. If we have: typebounds A B; typebounds A_exec B_exec; then: allow B B_exec:file <permissions>; will satisfy the bounds constraint iff: allow A A_exec:file <permissions>; is also allowed in policy. This is essentially the same as #1; it is merely included as an example of dealing with object types related to a bounded domain in a manner that satisfies the bounds relationship. Note that this approach is preferable to leaving B_exec unbounded and having: allow A B_exec:file <permissions>; in policy because that would allow B's entrypoints to be used to enter A. Similarly for _tmp or other related types. 3. If we have: typebounds A B; and an unbounded type T, then: allow B T:file <permissions>; will satisfy the bounds constraint iff: allow A T:file <permissions>; is allowed in policy. The old logic would have been identical for this example. 4. If we have: typebounds A B; and an unbounded domain D, then: allow D B:unix_stream_socket <permissions>; is not subject to any bounds constraints under the new logic because D is not bounded. This is desirable so that we can allow a domain to e.g. connectto a child domain without having to allow it to do the same to its parent. The old logic would have required: allow D A:unix_stream_socket <permissions>; to also be allowed in policy. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: re-wrapped description to appease checkpatch.pl] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-05-31 12:01:59 -04:00
Linus Torvalds	f4f27d0028	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - A new LSM, "LoadPin", from Kees Cook is added, which allows forcing of modules and firmware to be loaded from a specific device (this is from ChromeOS, where the device as a whole is verified cryptographically via dm-verity). This is disabled by default but can be configured to be enabled by default (don't do this if you don't know what you're doing). - Keys: allow authentication data to be stored in an asymmetric key. Lots of general fixes and updates. - SELinux: add restrictions for loading of kernel modules via finit_module(). Distinguish non-init user namespace capability checks. Apply execstack check on thread stacks" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (48 commits) LSM: LoadPin: provide enablement CONFIG Yama: use atomic allocations when reporting seccomp: Fix comment typo ima: add support for creating files using the mknodat syscall ima: fix ima_inode_post_setattr vfs: forbid write access when reading a file into memory fs: fix over-zealous use of "const" selinux: apply execstack check on thread stacks selinux: distinguish non-init user namespace capability checks LSM: LoadPin for kernel file loading restrictions fs: define a string representation of the kernel_read_file_id enumeration Yama: consolidate error reporting string_helpers: add kstrdup_quotable_file string_helpers: add kstrdup_quotable_cmdline string_helpers: add kstrdup_quotable selinux: check ss_initialized before revalidating an inode label selinux: delay inode label lookup as long as possible selinux: don't revalidate an inode's label when explicitly setting it selinux: Change bool variable name to index. KEYS: Add KEYCTL_DH_COMPUTE command ...	2016-05-19 09:21:36 -07:00
Linus Torvalds	a7fd20d1c4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support SPI based w5100 devices, from Akinobu Mita. 2) Partial Segmentation Offload, from Alexander Duyck. 3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE. 4) Allow cls_flower stats offload, from Amir Vadai. 5) Implement bpf blinding, from Daniel Borkmann. 6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is actually using FASYNC these atomics are superfluous. From Eric Dumazet. 7) Run TCP more preemptibly, also from Eric Dumazet. 8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e driver, from Gal Pressman. 9) Allow creating ppp devices via rtnetlink, from Guillaume Nault. 10) Improve BPF usage documentation, from Jesper Dangaard Brouer. 11) Support tunneling offloads in qed, from Manish Chopra. 12) aRFS offloading in mlx5e, from Maor Gottlieb. 13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo Leitner. 14) Add MSG_EOR support to TCP, this allows controlling packet coalescing on application record boundaries for more accurate socket timestamp sampling. From Martin KaFai Lau. 15) Fix alignment of 64-bit netlink attributes across the board, from Nicolas Dichtel. 16) Per-vlan stats in bridging, from Nikolay Aleksandrov. 17) Several conversions of drivers to ethtool ksettings, from Philippe Reynes. 18) Checksum neutral ILA in ipv6, from Tom Herbert. 19) Factorize all of the various marvell dsa drivers into one, from Vivien Didelot 20) Add VF support to qed driver, from Yuval Mintz" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits) Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m" Revert "phy dp83867: Make rgmii parameters optional" r8169: default to 64-bit DMA on recent PCIe chips phy dp83867: Make rgmii parameters optional phy dp83867: Fix compilation with CONFIG_OF_MDIO=m bpf: arm64: remove callee-save registers use for tmp registers asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions switchdev: pass pointer to fib_info instead of copy net_sched: close another race condition in tcf_mirred_release() tipc: fix nametable publication field in nl compat drivers: net: Don't print unpopulated net_device name qed: add support for dcbx. ravb: Add missing free_irq() calls to ravb_close() qed: Remove a stray tab net: ethernet: fec-mpc52xx: use phy_ethtool_{get\|set}_link_ksettings net: ethernet: fec-mpc52xx: use phydev from struct net_device bpf, doc: fix typo on bpf_asm descriptions stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set net: ethernet: fs-enet: use phy_ethtool_{get\|set}_link_ksettings net: ethernet: fs-enet: use phydev from struct net_device ...	2016-05-17 16:26:30 -07:00
Linus Torvalds	c52b76185b	Merge branch 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull 'struct path' constification update from Al Viro: "'struct path' is passed by reference to a bunch of Linux security methods; in theory, there's nothing to stop them from modifying the damn thing and LSM community being what it is, sooner or later some enterprising soul is going to decide that it's a good idea. Let's remove the temptation and constify all of those..." * 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: constify ima_d_path() constify security_sb_pivotroot() constify security_path_chroot() constify security_path_{link,rename} apparmor: remove useless checks for NULL ->mnt constify security_path_{mkdir,mknod,symlink} constify security_path_{unlink,rmdir} apparmor: constify common_perm_...() apparmor: constify aa_path_link() apparmor: new helper - common_path_perm() constify chmod_common/security_path_chmod constify security_sb_mount() constify chown_common/security_path_chown tomoyo: constify assorted struct path * apparmor_path_truncate(): path->mnt is never NULL constify vfs_truncate() constify security_path_truncate() [apparmor] constify struct path * in a bunch of helpers	2016-05-17 14:41:03 -07:00
Stephen Smalley	c2316dbf12	selinux: apply execstack check on thread stacks The execstack check was only being applied on the main process stack. Thread stacks allocated via mmap were only subject to the execmem permission check. Augment the check to apply to the current thread stack as well. Note that this does NOT prevent making a different thread's stack executable. Suggested-by: Nick Kralevich <nnk@google.com> Acked-by: Nick Kralevich <nnk@google.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-04-26 15:47:57 -04:00
Stephen Smalley	8e4ff6f228	selinux: distinguish non-init user namespace capability checks Distinguish capability checks against a target associated with the init user namespace versus capability checks against a target associated with a non-init user namespace by defining and using separate security classes for the latter. This is needed to support e.g. Chrome usage of user namespaces for the Chrome sandbox without needing to allow Chrome to also exercise capabilities on targets in the init user namespace. Suggested-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-04-26 15:41:43 -04:00
Roopa Prabhu	10c9ead9f3	rtnetlink: add new RTM_GETSTATS message to dump link stats This patch adds a new RTM_GETSTATS message to query link stats via netlink from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK returns a lot more than just stats and is expensive in some cases when frequent polling for stats from userspace is a common operation. RTM_GETSTATS is an attempt to provide a light weight netlink message to explicity query only link stats from the kernel on an interface. The idea is to also keep it extensible so that new kinds of stats can be added to it in the future. This patch adds the following attribute for NETDEV stats: struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = { [IFLA_STATS_LINK_64] = { .len = sizeof(struct rtnl_link_stats64) }, }; Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of a single interface or all interfaces with NLM_F_DUMP. Future possible new types of stat attributes: link af stats: - IFLA_STATS_LINK_IPV6 (nested. for ipv6 stats) - IFLA_STATS_LINK_MPLS (nested. for mpls/mdev stats) extended stats: - IFLA_STATS_LINK_EXTENDED (nested. extended software netdev stats like bridge, vlan, vxlan etc) - IFLA_STATS_LINK_HW_EXTENDED (nested. extended hardware stats which are available via ethtool today) This patch also declares a filter mask for all stat attributes. User has to provide a mask of stats attributes to query. filter mask can be specified in the new hdr 'struct if_stats_msg' for stats messages. Other important field in the header is the ifindex. This api can also include attributes for global stats (eg tcp) in the future. When global stats are included in a stats msg, the ifindex in the header must be zero. A single stats message cannot contain both global and netdev specific stats. To easily distinguish them, netdev specific stat attributes name are prefixed with IFLA_STATS_LINK_ Without any attributes in the filter_mask, no stats will be returned. This patch has been tested with mofified iproute2 ifstat. Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-20 15:43:42 -04:00
Paul Moore	1ac4247626	selinux: check ss_initialized before revalidating an inode label There is no point in trying to revalidate an inode's security label if the security server is not yet initialized. Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-04-19 16:37:27 -04:00
Paul Moore	20cdef8d57	selinux: delay inode label lookup as long as possible Since looking up an inode's label can result in revalidation, delay the lookup as long as possible to limit the performance impact. Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-04-19 16:37:07 -04:00
Paul Moore	2c97165bef	selinux: don't revalidate an inode's label when explicitly setting it There is no point in attempting to revalidate an inode's security label when we are in the process of setting it. Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-04-19 16:36:28 -04:00
Prarit Bhargava	0fd71a620b	selinux: Change bool variable name to index. security_get_bool_value(int bool) argument "bool" conflicts with in-kernel macros such as BUILD_BUG(). This patch changes this to index which isn't a type. Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Eric Paris <eparis@parisplace.org> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andrew Perepechko <anserper@ya.ru> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: selinux@tycho.nsa.gov Cc: Eric Paris <eparis@redhat.com> Cc: Paul Moore <pmoore@redhat.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: David Howells <dhowells@redhat.com> [PM: wrapped description for checkpatch.pl, use "selinux:..." as subj] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-04-14 11:24:50 -04:00

1 2 3 4 5 ...

1414 Commits