Commit Graph

691876 Commits

Author SHA1 Message Date
Alexander Potapenko beaec533fc llist: clang: introduce member_address_is_nonnull()
Currently llist_for_each_entry() and llist_for_each_entry_safe() iterate
until &pos->member != NULL.  But when building the kernel with Clang,
the compiler assumes &pos->member cannot be NULL if the member's offset
is greater than 0 (which would be equivalent to the object being
non-contiguous in memory).  Therefore the loop condition is always true,
and the loops become infinite.

To work around this, introduce the member_address_is_nonnull() macro,
which casts object pointer to uintptr_t, thus letting the member pointer
to be NULL.

Signed-off-by: Alexander Potapenko <glider@google.com>
Tested-by: Sodagudi Prasad <psodagud@codeaurora.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-19 15:33:50 -07:00
Eugeniy Paltsev 90f522a20e NET: dwmac: Make dwmac reset unconditional
Unconditional reset dwmac before HW init if reset controller is present.

In existing implementation we reset dwmac only after second module
probing:
(module load -> unload -> load again [reset happens])

Now we reset dwmac at every module load:
(module load [reset happens] -> unload -> load again [reset happens])

Also some reset controllers have only reset callback instead of
assert + deassert callbacks pair, so handle this case.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:52:19 -07:00
Tonghao Zhang c4b2bf6b4a openvswitch: Optimize operations for OvS flow_stats.
When calling the flow_free() to free the flow, we call many times
(cpu_possible_mask, eg. 128 as default) cpumask_next(). That will
take up our CPU usage if we call the flow_free() frequently.
When we put all packets to userspace via upcall, and OvS will send
them back via netlink to ovs_packet_cmd_execute(will call flow_free).

The test topo is shown as below. VM01 sends TCP packets to VM02,
and OvS forward packtets. When testing, we use perf to report the
system performance.

VM01 --- OvS-VM --- VM02

Without this patch, perf-top show as below: The flow_free() is
3.02% CPU usage.

	4.23%  [kernel]            [k] _raw_spin_unlock_irqrestore
	3.62%  [kernel]            [k] __do_softirq
	3.16%  [kernel]            [k] __memcpy
	3.02%  [kernel]            [k] flow_free
	2.42%  libc-2.17.so        [.] __memcpy_ssse3_back
	2.18%  [kernel]            [k] copy_user_generic_unrolled
	2.17%  [kernel]            [k] find_next_bit

When applied this patch, perf-top show as below: Not shown on
the list anymore.

	4.11%  [kernel]            [k] _raw_spin_unlock_irqrestore
	3.79%  [kernel]            [k] __do_softirq
	3.46%  [kernel]            [k] __memcpy
	2.73%  libc-2.17.so        [.] __memcpy_ssse3_back
	2.25%  [kernel]            [k] copy_user_generic_unrolled
	1.89%  libc-2.17.so        [.] _int_malloc
	1.53%  ovs-vswitchd        [.] xlate_actions

With this patch, the TCP throughput(we dont use Megaflow Cache
+ Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec
(maybe ~10% performance imporve).

This patch adds cpumask struct, the cpu_used_mask stores the cpu_id
that the flow used. And we only check the flow_stats on the cpu we
used, and it is unncessary to check all possible cpu when getting,
cleaning, and updating the flow_stats. Adding the cpu_used_mask to
sw_flow struct does’t increase the cacheline number.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:49:39 -07:00
Tonghao Zhang c57c054eb5 openvswitch: Optimize updating for OvS flow_stats.
In the ovs_flow_stats_update(), we only use the node
var to alloc flow_stats struct. But this is not a
common case, it is unnecessary to call the numa_node_id()
everytime. This patch is not a bugfix, but there maybe
a small increase.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:49:39 -07:00
David S. Miller 63679112c5 net: Zero terminate ifr_name in dev_ifname().
The ifr.ifr_name is passed around and assumed to be NULL terminated.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:33:24 -07:00
Levin, Alexander 98de4e0ea4 wireless: wext: terminate ifr name coming from userspace
ifr name is assumed to be a valid string by the kernel, but nothing
was forcing username to pass a valid string.

In turn, this would cause panics as we tried to access the string
past it's valid memory.

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:32:11 -07:00
David S. Miller 6b098a08ad Merge branch 'liquidio-lowmem-fixes'
Rick Farrington says:

====================
liquidio: avoid vm low memory crashes

This patchset addresses issues brought about by low memory conditions
in a VM.  These conditions were not seen when the driver was exercised
normally.  Rather, they were brought about through manual fault injection.
They are being included in the interest of hardening the driver against
unforeseen circumstances.

1. Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'.
   This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry
   in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries.
2. Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers().
3. For defensive programming, zero the allocated block 'oct->droq' in
   octeon_setup_output_queues() and 'oct->instr_queue' in
   octeon_setup_instr_queues().

change log:
V1 -> V2:
1. Corrected syntax in 'Subject' lines; no functional or code changes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:24:47 -07:00
Rick Farrington 2c4aac74a9 liquidio: lowmem: init allocated memory to 0
For defensive programming, zero the allocated block 'oct->droq[0]' in
octeon_setup_output_queues() and 'oct->instr_queue[0]' in
octeon_setup_instr_queues().

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:24:46 -07:00
Rick Farrington 689062a18c liquidio: lowmem: do not dereference null ptr
Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers().

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:24:46 -07:00
Rick Farrington 00587f2fa7 liquidio: lowmem: init allocated memory to 0
Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'.
This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry
in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries.

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:24:46 -07:00
Rick Farrington 741912c553 liquidio: support new firmware statistic fw_err_pki
Added support for new firmware statistic 'tx_err_pki'.

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19 13:22:29 -07:00
Linus Torvalds e06fdaf40a Now that IPC and other changes have landed, enable manual markings for
randstruct plugin, including the task_struct.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJZbRgGAAoJEIly9N/cbcAmk2AQAIL60aQ+9RIcFAXriFhnd7Z2
 x9Jqi9JNc8NgPFXx8GhE4J4eTZ5PwcjgXBpNRWY/laBkRyoBHn24ku09YxrJjmHz
 ZSUsP+/iO9lVeEfbmU9Tnk50afkfwx6bHXBwkiVGQWHtybNVUqA19JbqkHeg8ubx
 myKLGeUv5PPCodRIcBDD0+HaAANcsqtgbDpgmWU8s+IXWwvWCE2p7PuBw7v3HHgH
 qzlPDHYQCRDw+LWsSqPaHj+9mbRO18P/ydMoZHGH4Hl3YYNtty8ZbxnraI3A7zBL
 6mLUVcZ+/l88DqHc5I05T8MmLU1yl2VRxi8/jpMAkg9wkvZ5iNAtlEKIWU6eqsvk
 vaImNOkViLKlWKF+oUD1YdG16d8Segrc6m4MGdI021tb+LoGuUbkY7Tl4ee+3dl/
 9FM+jPv95HjJnyfRNGidh2TKTa9KJkh6DYM9aUnktMFy3ca1h/LuszOiN0LTDiHt
 k5xoFURk98XslJJyXM8FPwXCXiRivrXMZbg5ixNoS4aYSBLv7Cn1M6cPnSOs7UPh
 FqdNPXLRZ+vabSxvEg5+41Ioe0SHqACQIfaSsV5BfF2rrRRdaAxK4h7DBcI6owV2
 7ziBN1nBBq2onYGbARN6ApyCqLcchsKtQfiZ0iFsvW7ZawnkVOOObDTCgPl3tdkr
 403YXzphQVzJtpT5eRV6
 =ngAW
 -----END PGP SIGNATURE-----

Merge tag 'gcc-plugins-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull structure randomization updates from Kees Cook:
 "Now that IPC and other changes have landed, enable manual markings for
  randstruct plugin, including the task_struct.

  This is the rest of what was staged in -next for the gcc-plugins, and
  comes in three patches, largest first:

   - mark "easy" structs with __randomize_layout

   - mark task_struct with an optional anonymous struct to isolate the
     __randomize_layout section

   - mark structs to opt _out_ of automated marking (which will come
     later)

  And, FWIW, this continues to pass allmodconfig (normal and patched to
  enable gcc-plugins) builds of x86_64, i386, arm64, arm, powerpc, and
  s390 for me"

* tag 'gcc-plugins-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randstruct: opt-out externally exposed function pointer structs
  task_struct: Allow randomized layout
  randstruct: Mark various structs for randomization
2017-07-19 08:55:18 -07:00
Linus Torvalds a90c6ac2b5 A number of small fixes for -rc1 Luminous changes plus a readdir race
fix, marked for stable.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJZb3AkAAoJEEp/3jgCEfOLMAUH/RRRxbY4KL/PUhDXVPf+a+Pf
 groC365undvuCmHCkT1ufrlrh56KE0XUvEKgXJp+r84WS4SC6lxaebD6QvzVtyMM
 KPVnbpCNfKw5KtLB1upMteYY6MGfTk4VTPCav69aNGPrvUxJQB8obvWenPi0rWk/
 knALvlJZbSiZeUDK3Id9cjntTGkClYuUHYJQ1JaZeieB/Xwnr+ZvV4on8ul7gkGX
 B6zdqaM43ZomSl/rJrV/G/MOMNV5uVjBNJmVpfH7KkZQGipW7O+8aDwFaMFAAN7r
 4TQcLf+d3SDjcjVspikCMYr0r0VnbL8hLPGkd7Cus/3jei9GWQHGaQqbZZmcKl8=
 =TPyV
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-4.13-rc2' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A number of small fixes for -rc1 Luminous changes plus a readdir race
  fix, marked for stable"

* tag 'ceph-for-4.13-rc2' of git://github.com/ceph/ceph-client:
  libceph: potential NULL dereference in ceph_msg_data_create()
  ceph: fix race in concurrent readdir
  libceph: don't call encode_request_finish() on MOSDBackoff messages
  libceph: use alloc_pg_mapping() in __decode_pg_upmap_items()
  libceph: set -EINVAL in one place in crush_decode()
  libceph: NULL deref on osdmap_apply_incremental() error path
  libceph: fix old style declaration warnings
2017-07-19 08:49:46 -07:00
Shu Wang b0659ae5e3 audit: fix memleak in auditd_send_unicast_skb.
Found this issue by kmemleak report, auditd_send_unicast_skb
did not free skb if rcu_dereference(auditd_conn) returns null.

unreferenced object 0xffff88082568ce00 (size 256):
comm "auditd", pid 1119, jiffies 4294708499
backtrace:
[<ffffffff8176166a>] kmemleak_alloc+0x4a/0xa0
[<ffffffff8121820c>] kmem_cache_alloc_node+0xcc/0x210
[<ffffffff8161b99d>] __alloc_skb+0x5d/0x290
[<ffffffff8113c614>] audit_make_reply+0x54/0xd0
[<ffffffff8113dfa7>] audit_receive_msg+0x967/0xd70
----------------
(gdb) list *audit_receive_msg+0x967
0xffffffff8113dff7 is in audit_receive_msg (kernel/audit.c:1133).
1132    skb = audit_make_reply(0, AUDIT_REPLACE, 0,
                                0, &pvnr, sizeof(pvnr));
---------------
[<ffffffff8113e402>] audit_receive+0x52/0xa0
[<ffffffff8166c561>] netlink_unicast+0x181/0x240
[<ffffffff8166c8e2>] netlink_sendmsg+0x2c2/0x3b0
[<ffffffff816112e8>] sock_sendmsg+0x38/0x50
[<ffffffff816117a2>] SYSC_sendto+0x102/0x190
[<ffffffff81612f4e>] SyS_sendto+0xe/0x10
[<ffffffff8176d337>] entry_SYSCALL_64_fastpath+0x1a/0xa5
[<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Shu Wang <shuwang@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-07-19 10:28:54 -04:00
Sudeep Holla 975e83cfb8 PM / Domains: defer dev_pm_domain_set() until genpd->attach_dev succeeds if present
If the genpd->attach_dev or genpd->power_on fails, genpd_dev_pm_attach
may return -EPROBE_DEFER initially. However genpd_alloc_dev_data sets
the PM domain for the device unconditionally.

When subsequent attempts are made to call genpd_dev_pm_attach, it may
return -EEXISTS checking dev->pm_domain without re-attempting to call
attach_dev or power_on.

platform_drv_probe then attempts to call drv->probe as the return value
-EEXIST != -EPROBE_DEFER, which may end up in a situation where the
device is accessed without it's power domain switched on.

Fixes: f104e1e5ef (PM / Domains: Re-order initialization of generic_pm_domain_data)
Cc: 4.4+ <stable@vger.kernel.org> # v4.4+
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-19 14:41:11 +02:00
Dan Williams bbb3be170a device-dax: fix sysfs duplicate warnings
Fix warnings of the form...

     WARNING: CPU: 10 PID: 4983 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
     sysfs: cannot create duplicate filename '/class/dax/dax12.0'
     Call Trace:
      dump_stack+0x63/0x86
      __warn+0xcb/0xf0
      warn_slowpath_fmt+0x5a/0x80
      ? kernfs_path_from_node+0x4f/0x60
      sysfs_warn_dup+0x62/0x80
      sysfs_do_create_link_sd.isra.2+0x97/0xb0
      sysfs_create_link+0x25/0x40
      device_add+0x266/0x630
      devm_create_dax_dev+0x2cf/0x340 [dax]
      dax_pmem_probe+0x1f5/0x26e [dax_pmem]
      nvdimm_bus_probe+0x71/0x120

...by reusing the namespace id for the device-dax instance name.

Now that we have decided that there will never by more than one
device-dax instance per libnvdimm-namespace parent device [1], we can
directly reuse the namepace ids. There are some possible follow-on
cleanups, but those are saved for a later patch to simplify the -stable
backport.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-December/008266.html

Fixes: 98a29c39dc ("libnvdimm, namespace: allow creation of multiple pmem...")
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@vger.kernel.org>
Reported-by: Dariusz Dokupil <dariusz.dokupil@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-07-18 17:49:14 -07:00
Dan Carpenter 073dd5ad34 netfilter: fix netfilter_net_init() return
We accidentally return an uninitialized variable.

Fixes: cf56c2f892 ("netfilter: remove old pre-netns era hook api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 14:50:28 -07:00
David S. Miller 8c5e9fb8ac Merge branch 'net-attribute_group-const'
Arvind Yadav says:

====================
constify net attribute_group structures.

attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work with const
attribute_group. So mark the non-const structs as const.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:57 -07:00
Arvind Yadav 98dc8373db net: chelsio: cxgb3: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
  28720	    985	     12	  29717	   7415	net/.../cxgb3/cxgb3_main.o

File size After adding 'const':
   text	   data	    bss	    dec	    hex	filename
  28848	    857	     12	  29717	   7415	net/.../cxgb3/cxgb3_main.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:57 -07:00
Arvind Yadav 02dbbef054 net: bonding: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
   4512	   1472	      0	   5984	   1760	drivers/net/bonding/bond_sysfs.o

File size After adding 'const':
   text	   data	    bss	    dec	    hex	filename
   4576	   1408	      0	   5984	   1760	drivers/net/bonding/bond_sysfs.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
Arvind Yadav c5567669fc arcnet: com20020-pci: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
   3409	    948	     28	   4385	   1121	drivers/net/arcnet/com20020-pci.o

File size After adding 'const':
   text	   data	    bss	    dec	    hex	filename
   3473	    884	     28	   4385	   1121	drivers/net/arcnet/com20020-pci.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
Arvind Yadav 6cbbd7ec06 wireless: iwlegacy: Constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
Arvind Yadav 64571ca71b wireless: iwlegacy: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
Arvind Yadav e00b6c6d98 wireless: ipw2100: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
Arvind Yadav d7979553ef wireless: ipw2200: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
Arvind Yadav 7eaf0d93f9 net: can: janz-ican3: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
  11800	    368	      0	  12168	   2f88	drivers/net/can/janz-ican3.o

File size After adding 'const':
   text	   data	    bss	    dec	    hex	filename
  11864	    304	      0	  12168	   2f88	drivers/net/can/janz-ican3.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
Arvind Yadav 7ec2796e63 net: can: at91_can: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
   6164	    304	      0	   6468	   1944	drivers/net/can/at91_can.o

File size After adding 'const':
   text	   data	    bss	    dec	    hex	filename
   6228	    240	      0	   6468	   1944	drivers/net/can/at91_can.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
Arvind Yadav 43f51ef119 net: cdc_ncm: constify attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
  13275	    928	      1	  14204	   377c	drivers/net/usb/cdc_ncm.o

File size After adding 'const':
   text	   data	    bss	    dec	    hex	filename
  13339	    864	      1	  14204	   377c	drivers/net/usb/cdc_ncm.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:04:56 -07:00
David S. Miller 3e16afd33f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Missing netlink message sanity check in nfnetlink, patch from
   Mateusz Jurczyk.

2) We now have netfilter per-netns hooks, so let's kill global hook
   infrastructure, this infrastructure is known to be racy with netns.
   We don't care about out of tree modules. Patch from Florian Westphal.

3) find_appropriate_src() is buggy when colissions happens after the
   conversion of the nat bysource to rhashtable. Also from Florian.

4) Remove forward chain in nf_tables arp family, it's useless and it is
   causing quite a bit of confusion, from Florian Westphal.

5) nf_ct_remove_expect() is called with the wrong parameter, causing
   kernel oops, patch from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:01:39 -07:00
Paolo Abeni 0ddf3fb2c4 udp: preserve skb->dst if required for IP options processing
Eric noticed that in udp_recvmsg() we still need to access
skb->dst while processing the IP options.
Since commit 0a463c78d2 ("udp: avoid a cache miss on dequeue")
skb->dst is no more available at recvmsg() time and bad things
will happen if we enter the relevant code path.

This commit address the issue, avoid clearing skb->dst if
any IP options are present into the relevant skb.
Since the IP CB is contained in the first skb cacheline, we can
test it to decide to leverage the consume_stateless_skb()
optimization, without measurable additional cost in the faster
path.

v1 -> v2: updated commit message tags

Fixes: 0a463c78d2 ("udp: avoid a cache miss on dequeue")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 12:00:13 -07:00
David S. Miller 7d2a055b23 Merge branch 'mlxsw-Preparations-for-IPv6-UC-router'
Jiri Pirko says:

====================
mlxsw: Preparations for IPv6 UC router

Ido says:

The purpose of this set is to prepare the driver for the introduction of
IPv6 FIB offload. It's mainly composed of small and non-functional
changes, that either add the IPv6 equivalent of existing IPv4 code or
aimed at making the introduction of IPv6-specific code easier.

The first five patches enable IPv6 forwarding in the device and allow us
to configure router interfaces (RIFs) based on inet6addr notifications.

The next six patches add support for programming IPv6 neighbours into
the device's table as well as dumping their activity and updating the
kernel accordingly.

The last 11 patches extend current infrastructure to allow us to program
IPv6 routes, set catch-all IPv6 trap in case of abort and make the code
more receptive towards up-coming changes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:34 -07:00
Ido Schimmel 7dcc18adad mlxsw: spectrum_router: Update prefix count for IPv6
The number of possible prefix lengths for IPv6 is 129 and not 128.

Fixes following warning from UBSAN when /128 routes are offloaded:

 UBSAN: Undefined behaviour in
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:2510:27 index 128 is out
of range for type 'long unsigned int [128]'

Fixes: 5e9c16cc83 ("mlxsw: spectrum_router: Implement private fib")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:34 -07:00
Ido Schimmel 80c238f91b mlxsw: spectrum_router: Rename functions to add / delete a FIB entry
These functions aren't specific to IPv4 and can be re-used for IPv6.

Drop the '4' designation from their name.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:34 -07:00
Ido Schimmel 9efbee6fea mlxsw: spectrum_router: Drop unnecessary parameter
Functions that take as argument a FIB entry don't need to take FIB node
as well, as it can be extracted from the entry.

Remove unnecessary FIB node parameter.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:34 -07:00
Ido Schimmel 0e6ea2a4ea mlxsw: spectrum_router: Mark IPv4 specific function accordingly
The functions to create and destroy a nexthop group are IPv4 specific
and should be renamed accordingly, so that they won't be confused with
the IPv6 specific functions in follow-up patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Ido Schimmel 4f1c7f1f2e mlxsw: spectrum_router: Create IPv4 specific entry struct
Some of the parameters stored in the FIB entry structure are specific to
IPv4 and therefore better placed in an IPv4 specific structure.

Create an IPv4 specific structure that encapsulates the common FIB entry
structure and contains IPv4 specific parameters.

In a follow-up patchset an IPv6 specific structure will be introduced.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Ido Schimmel bc65a8a4f4 mlxsw: spectrum_router: Set abort trap for IPv6
When we fail to insert a route we invoke the abort mechanism which
flushes all the tables and inserts a default route in each, so that all
packets incoming to the router will be trapped to the CPU.

Upon abort, add an IPv6 default route to the IPv6 tables.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Ido Schimmel 9dbf4d76d0 mlxsw: spectrum_router: Allow IPv6 routes to be programmed
Take advantage of previous patch and allow the RALUE register to be
called with IPv6 routes.

In order to re-use as much code as possible between IPv4 and IPv6, only
the lowest-level function that actually does the register packing is
demuxed based on the passed protocol.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Ido Schimmel 62547f407f mlxsw: reg: Update RALUE register with IPv6 support
Update the register so that IPv6 LPM entries could be programmed to the
device's table.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Ido Schimmel a3d9bc506d mlxsw: spectrum_router: Extend virtual routers with IPv6 support
A Virtual Router (VR) is an entity which corresponds to a VRF and
performs FIB lookup in an LPM tree according to the {VR, IP Proto} ->
Tree binding.

Extend the virtual router data structure towards IPv6 FIB offload.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Ido Schimmel 731ea1ca42 mlxsw: spectrum_router: Make FIB node retrieval family agnostic
A FIB node is an entity which stores routes sharing the same prefix and
length. The data structure itself is already family agnostic, but we
make some of its operations agnostic as well and thus re-use them for
IPv6 offload.

Instead of passing an IPv4-specific structure to fib4_node_get(), pass
general routing parameters and rename the function accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Ido Schimmel 160e22aa26 mlxsw: spectrum_router: Don't create FIB node during lookup
When looking up a FIB entry we shouldn't create the FIB node where it's
supposed to be linked in case the node doesn't already exist.

Instead, lookup the node and fail if it doesn't exist.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Ido Schimmel 58adf2c480 mlxsw: spectrum_router: Don't assume neighbour type
Thankfully, the neighbour subsystem is agnostic to the upper protocol
and used by both IPv4 and IPv6. By removing assumptions regarding the
neighbour type we can thus re-use much of the neighbour-related code for
both IPv4 and IPv6.

For each nexthop, store its gateway IP and for nexthop group store the
neighbour table used by its nexthops.

Use this information throughout the code and remove assumption about the
neighbour type.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky a6c9b5d199 mlxsw: spectrum_router: Set activity interval according to both neighbour tables
The neighbours' activity is currently dumped according to the ARP
table's DELAY_PROBE time, but with the introduction of IPv6 offload we
should set the interval according to the minimum between the ARP and
ndisc tables.

Signed-off-by: Arkadi Sharshvesky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky 60f040ca11 mlxsw: spectrum_router: Periodically dump active IPv6 neighbours
In addition to IPv4, periodically dump IPv6 neighbours and update the
kernel about them.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky 72e8ebe1b3 mlxsw: reg: Update RAUHTD register with IPv6 support
Update the register so that the active IPv6 neighbours could be dumped
from the device's neighbour table.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky d5eb89cf68 mlxsw: spectrum_router: Reflect IPv6 neighbours to the device
As with IPv4, listen to NEIGH_UPDATE events from the ndisc table and
program relevant neighbours to the device's neighbour table.

Note that neighbours with a link-local IP address aren't programmed, as
packets with a link-local destination IP are trapped after LPM lookup
and never reach the neighbour table.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky 6929e50736 mlxsw: reg: Update RAUHT register with IPv6 support
Update the register, so the IPv6 neighbours could be programmed to the
device's neighbour table.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky 5ea1237f94 mlxsw: spectrum_router: Configure RIFs based on IPv6 addresses
When a netdev is configured with an IP address a router interface (RIF)
should be configured for it in the device. Allow configuration of RIFs
based on IPv6 address notifications as well as IPv4.

Note that the RIF exists as long as an IP address is configured on the
netdev, regardless of the address family.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:32 -07:00
Ido Schimmel 0d284818af mlxsw: spectrum_router: Flood unregistered multicast packets to router
Up until now we only flooded broadcast packets to the router when an L3
interface was configured on top of a bridge. However, IPv6 Neighbour
Discovery packets are trapped to the CPU inside the router and these can
be sent with a multicast address.

Flood unregistered multicast packets to the router port, so that
relevant packets could be trapped there.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:57:32 -07:00