Commit Graph

48780 Commits

Author SHA1 Message Date
Andrew Lunn 5bcbe0f35f phy: fixed: Fix removal of phys.
The fixed phys delete function simply removed the fixed phy from the
internal linked list and freed the memory. It however did not
unregister the associated phy device. This meant it was still possible
to find the phy device on the mdio bus.

Make fixed_phy_del() an internal function and add a
fixed_phy_unregister() to unregisters the phy device and then uses
fixed_phy_del() to free resources.

Modify DSA to use this new API function, so we don't leak phys.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 15:43:11 -04:00
Martin KaFai Lau a44d6eacda tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In
Per RFC4898, they count segments sent/received
containing a positive length data segment (that includes
retransmission segments carrying data).  Unlike
tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
carrying no data (e.g. pure ack).

The patch also updates the segs_in in tcp_fastopen_add_skb()
so that segs_in >= data_segs_in property is kept.

Together with retransmission data, tcpi_data_segs_out
gives a better signal on the rxmit rate.

v6: Rebase on the latest net-next

v5: Eric pointed out that checking skb->len is still needed in
tcp_fastopen_add_skb() because skb can carry a FIN without data.
Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in()
helper is used.  Comment is added to the fastopen case to explain why
segs_in has to be reset and tcp_segs_in() has to be called before
__skb_pull().

v4: Add comment to the changes in tcp_fastopen_add_skb()
and also add remark on this case in the commit message.

v3: Add const modifier to the skb parameter in tcp_segs_in()

v2: Rework based on recent fix by Eric:
commit a9d99ce28e ("tcp: fix tcpi_segs_in after connection establishment")

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Chris Rapier <rapier@psc.edu>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <mleitner@redhat.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 14:55:26 -04:00
Marcin Wojtas f2900acea8 bus: mvebu-mbus: provide api for obtaining IO and DRAM window information
This commit enables finding appropriate mbus window and obtaining its
target id and attribute for given physical address in two separate
routines, both for IO and DRAM windows. This functionality
is needed for Armada XP/38x Network Controller's Buffer Manager and
PnC configuration.

[gregory.clement@free-electrons.com: Fix size test for
mvebu_mbus_get_dram_win_info]

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
[DRAM window information reference in LKv3.10]
Signed-off-by: Evan Wang <xswang@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 12:19:46 -04:00
David S. Miller f4fa6e6d88 NFC 4.6 pull request
This is a very small one this time, with only 5 patches.
 There are a couple of big items that could not be merged/finished
 on time.
 
 We have:
 
 - 2 LLCP fixes for a race and a potential OOM.
 - 2 cleanups for the pn544 and microread drivers.
 - 1 Maintainer addition for the s3fwrn5 driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW4ygDAAoJEIqAPN1PVmxKQ5gQAIN5MaT6gcu2/p1kd7FzPKyC
 KRb2EU907uVQObNLyZgU6liLYQ2prAvK7Dr9SuL10sl+zJkdU2TUuPfK+G4I50qQ
 nY7dHN94S0d1F6vBZVKFslh9+mrJKZCqhyN7WrCdXT2i7W5bYO8lw419nADImAq9
 fF4k7Ww+t9ZHGlzYun9WmNl8lGYz27w/sjjqnEbD8G+SZlkrqRNwccILP7rmOso2
 orGvzxouGiWmp4XcXno2X3ydfFVTl5LdVeaX6oyKxEBpcikz0QFJUmTcqqy0GqGo
 NeAGUip36MaJETXVgqKqeZERrST2Eminps3nTEwftd/ukOE+Sa8tX3spe2seN7mm
 o9ZBk8a0VChlhfGmIqnYZfksRyacEmVaLFV2W5EQdumL+G+jelCmpPgQKo0ZxEEX
 WTYULG5JiSIu34RjtTAP9tOPQ9XTV9hcArRYR355aki5EXGh/ntY1eVtV2nCSHjA
 vNZ0hpJNwfcx/0UlVULEJqxwDYYtl/QzJFTQO4l0zFYgf5SK9DXbOGDcZfoKgFf8
 7jzLn/hDzHNjCPocHDkXOq50H/pPHwdV65U4hPMnemcGVuVrNzHr5CyKBDgj7nXN
 NYP3S6fphg8W/DIlQFaTRAuBCqanZE5DIqFUgptSSdAmBLKagZKkaypUjCscb7Ee
 GinQKHm7YgZWG+YeAs4O
 =W1gM
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz says:

====================
NFC 4.6 pull request

This is a very small one this time, with only 5 patches.
There are a couple of big items that could not be merged/finished
on time.

We have:

- 2 LLCP fixes for a race and a potential OOM.
- 2 cleanups for the pn544 and microread drivers.
- 1 Maintainer addition for the s3fwrn5 driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 22:43:01 -04:00
Sabrina Dubroca 3c17578473 net: add MACsec netdevice priv_flags and helper
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 22:40:24 -04:00
LABBE Corentin 470c3822d2 phy: remove documentation of removed members of phy_device structure
Commit e5a03bfd87 ("phy: Add an mdio_device structure") removed addr,
bus and dev member of the phy_device structure.
This patch remove the documentation about those members.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 22:11:43 -04:00
David S. Miller 00e3d2ef18 wireless-drivers patches for 4.6
Major changes:
 
 ath10k
 
 * dt: add bindings for ipq4019 wifi block
 * start adding support for qca4019 chip
 
 ath9k
 
 * add device ID for Toshiba WLM-20U2/GN-1080
 * allow more than one interface on DFS channels
 
 bcma
 
 * move flash detection code to ChipCommon core driver
 
 brcmfmac
 
 * IPv6 Neighbor discovery offload
 * driver settings that can be populated from different sources
 * country code setting in firmware
 * length checks to validate firmware events
 * new way to determine device memory size needed for BCM4366
 * various offloads during Wake on Wireless LAN (WoWLAN)
 * full Management Frame Protection (MFP) support
 
 iwlwifi
 
 * add support for thermal device / cooling device
 * improvements in scheduled scan without profiles
 * new firmware support (-21.ucode)
 * add MSIX support for 9000 devices
 * enable MU-MIMO and take care of firmware restart
 * add support for large SKBs in mvm to reach A-MSDU
 * add support for filtering frames from a BA session
 * start implementing the new Rx path for 9000 devices
 * enable the new Radio Resource Management (RRM) nl80211 feature flag
 * add a new module paramater to disable VHT
 * build infrastructure for Dynamic Queue Allocation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJW4HwrAAoJEG4XJFUm622bvFUH/ArZD53Jh8btu8ukmkgKOPkc
 hCnvR639TURCNkC/e1lR0MFjO1QLLZ2m1tdRoZQfLiZm63HUuQzPDmaVnTeVfjrI
 4p3LmYriTECvgLoqVJgmBjNWiC61fMbWTJ91YqQiw2ZhvuKbcsu6oz/jU9MyCLyJ
 7WSk+HUqAnwtj7z515vAYQYapdUbxU1u7m/NgYdiYKTXfBR2ozUbfDR18Ey2EBWC
 KkDpFXyxo7ZByXzVA1B1UogB9NteV7IV1+WHphIX/XGdVQPpwRV3KxmLqKjIWW5E
 1Cv3q05vWapev9V5ZYghLpAGUQTu8h0nH2v0bJa9nSiQX23/Vkz7xxA/hi5gBOo=
 =aqji
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2016-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers patches for 4.6

Major changes:

ath10k

* dt: add bindings for ipq4019 wifi block
* start adding support for qca4019 chip

ath9k

* add device ID for Toshiba WLM-20U2/GN-1080
* allow more than one interface on DFS channels

bcma

* move flash detection code to ChipCommon core driver

brcmfmac

* IPv6 Neighbor discovery offload
* driver settings that can be populated from different sources
* country code setting in firmware
* length checks to validate firmware events
* new way to determine device memory size needed for BCM4366
* various offloads during Wake on Wireless LAN (WoWLAN)
* full Management Frame Protection (MFP) support

iwlwifi

* add support for thermal device / cooling device
* improvements in scheduled scan without profiles
* new firmware support (-21.ucode)
* add MSIX support for 9000 devices
* enable MU-MIMO and take care of firmware restart
* add support for large SKBs in mvm to reach A-MSDU
* add support for filtering frames from a BA session
* start implementing the new Rx path for 9000 devices
* enable the new Radio Resource Management (RRM) nl80211 feature flag
* add a new module paramater to disable VHT
* build infrastructure for Dynamic Queue Allocation
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 15:03:34 -04:00
Stephen Hemminger 4c656c13b2 bridge: allow zero ageing time
This fixes a regression in the bridge ageing time caused by:
commit c62987bbd8 ("bridge: push bridge setting ageing_time down to switchdev")

There are users of Linux bridge which use the feature that if ageing time
is set to 0 it causes entries to never expire. See:
  https://www.linuxfoundation.org/collaborate/workgroups/networking/bridge

For a pure software bridge, it is unnecessary for the code to have
arbitrary restrictions on what values are allowable.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11 14:58:58 -05:00
Amir Vadai 5b33f48842 net/flower: Introduce hardware offload support
This patch is based on a patch made by John Fastabend.
It adds support for offloading cls_flower.
when NETIF_F_HW_TC is on:
  flags = 0       => Rule will be processed twice - by hardware, and if
                     still relevant, by software.
  flags = SKIP_HW => Rull will be processed by software only

If hardware fail/not capabale to apply the rule, operation will NOT
fail. Filter will be processed by SW only.

Acked-by: Jiri Pirko <jiri@mellanox.com>
Suggested-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10 16:24:02 -05:00
Willem de Bruijn 2793a23aac net: validate variable length ll headers
Netdevice parameter hard_header_len is variously interpreted both as
an upper and lower bound on link layer header length. The field is
used as upper bound when reserving room at allocation, as lower bound
when validating user input in PF_PACKET.

Clarify the definition to be maximum header length. For validation
of untrusted headers, add an optional validate member to header_ops.

Allow bypassing of validation by passing CAP_SYS_RAWIO, for instance
for deliberate testing of corrupt input. In this case, pad trailing
bytes, as some device drivers expect completely initialized headers.

See also http://comments.gmane.org/gmane.linux.network/401064

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 22:13:01 -05:00
Jean Delvare 87aca73737 NFC: microread: Drop platform data header file
Originally I only wanted to drop the unneeded inclusion of
<linux/i2c.h>, but then noticed that struct
microread_nfc_platform_data isn't actually used, and
MICROREAD_DRIVER_NAME is redefined in the only file where it is used,
so we can get rid of the header file and dead code altogether.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-03-09 23:25:45 +01:00
Tom Herbert ab7ac4eb98 kcm: Kernel Connection Multiplexor module
This module implements the Kernel Connection Multiplexor.

Kernel Connection Multiplexor (KCM) is a facility that provides a
message based interface over TCP for generic application protocols.
With KCM an application can efficiently send and receive application
protocol messages over TCP using datagram sockets.

For more information see the included Documentation/networking/kcm.txt

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:14 -05:00
Tom Herbert f092276d85 net: Add MSG_BATCH flag
Add a new msg flag called MSG_BATCH. This flag is used in sendmsg to
indicate that more messages will follow (i.e. a batch of messages is
being sent). This is similar to MSG_MORE except that the following
messages are not merged into one packet, they are sent individually.
sendmmsg is updated so that each contained message except for the
last one is marked as MSG_BATCH.

MSG_BATCH is a performance optimization in cases where a socket
implementation can benefit by transmitting packets in a batch.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Tom Herbert f4a00aacdb net: Make sock_alloc exportable
Export it for cases where we want to create sockets by hand.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Tom Herbert ff3c44e675 rcu: Add list_next_or_null_rcu
This is a convenience function that returns the next entry in an RCU
list or NULL if at the end of the list.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Alexei Starovoitov 557c0c6e7d bpf: convert stackmap to pre-allocation
It was observed that calling bpf_get_stackid() from a kprobe inside
slub or from spin_unlock causes similar deadlock as with hashmap,
therefore convert stackmap to use pre-allocated memory.

The call_rcu is no longer feasible mechanism, since delayed freeing
causes bpf_get_stackid() to fail unpredictably when number of actual
stacks is significantly less than user requested max_entries.
Since elements are no longer freed into slub, we can push elements into
freelist immediately and let them be recycled.
However the very unlikley race between user space map_lookup() and
program-side recycling is possible:
     cpu0                          cpu1
     ----                          ----
user does lookup(stackidX)
starts copying ips into buffer
                                   delete(stackidX)
                                   calls bpf_get_stackid()
				   which recyles the element and
                                   overwrites with new stack trace

To avoid user space seeing a partial stack trace consisting of two
merged stack traces, do bucket = xchg(, NULL); copy; xchg(,bucket);
to preserve consistent stack trace delivery to user space.
Now we can move memset(,0) of left-over element value from critical
path of bpf_get_stackid() into slow-path of user space lookup.
Also disallow lookup() from bpf program, since it's useless and
program shouldn't be messing with collected stack trace.

Note that similar race between user space lookup and kernel side updates
is also present in hashmap, but it's not a new race. bpf programs were
always allowed to modify hash and array map elements while user space
is copying them.

Fixes: d5a3b1f691 ("bpf: introduce BPF_MAP_TYPE_STACK_TRACE")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 15:28:31 -05:00
Alexei Starovoitov 6c90598174 bpf: pre-allocate hash map elements
If kprobe is placed on spin_unlock then calling kmalloc/kfree from
bpf programs is not safe, since the following dead lock is possible:
kfree->spin_lock(kmem_cache_node->lock)...spin_unlock->kprobe->
bpf_prog->map_update->kmalloc->spin_lock(of the same kmem_cache_node->lock)
and deadlocks.

The following solutions were considered and some implemented, but
eventually discarded
- kmem_cache_create for every map
- add recursion check to slow-path of slub
- use reserved memory in bpf_map_update for in_irq or in preempt_disabled
- kmalloc via irq_work

At the end pre-allocation of all map elements turned out to be the simplest
solution and since the user is charged upfront for all the memory, such
pre-allocation doesn't affect the user space visible behavior.

Since it's impossible to tell whether kprobe is triggered in a safe
location from kmalloc point of view, use pre-allocation by default
and introduce new BPF_F_NO_PREALLOC flag.

While testing of per-cpu hash maps it was discovered
that alloc_percpu(GFP_ATOMIC) has odd corner cases and often
fails to allocate memory even when 90% of it is free.
The pre-allocation of per-cpu hash elements solves this problem as well.

Turned out that bpf_map_update() quickly followed by
bpf_map_lookup()+bpf_map_delete() is very common pattern used
in many of iovisor/bcc/tools, so there is additional benefit of
pre-allocation, since such use cases are must faster.

Since all hash map elements are now pre-allocated we can remove
atomic increment of htab->count and save few more cycles.

Also add bpf_map_precharge_memlock() to check rlimit_memlock early to avoid
large malloc/free done by users who don't have sufficient limits.

Pre-allocation is done with vmalloc and alloc/free is done
via percpu_freelist. Here are performance numbers for different
pre-allocation algorithms that were implemented, but discarded
in favor of percpu_freelist:

1 cpu:
pcpu_ida	2.1M
pcpu_ida nolock	2.3M
bt		2.4M
kmalloc		1.8M
hlist+spinlock	2.3M
pcpu_freelist	2.6M

4 cpu:
pcpu_ida	1.5M
pcpu_ida nolock	1.8M
bt w/smp_align	1.7M
bt no/smp_align	1.1M
kmalloc		0.7M
hlist+spinlock	0.2M
pcpu_freelist	2.0M

8 cpu:
pcpu_ida	0.7M
bt w/smp_align	0.8M
kmalloc		0.4M
pcpu_freelist	1.5M

32 cpu:
kmalloc		0.13M
pcpu_freelist	0.49M

pcpu_ida nolock is a modified percpu_ida algorithm without
percpu_ida_cpu locks and without cross-cpu tag stealing.
It's faster than existing percpu_ida, but not as fast as pcpu_freelist.

bt is a variant of block/blk-mq-tag.c simlified and customized
for bpf use case. bt w/smp_align is using cache line for every 'long'
(similar to blk-mq-tag). bt no/smp_align allocates 'long'
bitmasks continuously to save memory. It's comparable to percpu_ida
and in some cases faster, but slower than percpu_freelist

hlist+spinlock is the simplest free list with single spinlock.
As expeceted it has very bad scaling in SMP.

kmalloc is existing implementation which is still available via
BPF_F_NO_PREALLOC flag. It's significantly slower in single cpu and
in 8 cpu setup it's 3 times slower than pre-allocation with pcpu_freelist,
but saves memory, so in cases where map->max_entries can be large
and number of map update/delete per second is low, it may make
sense to use it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 15:28:31 -05:00
Alexei Starovoitov b121d1e74d bpf: prevent kprobe+bpf deadlocks
if kprobe is placed within update or delete hash map helpers
that hold bucket spin lock and triggered bpf program is trying to
grab the spinlock for the same bucket on the same cpu, it will
deadlock.
Fix it by extending existing recursion prevention mechanism.

Note, map_lookup and other tracing helpers don't have this problem,
since they don't hold any locks and don't modify global data.
bpf_trace_printk has its own recursive check and ok as well.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 15:28:30 -05:00
David S. Miller 4c38cd61ae Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter updates for your net-next tree,
they are:

1) Remove useless debug message when deleting IPVS service, from
   Yannick Brosseau.

2) Get rid of compilation warning when CONFIG_PROC_FS is unset in
   several spots of the IPVS code, from Arnd Bergmann.

3) Add prandom_u32 support to nft_meta, from Florian Westphal.

4) Remove unused variable in xt_osf, from Sudip Mukherjee.

5) Don't calculate IP checksum twice from netfilter ipv4 defrag hook
   since fixing af_packet defragmentation issues, from Joe Stringer.

6) On-demand hook registration for iptables from netns. Instead of
   registering the hooks for every available netns whenever we need
   one of the support tables, we register this on the specific netns
   that needs it, patchset from Florian Westphal.

7) Add missing port range selection to nf_tables masquerading support.

BTW, just for the record, there is a typo in the description of
5f6c253ebe ("netfilter: bridge: register hooks only when bridge
interface is added") that refers to the cluster match as deprecated, but
it is actually the CLUSTERIP target (which registers hooks
inconditionally) the one that is scheduled for removal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 14:25:20 -05:00
David S. Miller 810813c47a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, as well as one instance
(vxlan) of a bug fix in 'net' overlapping with code movement
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 12:34:12 -05:00
Linus Torvalds e2857b8f11 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix ordering of WEXT netlink messages so we don't see a newlink
    after a dellink, from Johannes Berg.

 2) Out of bounds access in minstrel_ht_set_best_prob_rage, from
    Konstantin Khlebnikov.

 3) Paging buffer memory leak in iwlwifi, from Matti Gottlieb.

 4) Wrong units used to set initial TCP rto from cached metrics, also
    from Konstantin Khlebnikov.

 5) Fix stale IP options data in the SKB control block from leaking
    through layers of encapsulation, from Bernie Harris.

 6) Zero padding len miscalculated in bnxt_en, from Michael Chan.

 7) Only CHECKSUM_PARTIAL packets should be passed down through GSO, fix
    from Hannes Frederic Sowa.

 8) Fix suspend/resume with JME networking devices, from Diego Violat
    and Guo-Fu Tseng.

 9) Checksums not validated properly in bridge multicast support due to
    the placement of the SKB header pointers at the time of the check,
    fix from Álvaro Fernández Rojas.

10) Fix hang/tiemout with r8169 if a stats fetch is done while the
    device is runtime suspended.  From Chun-Hao Lin.

11) The forwarding database netlink dump facilities don't track the
    state of the dump properly, resulting in skipped/missed entries.
    From Minoura Makoto.

12) Fix regression from a recent 3c59x bug fix, from Neil Horman.

13) Fix list corruption in bna driver, from Ivan Vecera.

14) Big endian machines crash on vlan add in bnx2x, fix from Michal
    Schmidt.

15) Ethtool RSS configuration not propagated properly in mlx5 driver,
    from Tariq Toukan.

16) Fix regression in PHY probing in stmmac driver, from Gabriel
    Fernandez.

17) Fix SKB tailroom calculation in igmp/mld code, from Benjamin
    Poirier.

18) A past change to skip empty routing headers in ipv6 extention header
    parsing accidently caused fragment headers to not be matched any
    longer.  Fix from Florian Westphal.

19) eTSEC-106 erratum needs to be applied to more gianfar chips, from
    Atsushi Nemoto.

20) Fix netdev reference after free via workqueues in usb networking
    drivers, from Oliver Neukum and Bjørn Mork.

21) mdio->irq is now an array rather than a pointer to dynamic memory,
    but several drivers were still trying to free it :-/ Fixes from
    Colin Ian King.

22) act_ipt iptables action forgets to set the family field, thus LOG
    netfilter targets don't work with it.  Fix from Phil Sutter.

23) SKB leak in ibmveth when skb_linearize() fails, from Thomas Falcon.

24) pskb_may_pull() cannot be called with interrupts disabled, fix code
    that tries to do this in vmxnet3 driver, from Neil Horman.

25) be2net driver leaks iomap'd memory on removal, fix from Douglas
    Miller.

26) Forgotton RTNL mutex unlock in ppp_create_interface() error paths,
    from Guillaume Nault.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (97 commits)
  ppp: release rtnl mutex when interface creation fails
  cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  tcp: fix tcpi_segs_in after connection establishment
  net: hns: fix the bug about loopback
  jme: Fix device PM wakeup API usage
  jme: Do not enable NIC WoL functions on S0
  udp6: fix UDP/IPv6 encap resubmit path
  be2net: Don't leak iomapped memory on removal.
  vmxnet3: avoid calling pskb_may_pull with interrupts disabled
  net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM
  ibmveth: check return of skb_linearize in ibmveth_start_xmit
  cdc_ncm: toggle altsetting to force reset before setup
  usbnet: cleanup after bind() in probe()
  mlxsw: pci: Correctly determine if descriptor queue is full
  mlxsw: spectrum: Always decrement bridge's ref count
  tipc: fix nullptr crash during subscription cancel
  net: eth: altera: do not free array priv->mdio->irq
  net/ethoc: do not free array priv->mdio->irq
  net: sched: fix act_ipt for LOG target
  asix: do not free array priv->mdio->irq
  ...
2016-03-07 15:41:10 -08:00
Manish Chopra 088c861830 qed/qede: Add infrastructure support for hardware GRO
This patch adds mainly structures and APIs prototype changes
in order to give support for qede slowpath/fastpath support
for the same.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07 15:01:39 -05:00
Rafał Miłecki d6a3b51ada bcma: move parallel flash support to separated file
This follows the way of handling other flashes and cleans code a bit. As
next task we will want to move flash code to ChipCommon driver as:
1) Flash controllers are accesible using ChipCommon registers
2) This code isn't MIPS specific
This change prepares bcma for that.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-07 14:41:08 +02:00
Rafał Miłecki 2e62f9b2a4 bcma: drop unneeded fields from bcma_pflash struct
Most of info stored in this struct wasn't really used anywhere as we put
all that data in platform data & resource as well.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-07 14:41:08 +02:00
Hante Meuleman 4d79289598 brcmfmac: switch to new platform data
Platform data is only available for sdio. With this patch a new
platform data structure is being used which allows for platform
data for any device and configurable per device. This patch only
switches to the new structure and adds support for SDIO devices.

Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-03-07 14:15:50 +02:00
Linus Torvalds 21b27a74ec Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fix from Sage Weil:
 "This is a final commit we missed to align the protocol compatibility
  with the feature bits.

  It decodes a few extra fields in two different messages and reports
  EIO when they are used (not yet supported)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support
2016-03-06 11:31:13 -08:00
Linus Torvalds fab3e94a62 Merge branch 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
 "Assorted fixes for libata drivers.

   - Turns out HDIO_GET_32BIT ioctl was subtly broken all along.

   - Recent update to ahci external port handling was incorrectly
     marking hotpluggable ports as external making userland handle
     devices connected to those ports incorrectly.

   - ahci_xgene needs its own irq handler to work around a hardware
     erratum.  libahci updated to allow irq handler override.

   - Misc driver specific updates"

* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ata: ahci: don't mark HotPlugCapable Ports as external/removable
  ahci: Workaround for ThunderX Errata#22536
  libata: Align ata_device's id on a cacheline
  Adding Intel Lewisburg device IDs for SATA
  pata-rb532-cf: get rid of the irq_to_gpio() call
  libata: fix HDIO_GET_32BIT ioctl
  ahci_xgene: Implement the workaround to fix the missing of the edge interrupt for the HOST_IRQ_STAT.
  ata: Remove the AHCI_HFLAG_EDGE_IRQ support from libahci.
  libahci: Implement the capability to override the generic ahci interrupt handler.
2016-03-04 18:31:36 -08:00
Linus Torvalds e5322c5406 Merge branch 'for-linus2' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Round 2 of this.  I cut back to the bare necessities, the patch is
  still larger than it usually would be at this time, due to the number
  of NVMe fixes in there.  This pull request contains:

   - The 4 core fixes from Ming, that fix both problems with exceeding
     the virtual boundary limit in case of merging, and the gap checking
     for cloned bio's.

   - NVMe fixes from Keith and Christoph:

        - Regression on larger user commands, causing problems with
          reading log pages (for instance). This touches both NVMe,
          and the block core since that is now generally utilized also
          for these types of commands.

        - Hot removal fixes.

        - User exploitable issue with passthrough IO commands, if !length
          is given, causing us to fault on writing to the zero
          page.

        - Fix for a hang under error conditions

   - And finally, the current series regression for umount with cgroup
     writeback, where the final flush would happen async and hence open
     up window after umount where the device wasn't consistent.  fsck
     right after umount would show this.  From Tejun"

* 'for-linus2' of git://git.kernel.dk/linux-block:
  block: support large requests in blk_rq_map_user_iov
  block: fix blk_rq_get_max_sectors for driver private requests
  nvme: fix max_segments integer truncation
  nvme: set queue limits for the admin queue
  writeback: flush inode cgroup wb switches instead of pinning super_block
  NVMe: Fix 0-length integrity payload
  NVMe: Don't allow unsupported flags
  NVMe: Move error handling to failed reset handler
  NVMe: Simplify device reset failure
  NVMe: Fix namespace removal deadlock
  NVMe: Use IDA for namespace disk naming
  NVMe: Don't unmap controller registers on reset
  block: merge: get the 1st and last bvec via helpers
  block: get the 1st and last bvec via helpers
  block: check virt boundary in bio_will_gap()
  block: bio: introduce helpers to get the 1st and last bvec
2016-03-04 18:17:17 -08:00
Linus Torvalds 78baab7aa8 A feature was added in 4.3 that allowed users to filter trace points on
a tasks "comm" field. But this prevented filtering on a comm field that
 is within a trace event (like sched_migrate_task).
 
 When trying to filter on when a program migrated, this change prevented
 the filtering of the sched_migrate_task.
 
 To fix this, the event fields are examined first, and then the extra fields
 like "comm" and "cpu" are examined. Also, instead of testing to assign
 the comm filter function based on the field's name, the generic comm field
 is given a new filter type (FILTER_COMM). When this field is used to filter
 the type is checked. The same is done for the cpu filter field.
 
 Two new special filter types are added: "COMM" and "CPU". This allows users
 to still filter the tasks comm for events that have "comm" as one of their
 fields, in cases that users would like to filter sched_migrate_task on the
 comm of the task that called the event, and not the comm of the task that
 is being migrated.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW2argAAoJEKKk/i67LK/8b78H/32nYPizDIsK/p2bL1mgbtMl
 vrkcfb+maPOC7cjB+CdQmyV4EIVpSn06XFouYghGprdoVocVyBuIflxn0j3Gbymy
 zLCg8lR70KTATTqst1wsWMbnh+UvAKNEiXj8jf2qcK2xhgalXMDwsTC4+LDlLugu
 YAx89lmsjK1YpP/wIzMww2jQG+07Nhm9gHWXF2MC3egZ+sgYxARnfds0yTcGgS8o
 dc/yJGZDCI44JMDNThcCFxNvsmoTa9tpm+JNe2YTht6KCympa+Ht9Jj9MMlD06cq
 M5CqMQlok+mrVsW5LbJPCk1u83ynr6d/PcPQuT2nykRx8bGvKjA7AKMPaxw1Jz4=
 =ixBz
 -----END PGP SIGNATURE-----

Merge tag 'trace-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "A feature was added in 4.3 that allowed users to filter trace points
  on a tasks "comm" field.  But this prevented filtering on a comm field
  that is within a trace event (like sched_migrate_task).

  When trying to filter on when a program migrated, this change
  prevented the filtering of the sched_migrate_task.

  To fix this, the event fields are examined first, and then the extra
  fields like "comm" and "cpu" are examined.  Also, instead of testing
  to assign the comm filter function based on the field's name, the
  generic comm field is given a new filter type (FILTER_COMM).  When
  this field is used to filter the type is checked.  The same is done
  for the cpu filter field.

  Two new special filter types are added: "COMM" and "CPU".  This allows
  users to still filter the tasks comm for events that have "comm" as
  one of their fields, in cases that users would like to filter
  sched_migrate_task on the comm of the task that called the event, and
  not the comm of the task that is being migrated"

* tag 'trace-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Do not have 'comm' filter override event 'comm' field
2016-03-04 16:57:04 -08:00
Nicolas Dichtel b5d3755a22 uapi: define DIV_ROUND_UP for userland
DIV_ROUND_UP is defined in linux/kernel.h only for the kernel.
When ethtool.h is included by a userland app, we got the following error:

include/linux/ethtool.h:1218:8: error: variably modified 'queue_mask' at file scope
  __u32 queue_mask[DIV_ROUND_UP(MAX_NUM_QUEUE, 32)];
        ^

Let's add a common definition in uapi and use it everywhere.

Fixes: ac2c7ad0e5 ("net/ethtool: introduce a new ioctl for per queue setting")
CC: Kan Liang <kan.liang@intel.com>
Suggested-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-04 16:10:36 -05:00
Yan, Zheng 5ea5c5e0a7 ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support
Add support for the format change of MClientReply/MclientCaps.
Also add code that denies access to inodes with pool_ns layouts.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-03-04 21:00:37 +01:00
Steven Rostedt (Red Hat) e57cbaf0eb tracing: Do not have 'comm' filter override event 'comm' field
Commit 9f61668073 "tracing: Allow triggers to filter for CPU ids and
process names" added a 'comm' filter that will filter events based on the
current tasks struct 'comm'. But this now hides the ability to filter events
that have a 'comm' field too. For example, sched_migrate_task trace event.
That has a 'comm' field of the task to be migrated.

 echo 'comm == "bash"' > events/sched_migrate_task/filter

will now filter all sched_migrate_task events for tasks named "bash" that
migrates other tasks (in interrupt context), instead of seeing when "bash"
itself gets migrated.

This fix requires a couple of changes.

1) Change the look up order for filter predicates to look at the events
   fields before looking at the generic filters.

2) Instead of basing the filter function off of the "comm" name, have the
   generic "comm" filter have its own filter_type (FILTER_COMM). Test
   against the type instead of the name to assign the filter function.

3) Add a new "COMM" filter that works just like "comm" but will filter based
   on the current task, even if the trace event contains a "comm" field.

Do the same for "cpu" field, adding a FILTER_CPU and a filter "CPU".

Cc: stable@vger.kernel.org # v4.3+
Fixes: 9f61668073 "tracing: Allow triggers to filter for CPU ids and process names"
Reported-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-04 09:57:10 -05:00
Christoph Hellwig f21018427c block: fix blk_rq_get_max_sectors for driver private requests
Driver private request types should not get the artifical cap for the
FS requests.  This is important to use the full device capabilities
for internal command or NVMe pass through commands.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>

Updated by me to use an explicit check for the one command type that
does support extended checking, instead of relying on the ordering
of the enum command values - as suggested by Keith.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:43:45 -07:00
Tejun Heo a1a0e23e49 writeback: flush inode cgroup wb switches instead of pinning super_block
If cgroup writeback is in use, inodes can be scheduled for
asynchronous wb switching.  Before 5ff8eaac16 ("writeback: keep
superblock pinned during cgroup writeback association switches"), this
could race with umount leading to super_block being destroyed while
inodes are pinned for wb switching.  5ff8eaac16 fixed it by bumping
s_active while wb switches are in flight; however, this allowed
in-flight wb switches to make umounts asynchronous when the userland
expected synchronosity - e.g. fsck immediately following umount may
fail because the device is still busy.

This patch removes the problematic super_block pinning and instead
makes generic_shutdown_super() flush in-flight wb switches.  wb
switches are now executed on a dedicated isw_wq so that they can be
flushed and isw_nr_in_flight keeps track of the number of in-flight wb
switches so that flushing can be avoided in most cases.

v2: Move cgroup_writeback_umount() further below and add MS_ACTIVE
    check in inode_switch_wbs() as Jan an Al suggested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Tahsin Erdogan <tahsin@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com
Fixes: 5ff8eaac16 ("writeback: keep superblock pinned during cgroup writeback association switches")
Cc: stable@vger.kernel.org #v4.5
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:42:50 -07:00
Ming Lei 25e71a99f1 block: get the 1st and last bvec via helpers
This patch applies the two introduced helpers to
figure out the 1st and last bvec, and fixes the
original way after bio splitting.

Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:42:49 -07:00
Ming Lei e0af29171a block: check virt boundary in bio_will_gap()
In the following patch, the way for figuring out
the last bvec will be changed with a bit cost introduced,
so return immediately if the queue doesn't have virt
boundary limit. Actually most of devices have not
this limit.

Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:42:49 -07:00
Ming Lei 7bcd79ac50 block: bio: introduce helpers to get the 1st and last bvec
The bio passed to bio_will_gap() may be fast cloned from upper
layer(dm, md, bcache, fs, ...), or from bio splitting in block
core.

Unfortunately bio_will_gap() just figures out the last bvec via
'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way is obviously
wrong.

This patch introduces two helpers for getting the first and last
bvec of one bio for fixing the issue.

Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:42:49 -07:00
Benjamin Poirier 1837b2e2bc mld, igmp: Fix reserved tailroom calculation
The current reserved_tailroom calculation fails to take hlen and tlen into
account.

skb:
[__hlen__|__data____________|__tlen___|__extra__]
^                                               ^
head                                            skb_end_offset

In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:

[__hlen__|__data____________|__extra__|__tlen___]
^                                               ^
head                                            skb_end_offset

The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.

The min() in the third line can be expanded into:
if mtu < skb_tailroom - tlen:
	reserved_tailroom = skb_tailroom - mtu
else:
	reserved_tailroom = tlen

Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev->needed_tailroom or it may output more skbs than needed because not all
space available is used.

Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 15:41:07 -05:00
Gabriel Fernandez 88f8b1bb41 stmmac: Fix 'eth0: No PHY found' regression
This patch manages the case when you have an Ethernet MAC with
a "fixed link", and not connected to a normal MDIO-managed PHY device.

The test of phy_bus_name was not helpful because it was never affected
and replaced by the mdio test node.

Signed-off-by: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 15:02:05 -05:00
Tariq Toukan bdfc028de1 net/mlx5e: Fix ethtool RX hash func configuration change
We should modify TIRs explicitly to apply the new RSS configuration.
The light ndo close/open calls do not "refresh" them.

Fixes: 2d75b2bc8a ('net/mlx5e: Add ethtool RSS configuration options')
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:37:25 -05:00
Giuseppe Cavallaro afea03656a stmmac: rework DMA bus setting and introduce new platform AXI structure
This patch restructures the DMA bus settings and this is done
by introducing a new platform structure used for programming
the AXI Bus Mode Register inside the DMA module.
This structure can be populated from device-tree as documented in the
binding txt file.

After initializing the DMA, the AXI register can be optionally tuned
for platform drivers based.
This patch also reworks some parameters to make coherent the DMA
configuration now that AXI register is introduced.
For example, the burst_len is managed by using the mentioned axi
support above; so the snps,burst-len parameter has been removed.
It makes sense to provide the AAL parameter from DT to Address-Aligned
Beats inside the Register0 and review the PBL settings when initialize
the engine.

For PCI glue, rebuilding the story of this setting, it
was added to align a configuration so not for fixing some
known problem. No issue raised after this patch.
It is safe to use the default burst length instead of
tuning it to the maximum value

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:30 -05:00
Florian Westphal af4610c395 netfilter: don't call hooks unless needed
With the previous patches in place, a netns nf_hook_list might be empty,
even if e.g. init_net performs filtering.

Thus change nf_hook_thresh to check the hook_list as well before
initializing hook_state and calling nf_hook_slow().

We still make use of static keys; if no netfilter modules are loaded
list is guaranteed to be empty.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-02 20:05:26 +01:00
Florian Westphal b9e69e1273 netfilter: xtables: don't hook tables by default
delay hook registration until the table is being requested inside a
namespace.

Historically, a particular table (iptables mangle, ip6tables filter, etc)
was registered on module load.

When netns support was added to iptables only the ip/ip6tables ruleset was
made namespace aware, not the actual hook points.

This means f.e. that when ipt_filter table/module is loaded on a system,
then each namespace on that system has an (empty) iptables filter ruleset.

In other words, if a namespace sends a packet, such skb is 'caught' by
netfilter machinery and fed to hooking points for that table (i.e. INPUT,
FORWARD, etc).

Thanks to Eric Biederman, hooks are no longer global, but per namespace.

This means that we can avoid allocation of empty ruleset in a namespace and
defer hook registration until we need the functionality.

We register a tables hook entry points ONLY in the initial namespace.
When an iptables get/setockopt is issued inside a given namespace, we check
if the table is found in the per-namespace list.

If not, we attempt to find it in the initial namespace, and, if found,
create an empty default table in the requesting namespace and register the
needed hooks.

Hook points are destroyed only once namespace is deleted, there is no
'usage count' (it makes no sense since there is no 'remove table' operation
in xtables api).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-02 20:05:24 +01:00
Florian Westphal a67dd266ad netfilter: xtables: prepare for on-demand hook register
This change prepares for upcoming on-demand xtables hook registration.

We change the protoypes of the register/unregister functions.
A followup patch will then add nf_hook_register/unregister calls
to the iptables one.

Once a hook is registered packets will be picked up, so all assignments
of the form

net->ipv4.iptable_$table = new_table

have to be moved to ip(6)t_register_table, else we can see NULL
net->ipv4.iptable_$table later.

This patch doesn't change functionality; without this the actual change
simply gets too big.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-02 20:05:23 +01:00
Linus Torvalds f691b77b1f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull d_inode/d_flags race fix from Al Viro.

I love this fix.  Not only does it fix the race in the dentry type
handling, it entirely gets rid of the nasty and subtle memory ordering
rules for d_type and d_inode, and replaces them with the basic dentry
locking rules (sequence numbers under RCU, d_lock elsewhere).

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  use ->d_seq to get coherency between ->d_inode and ->d_flags
2016-03-01 15:30:45 -08:00
Yuval Mintz 4ac801b77e qed: Semantic refactoring of interrupt code
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01 17:39:49 -05:00
WANG Cong 64d4e3431e net: remove skb_sender_cpu_clear()
After commit 52bd2d62ce ("net: better skb->sender_cpu and skb->napi_id cohabitation")
skb_sender_cpu_clear() becomes empty and can be removed.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01 17:36:47 -05:00
Moshe Lazer 0ba422410b net/mlx5: Fix global UAR mapping
Avoid double mapping of io mapped memory, Device page may be
mapped to non-cached(NC) or to write-combining(WC).
The code before this fix tries to map it both to WC and NC
contrary to what stated in Intel's software developer manual.

Here we remove the global WC mapping of all UARS
"dev->priv.bf_mapping", since UAR mapping should be decided
per UAR (e.g we want different mappings for EQs, CQs vs QPs).

Caller will now have to choose whether to map via
write-combining API or not.

mlx5e SQs will choose write-combining in order to perform
BlueFlame writes.

Fixes: 88a85f99e5 ('TX latency optimization to save DMA reads')
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Reviewed-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01 17:28:00 -05:00
Or Gerlitz 6b6c07bdcd net/mlx5: Make command timeout way shorter
The command timeout is terribly long, whole two hours. Make it 60s so if
things do go wrong, the user gets feedback in relatively short time, so
they can take corrective actions and/or investigate using tools and such.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01 17:28:00 -05:00
David S. Miller d67703fced Here's another round of updates for -next:
* big A-MSDU RX performance improvement (avoid linearize of paged RX)
  * rfkill changes: cleanups, documentation, platform properties
  * basic PBSS support in cfg80211
  * MU-MIMO action frame processing support
  * BlockAck reordering & duplicate detection offload support
  * various cleanups & little fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJW0LZ0AAoJEGt7eEactAAdde0P/2meIOehuHBuAtL7REVoNhri
 bz9eSHMTg+ozCspL7F6vW1ifDI9AaJEaqJccmriueE/UQVC3VXRPGRJ4SFCwZGo9
 Zrtys2v9wOq0+XhxyN65Ucf41O9F/5FFabR5OFbf/pZhW5b2cubEjD1P4BB76Iya
 8O6wf9oDDjt3zJgYK+sygm3k9wtDVrH3qEbj8IDnCy22P7010qCsfok9swfaq8OB
 DBgb6BVfDOFTNXvJGH5fRuUKZdtovzzxorXnoG+zjmKmFdMVdgIYj9+2QfnMjW03
 B4/W85svcLLH8V3lHZc4G8oKM4J4XtjH1PskKIMF7ThJsKGMf8tL2vpt9rr8iscd
 Y9SwTEGc9JmhL7n2FaQFlY6ScLcp4ML+2rXxDOMpBmgF3Ne3yfBsJhLKZEl8vSfI
 mKhzGXpUKjJxJWIxkR0ylJy4/zHeIXkgRlUEhb8t+jgAqvOBTwiVY+vljHCDUERa
 sH40r1OqnGJtOHkSRqXSpxwXW+eKgyDd7fnnRX/tyttp2Fuew27/fN63SjpsfN6O
 3lfSM5bl3FcCKx7vqTLuqzsoqGvDDYkSq6GDfKDqeZIk0vaXA3SJNEOKgymFWQfR
 rzsaXvTbBT34GYRg3xS2NCxlmcBPemei/q0x6ZOffxhF41Qpqjs1dPB1Yq3AW4jD
 HGF+NdRbWEqEFVIjQa8w
 =JHOe
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2016-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Here's another round of updates for -next:
 * big A-MSDU RX performance improvement (avoid linearize of paged RX)
 * rfkill changes: cleanups, documentation, platform properties
 * basic PBSS support in cfg80211
 * MU-MIMO action frame processing support
 * BlockAck reordering & duplicate detection offload support
 * various cleanups & little fixes
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01 17:03:27 -05:00