Commit Graph

39943 Commits

Author SHA1 Message Date
Hannes Frederic Sowa 405c92f7a5 ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment
CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
of those warn about them once and handle them gracefully by recalculating
the checksum.

Fixes: commit 32dce968dd ("ipv6: Allow for partial checksums on non-ufo packets")
See-also: commit 72e843bb09 ("ipv6: ip6_fragment() should check CHECKSUM_PARTIAL")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01 12:01:28 -05:00
Hannes Frederic Sowa 682b1a9d3f ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets
We cannot reliable calculate packet size on MSG_MORE corked sockets
and thus cannot decide if they are going to be fragmented later on,
so better not use CHECKSUM_PARTIAL in the first place.

The IPv6 code also intended to protect and not use CHECKSUM_PARTIAL in
the existence of IPv6 extension headers, but the condition was wrong. Fix
it up, too. Also the condition to check whether the packet fits into
one fragment was wrong and has been corrected.

Fixes: commit 32dce968dd ("ipv6: Allow for partial checksums on non-ufo packets")
See-also: commit 72e843bb09 ("ipv6: ip6_fragment() should check CHECKSUM_PARTIAL")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01 12:01:27 -05:00
Hannes Frederic Sowa dbd3393c56 ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment
CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
of those warn about them once and handle them gracefully by recalculating
the checksum.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01 12:01:27 -05:00
Hannes Frederic Sowa d749c9cbff ipv4: no CHECKSUM_PARTIAL on MSG_MORE corked sockets
We cannot reliable calculate packet size on MSG_MORE corked sockets
and thus cannot decide if they are going to be fragmented later on,
so better not use CHECKSUM_PARTIAL in the first place.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01 12:01:27 -05:00
David S. Miller b75ec3af27 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-01 00:15:30 -04:00
David S. Miller e7b63ff115 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2015-10-30

1) The flow cache is limited by the flow cache limit which
   depends on the number of cpus and the xfrm garbage collector
   threshold which is independent of the number of cpus. This
   leads to the fact that on systems with more than 16 cpus
   we hit the xfrm garbage collector limit and refuse new
   allocations, so new flows are dropped. On systems with 16
   or less cpus, we hit the flowcache limit. In this case, we
   shrink the flow cache instead of refusing new flows.

   We increase the xfrm garbage collector threshold to INT_MAX
   to get the same behaviour, independent of the number of cpus.

2) Fix some unaligned accesses on sparc systems.
   From Sowmini Varadhan.

3) Fix some header checks in _decode_session4. We may call
   pskb_may_pull with a negative value converted to unsigened
   int from pskb_may_pull. This can lead to incorrect policy
   lookups. We fix this by a check of the data pointer position
   before we call pskb_may_pull.

4) Reload skb header pointers after calling pskb_may_pull
   in _decode_session4 as this may change the pointers into
   the packet.

5) Add a missing statistic counter on inner mode errors.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 20:51:56 +09:00
Scott Feldman e258d919b1 switchdev: fix: pass correct obj size when deferring obj add
Fixes: 4d429c5dd ("switchdev: introduce possibility to defer obj_add/del")
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 20:23:37 +09:00
Scott Feldman 3a7bde55a1 switchdev: fix: erasing too much of vlan obj when handling multiple vlan specs
When adding vlans with multiple IFLA_BRIDGE_VLAN_INFO attrs set in AFSPEC,
we would wipe the vlan obj struct after the first IFLA_BRIDGE_VLAN_INFO.
Fix this by only clearing what's necessary on each IFLA_BRIDGE_VLAN_INFO
iteration.

Fixes: 9e8f4a54 ("switchdev: push object ID back to object structure")
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 20:23:35 +09:00
David S. Miller 740215ddb5 NFC 4.4 pull request
This is the NFC pull request for 4.4.
 
 It's a bit bigger than usual, the 3 main culprits being:
 
 - A new driver for Intel's Fields Peak NCI chipset. In order to
   support this chipset we had to export a few NCI routines and
   extend the driver NCI ops to not only support proprietary
   commands but also core ones.
 
 - Support for vendor commands for both STM drivers, st-nci
   and st21nfca. Those vendor commands allow to run factory tests
   through the NFC netlink interface.
 
 - New i2c and SPI support for the Marvell driver, together with
   firmware download support for this driver's core.
 
 Besides that we also have:
 
 - A few file renames in the STM drivers, to keep the naming
   consistent between drivers.
 
 - Some improvements and fixes on the NCI HCI layer, mostly to
   properly reach a secure element over a legacy HCI link.
 
 - A few fixes for the s3fwrn5 and trf7970a drivers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWMGIaAAoJEIqAPN1PVmxKHjYP/3Q3Y4Vhvw3kTfDP3IlnAuH3
 XjBMGKPLu72MmtSk9jFOr5VuC76YtJzwf+4nKGJybu619NPKfxXN7r83bpsZV1Bk
 +0cS1RpjIQh92a0ElvX1muCFhgH7ax7zeqQ+29OSpA33e67/DlUcwxiqzF15cwWC
 Bk0pUv1FxMoNi5ZkG1JrRqrhx/Yqo1dw2HrnMbKVgwLtLzODBuzoGVKfydTo0b1j
 hkl30DPF3AYMxnwIml3tM8zT96b1LtD0Xgs1yF8IdrIJ+6YLn/6tnw1rUxnE8Ovo
 JFtvtS0OKjGTNFr1NhueG0i5td8TR4MAJKHh0Lz9ISIFWtNtsVjFfGuAeWZcQ/n/
 rQS7xOvzBCndNbe8PS9wBiNQAqLAH5/dzvKRwNboRttkpwIrNOgaBYj2LpuRzpfO
 p+ArwBryAorfxQVOIWl4knc59UsiPUKOK61uMTZ1sU7jCEvUNVChIm8EGRlMnpMQ
 ZFlBa2lqNdgz7ubKLofbnWLiCNY6r0E13MSLHZlJZX61IMjs13ojDeKMvitFBe+b
 1hDwbSWxIRB8xKbcsIA9bPUnEc16Syywz/Q4iAsE8Gy6l5J41MhA/q2QaO9WSrPE
 Leah53l5EwQRd55WjJkCkIKZwvCjIerkESfAS0oprELIYXaxs/1PbVl6C7VYZA4K
 A5tYLw2vS+tTK4Mgi/ym
 =aw/h
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz says:

====================
NFC 4.4 pull request

This is the NFC pull request for 4.4.

It's a bit bigger than usual, the 3 main culprits being:

- A new driver for Intel's Fields Peak NCI chipset. In order to
  support this chipset we had to export a few NCI routines and
  extend the driver NCI ops to not only support proprietary
  commands but also core ones.

- Support for vendor commands for both STM drivers, st-nci
  and st21nfca. Those vendor commands allow to run factory tests
  through the NFC netlink interface.

- New i2c and SPI support for the Marvell driver, together with
  firmware download support for this driver's core.

Besides that we also have:

- A few file renames in the STM drivers, to keep the naming
  consistent between drivers.

- Some improvements and fixes on the NCI HCI layer, mostly to
  properly reach a secure element over a legacy HCI link.

- A few fixes for the s3fwrn5 and trf7970a drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 20:19:43 +09:00
David S. Miller 5bf8921116 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-10-28

Here are a some more Bluetooth patches for 4.4 which collected up during
the past week. The most important ones are from Kuba Pawlak for fixing
locking issues with SCO sockets. There's also a fix from Alexander Aring
for 6lowpan, a memleak fix from Julia Lawall for the btmrvl driver and
some cleanup patches from Marcel.

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 19:41:10 +09:00
Alexander Duyck b7b0b1d290 ipv6: recreate ipv6 link-local addresses when increasing MTU over IPV6_MIN_MTU
This change makes it so that we reinitialize the interface if the MTU is
increased back above IPV6_MIN_MTU and the interface is up.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 18:11:07 +09:00
Ido Schimmel 741af0053b switchdev: Add support for flood control
Allow devices supporting this feature to control the flooding of unknown
unicast traffic, by making switchdev infrastructure propagate this setting
to the switch driver.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:26:38 +09:00
Roopa Prabhu b7af1472af bridge: set is_local and is_static before fdb entry is added to the fdb hashtable
Problem Description:
We can add fdbs pointing to the bridge with NULL ->dst but that has a
few race conditions because br_fdb_insert() is used which first creates
the fdb and then, after the fdb has been published/linked, sets
"is_local" to 1 and in that time frame if a packet arrives for that fdb
it may see it as non-local and either do a NULL ptr dereference in
br_forward() or attach the fdb to the port where it arrived, and later
br_fdb_insert() will make it local thus getting a wrong fdb entry.
Call chain br_handle_frame_finish() -> br_forward():
But in br_handle_frame_finish() in order to call br_forward() the dst
should not be local i.e. skb != NULL, whenever the dst is
found to be local skb is set to NULL so we can't forward it,
and here comes the problem since it's running only
with RCU when forwarding packets it can see the entry before "is_local"
is set to 1 and actually try to dereference NULL.
The main issue is that if someone sends a packet to the switch while
it's adding the entry which points to the bridge device, it may
dereference NULL ptr. This is needed now after we can add fdbs
pointing to the bridge.  This poses a problem for
br_fdb_update() as well, while someone's adding a bridge fdb, but
before it has is_local == 1, it might get moved to a port if it comes
as a source mac and then it may get its "is_local" set to 1

This patch changes fdb_create to take is_local and is_static as
arguments to set these values in the fdb entry before it is added to the
hash. Also adds null check for port in br_forward.

Fixes: 3741873b4f ("bridge: allow adding of fdb entries pointing to the bridge device")
Reported-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 12:13:05 +09:00
Hannes Frederic Sowa 89bc7848a9 ipv6: protect mtu calculation of wrap-around and infinite loop by rounding issues
Raw sockets with hdrincl enabled can insert ipv6 extension headers
right into the data stream. In case we need to fragment those packets,
we reparse the options header to find the place where we can insert
the fragment header. If the extension headers exceed the link's MTU we
actually cannot make progress in such a case.

Instead of ending up in broken arithmetic or rounding towards 0 and
entering an endless loop in ip6_fragment, just prevent those cases by
aborting early and signal -EMSGSIZE to user space.

This is the second version of the patch which doesn't use the
overflow_usub function, which got reverted for now.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-29 07:01:50 -07:00
Hannes Frederic Sowa 1e0d69a9cc Revert "Merge branch 'ipv6-overflow-arith'"
Linus dislikes these changes. To not hold up the net-merge let's revert
it for now and fix the bug like Linus suggested.

This reverts commit ec3661b422, reversing
changes made to c80dbe0461.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-29 07:01:48 -07:00
Sagi Grimberg 9ddc87374a RDS/IW: Convert to new memory registration API
Get rid of fast_reg page list and its construction.
Instead, just pass the RDS sg list to ib_map_mr_sg
and post the new ib_reg_wr.

This is done both for server IW RDMA_READ registration
and the client remote key registration.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:18 -04:00
Sagi Grimberg 412a15c0fe svcrdma: Port to new memory registration API
Instead of maintaining a fastreg page list, keep an sg table
and convert an array of pages to a sg list. Then call ib_map_mr_sg
and construct ib_reg_wr.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Selvin Xavier <selvin.xavier@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:18 -04:00
Sagi Grimberg 4143f34e01 xprtrdma: Port to new memory registration API
Instead of maintaining a fastreg page list, keep an sg table
and convert an array of pages to a sg list. Then call ib_map_mr_sg
and construct ib_reg_wr.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Selvin Xavier <selvin.xavier@avagotech.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:18 -04:00
Doug Ledford 63e8790d39 Merge branch 'wr-cleanup' into k.o/for-4.4 2015-10-28 22:23:34 -04:00
Doug Ledford eb14ab3ba1 Merge branch 'wr-cleanup' of git://git.infradead.org/users/hch/rdma into wr-cleanup
Signed-off-by: Doug Ledford <dledford@redhat.com>

Conflicts:
	drivers/infiniband/ulp/isert/ib_isert.c - Commit 4366b19ca5
	(iser-target: Change the recv buffers posting logic) changed the
	logic in isert_put_datain() and had to be hand merged
2015-10-28 22:21:09 -04:00
Guy Shapiro fa20105e09 IB/cma: Add support for network namespaces
Add support for network namespaces in the ib_cma module. This is
accomplished by:

1. Adding network namespace parameter for rdma_create_id. This parameter is
   used to populate the network namespace field in rdma_id_private.
   rdma_create_id keeps a reference on the network namespace.
2. Using the network namespace from the rdma_id instead of init_net inside
   of ib_cma, when listening on an ID and when looking for an ID for an
   incoming request.
3. Decrementing the reference count for the appropriate network namespace
   when calling rdma_destroy_id.

In order to preserve the current behavior init_net is passed when calling
from other modules.

Signed-off-by: Guy Shapiro <guysh@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Yotam Kenneth <yotamke@mellanox.com>
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 12:32:48 -04:00
Robert Dolca f11631748e NFC: nci: non-static functions can not be inline
This fixes a build error that seems to be toochain
dependent (Not seen with gcc v5.1):

In file included from net/nfc/nci/rsp.c:36:0:
net/nfc/nci/rsp.c: In function ‘nci_rsp_packet’:
include/net/nfc/nci_core.h:355:12: error: inlining failed in call to
always_inline ‘nci_prop_rsp_packet’: function body not available
 inline int nci_prop_rsp_packet(struct nci_dev *ndev, __u16 opcode,

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-28 06:44:45 +01:00
Robert Shearman cf4b24f002 mpls: reduce memory usage of routes
Nexthops for MPLS routes have a via address field sized for the
largest via address that is expected, which is 32 bytes. This means
that in the most common case of having ipv4 via addresses, 28 bytes of
memory more than required are used per nexthop. In the other common
case of an ipv6 nexthop then 16 bytes more than required are
used. With large numbers of MPLS routes this extra memory usage could
start to become significant.

To avoid allocating memory for a maximum length via address when not
all of it is required and to allow for ease of iterating over
nexthops, then the via addresses are changed to be stored in the same
memory block as the route and nexthops, but in an array after the end
of the array of nexthops. New accessors are provided to retrieve a
pointer to the via address.

To allow for O(1) access without having to store a pointer or offset
per nh, the via address for each nexthop is sized according to the
maximum via address for any nexthop in the route, which is stored in a
new route field, rt_max_alen, but this is in an existing hole in
struct mpls_route so it doesn't increase the size of the
structure. Each via address is ensured to be aligned to VIA_ALEN_ALIGN
to account for architectures that don't allow unaligned accesses.

Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:52:59 -07:00
Robert Shearman b4e04fc735 mpls: fix forwarding using v4/v6 explicit null
Fill in the via address length for the predefined IPv4 and IPv6
explicit-null label routes.

Fixes: f8efb73c97 ("mpls: multipath route support")
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:52:58 -07:00
Sowmini Varadhan 8ce675ff39 RDS-TCP: Recover correctly from pskb_pull()/pksb_trim() failure in rds_tcp_data_recv
Either of pskb_pull() or pskb_trim() may fail under low memory conditions.
If rds_tcp_data_recv() ignores such failures, the application will
receive corrupted data because the skb has not been correctly
carved to the RDS datagram size.

Avoid this by handling pskb_pull/pskb_trim failure in the same
manner as the skb_clone failure: bail out of rds_tcp_data_recv(), and
retry via the deferred call to rds_send_worker() that gets set up on
ENOMEM from rds_tcp_read_sock()

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:46:34 -07:00
Hannes Frederic Sowa 080a270f5a sock: don't enable netstamp for af_unix sockets
netstamp_needed is toggled for all socket families if they request
timestamping. But some protocols don't need the lower-layer timestamping
code at all. This patch starts disabling it for af-unix.

E.g. systemd enables timestamping during boot-up on the journald af-unix
sockets, thus causing the system to globally enable timestamping in the
lower networking stack. Still, it is very probable that timestamping
gets activated, by e.g. dhclient or various NTP implementations.

Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:39:14 -07:00
Joe Stringer 6f5cadee44 openvswitch: Fix skb leak using IPv6 defrag
nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the
reassembly is successful, expects the caller to free all of the original
skbs using nf_ct_frag6_consume_orig(). This call was previously missing,
meaning that the original fragments were never freed (with the exception
of the last fragment to arrive).

Fix this by ensuring that all original fragments except for the last
fragment are freed via nf_ct_frag6_consume_orig(). The last fragment
will be morphed into the head, so it must not be freed yet. Furthermore,
retain the ->next pointer for the head after skb_morph().

Fixes: 7f8a436eaa ("openvswitch: Add conntrack action")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:32:18 -07:00
Joe Stringer 190b8ffbb7 ipv6: Export nf_ct_frag6_consume_orig()
This is needed in openvswitch to fix an skb leak in the next patch.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:32:17 -07:00
Joe Stringer 74c1661813 openvswitch: Fix double-free on ip_defrag() errors
If ip_defrag() returns an error other than -EINPROGRESS, then the skb is
freed. When handle_fragments() passes this back up to
do_execute_actions(), it will be freed again. Prevent this double free
by never freeing the skb in do_execute_actions() for errors returned by
ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead.

Fixes: 7f8a436eaa ("openvswitch: Add conntrack action")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:32:14 -07:00
Alexander Duyck c2229fe143 fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key
We were computing the child index in cases where the key value we were
looking for was actually less than the base key of the tnode.  As a result
we were getting incorrect index values that would cause us to skip over
some children.

To fix this I have added a test that will force us to use child index 0 if
the key we are looking for is less than the key of the current tnode.

Fixes: 8be33e955c ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf")
Reported-by: Brian Rak <brak@gameservers.com>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 18:14:51 -07:00
Alexander Aring 324e786ee3 bluetooth: 6lowpan: fix NOHZ: local_softirq_pending
Jukka reported about the following warning:

"NOHZ: local_softirq_pending 08"

I remember this warning and we had a similar issue when using workqueues
and calling netif_rx. See commit 5ff3fec ("mac802154: fix NOHZ
local_softirq_pending 08 warning").

This warning occurs when calling "netif_rx" inside the wrong context
(non softirq context). The net core api offers "netif_rx_ni" to call
netif_rx inside the correct softirq context.

Reported-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-27 09:53:36 +01:00
emmanuel.grumbach@intel.com 8941faa161 net: tso: add support for IPv6
Adding IPv6 for the TSO helper API is trivial:
* Don't play with the id (which doesn't exist in IPv6)
* Correctly update the payload_len (don't include the
  length of the IP header itself)

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-26 22:24:22 -07:00
Eric Dumazet 7e3b6e7423 ipv6: gre: support SIT encapsulation
gre_gso_segment() chokes if SIT frames were aggregated by GRO engine.

Fixes: 61c1db7fae ("ipv6: sit: add GSO/TSO support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-26 22:01:18 -07:00
Kuba Pawlak 2c501cdd68 Bluetooth: Fix crash on fast disconnect of SCO
Fix a crash that may happen when a connection is closed before it was fully
established. Mapping conn->hcon was released by shutdown function, but it
is still referenced in (not yet finished) connection established handling
function.

[ 4635.254073] BUG: unable to handle kernel NULL pointer dereference at 00000013
[ 4635.262058] IP: [<c11659f0>] memcmp+0xe/0x25
[ 4635.266835] *pdpt = 0000000024190001 *pde = 0000000000000000
[ 4635.273261] Oops: 0000 [#1] PREEMPT SMP
[ 4635.277652] Modules linked in: evdev ecb vfat fat libcomposite usb2380 isofs zlib_inflate rfcomm(O) udc_core bnep(O) btusb(O) btbcm(O) btintel(O) bluetooth(O) cdc_acm arc4 uinput hid_mule
[ 4635.321761] Pid: 363, comm: kworker/u:2H Tainted: G           O 3.8.0-119.1-plk-adaptation-byt-ivi-brd #1
[ 4635.332642] EIP: 0060:[<c11659f0>] EFLAGS: 00010206 CPU: 0
[ 4635.338767] EIP is at memcmp+0xe/0x25
[ 4635.342852] EAX: e4720678 EBX: 00000000 ECX: 00000006 EDX: 00000013
[ 4635.349849] ESI: 00000000 EDI: fb85366c EBP: e40c7dc0 ESP: e40c7db4
[ 4635.356846]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 4635.362873] CR0: 8005003b CR2: 00000013 CR3: 24191000 CR4: 001007f0
[ 4635.369869] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4635.376865] DR6: ffff0ff0 DR7: 00000400
[ 4635.381143] Process kworker/u:2H (pid: 363, ti=e40c6000 task=e40c5510 task.ti=e40c6000)
[ 4635.390080] Stack:
[ 4635.392319]  e4720400 00000000 fb85366c e40c7df4 fb842285 e40c7de2 fb853200 00000013
[ 4635.401003]  e3f101c4 e4720678 e3f101c0 e403be0a e40c7dfc e416a000 e403be0a fb85366c
[ 4635.409692]  e40c7e1c fb820186 020f6c00 e47c49ac e47c4008 00000000 e416a000 e47c402c
[ 4635.418380] Call Trace:
[ 4635.421153]  [<fb842285>] sco_connect_cfm+0xff/0x236 [bluetooth]
[ 4635.427893]  [<fb820186>] hci_sync_conn_complete_evt.clone.101+0x227/0x268 [bluetooth]
[ 4635.436758]  [<fb82370f>] hci_event_packet+0x1caa/0x21d3 [bluetooth]
[ 4635.443859]  [<c106231f>] ? trace_hardirqs_on+0xb/0xd
[ 4635.449502]  [<c1375b8a>] ? _raw_spin_unlock_irqrestore+0x42/0x59
[ 4635.456340]  [<fb814b67>] hci_rx_work+0xb9/0x350 [bluetooth]
[ 4635.462663]  [<c1039f1e>] ? process_one_work+0x17b/0x2e6
[ 4635.468596]  [<c1039f77>] process_one_work+0x1d4/0x2e6
[ 4635.474333]  [<c1039f1e>] ? process_one_work+0x17b/0x2e6
[ 4635.480294]  [<fb814aae>] ? hci_cmd_work+0xda/0xda [bluetooth]
[ 4635.486810]  [<c103a3fa>] worker_thread+0x171/0x20f
[ 4635.492257]  [<c10456c5>] ? complete+0x34/0x3e
[ 4635.497219]  [<c103ea06>] kthread+0x90/0x95
[ 4635.501888]  [<c103a289>] ? manage_workers+0x1df/0x1df
[ 4635.507628]  [<c1376537>] ret_from_kernel_thread+0x1b/0x28
[ 4635.513755]  [<c103e976>] ? __init_kthread_worker+0x42/0x42
[ 4635.519975] Code: 74 0d 3c 79 74 04 3c 59 75 0c c6 02 01 eb 03 c6 02 00 31 c0 eb 05 b8 ea ff ff ff 5d c3 55 89 e5 57 56 53 31 db eb 0e 0f b6 34 18 <0f> b6 3c 1a 43 29 fe 75 07 49 85 c9 7f
[ 4635.541264] EIP: [<c11659f0>] memcmp+0xe/0x25 SS:ESP 0068:e40c7db4
[ 4635.548166] CR2: 0000000000000013
[ 4635.552177] ---[ end trace e05ce9b8ce6182f6 ]---

Signed-off-by: Kuba Pawlak <kubax.t.pawlak@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-27 06:00:07 +01:00
Bjørn Mork 4b3418fba0 ipv6: icmp: include addresses in debug messages
Messages like "icmp6_send: no reply to icmp error" are close
to useless. Adding source and destination addresses to provide
some more clue.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-26 21:59:42 -07:00
Vincent Cuissard 2bd832459a NFC: NCI: allow spi driver to choose transfer clock
In some cases low level drivers might want to update the
SPI transfer clock (e.g. during firmware download).

This patch adds this support. Without any modification the
driver will use the default SPI clock (from pdata or device tree).

Signed-off-by: Vincent Cuissard <cuissard@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-27 04:23:34 +01:00
Vincent Cuissard fcd9d046fd NFC: NCI: move generic spi driver to a module
SPI driver should be a module.

Signed-off-by: Vincent Cuissard <cuissard@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-27 04:21:38 +01:00
Vincent Cuissard e5629d2947 NFC: NCI: export nci_send_frame and nci_send_cmd function
Export nci_send_frame and nci_send_cmd symbols to allow drivers
to use it. This is needed for example if NCI is used during
firmware download phase.

Signed-off-by: Vincent Cuissard <cuissard@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-27 04:16:14 +01:00
Christophe Ricard 15d17170b4 NFC: st21nfca: Add support for proprietary commands
Add support for proprietary commands useful mainly
for factory testings.

Here is a list:

- FACTORY_MODE: Allow to set the driver into a mode where no
  secure element are activated. It does not consider any
  NFC_ATTR_VENDOR_DATA.
- HCI_CLEAR_ALL_PIPES: Allow to execute a HCI clear all pipes
  command. It does not consider any NFC_ATTR_VENDOR_DATA.
- HCI_DM_PUT_DATA: Allow to configure specific CLF registry as
  for example RF trimmings or low level drivers configurations
  (I2C, SPI, SWP).
- HCI_DM_UPDATE_AID: Allow to configure an AID routing into the
  CLF routing table following RF technology, CLF mode or protocol.
- HCI_DM_GET_INFO: Allow to retrieve CLF information.
- HCI_DM_GET_DATA: Allow to retrieve CLF configurable data such as
  low level drivers configurations or RF trimmings.
- HCI_DM_LOAD: Allow to load a firmware into the CLF. A complete
  packet can be more than 8KB.
- HCI_DM_RESET: Allow to run a CLF reset in order to "commit" CLF
  configuration changes without CLF power off.
- HCI_GET_PARAM: Allow to retrieve an HCI CLF parameter (for example
  the white list).
- HCI_DM_FIELD_GENERATOR: Allow to generate different kind of RF
  technology. When using this command to anti-collision is done.
- HCI_LOOPBACK: Allow to echo a command and test the Dh to CLF
  connectivity.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-27 04:00:24 +01:00
Christophe Ricard 064d004796 NFC: st-nci: Add few code style fixes
Add some few code style fixes.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-27 03:55:13 +01:00
Christophe Ricard 96d4581f0b NFC: netlink: Add mode parameter to deactivate_target functions
In order to manage in a better way the nci poll mode state machine,
add mode parameter to deactivate_target functions.
This way we can manage different target state.
mode parameter make sense only in nci core.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-27 03:55:12 +01:00
Marcel Holtmann c4297e8f7f Bluetooth: Fix some obvious coding style issues in the SCO module
Lets fix this obvious coding style issues in the SCO module and bring it
in line with the rest of the Bluetooth subsystem.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-26 08:22:00 +02:00
Marcel Holtmann 05fcd4c4f1 Bluetooth: Replace hci_notify with hci_sock_dev_event
There is no point in wrapping hci_sock_dev_event around hci_notify. It
is an empty wrapper which adds no value. So remove it.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-26 08:21:47 +02:00
Marcel Holtmann 242c0ebd37 Bluetooth: Rename bt_cb()->req into bt_cb()->hci
The SKB context buffer for HCI request is really not just for requests,
information in their are preserved for the whole HCI layer. So it makes
more sense to actually rename it into bt_cb()->hci and also call it then
struct hci_ctrl.

In addition that allows moving the decoded opcode for outgoing packets
into that struct. So far it was just consuming valuable space from the
main shared items. And opcode are not valid for L2CAP packets.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-26 08:21:03 +02:00
Marcel Holtmann d94a61040d Bluetooth: Remove unneeded parenthesis around MSG_OOB
There are two checks that are still using (MSG_OOB) instead of just
MSG_OOB and so lets just fix them.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-26 08:20:51 +02:00
Christophe Ricard a1b0b94158 NFC: nci: Create pipe on specific gate in nci_hci_connect_gate
Some gates might need to have their pipes explicitly created.
Add a call to nci_hci_create_pipe in nci_hci_connect_gate for
every gate that is different than NCI_HCI_LINK_MGMT_GATE or
NCI_HCI_ADMIN_GATE.

In case of an error when opening a pipe, like in hci layer,
delete the pipe if it was created.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-26 06:53:13 +01:00
Christophe Ricard 8a49943f5b NFC: nci: Call nci_hci_clear_all_pipes at HCI initial activation.
When session_id is filled to 0xff, the pipe configuration is
probably incorrect and needs to be cleared.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-26 06:53:13 +01:00
Christophe Ricard fa6fbadea5 NFC: nci: add nci_hci_clear_all_pipes functions
nci_hci_clear_all_pipes might be use full in some cases
for example after a firmware update.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-26 06:53:12 +01:00
Christophe Ricard e65917b6d5 NFC: nci: extract pipe value using NCI_HCP_MSG_GET_PIPE
When receiving data in nci_hci_msg_rx_work, extract pipe
value using NCI_HCP_MSG_GET_PIPE macro.

Cc: stable@vger.kernel.org
Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-26 06:53:12 +01:00
Christophe Ricard d8cd37ed2f NFC: nci: Fix improper management of HCI return code
When sending HCI data over NCI, HCI return code is part
of the NCI data. In order to get correctly the HCI return
code, we assume the NCI communication is successful and
extract the return code for the nci_hci functions return code.

This is done because nci_to_errno does not match hci return
code value.

Cc: stable@vger.kernel.org
Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-26 06:53:12 +01:00
Christophe Ricard 500c4ef022 NFC: nci: Fix incorrect data chaining when sending data
When sending HCI data over NCI, cmd information should be
present only on the first packet.
Each packet shall be specifically allocated and sent to the
NCI layer.

Cc: stable@vger.kernel.org
Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-26 06:53:11 +01:00
Kuba Pawlak 1da5537ecc Bluetooth: Fix locking issue during fast SCO reconnection.
When SCO connection is requested and disconnected fast, there is a change
that sco_sock_shutdown is going to preempt thread started in sco_connect_cfm.
When this happens struct sock sk may be removed but a pointer to it is still
held in sco_conn_ready, where embedded spinlock is used. If it is used, but
struct sock has been removed, it will crash.

Block connection object, which will prevent struct sock from being removed
and give connection process chance to finish.

BUG: spinlock bad magic on CPU#0, kworker/u:2H/319
 lock: 0xe3e99434, .magic: f3000000, .owner: (���/0, .owner_cpu: -203804160
Pid: 319, comm: kworker/u:2H Tainted: G           O 3.8.0-115.1-plk-adaptation-byt-ivi-brd #1
Call Trace:
 [<c1155659>] ? do_raw_spin_lock+0x19/0xe9
 [<fb75354f>] ? sco_connect_cfm+0x92/0x236 [bluetooth]
 [<fb731dbc>] ? hci_sync_conn_complete_evt.clone.101+0x18b/0x1cb [bluetooth]
 [<fb734ee7>] ? hci_event_packet+0x1acd/0x21a6 [bluetooth]
 [<c1041095>] ? finish_task_switch+0x50/0x89
 [<c1349a2e>] ? __schedule+0x638/0x6b8
 [<fb727918>] ? hci_rx_work+0xb9/0x2b8 [bluetooth]
 [<c103760a>] ? queue_delayed_work_on+0x21/0x2a
 [<c1035df9>] ? process_one_work+0x157/0x21b
 [<fb72785f>] ? hci_cmd_work+0xef/0xef [bluetooth]
 [<c1036217>] ? worker_thread+0x16e/0x20a
 [<c10360a9>] ? manage_workers+0x1cf/0x1cf
 [<c103a0ef>] ? kthread+0x8d/0x92
 [<c134adf7>] ? ret_from_kernel_thread+0x1b/0x28
 [<c103a062>] ? __init_kthread_worker+0x24/0x24
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<  (null)>]   (null)
*pdpt = 00000000244e1001 *pde = 0000000000000000
Oops: 0010 [#1] PREEMPT SMP
Modules linked in: evdev ecb rfcomm(O) libcomposite usb2380 udc_core bnep(O) btusb(O) btbcm(O) cdc_acm btintel(O) bluetooth(O) arc4 uinput hid_multitouch usbhid hid iwlmvm(O)e
Pid: 319, comm: kworker/u:2H Tainted: G           O 3.8.0-115.1-plk-adaptation-byt-ivi-brd #1
EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
EIP is at 0x0
EAX: e3e99400 EBX: e3e99400 ECX: 00000100 EDX: 00000000
ESI: e3e99434 EDI: fb763ce0 EBP: e49b9e44 ESP: e49b9e14
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 00000000 CR3: 24444000 CR4: 001007f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process kworker/u:2H (pid: 319, ti=e49b8000 task=e4ab9030 task.ti=e49b8000)
Stack:
 fb75355b 00000246 fb763900 22222222 22222222 22222222 e3f94460 e3ca7c0a
 e49b9e4c e3f34c00 e3ca7c0a fb763ce0 e49b9e6c fb731dbc 02000246 e4cec85c
 e4cec008 00000000 e3f34c00 e4cec000 e3c2ce00 0000002c e49b9ed0 fb734ee7
Call Trace:
 [<fb75355b>] ? sco_connect_cfm+0x9e/0x236 [bluetooth]
 [<fb731dbc>] ? hci_sync_conn_complete_evt.clone.101+0x18b/0x1cb [bluetooth]
 [<fb734ee7>] ? hci_event_packet+0x1acd/0x21a6 [bluetooth]
 [<c1041095>] ? finish_task_switch+0x50/0x89
 [<c1349a2e>] ? __schedule+0x638/0x6b8
 [<fb727918>] ? hci_rx_work+0xb9/0x2b8 [bluetooth]
 [<c103760a>] ? queue_delayed_work_on+0x21/0x2a
 [<c1035df9>] ? process_one_work+0x157/0x21b
 [<fb72785f>] ? hci_cmd_work+0xef/0xef [bluetooth]
 [<c1036217>] ? worker_thread+0x16e/0x20a
 [<c10360a9>] ? manage_workers+0x1cf/0x1cf
 [<c103a0ef>] ? kthread+0x8d/0x92
 [<c134adf7>] ? ret_from_kernel_thread+0x1b/0x28
 [<c103a062>] ? __init_kthread_worker+0x24/0x24
Code:  Bad EIP value.
EIP: [<00000000>] 0x0 SS:ESP 0068:e49b9e14
CR2: 0000000000000000
---[ end trace 942a6577c0abd725 ]---

Signed-off-by: Kuba Pawlak <kubax.t.pawlak@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-25 21:06:39 +01:00
Kuba Pawlak 435c513369 Bluetooth: Fix locking issue on SCO disconnection
Thread handling SCO disconnection may get preempted in '__sco_sock_close'
after dropping a reference to hci_conn but before marking this as NULL
in associated struct sco_conn. When execution returs to this thread,
this connection will possibly be released, resulting in kernel crash

Lock connection before this point.

BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<fb770ab9>] __sco_sock_close+0x194/0x1ff [bluetooth]
*pdpt = 0000000023da6001 *pde = 0000000000000000
Oops: 0002 [#1] PREEMPT SMP
Modules linked in: evdev ecb rfcomm(O) libcomposite usb2380 udc_core bnep(O) btusb(O) btbcm(O) cdc_acm btintel(O) bluetooth(O) arc4 uinput hid_multitouch usbhid iwlmvm(O) hide
Pid: 984, comm: bluetooth Tainted: G           O 3.8.0-115.1-plk-adaptation-byt-ivi-brd #1
EIP: 0060:[<fb770ab9>] EFLAGS: 00010282 CPU: 2
EIP is at __sco_sock_close+0x194/0x1ff [bluetooth]
EAX: 00000000 EBX: e49d7600 ECX: ef1ec3c2 EDX: 000000c3
ESI: e4c12000 EDI: 00000000 EBP: ef1edf5c ESP: ef1edf4c
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: 00000000 CR3: 23da7000 CR4: 001007f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process bluetooth (pid: 984, ti=ef1ec000 task=e47f2550 task.ti=ef1ec000)
Stack:
 e4c120d0 e49d7600 00000000 08421a40 ef1edf70 fb770b7a 00000002 e8a4cc80
 08421a40 ef1ec000 c12966b1 00000001 00000000 0000000b 084954c8 c1296b6c
 0000001b 00000002 0000001b 00000002 00000000 00000002 b2524880 00000046
Call Trace:
 [<fb770b7a>] ? sco_sock_shutdown+0x56/0x95 [bluetooth]
 [<c12966b1>] ? sys_shutdown+0x37/0x53
 [<c1296b6c>] ? sys_socketcall+0x12e/0x1be
 [<c134ae7e>] ? sysenter_do_call+0x12/0x26
 [<c1340000>] ? ip_vs_control_net_cleanup+0x46/0xb1
Code: e8 90 6b 8c c5 f6 05 72 5d 78 fb 04 74 17 8b 46 08 50 56 68 0a fd 77 fb 68 60 5d 78 fb e8 68 95 9e c5 83 c4 10 8b 83 fc 01 00 00 <c7> 00 00 00 00 00 eb 32 ba 68 00 00 0b
EIP: [<fb770ab9>] __sco_sock_close+0x194/0x1ff [bluetooth] SS:ESP 0068:ef1edf4c
CR2: 0000000000000000
---[ end trace 47fa2f55a9544e69 ]---

Signed-off-by: Kuba Pawlak <kubax.t.pawlak@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-25 21:06:39 +01:00
Kuba Pawlak 75e34f5cf6 Bluetooth: Fix crash on SCO disconnect
When disconnecting audio from the phone's side, it may happen, that
a thread handling HCI message 'disconnection complete' will get preempted
in 'sco_conn_del' before calling 'sco_sock_kill', still holding a pointer
to struct sock sk. Interrupting thread started in 'sco_sock_shutdown' will
carry on releasing resources and will eventually release struct sock.
When execution goes back to first thread it will call sco_sock_kill using
now invalid pointer to already destroyed socket.

Fix is to grab a reference to the socket a release it after calling
'sco_sock_kill'.

[  166.358213] BUG: unable to handle kernel paging request at 7541203a
[  166.365228] IP: [<fb6e8bfb>] bt_sock_unlink+0x1a/0x38 [bluetooth]
[  166.372068] *pdpt = 0000000024b19001 *pde = 0000000000000000
[  166.378483] Oops: 0002 [#1] PREEMPT SMP
[  166.382871] Modules linked in: evdev ecb rfcomm(O) libcomposite usb2380 udc_core bnep(O) btusb(O) btbcm(O) btintel(O) cdc_acm bluetooth(O) arc4 uinput hid_multitouch iwlmvm(O) usbhid hide
[  166.424233] Pid: 338, comm: kworker/u:2H Tainted: G           O 3.8.0-115.1-plk-adaptation-byt-ivi-brd #1
[  166.435112] EIP: 0060:[<fb6e8bfb>] EFLAGS: 00010206 CPU: 0
[  166.441259] EIP is at bt_sock_unlink+0x1a/0x38 [bluetooth]
[  166.447382] EAX: 632e6563 EBX: e4bfc600 ECX: e466d4d3 EDX: 7541203a
[  166.454369] ESI: fb7278ac EDI: e4d52000 EBP: e4669e20 ESP: e4669e0c
[  166.461366]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[  166.467391] CR0: 8005003b CR2: 7541203a CR3: 24aba000 CR4: 001007f0
[  166.474387] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  166.481375] DR6: ffff0ff0 DR7: 00000400
[  166.485654] Process kworker/u:2H (pid: 338, ti=e4668000 task=e466e030 task.ti=e4668000)
[  166.494591] Stack:
[  166.496830]  e4bfc600 e4bfc600 fb715c28 e4717ee0 e4d52000 e4669e3c fb715cf3 e4bfc634
[  166.505518]  00000068 e4d52000 e4c32000 fb7277c0 e4669e6c fb6f2019 0000004a 00000216
[  166.514205]  e4660101 e4c32008 02000001 00000013 e4d52000 e4c32000 e3dc9240 00000005
[  166.522891] Call Trace:
[  166.525654]  [<fb715c28>] ? sco_sock_kill+0x73/0x9a [bluetooth]
[  166.532295]  [<fb715cf3>] ? sco_conn_del+0xa4/0xbf [bluetooth]
[  166.538836]  [<fb6f2019>] ? hci_disconn_complete_evt.clone.55+0x1bd/0x205 [bluetooth]
[  166.547609]  [<fb6f73d3>] ? hci_event_packet+0x297/0x223c [bluetooth]
[  166.554805]  [<c10416da>] ? dequeue_task+0xaf/0xb7
[  166.560154]  [<c1041095>] ? finish_task_switch+0x50/0x89
[  166.566086]  [<c1349a2e>] ? __schedule+0x638/0x6b8
[  166.571460]  [<fb6eb906>] ? hci_rx_work+0xb9/0x2b8 [bluetooth]
[  166.577975]  [<c1035df9>] ? process_one_work+0x157/0x21b
[  166.583933]  [<fb6eb84d>] ? hci_cmd_work+0xef/0xef [bluetooth]
[  166.590448]  [<c1036217>] ? worker_thread+0x16e/0x20a
[  166.596088]  [<c10360a9>] ? manage_workers+0x1cf/0x1cf
[  166.601826]  [<c103a0ef>] ? kthread+0x8d/0x92
[  166.606691]  [<c134adf7>] ? ret_from_kernel_thread+0x1b/0x28
[  166.613010]  [<c103a062>] ? __init_kthread_worker+0x24/0x24
[  166.619230] Code: 85 63 ff ff ff 31 db 8d 65 f4 89 d8 5b 5e 5f 5d c3 56 8d 70 04 53 89 f0 89 d3 e8 7e 17 c6 c5 8b 53 28 85 d2 74 1a 8b 43 24 85 c0 <89> 02 74 03 89 50 04 c7 43 28 00 00 00
[  166.640501] EIP: [<fb6e8bfb>] bt_sock_unlink+0x1a/0x38 [bluetooth] SS:ESP 0068:e4669e0c
[  166.649474] CR2: 000000007541203a
[  166.653420] ---[ end trace 0181ff2c9e42d51e ]---
[  166.658609] note: kworker/u:2H[338] exited with preempt_count 1

Signed-off-by: Kuba Pawlak <kubax.t.pawlak@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-25 21:06:39 +01:00
Robert Dolca 85b9ce9a21 NFC: nci: add nci_get_conn_info_by_id function
This functin takes as a parameter a pointer to the nci_dev
struct and the first byte from the values of the first domain
specific parameter that was used for the connection creation.

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-25 20:29:11 +01:00
Robert Dolca caa575a86e NFC: nci: fix possible crash in nci_core_conn_create
If the number of destination speific parameters supplied is 0
the call will fail. If the first destination specific parameter
does not have a value, curr_id will be set to 0.

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-25 20:29:05 +01:00
Robert Dolca 22e4bd09c4 NFC: nci: rename nci_prop_ops to nci_driver_ops
Initially it was used to create hooks in the driver for
proprietary operations. Currently it is being used for hooks
for both proprietary and generic operations.

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-25 20:28:59 +01:00
Robert Dolca 0a97a3cba2 NFC: nci: Allow the driver to set handler for core nci ops
The driver may be required to act when some responses or
notifications arrive. For example the NCI core does not have a
handler for NCI_OP_CORE_GET_CONFIG_RSP. The NFCC can send a
config response that has to be read by the driver and the packet
may contain vendor specific data.

The Fields Peak driver needs to take certain actions when a reset
notification arrives (packet also not handled by the nfc core).

The driver handlers do not interfere with the core and they are
called after the core processes the packet.

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-25 19:12:57 +01:00
Robert Dolca 7bc4824ed5 NFC: nci: Introduce nci_core_cmd
This allows sending core commands from the driver. The driver
should be able to send NCI core commands like CORE_GET_CONFIG_CMD.

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-25 19:12:13 +01:00
Robert Dolca e4dbd62528 NFC: nci: Do not call post_setup when setup fails
The driver should know that it can continue with post setup where
setup left off. Being able to execute post_setup when setup fails
may force the developer to keep this state in the driver.

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-25 19:11:58 +01:00
Robert Dolca 2663589ce6 NFC: nci: Add function to get max packet size for conn
FDP driver needs to send the firmware as regular packets
(not fragmented). The driver should have a way to
get the max packet size for a given connection.

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-25 19:11:42 +01:00
Robert Dolca ea785c094d NFC: nci: Export nci data send API
For the firmware update the driver may use nci_send_data.

Signed-off-by: Robert Dolca <robert.dolca@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-10-25 19:11:35 +01:00
Eric Dumazet 1586a5877d af_unix: do not report POLLOUT on listeners
poll(POLLOUT) on a listener should not report fd is ready for
a write().

This would break some applications using poll() and pfd.events = -1,
as they would not block in poll()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alan Burlison <Alan.Burlison@oracle.com>
Tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-25 06:37:45 -07:00
Wu Fengguang 742e038330 tipc: link_is_bc_sndlink() can be static
TO: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
CC: Jon Maloy <jon.maloy@ericsson.com>
CC: Ying Xue <ying.xue@windriver.com>
CC: tipc-discussion@lists.sourceforge.net
CC: linux-kernel@vger.kernel.org

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-25 06:31:52 -07:00
Jon Paul Maloy 2af5ae372a tipc: clean up unused code and structures
After the previous changes in this series, we can now remove some
unused code and structures, both in the broadcast, link aggregation
and link code.

There are no functional changes in this commit.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:47 -07:00
Jon Paul Maloy c49a0a8439 tipc: ensure binding table initial distribution is sent via first link
Correct synchronization of the broadcast link at first contact between
two nodes is dependent on the assumption that the binding table "bulk"
update passes via the same link as the initial broadcast syncronization
message, i.e., via the first link that is established.

This is not guaranteed in the current implementation. If two link
come up very close to each other in time, the "bulk" may quite well
pass via the second link, and hence void the guarantee of a correct
initial synchronization before the broadcast link is opened.

This commit makes two small changes to strengthen this guarantee.

1) We let the second established link occupy slot 1 of the
   "active_links" array, while the first link will retain slot 0.
   (This is in reality a cosmetic change, we could just as well keep
    the current, opposite order)

2) We let the name distributor always use link selector/slot 0 when
   it sends it binding table updates.

The extra traffic bias on the first link caused by this change should
be negligible, since binding table updates constitutes a very small
fraction of the total traffic.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:46 -07:00
Jon Paul Maloy c72fa872a2 tipc: eliminate link's reference to owner node
With the recent commit series, we have established a one-way dependency
between the link aggregation (struct tipc_node) instances and their
pertaining tipc_link instances. This has enabled quite significant code
and structure simplifications.

In this commit, we eliminate the field 'owner', which points to an
instance of struct tipc_node, from struct tipc_link, and replace it with
a pointer to struct net, which is the only external reference now needed
by a link instance.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:45 -07:00
Jon Paul Maloy 7214bcf875 tipc: eliminate redundant buffer cloning at transmission
Since all packet transmitters (link, bcast, discovery) are now sending
consumable buffer clones to the bearer layer, we can remove the
redundant buffer cloning that is perfomed in the lower level functions
tipc_l2_send_msg() and tipc_udp_send_msg().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:44 -07:00
Jon Paul Maloy 60852d6795 tipc: let neighbor discoverer tranmsit consumable buffers
The neighbor discovery function currently uses the function
tipc_bearer_send() for transmitting packets, assuming that the
sent buffers are not consumed by the called function.

We want to change this, in order to avoid unnecessary buffer cloning
elswhere in the code.

This commit introduces a new function tipc_bearer_skb() which consumes
the sent buffers, and let the discoverer functions use this new call
instead. The discoverer does now itself perform the cloning when
that is necessary.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:44 -07:00
Jon Paul Maloy 959e1781aa tipc: introduce jumbo frame support for broadcast
Until now, we have only been supporting a fix MTU size of 1500 bytes
for all broadcast media, irrespective of their actual capability.

We now make the broadcast MTU adaptable to the carrying media, i.e.,
we use the smallest MTU supported by any of the interfaces attached
to TIPC.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:40 -07:00
Jon Paul Maloy b06b281e79 tipc: simplify bearer level broadcast
Until now, we have been keeping track of the exact set of broadcast
destinations though the help structure tipc_node_map. This leads us to
have to maintain a whole infrastructure for supporting this, including
a pseudo-bearer and a number of functions to manipulate both the bearers
and the node map correctly. Apart from the complexity, this approach is
also limiting, as struct tipc_node_map only can support cluster local
broadcast if we want to avoid it becoming excessively large. We want to
eliminate this limitation, in order to enable introduction of scoped
multicast in the future.

A closer analysis reveals that it is unnecessary maintaining this "full
set" overview; it is sufficient to keep a counter per bearer, indicating
how many nodes can be reached via this bearer at the moment. The protocol
is now robust enough to handle transitional discrepancies between the
nominal number of reachable destinations, as expected by the broadcast
protocol itself, and the number which is actually reachable at the
moment. The initial broadcast synchronization, in conjunction with the
retransmission mechanism, ensures that all packets will eventually be
acknowledged by the correct set of destinations.

This commit introduces these changes.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:39 -07:00
Jon Paul Maloy 5266698661 tipc: let broadcast packet reception use new link receive function
The code path for receiving broadcast packets is currently distinct
from the unicast path. This leads to unnecessary code and data
duplication, something that can be avoided with some effort.

We now introduce separate per-peer tipc_link instances for handling
broadcast packet reception. Each receive link keeps a pointer to the
common, single, broadcast link instance, and can hence handle release
and retransmission of send buffers as if they belonged to the own
instance.

Furthermore, we let each unicast link instance keep a reference to both
the pertaining broadcast receive link, and to the common send link.
This makes it possible for the unicast links to easily access data for
broadcast link synchronization, as well as for carrying acknowledges for
received broadcast packets.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:37 -07:00
Jon Paul Maloy fd556f209a tipc: introduce capability bit for broadcast synchronization
Until now, we have tried to support both the newer, dedicated broadcast
synchronization mechanism along with the older, less safe, RESET_MSG/
ACTIVATE_MSG based one. The latter method has turned out to be a hazard
in a highly dynamic cluster, so we find it safer to disable it completely
when we find that the former mechanism is supported by the peer node.

For this purpose, we now introduce a new capabability bit,
TIPC_BCAST_SYNCH, to inform any peer nodes that dedicated broadcast
syncronization is supported by the present node. The new bit is conveyed
between peers in the 'capabilities' field of neighbor discovery messages.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:35 -07:00
Jon Paul Maloy 2f56612457 tipc: let broadcast transmission use new link transmit function
This commit simplifies the broadcast link transmission function, by
leveraging previous changes to the link transmission function and the
broadcast transmission link life cycle.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:32 -07:00
Jon Paul Maloy c1ab3f1dea tipc: make struct tipc_link generic to support broadcast
Realizing that unicast is just a special case of broadcast, we also see
that we can go in the other direction, i.e., that modest changes to the
current unicast link can make it generic enough to support broadcast.

The following changes are introduced here:

- A new counter ("ackers") in struct tipc_link, to indicate how many
  peers need to ack a packet before it can be released.
- A corresponding counter in the skb user area, to keep track of how
  many peers a are left to ack before a buffer can be released.
- A new counter ("acked"), to keep persistent track of how far a peer
  has acked at the moment, i.e., where in the transmission queue to
  start updating buffers when the next ack arrives. This is to avoid
  double acknowledgements from a peer, with inadvertent relase of
  packets as a result.
- A more generic tipc_link_retrans() function, where retransmit starts
  from a given sequence number, instead of the first packet in the
  transmision queue. This is to minimize the number of retransmitted
  packets on the broadcast media.

When the new functionality is taken into use in the next commits,
we expect it to have minimal effect on unicast mode performance.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:32 -07:00
Jon Paul Maloy 323019069e tipc: use explicit allocation of broadcast send link
The broadcast link instance (struct tipc_link) used for sending is
currently aggregated into struct tipc_bclink. This means that we cannot
use the regular tipc_link_create() function for initiating the link, but
do instead have to initiate numerous fields directly from the
bcast_init() function.

We want to reduce dependencies between the broadcast functionality
and the inner workings of tipc_link. In this commit, we introduce
a new function tipc_bclink_create() to link.c, and allocate the
instance of the link separately using this function.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:30 -07:00
Jon Paul Maloy 0e05498e9e tipc: make link implementation independent from struct tipc_bearer
In reality, the link implementation is already independent from
struct tipc_bearer, in that it doesn't store any reference to it.
However, we still pass on a pointer to a bearer instance in the
function tipc_link_create(), just to have it extract some
initialization information from it.

I later commits, we need to create instances of tipc_link without
having any associated struct tipc_bearer. To facilitate this, we
want to extract the initialization data already in the creator
function in node.c, before calling tipc_link_create(), and pass
this info on as individual parameters in the call.

This commit introduces this change.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:30 -07:00
Jon Paul Maloy 5fd9fd6351 tipc: create broadcast transmission link at namespace init
The broadcast transmission link is currently instantiated when the
network subsystem is started, i.e., on order from user space via netlink.

This forces the broadcast transmission code to do unnecessary tests for
the existence of the transmission link, as well in single mode node as
in network mode.

In this commit, we do instead create the link during initialization of
the name space, and remove it when it is stopped. The fact that the
transmission link now has a guaranteed longer life cycle than any of its
potential clients paves the way for further code simplifcations
and optimizations.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:27 -07:00
Jon Paul Maloy 0043550b0a tipc: move broadcast link lock to struct tipc_net
The broadcast lock will need to be acquired outside bcast.c in a later
commit. For this reason, we move the lock to struct tipc_net. Consistent
with the changes in the previous commit, we also introducee two new
functions tipc_bcast_lock() and tipc_bcast_unlock(). The code that is
currently using tipc_bclink_lock()/unlock() will be phased out during
the coming commits in this series.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:25 -07:00
Jon Paul Maloy 6beb19a62a tipc: move bcast definitions to bcast.c
Currently, a number of structure and function definitions related
to the broadcast functionality are unnecessarily exposed in the file
bcast.h. This obscures the fact that the external interface towards
the broadcast link in fact is very narrow, and causes unnecessary
recompilations of other files when anything changes in those
definitions.

In this commit, we move as many of those definitions as is currently
possible to the file bcast.c.

We also rename the structure 'tipc_bclink' to 'tipc_bc_base', both
since the name does not correctly describe the contents of this
struct, and will do so even less in the future, and because we want
to use the term 'link' more appropriately in the functionality
introduced later in this series.

Finally, we rename a couple of functions, such as tipc_bclink_xmit()
and others that will be kept in the future, to include the term 'bcast'
instead.

There are no functional changes in this commit.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:56:24 -07:00
David S. Miller ba3e2084f2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv6/xfrm6_output.c
	net/openvswitch/flow_netlink.c
	net/openvswitch/vport-gre.c
	net/openvswitch/vport-vxlan.c
	net/openvswitch/vport.c
	net/openvswitch/vport.h

The openvswitch conflicts were overlapping changes.  One was
the egress tunnel info fix in 'net' and the other was the
vport ->send() op simplification in 'net-next'.

The xfrm6_output.c conflicts was also a simplification
overlapping a bug fix.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:54:12 -07:00
David S. Miller a72c9512bf Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-10-22

Here's probably the last bluetooth-next pull request for 4.4. Among
several other changes it contains the rest of the fixes & cleanups from
the Bluetooth UnplugFest (that didn't need to be hurried to 4.3).

 - Refactoring & cleanups to 6lowpan code
 - New USB ids for two Atheros controllers and BCM43142A0 from Broadcom
 - Fix (quirk) for broken Broadcom BCM2045 controllers
 - Support for latest Apple controllers
 - Improvements to the vendor diagnostic message support

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 05:13:16 -07:00
David S. Miller bf7958607d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-10-23

This series contains updates to i40e, i40evf, if_link, ixgbe and ixgbevf.

Anjali adds a workaround to drop any flow control frames from being
transmitted from any VSI, so that a malicious VF cannot send flow control
or PFC packets out on the wire.  Also fixed a bug in debugfs by grabbing
the filter list lock before adding or deleting a filter.

Akeem fixes an issue where we were unconditionally returning VEB bridge
mode before allowing LB in the add VSI routine, resolve by checking if
the bridge is actually in VEB mode first.

Mitch fixed an issue where the incorrect structure was being used for
VLAN filter list, which meant the VLAN filter list did not get
processed correctly and VLAN filters would not be re-enabled after any
kind of reset.

Helin fixed a problem of possibly getting inconsistent flow control
status after a PF reset.  The issue was requested_mode was being set
with a default value during probe, but the hardware state could be a
different value from this mode.

Carolyn fixed a problem where the driver output of the OEM version
string varied from the other tools.

Jean Sacren fixes up kernel documentation by fixing function header
comments to match actual variables used in the functions.  Also
cleaned up variable initialization, when the variable would be
over-written immediately.

Hiroshi Shimanoto provides three patches to add "trusted" VF by adding
netlink directives and an NDO entry.  Then implement these new controls
in ixgbe and ixgbevf.  This series has gone through several iterations
to address all the suggested community changes and concerns.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 06:58:09 -07:00
Robert Shearman 1c78efa831 mpls: flow-based multipath selection
Change the selection of a multipath route to use a flow-based
hash. This more suitable for traffic sensitive to reordering within a
flow (e.g. TCP, L2VPN) and whilst still allowing a good distribution
of traffic given enough flows.

Selection of the path for a multipath route is done using a hash of:
1. Label stack up to MAX_MP_SELECT_LABELS labels or up to and
   including entropy label, whichever is first.
2. 3-tuple of (L3 src, L3 dst, proto) from IPv4/IPv6 header in MPLS
   payload, if present.

Naturally, a 5-tuple hash using L4 information in addition would be
possible and be better in some scenarios, but there is a tradeoff
between looking deeper into the packet to achieve good distribution,
and packet forwarding performance, and I have erred on the side of the
latter as the default.

Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 06:26:45 -07:00
Roopa Prabhu f8efb73c97 mpls: multipath route support
This patch adds support for MPLS multipath routes.

Includes following changes to support multipath:
- splits struct mpls_route into 'struct mpls_route + struct mpls_nh'

- 'struct mpls_nh' represents a mpls nexthop label forwarding entry

- moves mpls route and nexthop structures into internal.h

- A mpls_route can point to multiple mpls_nh structs

- the nexthops are maintained as a array (similar to ipv4 fib)

- In the process of restructuring, this patch also consistently changes
  all labels to u8

- Adds support to parse/fill RTA_MULTIPATH netlink attribute for
multipath routes similar to ipv4/v6 fib

- In this patch, the multipath route nexthop selection algorithm
simply returns the first nexthop. It is replaced by a
hash based algorithm from Robert Shearman in the next patch

- mpls_route_update cleanup: remove 'dev' handling in mpls_route_update.
mpls_route_update though implemented to update based on dev, it was
never used that way. And the dev handling gets tricky with multiple
nexthops. Cannot match against any single nexthops dev. So, this patch
removes the unused 'dev' handling in mpls_route_update.

- dead route/path handling will be implemented in a subsequent patch

Example:

$ip -f mpls route add 100 nexthop as 200 via inet 10.1.1.2 dev swp1 \
                nexthop as 700 via inet 10.1.1.6 dev swp2 \
                nexthop as 800 via inet 40.1.1.2 dev swp3

$ip  -f mpls route show
100
        nexthop as to 200 via inet 10.1.1.2  dev swp1
        nexthop as to 700 via inet 10.1.1.6  dev swp2
        nexthop as to 800 via inet 40.1.1.2  dev swp3

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 06:26:42 -07:00
Li RongQing ce9d9b8e5c net: sysctl: fix a kmemleak warning
the returned buffer of register_sysctl() is stored into net_header
variable, but net_header is not used after, and compiler maybe
optimise the variable out, and lead kmemleak reported the below warning

	comm "swapper/0", pid 1, jiffies 4294937448 (age 267.270s)
	hex dump (first 32 bytes):
	90 38 8b 01 c0 ff ff ff 00 00 00 00 01 00 00 00 .8..............
	01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
	backtrace:
	[<ffffffc00020f134>] create_object+0x10c/0x2a0
	[<ffffffc00070ff44>] kmemleak_alloc+0x54/0xa0
	[<ffffffc0001fe378>] __kmalloc+0x1f8/0x4f8
	[<ffffffc00028e984>] __register_sysctl_table+0x64/0x5a0
	[<ffffffc00028eef0>] register_sysctl+0x30/0x40
	[<ffffffc00099c304>] net_sysctl_init+0x20/0x58
	[<ffffffc000994dd8>] sock_init+0x10/0xb0
	[<ffffffc0000842e0>] do_one_initcall+0x90/0x1b8
	[<ffffffc000966bac>] kernel_init_freeable+0x218/0x2f0
	[<ffffffc00070ed6c>] kernel_init+0x1c/0xe8
	[<ffffffc000083bfc>] ret_from_fork+0xc/0x50
	[<ffffffffffffffff>] 0xffffffffffffffff <<end check kmemleak>>

Before fix, the objdump result on ARM64:
0000000000000000 <net_sysctl_init>:
   0:   a9be7bfd        stp     x29, x30, [sp,#-32]!
   4:   90000001        adrp    x1, 0 <net_sysctl_init>
   8:   90000000        adrp    x0, 0 <net_sysctl_init>
   c:   910003fd        mov     x29, sp
  10:   91000021        add     x1, x1, #0x0
  14:   91000000        add     x0, x0, #0x0
  18:   a90153f3        stp     x19, x20, [sp,#16]
  1c:   12800174        mov     w20, #0xfffffff4                // #-12
  20:   94000000        bl      0 <register_sysctl>
  24:   b4000120        cbz     x0, 48 <net_sysctl_init+0x48>
  28:   90000013        adrp    x19, 0 <net_sysctl_init>
  2c:   91000273        add     x19, x19, #0x0
  30:   9101a260        add     x0, x19, #0x68
  34:   94000000        bl      0 <register_pernet_subsys>
  38:   2a0003f4        mov     w20, w0
  3c:   35000060        cbnz    w0, 48 <net_sysctl_init+0x48>
  40:   aa1303e0        mov     x0, x19
  44:   94000000        bl      0 <register_sysctl_root>
  48:   2a1403e0        mov     w0, w20
  4c:   a94153f3        ldp     x19, x20, [sp,#16]
  50:   a8c27bfd        ldp     x29, x30, [sp],#32
  54:   d65f03c0        ret
After:
0000000000000000 <net_sysctl_init>:
   0:   a9bd7bfd        stp     x29, x30, [sp,#-48]!
   4:   90000000        adrp    x0, 0 <net_sysctl_init>
   8:   910003fd        mov     x29, sp
   c:   a90153f3        stp     x19, x20, [sp,#16]
  10:   90000013        adrp    x19, 0 <net_sysctl_init>
  14:   91000000        add     x0, x0, #0x0
  18:   91000273        add     x19, x19, #0x0
  1c:   f90013f5        str     x21, [sp,#32]
  20:   aa1303e1        mov     x1, x19
  24:   12800175        mov     w21, #0xfffffff4                // #-12
  28:   94000000        bl      0 <register_sysctl>
  2c:   f9002260        str     x0, [x19,#64]
  30:   b40001a0        cbz     x0, 64 <net_sysctl_init+0x64>
  34:   90000014        adrp    x20, 0 <net_sysctl_init>
  38:   91000294        add     x20, x20, #0x0
  3c:   9101a280        add     x0, x20, #0x68
  40:   94000000        bl      0 <register_pernet_subsys>
  44:   2a0003f5        mov     w21, w0
  48:   35000080        cbnz    w0, 58 <net_sysctl_init+0x58>
  4c:   aa1403e0        mov     x0, x20
  50:   94000000        bl      0 <register_sysctl_root>
  54:   14000004        b       64 <net_sysctl_init+0x64>
  58:   f9402260        ldr     x0, [x19,#64]
  5c:   94000000        bl      0 <unregister_sysctl_table>
  60:   f900227f        str     xzr, [x19,#64]
  64:   2a1503e0        mov     w0, w21
  68:   f94013f5        ldr     x21, [sp,#32]
  6c:   a94153f3        ldp     x19, x20, [sp,#16]
  70:   a8c37bfd        ldp     x29, x30, [sp],#48
  74:   d65f03c0        ret

Add the possible error handle to free the net_header to remove the
kmemleak warning

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 06:22:08 -07:00
Hiroshi Shimamoto dd461d6aa8 if_link: Add control trust VF
Add netlink directives and ndo entry to trust VF user.

This controls the special permission of VF user.
The administrator will dedicatedly trust VF user to use some features
which impacts security and/or performance.

The administrator never turn it on unless VF user is fully trusted.

CC: Sy Jong Choi <sy.jong.choi@intel.com>
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-23 05:44:28 -07:00
Eric Dumazet 5e0724d027 tcp/dccp: fix hashdance race for passive sessions
Multiple cpus can process duplicates of incoming ACK messages
matching a SYN_RECV request socket. This is a rare event under
normal operations, but definitely can happen.

Only one must win the race, otherwise corruption would occur.

To fix this without adding new atomic ops, we use logic in
inet_ehash_nolisten() to detect the request was present in the same
ehash bucket where we try to insert the new child.

If request socket was not found, we have to undo the child creation.

This actually removes a spin_lock()/spin_unlock() pair in
reqsk_queue_unlink() for the fast path.

Fixes: e994b2f0fb ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f103 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 05:42:21 -07:00
Li RongQing f6b8dec998 af_key: fix two typos
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 03:05:19 -07:00
Paolo Abeni 7b1311807f ipv4: implement support for NOPREFIXROUTE ifa flag for ipv4 address
Currently adding a new ipv4 address always cause the creation of the
related network route, with default metric. When a host has multiple
interfaces on the same network, multiple routes with the same metric
are created.

If the userspace wants to set specific metric on each routes, i.e.
giving better metric to ethernet links in respect to Wi-Fi ones,
the network routes must be deleted and recreated, which is error-prone.

This patch implements the support for IFA_F_NOPREFIXROUTE for ipv4
address. When an address is added with such flag set, no associated
network route is created, no network route is deleted when
said IP is gone and it's up to the user space manage such route.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 02:54:54 -07:00
Hannes Frederic Sowa b72a2b01b6 ipv6: protect mtu calculation of wrap-around and infinite loop by rounding issues
Raw sockets with hdrincl enabled can insert ipv6 extension headers
right into the data stream. In case we need to fragment those packets,
we reparse the options header to find the place where we can insert
the fragment header. If the extension headers exceed the link's MTU we
actually cannot make progress in such a case.

Instead of ending up in broken arithmetic or rounding towards 0 and
entering an endless loop in ip6_fragment, just prevent those cases by
aborting early and signal -EMSGSIZE to user space.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 02:49:36 -07:00
Andrew Shewmaker c80dbe0461 tcp: allow dctcp alpha to drop to zero
If alpha is strictly reduced by alpha >> dctcp_shift_g and if alpha is less
than 1 << dctcp_shift_g, then alpha may never reach zero. For example,
given shift_g=4 and alpha=15, alpha >> dctcp_shift_g yields 0 and alpha
remains 15. The effect isn't noticeable in this case below cwnd=137, but
could gradually drive uncongested flows with leftover alpha down to
cwnd=137. A larger dctcp_shift_g would have a greater effect.

This change causes alpha=15 to drop to 0 instead of being decrementing by 1
as it would when alpha=16. However, it requires one less conditional to
implement since it doesn't have to guard against subtracting 1 from 0U. A
decay of 15 is not unreasonable since an equal or greater amount occurs at
alpha >= 240.

Signed-off-by: Andrew G. Shewmaker <agshew@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 02:46:52 -07:00
lucien ab997ad408 ipv6: fix the incorrect return value of throw route
The error condition -EAGAIN, which is signaled by throw routes, tells
the rules framework to walk on searching for next matches. If the walk
ends and we stop walking the rules with the result of a throw route we
have to translate the error conditions to -ENETUNREACH.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 02:38:18 -07:00
Steffen Klassert cb866e3298 xfrm: Increment statistic counter on inner mode error
Increment the LINUX_MIB_XFRMINSTATEMODEERROR statistic counter
to notify about dropped packets if we fail to fetch a inner mode.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-10-23 07:52:58 +02:00
Steffen Klassert ea673a4d3a xfrm4: Reload skb header pointers after calling pskb_may_pull.
A call to pskb_may_pull may change the pointers into the packet,
so reload the pointers after the call.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-10-23 07:32:39 +02:00
Steffen Klassert 1a14f1e555 xfrm4: Fix header checks in _decode_session4.
We skip the header informations if the data pointer points
already behind the header in question for some protocols.
This is because we call pskb_may_pull with a negative value
converted to unsigened int from pskb_may_pull in this case.
Skipping the header informations can lead to incorrect policy
lookups, so fix it by a check of the data pointer position
before we call pskb_may_pull.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-10-23 07:31:23 +02:00
Sowmini Varadhan e33d4f13d2 xfrm: Fix unaligned access to stats in copy_to_user_state()
On sparc, deleting established SAs (e.g., by restarting ipsec)
results in unaligned access messages via xfrm_del_sa ->
km_state_notify -> xfrm_send_state_notify().

Even though struct xfrm_usersa_info is aligned on 8-byte boundaries,
netlink attributes are fundamentally only 4 byte aligned, and this
cannot be changed for nla_data() that is passed up to userspace.
As a result, the put_unaligned() macro needs to be used to
set up potentially unaligned fields such as the xfrm_stats in
copy_to_user_state()

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-10-23 06:49:29 +02:00
Pravin B Shelar fc4099f172 openvswitch: Fix egress tunnel info.
While transitioning to netdev based vport we broke OVS
feature which allows user to retrieve tunnel packet egress
information for lwtunnel devices.  Following patch fixes it
by introducing ndo operation to get the tunnel egress info.
Same ndo operation can be used for lwtunnel devices and compat
ovs-tnl-vport devices. So after adding such device operation
we can remove similar operation from ovs-vport.

Fixes: 614732eaa1 ("openvswitch: Use regular VXLAN net_device device").
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22 19:39:25 -07:00
Jorgen Hansen 8566b86ab9 VSOCK: Fix lockdep issue.
The recent fix for the vsock sock_put issue used the wrong
initializer for the transport spin_lock causing an issue when
running with lockdep checking.

Testing: Verified fix on kernel with lockdep enabled.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22 18:26:29 -07:00
David S. Miller 199c655069 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2015-10-22

1) Fix IPsec pre-encap fragmentation for GSO packets.
   From Herbert Xu.

2) Fix some header checks in _decode_session6.
   We skip the header informations if the data pointer points
   already behind the header in question for some protocols.
   This is because we call pskb_may_pull with a negative value
   converted to unsigened int from pskb_may_pull in this case.
   Skipping the header informations can lead to incorrect policy
   lookups. From Mathias Krause.

3) Allow to change the replay threshold and expiry timer of a
   state without having to set other attributes like replay
   counter and byte lifetime. Changing these other attributes
   may break the SA. From Michael Rossberg.

4) Fix pmtu discovery for local generated packets.
   We may fail dispatch to the inner address family.
   As a reault, the local error handler is not called
   and the mtu value is not reported back to userspace.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22 07:46:05 -07:00