Commit Graph

29001 Commits

Author SHA1 Message Date
Michal Kubeček 49a18d86f6 ipv6: update ip6_rt_last_gc every time GC is run
As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got there from ip6_dst_gc().

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-01 14:16:20 -07:00
Michal Kubeček 2ac3ac8f86 ipv6: prevent fib6_run_gc() contention
On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.

This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.

Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-01 14:16:20 -07:00
John W. Linville 7546ff9549 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2013-08-01 15:26:52 -04:00
John W. Linville 22e02a0272 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-08-01 14:30:59 -04:00
Simon Wunderlich 73da7d5bab mac80211: add channel switch command and beacon callbacks
The count field in CSA must be decremented with each beacon
transmitted. This patch implements the functionality for drivers
using ieee80211_beacon_get(). Other drivers must call back manually
after reaching count == 0.

This patch also contains the handling and finish worker for the channel
switch command, and mac80211/chanctx code to allow to change a channel
definition of an active channel context.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
[small cleanups, catch identical chandef]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-08-01 18:30:33 +02:00
Simon Wunderlich 16ef1fe272 nl80211/cfg80211: add channel switch command
To allow channel switch announcements within beacons, add
the channel switch command to nl80211/cfg80211. This is
implementation is intended for AP and (later) IBSS mode.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-08-01 18:30:28 +02:00
Johannes Berg 7cf1f14ecf mac80211: add debugfs for driver-buffered TID bitmap
Add a per-station debugfs file indicating the TIDs (as
a bitmap) that the driver has data buffered on.

Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-08-01 15:08:24 +02:00
J. Bruce Fields 7193bd17ea svcrpc: set cr_gss_mech from gss-proxy as well as legacy upcall
The change made to rsc_parse() in
0dc1531aca "svcrpc: store gss mech in
svc_cred" should also have been propagated to the gss-proxy codepath.
This fixes a crash in the gss-proxy case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-01 08:42:01 -04:00
J. Bruce Fields 743e217129 svcrpc: fix kfree oops in gss-proxy code
mech_oid.data is an array, not kmalloc()'d memory.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-01 08:41:29 -04:00
J. Bruce Fields dc43376c26 svcrpc: fix gss-proxy xdr decoding oops
Uninitialized stack data was being used as the destination for memcpy's.

Longer term we'll just delete some of this code; all we're doing is
skipping over xdr that we don't care about.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-01 08:40:42 -04:00
J. Bruce Fields 9f96392b0a svcrpc: fix gss_rpc_upcall create error
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-01 08:39:30 -04:00
NeilBrown 447383d2ba NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure.
Since we enabled auto-tuning for sunrpc TCP connections we do not
guarantee that there is enough write-space on each connection to
queue a reply.

If memory pressure causes the window to shrink too small, the request
throttling in sunrpc/svc will not accept any requests so no more requests
will be handled.  Even when pressure decreases the window will not
grow again until data is sent on the connection.
This means we get a deadlock:  no requests will be handled until there
is more space, and no space will be allocated until a request is
handled.

This can be simulated by modifying svc_tcp_has_wspace to inflate the
number of byte required and removing the 'svc_sock_setbufsize' calls
in svc_setup_socket.

I found that multiplying by 16 was enough to make the requirement
exceed the default allocation.  With this modification in place:
   mount -o vers=3,proto=tcp 127.0.0.1:/home /mnt
would block and eventually time out because the nfs server could not
accept any requests.

This patch relaxes the request throttling to always allow at least one
request through per connection.  It does this by checking both
  sk_stream_min_wspace() and xprt->xpt_reserved
are zero.
The first is zero when the TCP transmit queue is empty.
The second is zero when there are no RPC requests being processed.
When both of these are zero the socket is idle and so one more
request can safely be allowed through.

Applying this patch allows the above mount command to succeed cleanly.
Tracing shows that the allocated write buffer space quickly grows and
after a few requests are handled, the extra tests are no longer needed
to permit further requests to be processed.

The main purpose of request throttling is to handle the case when one
client is slow at collecting replies and the send queue gets full of
replies that the client hasn't acknowledged (at the TCP level) yet.
As we only change behaviour when the send queue is empty this main
purpose is still preserved.

Reported-by: Ben Myers <bpm@sgi.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-01 08:39:16 -04:00
Karl Beldan a824131017 mac80211: report some VHT radiotap infos for tx status
The radiotap VHT info is 12 bytes (required to be aligned on 2) :
u16 known - IEEE80211_RADIOTAP_VHT_KNOWN_*
u8 flags - IEEE80211_RADIOTAP_VHT_FLAG_*
u8 bandwidth
u8 mcs_nss[4]
u8 coding
u8 group_id
u16 partial_aid

ATM mac80211 can handle IEEE80211_RADIOTAP_VHT_KNOWN_{GI,BANDWIDTH} and
mcs_nss[0] (i.e single user) in simple cases.
This is more a placeholder to let sniffers give more clues for VHT,
since we don't have yet the proper infrastructure/conventions
in mac80211 for complete feedback (e.g consider dynamic BW).

Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-08-01 10:49:04 +02:00
Antonio Quartulli dad9defd28 mac80211: ibss - remove not authorized station earlier
A station which is not authorized has to be purged earlier
to give it a chance to re-try to establish an IBSS/RSN
session soon. Set the timeout to 10 seconds.

Some refactoring has also been done to allow the IBSS
submodule to have its own expiring function.

Reported-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-08-01 10:49:02 +02:00
Fabio Baltieri e47f2509e5 mac80211: use oneshot blink API for LED triggers
Change mac80211 LED trigger code to use the generic
led_trigger_blink_oneshot() API for transmit and receive activity
indication.

This gives a better feedback to the user, as with the new API each
activity event results in a visible blink, while a constant traffic
results in a continuous blink at constant rate.

Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com>
[fix LED disabled build error]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-08-01 10:48:49 +02:00
Hannes Frederic Sowa 46b3a42190 ipv6: fib6_rules should return exact return value
With the addition of the suppress operation
(7764a45a8f ("fib_rules: add .suppress
operation") we rely on accurate error reporting of the fib_rules.actions.

fib6_rule_action always returned -EAGAIN in case we could not find a
matching route and 0 if a rule was matched. This also included a match
for blackhole or prohibited rule actions which could get suppressed by
the new logic.

So adapt fib6_rule_action to always return the correct error code as
its counterpart fib4_rule_action does. This also fixes a possiblity of
nullptr-deref where we don't find a table, thus rt == NULL. Because
the condition rt != ip6_null_entry still holdes it seems we could later
get a nullptr bug on dereference rt->dst.

v2:
a) Fixed a brain fart in the commit msg (the rule => a table, etc). No
   changes to the patch.

Cc: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-01 00:26:22 -07:00
Linus Lüssing b00589af3b bridge: disable snooping if there is no querier
If there is no querier on a link then we won't get periodic reports and
therefore won't be able to learn about multicast listeners behind ports,
potentially leading to lost multicast packets, especially for multicast
listeners that joined before the creation of the bridge.

These lost multicast packets can appear since c5c2326059
("bridge: Add multicast_querier toggle and disable queries by default")
in particular.

With this patch we are flooding multicast packets if our querier is
disabled and if we didn't detect any other querier.

A grace period of the Maximum Response Delay of the querier is added to
give multicast responses enough time to arrive and to be learned from
before disabling the flooding behaviour again.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-31 17:40:21 -07:00
Stefan Tomanek 7764a45a8f fib_rules: add .suppress operation
This change adds a new operation to the fib_rules_ops struct; it allows the
suppression of routing decisions if certain criteria are not met by its
results.

The first implemented constraint is a minimum prefix length added to the
structures of routing rules. If a rule is added with a minimum prefix length
>0, only routes meeting this threshold will be considered. Any other (more
general) routing table entries will be ignored.

When configuring a system with multiple network uplinks and default routes, it
is often convinient to reference the main routing table multiple times - but
omitting the default route. Using this patch and a modified "ip" utility, this
can be achieved by using the following command sequence:

  $ ip route add table secuplink default via 10.42.23.1

  $ ip rule add pref 100            table main prefixlength 1
  $ ip rule add pref 150 fwmark 0xA table secuplink

With this setup, packets marked 0xA will be processed by the additional routing
table "secuplink", but only if no suitable route in the main routing table can
be found. By using a minimal prefixlength of 1, the default route (/0) of the
table "main" is hidden to packets processed by rule 100; packets traveling to
destinations with more specific routing entries are processed as usual.

Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-31 17:27:17 -07:00
Dan Carpenter 8cb3b9c364 net_sched: info leak in atm_tc_dump_class()
The "pvc" struct has a hole after pvc.sap_family which is not cleared.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-31 15:04:19 -07:00
Eric Dumazet f2f872f927 netem: Introduce skb_orphan_partial() helper
Commit 547669d483 ("tcp: xps: fix reordering issues") added
unexpected reorders in case netem is used in a MQ setup for high
performance test bed.

ETH=eth0
tc qd del dev $ETH root 2>/dev/null
tc qd add dev $ETH root handle 1: mq
for i in `seq 1 32`
do
 tc qd add dev $ETH parent 1:$i netem delay 100ms
done

As all tcp packets are orphaned by netem, TCP stack believes it can
set skb->ooo_okay on all packets.

In order to allow producers to send more packets, we want to
keep sk_wmem_alloc from reaching sk_sndbuf limit.

We can do that by accounting one byte per skb in netem queues,
so that TCP stack is not fooled too much.

Tested:

With above MQ/netem setup, scaling number of concurrent flows gives
linear results and no reorders/retransmits

lpq83:~# for n in 1 10 20 30 40 50 60 70 80 90 100
 do echo -n "n:$n " ; ./super_netperf $n -H 10.7.7.84; done
n:1 198.46
n:10 2002.69
n:20 4000.98
n:30 6006.35
n:40 8020.93
n:50 10032.3
n:60 12081.9
n:70 13971.3
n:80 16009.7
n:90 17117.3
n:100 17425.5

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-31 14:59:49 -07:00
fan.du ca4c3fc24e net: split rt_genid for ipv4 and ipv6
Current net name space has only one genid for both IPv4 and IPv6, it has below
drawbacks:

- Add/delete an IPv4 address will invalidate all IPv6 routing table entries.
- Insert/remove XFRM policy will also invalidate both IPv4/IPv6 routing table
  entries even when the policy is only applied for one address family.

Thus, this patch attempt to split one genid for two to cater for IPv4 and IPv6
separately in a fine granularity.

Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-31 14:56:36 -07:00
Linus Torvalds 06693f305e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix association failures not triggering a connect-failure event in
    cfg80211, from Johannes Berg.

 2) Eliminate a potential NULL deref with older iptables tools when
    configuring xt_socket rules, from Eric Dumazet.

 3) Missing RTNL locking in wireless regulatory code, from Johannes
    Berg.

 4) Fix OOPS caused by firmware loading races in ath9k_htc, from Alexey
    Khoroshilov.

 5) Fix usb URB leak in usb_8dev CAN driver, also from Alexey
    Khoroshilov.

 6) VXLAN namespace teardown fails to unregister devices, from Stephen
    Hemminger.

 7) Fix multicast settings getting dropped by firmware in qlcnic driver,
    from Sucheta Chakraborty.

 8) Add sysctl range enforcement for tcp_syn_retries, from Michal Tesar.

 9) Fix a nasty bug in bridging where an active timer would get
    reinitialized with a setup_timer() call.  From Eric Dumazet.

10) Fix use after free in new mlx5 driver, from Dan Carpenter.

11) Fix freed pointer reference in ipv6 multicast routing on namespace
    cleanup, from Hannes Frederic Sowa.

12) Some usbnet drivers report TSO and SG in their feature set, but the
    usbnet layer doesn't really support them.  From Eric Dumazet.

13) Fix crash on EEH errors in tg3 driver, from Gavin Shan.

14) Drop cb_lock when requesting modules in genetlink, from Stanislaw
    Gruszka.

15) Kernel stack leaks in cbq scheduler and af_key pfkey messages, from
    Dan Carpenter.

16) FEC driver erroneously signals NETDEV_TX_BUSY on transmit leading to
    endless loops, from Uwe Kleine-König.

17) Fix hangs from loading mvneta driver, from Arnaud Patard.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (84 commits)
  mlx5: fix error return code in mlx5_alloc_uuars()
  mvneta: Try to fix mvneta when compiled as module
  mvneta: Fix hang when loading the mvneta driver
  atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring
  genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE
  af_key: more info leaks in pfkey messages
  net/fec: Don't let ndo_start_xmit return NETDEV_TX_BUSY without link
  net_sched: Fix stack info leak in cbq_dump_wrr().
  igb: fix vlan filtering in promisc mode when not in VT mode
  ixgbe: Fix Tx Hang issue with lldpad on 82598EB
  genetlink: release cb_lock before requesting additional module
  net: fec: workaround stop tx during errata ERR006358
  qlcnic: Fix diagnostic interrupt test for 83xx adapters.
  qlcnic: Fix setting Guest VLAN
  qlcnic: Fix operation type and command type.
  qlcnic: Fix initialization of work function.
  Revert "atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring"
  atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring
  net/tg3: Fix warning from pci_disable_device()
  net/tg3: Fix kernel crash
  ...
2013-07-31 12:56:18 -07:00
Johannes Berg ddfe49b42d mac80211: continue using disabled channels while connected
In case the AP has different regulatory information than we do,
it can happen that we connect to an AP based on e.g. the world
roaming regulatory data, and then update our database with the
AP's country information disables the channel the AP is using.
If this happens on an HT AP, the bandwidth tracking code will
hit the WARN_ON() and disconnect. Since that's not very useful,
ignore the channel-disable flag in bandwidth tracking.

Cc: stable@vger.kernel.org
Reported-by: Chris Wright <chrisw@sous-sol.org>
Tested-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-31 21:18:17 +02:00
Johannes Berg 74418edec9 cfg80211: fix P2P GO interface teardown
When a P2P GO interface goes down, cfg80211 doesn't properly
tear it down, leading to warnings later. Add the GO interface
type to the enumeration to tear it down like AP interfaces.
Otherwise, we leave it pending and mac80211's state can get
very confused, leading to warnings later.

Cc: stable@vger.kernel.org
Reported-by: Ilan Peer <ilan.peer@intel.com>
Tested-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-31 21:18:17 +02:00
Johannes Berg 5cdaed1e87 mac80211: ignore HT primary channel while connected
While we're connected, the AP shouldn't change the primary channel
in the HT information. We checked this, and dropped the connection
if it did change it.

Unfortunately, this is causing problems on some APs, e.g. on the
Netgear WRT610NL: the beacons seem to always contain a bad channel
and if we made a connection using a probe response (correct data)
we drop the connection immediately and can basically not connect
properly at all.

Work around this by ignoring the HT primary channel information in
beacons if we're already connected.

Also print out more verbose messages in the other situations to
help diagnose similar bugs quicker in the future.

Cc: stable@vger.kernel.org [3.10]
Acked-by: Andy Isaacson <adi@hexapodia.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-31 21:18:10 +02:00
Dmitry Popov c0155b2da4 tcp: Remove unused tcpct declarations and comments
Remove declaration, 4 defines and confusing comment that are no longer used
since 1a2c6181c4 ("tcp: Remove TCPCT").

Signed-off-by: Dmitry Popov <dp@highloadlab.com>
Acked-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-31 12:16:45 -07:00
Johannes Berg cb236d2d71 mac80211: don't wait for TX status forever
TX status notification can get lost, or the frames could
get stuck on the queue, so don't wait for the callback
from the driver forever and instead time out after half
a second.

Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-31 21:16:17 +02:00
John W. Linville 11a45820d0 This is the second NFC fixes pull request for 3.11.
We have:
 
 - A build failure fix for the NCI SPI transport layer due to a
   missing CRC_CCITT Kconfig dependency.
 
 - A netlink command rename: CMD_FW_UPLOAD was merged during the 3.11
   merge window but the typical terminology for loading a firmware to a
   target is firmware download rather than upload. In order to avoid any
   confusion in a file exported to userspace, we rename this command into
   CMD_FW_DOWNLOAD.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJR+E9jAAoJEIqAPN1PVmxKzkwP/3dy9wpwQG7f8FLv61IhbhhQ
 8gVqY3BX1RJdez1vH5MvqkTK6U3SmlQvtJM8pIPyfPXyR1+Af5AxqQh3vjTP3xUG
 PuMtmQOlz5OJ6ErttxZtYERVtrhFkasMVmVqKrN9ptItPADfOmeC0/hyoEnoYsWQ
 HrhZn1lYsf98zmEbNS2KoRcZUVLClbg4xosTktTVaz56jIGVuM8MAch+FS+tJhhl
 av0MX/VZvAUllSnlWWDmt0Lh9isJOLOMtqIRj6PLBAp2ra9sPNO5TlZ4lz2og2gx
 zesVhBBLyiF9oluuQj/FJft+s5Khcm0R9W969raL5SvehWY77wHoY76ZqHMUE2Qv
 7RPUvFRfOA5LvKJM8MduJ8fMf830mZWD7cByhIfUxtWQZumwPfn2Mbl3xNkPLFZB
 L2x13SwGjU+PdCo70+ybgr8zUvYIxiVULwq5xFynvXJNSpOujIe3nPdQb7QtK8C0
 4d9OudAHmfHsW93PMBE+Zki8i8GDLTR3DOQoXIRi7oPR+EVL2JDsBQvnauXhdSap
 mp9iyuoqAYjgc6e2o8coVqViXWbKmBEa9n7NKrX3dPrI9e5F67WChAyehBCu9KV3
 zZxruhEJBw6PLmIGDETk1XIVd9G6rfMBswnDfSJBjjG5PrUh6Xbfwa1y+KiRKqCh
 FG+IvbfHWZRmdeFX3U4P
 =p4r5
 -----END PGP SIGNATURE-----

Merge tag 'nfc-fixes-3.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes

Samuel Ortiz <sameo@linux.intel.com> says:

'This is the second NFC fixes pull request for 3.11.

We have:

- A build failure fix for the NCI SPI transport layer due to a
  missing CRC_CCITT Kconfig dependency.

- A netlink command rename: CMD_FW_UPLOAD was merged during the 3.11
  merge window but the typical terminology for loading a firmware to a
  target is firmware download rather than upload. In order to avoid any
  confusion in a file exported to userspace, we rename this command into
  CMD_FW_DOWNLOAD."

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-07-31 15:15:50 -04:00
Chris Wright b56e4b857c mac80211: fix infinite loop in ieee80211_determine_chantype
Commit "3d9646d mac80211: fix channel selection bug" introduced a possible
infinite loop by moving the out target above the chandef_downgrade
while loop.  When we downgrade to NL80211_CHAN_WIDTH_20_NOHT, we jump
back up to re-run the while loop...indefinitely.  Replace goto with
break and carry on.  This may not be sufficient to connect to the AP,
but will at least keep the cpu from livelocking.  Thanks to Derek Atkins
as an extra pair of debugging eyes.

Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-31 21:15:36 +02:00
John W. Linville 704278ccb5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Conflicts:
	net/bluetooth/hci_core.c
2013-07-31 15:11:50 -04:00
Dan Carpenter 4299c8a94f net: remove an unneeded check
"ifa->ifa_label" is an array inside the in_ifaddr struct.  It can never
be NULL so we can remove this check.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 19:24:16 -07:00
Tom Herbert b438f940d3 flow_dissector: add support for IPPROTO_IPV6
Support IPPROTO_IPV6 similar to IPPROTO_IPIP

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 19:16:52 -07:00
Tom Herbert fca4189551 flow_dissector: clean up IPIP case
Explicitly set proto to ETH_P_IP and jump directly to ip processing.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 19:16:51 -07:00
Jiri Pirko ff80e519ab net: export physical port id via sysfs
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 17:31:25 -07:00
Jiri Pirko 66cae9ed6b rtnl: export physical port id via RT netlink
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 17:31:24 -07:00
Jiri Pirko 66b52b0dc8 net: add ndo to get id of physical port of the device
This patch adds a ndo for getting physical port of the device. Driver
which is aware of being virtual function of some physical port should
implement this ndo. This is applicable not only for IOV, but for other
solutions (NPAR, multichannel) as well. Basically if there is possible
to have multiple netdevs on the single hw port.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 17:31:24 -07:00
Thomas Graf ffd756b317 pktgen: Require CONFIG_INET due to use of IPv4 checksum function
Unlike for IPv6, the IPv4 checksum functions are only available
if CONFIG_INET is set.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 16:45:09 -07:00
Pablo Neira e1ee3673a8 genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE
Currently, it is not possible to use neither NLM_F_EXCL nor
NLM_F_REPLACE from genetlink. This is due to this checking in
genl_family_rcv_msg:

	if (nlh->nlmsg_flags & NLM_F_DUMP)

NLM_F_DUMP is NLM_F_MATCH|NLM_F_ROOT. Thus, if NLM_F_EXCL or
NLM_F_REPLACE flag is set, genetlink believes that you're
requesting a dump and it calls the .dumpit callback.

The solution that I propose is to refine this checking to
make it stricter:

	if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP)

And given the combination NLM_F_REPLACE and NLM_F_EXCL does
not make sense to me, it removes the ambiguity.

There was a patch that tried to fix this some time ago (0ab03c2
netlink: test for all flags of the NLM_F_DUMP composite) but it
tried to resolve this ambiguity in *all* existing netlink subsystems,
not only genetlink. That patch was reverted since it broke iproute2,
which is using NLM_F_ROOT to request the dump of the routing cache.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 16:43:19 -07:00
Dan Carpenter ff862a4668 af_key: more info leaks in pfkey messages
This is inspired by a5cc68f3d6 "af_key: fix info leaks in notify
messages".  There are some struct members which don't get initialized
and could disclose small amounts of private information.

Acked-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 16:26:16 -07:00
Samuel Ortiz 9ea7187c53 NFC: netlink: Rename CMD_FW_UPLOAD to CMD_FW_DOWNLOAD
Loading a firmware into a target is typically called firmware
download, not firmware upload. So we rename the netlink API to
NFC_CMD_FW_DOWNLOAD in order to avoid any terminology confusion from
userspace.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-07-31 01:19:43 +02:00
Hannes Frederic Sowa 5ad37d5dee tcp: add tcp_syncookies mode to allow unconditionally generation of syncookies
| If you want to test which effects syncookies have to your
| network connections you can set this knob to 2 to enable
| unconditionally generation of syncookies.

Original idea and first implementation by Eric Dumazet.

Cc: Florian Westphal <fw@strlen.de>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 16:15:18 -07:00
Andi Shyti 60ff779c4a 9p: client: remove unused code and any reference to "cancelled" function
This patch reverts commit

80b45261a0

which was implementing a 'cancelled' functionality to notify that
a cancelled request will not be replied.

This implementation was not used anywhere and therefore removed.

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 15:54:28 -07:00
Johannes Berg c319d50bfc nl80211: fix another nl80211_fam.attrbuf race
This is similar to the race Linus had reported, but in this case
it's an older bug: nl80211_prepare_wdev_dump() uses the wiphy
index in cb->args[0] as it is and thus parses the message over
and over again instead of just once because 0 is the first valid
wiphy index. Similar code in nl80211_testmode_dump() correctly
offsets the wiphy_index by 1, do that here as well.

Cc: stable@vger.kernel.org
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-30 22:40:34 +02:00
David S. Miller a0db856a95 net_sched: Fix stack info leak in cbq_dump_wrr().
Make sure the reserved fields, and padding (if any), are
fully initialized.

Based upon a patch by Dan Carpenter and feedback from
Joe Perches.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-30 00:16:21 -07:00
Johan Hedberg 53e21fbc28 Bluetooth: Fix calling request callback more than once
In certain circumstances, such as an HCI driver using __hci_cmd_sync_ev
with HCI_EV_CMD_COMPLETE as the expected completion event there is the
chance that hci_event_packet will call hci_req_cmd_complete twice (once
for the explicitly looked after event and another time in the actual
handler of cmd_complete).

In the case of __hci_cmd_sync_ev this introduces a race where the first
call wakes up the blocking __hci_cmd_sync_ev and lets it complete.
However, by the time that a second __hci_cmd_sync_ev call is already in
progress the second hci_req_cmd_complete call (from the previous
operation) will wake up the blocking function prematurely and cause it
to fail, as witnessed by the following log:

[  639.232195] hci_rx_work: hci0 Event packet
[  639.232201] hci_req_cmd_complete: opcode 0xfc8e status 0x00
[  639.232205] hci_sent_cmd_data: hci0 opcode 0xfc8e
[  639.232210] hci_req_sync_complete: hci0 result 0x00
[  639.232220] hci_cmd_complete_evt: hci0 opcode 0xfc8e
[  639.232225] hci_req_cmd_complete: opcode 0xfc8e status 0x00
[  639.232228] __hci_cmd_sync_ev: hci0 end: err 0
[  639.232234] __hci_cmd_sync_ev: hci0
[  639.232238] hci_req_add_ev: hci0 opcode 0xfc8e plen 250
[  639.232242] hci_prepare_cmd: skb len 253
[  639.232246] hci_req_run: length 1
[  639.232250] hci_sent_cmd_data: hci0 opcode 0xfc8e
[  639.232255] hci_req_sync_complete: hci0 result 0x00
[  639.232266] hci_cmd_work: hci0 cmd_cnt 1 cmd queued 1
[  639.232271] __hci_cmd_sync_ev: hci0 end: err 0
[  639.232276] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed (-61)

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-07-29 12:28:04 +01:00
Johan Hedberg 3f8e2d75c1 Bluetooth: Fix HCI init for BlueFRITZ! devices
None of the BlueFRITZ! devices with manufacurer ID 31 (AVM Berlin)
support HCI_Read_Local_Supported_Commands. It is safe to use the
manufacturer ID (instead of e.g. a USB ID specific quirk) because the
company never created any newer controllers.

< HCI Command: Read Local Supported Comm.. (0x04|0x0002) plen 0 [hci0] 0.210014
> HCI Event: Command Status (0x0f) plen 4 [hci0] 0.217361
      Read Local Supported Commands (0x04|0x0002) ncmd 1
        Status: Unknown HCI Command (0x01)

Reported-by: Jörg Esser <jackfritt@boh.de>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Tested-by: Jörg Esser <jackfritt@boh.de>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-07-29 12:12:27 +01:00
Stephen Rothwell 73d94e9481 pktgen: add needed include file
Fixes this on PowerPC (at least):

net/core/pktgen.c: In function 'fill_packet_ipv6':
net/core/pktgen.c:2906:3: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration]
   udph->check = ~csum_ipv6_magic(&iph->saddr, &iph->daddr, udplen, IPPROTO_UDP, 0);
   ^

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-29 00:47:14 -07:00
Hannes Frederic Sowa 9d4a031464 ipv4, ipv6: send igmpv3/mld packets with TC_PRIO_CONTROL
v2:
a) Also send ipv4 igmp messages with TC_PRIO_CONTROL

Cc: William Manley <william.manley@youview.com>
Cc: Lukas Tribus <luky-37@hotmail.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-28 11:13:55 -07:00
Stanislaw Gruszka c74f2b2678 genetlink: release cb_lock before requesting additional module
Requesting external module with cb_lock taken can result in
the deadlock like showed below:

[ 2458.111347] Showing all locks held in the system:
[ 2458.111347] 1 lock held by NetworkManager/582:
[ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162bc79>] genl_rcv+0x19/0x40
[ 2458.111347] 1 lock held by modprobe/603:
[ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162baa5>] genl_lock_all+0x15/0x30

[ 2461.579457] SysRq : Show Blocked State
[ 2461.580103]   task                        PC stack   pid father
[ 2461.580103] NetworkManager  D ffff880034b84500  4040   582      1 0x00000080
[ 2461.580103]  ffff8800197ff720 0000000000000046 00000000001d5340 ffff8800197fffd8
[ 2461.580103]  ffff8800197fffd8 00000000001d5340 ffff880019631700 7fffffffffffffff
[ 2461.580103]  ffff8800197ff880 ffff8800197ff878 ffff880019631700 ffff880019631700
[ 2461.580103] Call Trace:
[ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
[ 2461.580103]  [<ffffffff81731ad1>] schedule_timeout+0x1c1/0x360
[ 2461.580103]  [<ffffffff810e69eb>] ? mark_held_locks+0xbb/0x140
[ 2461.580103]  [<ffffffff817377ac>] ? _raw_spin_unlock_irq+0x2c/0x50
[ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 2461.580103]  [<ffffffff81736398>] wait_for_completion_killable+0xe8/0x170
[ 2461.580103]  [<ffffffff810b7fa0>] ? wake_up_state+0x20/0x20
[ 2461.580103]  [<ffffffff81095825>] call_usermodehelper_exec+0x1a5/0x210
[ 2461.580103]  [<ffffffff817362ed>] ? wait_for_completion_killable+0x3d/0x170
[ 2461.580103]  [<ffffffff81095cc3>] __request_module+0x1b3/0x370
[ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 2461.580103]  [<ffffffff8162c5c9>] ctrl_getfamily+0x159/0x190
[ 2461.580103]  [<ffffffff8162d8a4>] genl_family_rcv_msg+0x1f4/0x2e0
[ 2461.580103]  [<ffffffff8162d990>] ? genl_family_rcv_msg+0x2e0/0x2e0
[ 2461.580103]  [<ffffffff8162da1e>] genl_rcv_msg+0x8e/0xd0
[ 2461.580103]  [<ffffffff8162b729>] netlink_rcv_skb+0xa9/0xc0
[ 2461.580103]  [<ffffffff8162bc88>] genl_rcv+0x28/0x40
[ 2461.580103]  [<ffffffff8162ad6d>] netlink_unicast+0xdd/0x190
[ 2461.580103]  [<ffffffff8162b149>] netlink_sendmsg+0x329/0x750
[ 2461.580103]  [<ffffffff815db849>] sock_sendmsg+0x99/0xd0
[ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
[ 2461.580103]  [<ffffffff810e96e8>] ? lock_release_non_nested+0x308/0x350
[ 2461.580103]  [<ffffffff815dbc6e>] ___sys_sendmsg+0x39e/0x3b0
[ 2461.580103]  [<ffffffff810565af>] ? kvm_clock_read+0x2f/0x50
[ 2461.580103]  [<ffffffff810218b9>] ? sched_clock+0x9/0x10
[ 2461.580103]  [<ffffffff810bb2bd>] ? sched_clock_local+0x1d/0x80
[ 2461.580103]  [<ffffffff810bb448>] ? sched_clock_cpu+0xa8/0x100
[ 2461.580103]  [<ffffffff810e33ad>] ? trace_hardirqs_off+0xd/0x10
[ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
[ 2461.580103]  [<ffffffff810e3f7f>] ? lock_release_holdtime.part.28+0xf/0x1a0
[ 2461.580103]  [<ffffffff8120fec9>] ? fget_light+0xf9/0x510
[ 2461.580103]  [<ffffffff8120fe0c>] ? fget_light+0x3c/0x510
[ 2461.580103]  [<ffffffff815dd1d2>] __sys_sendmsg+0x42/0x80
[ 2461.580103]  [<ffffffff815dd222>] SyS_sendmsg+0x12/0x20
[ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
[ 2461.580103] modprobe        D ffff88000f2c8000  4632   603    602 0x00000080
[ 2461.580103]  ffff88000f04fba8 0000000000000046 00000000001d5340 ffff88000f04ffd8
[ 2461.580103]  ffff88000f04ffd8 00000000001d5340 ffff8800377d4500 ffff8800377d4500
[ 2461.580103]  ffffffff81d0b260 ffffffff81d0b268 ffffffff00000000 ffffffff81d0b2b0
[ 2461.580103] Call Trace:
[ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
[ 2461.580103]  [<ffffffff81736d4d>] rwsem_down_write_failed+0xed/0x1a0
[ 2461.580103]  [<ffffffff810bb200>] ? update_cpu_load_active+0x10/0xb0
[ 2461.580103]  [<ffffffff8137b473>] call_rwsem_down_write_failed+0x13/0x20
[ 2461.580103]  [<ffffffff8173492d>] ? down_write+0x9d/0xb2
[ 2461.580103]  [<ffffffff8162baa5>] ? genl_lock_all+0x15/0x30
[ 2461.580103]  [<ffffffff8162baa5>] genl_lock_all+0x15/0x30
[ 2461.580103]  [<ffffffff8162cbb3>] genl_register_family+0x53/0x1f0
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffff8162d650>] genl_register_family_with_ops+0x20/0x80
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffffa017fe84>] nl80211_init+0x24/0xf0 [cfg80211]
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffffa01dc043>] cfg80211_init+0x43/0xdb [cfg80211]
[ 2461.580103]  [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0
[ 2461.580103]  [<ffffffff8105cb93>] ? set_memory_nx+0x43/0x50
[ 2461.580103]  [<ffffffff810f75af>] load_module+0x1c6f/0x27f0
[ 2461.580103]  [<ffffffff810f2c90>] ? store_uevent+0x40/0x40
[ 2461.580103]  [<ffffffff810f82c6>] SyS_finit_module+0x86/0xb0
[ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
[ 2461.580103] Sched Debug Version: v0.10, 3.11.0-0.rc1.git4.1.fc20.x86_64 #1

Problem start to happen after adding net-pf-16-proto-16-family-nl80211
alias name to cfg80211 module by below commit (though that commit
itself is perfectly fine):

commit fb4e156886
Author: Marcel Holtmann <marcel@holtmann.org>
Date:   Sun Apr 28 16:22:06 2013 -0700

    nl80211: Add generic netlink module alias for cfg80211/nl80211

Reported-and-tested-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Reviewed-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-27 22:19:20 -07:00
Thomas Graf 03c633e733 pktgen: Use ip_send_check() to compute checksum
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-27 22:18:00 -07:00
Thomas Graf c26bf4a513 pktgen: Add UDPCSUM flag to support UDP checksums
UDP checksums are optional, hence pktgen has been omitting them in
favour of performance. The optional flag UDPCSUM enables UDP
checksumming. If the output device supports hardware checksumming
the skb is prepared and marked CHECKSUM_PARTIAL, otherwise the
checksum is generated in software.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-27 22:16:36 -07:00
Asias He 82a54d0ebb VSOCK: Move af_vsock.h and vsock_addr.h to include/net
This is useful for other VSOCK transport implemented outside the
net/vmw_vsock/ directory to use these headers.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-27 22:14:06 -07:00
Joe Stringer 024ec3deac net/sctp: Refactor SCTP skb checksum computation
This patch consolidates the SCTP checksum calculation code from various
places to a single new function, sctp_compute_cksum(skb, offset).

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-27 20:07:15 -07:00
stephen hemminger 93d8bf9fb8 bridge: cleanup netpoll code
This started out with fixing a sparse warning, then I realized that
the wrapper function br_netpoll_info could just be collapsed away
by rolling it into the enable code.

Also, eliminate unnecessary goto's

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-26 15:24:32 -07:00
Francesco Fusco 555445cd11 neigh: prevent overflowing params in /proc/sys/net/ipv4/neigh/
Without this patch, the fields app_solicit, gc_thresh1, gc_thresh2,
gc_thresh3, proxy_qlen, ucast_solicit, mcast_solicit could have
assumed negative values when setting large numbers.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-26 14:22:10 -07:00
Gustavo Padovan fcee337704 Bluetooth: Fix race between hci_register_dev() and hci_dev_open()
If hci_dev_open() is called after hci_register_dev() added the device to
the hci_dev_list but before the workqueue are created we could run into a
NULL pointer dereference (see below).

This bug is very unlikely to happen, systems using bluetoothd to
manage their bluetooth devices will never see this happen.

BUG: unable to handle kernel NULL pointer dereference
0100
IP: [<ffffffff81077502>] __queue_work+0x32/0x3d0
(...)
Call Trace:
 [<ffffffff81077be5>] queue_work_on+0x45/0x50
 [<ffffffffa016e8ff>] hci_req_run+0xbf/0xf0 [bluetooth]
 [<ffffffffa01709b0>] ? hci_init2_req+0x720/0x720 [bluetooth]
 [<ffffffffa016ea06>] __hci_req_sync+0xd6/0x1c0 [bluetooth]
 [<ffffffff8108ee10>] ? try_to_wake_up+0x2b0/0x2b0
 [<ffffffff8150e3f0>] ? usb_autopm_put_interface+0x30/0x40
 [<ffffffffa016fad5>] hci_dev_open+0x275/0x2e0 [bluetooth]
 [<ffffffffa0182752>] hci_sock_ioctl+0x1f2/0x3f0 [bluetooth]
 [<ffffffff815c6050>] sock_do_ioctl+0x30/0x70
 [<ffffffff815c75f9>] sock_ioctl+0x79/0x2f0
 [<ffffffff811a8046>] do_vfs_ioctl+0x96/0x560
 [<ffffffff811a85a1>] SyS_ioctl+0x91/0xb0
 [<ffffffff816d989d>] system_call_fastpath+0x1a/0x1f

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-07-25 19:52:36 +01:00
Jaganath Kanakkassery da9910ac4a Bluetooth: Fix invalid length check in l2cap_information_rsp()
The length check is invalid since the length varies with type of
info response.

This was introduced by the commit cb3b3152b2

Because of this, l2cap info rsp is not handled and command reject is sent.

> ACL data: handle 11 flags 0x02 dlen 16
        L2CAP(s): Info rsp: type 2 result 0
          Extended feature mask 0x00b8
            Enhanced Retransmission mode
            Streaming mode
            FCS Option
            Fixed Channels
< ACL data: handle 11 flags 0x00 dlen 10
        L2CAP(s): Command rej: reason 0
          Command not understood

Cc: stable@vger.kernel.org
Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com>
Signed-off-by: Chan-Yeol Park <chanyeol.park@samsung.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-07-25 19:52:30 +01:00
Arik Nemtsov 23df0b7319 regulatory: use correct regulatory initiator on wiphy register
The current regdomain was not always set by the core. This causes
cards with a custom regulatory domain to ignore user initiated changes
if done before the card was registered.

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-25 09:52:46 +02:00
David S. Miller 1df86b4cee Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
This is another batch of fixes intended for the 3.11 stream.  FWIW,
this is the first request with fixes from the mac80211 and iwlwifi
trees as well.

Regarding the mac80211 bits, Johannes says:

"Here I have a fix for RSSI thresholds in mesh, two minstrel fixes from
Felix, an nl80211 fix from Michal and four various fixes I did myself."

As for the iwlwifi bits, Johannes says:

"Here I have a fix for debugfs directory creation (causing a spurious
error message), two scanning fixes from David Spinadel, an LED fix and
two patches related to a BA session problem that eventually caused
firmware crashes from Emmanuel and a small BT fix for older devices as
well as a workaround for a firmware problem with APs with very small
beacon intervals from myself."

Along with those:

Arend van Spriel addresses a lock-up and a NULL pointer dereference
in brcmfmac.

Daniel Drake fixes an unhandled interrupt during device tear down
in mwifiex.

Larry Finger corrects a wil6210 build error.

Oleksij Rempel fixes two ath9k_htc problems related to keeping the
driver and firmware in sync.

Solomon Peachy gives us a cw1200 fix to avoid an oops in monitor mode.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-25 00:06:59 -07:00
Florian Fainelli deceb4c062 net: fix comment above build_skb()
build_skb() specifies that the data parameter must come from a kmalloc'd
area, this is only true if frag_size equals 0, because then build_skb()
will use kzsize(data) to figure out the actual data size. Update the
comment to reflect that special condition.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 17:59:07 -07:00
Thomas Gleixner 18afa4b028 net: Make devnet_rename_seq static
No users outside net/core/dev.c.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 17:57:26 -07:00
Eric Dumazet c9bee3b7fd tcp: TCP_NOTSENT_LOWAT socket option
Idea of this patch is to add optional limitation of number of
unsent bytes in TCP sockets, to reduce usage of kernel memory.

TCP receiver might announce a big window, and TCP sender autotuning
might allow a large amount of bytes in write queue, but this has little
performance impact if a large part of this buffering is wasted :

Write queue needs to be large only to deal with large BDP, not
necessarily to cope with scheduling delays (incoming ACKS make room
for the application to queue more bytes)

For most workloads, using a value of 128 KB or less is OK to give
applications enough time to react to POLLOUT events in time
(or being awaken in a blocking sendmsg())

This patch adds two ways to set the limit :

1) Per socket option TCP_NOTSENT_LOWAT

2) A sysctl (/proc/sys/net/ipv4/tcp_notsent_lowat) for sockets
not using TCP_NOTSENT_LOWAT socket option (or setting a zero value)
Default value being UINT_MAX (0xFFFFFFFF), meaning this has no effect.

This changes poll()/select()/epoll() to report POLLOUT
only if number of unsent bytes is below tp->nosent_lowat

Note this might increase number of sendmsg()/sendfile() calls
when using non blocking sockets,
and increase number of context switches for blocking sockets.

Note this is not related to SO_SNDLOWAT (as SO_SNDLOWAT is
defined as :
 Specify the minimum number of bytes in the buffer until
 the socket layer will pass the data to the protocol)

Tested:

netperf sessions, and watching /proc/net/protocols "memory" column for TCP

With 200 concurrent netperf -t TCP_STREAM sessions, amount of kernel memory
used by TCP buffers shrinks by ~55 % (20567 pages instead of 45458)

lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat
lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols
TCPv6     1880      2   45458   no     208   yes  ipv6        y  y  y  y  y  y  y  y  y  y  y  y  y  n  y  y  y  y  y
TCP       1696    508   45458   no     208   yes  kernel      y  y  y  y  y  y  y  y  y  y  y  y  y  n  y  y  y  y  y

lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat
lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols
TCPv6     1880      2   20567   no     208   yes  ipv6        y  y  y  y  y  y  y  y  y  y  y  y  y  n  y  y  y  y  y
TCP       1696    508   20567   no     208   yes  kernel      y  y  y  y  y  y  y  y  y  y  y  y  y  n  y  y  y  y  y

Using 128KB has no bad effect on the throughput or cpu usage
of a single flow, although there is an increase of context switches.

A bonus is that we hold socket lock for a shorter amount
of time and should improve latencies of ACK processing.

lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat
lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3
OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf.
Local       Remote      Local  Elapsed Throughput Throughput  Local Local  Remote Remote Local   Remote  Service
Send Socket Recv Socket Send   Time               Units       CPU   CPU    CPU    CPU    Service Service Demand
Size        Size        Size   (sec)                          Util  Util   Util   Util   Demand  Demand  Units
Final       Final                                             %     Method %      Method
1651584     6291456     16384  20.00   17447.90   10^6bits/s  3.13  S      -1.00  U      0.353   -1.000  usec/KB

 Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3':

           412,514 context-switches

     200.034645535 seconds time elapsed

lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat
lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3
OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf.
Local       Remote      Local  Elapsed Throughput Throughput  Local Local  Remote Remote Local   Remote  Service
Send Socket Recv Socket Send   Time               Units       CPU   CPU    CPU    CPU    Service Service Demand
Size        Size        Size   (sec)                          Util  Util   Util   Util   Demand  Demand  Units
Final       Final                                             %     Method %      Method
1593240     6291456     16384  20.00   17321.16   10^6bits/s  3.35  S      -1.00  U      0.381   -1.000  usec/KB

 Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3':

         2,675,818 context-switches

     200.029651391 seconds time elapsed

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-By: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 17:54:48 -07:00
Eric Dumazet 64dc61306c net: add sk_stream_is_writeable() helper
Several call sites use the hardcoded following condition :

sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)

Lets use a helper because TCP_NOTSENT_LOWAT support will change this
condition for TCP sockets.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 17:54:48 -07:00
Daniel Borkmann 91705c61b5 net: sctp: trivial: update mailing list address
The SCTP mailing list address to send patches or questions
to is linux-sctp@vger.kernel.org and not
lksctp-developers@lists.sourceforge.net anymore. Therefore,
update all occurences.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 17:53:38 -07:00
Hannes Frederic Sowa 905a6f96a1 ipv6: take rtnl_lock and mark mrt6 table as freed on namespace cleanup
Otherwise we end up dereferencing the already freed net->ipv6.mrt pointer
which leads to a panic (from Srivatsa S. Bhat):

BUG: unable to handle kernel paging request at ffff882018552020
IP: [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
PGD 290a067 PUD 207ffe0067 PMD 207ff1d067 PTE 8000002018552060
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: ebtable_nat ebtables nfs fscache nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables nfsd lockd nfs_acl exportfs auth_rpcgss autofs4 sunrpc 8021q garp bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter
+ip6_tables ipv6 vfat fat vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support cdc_ether usbnet mii microcode i2c_i801 i2c_core lpc_ich mfd_core shpchp ioatdma dca mlx4_core be2net wmi acpi_cpufreq mperf ext4 jbd2 mbcache dm_mirror dm_region_hash dm_log dm_mod
CPU: 0 PID: 7 Comm: kworker/u33:0 Not tainted 3.11.0-rc1-ea45e-a #4
Hardware name: IBM  -[8737R2A]-/00Y2738, BIOS -[B2E120RUS-1.20]- 11/30/2012
Workqueue: netns cleanup_net
task: ffff8810393641c0 ti: ffff881039366000 task.ti: ffff881039366000
RIP: 0010:[<ffffffffa0366b02>]  [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
RSP: 0018:ffff881039367bd8  EFLAGS: 00010286
RAX: ffff881039367fd8 RBX: ffff882018552000 RCX: dead000000200200
RDX: 0000000000000000 RSI: ffff881039367b68 RDI: ffff881039367b68
RBP: ffff881039367bf8 R08: ffff881039367b68 R09: 2222222222222222
R10: 2222222222222222 R11: 2222222222222222 R12: ffff882015a7a040
R13: ffff882014eb89c0 R14: ffff8820289e2800 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88103fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff882018552020 CR3: 0000000001c0b000 CR4: 00000000000407f0
Stack:
 ffff881039367c18 ffff882014eb89c0 ffff882015e28c00 0000000000000000
 ffff881039367c18 ffffffffa034d9d1 ffff8820289e2800 ffff882014eb89c0
 ffff881039367c58 ffffffff815bdecb ffffffff815bddf2 ffff882014eb89c0
Call Trace:
 [<ffffffffa034d9d1>] rawv6_close+0x21/0x40 [ipv6]
 [<ffffffff815bdecb>] inet_release+0xfb/0x220
 [<ffffffff815bddf2>] ? inet_release+0x22/0x220
 [<ffffffffa032686f>] inet6_release+0x3f/0x50 [ipv6]
 [<ffffffff8151c1d9>] sock_release+0x29/0xa0
 [<ffffffff81525520>] sk_release_kernel+0x30/0x70
 [<ffffffffa034f14b>] icmpv6_sk_exit+0x3b/0x80 [ipv6]
 [<ffffffff8152fff9>] ops_exit_list+0x39/0x60
 [<ffffffff815306fb>] cleanup_net+0xfb/0x1a0
 [<ffffffff81075e3a>] process_one_work+0x1da/0x610
 [<ffffffff81075dc9>] ? process_one_work+0x169/0x610
 [<ffffffff81076390>] worker_thread+0x120/0x3a0
 [<ffffffff81076270>] ? process_one_work+0x610/0x610
 [<ffffffff8107da2e>] kthread+0xee/0x100
 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff8162a99c>] ret_from_fork+0x7c/0xb0
 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70
Code: 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 4c 8b 67 30 49 89 fd e8 db 3c 1e e1 49 8b 9c 24 90 08 00 00 48 85 db 74 06 <4c> 39 6b 20 74 20 bb f3 ff ff ff e8 8e 3c 1e e1 89 d8 4c 8b 65
RIP  [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
 RSP <ffff881039367bd8>
CR2: ffff882018552020

Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 17:02:13 -07:00
Jerry Snitselaar f585a991e1 fib_trie: potential out of bounds access in trie_show_stats()
With the <= max condition in the for loop, it will be always go 1
element further than needed. If the condition for the while loop is
never met, then max is MAX_STAT_DEPTH, and for loop will walk off the
end of nodesizes[].

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 16:05:14 -07:00
Andi Shyti 59ea52dc46 net: trans_rdma: remove unused function
This patch gets rid of the following warning:

net/9p/trans_rdma.c:594:12: warning: ‘rdma_cancelled’ defined but not used [-Wunused-function]
 static int rdma_cancelled(struct p9_client *client, struct p9_req_t *req)

The rdma_cancelled function is not called anywhere in the kernel

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 15:46:27 -07:00
fan.du 9225b23057 net: ipv6 eliminate parameter "int addrlen" in function fib6_add_1
The "int addrlen" in fib6_add_1 is rebundant, as we can get it from
parameter "struct in6_addr *addr" once we modified its type.
And also fix some coding style issues in fib6_add_1

Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 14:24:56 -07:00
fan.du 86a37def2b net ipv6: Remove rebundant rt6i_nsiblings initialization
Seting rt->rt6i_nsiblings to zero is rebundant, because above memset
zeroed the rest of rt excluding the first dst memember.

Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 14:24:56 -07:00
John W. Linville 18e1ccb6ca Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-07-24 11:50:38 -04:00
Amerigo Wang b9959fd3b0 vti: switch to new ip tunnel code
GRE tunnel and IPIP tunnel already switched to the new
ip tunnel code, VTI tunnel can use it too.

Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Saurabh Mohan <saurabh.mohan@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-23 17:27:32 -07:00
Rami Rosen 2b52c3ada8 ip6mr: change the prototype of ip6_mr_forward().
This patch changes the prototpye of the ip6_mr_forward() method to return void
instead of int.

The ip6_mr_forward() method always returns 0; moreover, the return value of this
method is not checked anywhere.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-23 17:20:37 -07:00
Rami Rosen c4854ec8c4 ipmr: change the prototype of ip_mr_forward().
This patch changes the prototpye of the ip_mr_forward() method to return void
instead of int.

The ip_mr_forward() method always returns 0; moreover, the return value of this
method is not checked anywhere.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-23 17:01:05 -07:00
Jiri Pirko 4aa5dee4d9 net: convert resend IGMP to notifier event
Until now, bond_resend_igmp_join_requests() looks for vlans attached to
bonding device, bridge where bonding act as port manually. It does not
care of other scenarios, like stacked bonds or team device above. Make
this more generic and use netdev notifier to propagate the event to
upper devices and to actually call ip_mc_rejoin_groups().

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-23 16:52:47 -07:00
Stanislaw Gruszka cd34f647a7 mac80211: fix monitor interface suspend crash regression
My commit:

commit 12e7f51702
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Thu Feb 28 10:55:26 2013 +0100

    mac80211: cleanup generic suspend/resume procedures

removed check for deleting MONITOR and AP_VLAN when suspend. That can
cause a crash (i.e. in iwlagn_mac_remove_interface()) since we remove
interface in the driver that we did not add before.

Reference:
http://marc.info/?l=linux-kernel&m=137391815113860&w=2

Bisected-by: Ortwin Glück <odi@odi.ch>
Reported-and-tested-by: Ortwin Glück <odi@odi.ch>
Cc: stable@vger.kernel.org # 3.10
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-23 14:02:08 +02:00
Yuchung Cheng ed08495c31 tcp: use RTT from SACK for RTO
If RTT is not available because Karn's check has failed or no
new packet is acked, use the RTT measured from SACK to estimate
the RTO. The sender can continue to estimate the RTO during loss
recovery or reordering event upon receiving non-partial ACKs.

This also changes when the RTO is re-armed. Previously it is
only re-armed when some data is cummulatively acknowledged (i.e.,
SND.UNA advances), but now it is re-armed whenever RTT estimator
is updated. This feature is particularly useful to reduce spurious
timeout for buffer bloat including cellular carriers [1], and
RTT estimation on reordering events.

[1] "An In-depth Study of LTE: Effect of Network Protocol and
 Application Behavior on Performance", In Proc. of SIGCOMM 2013

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-22 17:53:42 -07:00
Yuchung Cheng 59c9af4234 tcp: measure RTT from new SACK
Take RTT sample if an ACK selectively acks some sequences that
have never been retransmitted. The Karn's algorithm does not apply
even if that ACK (s)acks other retransmitted sequences, because it
must been generated by an original but perhaps out-of-order packet.
There is no ambiguity. In case when multiple blocks are newly
sacked because of ACK losses the earliest block is used to
measure RTT, similar to cummulative ACKs.

Such RTT samples allow the sender to estimate the RTO during loss
recovery and packet reordering events. It is still useful even with
TCP timestamps. That's because during these events the SND.UNA may
not advance preventing RTT samples from TS ECR (thus the FLAG_ACKED
check before calling tcp_ack_update_rtt()).  Therefore this new
RTT source is complementary to existing ACK and TS RTT mechanisms.

This patch does not update the RTO. It is done in the next patch.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-22 17:53:42 -07:00
Yuchung Cheng 5b08e47caf tcp: prefer packet timing to TS-ECR for RTT
Prefer packet timings to TS-ecr for RTT measurements when both
sources are available. That's because broken middle-boxes and remote
peer can return packets with corrupted TS ECR fields. Similarly most
congestion controls that require RTT signals favor timing-based
sources as well. Also check for bad TS ECR values to avoid RTT
blow-ups. It has happened on production Web servers.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-22 17:53:42 -07:00
Yuchung Cheng 375fe02c91 tcp: consolidate SYNACK RTT sampling
The first patch consolidates SYNACK and other RTT measurement to use a
central function tcp_ack_update_rtt(). A (small) bonus is now SYNACK
RTT measurement happens after PAWS check, potentially reducing the
impact of RTO seeding on bad TCP timestamps values.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-22 17:53:42 -07:00
Richard Cochran cb820f8e4b net: Provide a generic socket error queue delivery method for Tx time stamps.
This patch moves the private error queue delivery function from the
af_packet code to the core socket method. In this way, network layers
only needing the error queue for transmit time stamping can share common
code.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-22 14:58:19 -07:00
David S. Miller 7bd04bcf91 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter fixes for your net tree,
they are:

* Fix potential NULL dereference in the socket match if revision 0
  is used, from Eric Dumazet.

* Fix missing expectation NAT initialization that results in dumping
  the NAT part via ctnetlink, thus leading to problems in expectation
  synchronization through conntrackd, from myself.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-22 14:32:39 -07:00
Chun-Yeow Yeoh 40d18ff959 mac80211: prevent the buffering or frame transmission to non-assoc mesh STA
This patch is intended to avoid the buffering to non-assoc mesh STA
and also to avoid the triggering of frame to non-assoc mesh STA which
could cause kernel panic in specific hw.

One of the examples, is kernel panic happens to ath9k if user space
inserts the mesh STA and not proceed with the SAE and AMPE, and later
the same mesh STA is detected again. The sta_state of the mesh STA remains
at IEEE80211_STA_NONE and if the ieee80211_sta_ps_deliver_wakeup is called
and subsequently the ath_tx_aggr_wakeup, the kernel panic due to
ath_tx_node_init is not called before to initialize the require data
structures.

This issue is reported by Cedric Voncken before.
http://www.spinics.net/lists/linux-wireless/msg106342.html

[<831ea6b4>] ath_tx_aggr_wakeup+0x44/0xcc [ath9k]
[<83084214>] ieee80211_sta_ps_deliver_wakeup+0xb8/0x208 [mac80211]
[<830b9824>] ieee80211_mps_sta_status_update+0x94/0x108 [mac80211]
[<83099398>] ieee80211_sta_ps_transition+0xc94/0x34d8 [mac80211]
[<8022399c>] nf_iterate+0x98/0x104
[<8309bb60>] ieee80211_sta_ps_transition+0x345c/0x34d8 [mac80211]

Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-22 15:32:47 +02:00
Linus Torvalds 3be542d464 NFS client bugfixes for 3.11
- Fix a regression against NFSv4 FreeBSD servers when creating a new file
 - Fix another regression in rpc_client_register()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR6ZKMAAoJEGcL54qWCgDyQX8P/19LKLNKcL+y2zVGjLbXMTq0
 TpyWdBO0ux7QcqnPEDg+Jpvu62IowYiKTtaSOXtHb5BNjQMBo2RKw3B0eMBoCp/z
 6gHmQRD2hMgqwBxBwHceV+dNwueCUiZW7GqaaNh6/3bpGQefegdONnLEifuPogEu
 oZmEuiVrGDfITEF7D4k5+shXCQN4eNH0LFuIQo4XXdCqmK6PwvOsidZ7YwHVC3Mg
 /Jzda2YsCxHj8kPi1xb9skPPAn6g4kdfYfyr/xSY7IviPixrkg/nEEK1b8xHU81e
 a0dd0Yx5kq6fR8LsBvQCHdj2m7doHM15jf5Np5G7VnnaWEjB2y+QftkxWc9lCNU3
 t2fr9YVD7ZG/GGNSFePHAHmBY0OqDB1Htp4vcwEQfzX6CAR3Hel82WVvut62Z6m4
 G5qHjwdqUFhmRN//SWlDpEqSn+pbeCvPhQS60ayN0TLivRsscm/I4yA75odAnn9b
 4su1IcUpqeJGeV6yDyMUqbx4kYZFyCZg/DNkThXiTKOs47A7ogSS9ev2fTB/V+jd
 rroNHNd/U508ze9D6D4ai9vR78uUp4wKNSSBZMCkBtNh0uSApOTgyGVhertB1EKS
 vgAr4T1tc+9t+0qg1Sb+hbKyBM/KaS5zUrPn+APHPoBXPh5PSVBzeNJkpxHRw/V0
 ZxkEgSQKLZSXYb5ab770
 =XE+7
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - Fix a regression against NFSv4 FreeBSD servers when creating a new
   file
 - Fix another regression in rpc_client_register()

* tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Fix a regression against the FreeBSD server
  SUNRPC: Fix another issue with rpc_client_register()
2013-07-20 10:48:24 -07:00
Eric Dumazet 1faabf2aab bridge: do not call setup_timer() multiple times
commit 9f00b2e7cf ("bridge: only expire the mdb entry when query is
received") added a nasty bug as an active timer can be reinitialized.

setup_timer() must be done once, no matter how many time mod_timer()
is called. br_multicast_new_group() is the right place to do this.

Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Diagnosed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-19 22:12:42 -07:00
Michal Tesar 651e92716a sysctl net: Keep tcp_syn_retries inside the boundary
Limit the min/max value passed to the
/proc/sys/net/ipv4/tcp_syn_retries.

Signed-off-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-19 17:36:03 -07:00
Dragos Foianu aafee33423 net/irda: fixed style issues in irttp
Applied error fixes suggested by checpatch.pl

Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-19 17:34:40 -07:00
Frederic Danis 7427b370e0 NFC: Fix NCI over SPI build
kbuild test robot found following error:

     net/built-in.o: In function `nci_spi_send':
  >> spi.c:(.text+0x19a76f): undefined reference to `crc_ccitt'

Add CRC_CCITT module to Kconfig to fix it

Reported-by: kbuild test robot.
Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-07-19 16:55:26 +02:00
Linus Torvalds ecb2cf1a6b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "A couple interesting SKB fragment handling fixes, plus the usual small
  bits here and there:

   1) Fix 64-bit divide build failure on 32-bit platforms in mlx5, from
      Tim Gardner.

   2) Get rid of a stupid reimplementation on "%*phC" in our sysfs MAC
      address printing helper.

   3) Fix NETIF_F_SG capability advertisement in hyperv driver, if the
      device can't do checksumming offloads then it shouldn't say it can
      do SG either.  From Haiyang Zhang.

   4) bgmac needs to depend on PHYLIB, from Hauke Mehrtens.

   5) Don't leak DMA mappings on mapping failures, from Neil Horman.

   6) We need to reset the transport header of SKBs in ipv4 before we
      attempt to perform early socket demux, just like ipv6 does.  From
      Eric Dumazet.

   7) Add missing locking on vxlan device removal, from Stephen
      Hemminger.

   8) xen-netfront has to make two passes over an SKB to prepare it for
      transfer.  One pass calculates the number of slots needed, the
      second massages the SKB and fills the slots.  Unfortunately, the
      first pass doesn't calculate the number of slots properly so we
      can end up trying to build a MAX_SKB_FRAGS + 1 SKB which doesn't
      work out so well.  Fix from Jan Beulich with help and discussion
      with several others.

   9) Fix a similar problem in tun and macvtap, which have to split up
      scatter-gather elements at PAGE_SIZE boundaries.  Don't do
      zerocopy if it would result in a > MAX_SKB_FRAGS skb.  Fixes from
      Jason Wang.

  10) On receive, once we've decoded the VLAN state completely, clear
      skb->vlan_tci.  Otherwise demuxed tunnels underneath can trigger
      the VLAN code again, corrupting the packet.  Fix from Eric
      Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  vlan: fix a race in egress prio management
  vlan: mask vlan prio bits
  macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
  tuntap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
  pkt_sched: sch_qfq: remove a source of high packet delay/jitter
  xen-netfront: pull on receive skb may need to happen earlier
  vxlan: add necessary locking on device removal
  hyperv: Fix the NETIF_F_SG flag setting in netvsc
  net: Fix sysfs_format_mac() code duplication.
  be2net: Fix to avoid hardware workaround when not needed
  macvtap: do not assume 802.1Q when send vlan packets
  macvtap: fix the missing ret value of TUNSETQUEUE
  ipv4: set transport header earlier
  mlx5 core: Fix __udivdi3 when compiling for 32 bit arches
  bgmac: add dependency to phylib
  net/irda: fixed style issues in irlan_eth
  ethtool: fixed trailing statements in ethtool
  ndisc: bool initializations should use true and false
  atl1e: unmap partially mapped skb on dma error and free skb
2013-07-18 20:08:47 -07:00
Eric Dumazet 3e3aac4975 vlan: fix a race in egress prio management
egress_priority_map[] hash table updates are protected by rtnl,
and we never remove elements until device is dismantled.

We have to make sure that before inserting an new element in hash table,
all its fields are committed to memory or else another cpu could
find corrupt values and crash.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-18 13:07:13 -07:00
Eric Dumazet d4b812dea4 vlan: mask vlan prio bits
In commit 48cc32d38a
("vlan: don't deliver frames for unknown vlans to protocols")
Florian made sure we set pkt_type to PACKET_OTHERHOST
if the vlan id is set and we could find a vlan device for this
particular id.

But we also have a problem if prio bits are set.

Steinar reported an issue on a router receiving IPv6 frames with a
vlan tag of 4000 (id 0, prio 2), and tunneled into a sit device,
because skb->vlan_tci is set.

Forwarded frame is completely corrupted : We can see (8100:4000)
being inserted in the middle of IPv6 source address :

16:48:00.780413 IP6 2001:16d8:8100:4000:ee1c:0:9d9:bc87 >
9f94:4d95:2001:67c:29f4::: ICMP6, unknown icmp6 type (0), length 64
       0x0000:  0000 0029 8000 c7c3 7103 0001 a0ae e651
       0x0010:  0000 0000 ccce 0b00 0000 0000 1011 1213
       0x0020:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223
       0x0030:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233

It seems we are not really ready to properly cope with this right now.

We can probably do better in future kernels :
vlan_get_ingress_priority() should be a netdev property instead of
a per vlan_dev one.

For stable kernels, lets clear vlan_tci to fix the bugs.

Reported-by: Steinar H. Gunderson <sesse@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-18 13:05:23 -07:00
Paolo Valente 87f40dd6ce pkt_sched: sch_qfq: remove a source of high packet delay/jitter
QFQ+ inherits from QFQ a design choice that may cause a high packet
delay/jitter and a severe short-term unfairness. As QFQ, QFQ+ uses a
special quantity, the system virtual time, to track the service
provided by the ideal system it approximates. When a packet is
dequeued, this quantity must be incremented by the size of the packet,
divided by the sum of the weights of the aggregates waiting to be
served. Tracking this sum correctly is a non-trivial task, because, to
preserve tight service guarantees, the decrement of this sum must be
delayed in a special way [1]: this sum can be decremented only after
that its value would decrease also in the ideal system approximated by
QFQ+. For efficiency, QFQ+ keeps track only of the 'instantaneous'
weight sum, increased and decreased immediately as the weight of an
aggregate changes, and as an aggregate is created or destroyed (which,
in its turn, happens as a consequence of some class being
created/destroyed/changed). However, to avoid the problems caused to
service guarantees by these immediate decreases, QFQ+ increments the
system virtual time using the maximum value allowed for the weight
sum, 2^10, in place of the dynamic, instantaneous value. The
instantaneous value of the weight sum is used only to check whether a
request of weight increase or a class creation can be satisfied.

Unfortunately, the problems caused by this choice are worse than the
temporary degradation of the service guarantees that may occur, when a
class is changed or destroyed, if the instantaneous value of the
weight sum was used to update the system virtual time. In fact, the
fraction of the link bandwidth guaranteed by QFQ+ to each aggregate is
equal to the ratio between the weight of the aggregate and the sum of
the weights of the competing aggregates. The packet delay guaranteed
to the aggregate is instead inversely proportional to the guaranteed
bandwidth. By using the maximum possible value, and not the actual
value of the weight sum, QFQ+ provides each aggregate with the worst
possible service guarantees, and not with service guarantees related
to the actual set of competing aggregates. To see the consequences of
this fact, consider the following simple example.

Suppose that only the following aggregates are backlogged, i.e., that
only the classes in the following aggregates have packets to transmit:
one aggregate with weight 10, say A, and ten aggregates with weight 1,
say B1, B2, ..., B10. In particular, suppose that these aggregates are
always backlogged. Given the weight distribution, the smoothest and
fairest service order would be:
A B1 A B2 A B3 A B4 A B5 A B6 A B7 A B8 A B9 A B10 A B1 A B2 ...

QFQ+ would provide exactly this optimal service if it used the actual
value for the weight sum instead of the maximum possible value, i.e.,
11 instead of 2^10. In contrast, since QFQ+ uses the latter value, it
serves aggregates as follows (easy to prove and to reproduce
experimentally):
A B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 A A A A A A A A A A B1 B2 ... B10 A A ...

By replacing 10 with N in the above example, and by increasing N, one
can increase at will the maximum packet delay and the jitter
experienced by the classes in aggregate A.

This patch addresses this issue by just using the above
'instantaneous' value of the weight sum, instead of the maximum
possible value, when updating the system virtual time.  After the
instantaneous weight sum is decreased, QFQ+ may deviate from the ideal
service for a time interval in the order of the time to serve one
maximum-size packet for each backlogged class. The worst-case extent
of the deviation exhibited by QFQ+ during this time interval [1] is
basically the same as of the deviation described above (but, without
this patch, QFQ+ suffers from such a deviation all the time). Finally,
this patch modifies the comment to the function qfq_slot_insert, to
make it coherent with the fact that the weight sum used by QFQ+ can
now be lower than the maximum possible value.

[1] P. Valente, "Extending WF2Q+ to support a dynamic traffic mix",
Proceedings of AAA-IDEA'05, June 2005.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-18 13:02:00 -07:00
Linus Torvalds 3f334c2081 Merge branch 'cpuinit_phase2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull phase two of __cpuinit removal from Paul Gortmaker:
 "With the __cpuinit infrastructure removed earlier, this group of
  commits only removes the function/data tagging that was done with the
  various (now no-op) __cpuinit related prefixes.

  Now that the dust has settled with yesterday's v3.11-rc1, there
  hopefully shouldn't be any new users leaking back in tree, but I think
  we can leave the harmless no-op stubs there for a release as a
  courtesy to those who still have out of tree stuff and weren't paying
  attention.

  Although the commits are against the recent tag to allow for minor
  context refreshes for things like yesterday's v3.11-rc1~ slab content,
  the patches have been largely unchanged for weeks, aside from such
  trivial updates.

  For detail junkies, the largely boring and mostly irrelevant history
  of the patches can be viewed at:

    http://git.kernel.org/cgit/linux/kernel/git/paulg/cpuinit-delete.git

  If nothing else, I guess it does at least demonstrate the level of
  involvement required to shepherd such a treewide change to completion.

  This is the same repository of patches that has been applied to the
  end of the daily linux-next branches for the past several weeks"

* 'cpuinit_phase2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (28 commits)
  block: delete __cpuinit usage from all block files
  drivers: delete __cpuinit usage from all remaining drivers files
  kernel: delete __cpuinit usage from all core kernel files
  rcu: delete __cpuinit usage from all rcu files
  net: delete __cpuinit usage from all net files
  acpi: delete __cpuinit usage from all acpi files
  hwmon: delete __cpuinit usage from all hwmon files
  cpufreq: delete __cpuinit usage from all cpufreq files
  clocksource+irqchip: delete __cpuinit usage from all related files
  x86: delete __cpuinit usage from all x86 files
  score: delete __cpuinit usage from all score files
  xtensa: delete __cpuinit usage from all xtensa files
  openrisc: delete __cpuinit usage from all openrisc files
  m32r: delete __cpuinit usage from all m32r files
  hexagon: delete __cpuinit usage from all hexagon files
  frv: delete __cpuinit usage from all frv files
  cris: delete __cpuinit usage from all cris files
  metag: delete __cpuinit usage from all metag files
  tile: delete __cpuinit usage from all tile files
  sh: delete __cpuinit usage from all sh files
  ...
2013-07-18 10:50:26 -07:00
Linus Torvalds 61f98b0fca Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
 "Just three minor bugfixes"

* 'for-3.11' of git://linux-nfs.org/~bfields/linux:
  svcrdma: underflow issue in decode_write_list()
  nfsd4: fix minorversion support interface
  lockd: protect nlm_blocked access in nlmsvc_retry_blocked
2013-07-17 13:43:55 -07:00
David S. Miller ae8e9c5a1a net: Fix sysfs_format_mac() code duplication.
It's just a duplicate implementation of "%*phC".  Thanks to Joe
Perches for showing that we had exactly this support in the
lib/vsprintf.c code already.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 17:09:22 -07:00
Eric Dumazet 21d1196a35 ipv4: set transport header earlier
commit 45f00f99d6 ("ipv4: tcp: clean up tcp_v4_early_demux()") added a
performance regression for non GRO traffic, basically disabling
IP early demux.

IPv6 stack resets transport header in ip6_rcv() before calling
IP early demux in ip6_rcv_finish(), while IPv4 does this only in
ip_local_deliver_finish(), _after_ IP early demux.

GRO traffic happened to enable IP early demux because transport header
is also set in inet_gro_receive()

Instead of reverting the faulty commit, we can make IPv4/IPv6 behave the
same : transport_header should be set in ip_rcv() instead of
ip_local_deliver_finish()

ip_local_deliver_finish() can also use skb_network_header_len() which is
faster than ip_hdrlen()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 12:59:28 -07:00
Dragos Foianu b6a82dd233 net/irda: fixed style issues in irlan_eth
Applied fixes suggested by checkpatch.pl

Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 12:16:03 -07:00
Dragos Foianu 8a73125c36 ethtool: fixed trailing statements in ethtool
Applied fixes suggested by checkpatch.pl.

Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 12:14:51 -07:00
Daniel Baluta f2f79cca13 ndisc: bool initializations should use true and false
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 12:13:31 -07:00
Felix Fietkau 5c9fc93bc9 mac80211/minstrel: fix NULL pointer dereference issue
When priv_sta == NULL, mi->prev_sample is dereferenced too early. Move
the assignment further down, after the rate_control_send_low call.

Reported-by: Krzysztof Mazur <krzysiek@podlesie.net>
Cc: stable@vger.kernel.org # 3.10
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-16 17:48:14 +03:00
Johannes Berg c82b5a74cc mac80211: make active monitor injection work w/ HW queue
When a driver (like hwsim) uses HW queue control an
active monitor vif needs to be used for the queues,
make the code do that. Otherwise we'd bail out and
drop the frames.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-07-16 09:58:19 +03:00