OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Pablo Neira Ayuso	c7c32e72cb	netfilter: nf_tables: defer all object release via rcu Now that all objects are released in the reverse order via the transaction infrastructure, we can enqueue the release via call_rcu to save one synchronize_rcu. For small rule-sets loaded via nft -f, it now takes around 50ms less here. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso	128ad3322b	netfilter: nf_tables: remove skb and nlh from context structure Instead of caching the original skbuff that contains the netlink messages, this stores the netlink message sequence number, the netlink portID and the report flag. This helps to prepare the introduction of the object release via call_rcu. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso	35151d840c	netfilter: nf_tables: simplify nf_tables__notify Now that all these function are called from the commit path, we can pass the context structure to reduce the amount of parameters in all of the nf_tables__notify functions. This patch also removes unneeded branches to check for skb, nlh and net that should be always set in the context structure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso	60319eb1ca	netfilter: nf_tables: use new transaction infrastructure to handle elements Leave the set content in consistent state if we fail to load the batch. Use the new generic transaction infrastructure to achieve this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso	55dd6f9307	netfilter: nf_tables: use new transaction infrastructure to handle table This patch speeds up rule-set updates and it also provides a way to revert updates and leave things in consistent state in case that the batch needs to be aborted. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso	e1aaca93ee	netfilter: nf_tables: pass context to nf_tables_updtable() So nf_tables_uptable() only takes one single parameter. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso	f75edf5e9c	netfilter: nf_tables: disabling table hooks always succeeds nf_tables_table_disable() always succeeds, make this function void. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso	91c7b38dc9	netfilter: nf_tables: use new transaction infrastructure to handle chain This patch speeds up rule-set updates and it also introduces a way to revert chain updates if the batch is aborted. The idea is to store the changes in the transaction to apply that in the commit step. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso	ff3cd7b3c9	netfilter: nf_tables: refactor chain statistic routines Add new routines to encapsulate chain statistics allocation and replacement. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso	958bee14d0	netfilter: nf_tables: use new transaction infrastructure to handle sets This patch reworks the nf_tables API so set updates are included in the same batch that contains rule updates. This speeds up rule-set updates since we skip a dialog of four messages between kernel and user-space (two on each direction), from: 1) create the set and send netlink message to the kernel 2) process the response from the kernel that contains the allocated name. 3) add the set elements and send netlink message to the kernel. 4) process the response from the kernel (to check for errors). To: 1) add the set to the batch. 2) add the set elements to the batch. 3) add the rule that points to the set. 4) send batch to the kernel. This also introduces an internal set ID (NFTA_SET_ID) that is unique in the batch so set elements and rules can refer to new sets. Backward compatibility has been only retained in userspace, this means that new nft versions can talk to the kernel both in the new and the old fashion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso	b380e5c733	netfilter: nf_tables: add message type to transactions The patch adds message type to the transaction to simplify the commit the and abort routines. Yet another step forward in the generalisation of the transaction infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso	37082f930b	netfilter: nf_tables: relocate commit and abort routines in the source file Move the commit and abort routines to the bottom of the source code file. This change is required by the follow up patches that add the set, chain and table transaction support. This patch is just a cleanup to access several functions without having to declare their prototypes. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso	1081d11b08	netfilter: nf_tables: generalise transaction infrastructure This patch generalises the existing rule transaction infrastructure so it can be used to handle set, table and chain object transactions as well. The transaction provides a data area that stores private information depending on the transaction type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso	7c95f6d866	netfilter: nf_tables: deconstify table and chain in context structure The new transaction infrastructure updates the family, table and chain objects in the context structure, so let's deconstify them. While at it, move the context structure initialization routine to the top of the source file as it will be also used from the table and chain routines. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:09 +02:00
Oliver Hartkopp	45c700291a	can: add hash based access to single EFF frame filters In contrast to the direct access to the single SFF frame filters (which are indexed by the SFF CAN ID itself) the single EFF frame filters are arranged in a single linked hlist. To reduce the hlist traversal in the case of many filter subscriptions a hash based access is introduced for single EFF filters. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2014-05-19 09:38:24 +02:00
Oliver Hartkopp	e3d3917f3d	can: proc: make array printing function indenpendent from sff frames The can_rcvlist_sff_proc_show_one() function which prints the array of filters for the single SFF CAN identifiers is prepared to be used by a second caller. Therefore it is also renamed to properly describe its future functionality. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2014-05-19 09:38:24 +02:00
David S. Miller	b6052af61a	Included changes: - fix codestyle to respect new checkpatch warnings - increase internal version number -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTeLMvAAoJEEKTMo6mOh1VKtgP/RuR34USuUbY/xMZ9/Rn2/E7 z1qn6hh8hlw+Hd+Vn+9BvDJzwn+Baneu1c3SMP08kE+pAst0n788y/f/pVzfToJk Gll0sOVHiSm05M0QQ0Vq57H+rxoFv2KACM1t2+NMW+pB+PsSYG5y87b6I+0hR4Pv lbBCNmgIxY2alxM8qab2Zlt+cCUdkKUnI67P0LtVnMh91JuKwsheOdR+Smxz2+2g J+2Bzcz+NIHhJP9c+QmJipV+gtIRjFr7+bebaXDm/eEBq/3f6cEhFtwa76CmCpI/ cAIMDFORCHB27qNMgKSuzFDdhF1qQJnZh8FX0dfRBXvH8NwxBOkjFh1CBJ3iwjm1 T7GBTLTKiv/JqdNjqrWJ9OxChl8I2jppevZdimq1VUjhv9117Jc73TnzazjULTST xr5PpZ1gRfruUVXl362otrtzm0N/hdqez+mYlkZEx/ERTDedLCZZAnjTsx5PPMG+ GXlbc1BWuQZuHpvs8uWMcnXDaWtNyNKKpvfRPuvLIST80F1Bw/KRd2FDH/AiO2tL 2eACn9ughC5XO9E+/iyfWm1MQMEwo/w9+EfWpnRWV9HtDuHepVGy59x3mCYH/bN0 7FP23lbaFw05i/UpsRRneqkzMJLk/16qLCiNoC8u2hEiqKzu0/celPwl7B16Fs4Z CU65LSN/QNU9q+AXVQOd =tdAQ -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - fix codestyle to respect new checkpatch warnings - increase internal version number	2014-05-18 21:27:09 -04:00
Manuel Schölling	71fd762f2e	net: rds: Use time_after() for time comparison To be future-proof and for better readability the time comparisons are modified to use time_after() instead of raw math. Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-18 21:24:52 -04:00
stephen hemminger	614d056c8e	ipv4: minor spelling fix Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-18 21:10:29 -04:00
stephen hemminger	025559eec8	bridge: fix spelling of promiscuous Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-18 21:10:08 -04:00
Ben Hutchings	61d88c6811	ethtool: Disallow ETHTOOL_SRSSH with both indir table and hash key unchanged This would be a no-op, so there is no reason to request it. This also allows conversion of the current implementations of ethtool_ops::{get,set}_rxfh_indir to ethtool_ops::{get,set}_rxfh with no change other than their parameters. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2014-05-19 01:29:42 +01:00
Ben Hutchings	7455fa2422	ethtool: Name the 'no change' value for setting RSS hash key but not indir table We usually allocate special values of u32 fields starting from the top down, so also change the value to 0xffffffff. As these operations haven't been included in a stable release yet, it's not too late to change. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2014-05-19 01:18:19 +01:00
Ben Hutchings	fb95cd8d14	ethtool: Return immediately on error in ethtool_copy_validate_indir() We must return -EFAULT immediately rather than continuing into the loop. Similarly, we may as well return -EINVAL directly. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2014-05-19 01:17:32 +01:00
Alexei Starovoitov	d4f0e0958d	net: bridge: fix build fix build when BRIDGE_VLAN_FILTERING is not set Fixes: `2796d0c648` ("bridge: Automatically manage port promiscuous mode") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-18 20:09:50 -04:00
Simon Wunderlich	871d3d9fdf	batman-adv: Start new development cycle Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-18 15:04:00 +02:00
Antonio Quartulli	2b64df2058	batman-adv: remove semi-colon after macro definition Reported by checkpatch with the following warning: "WARNING: macros should not use a trailing semicolon" Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2014-05-18 15:04:00 +02:00
Antonio Quartulli	f138694b15	batman-adv: add blank line between declarations and the rest of the code Reported by checkpatch with the following message: "WARNING: Missing a blank line after declarations" Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2014-05-18 15:03:52 +02:00
Vlad Yasevich	44a4085538	bonding: Fix stacked device detection in arp monitoring Prior to commit `fbd929f2dc` bonding: support QinQ for bond arp interval the arp monitoring code allowed for proper detection of devices stacked on top of vlans. Since the above commit, the code can still detect a device stacked on top of single vlan, but not a device stacked on top of Q-in-Q configuration. The search will only set the inner vlan tag if the route device is the vlan device. However, this is not always the case, as it is possible to extend the stacked configuration. With this patch it is possible to provision devices on top Q-in-Q vlan configuration that should be used as a source of ARP monitoring information. For example: ip link add link bond0 vlan10 type vlan proto 802.1q id 10 ip link add link vlan10 vlan100 type vlan proto 802.1q id 100 ip link add link vlan100 type macvlan Note: This patch limites the number of stacked VLANs to 2, just like before. The original, however had another issue in that if we had more then 2 levels of VLANs, we would end up generating incorrectly tagged traffic. This is no longer possible. Fixes: `fbd929f2dc` (bonding: support QinQ for bond arp interval) CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Patric McHardy <kaber@trash.net> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 22:29:05 -04:00
Vlad Yasevich	d38569ab2b	vlan: Fix lockdep warning with stacked vlan devices. This reverts commit `dc8eaaa006`. vlan: Fix lockdep warning when vlan dev handle notification Instead we use the new new API to find the lock subclass of our vlan device. This way we can support configurations where vlans are interspersed with other devices: bond -> vlan -> macvlan -> vlan Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 22:14:49 -04:00
Vlad Yasevich	4085ebe8c3	net: Find the nesting level of a given device by type. Multiple devices in the kernel can be stacked/nested and they need to know their nesting level for the purposes of lockdep. This patch provides a generic function that determines a nesting level of a particular device by its type (ex: vlan, macvlan, etc). We only care about nesting of the same type of devices. For example: eth0 <- vlan0.10 <- macvlan0 <- vlan1.20 The nesting level of vlan1.20 would be 1, since there is another vlan in the stack under it. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 22:14:49 -04:00
Thomas Graf	97dc48e220	pktgen: Use seq_puts() where seq_printf() is not needed Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:30:30 -04:00
Eric Dumazet	29e9824278	net: gro: make sure skb->cb[] initial content has not to be zero Starting from linux-3.13, GRO attempts to build full size skbs. Problem is the commit assumed one particular field in skb->cb[] was clean, but it is not the case on some stacked devices. Timo reported a crash in case traffic is decrypted before reaching a GRE device. Fix this by initializing NAPI_GRO_CB(skb)->last at the right place, this also removes one conditional. Thanks a lot to Timo for providing full reports and bisecting this. Fixes: `8a29111c7c` ("net: gro: allow to build full sized skb") Bisected-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:24:54 -04:00
Phoebe Buckheister	f0f77dc6be	ieee802154, mac802154: implement devkey record option The 802.15.4-2011 standard states that for each key, a list of devices that use this key shall be kept. Previous patches have only considered two options: * a device "uses" (or may use) all keys, rendering the list useless * a device is restricted to a certain set of keys Another option would be that a device may use all keys, but need not do so, and we are interested in the actual set of keys the device uses. Recording keys used by any given device may have a noticable performance impact and might not be needed as often. The common case, in which a device will not switch keys too often, should still perform well. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:42 -04:00
Phoebe Buckheister	3e9c156e2c	ieee802154: add netlink interfaces for llsec This patch adds user-visible interfaces for the llsec infrastructure. For the added methods, the only major difference between all add/remove implementation lies in how the specific object is parsed, and for dump requests, how objects are written into netlink messages. To save on boilerplate code, table dumps are routed through a helper function that handles netlink dump state, leaving the actual dumping code to care only about iterating over the table to be dumped and filling netlink messages. For add/remove methods, the boilerplate required to work is not quite as large, but still enough to also move into a local helper. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:41 -04:00
Phoebe Buckheister	9b0bb4a83f	mac802154: propagate device address changes to llsec Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:41 -04:00
Phoebe Buckheister	29e023746a	mac802154: add llsec configuration functions Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:41 -04:00
Phoebe Buckheister	af9eed5bbf	ieee802154: add dgram sockopts for security control Allow datagram sockets to override the security settings of the device they send from on a per-socket basis. Requires CAP_NET_ADMIN or CAP_NET_RAW, since raw sockets can send arbitrary packets anyway. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:41 -04:00
Phoebe Buckheister	f30be4d53c	mac802154: integrate llsec with wpan devices Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:41 -04:00
Phoebe Buckheister	4c14a2fb5d	mac802154: add llsec decryption method Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:41 -04:00
Phoebe Buckheister	03556e4d0d	mac802154: add llsec encryption method Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:40 -04:00
Phoebe Buckheister	5d637d5aab	mac802154: add llsec structures and mutators This patch adds containers and mutators for the major ieee802154_llsec structures to mac802154. Most of the (rather simple) ieee802154_llsec structs are wrapped only to provide an rcu_head for orderly disposal, but some structs - llsec keys notably - require more complex bookkeeping. Since each llsec key may be referenced by a number of llsec key table entries (with differing key ids, but the same actual key), we want to save memory and not allocate crypto transforms for each entry in the table. Thus, the mac802154 llsec key is reference-counted instead. Further, each key will have four associated crypto transforms - three CCM transforms for the authsizes 4/8/16 and one CTR transform for unauthenticated encryption. If we had a CCM* transform that allowed authsize 0, and authsize as part of requests instead of transforms, this would not be necessary. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:40 -04:00
Phoebe Buckheister	87de726c9b	mac802154: update Kconfig Link-layer security requires AES CCM for authenticated modes and AES CTR for the unauthenticated encryption mode. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:40 -04:00
David S. Miller	e54740e6d7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== A set of OVS changes for net-next/3.16. The major change here is a switch from per-CPU to per-NUMA flow statistics. This improves scalability by reducing kernel overhead in flow setup and maintenance. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:21:51 -04:00
Vlad Yasevich	2796d0c648	bridge: Automatically manage port promiscuous mode. There exist configurations where the administrator or another management entity has the foreknowledge of all the mac addresses of end systems that are being bridged together. In these environments, the administrator can statically configure known addresses in the bridge FDB and disable flooding and learning on ports. This makes it possible to turn off promiscuous mode on the interfaces connected to the bridge. Here is why disabling flooding and learning allows us to control promiscuity: Consider port X. All traffic coming into this port from outside the bridge (ingress) will be either forwarded through other ports of the bridge (egress) or dropped. Forwarding (egress) is defined by FDB entries and by flooding in the event that no FDB entry exists. In the event that flooding is disabled, only FDB entries define the egress. Once learning is disabled, only static FDB entries provided by a management entity define the egress. If we provide information from these static FDBs to the ingress port X, then we'll be able to accept all traffic that can be successfully forwarded and drop all the other traffic sooner without spending CPU cycles to process it. Another way to define the above is as following equations: ingress = egress + drop expanding egress ingress = static FDB + learned FDB + flooding + drop disabling flooding and learning we a left with ingress = static FDB + drop By adding addresses from the static FDB entries to the MAC address filter of an ingress port X, we fully define what the bridge can process without dropping and can thus turn off promiscuous mode, thus dropping packets sooner. There have been suggestions that we may want to allow learning and update the filters with learned addresses as well. This would require mac-level authentication similar to 802.1x to prevent attacks against the hw filters as they are limited resource. Additionally, if the user places the bridge device in promiscuous mode, all ports are placed in promiscuous mode regardless of the changes to flooding and learning. Since the above functionality depends on full static configuration, we have also require that vlan filtering be enabled to take advantage of this. The reason is that the bridge has to be able to receive and process VLAN-tagged frames and the there are only 2 ways to accomplish this right now: promiscuous mode or vlan filtering. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:06:33 -04:00
Vlad Yasevich	145beee8d6	bridge: Add addresses from static fdbs to non-promisc ports When a static fdb entry is created, add the mac address from this fdb entry to any ports that are currently running in non-promiscuous mode. These ports need this data so that they can receive traffic destined to these addresses. By default ports start in promiscuous mode, so this feature is disabled. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:06:33 -04:00
Vlad Yasevich	f3a6ddf152	bridge: Introduce BR_PROMISC flag Introduce a BR_PROMISC per-port flag that will help us track if the current port is supposed to be in promiscuous mode or not. For now, always start in promiscuous mode. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:06:33 -04:00
Vlad Yasevich	8db24af71b	bridge: Add functionality to sync static fdb entries to hw Add code that allows static fdb entires to be synced to the hw list for a specified port. This will be used later to program ports that can function in non-promiscuous mode. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:06:33 -04:00
Vlad Yasevich	e028e4b8dc	bridge: Keep track of ports capable of automatic discovery. By default, ports on the bridge are capable of automatic discovery of nodes located behind the port. This is accomplished via flooding of unknown traffic (BR_FLOOD) and learning the mac addresses from these packets (BR_LEARNING). If the above functionality is disabled by turning off these flags, the port requires static configuration in the form of static FDB entries to function properly. This patch adds functionality to keep track of all ports capable of automatic discovery. This will later be used to control promiscuity settings. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:06:33 -04:00
Vlad Yasevich	63c3a622dd	bridge: Turn flag change macro into a function. Turn the flag change macro into a function to allow easier updates and to reduce space. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:06:32 -04:00
Timo Teräs	22fb22eaeb	ipv4: ip_tunnels: disable cache for nbma gre tunnels The connected check fails to check for ip_gre nbma mode tunnels properly. ip_gre creates temporary tnl_params with daddr specified to pass-in the actual target on per-packet basis from neighbor layer. Detect these tunnels by inspecting the actual tunnel configuration. Minimal test case: ip route add 192.168.1.1/32 via 10.0.0.1 ip route add 192.168.1.2/32 via 10.0.0.2 ip tunnel add nbma0 mode gre key 1 tos c0 ip addr add 172.17.0.0/16 dev nbma0 ip link set nbma0 up ip neigh add 172.17.0.1 lladdr 192.168.1.1 dev nbma0 ip neigh add 172.17.0.2 lladdr 192.168.1.2 dev nbma0 ping 172.17.0.1 ping 172.17.0.2 The second ping should be going to 192.168.1.2 and head 10.0.0.2; but cached gre tunnel level route is used and it's actually going to 192.168.1.1 via 10.0.0.1. The lladdr's need to go to separate dst for the bug to trigger. Test case uses separate route entries, but this can also happen when the route entry is same: if there is a nexthop exception or the GRE tunnel is IPsec'ed in which case the dst points to xfrm bundle unique to the gre lladdr. Fixes: `7d442fab0a` ("ipv4: Cache dst in tunnels") Signed-off-by: Timo Teräs <timo.teras@iki.fi> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:58:41 -04:00
Duan Jiong	ee30ef4d45	ip_tunnel: don't add tunnel twice When using command "ip tunnel add" to add a tunnel, the tunnel will be added twice, through ip_tunnel_create() and ip_tunnel_update(). Because the second is unnecessary, so we can just break after adding tunnel through ip_tunnel_create(). Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:57:44 -04:00
Fabian Godehardt	d1c0b471b3	net/dsa/dsa.c: increment chip_index during of_node handling on dsa_of_probe() Adding more than one chip on device-tree currently causes the probing routine to always use the first chips data pointer. Signed-off-by: Fabian Godehardt <fg@emlix.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:56:33 -04:00
Lorenzo Colitti	2e47b29195	net: ipv6: make "ip -6 route get mark xyz" work. Currently, "ip -6 route get mark xyz" ignores the mark passed in by userspace. Make it honour the mark, just like IPv4 does. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:50:30 -04:00
Monam Agarwal	944df8ae84	net/openvswitch: Use with RCU_INIT_POINTER(x, NULL) in vport-gre.c This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:29 -07:00
Jarno Rajahalme	88d73f6c41	openvswitch: Use TCP flags in the flow key for stats. We already extract the TCP flags for the key, might as well use that for stats. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:29 -07:00
Jarno Rajahalme	d92ab13558	openvswitch: Fix output of SCTP mask. The 'output' argument of the ovs_nla_put_flow() is the one from which the bits are written to the netlink attributes. For SCTP we accidentally used the bits from the 'swkey' instead. This caused the mask attributes to include the bits from the actual flow key instead of the mask. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:29 -07:00
Jarno Rajahalme	63e7959c4b	openvswitch: Per NUMA node flow stats. Keep kernel flow stats for each NUMA node rather than each (logical) CPU. This avoids using the per-CPU allocator and removes most of the kernel-side OVS locking overhead otherwise on the top of perf reports and allows OVS to scale better with higher number of threads. With 9 handlers and 4 revalidators netperf TCP_CRR test flow setup rate doubles on a server with two hyper-threaded physical CPUs (16 logical cores each) compared to the current OVS master. Tested with non-trivial flow table with a TCP port match rule forcing all new connections with unique port numbers to OVS userspace. The IP addresses are still wildcarded, so the kernel flows are not considered as exact match 5-tuple flows. This type of flows can be expected to appear in large numbers as the result of more effective wildcarding made possible by improvements in OVS userspace flow classifier. Perf results for this test (master): Events: 305K cycles + 8.43% ovs-vswitchd [kernel.kallsyms] [k] mutex_spin_on_owner + 5.64% ovs-vswitchd [kernel.kallsyms] [k] __ticket_spin_lock + 4.75% ovs-vswitchd ovs-vswitchd [.] find_match_wc + 3.32% ovs-vswitchd libpthread-2.15.so [.] pthread_mutex_lock + 2.61% ovs-vswitchd [kernel.kallsyms] [k] pcpu_alloc_area + 2.19% ovs-vswitchd ovs-vswitchd [.] flow_hash_in_minimask_range + 2.03% swapper [kernel.kallsyms] [k] intel_idle + 1.84% ovs-vswitchd libpthread-2.15.so [.] pthread_mutex_unlock + 1.64% ovs-vswitchd ovs-vswitchd [.] classifier_lookup + 1.58% ovs-vswitchd libc-2.15.so [.] 0x7f4e6 + 1.07% ovs-vswitchd [kernel.kallsyms] [k] memset + 1.03% netperf [kernel.kallsyms] [k] __ticket_spin_lock + 0.92% swapper [kernel.kallsyms] [k] __ticket_spin_lock ... And after this patch: Events: 356K cycles + 6.85% ovs-vswitchd ovs-vswitchd [.] find_match_wc + 4.63% ovs-vswitchd libpthread-2.15.so [.] pthread_mutex_lock + 3.06% ovs-vswitchd [kernel.kallsyms] [k] __ticket_spin_lock + 2.81% ovs-vswitchd ovs-vswitchd [.] flow_hash_in_minimask_range + 2.51% ovs-vswitchd libpthread-2.15.so [.] pthread_mutex_unlock + 2.27% ovs-vswitchd ovs-vswitchd [.] classifier_lookup + 1.84% ovs-vswitchd libc-2.15.so [.] 0x15d30f + 1.74% ovs-vswitchd [kernel.kallsyms] [k] mutex_spin_on_owner + 1.47% swapper [kernel.kallsyms] [k] intel_idle + 1.34% ovs-vswitchd ovs-vswitchd [.] flow_hash_in_minimask + 1.33% ovs-vswitchd ovs-vswitchd [.] rule_actions_unref + 1.16% ovs-vswitchd ovs-vswitchd [.] hindex_node_with_hash + 1.16% ovs-vswitchd ovs-vswitchd [.] do_xlate_actions + 1.09% ovs-vswitchd ovs-vswitchd [.] ofproto_rule_ref + 1.01% netperf [kernel.kallsyms] [k] __ticket_spin_lock ... There is a small increase in kernel spinlock overhead due to the same spinlock being shared between multiple cores of the same physical CPU, but that is barely visible in the netperf TCP_CRR test performance (maybe ~1% performance drop, hard to tell exactly due to variance in the test results), when testing for kernel module throughput (with no userspace activity, handful of kernel flows). On flow setup, a single stats instance is allocated (for the NUMA node 0). As CPUs from multiple NUMA nodes start updating stats, new NUMA-node specific stats instances are allocated. This allocation on the packet processing code path is made to never block or look for emergency memory pools, minimizing the allocation latency. If the allocation fails, the existing preallocated stats instance is used. Also, if only CPUs from one NUMA-node are updating the preallocated stats instance, no additional stats instances are allocated. This eliminates the need to pre-allocate stats instances that will not be used, also relieving the stats reader from the burden of reading stats that are never used. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:29 -07:00
Jarno Rajahalme	23dabf88ab	openvswitch: Remove 5-tuple optimization. The 5-tuple optimization becomes unnecessary with a later per-NUMA node stats patch. Remove it first to make the changes easier to grasp. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:29 -07:00
Joe Perches	8c63ff09bd	openvswitch: Use ether_addr_copy It's slightly smaller/faster for some architectures. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:29 -07:00
Joe Perches	2235ad1c3a	openvswitch: flow_netlink: Use pr_fmt to OVS_NLERR output Add "openvswitch: " prefix to OVS_NLERR output to match the other OVS_NLERR output of datapath.c Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:28 -07:00
Joe Perches	1815a8831f	openvswitch: Use net_ratelimit in OVS_NLERR Each use of pr_<level>_once has a per-site flag. Some of the OVS_NLERR messages look as if seeing them multiple times could be useful, so use net_ratelimit() instead of pr_info_once. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:28 -07:00
Daniele Di Proietto	cc23ebf3bb	openvswitch: Added (unsigned long long) cast in printf This is necessary, since u64 is not unsigned long long in all architectures: u64 could be also uint64_t. Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:28 -07:00
Daniele Di Proietto	07dc0602c5	openvswitch: avoid cast-qual warning in vport_priv This function must cast a const value to a non const value. By adding an uintptr_t cast the warning is suppressed. To avoid the cast (proper solution) several function signatures must be changed. Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:28 -07:00
Daniele Di Proietto	d0b4da1375	openvswitch: avoid warnings in vport_from_priv This change, firstly, avoids declaring the formal parameter const, since it is treated as non const. (to avoid -Wcast-qual) Secondly, it cast the pointer from void* to u8*, since it is used in arithmetic (to avoid -Wpointer-arith) Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:28 -07:00
Daniele Di Proietto	7085130bab	openvswitch: use const in some local vars and casts In few functions, const formal parameters are assigned or cast to non-const. These changes suppress warnings if compiled with -Wcast-qual. Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2014-05-16 13:40:28 -07:00
David S. Miller	2f67cc87d6	Include changes: - fix NULL dereference in batadv_orig_hardif_seq_print_text() - fix reference counting imbalance when using fragmentation - avoid access to orig_node objects after they have been free'd - fix local TT check for outgoing arp requests in DAT -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTdQmwAAoJEEKTMo6mOh1VGLAP/A1nHDzPMUOcDttR49Cs38w0 oD2Ox66xSJh2Yn8qRg9k7CshG65pU70J77bQjkPvMtTlwsgwgLFcHP1b/RJQl7Cz aFSJY5tKvLL41TwqxLSAmvUyPMfvagXvxH65bLBIQ9+dLNkDiHNH/IjdnYKWHYi9 0tqUi7/pLaCfWXMkDVeWn0P2M8baDyU1HUTuRX3ctE4l9PKF9ZVgxsxaPrhTYlXY J61KT+VXs19rdAnYQlFiaDk64Q6meMjuNjxuLkViTmqKi6pSDGi9skeKWZXaKOjT UmLLygVyf9Sh36TWDKinSV09r/s+TeU35o6bCgrmshZebSmFEUkEDA7oxNJ5JW+Q Lh2Y2SrX/+F0+9yhxhDd0fHP3PAwt2XNKjIQjurE85Gw84ZoMyBsVIpF8LD3IS+I T5CSAB0fEyeS0ZFyChbgWSLZzFjcowRHwK1iO8SJC5LHRtYerEqnvgP/V3ej0dt9 A4nq8eO8N9AorQc1G9qMosLNLheMCmFenU2nb8MbC5yDvq2X9jxsmgYm0fvr/y47 f667bowPr0afhsLvTqy6ezYma9EV40F8jW2/OovyBRUuytavJ4xcbCz/FUlWfNRU xx68e15t49iOFJynGXt62LJnEmBzRaE2uUagZaMNms18gmsL10y5pECAmi9zhQWK smkfqmsVWU8nB9UsDIT7 =+DKS -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Include changes: - fix NULL dereference in batadv_orig_hardif_seq_print_text() - fix reference counting imbalance when using fragmentation - avoid access to orig_node objects after they have been free'd - fix local TT check for outgoing arp requests in DAT	2014-05-16 16:28:53 -04:00
Kirill Tkhai	31ff6aa5c8	net: unix: Align send data_len up to PAGE_SIZE Using whole of allocated pages reduces requested skb->data size. This is just a little more thriftily allocation. netperf does not show difference with the current performance. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:04:03 -04:00
David S. Miller	202630b445	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-05-15 Please pull this batch of fixes for the 3.15 stream... For the mac80211 bits, Johannes says: "One fix is to get better VHT performance and the other fixes tracing garbage or other potential issues with the interface name tracing." And... "This has a fix from Emmanuel for a problem I failed to fix - when association is in progress then it needs to be cancelled while suspending (I had fixed the same for authentication). Also included a fix from myself for a userspace API problem that hit the iw tool and a fix to the remain-on-channel framework." For the iwlwifi bits, Emmanuel says: "Alex fixes the scan by disabling the fragmented scan. David prevents scan offload while associated, the firmware seems not to like it. I fix a stupid bug I made in BT Coex, and fix a bad #ifdef clause in rate scaling. Along with that there is a fix for a NULL pointer exception that can happen if we load the driver and our ISR gets called because the interrupt line is shared. The fix has been tested by the reporter." And... "We have here a fix from David Spinadel that makes a previous fix more complete, and an off-by-one issue fixed by Eliad in the same area. I fix the monitor that broke on the way." Beyond that... Daniel Kim's one-liner fixes a brcmfmac regression caused by a typo in an earlier commit.. Rajkumar Manoharan fixes an ath9k oops reported by David Herrmann. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 15:45:56 -04:00
Nathaniel W Filardo	fde0133b9c	af_rxrpc: Fix XDR length check in rxrpc key demarshalling. There may be padding on the ticket contained in the key payload, so just ensure that the claimed token length is large enough, rather than exactly the right size. Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 15:24:47 -04:00
Ilya Dryomov	f140662f35	crush: decode and initialize chooseleaf_vary_r Commit `e2b149cc4b` ("crush: add chooseleaf_vary_r tunable") added the crush_map::chooseleaf_vary_r field but missed the decode part. This lead to misdirected requests caused by incorrect raw crush mapping sets. Fixes: http://tracker.ceph.com/issues/8226 Reported-and-Tested-by: Dmitry Smirnov <onlyjob@member.fsf.org> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2014-05-16 21:29:55 +04:00
Chunwei Chen	178eda29ca	libceph: fix corruption when using page_count 0 page in rbd It has been reported that using ZFSonLinux on rbd will result in memory corruption. The bug report can be found here: https://github.com/zfsonlinux/spl/issues/241 http://tracker.ceph.com/issues/7790 The reason is that ZFS will send pages with page_count 0 into rbd, which in turns send them to tcp_sendpage. However, tcp_sendpage cannot deal with page_count 0, as it will do get_page and put_page, and erroneously free the page. This type of issue has been noted before, and handled in iscsi, drbd, etc. So, rbd should also handle this. This fix address this issue by fall back to slower sendmsg when page_count 0 detected. Cc: Sage Weil <sage@inktank.com> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: stable@vger.kernel.org Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>	2014-05-16 21:29:26 +04:00
Andrzej Kaczmarek	f4e2dd53d5	Bluetooth: Add missing msecs to jiffies conversion conn_info_age value is calculated in ms, so need to be converted to jiffies. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-16 08:23:01 -07:00
Andrzej Kaczmarek	eed5daf318	Bluetooth: Add support for max_tx_power in Get Conn Info This patch adds support for max_tx_power in Get Connection Information request. Value is read only once for given connection and then always returned in response as parameter. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-15 21:48:07 -07:00
Andrzej Kaczmarek	d0455ed996	Bluetooth: Store max TX power level for connection This patch adds support to store local maximum TX power level for connection when reply for HCI_Read_Transmit_Power_Level is received. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-15 21:48:07 -07:00
Andrzej Kaczmarek	f7faab0c9d	Bluetooth: Avoid polling TX power for LE links TX power for LE links is immutable thus we do not need to query for it if already have value. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-15 21:48:06 -07:00
Andrzej Kaczmarek	dd9838087b	Bluetooth: Add support to get connection information This patch adds support for Get Connection Information mgmt command which can be used to query for information about connection, i.e. RSSI and local TX power level. In general values cached in hci_conn are returned as long as they are considered valid, i.e. do not exceed age limit set in hdev. This limit is calculated as random value between min/max values to avoid client trying to guess when to poll for updated information. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-15 21:48:06 -07:00
Andrzej Kaczmarek	31ad169148	Bluetooth: Add conn info lifetime parameters to debugfs This patch adds conn_info_min_age and conn_info_max_age parameters to debugfs which determine lifetime of connection information. Actual lifetime will be random value between min and max age. Default values for min and max age are 1000ms and 3000ms respectively. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-15 21:48:05 -07:00
Duan Jiong	be7a010d6f	ipv6: update Destination Cache entries when gateway turn into host RFC 4861 states in 7.2.5: The IsRouter flag in the cache entry MUST be set based on the Router flag in the received advertisement. In those cases where the IsRouter flag changes from TRUE to FALSE as a result of this update, the node MUST remove that router from the Default Router List and update the Destination Cache entries for all destinations using that neighbor as a router as specified in Section 7.3.3. This is needed to detect when a node that is used as a router stops forwarding packets due to being configured as a host. Currently, when dealing with NA Message which IsRouter flag changes from TRUE to FALSE, the kernel only removes router from the Default Router List, and don't update the Destination Cache entries. Now in order to update those Destination Cache entries, i introduce function rt6_clean_tohost(). Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 23:26:27 -04:00
David S. Miller	f895f0cfbb	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Conflicts: net/ipv4/ip_vti.c Steffen Klassert says: ==================== pull request (net): ipsec 2014-05-15 This pull request has a merge conflict in net/ipv4/ip_vti.c between commit `8d89dcdf80` ("vti: don't allow to add the same tunnel twice") and commit `a32452366b` ("vti4:Don't count header length twice"). It can be solved like it is done in linux-next. 1) Fix a ipv6 xfrm output crash when a packet is rerouted by netfilter to not use IPsec. 2) vti4 counts some header lengths twice leading to an incorrect device mtu. Fix this by counting these headers only once. 3) We don't catch the case if an unsupported protocol is submitted to the xfrm protocol handlers, this can lead to NULL pointer dereferences. Fix this by adding the appropriate checks. 4) vti6 may unregister pernet ops twice on init errors. Fix this by removing one of the calls to do it only once. From Mathias Krause. 5) Set the vti tunnel mark before doing a lookup in the error handlers. Otherwise we don't find the correct xfrm state. ==================== The conflict in ip_vti.c was simple, 'net' had a commit removing a line from vti_tunnel_init() and this tree being merged had a commit adding a line to the same location. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 23:23:48 -04:00
Julia Lawall	112a3513b5	vti6: delete unneeded call to netdev_priv Netdev_priv is an accessor function, and has no purpose if its result is not used. A simplified version of the semantic match that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ local idexpression x; @@ -x = netdev_priv(...); ... when != x // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 16:57:47 -04:00
Julia Lawall	4929fd8cb0	ip_tunnel: delete unneeded call to netdev_priv Netdev_priv is an accessor function, and has no purpose if its result is not used. A simplified version of the semantic match that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ local idexpression x; @@ -x = netdev_priv(...); ... when != x // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 16:57:47 -04:00
Alexei Starovoitov	622582786c	net: filter: x86: internal BPF JIT Maps all internal BPF instructions into x86_64 instructions. This patch replaces original BPF x64 JIT with internal BPF x64 JIT. sysctl net.core.bpf_jit_enable is reused as on/off switch. Performance: 1. old BPF JIT and internal BPF JIT generate equivalent x86_64 code. No performance difference is observed for filters that were JIT-able before Example assembler code for BPF filter "tcpdump port 22" original BPF -> old JIT: original BPF -> internal BPF -> new JIT: 0: push %rbp 0: push %rbp 1: mov %rsp,%rbp 1: mov %rsp,%rbp 4: sub $0x60,%rsp 4: sub $0x228,%rsp 8: mov %rbx,-0x8(%rbp) b: mov %rbx,-0x228(%rbp) // prologue 12: mov %r13,-0x220(%rbp) 19: mov %r14,-0x218(%rbp) 20: mov %r15,-0x210(%rbp) 27: xor %eax,%eax // clear A c: xor %ebx,%ebx 29: xor %r13,%r13 // clear X e: mov 0x68(%rdi),%r9d 2c: mov 0x68(%rdi),%r9d 12: sub 0x6c(%rdi),%r9d 30: sub 0x6c(%rdi),%r9d 16: mov 0xd8(%rdi),%r8 34: mov 0xd8(%rdi),%r10 3b: mov %rdi,%rbx 1d: mov $0xc,%esi 3e: mov $0xc,%esi 22: callq 0xffffffffe1021e15 43: callq 0xffffffffe102bd75 27: cmp $0x86dd,%eax 48: cmp $0x86dd,%rax 2c: jne 0x0000000000000069 4f: jne 0x000000000000009a 2e: mov $0x14,%esi 51: mov $0x14,%esi 33: callq 0xffffffffe1021e31 56: callq 0xffffffffe102bd91 38: cmp $0x84,%eax 5b: cmp $0x84,%rax 3d: je 0x0000000000000049 62: je 0x0000000000000074 3f: cmp $0x6,%eax 64: cmp $0x6,%rax 42: je 0x0000000000000049 68: je 0x0000000000000074 44: cmp $0x11,%eax 6a: cmp $0x11,%rax 47: jne 0x00000000000000c6 6e: jne 0x0000000000000117 49: mov $0x36,%esi 74: mov $0x36,%esi 4e: callq 0xffffffffe1021e15 79: callq 0xffffffffe102bd75 53: cmp $0x16,%eax 7e: cmp $0x16,%rax 56: je 0x00000000000000bf 82: je 0x0000000000000110 58: mov $0x38,%esi 88: mov $0x38,%esi 5d: callq 0xffffffffe1021e15 8d: callq 0xffffffffe102bd75 62: cmp $0x16,%eax 92: cmp $0x16,%rax 65: je 0x00000000000000bf 96: je 0x0000000000000110 67: jmp 0x00000000000000c6 98: jmp 0x0000000000000117 69: cmp $0x800,%eax 9a: cmp $0x800,%rax 6e: jne 0x00000000000000c6 a1: jne 0x0000000000000117 70: mov $0x17,%esi a3: mov $0x17,%esi 75: callq 0xffffffffe1021e31 a8: callq 0xffffffffe102bd91 7a: cmp $0x84,%eax ad: cmp $0x84,%rax 7f: je 0x000000000000008b b4: je 0x00000000000000c2 81: cmp $0x6,%eax b6: cmp $0x6,%rax 84: je 0x000000000000008b ba: je 0x00000000000000c2 86: cmp $0x11,%eax bc: cmp $0x11,%rax 89: jne 0x00000000000000c6 c0: jne 0x0000000000000117 8b: mov $0x14,%esi c2: mov $0x14,%esi 90: callq 0xffffffffe1021e15 c7: callq 0xffffffffe102bd75 95: test $0x1fff,%ax cc: test $0x1fff,%rax 99: jne 0x00000000000000c6 d3: jne 0x0000000000000117 d5: mov %rax,%r14 9b: mov $0xe,%esi d8: mov $0xe,%esi a0: callq 0xffffffffe1021e44 dd: callq 0xffffffffe102bd91 // MSH e2: and $0xf,%eax e5: shl $0x2,%eax e8: mov %rax,%r13 eb: mov %r14,%rax ee: mov %r13,%rsi a5: lea 0xe(%rbx),%esi f1: add $0xe,%esi a8: callq 0xffffffffe1021e0d f4: callq 0xffffffffe102bd6d ad: cmp $0x16,%eax f9: cmp $0x16,%rax b0: je 0x00000000000000bf fd: je 0x0000000000000110 ff: mov %r13,%rsi b2: lea 0x10(%rbx),%esi 102: add $0x10,%esi b5: callq 0xffffffffe1021e0d 105: callq 0xffffffffe102bd6d ba: cmp $0x16,%eax 10a: cmp $0x16,%rax bd: jne 0x00000000000000c6 10e: jne 0x0000000000000117 bf: mov $0xffff,%eax 110: mov $0xffff,%eax c4: jmp 0x00000000000000c8 115: jmp 0x000000000000011c c6: xor %eax,%eax 117: mov $0x0,%eax c8: mov -0x8(%rbp),%rbx 11c: mov -0x228(%rbp),%rbx // epilogue cc: leaveq 123: mov -0x220(%rbp),%r13 cd: retq 12a: mov -0x218(%rbp),%r14 131: mov -0x210(%rbp),%r15 138: leaveq 139: retq On fully cached SKBs both JITed functions take 12 nsec to execute. BPF interpreter executes the program in 30 nsec. The difference in generated assembler is due to the following: Old BPF imlements LDX_MSH instruction via sk_load_byte_msh() helper function inside bpf_jit.S. New JIT removes the helper and does it explicitly, so ldx_msh cost is the same for both JITs, but generated code looks longer. New JIT has 4 registers to save, so prologue/epilogue are larger, but the cost is within noise on x64. Old JIT checks whether first insn clears A and if not emits 'xor %eax,%eax'. New JIT clears %rax unconditionally. 2. old BPF JIT doesn't support ANC_NLATTR, ANC_PAY_OFFSET, ANC_RANDOM extensions. New JIT supports all BPF extensions. Performance of such filters improves 2-4 times depending on a filter. The longer the filter the higher performance gain. Synthetic benchmarks with many ancillary loads see 20x speedup which seems to be the maximum gain from JIT Notes: . net.core.bpf_jit_enable=2 + tools/net/bpf_jit_disasm is still functional and can be used to see generated assembler . there are two jit_compile() functions and code flow for classic filters is: sk_attach_filter() - load classic BPF bpf_jit_compile() - try to JIT from classic BPF sk_convert_filter() - convert classic to internal bpf_int_jit_compile() - JIT from internal BPF seccomp and tracing filters will just call bpf_int_jit_compile() Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 16:31:30 -04:00
Phoebe Buckheister	6ef0023a2e	mac802154: make mac802154_wpan_open static This function is only used within the same translation unit, so mark it static. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 15:51:43 -04:00
Phoebe Buckheister	1cc76e3654	ieee802154: fix dgram socket sendmsg() 802.15.4 datagram sockets do not currently have a compliant sendmsg(). The destination address supplied is always ignored, and in unconnected mode, packets are broadcast instead of dropped with -EDESTADDRREQ. This patch fixes 802.15.4 dgram sockets to be compliant, i.e. !conn && !msg_name => -EDESTADDRREQ !conn && msg_name => send to msg_name conn && !msg_name => send to connected conn && msg_name => -EISCONN Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 15:51:43 -04:00
Phoebe Buckheister	d4b2816d67	6lowpan: fix fragmentation Currently, 6lowpan creates one 802.15.4 MAC header for the original packet the device was given by upper layers and reuses this header for all fragments, if fragmentation is required. This also reuses frame sequence numbers, which must not happen. 6lowpan also has issues with fragmentation in the presence of security headers, since those may imply the presence of trailing fields that are not accounted for by the fragmentation code right now. Fix both of these issues by properly allocating fragment skbs with headromm and tailroom as specified by the underlying device, create one header for each skb instead of reusing the original header, let the underlying device do the rest. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 15:51:43 -04:00
Phoebe Buckheister	32edc40ae6	ieee802154: change _cb handling slightly The current mac_cb handling of ieee802154 is rather awkward and limited. Decompose the single flags field into multiple fields with the meanings of each subfield of the flags field to make future extensions (for example, link-layer security) easier. Also don't set the frame sequence number in upper layers, since that's a thing the MAC is supposed to set on frame transmit - we set it on header creation, but assuming that upper layers do not blindly duplicate our headers, this is fine. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 15:51:42 -04:00
Phoebe Buckheister	8c84296fd2	mac802154: account for all header parts during wpan header creationg The current WPAN header creation code checks for EMSGSIZE conditions, but does not account for the MIC field that link layer security may add at the end of the frame. Now that we can accurately calculate the maximum payload size of packets, use that to check for EMSGSIZE conditions. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 15:51:42 -04:00
Phoebe Buckheister	c3a6114f31	ieee802154: add definitions for link-layer security and header functions When dealing with 802.15.4, one often has to know the maximum payload size for a given packet. This depends on many factors, one of which is whether or not a security header is present in the frame. These definitions and functions provide an easy way for any upper layer to calculate the maximum payload size for a packet. The first obvious user for this is 6lowpan, which duplicates this calculation and gets it partially wrong because it ignores security headers. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 15:51:42 -04:00
Cong Wang	200b916f35	rtnetlink: wait for unregistering devices in rtnl_link_unregister() From: Cong Wang <cwang@twopensource.com> commit `50624c934d` (net: Delay default_device_exit_batch until no devices are unregistering) introduced rtnl_lock_unregistering() for default_device_exit_batch(). Same race could happen we when rmmod a driver which calls rtnl_link_unregister() as we call dev->destructor without rtnl lock. For long term, I think we should clean up the mess of netdev_run_todo() and net namespce exit code. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 15:30:33 -04:00
Antonio Quartulli	cc2f33860c	batman-adv: fix local TT check for outgoing arp requests in DAT Change introduced by `88e48d7b33` ("batman-adv: make DAT drop ARP requests targeting local clients") implements a check that prevents DAT from using the caching mechanism when the client that is supposed to provide a reply to an arp request is local. However change brought by `be1db4f661` ("batman-adv: make the Distributed ARP Table vlan aware") has not converted the above check into its vlan aware version thus making it useless when the local client is behind a vlan. Fix the behaviour by properly specifying the vlan when checking for a client being local or not. Reported-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2014-05-15 20:23:47 +02:00
Antonio Quartulli	377fe0f968	batman-adv: increase orig refcount when storing ref in gw_node A pointer to the orig_node representing a bat-gateway is stored in the gw_node->orig_node member, but the refcount for such orig_node is never increased. This leads to memory faults when gw_node->orig_node is accessed and the originator has already been freed. Fix this by increasing the refcount on gw_node creation and decreasing it on gw_node free. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2014-05-15 20:03:17 +02:00
Antonio Quartulli	be181015a1	batman-adv: fix reference counting imbalance while sending fragment In the new fragmentation code the batadv_frag_send_packet() function obtains a reference to the primary_if, but it does not release it upon return. This reference imbalance prevents the primary_if (and then the related netdevice) to be properly released on shut down. Fix this by releasing the primary_if in batadv_frag_send_packet(). Introduced by `ee75ed8887` ("batman-adv: Fragment and send skbs larger than mtu") Cc: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Acked-by: Martin Hundebøll <martin@hundeboll.net>	2014-05-15 20:03:17 +02:00
Marek Lindner	16a4142363	batman-adv: fix indirect hard_iface NULL dereference If hard_iface is NULL and goto out is made batadv_hardif_free_ref() doesn't check for NULL before dereferencing it to get to refcount. Introduced in `cb1c92ec37` ("batman-adv: add debugfs support to view multiif tables"). Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-15 20:03:16 +02:00
Pablo Neira Ayuso	3b084e99a3	netfilter: nf_tables: fix trace of matching non-terminal rule Add the corresponding trace if we have a full match in a non-terminal rule. Note that the traces will look slightly different than in x_tables since the log message after all expressions have been evaluated (contrary to x_tables, that emits it before the target action). This manifests in two differences in nf_tables wrt. x_tables: 1) The rule that enables the tracing is included in the trace. 2) If the rule emits some log message, that is shown before the trace log message. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-15 19:44:20 +02:00
John W. Linville	025a58fd9d	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2014-05-15 10:24:28 -04:00
Andrei Otcheretianski	1af586c911	mac80211: Handle the CSA counters correctly Make the beacon CSA counters part of ieee80211_mutable_offsets and don't decrement CSA counters when generating a beacon template. This permits the driver to offload the CSA counters handling. Since mac80211 updates the probe responses with the correct counter, the driver should sync the counter's value with mac80211 using ieee80211_csa_update_counter function. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-15 15:01:00 +02:00
Andrei Otcheretianski	6ec8c332a0	mac80211: Provide ieee80211_beacon_get_template API Add a new API ieee80211_beacon_get_template, which doesn't affect DTIM counter and should be used if the device generates beacon frames, and new beacon template is needed. In addition set the offsets to TIM IE for MESH interface. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-15 15:01:00 +02:00
Andrei Otcheretianski	0d06d9ba93	mac80211: Support multiple CSA counters Support up to IEEE80211_MAX_CSA_COUNTERS_NUM csa counters. This is defined to be 2 now, to support both CSA and eCSA counters. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-15 15:00:58 +02:00
Andrei Otcheretianski	9a774c78e2	cfg80211: Support multiple CSA counters Change the type of NL80211_ATTR_CSA_C_OFF_BEACON and NL80211_ATTR_CSA_C_OFF_PRESP to be NLA_BINARY which allows userspace to use beacons and probe responses with multiple CSA counters. This isn't breaking the API since userspace can continue to use nla_put_u16 for this attributes, which is equivalent to a single element u16 array. In addition advertise max number of supported CSA counters. This is needed when using CSA and eCSA IEs together. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-15 15:00:42 +02:00
Andrei Otcheretianski	387910cc79	mac80211: Update CSA counters in mgmt frames Track current csa counter value and use it to update mgmt frames at the provided offsets. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-15 14:54:32 +02:00
Andrei Otcheretianski	34d22ce22b	cfg80211: Add API to update CSA counters in mgmt frames Add NL80211_ATTR_CSA_C_OFFSETS_TX which holds an array of offsets to the CSA counters which should be updated when sending a management frames with NL80211_CMD_FRAME. This API should be used by the drivers that wish to keep the CSA counter updated in probe responses, but do not implement probe response offloading and so, do not use ieee80211_proberesp_get function. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-15 14:52:44 +02:00
Luciano Coelho	00ec75fc5a	cfg80211: pass the actual iftype when calling cfg80211_chandef_dfs_required() There is no need to pass NL80211_IFTYPE_UNSPECIFIED when calling cfg80211_chandef_dfs_required() since we always already have the interface type. So, pass the actual interface type instead. Additionally, have cfg80211_chandef_dfs_required() WARN if the passed interface type is NL80211_IFTYPE_UNSPECIFIED, so we can detect problems more easily. Tested-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Reported-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-15 14:50:34 +02:00
Joe Perches	c722831744	net: Use a more standard macro for INET_ADDR_COOKIE Missing a colon on definition use is a bit odd so change the macro for the 32 bit case to declare an __attribute__((unused)) and __deprecated variable. The __deprecated attribute will cause gcc to emit an error if the variable is actually used. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 16:07:23 -04:00
John W. Linville	eac94da8b4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2014-05-14 15:39:45 -04:00
Ursula Braun	f5738e2ef8	af_iucv: wrong mapping of sent and confirmed skbs When sending data through IUCV a MESSAGE COMPLETE interrupt signals that sent data memory can be freed or reused again. With commit `f9c41a62bb` "af_iucv: fix recvmsg by replacing skb_pull() function" the MESSAGE COMPLETE callback iucv_callback_txdone() identifies the wrong skb as being confirmed, which leads to data corruption. This patch fixes the skb mapping logic in iucv_callback_txdone(). Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:38:39 -04:00
wangweidong	8ba7e7bfc3	dccp: make the request_retries minimum is 1 In Documentation/networking/dccp.txt points that request_retries should be greater than 0. So make the extra1 to be &one instead of &zero. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:34:16 -04:00
WANG Cong	c9f2dba61b	snmp: fix some left over of snmp stats Fengguang reported the following sparse warning: >> net/ipv6/proc.c:198:41: sparse: incorrect type in argument 1 (different address spaces) net/ipv6/proc.c:198:41: expected void [noderef] <asn:3>mib net/ipv6/proc.c:198:41: got void [noderef] <asn:3>*pcpumib Fixes: commit `698365fa18` (net: clean up snmp stats code) Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:33:47 -04:00
WANG Cong	122ff243f5	ipv4: make ip_local_reserved_ports per netns ip_local_port_range is already per netns, so should ip_local_reserved_ports be. And since it is none by default we don't actually need it when we don't enable CONFIG_SYSCTL. By the way, rename inet_is_reserved_local_port() to inet_is_local_reserved_port() Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:31:45 -04:00
Jon Paul Maloy	9816f0615d	tipc: merge port message reception into socket reception function In order to reduce complexity and save a call level during message reception at port/socket level, we remove the function tipc_port_rcv() and merge its functionality into tipc_sk_rcv(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:19:48 -04:00
Jon Paul Maloy	c82910e2a8	tipc: clean up neigbor discovery message reception The function tipc_disc_rcv(), which is handling received neighbor discovery messages, is perceived as messy, and it is hard to verify its correctness by code inspection. The fact that the task it is set to resolve is fairly complex does not make the situation better. In this commit we try to take a more systematic approach to the problem. We define a decision machine which takes three state flags as input, and produces three action flags as output. We then walk through all permutations of the state flags, and for each of them we describe verbally what is going on, plus that we set zero or more of the action flags. The action flags indicate what should be done once the decision machine has finished its job, while the last part of the function deals with performing those actions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:19:48 -04:00
Jon Paul Maloy	38504c28a2	tipc: improve and extend media address conversion functions TIPC currently handles two media specific addresses: Ethernet MAC addresses and InfiniBand addresses. Those are kept in three different formats: 1) A "raw" format as obtained from the device. This format is known only by the media specific adapter code in eth_media.c and ib_media.c. 2) A "generic" internal format, in the form of struct tipc_media_addr, which can be referenced and passed around by the generic media- unaware code. 3) A serialized version of the latter, to be conveyed in neighbor discovery messages. Conversion between the three formats can only be done by the media specific code, so we have function pointers for this purpose in struct tipc_media. Here, the media adapters can install their own conversion functions at startup. We now introduce a new such function, 'raw2addr()', whose purpose is to convert from format 1 to format 2 above. We also try to as far as possible uniform commenting, variable names and usage of these functions, with the purpose of making them more comprehensible. We can now also remove the function tipc_l2_media_addr_set(), whose job is done better by the new function. Finally, we expand the field for serialized addresses (format 3) in discovery messages from 20 to 32 bytes. This is permitted according to the spec, and reduces the risk of problems when we add new media in the future. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:19:48 -04:00
Jon Paul Maloy	37e22164a8	tipc: rename and move message reassembly function The function tipc_link_frag_rcv() is in reality a re-entrant generic message reassemby function that has nothing in particular to do with the link, where it is defined now. This becomes obvious when we see the need to call the function from other places in the code. In this commit rename it to tipc_buf_append() and move it to the file msg.c. We also simplify its signature by moving the tail pointer to the control block of the head buffer, hence making the head buffer self-contained. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:19:48 -04:00
Jon Paul Maloy	5074ab89c5	tipc: mark head of reassembly buffer as non-linear The message reassembly function does not update the 'len' and 'data_len' fields of the head skbuff correctly when fragments are chained to it. This may sometimes lead to obsure errors, such as fragment reordering when we receive fragments which are cloned buffers. This commit fixes this, by ensuring that the two fields are updated correctly. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:19:48 -04:00
Jon Paul Maloy	ec37dcd382	tipc: don't record link RESET or ACTIVATE messages as traffic In the current code, all incoming LINK_PROTOCOL messages, irrespective of type, nudge the "last message received" checkpoint, informing the link state machine that a message was received from the peer since last supervision timeout event. This inhibits the link from starting probing the peer unnecessarily. However, not only STATE messages are recorded as legitimate incoming traffic this way, but even RESET and ACTIVATE messages, which in reality are there to inform the link that the peer endpoint has been reset. At the same time, some RESET messages may be dropped instead of causing a link reset. This happens when the link endpoint thinks it is fully up and working, and the session number of the RESET is lower than or equal to the current link session. In such cases the RESET is perceived as a delayed remnant from an earlier session, or the current one, and dropped. Now, if a TIPC module is removed and then immediately reinserted, e.g. when using a script, RESET messages may arrive at the peer link endpoint before this one has had time to discover the failure. The RESET may be dropped because of the session number, but only after it has been recorded as a legitimate traffic event. Hence, the receiving link will not start probing, and not discover that the peer endpoint is down, at the same time ignoring the periodic RESET messages coming from that endpoint. We have ended up in a stale state where a failed link cannot be re-established. In this commit, we remedy this by nudging the checkpoint only for received STATE messages, not for RESET or ACTIVATE messages. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:19:48 -04:00
Jon Paul Maloy	4f4482dcd9	tipc: compensate for double accounting in socket rcv buffer The function net/core/sock.c::__release_sock() runs a tight loop to move buffers from the socket backlog queue to the receive queue. As a security measure, sk_backlog.len of the receiving socket is not set to zero until after the loop is finished, i.e., until the whole backlog queue has been transferred to the receive queue. During this transfer, the data that has already been moved is counted both in the backlog queue and the receive queue, hence giving an incorrect picture of the available queue space for new arriving buffers. This leads to unnecessary rejection of buffers by sk_add_backlog(), which in TIPC leads to unnecessarily broken connections. In this commit, we compensate for this double accounting by adding a counter that keeps track of it. The function socket.c::backlog_rcv() receives buffers one by one from __release_sock(), and adds them to the socket receive queue. If the transfer is successful, it increases a new atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the transferred buffer. If a new buffer arrives during this transfer and finds the socket busy (owned), we attempt to add it to the backlog. However, when sk_add_backlog() is called, we adjust the 'limit' parameter with the value of the new counter, so that the risk of inadvertent rejection is eliminated. It should be noted that this change does not invalidate the original purpose of zeroing 'sk_backlog.len' after the full transfer. We set an upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that doesn't respect the send window) keeps pumping in buffers to sk_add_backlog(), he will eventually reach an upper limit, (2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added to the backlog, and the connection will be broken. Ordinary, well- behaved senders will never reach this buffer limit at all. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:19:47 -04:00
Jon Paul Maloy	6163a194e0	tipc: decrease connection flow control window Memory overhead when allocating big buffers for data transfer may be quite significant. E.g., truesize of a 64 KB buffer turns out to be 132 KB, 2 x the requested size. This invalidates the "worst case" calculation we have been using to determine the default socket receive buffer limit, which is based on the assumption that 1024x64KB = 67MB buffers may be queued up on a socket. Since TIPC connections cannot survive hitting the buffer limit, we have to compensate for this overhead. We do that in this commit by dividing the fix connection flow control window from 1024 (2512) messages to 512 (2256). Since older version nodes send out acks at 512 message intervals, compatibility with such nodes is guaranteed, although performance may be non-optimal in such cases. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 15:19:47 -04:00
Samuel Ortiz	40b9397a1a	Bluetooth: Fix L2CAP LE debugfs entries permissions 0466 was probably meant to be 0644, there's no reason why everyone except root could write there. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-05-14 09:07:07 -07:00
Janusz Dziedzic	67ae07a109	cfg80211: fix start_radar_detection issue After patch: cfg80211/mac80211: refactor cfg80211_chandef_dfs_required() start_radar_detection always fail with -EINVAL. Acked-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-14 16:42:22 +02:00
Johannes Berg	b4b177a555	mac80211: fix on-channel remain-on-channel Jouni reported that if a remain-on-channel was active on the same channel as the current operating channel, then the ROC would start, but any frames transmitted using mgmt-tx on the same channel would get delayed until after the ROC. The reason for this is that the ROC starts, but doesn't have any handling for "remain on the same channel", so it stops the interface queues. The later mgmt-tx then puts the frame on the interface queues (since it's on the current operating channel) and thus they get delayed until after the ROC. To fix this, add some logic to handle remaining on the same channel specially and not stop the queues etc. in this case. This not only fixes the bug but also improves behaviour in this case as data frames etc. can continue to flow. Cc: stable@vger.kernel.org Reported-by: Jouni Malinen <j@w1.fi> Tested-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-14 15:48:38 +02:00
Hannes Frederic Sowa	3a1cebe7e0	ipv6: fix calculation of option len in ip6_append_data tot_len does specify the size of struct ipv6_txoptions. We need opt_flen + opt_nflen to calculate the overall length of additional ipv6 extensions. I found this while auditing the ipv6 output path for a memory corruption reported by Alexey Preobrazhensky while he fuzzed an instrumented AddressSanitizer kernel with trinity. This may or may not be the cause of the original bug. Fixes: `4df98e76cd` ("ipv6: pmtudisc setting not respected with UFO/CORK") Reported-by: Alexey Preobrazhensky <preobr@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 00:40:27 -04:00
Hannes Frederic Sowa	3d4405226d	net: avoid dependency of net_get_random_once on nop patching net_get_random_once depends on the static keys infrastructure to patch up the branch to the slow path during boot. This was realized by abusing the static keys api and defining a new initializer to not enable the call site while still indicating that the branch point should get patched up. This was needed to have the fast path considered likely by gcc. The static key initialization during boot up normally walks through all the registered keys and either patches in ideal nops or enables the jump site but omitted that step on x86 if ideal nops where already placed at static_key branch points. Thus net_get_random_once branches not always became active. This patch switches net_get_random_once to the ordinary static_key api and thus places the kernel fast path in the - by gcc considered - unlikely path. Microbenchmarks on Intel and AMD x86-64 showed that the unlikely path actually beats the likely path in terms of cycle cost and that different nop patterns did not make much difference, thus this switch should not be noticeable. Fixes: `a48e42920f` ("net: introduce new macro net_get_random_once") Reported-by: Tuomas Räsänen <tuomasjjrasanen@tjjr.fi> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 00:37:34 -04:00
Lorenzo Colitti	84f39b08d7	net: support marking accepting TCP sockets When using mark-based routing, sockets returned from accept() may need to be marked differently depending on the incoming connection request. This is the case, for example, if different socket marks identify different networks: a listening socket may want to accept connections from all networks, but each connection should be marked with the network that the request came in on, so that subsequent packets are sent on the correct network. This patch adds a sysctl to mark TCP sockets based on the fwmark of the incoming SYN packet. If enabled, and an unmarked socket receives a SYN, then the SYN packet's fwmark is written to the connection's inet_request_sock, and later written back to the accepted socket when the connection is established. If the socket already has a nonzero mark, then the behaviour is the same as it is today, i.e., the listening socket's fwmark is used. Black-box tested using user-mode linux: - IPv4/IPv6 SYN+ACK, FIN, etc. packets are routed based on the mark of the incoming SYN packet. - The socket returned by accept() is marked with the mark of the incoming SYN packet. - Tested with syncookies=1 and syncookies=2. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 18:35:09 -04:00
Lorenzo Colitti	1b3c61dc1a	net: Use fwmark reflection in PMTU discovery. Currently, routing lookups used for Path PMTU Discovery in absence of a socket or on unmarked sockets use a mark of 0. This causes PMTUD not to work when using routing based on netfilter fwmark mangling and fwmark ip rules, such as: iptables -j MARK --set-mark 17 ip rule add fwmark 17 lookup 100 This patch causes these route lookups to use the fwmark from the received ICMP error when the fwmark_reflect sysctl is enabled. This allows the administrator to make PMTUD work by configuring appropriate fwmark rules to mark the inbound ICMP packets. Black-box tested using user-mode linux by pointing different fwmarks at routing tables egressing on different interfaces, and using iptables mangling to mark packets inbound on each interface with the interface's fwmark. ICMPv4 and ICMPv6 PMTU discovery work as expected when mark reflection is enabled and fail when it is disabled. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 18:35:09 -04:00
Lorenzo Colitti	e110861f86	net: add a sysctl to reflect the fwmark on replies Kernel-originated IP packets that have no user socket associated with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.) are emitted with a mark of zero. Add a sysctl to make them have the same mark as the packet they are replying to. This allows an administrator that wishes to do so to use mark-based routing, firewalling, etc. for these replies by marking the original packets inbound. Tested using user-mode linux: - ICMP/ICMPv6 echo replies and errors. - TCP RST packets (IPv4 and IPv6). Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 18:35:08 -04:00
Daniel Lee	3a19ce0eec	tcp: IPv6 support for fastopen server After all the preparatory works, supporting IPv6 in Fast Open is now easy. We pretty much just mirror v4 code. The only difference is how we generate the Fast Open cookie for IPv6 sockets. Since Fast Open cookie is 128 bits and we use AES 128, we use CBC-MAC to encrypt both the source and destination IPv6 addresses since the cookie is a MAC tag. Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Jerry Chu <hkchu@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:53:03 -04:00
Yuchung Cheng	0a672f7413	tcp: improve fastopen icmp handling If a fast open socket is already accepted by the user, it should be treated like a connected socket to record the ICMP error in sk_softerr, so the user can fetch it. Do that in both tcp_v4_err and tcp_v6_err. Also refactor the sequence window check to improve readability (e.g., there were two local variables named 'req'). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Jerry Chu <hkchu@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:53:03 -04:00
Yuchung Cheng	843f4a55e3	tcp: use tcp_v4_send_synack on first SYN-ACK To avoid large code duplication in IPv6, we need to first simplify the complicate SYN-ACK sending code in tcp_v4_conn_request(). To use tcp_v4(6)_send_synack() to send all SYN-ACKs, we need to initialize the mini socket's receive window before trying to create the child socket and/or building the SYN-ACK packet. So we move that initialization from tcp_make_synack() to tcp_v4_conn_request() as a new function tcp_openreq_init_req_rwin(). After this refactoring the SYN-ACK sending code is simpler and easier to implement Fast Open for IPv6. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Jerry Chu <hkchu@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:53:02 -04:00
Yuchung Cheng	89278c9dc9	tcp: simplify fast open cookie processing Consolidate various cookie checking and generation code to simplify the fast open processing. The main goal is to reduce code duplication in tcp_v4_conn_request() for IPv6 support. Removes two experimental sysctl flags TFO_SERVER_ALWAYS and TFO_SERVER_COOKIE_NOT_CHKD used primarily for developmental debugging purposes. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Jerry Chu <hkchu@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:53:02 -04:00
Yuchung Cheng	5b7ed0892f	tcp: move fastopen functions to tcp_fastopen.c Move common TFO functions that will be used by both v4 and v6 to tcp_fastopen.c. Create a helper tcp_fastopen_queue_check(). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Jerry Chu <hkchu@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:53:02 -04:00
Wilfried Klaebe	7ad24ea4bf	net: get rid of SET_ETHTOOL_OPS net: get rid of SET_ETHTOOL_OPS Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone. This does that. Mostly done via coccinelle script: @@ struct ethtool_ops ops; struct net_device dev; @@ - SET_ETHTOOL_OPS(dev, ops); + dev->ethtool_ops = ops; Compile tested only, but I'd seriously wonder if this broke anything. Suggested-by: Dave Miller <davem@davemloft.net> Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:43:20 -04:00
John W. Linville	3231d65ffe	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2014-05-13 15:27:44 -04:00
Mathias Krause	0f49ff0702	net: ptp: mark filter as __initdata sk_unattached_filter_create() will copy the filter's instructions so we don't need to have the master copy hanging around after initialization. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 13:17:24 -04:00
David S. Miller	1268e253a8	net: filter: Fix redefinition warnings on x86-64. Do not collide with the x86-64 PTRACE user API namespace. net/core/filter.c:57:0: warning: "R8" redefined [enabled by default] arch/x86/include/uapi/asm/ptrace-abi.h:38:0: note: this is the location of the previous definition Fix by adding a BPF_ prefix to the register macros. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 13:13:33 -04:00
David S. Miller	6262971a8a	Included changes: - properly release neigh_ifinfo in batadv_iv_ogm_process_per_outif() - properly release orig_ifinfo->router when freeing orig_ifinfo - properly release neigh_node objects during periodic check - properly release neigh_info objects when the related hard_iface is free'd These changes are all very important because they fix some reference counting imbalances that lead to the impossibility of releasing the netdev object used by batman-adv on shutdown. The consequence is that such object cannot be destroyed by the networking stack (the refcounter does not reach zero) thus bringing the system in hanging state during a normal reboot operation or a network reconfiguration. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTbyNlAAoJEEKTMo6mOh1VzeUQAJmcP73MMwdFDGfI+3DUD43Z ziaWzHK1/NAkERIJMYu/Nj9BPhFJ/JgYNoYGd4eZ+0IVzIidBKffpGvZYLKJaBBb kVzDt8sHgm7T+bmJdGK5zBCkCrQ66T1/7jF7evzWCtdmzAj9Ld+cJha6sZ6OLY4v WusFFHH2yQgzOGML52HdM99lIfZJu53sdQtYrMI7FpmObwmoBw1VQsmLsJbbFj0A XbFWYNOtQ0s8JvuHPnHB2gsczMXG6AdDuYdG1douOUryjsdg4AsKVWbPWaSuIyS9 ED6TiNsxtRt3A2YDgKrYmcGWHIc7CR4TE97DpdaB1xOEe/h0JPy8NEXaTiXifVi0 yWXaDZAl0J1gEKxda5foqIJZEScQyqWnAGFIIMVsxWxMpv9V3C+XaMgpgC5yQdoQ hgs6lv8U/w7Qevu4oaU2oq64C5ipyzheLuL+l9Ykwig9brJ9pqvBhEr34VDyyLnK l1VVQP5Y94gsPX2FuBaFgQ6oN3xjAkzFWDVKPtdYhMW7l93ER31KWgyJ53zK0Avk wl/h5Xvep7vgA1pvyiu7Lom47QX2SVY3Xt6vsJ42qrR9bp1sLZ+piZaSBPTSuNmo YySwgku6QlQfCFThh09zjuQ8+zwlq5Enjp+fvy/NtzEhTzK1gmknrQo0QF+Fj1Fj 5yz30/XWjUTn1dtBNeBw =GsPT -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - properly release neigh_ifinfo in batadv_iv_ogm_process_per_outif() - properly release orig_ifinfo->router when freeing orig_ifinfo - properly release neigh_node objects during periodic check - properly release neigh_info objects when the related hard_iface is free'd These changes are all very important because they fix some reference counting imbalances that lead to the impossibility of releasing the netdev object used by batman-adv on shutdown. The consequence is that such object cannot be destroyed by the networking stack (the refcounter does not reach zero) thus bringing the system in hanging state during a normal reboot operation or a network reconfiguration.	2014-05-13 12:53:36 -04:00
Duan Jiong	2176d5d418	neigh: set nud_state to NUD_INCOMPLETE when probing router reachability Since commit 7e98056964("ipv6: router reachability probing"), a router falls into NUD_FAILED will be probed. Now if function rt6_select() selects a router which neighbour state is NUD_FAILED, and at the same time function rt6_probe() changes the neighbour state to NUD_PROBE, then function dst_neigh_output() can directly send packets, but actually the neighbour still is unreachable. If we set nud_state to NUD_INCOMPLETE instead NUD_PROBE, packets will not be sent out until the neihbour is reachable. In addition, because the route should be probes with a single NS, so we must set neigh->probes to neigh_max_probes(), then the neigh timer timeout and function neigh_timer_handler() will not send other NS Messages. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 12:43:05 -04:00
Felix Fietkau	8c48b50a1a	cfg80211: allow restricting supported dfs regions At the moment, the ath9k/ath10k DFS module only supports detecting ETSI radar patterns. Add a bitmap in the interface combinations, indicating which DFS regions are supported by the detector. If unset, support for all regions is assumed. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-13 15:50:06 +02:00
Emmanuel Grumbach	c52666aef9	mac80211: fix suspend vs. association race If the association is in progress while we suspend, the stack will be in a messed up state. Clean it before we suspend. This patch completes Johannes's patch: `1a1cb744de` Author: Johannes Berg <johannes.berg@intel.com> mac80211: fix suspend vs. authentication race Cc: <stable@vger.kernel.org> Fixes: `12e7f51702` ("mac80211: cleanup generic suspend/resume procedures") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-13 13:58:16 +02:00
Fabian Frederick	fc68086ce8	net/xfrm/xfrm_output.c: move EXPORT_SYMBOL Fix checkpatch warning: "WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable" Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-05-13 12:44:28 +02:00
Susant Sahani	c8965932a2	ip6_tunnel: fix potential NULL pointer dereference The function ip6_tnl_validate assumes that the rtnl attribute IFLA_IPTUN_PROTO always be filled . If this attribute is not filled by the userspace application kernel get crashed with NULL pointer dereference. This patch fixes the potential kernel crash when IFLA_IPTUN_PROTO is missing . Signed-off-by: Susant Sahani <susant@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 00:27:19 -04:00
Yang Yingliang	b2ce49e737	sch_hhf: fix comparison of qlen and limit When I use the following command, eth0 cannot send any packets. #tc qdisc add dev eth0 root handle 1: hhf limit 1 Because qlen need be smaller than limit, all packets were dropped. Fix this by qlen <= limit. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 14:55:21 -04:00
dingtianhong	f06c7f9f92	vlan: rename __vlan_find_dev_deep() to __vlan_find_dev_deep_rcu() The __vlan_find_dev_deep should always called in RCU, according David's suggestion, rename to __vlan_find_dev_deep_rcu looks more reasonable. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 14:39:13 -04:00
John W. Linville	c5e64d6b70	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2014-05-12 14:12:19 -04:00
WANG Cong	60ff746739	net: rename local_df to ignore_df As suggested by several people, rename local_df to ignore_df, since it means "ignore df bit if it is set". Cc: Maciej Żenczykowski <maze@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 14:03:41 -04:00
David S. Miller	5f013c9bc7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/altera/altera_sgdma.c net/netlink/af_netlink.c net/sched/cls_api.c net/sched/sch_api.c The netlink conflict dealt with moving to netlink_capable() and netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations in non-init namespaces. These were simple transformations from netlink_capable to netlink_ns_capable. The Altera driver conflict was simply code removal overlapping some void pointer cast cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 13:19:14 -04:00
Pablo Neira Ayuso	7e9bc10db2	netfilter: nf_tables: fix missing return trace at the end of non-base chain Display "return" for implicit rule at the end of a non-base chain, instead of when popping chain from the stack. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-12 16:33:11 +02:00
Pablo Neira Ayuso	f7e7e39b21	netfilter: nf_tables: fix bogus rulenum after goto action After returning from the chain that we just went to with no matchings, we get a bogus rule number in the trace. To fix this, we would need to iterate over the list of remaining rules in the chain to update the rule number counter. Patrick suggested to set this to the maximum value since the default base chain policy is the very last action when the processing the base chain is over. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-12 16:33:10 +02:00
Pablo Neira Ayuso	7b9d5ef932	netfilter: nf_tables: fix tracing of the goto action Add missing code to trace goto actions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-12 16:33:08 +02:00
Pablo Neira Ayuso	5467a51221	netfilter: nf_tables: fix goto action This patch fixes a crash when trying to access the counters and the default chain policy from the non-base chain that we have reached via the goto chain. Fix this by falling back on the original base chain after returning from the custom chain. While fixing this, kill the inline function to account chain statistics to improve source code readability. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-12 16:32:41 +02:00
Steffen Klassert	6d004d6cc7	vti: Use the tunnel mark for lookup in the error handlers. We need to use the mark we get from the tunnels o_key to lookup the right vti state in the error handlers. This patch ensures that. Fixes: `df3893c1` ("vti: Update the ipv4 side to use it's own receive hook.") Fixes: `fa9ad96d` ("vti6: Update the ipv6 side to use its own receive hook.") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-05-12 09:36:03 +02:00
Mathias Krause	fd71143645	vti6: Don't unregister pernet ops twice on init errors If we fail to register one of the xfrm protocol handlers we will unregister the pernet ops twice on the error exit path. This will probably lead to a kernel panic as the double deregistration leads to a double kfree(). Fix this by removing one of the calls to do it only once. Fixes: `fa9ad96d49` ("vti6: Update the ipv6 side to use its own...") Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-05-12 07:43:21 +02:00
Duan Jiong	163cd4e817	ipv6: remove parameter rt from fib6_prune_clones() the parameter rt will be assigned to c.arg in function fib6_clean_tree(), but function fib6_prune_clone() doesn't use c.arg, so we can remove it safely. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 01:06:42 -04:00
Alexei Starovoitov	9739eef13c	net: filter: make BPF conversion more readable Introduce BPF helper macros to define instructions (similar to old BPF_STMT/BPF_JUMP macros) Use them while converting classic BPF to internal and in BPF testsuite later. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 00:23:55 -04:00
Simon Wunderlich	709de13f0c	batman-adv: fix removing neigh_ifinfo When an interface is removed separately, all neighbors need to be checked if they have a neigh_ifinfo structure for that particular interface. If that is the case, remove that ifinfo so any references to a hard interface can be freed. This is a regression introduced by `89652331c0` ("batman-adv: split tq information in neigh_node struct") Reported-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-11 09:10:58 +02:00
Pablo Neira Ayuso	d088be8042	netfilter: nf_tables: reset rule number counter after jump and goto Otherwise we start incrementing the rule number counter from the previous chain iteration. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-10 19:12:04 +02:00
Simon Wunderlich	7b955a9fc1	batman-adv: always run purge_orig_neighbors The current code will not execute batadv_purge_orig_neighbors() when an orig_ifinfo has already been purged. However we need to run it in any case. Fix that. This is a regression introduced by `7351a4822d` ("batman-adv: split out router from orig_node") Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-10 10:58:58 +02:00
Simon Wunderlich	000c8dff97	batman-adv: fix neigh reference imbalance When an interface is removed from batman-adv, the orig_ifinfo of a orig_node may be removed without releasing the router first. This will prevent the reference for the neighbor pointed at by the orig_ifinfo->router to be released, and this leak may result in reference leaks for the interface used by this neighbor. Fix that. This is a regression introduced by `7351a4822d` ("batman-adv: split out router from orig_node"). Reported-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-10 10:58:45 +02:00
Simon Wunderlich	c1e517fbbc	batman-adv: fix neigh_ifinfo imbalance The neigh_ifinfo object must be freed if it has been used in batadv_iv_ogm_process_per_outif(). This is a regression introduced by `89652331c0` ("batman-adv: split tq information in neigh_node struct") Reported-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-10 10:58:42 +02:00
Andrzej Kaczmarek	5a134faeef	Bluetooth: Store TX power level for connection This patch adds support to store local TX power level for connection when reply for HCI_Read_Transmit_Power_Level is received. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-09 14:16:42 -07:00
David S. Miller	1448eb5669	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-05-08 This one is all from Johannes: "Here are a few small fixes for the current cycle: radiotap TX flags were wrong (fix by Bob), Chun-Yeow fixes an SMPS issue with mesh interfaces, Eliad fixes a locking bug and a cfg80211 state problem and finally Henning sent me a fix for IBSS rate information." Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-09 16:46:53 -04:00
wangweidong	f66138c847	sctp: add a checking for sctp_sysctl_net_register When register_net_sysctl failed, we should free the sysctl_table. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-09 16:41:09 -04:00
wangweidong	eb9f37053d	Revert "sctp: optimize the sctp_sysctl_net_register" This revert commit efb842c45("sctp: optimize the sctp_sysctl_net_register"), Since it doesn't kmemdup a sysctl_table for init_net, so the init_net->sctp.sysctl_header->ctl_table_arg points to sctp_net_table which is a static array pointer. So when doing sctp_sysctl_net_unregister, it will free sctp_net_table, then we will get a NULL pointer dereference like that: [ 262.948220] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c [ 262.948232] IP: [<ffffffff81144b70>] kfree+0x80/0x420 [ 262.948260] PGD db80a067 PUD dae12067 PMD 0 [ 262.948268] Oops: 0000 [#1] SMP [ 262.948273] Modules linked in: sctp(-) crc32c_generic libcrc32c ... [ 262.948338] task: ffff8800db830190 ti: ffff8800dad00000 task.ti: ffff8800dad00000 [ 262.948344] RIP: 0010:[<ffffffff81144b70>] [<ffffffff81144b70>] kfree+0x80/0x420 [ 262.948353] RSP: 0018:ffff8800dad01d88 EFLAGS: 00010046 [ 262.948358] RAX: 0100000000000000 RBX: ffffffffa0227940 RCX: ffffea0000707888 [ 262.948363] RDX: ffffea0000707888 RSI: 0000000000000001 RDI: ffffffffa0227940 [ 262.948369] RBP: ffff8800dad01de8 R08: 0000000000000000 R09: ffff8800d9e983a9 [ 262.948374] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0227940 [ 262.948380] R13: ffffffff8187cfc0 R14: 0000000000000000 R15: ffffffff8187da10 [ 262.948386] FS: 00007fa2a2658700(0000) GS:ffff880112800000(0000) knlGS:0000000000000000 [ 262.948394] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 262.948400] CR2: 000000000000006c CR3: 00000000cddc0000 CR4: 00000000000006e0 [ 262.948410] Stack: [ 262.948413] ffff8800dad01da8 0000000000000286 0000000020227940 ffffffffa0227940 [ 262.948422] ffff8800dad01dd8 ffffffff811b7fa1 ffffffffa0227940 ffffffffa0227940 [ 262.948431] ffffffff8187d960 ffffffff8187cfc0 ffffffff8187d960 ffffffff8187da10 [ 262.948440] Call Trace: [ 262.948457] [<ffffffff811b7fa1>] ? unregister_sysctl_table+0x51/0xa0 [ 262.948476] [<ffffffffa020d1a1>] sctp_sysctl_net_unregister+0x21/0x30 [sctp] [ 262.948490] [<ffffffffa020ef6d>] sctp_net_exit+0x12d/0x150 [sctp] [ 262.948512] [<ffffffff81394f49>] ops_exit_list+0x39/0x60 [ 262.948522] [<ffffffff813951ed>] unregister_pernet_operations+0x3d/0x70 [ 262.948530] [<ffffffff81395292>] unregister_pernet_subsys+0x22/0x40 [ 262.948544] [<ffffffffa020efcc>] sctp_exit+0x3c/0x12d [sctp] [ 262.948562] [<ffffffff810c5e04>] SyS_delete_module+0x194/0x210 [ 262.948577] [<ffffffff81240fde>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 262.948587] [<ffffffff815217a2>] system_call_fastpath+0x16/0x1b With this revert, it won't occur the Oops. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-09 16:41:08 -04:00
wangweidong	be7faf7168	rds: remove the unneed NULL checking unregister_net_sysctl_table will check the ctl_table_header, so remove the unneed checking Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-09 15:59:45 -04:00
David S. Miller	b3d4056632	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following batch contains netfilter fixes for your net tree, they are: 1) Fix use after free in nfnetlink when sending a batch for some unsupported subsystem, from Denys Fedoryshchenko. 2) Skip autoload of the nat module if no binding is specified via ctnetlink, from Florian Westphal. 3) Set local_df after netfilter defragmentation to avoid a bogus ICMP fragmentation needed in the forwarding path, also from Florian. 4) Fix potential user after free in ip6_route_me_harder() when returning the error code to the upper layers, from Sergey Popovich. 5) Skip possible bogus ICMP time exceeded emitted from the router (not valid according to RFC) if conntrack zones are used, from Vasily Averin. 6) Fix fragment handling when nf_defrag_ipv4 is loaded but nf_conntrack is not present, also from Vasily. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-09 13:17:30 -04:00
Eliad Peller	f9ac71bfcc	mac80211: fix vif name tracing If sdata doesn't have a valid dev (e.g. in case of monitor vif), the vif_name field was initialized with (a length of) some short string, but later was set to a different, potentially larger one. This resulted in out-of-bounds write, which usually appeared as garbage in the trace log. Simply trace sdata->name, as it should always have the correct name for both cases. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-09 14:35:40 +02:00
Marcel Holtmann	b75cf9cd16	Bluetooth: Increment management interface revision This patch increments the management interface revision due to the changes with the Device Found management event and other fixes. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-05-09 15:05:57 +03:00
Johannes Berg	f6837ba8c9	mac80211: handle failed restart/resume better When the driver fails during HW restart or resume, the whole stack goes into a very confused state with interfaces being up while the hardware is down etc. Address this by shutting down everything; we'll run into a lot of warnings in the process but that's better than having the whole stack get messed up. Reviewed-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-09 12:21:34 +02:00
Johannes Berg	4a817aa78f	mac80211: allow VHT with peers not capable of 40MHz There are two (related) issues with this. One case, reported by Michal, is related to hostap: it unsets the 20/40 capability bit for stations that associate when it's in 20 MHz mode. The other case, reported by Eyal, is that some APs like Netgear R6300v2 and probably others based on the BCM4360 chipset can be configured for doing VHT at 20Mhz. In this case the beacon has a VHT IE but the HT cap indicates transmitter only support 20Mhz. In both of these cases, we currently avoid VHT and use only HT this means we can't use the highest rates (MCS8), so fixing this leads to throughput improvements. Reported-by: Michal Kazior <michal.kazior@tieto.com> Reported-by: Eyal Shapira <eyal@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-09 09:56:53 +02:00
Ying Xue	ca9cf06a06	tipc: don't directly overwrite node action_flags Each node action flag should be set or cleared separately, instead we now set the whole flags variable in one shot, and it's turned out to be hard to see which other flags are affected. Therefore, for instance, we explicitly clear TIPC_WAIT_OWN_LINKS_DOWN bit in node_lost_contact(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-09 01:41:01 -04:00
Ying Xue	aecb9bb89c	tipc: rename enum names of node flags Rename node flags to action_flags as well as its enum names so that they can reflect its real meanings. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-09 01:41:01 -04:00
Tom Herbert	58d6085c14	l2tp: Remove UDP checksum verification Validating the UDP checksum is now done in UDP before handing packets to the encapsulation layer. Note that this also eliminates the "feature" where L2TP can ignore a non-zero UDP checksum (doing this was contrary to RFC 1122). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:47:50 -04:00
Tom Herbert	0a80966b10	net: Verify UDP checksum before handoff to encap Moving validation of UDP checksum to be done in UDP not encap layer. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:47:50 -04:00
Tom Herbert	39471ac8dd	icmp6: Call skb_checksum_validate Use skb_checksum_validate to verify checksum. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:47:50 -04:00
Tom Herbert	29a96e1f36	icmp: Call skb_checksum_simple_validate Use skb_checksum_simple_validate to verify checksum. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:47:50 -04:00
Tom Herbert	de08dc1a8e	igmp: Call skb_checksum_simple_validate Use skb_checksum_simple_validate to verify checksum. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:47:50 -04:00
Tom Herbert	81249bea1f	gre6: Call skb_checksum_simple_validate Use skb_checksum_simple_validate to verify checksum. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:47:50 -04:00
Tom Herbert	b1036c6a47	gre: Call skb_checksum_simple_validate Use skb_checksum_simple_validate to verify checksum. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:47:50 -04:00
WANG Cong	32a4be4890	ipv4: remove inet_addr_hash_lock in devinet.c All the callers hold RTNL lock, so there is no need to use inet_addr_hash_lock to protect the hash list. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 22:56:56 -04:00
Cong Wang	ba6b918ab2	ping: move ping_group_range out of CONFIG_SYSCTL Similarly, when CONFIG_SYSCTL is not set, ping_group_range should still work, just that no one can change it. Therefore we should move it out of sysctl_net_ipv4.c. And, it should not share the same seqlock with ip_local_port_range. BTW, rename it to ->ping_group_range instead. Cc: David S. Miller <davem@davemloft.net> Cc: Francois Romieu <romieu@fr.zoreil.com> Reported-by: Stefan de Konink <stefan@konink.de> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 22:50:47 -04:00
Cong Wang	c9d8f1a642	ipv4: move local_port_range out of CONFIG_SYSCTL When CONFIG_SYSCTL is not set, ip_local_port_range should still work, just that no one can change it. Therefore we should move it out of sysctl_inet.c. Also, rename it to ->ip_local_ports instead. Cc: David S. Miller <davem@davemloft.net> Cc: Francois Romieu <romieu@fr.zoreil.com> Reported-by: Stefan de Konink <stefan@konink.de> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 22:50:47 -04:00
Sergey Popovich	a8951d5814	netfilter: Fix potential use after free in ip6_route_me_harder() Dst is released one line before we access it again with dst->error. Fixes: `58e35d1471` netfilter: ipv6: propagate routing errors from ip6_route_me_harder() Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-09 02:36:39 +02:00
John W. Linville	6153871f77	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2014-05-08 11:13:41 -04:00
Andrzej Kaczmarek	5ae76a9415	Bluetooth: Store RSSI for connection This patch adds support to store RSSI for connection when reply for HCI_Read_RSSI is received. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-08 08:01:57 -07:00
Johan Hedberg	38e4a91566	Bluetooth: Add support for SMP Invalid Parameters error code The Invalid Parameters error code is used to indicate that the command length is invalid or that a parameter is outside of the specified range. This error code wasn't clearly specified in the Bluetooth 4.0 specification but since 4.1 this has been fixed. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-08 07:50:02 -07:00
Luciano Coelho	f29f58a9e5	mac80211: fix sparse warning caused by __ieee80211_channel_switch() Commit `59af6928` (mac80211: fix CSA tx queue stopping) introduced a sparse warning: net/mac80211/cfg.c:3274:5: warning: symbol '__ieee80211_channel_switch' was not declared. Should it be static? Fix it by declaring the function static. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-08 12:00:12 +02:00
Michal Kazior	cf8767dd76	mac80211: disconnect iface if CSA unexpectedly fails It doesn't make much sense to leave a crippled interface running. As a side effect this will unblock tx queues with CSA reason immediately after failure instead of until after userspace requests interface to stop. This also gives userspace an opportunity to indirectly see CSA failure. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> [small code cleanup] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-08 11:57:27 +02:00
Sergey Popovich	aeefa1ecfc	ipv4: fib_semantics: increment fib_info_cnt after fib_info allocation Increment fib_info_cnt in fib_create_info() right after successfuly alllocating fib_info structure, overwise fib_metrics allocation failure leads to fib_info_cnt incorrectly decremented in free_fib_info(), called on error path from fib_create_info(). Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-07 17:14:32 -04:00
WANG Cong	698365fa18	net: clean up snmp stats code commit `8f0ea0fe3a` (snmp: reduce percpu needs by 50%) reduced snmp array size to 1, so technically it doesn't have to be an array any more. What's more, after the following commit: commit `933393f58f` Date: Thu Dec 22 11:58:51 2011 -0600 percpu: Remove irqsafe_cpu_xxx variants We simply say that regular this_cpu use must be safe regardless of preemption and interrupt state. That has no material change for x86 and s390 implementations of this_cpu operations. However, arches that do not provide their own implementation for this_cpu operations will now get code generated that disables interrupts instead of preemption. probably no arch wants to have SNMP_ARRAY_SZ == 2. At least after almost 3 years, no one complains. So, just convert the array to a single pointer and remove snmp_mib_init() and snmp_mib_free() as well. Cc: Christoph Lameter <cl@linux.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-07 16:06:05 -04:00
Florian Westphal	c1e756bfcb	Revert "net: core: introduce netif_skb_dev_features" This reverts commit `d206940319`, there are no more callers. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-07 15:49:07 -04:00
Florian Westphal	c7ba65d7b6	net: ip: push gso skb forwarding handling down the stack Doing the segmentation in the forward path has one major drawback: When using virtio, we may process gso udp packets coming from host network stack. In that case, netfilter POSTROUTING will see one packet with udp header followed by multiple ip fragments. Delay the segmentation and do it after POSTROUTING invocation to avoid this. Fixes: `fe6cc55f3a` ("net: ip, ipv6: handle gso skbs in forwarding path") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-07 15:49:07 -04:00
Florian Westphal	418a31561d	net: ipv6: send pkttoobig immediately if orig frag size > mtu If conntrack defragments incoming ipv6 frags it stores largest original frag size in ip6cb and sets ->local_df. We must thus first test the largest original frag size vs. mtu, and not vice versa. Without this patch PKTTOOBIG is still generated in ip6_fragment() later in the stack, but 1) IPSTATS_MIB_INTOOBIGERRORS won't increment 2) packet did (needlessly) traverse netfilter postrouting hook. Fixes: `fe6cc55f3a` ("net: ip, ipv6: handle gso skbs in forwarding path") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-07 15:27:59 -04:00
Florian Westphal	ca6c5d4ad2	net: ipv4: ip_forward: fix inverted local_df test local_df means 'ignore DF bit if set', so if its set we're allowed to perform ip fragmentation. This wasn't noticed earlier because the output path also drops such skbs (and emits needed icmp error) and because netfilter ip defrag did not set local_df until couple of days ago. Only difference is that DF-packets-larger-than MTU now discarded earlier (f.e. we avoid pointless netfilter postrouting trip). While at it, drop the repeated test ip_exceeds_mtu, checking it once is enough... Fixes: `fe6cc55f3a` ("net: ip, ipv6: handle gso skbs in forwarding path") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-07 15:26:09 -04:00
Loic Poulain	771bb3ddc2	rfkill-gpio: Use gpio cansleep version If gpio controller requires waiting for read and write GPIO values, then we have to use the gpio cansleep api. Fix the rfkill_gpio_set_power which calls only the nonsleep version (causing kernel warning). There is no problem to use the cansleep version here because we are not in IRQ handler or similar context (cf rfkill_set_block). Signed-off-by: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-07 13:20:59 +02:00
Michal Kazior	e5593f56eb	mac80211: ignore cqm during csa It is not guaranteed that multi-vif channel switching is tightly synchronized. It makes sense to ignore cqm (missing beacons, et al) while csa is progressing and re-check it after it completes. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-07 11:04:51 +02:00
John W. Linville	68b422db3d	Merge branch 'rfkill-gpio-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2014-05-06 14:43:34 -04:00
John W. Linville	cabae81103	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2014-05-06 14:05:51 -04:00
Michal Kazior	f04c22033c	cfg80211: export interface stopping function This exports a new cfg80211_stop_iface() function. This is intended for driver internal interface combination management and channel switching. Due to locking issues (it re-enters driver) the call is asynchronous and uses cfg80211 event list/worker. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-06 15:16:34 +02:00
Michal Kazior	66199506fb	mac80211: split CSA finalize function Improves readability and modularity. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-06 15:10:00 +02:00
Michal Kazior	59af6928d2	mac80211: fix CSA tx queue stopping It was possible for tx queues to be stuck stopped if AP CSA finalization failed. In that case neither stop_ap nor do_stop woke the queues up. This means it was impossible to perform tx at all until driver was reloaded or a successful CSA was performed later. It was possible to solve this in a simpler manner however this is more robust and future proof (having multi-vif CSA in mind). New sdata->csa_block_tx is introduced to keep track of which interfaces requested tx to be blocked for CSA. This is required because mac80211 stops all tx queues for that purpose. This means queues must be awoken only when last tx-blocking CSA interface is finished. It is still possible to have tx queues stopped after CSA failure but as soon as offending interfaces are stopped from userspace (stop_ap or ifdown) tx queues are woken up properly. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-06 15:10:00 +02:00
Steffen Klassert	edb666f07e	xfrm6: Properly handle unsupported protocols We don't catch the case if an unsupported protocol is submitted to the xfrm6 protocol handlers, this can lead to NULL pointer dereferences. Fix this by adding the appropriate checks. Fixes: `7e14ea15` ("xfrm6: Add IPsec protocol multiplexer") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-05-06 07:08:38 +02:00
Tom Herbert	79e0f1c9f2	ipv6: Need to sock_put on csum error Commit `4068579e1e` ("net: Implmement RFC 6936 (zero RX csums for UDP/IPv6)") introduced zero checksums being allowed for IPv6, but in the case that a socket disallows a zero checksum on RX we need to sock_put. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 23:17:16 -04:00
Libor Pechacek	86aae6c7b5	Bluetooth: Convert RFCOMM spinlocks into mutexes Enabling CONFIG_DEBUG_ATOMIC_SLEEP has shown that some rfcomm functions acquiring spinlocks call sleeping locks further in the chain. Converting the offending spinlocks into mutexes makes sleeping safe. Signed-off-by: Libor Pechacek <lpechacek@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-05 19:25:06 -07:00
Linus Torvalds	2080cee435	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) e1000e computes header length incorrectly wrt vlans, fix from Vlad Yasevich. 2) ns_capable() check in sock_diag netlink code, from Andrew Lutomirski. 3) Fix invalid queue pairs handling in virtio_net, from Amos Kong. 4) Checksum offloading busted in sxgbe driver due to incorrect descriptor layout, fix from Byungho An. 5) Fix build failure with SMC_DEBUG set to 2 or larger, from Zi Shen Lim. 6) Fix uninitialized A and X registers in BPF interpreter, from Alexei Starovoitov. 7) Fix arch dependencies of candence driver. 8) Fix netlink capabilities checking tree-wide, from Eric W Biederman. 9) Don't dump IFLA_VF_PORTS if netlink request didn't ask for it in IFLA_EXT_MASK, from David Gibson. 10) IPV6 FIB dump restart doesn't handle table changes that happen meanwhile, causing the code to loop forever or emit dups, fix from Kumar Sandararajan. 11) Memory leak on VF removal in bnx2x, from Yuval Mintz. 12) Bug fixes for new Altera TSE driver from Vince Bridgers. 13) Fix route lookup key in SCTP, from Xugeng Zhang. 14) Use BH blocking spinlocks in SLIP, as per a similar fix to CAN/SLCAN driver. From Oliver Hartkopp. 15) TCP doesn't bump retransmit counters in some code paths, fix from Eric Dumazet. 16) Clamp delayed_ack in tcp_cubic to prevent theoretical divides by zero. Fix from Liu Yu. 17) Fix locking imbalance in error paths of HHF packet scheduler, from John Fastabend. 18) Properly reference the transport module when vsock_core_init() runs, from Andy King. 19) Fix buffer overflow in cdc_ncm driver, from Bjørn Mork. 20) IP_ECN_decapsulate() doesn't see a correct SKB network header in ip_tunnel_rcv(), fix from Ying Cai. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (132 commits) net: macb: Fix race between HW and driver net: macb: Remove 'unlikely' optimization net: macb: Re-enable RX interrupt only when RX is done net: macb: Clear interrupt flags net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP ip_tunnel: Set network header properly for IP_ECN_decapsulate() e1000e: Restrict MDIO Slow Mode workaround to relevant parts e1000e: Fix issue with link flap on 82579 e1000e: Expand workaround for 10Mb HD throughput bug e1000e: Workaround for dropped packets in Gig/100 speeds on 82579 net/mlx4_core: Don't issue PCIe speed/width checks for VFs net/mlx4_core: Load the Eth driver first net/mlx4_core: Fix slave id computation for single port VF net/mlx4_core: Adjust port number in qp_attach wrapper when detaching net: cdc_ncm: fix buffer overflow Altera TSE: ALTERA_TSE should depend on HAS_DMA vsock: Make transport the proto owner net: sched: lock imbalance in hhf qdisc net: mvmdio: Check for a valid interrupt instead of an error net phy: Check for aneg completion before setting state to PHY_RUNNING ...	2014-05-05 15:59:46 -07:00
Linus Torvalds	5575eeb7b9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "First, there is a critical fix for the new primary-affinity function that went into -rc1. The second batch of patches from Zheng fix a range of problems with directory fragmentation, readdir, and a few odds and ends for cephfs" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: reserve caps for file layout/lock MDS requests ceph: avoid releasing caps that are being used ceph: clear directory's completeness when creating file libceph: fix non-default values check in apply_primary_affinity() ceph: use fpos_cmp() to compare dentry positions ceph: check directory's completeness before emitting directory entry	2014-05-05 15:17:02 -07:00
Ying Xue	52ff872055	tipc: purge signal handler infrastructure In the previous commits of this series, we removed all asynchronous actions which were based on the tasklet handler - "tipc_k_signal()". So the moment has now come when we can completely remove the tasklet handler infrastructure. That is done with this commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:45 -04:00
Ying Xue	3f5a12bd9f	tipc: avoid to asynchronously reset all links Postpone the actions of resetting all links until after bclink lock is released, avoiding to asynchronously reset all links. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:45 -04:00
Ying Xue	eb8b00f5f2	tipc: convert allocations of global variables associated with bclink Convert allocations of global variables associated with bclink from static way to dynamical way for the convenience of bclink instance initialisation. Meanwhile, this also helps TIPC support name space in the future easily. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:45 -04:00
Ying Xue	d69afc90b8	tipc: define new functions to operate bc_lock As we are going to do more jobs when bc_lock is released, the two operations of holding/releasing the lock should be encapsulated with functions. In addition, we move bc_lock spin lock into tipc_bclink structure avoiding to define the global variable. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:44 -04:00
Ying Xue	ca0c42732c	tipc: avoid to asynchronously deliver name tables to peer node Postpone the actions of delivering name tables until after node lock is released, avoiding to do it under asynchronous context. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:44 -04:00
Ying Xue	9d56194968	tipc: remove TIPC_NAMES_GONE node flag Since previously what all publications pertaining to the lost node were removed from name table was finished in tasklet context asynchronously, we need to TIPC_NAMES_GONE flag indicating whether the node cleanup work is finished or not. But now as the cleanup work has been finished when node lock is released, the flag becomes meaningless for us. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:44 -04:00
Ying Xue	9db9fdd198	tipc: avoid to asynchronously notify subscriptions Postpone the actions of notifying subscriptions until after node lock is released, avoiding to asynchronously execute registered handlers when node is lost. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:44 -04:00
Ying Xue	10f465c496	tipc: rename setup_blocked variable of node struct to flags Rename setup_blocked variable of node struct to a more common name called "flags", which will be used to represent kinds of node states. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:44 -04:00
Ying Xue	486f930ac5	tipc: adjust order of variables in tipc_node structure Move more frequently used variables up to the head of tipc_node structure, hopefully improving a bit performance. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:44 -04:00
Ying Xue	5356f3d7d4	tipc: always use tipc_node_lock() to hold node lock Although we obtain node lock with tipc_node_lock() in most time, there are still places where we directly use native spin lock interface to grab node lock. But as we will do more jobs in the future when node lock is released, we should ensure that tipc_node_lock() is always called when node lock is taken. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 17:26:43 -04:00
Ying Cai	e96f2e7c43	ip_tunnel: Set network header properly for IP_ECN_decapsulate() In ip_tunnel_rcv(), set skb->network_header to inner IP header before IP_ECN_decapsulate(). Without the fix, IP_ECN_decapsulate() takes outer IP header as inner IP header, possibly causing error messages or packet drops. Note that this skb_reset_network_header() call was in this spot when the original feature for checking consistency of ECN bits through tunnels was added in `eccc1bb8d4` ("tunnel: drop packet if ECN present with not-ECT"). It was only removed from this spot in `3d7b46cd20` ("ip_tunnel: push generic protocol handling to ip_tunnel module."). Fixes: `3d7b46cd20` ("ip_tunnel: push generic protocol handling to ip_tunnel module.") Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Ying Cai <ycai@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 16:32:17 -04:00
WANG Cong	07c8e35a38	ipv6: remove unused function ipv6_inherit_linklocal() It is no longer used after commit `e837735ec4` (ip6_tunnel: ensure to always have a link local address). Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 15:30:20 -04:00
Tom Herbert	4068579e1e	net: Implmement RFC 6936 (zero RX csums for UDP/IPv6) RFC 6936 relaxes the requirement of RFC 2460 that UDP/IPv6 packets which are received with a zero UDP checksum value must be dropped. RFC 6936 allows zero checksums to support tunnels over UDP. When sk_no_check is set we allow on a socket we allow a zero IPv6 UDP checksum. This is for both sending zero checksum and accepting a zero checksum on receive. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 15:26:30 -04:00
Tom Herbert	e4f45b7f40	net: Call skb_checksum_init in IPv6 Call skb_checksum_init instead of private functions. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 15:26:30 -04:00
Tom Herbert	ed70fcfcee	net: Call skb_checksum_init in IPv4 Call skb_checksum_init instead of private functions. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 15:26:30 -04:00
David S. Miller	2ad0649687	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-05-02 Please pull this batch of updates intended for the 3.16 stream... For the mac80211 bits, Johannes says: "In this round we have a large number of small features and improvements from people too numerous to list here. The only really bit thing is Michał and Luca's CSA work (including changing how interface combination verification is done)." For the Bluetooth bits, Gustavo says: "Here goes some patches for the -next release. There is nothing really special for this pull request, just a bunch of refactors, fixes and clean ups." For the ath10k/ath6kl bits, Kalle says: "For ath6kl Kalle fixed a bunch of checkpatch warnings. In ath10k we had more changes, major ones being: * fix memory allocation failures after a firmware crash (Michal) * some rework of DFS configuration to enable it correctly in all cases (Michal) * add a new firmware crash option to make it possible to crash 10.1 firmware for testing purposes (Marek P) * fix RTS/CTS protection in certain cases (Marek K) * fix wrong RSSI and rate reporting in some cases (Janusz) * fix firmware stats reporting (Chun, Ben & Bartosz)" For the iwlwifi bits, Emmanuel says: "I have here a bunch of unrelated things. I disabled support for -7.ucode which means that I can removed a lot of code. Eliad has a brand new feature: we reduce the Tx power when the link allows - this reduces our power consumption. The regular changes in power and scan area. One interesting thing though is the patches from Johannes, we have now GRO which allows to increase our throughput in TCP Rx. The main advantage is that it reduces the number of TCP Acks - these TCP Acks are completely useless when we are using A-MPDU since the first packet of the A-MPDU generates a TCP Ack which is made obsolete by the next packets." Along with that, there are a variety of updates to b43, mwifiex, rtl8180 and wil6210 drivers and a handful of other updates here and there. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 13:36:26 -04:00
Andy King	2c4a336e0a	vsock: Make transport the proto owner Right now the core vsock module is the owner of the proto family. This means there's nothing preventing the transport module from unloading if there are open sockets, which results in a panic. Fix that by allowing the transport to be the owner, which will refcount it properly. Includes version bump to 1.0.1.0-k Passes checkpatch this time, I swear... Acked-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 13:13:50 -04:00
Roopa Prabhu	56bfa7ee7c	unregister_netdevice : move RTM_DELLINK to until after ndo_uninit This patch fixes ordering of rtnl notifications during unregister_netdevice by moving RTM_DELLINK notification to until after ndo_uninit. The problem was seen with unregistering bond netdevices. bond ndo_uninit callback generates a few RTM_NEWLINK notifications for NETDEV_CHANGEADDR and NETDEV_FEAT_CHANGE. This is seen mostly when the bond is deleted with slaves still enslaved to the bond. During unregister netdevice (rollback_registered_many to be specific) bond ndo_uninit is called after RTM_DELLINK notification goes out. This results in userspace seeing RTM_DELLINK followed by a couple of RTM_NEWLINK's. In userspace problem was seen with libnl. libnl cache deletes the bond when it sees RTM_DELLINK and re-adds the bond with the following RTM_NEWLINK. Resulting in a stale bond entry in libnl cache when the kernel has already deleted the bond. This patch has been tested for bond, bridges and vlan devices. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 13:11:36 -04:00
David S. Miller	b8dff4e60c	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-05-01 Please pull the following batch of fixes intended for the 3.15 stream! For the Bluetooth bits, Gustavo says: "Some fixes for 3.15. There is a revert for the intel driver, a new device id, and two important SSP fixes from Johan." On top of that... Ben Hutchings gives us a fix for an unbalanced irq enable in an rtl8192cu error path. Colin Ian King provides an rtlwifi fix for an uninitialized variable. Felix Fietkau brings a pair of ath9k fixes, one that corrects a hardware initialization value and another that removes an (unnecessary) flag that was being used in a way that led to a software tx queue hang in ath9k. Gertjan van Wingerde pushes a MAINTAINERS change to remove himself from the rt2x00 maintainer team. Hans de Goede fixes a brcmfmac firmware load hang. Larry Finger changes rtlwifi to use the correct queue for V0 traffic on rtl8192se. Rajkumar Manoharan corrects a race in ath9k driver initialization. Stanislaw Gruszka fixes an rt2x00 bug in which disabling beaconing once on USB devices led to permanently disabling beaconing for those devices. Tim Harvey provides fixes for a pair of ath9k issues that can lead to soft lockups in that driver. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 13:06:01 -04:00
Vasily Averin	aff09ce303	bridge: superfluous skb->nfct check in br_nf_dev_queue_xmit Currently bridge can silently drop ipv4 fragments. If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4, br_nf_pre_routing defragments incoming ipv4 fragments but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back, and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters It seems the only way to hit the ip_fragment code in the bridge xmit path is to have a fragment list whose reassembled fragments go over the mtu. This only happens if nf_defrag is enabled. Thanks to Florian Westphal for providing feedback to clarify this. Defragmentation ipv4 is required not only in conntracks but at least in TPROXY target and socket match, therefore #ifdef is changed from NF_CONNTRACK_IPV4 to NF_DEFRAG_IPV4 Signed-off-by: Vasily Averin <vvs@openvz.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-05 16:05:43 +02:00
Johannes Berg	33926eb778	mac80211: mark local variable __maybe_unused The 'local' variable in __ieee80211_vif_copy_chanctx_to_vlans() is only used/needed when lockdep is compiled in, mark it as such to avoid compile warnings in the other case. While at it, fix some indentation where it's used. Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-05 16:03:42 +02:00
Vasily Averin	7c3d5ab1f3	ipv4: fix "conntrack zones" support for defrag user check in ip_expire Defrag user check in ip_expire was not updated after adding support for "conntrack zones". This bug manifests as a RFC violation, since the router will send the icmp time exceeeded message when using conntrack zones. Signed-off-by: Vasily Averin <vvs@openvz.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-05 16:02:59 +02:00
Arik Nemtsov	95224fe83e	mac80211: move TDLS code to another file With new additions planned, this code is getting too big for cfg.c. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-05 15:56:15 +02:00
Arik Nemtsov	0c4972ccaa	mac80211: set an external flag for TDLS stations Expose a new tdls flag for the public ieee80211_sta struct. This can be used in some rate control decisions. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-05 15:56:02 +02:00
Eliad Peller	e669ba2d06	mac80211: fix nested rtnl locking on ieee80211_reconfig ieee80211_reconfig already holds rtnl, so calling cfg80211_sched_scan_stopped results in deadlock. Use the rtnl-version of this function instead. Fixes: `d43c6b6` ("mac80211: reschedule sched scan after HW restart") Cc: stable@vger.kernel.org (3.14+) Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-05 15:14:58 +02:00
Eliad Peller	792e6aa7a1	cfg80211: add cfg80211_sched_scan_stopped_rtnl Add locked-version for cfg80211_sched_scan_stopped. This is used for some users that might want to call it when rtnl is already locked. Fixes: `d43c6b6` ("mac80211: reschedule sched scan after HW restart") Cc: stable@vger.kernel.org (3.14+) Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-05 15:14:57 +02:00
Eliad Peller	c1fbb25884	cfg80211: free sme on connection failures cfg80211 is notified about connection failures by __cfg80211_connect_result() call. However, this function currently does not free cfg80211 sme. This results in hanging connection attempts in some cases e.g. when mac80211 authentication attempt is denied, we have this function call: ieee80211_rx_mgmt_auth() -> cfg80211_rx_mlme_mgmt() -> cfg80211_process_auth() -> cfg80211_sme_rx_auth() -> __cfg80211_connect_result() but cfg80211_sme_free() is never get called. Fixes: `ceca7b712` ("cfg80211: separate internal SME implementation") Cc: stable@vger.kernel.org (3.10+) Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-05 14:59:00 +02:00
Henning Rogge	f4ebddf9ab	mac80211: Fix mac80211 station info rx bitrate for IBSS mode Filter out incoming multicast packages before applying their bitrate to the rx bitrate station info field to prevent them from setting the rx bitrate to the basic multicast rate. Signed-off-by: Henning Rogge <hrogge@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-05 14:52:03 +02:00
Daniel Borkmann	eb9672f4a1	net: filter: misc/various cleanups This contains only some minor misc cleanpus. We can spare us the extra variable declaration in __skb_get_pay_offset(), the cast in __get_random_u32() is rather unnecessary and in __sk_migrate_realloc() we can remove the memcpy() and do a direct assignment of the structs. Latter was suggested by Fengguang Wu found with coccinelle. Also, remaining pointer casts of long should be unsigned long instead. Suggested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-04 19:46:31 -04:00
Daniel Borkmann	30743837dd	net: filter: make register naming more comprehensible The current code is a bit hard to parse on which registers can be used, how they are mapped and all play together. It makes much more sense to define this a bit more clearly so that the code is a bit more intuitive. This patch cleans this up, and makes naming a bit more consistent among the code. This also allows for moving some of the defines into the header file. Clearing of A and X registers in __sk_run_filter() do not get a particular register name assigned as they have not an 'official' function, but rather just result from the concrete initial mapping of old BPF programs. Since for BPF helper functions for BPF_CALL we already use small letters, so be consistent here as well. No functional changes. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-04 19:46:31 -04:00
Daniel Borkmann	5bcfedf06f	net: filter: simplify label names from jump-table This patch simplifies label naming for the BPF jump-table. When we define labels via DL(), we just concatenate/textify the combination of instruction opcode which consists of the class, subclass, word size, target register and so on. Each time we leave BPF_ prefix intact, so that e.g. the preprocessor generates a label BPF_ALU_BPF_ADD_BPF_X for DL(BPF_ALU, BPF_ADD, BPF_X) whereas a label name of ALU_ADD_X is much more easy to grasp. Pure cleanup only. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-04 19:46:31 -04:00
John Fastabend	f6a082fed1	net: sched: lock imbalance in hhf qdisc hhf_change() takes the sch_tree_lock and releases it but misses the error cases. Fix the missed case here. To reproduce try a command like this, # tc qdisc change dev p3p2 root hhf quantum 40960 non_hh_weight 300000 Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-04 19:41:45 -04:00
Denys Fedoryshchenko	ecd15dd7e4	netfilter: nfnetlink: Fix use after free when it fails to process batch This bug manifests when calling the nft command line tool without nf_tables kernel support. kernel message: [ 44.071555] Netfilter messages via NETLINK v0.30. [ 44.072253] BUG: unable to handle kernel NULL pointer dereference at 0000000000000119 [ 44.072264] IP: [<ffffffff8171db1f>] netlink_getsockbyportid+0xf/0x70 [ 44.072272] PGD 7f2b74067 PUD 7f2b73067 PMD 0 [ 44.072277] Oops: 0000 [#1] SMP [...] [ 44.072369] Call Trace: [ 44.072373] [<ffffffff8171fd81>] netlink_unicast+0x91/0x200 [ 44.072377] [<ffffffff817206c9>] netlink_ack+0x99/0x110 [ 44.072381] [<ffffffffa004b951>] nfnetlink_rcv+0x3c1/0x408 [nfnetlink] [ 44.072385] [<ffffffff8171fde3>] netlink_unicast+0xf3/0x200 [ 44.072389] [<ffffffff817201ef>] netlink_sendmsg+0x2ff/0x740 [ 44.072394] [<ffffffff81044752>] ? __mmdrop+0x62/0x90 [ 44.072398] [<ffffffff816dafdb>] sock_sendmsg+0x8b/0xc0 [ 44.072403] [<ffffffff812f1af5>] ? copy_user_enhanced_fast_string+0x5/0x10 [ 44.072406] [<ffffffff816dbb6c>] ? move_addr_to_kernel+0x2c/0x50 [ 44.072410] [<ffffffff816db423>] ___sys_sendmsg+0x3c3/0x3d0 [ 44.072415] [<ffffffff811301ba>] ? handle_mm_fault+0xa9a/0xc60 [ 44.072420] [<ffffffff811362d6>] ? mmap_region+0x166/0x5a0 [ 44.072424] [<ffffffff817da84c>] ? __do_page_fault+0x1dc/0x510 [ 44.072428] [<ffffffff812b8b2c>] ? apparmor_capable+0x1c/0x60 [ 44.072435] [<ffffffff817d6e9a>] ? _raw_spin_unlock_bh+0x1a/0x20 [ 44.072439] [<ffffffff816dfc86>] ? release_sock+0x106/0x150 [ 44.072443] [<ffffffff816dc212>] __sys_sendmsg+0x42/0x80 [ 44.072446] [<ffffffff816dc262>] SyS_sendmsg+0x12/0x20 [ 44.072450] [<ffffffff817df616>] system_call_fastpath+0x1a/0x1f Signed-off-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-04 15:14:08 +02:00
Florian Westphal	895162b110	netfilter: ipv4: defrag: set local_df flag on defragmented skb else we may fail to forward skb even if original fragments do fit outgoing link mtu: 1. remote sends 2k packets in two 1000 byte frags, DF set 2. we want to forward but only see '2k > mtu and DF set' 3. we then send icmp error saying that outgoing link is 1500 But original sender never sent a packet that would not fit the outgoing link. Setting local_df makes outgoing path test size vs. IPCB(skb)->frag_max_size, so we will still send the correct error in case the largest original size did not fit outgoing link mtu. Reported-by: Maxime Bizon <mbizon@freebox.fr> Suggested-by: Maxime Bizon <mbizon@freebox.fr> Fixes: `5f2d04f1f9` (ipv4: fix path MTU discovery with connection tracking) Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-04 13:23:28 +02:00
Eric Dumazet	249015515f	tcp: remove in_flight parameter from cong_avoid() methods Commit `e114a710aa` ("tcp: fix cwnd limited checking to improve congestion control") obsoleted in_flight parameter from tcp_is_cwnd_limited() and its callers. This patch does the removal as promised. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-03 19:23:07 -04:00
Eric Dumazet	e114a710aa	tcp: fix cwnd limited checking to improve congestion control Yuchung discovered tcp_is_cwnd_limited() was returning false in slow start phase even if the application filled the socket write queue. All congestion modules take into account tcp_is_cwnd_limited() before increasing cwnd, so this behavior limits slow start from probing the bandwidth at full speed. The problem is that even if write queue is full (aka we are _not_ application limited), cwnd can be under utilized if TSO should auto defer or TCP Small queues decided to hold packets. So the in_flight can be kept to smaller value, and we can get to the point tcp_is_cwnd_limited() returns false. With TCP Small Queues and FQ/pacing, this issue is more visible. We fix this by having tcp_cwnd_validate(), which is supposed to track such things, take into account unsent_segs, the number of segs that we are not sending at the moment due to TSO or TSQ, but intend to send real soon. Then when we are cwnd-limited, remember this fact while we are processing the window of ACKs that comes back. For example, suppose we have a brand new connection with cwnd=10; we are in slow start, and we send a flight of 9 packets. By the time we have received ACKs for all 9 packets we want our cwnd to be 18. We implement this by setting tp->lsnd_pending to 9, and considering ourselves to be cwnd-limited while cwnd is less than twice tp->lsnd_pending (2*9 -> 18). This makes tcp_is_cwnd_limited() more understandable, by removing the GSO/TSO kludge, that tried to work around the issue. Note the in_flight parameter can be removed in a followup cleanup patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-02 17:54:35 -04:00
Stéphane Graber	4e8bbb819d	net: Allow tc changes in user namespaces This switches a few remaining capable(CAP_NET_ADMIN) to ns_capable so that root in a user namespace may set tc rules inside that namespace. Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-02 17:43:25 -04:00
John W. Linville	406a94d7fa	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2014-05-02 13:47:50 -04:00
John W. Linville	812e4dafa4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2014-05-01 11:23:21 -04:00
Liu Yu	0cda345d1b	tcp_cubic: fix the range of delayed_ack commit `b9f47a3aae` (tcp_cubic: limit delayed_ack ratio to prevent divide error) try to prevent divide error, but there is still a little chance that delayed_ack can reach zero. In case the param cnt get negative value, then ratio+cnt would overflow and may happen to be zero. As a result, min(ratio, ACK_RATIO_LIMIT) will calculate to be zero. In some old kernels, such as 2.6.32, there is a bug that would pass negative param, which then ultimately leads to this divide error. commit `5b35e1e6e9` (tcp: fix tcp_trim_head() to adjust segment count with skb MSS) fixed the negative param issue. However, it's safe that we fix the range of delayed_ack as well, to make sure we do not hit a divide by zero. CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Liu Yu <allanyuliu@tencent.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-30 16:12:22 -04:00
Eric Dumazet	fc9f350106	tcp: increment retransmit counters in tlp and fast open Both TLP and Fast Open call __tcp_retransmit_skb() instead of tcp_retransmit_skb() to avoid changing tp->retrans_out. This has the side effect of missing SNMP counters increments as well as tcp_info tcpi_total_retrans updates. Fix this by moving the stats increments of into __tcp_retransmit_skb() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-30 16:12:22 -04:00
Ying Xue	1621b94d2a	tipc: fix memory leak of publications Commit `1bb8dce57f` ("tipc: fix memory leak during module removal") introduced a memory leak issue: when name table is stopped, it's forgotten that publication instances are freed properly. Additionally the useless "continue" statement in tipc_nametbl_stop() is removed as well. Reported-by: Jason <huzhijiang@gmail.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-30 13:31:26 -04:00
Lorenzo Colitti	5c98631cca	net: ipv6: Introduce ip6_sk_dst_hoplimit. This replaces 6 identical code snippets with a call to a new static inline function. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-30 13:31:26 -04:00
John W. Linville	f6595444c1	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Conflicts: net/mac80211/chan.c	2014-04-30 12:04:27 -04:00
John W. Linville	0006433a5b	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-04-30 11:56:43 -04:00
Florian Westphal	f768e5bdef	netfilter: add helper for adding nat extension Reduce copy-past a bit by adding a common helper. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-04-29 20:56:22 +02:00
Florian Westphal	fe337ac283	netfilter: ctnetlink: don't add null bindings if no nat requested commit `0eba801b64` tried to fix a race where nat initialisation can happen after ctnetlink-created conntrack has been created. However, it causes the nat module(s) to be loaded needlessly on systems that are not using NAT. Fortunately, we do not have to create null bindings in that case. conntracks injected via ctnetlink always have the CONFIRMED bit set, which prevents addition of the nat extension in nf_nat_ipv4/6_fn(). We only need to make sure that either no nat extension is added or that we've created both src and dst manips. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-04-29 20:49:08 +02:00

... 3 4 5 6 7 ...

33035 Commits