linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Mahesh Bandewar	14c9551a32	bonding: Implement port churn-machine (AD standard 43.4.17). The Churn Detection machines detect the situation where a port is operable, but the Actor and Partner have not attached the link to an Aggregator and brought the link into operation within a bound time period. Under normal operation of the LACP, agreement between Actor and Partner should be reached very rapidly. Continued failure to reach agreement can be symptomatic of device failure. Actor-churn-detection state-machine Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> =================================== BEGIN=True + PortEnable=False \| v +------------------------+ ActorPort.Sync=True +------------------+ \| ACTOR_CHURN_MONITOR \| ---------------------> \| NO_ACTOR_CHURN \| \|========================\| \|==================\| \| ActorChurn=False \| ActorPort.Sync=False \| ActorChurn=False \| \| ActorChurn.Timer=Start \| <--------------------- \| \| +------------------------+ +------------------+ \| ^ \| \| ActorChurn.Timer=Expired \| \| ActorPort.Sync=True \| \| \| +-----------------+ \| \| \| ACTOR_CHURN \| \| \| \|=================\| \| +--------------> \| ActorChurn=True \| ------------+ \| \| +-----------------+ Similar for the Partner-churn-detection. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-24 16:05:48 -05:00
Mahesh Bandewar	bb54e58929	bonding: Verify RX LACPDU has proper dest mac-addr The 802.1AX standard states: "The DA in LACPDUs is the Slow_Protocols_Multicast address." This patch enforces that and drops LACPDUs with destination MAC addresses other than Slow_Protocols_Multicast address Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-24 16:05:47 -05:00
Mahesh Bandewar	950ddcb1c1	bonding: simple code refactor Remove duplicate code. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 17:43:29 -05:00
Moni Shoua	92e584fe44	net/bonding: Fix potential bad memory access during bonding events When queuing work to send the NETDEV_BONDING_INFO netdev event, it's possible that when the work is executed, the pointer to the slave becomes invalid. This can happen if between queuing the event and the execution of the work, the net-device was un-ensvaled and re-enslaved. Fix that by queuing a work with the data of the slave instead of the slave structure. Fixes: `69e6113343` ('net/bonding: Notify state change on slaves') Reported-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:03:53 -08:00
Moni Shoua	69e6113343	net/bonding: Notify state change on slaves Use notifier chain to dispatch an event upon a change in slave state. Event is dispatched with slave specific info. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:24 -08:00
Moni Shoua	69a2338e05	net/bonding: Move slave state changes to a helper function Move slave state changes to a helper function, this is a pre-step for adding functionality of dispatching an event when this helper is called. This commit doesn't add new functionality. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:24 -08:00
Roopa Prabhu	c158cba38c	bonding: handle NETIF_F_HW_SWITCH_OFFLOAD flag and add ndo_bridge_setlink/dellink handlers We want bond to pick up the offload flag if any of its slaves have it. NETIF_F_HW_SWITCH_OFFLOAD flag is added to the mask, so that netdev_increment_features does not ignore it. This also adds ndo_bridge_setlink and ndo_bridge_dellink handlers. These currently point to the default handlers provided by the switchdev api. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-01 23:16:34 -08:00
Jonathan Toppins	303691042d	bonding: cleanup and remove dead code fix sparse warning about non-static function drivers/net/bonding/bond_main.c:3737:5: warning: symbol 'bond_3ad_xor_xmit' was not declared. Should it be static? Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:09:04 -08:00
Satish Ashok	2f6373245a	bonding: fix LACP PDU not sent on slave port sometimes When a slave is added to a bond and it is not in full duplex mode, AD_PORT_LACP_ENABLED flag is cleared, due to this LACP PDU is not sent on slave. When the duplex is changed to full, the flag needs to be set to send LACP PDU. Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:09:04 -08:00
Wilson Kok	63b46242f7	bonding: fix incorrect lacp mux state when agg not active This patch attempts to fix the following problems when an actor or partner's aggregator is not active: 1. a slave's lacp port state is marked as AD_STATE_SYNCHRONIZATION even if it is attached to an inactive aggregator. LACP advertises this state to the partner, making the partner think he can move into COLLECTING_DISTRIBUTING state even though this link will not pass traffic on the local side 2. a slave goes into COLLECTING_DISTRIBUTING state without checking if the aggregator is actually active 3. when in COLLECTING_DISTRIBUTING state, the partner parameters may change, e.g. the partner_oper_port_state.SYNCHRONIZATION. The local mux machine is not reacting to the change and continue to keep the slave and bond up 4. When bond slave leaves an inactive aggregator and joins an active aggregator, the actor oper port state need to update to SYNC state. v2: * fix style issues in bond_3ad.c Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com> Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:09:04 -08:00
Wilson Kok	8bbe71a595	bonding: fix bond_open() don't always set slave active flag Mode 802.3ad, fix incorrect bond slave active state when slave is not in active aggregator. During bond_open(), the bonding driver always sets the slave active flag to true if the bond is not in active-backup, alb, or tlb modes. Bonding should let the aggregator selection logic set the active flag when in 802.3ad mode. Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com> Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:09:03 -08:00
Jonathan Toppins	2477bc9a3d	bonding: update bond carrier state when min_links option changes Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:09:03 -08:00
Eric Dumazet	24f87d4ce1	bonding: handle more gso types In commit `5a7baa7885` ("bonding: Advertize vxlan offload features when supported"), Or Gerlitz added support conditional vxlan offload. In this patch I also add support for all kind of tunnels, but we allow a bonding device to not require segmentation, as it is always better to make this segmentation at the very last stage, if a particular slave device requires it. Tested: Setup a GRE tunnel, on a physical NIC not having tx-gre-segmentation. Results on bnx2x are even better, as we no longer have to segment in software. ethtool -K bond0 tx-gre-segmentation off super_netperf 50 --google-pacing-rate 30000000 -H 10.7.8.152 -l 15 7538.32 ethtool -K bond0 tx-gre-segmentation on super_netperf 50 --google-pacing-rate 30000000 -H 10.7.8.152 -l 15 10200.5 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 23:34:23 -08:00
Jonathan Toppins	7bfa014500	bonding: cleanup bond_opts array Remove the empty array element initializer and size the array with BOND_OPT_LAST so the compiler will complain if more elements are in there than should be. An interesting unwanted side effect of this initializer is that if one inserts new options into the middle of the array then this initializer will zero out the option that equals BOND_OPT_TLB_DYNAMIC_LB+1. Example: Extend the OPTS enum: enum { ... BOND_OPT_TLB_DYNAMIC_LB, BOND_OPT_LACP_NEW1, BOND_OPT_LAST }; Now insert into bond_opts array: static const struct bond_option bond_opts[] = { ... [BOND_OPT_LACP_RATE] = { .... unchanged stuff .... }, [BOND_OPT_LACP_NEW1] = { ... new stuff ... }, ... [BOND_OPT_TLB_DYNAMIC_LB] = { .... unchanged stuff ....}, { } // MARK A }; Since BOND_OPT_LACP_NEW1 = BOND_OPT_TLB_DYNAMIC_LB+1, the last initializer (MARK A) will overwrite the contents of BOND_OPT_LACP_NEW1 and can be easily viewed with the crash utility. Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com> Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Cc: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:39:31 -05:00
Wengang Wang	a22a9e4141	bonding: change error message to debug message in __bond_release_one() In __bond_release_one(), when the interface is not a slave or not a slave of "this" master, it log error message. The message actually should be a debug message matching what bond_enslave() does. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-27 02:20:55 -05:00
David S. Miller	22f10923dd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/amd/xgbe/xgbe-desc.c drivers/net/ethernet/renesas/sh_eth.c Overlapping changes in both conflict cases. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 15:48:20 -05:00
Thomas Graf	f6c6fda4c9	bond: Check length of IFLA_BOND_ARP_IP_TARGET attributes Fixes: `7f28fa10` ("bonding: add arp_ip_target netlink support") Reported-by: John Fastabend <john.fastabend@gmail.com> Cc: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-29 20:46:32 -08:00
David S. Miller	1459143386	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ieee802154/fakehard.c A bug fix went into 'net' for ieee802154/fakehard.c, which is removed in 'net-next'. Add build fix into the merge from Stephen Rothwell in openvswitch, the logging macros take a new initial 'log' argument, a new call was added in 'net' so when we merge that in here we have to explicitly add the new 'log' arg to it else the build fails. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 22:28:24 -05:00
Jiri Pirko	62749e2cb3	vlan: rename __vlan_put_tag to vlan_insert_tag_set_proto Name fits better. Plus there's going to be introduced __vlan_insert_tag later on. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:17 -05:00
Jiri Pirko	b4bef1b575	vlan: kill vlan_put_tag helper Since both tx and rx paths work with skb->vlan_tci, there's no need for this function anymore. Switch users directly to __vlan_hwaccel_put_tag. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:17 -05:00
Jianhua Xie	424c3232b0	bonding: Introduce 4 AD link speed to fix agg_bandwidth This patch adds [2.5\|20\|40\|56] Gbps enum definition, and fixes aggregated bandwidth calculation based on above slave links. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-19 19:10:16 -05:00
Jianhua Xie	cb8dda90c2	bonding: change AD_LINK_SPEED_BITMASK to enum to suport more speed Port Key was determined as 16 bits according to the link speed, duplex and user key (which is yet not supported). In the old speed field, 5 bits are for speed [1\|10\|100\|1000\|10000]Mbps as below: -------------------------------------------------------------- Port key :\| User key \| Speed \| Duplex\| -------------------------------------------------------------- 16 6 1 0 This patch keeps the old layout, but changes AD_LINK_SPEED_BITMASK from bit type to an enum type. In this way, the speed field can expand speed type from 5 to 32. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-19 19:10:16 -05:00
Nikolay Aleksandrov	b8e4500f42	bonding: fix curr_active_slave/carrier with loadbalance arp monitoring Since commit `6fde8f037e` ("bonding: fix locking in bond_loadbalance_arp_mon()") we can have a stale bond carrier state and stale curr_active_slave when using arp monitoring in loadbalance modes. The reason is that in bond_loadbalance_arp_mon() we can't have do_failover == true but slave_state_changed == false, whenever do_failover is true then slave_state_changed is also true. Then the following piece from bond_loadbalance_arp_mon(): if (slave_state_changed) { bond_slave_state_change(bond); if (BOND_MODE(bond) == BOND_MODE_XOR) bond_update_slave_arr(bond, NULL); } else if (do_failover) { block_netpoll_tx(); bond_select_active_slave(bond); unblock_netpoll_tx(); } will execute only the first branch, always and regardless of do_failover. Since these two events aren't related in such way, we need to decouple and consider them separately. For example this issue could lead to the following result: Bonding Mode: load balancing (round-robin) MII Status: down MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 ARP Polling Interval (ms): 100 ARP IP target/s (n.n.n.n form): 192.168.9.2 Slave Interface: ens12 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 2 Permanent HW addr: 00:0f:53:01:42:2c Slave queue ID: 0 Slave Interface: eth1 MII Status: up Speed: Unknown Duplex: Unknown Link Failure Count: 70 Permanent HW addr: 52:54:00:2f:0f:8e Slave queue ID: 0 Since some interfaces are up, then the status of the bond should also be up, but it will never change unless something invokes bond_set_carrier() (i.e. enslave, bond_select_active_slave etc). Now, if I force the calling of bond_select_active_slave via for example changing primary_reselect (it can change in any mode), then the MII status goes to "up" because it calls bond_select_active_slave() which should've been done from bond_loadbalance_arp_mon() itself. CC: Veaceslav Falico <vfalico@gmail.com> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Ding Tianhong <dingtianhong@huawei.com> Fixes: `6fde8f037e` ("bonding: fix locking in bond_loadbalance_arp_mon()") Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-19 15:09:04 -05:00
Michal Kubeček	fbe168ba91	net: generic dev_disable_lro() stacked device handling Large receive offloading is known to cause problems if received packets are passed to other host. Therefore the kernel disables it by calling dev_disable_lro() whenever a network device is enslaved in a bridge or forwarding is enabled for it (or globally). For virtual devices we need to disable LRO on the underlying physical device (which is actually receiving the packets). Current dev_disable_lro() code handles this propagation for a vlan (including 802.1ad nested vlan), macvlan or a vlan on top of a macvlan. It doesn't handle other stacked devices and their combinations, in particular propagation from a bond to its slaves which often causes problems in virtualization setups. As we now have generic data structures describing the upper-lower device relationship, dev_disable_lro() can be generalized to disable LRO also for all lower devices (if any) once it is disabled for the device itself. For bonding and teaming devices, it is necessary to disable LRO not only on current slaves at the moment when dev_disable_lro() is called but also on any slave (port) added later. v2: use lower device links for all devices (including vlan and macvlan) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 14:48:56 -05:00
David S. Miller	1ef8019be8	net: Move bonding headers under include/net This ways drivers like cxgb4 don't need to do ugly relative includes. Reported-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 13:27:49 -05:00
Eric Dumazet	31aa860e0a	bonding: add bond_tx_drop() helper Because bonding stats are usually sum of slave stats, it was not easy to account for tx drops at bonding layer. We can use dev->tx_dropped for this, as this counter is later added to the device stats (in dev_get_stats()) This extends the idea we had in commit `ee63771474` ("bonding: Simplify the xmit function for modes that use xmit_hash") for bond_3ad_xor_xmit() to other bonding modes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-31 16:09:03 -04:00
Linus Torvalds	35a9ad8af0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Most notable changes in here: 1) By far the biggest accomplishment, thanks to a large range of contributors, is the addition of multi-send for transmit. This is the result of discussions back in Chicago, and the hard work of several individuals. Now, when the ->ndo_start_xmit() method of a driver sees skb->xmit_more as true, it can choose to defer the doorbell telling the driver to start processing the new TX queue entires. skb->xmit_more means that the generic networking is guaranteed to call the driver immediately with another SKB to send. There is logic added to the qdisc layer to dequeue multiple packets at a time, and the handling mis-predicted offloads in software is now done with no locks held. Finally, pktgen is extended to have a "burst" parameter that can be used to test a multi-send implementation. Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4, virtio_net Adding support is almost trivial, so export more drivers to support this optimization soon. I want to thank, in no particular or implied order, Jesper Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann, David Tat, Hannes Frederic Sowa, and Rusty Russell. 2) PTP and timestamping support in bnx2x, from Michal Kalderon. 3) Allow adjusting the rx_copybreak threshold for a driver via ethtool, and add rx_copybreak support to enic driver. From Govindarajulu Varadarajan. 4) Significant enhancements to the generic PHY layer and the bcm7xxx driver in particular (EEE support, auto power down, etc.) from Florian Fainelli. 5) Allow raw buffers to be used for flow dissection, allowing drivers to determine the optimal "linear pull" size for devices that DMA into pools of pages. The objective is to get exactly the necessary amount of headers into the linear SKB area pre-pulled, but no more. The new interface drivers use is eth_get_headlen(). From WANG Cong, with driver conversions (several had their own by-hand duplicated implementations) by Alexander Duyck and Eric Dumazet. 6) Support checksumming more smoothly and efficiently for encapsulations, and add "foo over UDP" facility. From Tom Herbert. 7) Add Broadcom SF2 switch driver to DSA layer, from Florian Fainelli. 8) eBPF now can load programs via a system call and has an extensive testsuite. Alexei Starovoitov and Daniel Borkmann. 9) Major overhaul of the packet scheduler to use RCU in several major areas such as the classifiers and rate estimators. From John Fastabend. 10) Add driver for Intel FM10000 Ethernet Switch, from Alexander Duyck. 11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric Dumazet. 12) Add Datacenter TCP congestion control algorithm support, From Florian Westphal. 13) Reorganize sk_buff so that __copy_skb_header() is significantly faster. From Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits) netlabel: directly return netlbl_unlabel_genl_init() net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers net: description of dma_cookie cause make xmldocs warning cxgb4: clean up a type issue cxgb4: potential shift wrapping bug i40e: skb->xmit_more support net: fs_enet: Add NAPI TX net: fs_enet: Remove non NAPI RX r8169:add support for RTL8168EP net_sched: copy exts->type in tcf_exts_change() wimax: convert printk to pr_foo() af_unix: remove 0 assignment on static ipv6: Do not warn for informational ICMP messages, regardless of type. Update Intel Ethernet Driver maintainers list bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING tipc: fix bug in multicast congestion handling net: better IFF_XMIT_DST_RELEASE support net/mlx4_en: remove NETDEV_TX_BUSY 3c59x: fix bad split of cpu_to_le32(pci_map_single()) net: bcmgenet: fix Tx ring priority programming ...	2014-10-08 21:40:54 -04:00
Linus Torvalds	28596c9722	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull "trivial tree" updates from Jiri Kosina: "Usual pile from trivial tree everyone is so eagerly waiting for" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Remove MN10300_PROC_MN2WS0038 mei: fix comments treewide: Fix typos in Kconfig kprobes: update jprobe_example.c for do_fork() change Documentation: change "&" to "and" in Documentation/applying-patches.txt Documentation: remove obsolete pcmcia-cs from Changes Documentation: update links in Changes Documentation: Docbook: Fix generated DocBook/kernel-api.xml score: Remove GENERIC_HAS_IOMAP gpio: fix 'CONFIG_GPIO_IRQCHIP' comments tty: doc: Fix grammar in serial/tty dma-debug: modify check_for_stack output treewide: fix errors in printk genirq: fix reference in devm_request_threaded_irq comment treewide: fix synchronize_rcu() in comments checkstack.pl: port to AArch64 doc: queue-sysfs: minor fixes init/do_mounts: better syntax description MIPS: fix comment spelling powerpc/simpleboot: fix comment ...	2014-10-07 21:16:26 -04:00
Eric Dumazet	0287587884	net: better IFF_XMIT_DST_RELEASE support Testing xmit_more support with netperf and connected UDP sockets, I found strange dst refcount false sharing. Current handling of IFF_XMIT_DST_RELEASE is not optimal. Dropping dst in validate_xmit_skb() is certainly too late in case packet was queued by cpu X but dequeued by cpu Y The logical point to take care of drop/force is in __dev_queue_xmit() before even taking qdisc lock. As Julian Anastasov pointed out, need for skb_dst() might come from some packet schedulers or classifiers. This patch adds new helper to cleanly express needs of various drivers or qdiscs/classifiers. Drivers that need skb_dst() in their ndo_start_xmit() should call following helper in their setup instead of the prior : dev->priv_flags &= ~IFF_XMIT_DST_RELEASE; -> netif_keep_dst(dev); Instead of using a single bit, we use two bits, one being eventually rebuilt in bonding/team drivers. The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being rebuilt in bonding/team. Eventually, we could add something smarter later. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 13:22:11 -04:00
Mahesh Bandewar	ee63771474	bonding: Simplify the xmit function for modes that use xmit_hash Earlier change to use usable slave array for TLB mode had an additional performance advantage. So extending the same logic to all other modes that use xmit-hash for slave selection (viz 802.3AD, and XOR modes). Also consolidating this with the earlier TLB change. The main idea is to build the usable slaves array in the control path and use that array for slave selection during xmit operation. Measured performance in a setup with a bond of 4x1G NICs with 200 instances of netperf for the modes involved (3ad, xor, tlb) cmd: netperf -t TCP_RR -H <TargetHost> -l 60 -s 5 Mode TPS-Before TPS-After 802.3ad : 468,694 493,101 TLB (lb=0): 392,583 392,965 XOR : 475,696 484,517 Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 17:13:07 -04:00
Mahesh Bandewar	d7021325a2	bonding: display xmit_hash_policy for non-dynamic-tlb mode It's a trivial fix to display xmit_hash_policy for this new TLB mode since it uses transmit-hash-poilicy as part of bonding-master info (/proc/net/bonding/<bonding-interface). Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 17:13:07 -04:00
Andy Gospodarek	5f0c5f73e5	bonding: make global bonding stats more reliable As the code stands today, bonding stats are based simply on the stats from the member interfaces. If a member was to be removed from a bond, the stats would instantly drop. This would be confusing to an admin would would suddonly see interface stats drop while traffic is still flowing. In addition to preventing the stats drops mentioned above, new members will now be added to the bond and only traffic received after the member was added to the bond will be counted as part of bonding stats. Bonding counters will also be updated when any slaves are dropped to make sure the reported stats are reliable. v2: Changes suggested by Nik to properly allocate/free stats memory. v3: Properly destroy workqueue and fix netlink configuration path. v4: Moved cached stats into bonding and slave structs as there does not seem to be a complexity/performance benefit to using alloc'd memory vs in-struct memory. Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:20:07 -04:00
David S. Miller	1f6d80358d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: arch/mips/net/bpf_jit.c drivers/net/can/flexcan.c Both the flexcan and MIPS bpf_jit conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-23 12:09:27 -04:00
dingtianhong	37ab7ddf3f	bonding: remove the unnecessary notes for bond_xmit_broadcast() Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 15:21:01 -04:00
dingtianhong	a64d044e39	bonding: slight optimization for bond_xmit_roundrobin() When the slave is the curr_active_slave, no need to check whether the slave is active or not, it is always active. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 15:21:00 -04:00
Nikolay Aleksandrov	e0974585e7	bonding: consolidate ASSERT_RTNL()s and remove the unnecessary Consolidate the calls to ASSERT_RTNL() before bond_select_active_slave() inside bond_select_active_slave() itself and remove the ASSERT_RTNL() from bond_hw_addr_swap() as it's not exported and its only caller - bond_change_active_slave() already has an ASSERT_RTNL(). Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:19:50 -04:00
Nikolay Aleksandrov	547942cace	bonding: trivial: style and comment fixes First adjust a couple of locking comments that were left inaccurate, then adjust comments to use the netdev styling and remove extra new lines where necessary and add a couple of new lines between declarations and code. These are all trivial styling changes, no functional change. Also removed a couple of outdated or obvious comments. This patch is by no means a complete fix of all netdev style violations but it gets the bonding closer. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:19:49 -04:00
Nikolay Aleksandrov	56924c3811	bonding: consolidate the two rlb_next_rx_slave functions into one __rlb_next_rx_slave() is a copy of rlb_next_rx_slave() with the difference that it uses rcu primitives to walk the slave list. We don't need the two functions and can make rlb_next_rx_slave() a wrapper for callers which hold RTNL. So add a comment and ASSERT_RTNL() to make sure what is intended. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:19:49 -04:00
Nikolay Aleksandrov	9a72c2da69	bonding: fix div by zero while enslaving and transmitting The problem is that the slave is first linked and slave_cnt is incremented afterwards leading to a div by zero in the modes that use it as a modulus. What happens is that in bond_start_xmit() bond_has_slaves() is used to evaluate further transmission and it becomes true after the slave is linked in, but when slave_cnt is used in the xmit path it is still 0, so fetch it once and transmit based on that. Since it is used only in round-robin and XOR modes, the fix is only for them. Thanks to Eric Dumazet for pointing out the fault in my first try to fix this. Call trace (took it out of net-next kernel, but it's the same with net): [46934.330038] divide error: 0000 [#1] SMP [46934.330041] Modules linked in: bonding(O) 9p fscache snd_hda_codec_generic crct10dif_pclmul [46934.330041] bond0: Enslaving eth1 as an active interface with an up link [46934.330051] ppdev joydev crc32_pclmul crc32c_intel 9pnet_virtio ghash_clmulni_intel snd_hda_intel 9pnet snd_hda_controller parport_pc serio_raw pcspkr snd_hda_codec parport virtio_balloon virtio_console snd_hwdep snd_pcm pvpanic i2c_piix4 snd_timer i2ccore snd soundcore virtio_blk virtio_net virtio_pci virtio_ring virtio ata_generic pata_acpi floppy [last unloaded: bonding] [46934.330053] CPU: 1 PID: 3382 Comm: ping Tainted: G O 3.17.0-rc4+ #27 [46934.330053] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [46934.330054] task: ffff88005aebf2c0 ti: ffff88005b728000 task.ti: ffff88005b728000 [46934.330059] RIP: 0010:[<ffffffffa0198c33>] [<ffffffffa0198c33>] bond_start_xmit+0x1c3/0x450 [bonding] [46934.330060] RSP: 0018:ffff88005b72b7f8 EFLAGS: 00010246 [46934.330060] RAX: 0000000000000679 RBX: ffff88004b077000 RCX: 000000000000002a [46934.330061] RDX: 0000000000000000 RSI: ffff88004b3f0500 RDI: ffff88004b077940 [46934.330061] RBP: ffff88005b72b830 R08: 00000000000000c0 R09: ffff88004a83e000 [46934.330062] R10: 000000000000ffff R11: ffff88004b1f12c0 R12: ffff88004b3f0500 [46934.330062] R13: ffff88004b3f0500 R14: 000000000000002a R15: ffff88004b077940 [46934.330063] FS: 00007fbd91a4c740(0000) GS:ffff88005f080000(0000) knlGS:0000000000000000 [46934.330064] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [46934.330064] CR2: 00007f803a8bb000 CR3: 000000004b2c9000 CR4: 00000000000406e0 [46934.330069] Stack: [46934.330071] ffffffff811e6169 00000000e772fa05 ffff88004b077000 ffff88004b3f0500 [46934.330072] ffffffff81d17d18 000000000000002a 0000000000000000 ffff88005b72b8a0 [46934.330073] ffffffff81620108 ffffffff8161fe0e ffff88005b72b8c4 ffff88005b302000 [46934.330073] Call Trace: [46934.330077] [<ffffffff811e6169>] ? __kmalloc_node_track_caller+0x119/0x300 [46934.330084] [<ffffffff81620108>] dev_hard_start_xmit+0x188/0x410 [46934.330086] [<ffffffff8161fe0e>] ? harmonize_features+0x2e/0x90 [46934.330088] [<ffffffff81620b06>] __dev_queue_xmit+0x456/0x590 [46934.330089] [<ffffffff81620c50>] dev_queue_xmit+0x10/0x20 [46934.330090] [<ffffffff8168f022>] arp_xmit+0x22/0x60 [46934.330091] [<ffffffff8168f090>] arp_send.part.16+0x30/0x40 [46934.330092] [<ffffffff8168f1e5>] arp_solicit+0x115/0x2b0 [46934.330094] [<ffffffff8160b5d7>] ? copy_skb_header+0x17/0xa0 [46934.330096] [<ffffffff8162875a>] neigh_probe+0x4a/0x70 [46934.330097] [<ffffffff8162979c>] __neigh_event_send+0xac/0x230 [46934.330098] [<ffffffff8162a00b>] neigh_resolve_output+0x13b/0x220 [46934.330100] [<ffffffff8165f120>] ? ip_forward_options+0x1c0/0x1c0 [46934.330101] [<ffffffff81660478>] ip_finish_output+0x1f8/0x860 [46934.330102] [<ffffffff81661f08>] ip_output+0x58/0x90 [46934.330103] [<ffffffff81661602>] ? __ip_local_out+0xa2/0xb0 [46934.330104] [<ffffffff81661640>] ip_local_out_sk+0x30/0x40 [46934.330105] [<ffffffff81662a66>] ip_send_skb+0x16/0x50 [46934.330106] [<ffffffff81662ad3>] ip_push_pending_frames+0x33/0x40 [46934.330107] [<ffffffff8168854c>] raw_sendmsg+0x88c/0xa30 [46934.330110] [<ffffffff81612b31>] ? skb_recv_datagram+0x41/0x60 [46934.330111] [<ffffffff816875a9>] ? raw_recvmsg+0xa9/0x1f0 [46934.330113] [<ffffffff816978d4>] inet_sendmsg+0x74/0xc0 [46934.330114] [<ffffffff81697a9b>] ? inet_recvmsg+0x8b/0xb0 [46934.330115] bond0: Adding slave eth2 [46934.330116] [<ffffffff8160357c>] sock_sendmsg+0x9c/0xe0 [46934.330118] [<ffffffff81603248>] ? move_addr_to_kernel.part.20+0x28/0x80 [46934.330121] [<ffffffff811b4477>] ? might_fault+0x47/0x50 [46934.330122] [<ffffffff816039b9>] ___sys_sendmsg+0x3a9/0x3c0 [46934.330125] [<ffffffff8144a14a>] ? n_tty_write+0x3aa/0x530 [46934.330127] [<ffffffff810d1ae4>] ? __wake_up+0x44/0x50 [46934.330129] [<ffffffff81242b38>] ? fsnotify+0x238/0x310 [46934.330130] [<ffffffff816048a1>] __sys_sendmsg+0x51/0x90 [46934.330131] [<ffffffff816048f2>] SyS_sendmsg+0x12/0x20 [46934.330134] [<ffffffff81738b29>] system_call_fastpath+0x16/0x1b [46934.330144] Code: 48 8b 10 4c 89 ee 4c 89 ff e8 aa bc ff ff 31 c0 e9 1a ff ff ff 0f 1f 00 4c 89 ee 4c 89 ff e8 65 fb ff ff 31 d2 4c 89 ee 4c 89 ff <f7> b3 64 09 00 00 e8 02 bd ff ff 31 c0 e9 f2 fe ff ff 0f 1f 00 [46934.330146] RIP [<ffffffffa0198c33>] bond_start_xmit+0x1c3/0x450 [bonding] [46934.330146] RSP <ffff88005b72b7f8> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> Fixes: `278b208375` ("bonding: initial RCU conversion") Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 17:16:56 -04:00
Nikolay Aleksandrov	8c0bc55028	bonding: adjust locking comments Now that locks have been removed, remove some unnecessary comments and adjust others to reflect reality. Also add a comment to "mode_lock" to describe its current users and give a brief summary why they need it. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:29:07 -04:00
Nikolay Aleksandrov	e470259fa1	bonding: 3ad: convert to bond->mode_lock Now that we have bond->mode_lock, we can remove the state_machine_lock and use it in its place. There're no fast paths requiring the per-port spinlocks so it should be okay to consolidate them into mode_lock. Also move it inside the unbinding function as we don't want to expose mode_lock outside of the specific modes. Suggested-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:29:07 -04:00
Nikolay Aleksandrov	4bab16d7c9	bonding: alb: convert to bond->mode_lock The ALB/TLB specific spinlocks are no longer necessary as we now have bond->mode_lock for this purpose, so convert them and remove them from struct alb_bond_info. Also remove the unneeded lock/unlock functions and use spin_lock/unlock directly. Suggested-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:29:07 -04:00
Nikolay Aleksandrov	b743562819	bonding: convert curr_slave_lock to a spinlock and rename it curr_slave_lock is now a misleading name, a much better name is mode_lock as it'll be used for each mode's purposes and it's no longer necessary to use a rwlock, a simple spinlock is enough. Suggested-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:29:07 -04:00
Nikolay Aleksandrov	1c72cfdc96	bonding: clean curr_slave_lock use Mostly all users of curr_slave_lock already have RTNL as we've discussed previously so there's no point in using it, the one case where the lock must stay is the 3ad code, in fact it's the only one. It's okay to remove it from bond_do_fail_over_mac() as it's called with RTNL and drops the curr_slave_lock anyway. bond_change_active_slave() is one of the main places where curr_slave_lock was used, it's okay to remove it as all callers use RTNL these days before calling it, that's why we move the ASSERT_RTNL() in the beginning to catch any potential offenders to this rule. The RTNL argument actually applies to all of the places where curr_slave_lock has been removed from in this patch. Also remove the unnecessary bond_deref_active_protected() macro and use rtnl_dereference() instead. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:29:06 -04:00
Nikolay Aleksandrov	62c5f51853	bonding: alb: remove curr_slave_lock First in rlb_teach_disabled_mac_on_primary() it's okay to remove curr_slave_lock as all callers except bond_alb_monitor() already hold RTNL, and in case bond_alb_monitor() is executing we can at most have a period with bad throughput (very unlikely though). In bond_alb_monitor() it's okay to remove the read_lock as the slave list is walked with RCU and the worst that could happen is another transmitter at the same time and thus for a period which currently is 10 seconds (bond_alb.h: BOND_ALB_LP_TICKS). And bond_alb_handle_active_change() is okay because it's always called with RTNL. Removed the ASSERT_RTNL() because it'll be inserted in the parent function in a following patch. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:29:06 -04:00
Nikolay Aleksandrov	86e749866d	bonding: 3ad: clean up curr_slave_lock usage Remove the read_lock in bond_3ad_lacpdu_recv() since when the slave is being released its rx_handler is removed before 3ad unbind, so even if packets arrive, they won't see the slave in an inconsistent state. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:29:06 -04:00
Masanari Iida	37b7021d9d	net:bonding: Add missing space in bonding driver parameter description This patch adds missing space between "interface" and "by" in bonding module parameter description. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 20:38:03 -07:00
Nikolay Aleksandrov	87163ef9cd	bonding: remove last users of bond->lock and bond->lock itself The usage of bond->lock in bond_main.c was completely unnecessary as it didn't help to sync with anything, most of the spots already had RTNL. Since there're no more users of bond->lock, remove it. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:31:36 -07:00
Nikolay Aleksandrov	246df7b423	bonding: options: remove bond->lock usage We're safe to remove the bond->lock use from the arp targets because arp_rcv_probe no longer acquires bond->lock, only rcu_read_lock. Also setting the primary slave is safe because noone uses the bond->lock as a syncing mechanism for that anymore. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:31:35 -07:00
Nikolay Aleksandrov	e9fe8efeea	bonding: procfs: clean bond->lock usage and use RCU Use RCU to protect against slave release, the proc show function will sync with the bond destruction by the proc locks and the fact that the bond is released after NETDEV_UNREGISTER which causes the bonding to remove the proc entry. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:31:35 -07:00
Nikolay Aleksandrov	059b47e8aa	bonding: convert primary_slave to use RCU This is necessary mainly for two bonding call sites: procfs and sysfs as it was dereferenced without any real protection. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:31:35 -07:00
Nikolay Aleksandrov	ecfede424e	bonding: alb: clean bond->lock We can remove the lock/unlock as it's no longer necessary since RTNL should be held while calling bond_alb_set_mac_address(). Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:31:35 -07:00
Nikolay Aleksandrov	bdbc5f1303	bonding: 3ad: use curr_slave_lock instead of bond->lock In 3ad mode the only syncing needed by bond->lock is for the wq and the recv handler, so change them to use curr_slave_lock. There're no locking dependencies here as 3ad doesn't use curr_slave_lock at all. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:31:35 -07:00
Jiri Pirko	cea6aeb697	bonding: add slave netlink policy and put slave-related ops together Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 21:44:03 -07:00
Nikolay Aleksandrov	0f23124aaa	bonding: add slave_changelink support and use it for queue_id This patch adds support for slave_changelink to the bonding and uses it to give the ability to change the queue_id of the enslaved devices via netlink. It sets slave_maxtype and uses bond_changelink as a prototype for bond_slave_changelink. Example/test command after the iproute2 patch: ip link set eth0 type bond_slave queue_id 10 CC: David S. Miller <davem@davemloft.net> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Suggested-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 18:32:22 -07:00
Masanari Iida	9b13494c91	treewide: Fix typo in printk This patch fix spelling typo in printk within vairous part of the code. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2014-08-26 09:35:54 +02:00
Jiri Pirko	d4261e5650	bonding: create netlink event when bonding option is changed Userspace needs to be notified if one changes some option. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Veaceslav Falico <vfalico@gmail.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-22 12:33:47 -07:00
Andreea-Cristina Bernat	b5091b552a	bonding: Replace rcu_dereference() with rcu_access_pointer() This "rcu_dereference()" call is used directly in a condition. Since its return value is never dereferenced it is recommended to use "rcu_access_pointer()" instead of "rcu_dereference()". Therefore, this patch makes this replacement. The following Coccinelle semantic patch was used for solving it: @@ @@ ( if( (<+... - rcu_dereference + rcu_access_pointer (...) ...+>)) {...} \| while( (<+... - rcu_dereference + rcu_access_pointer (...) ...+>)) {...} ) Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-22 12:23:10 -07:00
David S. Miller	d247b6ab3c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/Makefile net/ipv6/sysctl_net_ipv6.c Two ipv6_table_template[] additions overlap, so the index of the ipv6_table[x] assignments needed to be adjusted. In the drivers/net/Makefile case, we've gotten rid of the garbage whereby we had to list every single USB networking driver in the top-level Makefile, there is just one "USB_NETWORKING" that guards everything. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-05 18:46:26 -07:00
Veaceslav Falico	7afcaec496	bonding: use kobject_put instead of _del after kobject_add Otherwise the name of the kobject isn't getting freed and other stuff from kobject_cleanup() isn't getting called. kobject_put() will call kobject_del() on its own in kobject_cleanup(). CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-31 11:16:09 -07:00
Dan Carpenter	a67eed571a	bonding: fix a memory leak in bond_arp_send_all() This test is reversed so the memory is always leaked. It's better style to remove the test anyway. Fixes: `3e403a7777` ('bonding: make it possible to have unlimited nested upper vlans') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-28 17:27:47 -07:00
Veaceslav Falico	3e403a7777	bonding: make it possible to have unlimited nested upper vlans Currently we're limited by a constant level of vlan nestings, and fail to find anything beyound that level (currently 2). To fix this - remove the limit of nestings when going through device tree, and when the end device is found - allocate the needed amount of vlan tags and return them, instead of found/not found. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-20 20:35:00 -07:00
Veaceslav Falico	23fa5c2caa	bonding: destroy proc directory only after all bonds are gone Currently we might arrive to bond_net_exit() with some bonds left (that were created while the module is unloading). We take care of that by destroying sysfs (the last possibility to add new bonds) and then destroying all the remaining bonds. However, we destroy the /proc/net/bonding directory before destroying those last bonds, and get a warning that we're trying to destroy a non-empty proc directory (containing /proc/net/bonding/bondX). Fix this by moving bond_destroy_proc_dir() after all the bonds are destroyed, so that we're sure that no bonds exist. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-17 16:33:27 -07:00
Veaceslav Falico	14056e7930	bonding: use rtnl_deref in bond_change_rx_flags() As it's always called with RTNL held, via dev_set_allmulti/promiscuity. Also, remove the wrong comment. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-17 16:06:52 -07:00
Jianhua Xie	ce04d63502	bonding: enhance L2 hash helper with packet type Current L2 hash helper calculates destination eth addr and source ether addr as L2 hash factors. This patch is adding packet type ID field into L2 hash factors. While one of BOND_XMIT_POLICY_LAYER2 or BOND_XMIT_POLICY_{LAYER\|ENCAP}23 is applied, for the 2nd level hash, enhanced hash method can help to distribute different types of packets like IPv4/IPv6 packets to different slave devices. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: David S. Miller <davem@davemloft.net> CC: Pan Jiafei <Jiafei.Pan@freescale.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jianhua Xie <jianhua.xie@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-17 16:03:27 -07:00
Mahesh Bandewar	6b794c1cd8	bonding: Do not try to send packets over dead link in TLB mode. In TLB mode if tlb_dynamic_lb is NOT set, slaves from the bond group are selected based on the hash distribution. This does not exclude dead links which are part of the bond. Also if there is a temporary link event which brings down the interface, packets hashed on that interface would be dropped too. This patch fixes these issues and distributes flows across the UP links only. Also the array construction of links which are capable of sending packets happen in the control path leaving only link-selection during the data-path. One possible side effect of this is - at a link event; all flows will be shuffled to get good distribution. But impact of this should be minimum with the assumption that a member or members of the bond group are not available is a very temporary situation. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-16 23:35:35 -07:00
David S. Miller	1a98c69af1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-16 14:09:34 -07:00
Veaceslav Falico	cb25235860	bonding: remove pr_fmt from bond_options.c To maintain the same message structure as netdev_* functions print. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:16:06 -07:00
Veaceslav Falico	2de390bace	bonding: convert bond_options.c to use netdev_printk instead of pr_ CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:16:06 -07:00
Veaceslav Falico	c735ee6c43	bonding: convert bond_procfs.c to use netdev_printk instead of pr_ CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:59 -07:00
Veaceslav Falico	6a9fc6f14a	bonding: bonding: remove pr_fmt from bond_netlink.c To maintain the same message structure as netdev_* functions print. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:59 -07:00
Veaceslav Falico	a5f24542ab	bonding: convert bond_netlink.c to use netdev_printk instead of pr_ CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:58 -07:00
Veaceslav Falico	9ef1cf16cf	bonding: convert bond_debugfs.c to use netdev_printk instead of pr_ One occurance left intact as it's unrelated to net_device. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:58 -07:00
Veaceslav Falico	abaf98ef80	bonding: remove pr_fmt from bond_alb.c To maintain the same message structure as netdev_* functions print. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:58 -07:00
Veaceslav Falico	0a111a03f4	bonding: convert bond_alb.c to use netdev_printk instead of pr_ CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:58 -07:00
Veaceslav Falico	dbfddcfdd0	bonding: remove pr_fmt from bond_3ad.c To maintain the same message structure as netdev_* functions print. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:58 -07:00
Veaceslav Falico	d4471f5e23	bonding: convert bond_3ad.c to use netdev_printk instead of pr_ Several functions left out cause we might not have at that time a valid bond/slave/port. Also, converted severa pr_ratelimited into net_ratelimited. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:58 -07:00
Veaceslav Falico	f338532327	bonding: remove pr_fmt from bond_main.c To maintain the same message structure as netdev_* functions print. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:58 -07:00
Veaceslav Falico	76444f5052	bonding: convert bond_main.c to use netdev_printk instead of pr_ Converted only the parts where we've had a valid net_device, skipping the init/deinit and options verification. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 23:15:57 -07:00
Nikolay Aleksandrov	ff11d8b27d	bonding: fix bond_option_mode_set warning During the conversion to "static" functions this one got left out, only its prototype was converted, thus resulting in: drivers/net/bonding//bond_options.c:674:5: warning: symbol 'bond_option_mode_set' was not declared. Should it be static? Fix it by making it static and also break the line in two as it was too long. CC: Stephen Hemminger <stephen@networkplumber.org> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 22:55:57 -07:00
Veaceslav Falico	f54424412b	bonding: permit enslaving interfaces without set_mac support Currently we exit if the slave isn't the first slave, doesn't support mac address setting and fail_over_mac isn't FOM_ACTIVE. It's wrong because we only require ndo_set_mac_address in case bonding is in active-backup mode and FOM isn't FOM_ACTIVE. To fix this - only exit with an error if we're in a/b mode and have fail_over_mac != FOM_ACTIVE. Also, maintain current behaviour on the first slave (forcibly change fom to FOM_ACTIVE) to not break anyone's configuration. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 22:54:49 -07:00
Eric Dumazet	8574171833	bonding: add proper __rcu annotation for current_arp_slave Using __rcu annotation actually helps to spot all accesses to bond->current_arp_slave are correctly protected, with LOCKDEP support. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 17:49:42 -07:00
Eric Dumazet	4740d63827	bonding: add proper __rcu annotation for curr_active_slave RCU was added to bonding in linux-3.12 but lacked proper sparse annotations. Using __rcu annotation actually helps to spot all accesses to bond->curr_active_slave are correctly protected, with LOCKDEP support. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 17:49:42 -07:00
Eric Dumazet	c2646b593e	bonding: use rcu_access_pointer() in bonding_show_mii_status() curr_active_slave is rcu protected, and bonding_show_mii_status() only wants to check if pointer is NULL or not. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 17:49:41 -07:00
Eric Dumazet	e965f80494	bonding: get rid of bond_option_active_slave_get() Only keep bond_option_active_slave_get_rcu() helper. bond_fill_info() uses a new bond_option_active_slave_get_ifindex() helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 17:49:41 -07:00
Tom Gundersen	c835a67733	net: set name_assign_type in alloc_netdev() Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert all users to pass NET_NAME_UNKNOWN. Coccinelle patch: @@ expression sizeof_priv, name, setup, txqs, rxqs, count; @@ ( -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs) +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs) \| -alloc_netdev_mq(sizeof_priv, name, setup, count) +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count) \| -alloc_netdev(sizeof_priv, name, setup) +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup) ) v9: move comments here from the wrong commit Signed-off-by: Tom Gundersen <teg@jklm.no> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 16:12:48 -07:00
Nikolay Aleksandrov	548d28bd0e	bonding: fix ad_select module param check Obvious copy/paste error when I converted the ad_select to the new option API. "lacp_rate" there should be "ad_select" so we can get the proper value. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: David S. Miller <davem@davemloft.net> Fixes: `9e5f5eebe7` ("bonding: convert ad_select to use the new option API") Reported-by: Karim Scheik <karim.scheik@prisma-solutions.at> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-14 14:36:58 -07:00
Jiri Pirko	e721f87d80	bonding: remove no longer relevant vlan warnings These warnings are no longer relevant. Even when last slave is removed, there is a valid address assigned to bond (random). The correct functionality of vlans is ensured by maintaining unicast list in vlan_sync_address(). Suggested-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-07 21:31:54 -07:00
Jiri Pirko	763e0ecd72	bonding: allow to add vlans on top of empty bond This limitation maybe had some reason in the past, but now there is not one -> removing this. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-01 18:57:43 -07:00
Or Gerlitz	5a7baa7885	bonding: Advertize vxlan offload features when supported When the underlying device supports TCP offloads for VXLAN/UDP encapulated traffic, we need to reflect that through the hw_enc_features field of the bonding net-device. This will cause the xmit path in the core networking stack to provide bonding with encapsulated GSO frames to offload into the HW etc. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-18 16:49:51 -07:00
Vlad Yasevich	14af9963ba	bonding: Support macvlans on top of tlb/rlb mode bonds To make TLB mode work, the patch allows learning packets to be sent using mac addresses assigned to macvlan devices, also taking into an account vlans that may be between the bond and macvlan device. To make RLB work, all we have to do is accept ARP packets for addresses added to the bond dev->uc list. Since RLB mode will take care to update the peers directly with correct mac addresses, learning packets for these addresses do not have be send to switch. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 15:13:54 -07:00
Vlad Yasevich	c565b488c6	bonding: Turn on IFF_UNICAST_FLT on bond devices Bonding devices manage the unicast filters of the underlying interfaces, but do not turn on IFF_UNICAST_FLT flag. Thus anytime a unicast address is added to the bond, the bond is places in promiscuous mode. Turn on IFF_UNICAST_FLT on the bond device so that the bond does not go into promiscuous mode needlesly. If an underlying device does not support unicast filtering, that device will automaticall enter promiscuous mode already. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 15:13:54 -07:00
David S. Miller	54e5c4def0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-24 00:32:30 -04:00
Vlad Yasevich	d0c21d43a5	bonding: Send ALB learning packets using the right source ALB learning packets are currentlyalways sent using the slave mac address for all vlans configured on top of bond. This is not always correct, as vlans may change their mac address. This patch introduced a concept of strict matching where the source of learning packets can either strictly match the address passed in, or it can determine a more correct address to use. There are 3 casese to consider: 1) Switchover. In this case, we have a new active slave and we need tell the switch about all addresses available on the slave. 2) Monitor. We'll periodically refresh learning info for all slaves. In this case, we refresh all addresses for current active, and just the slave address for other slaves. 3) Teaching of disabled adddress. This happens as part of the failover and in this case, we alwyas to use just the address provided. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:47:58 -04:00
Veaceslav Falico	f6dcf561e6	bonding: remove NULL verification from bond_get_bond_by_slave() Every caller relies on the result being the actual bond, so this verification just masks the real problem. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:46:34 -04:00
Veaceslav Falico	dc73c41f4e	bonding: populate essential new_slave->bond/dev early The new bond_free_slave() needs new_slave->bond to verify if additional structures were allocated, so populate it early so that, in case of failure in bond_enslave(), we would be able to get it. Also populate the new_slave->dev field, as it's too one of the most needed things to assign early. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:46:34 -04:00
Vlad Yasevich	d6b694c0b3	bonding: Don't assume 802.1Q when sending alb learning packets. TLB/ALB learning packets always assume 802.1Q vlan protocol, but that is no longer the case since we now have support for Q-in-Q on top of bonding. Pass the vlan protocol to alb_send_lp_vid() so that the packets are properly tagged. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:44:58 -04:00
Michal Kubeček	a9b3ace44c	bonding: fix vlan_features computing bond_compute_features() uses netdev_increment_features() to combine vlan_features of slaves into vlan_features of the bond. As netdev_increment_features() only adds most features and we start with BOND_VLAN_FEATURES, we can end up with features none of the slaves provided. If there is at least one slave, initialize vlan_features only with the flags in NETIF_F_ALL_FOR_ALL. Right now there is none in BOND_VLAN_FEATURES but stating it explicitely will make the code more future proof. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:07:23 -04:00
Vlad Yasevich	f60c3704e8	bonding: Fix alb mode to only use first level vlans. ALB/TLB learning packets use all vlans configured on top of the bond. This ends up being incorrect if we have a stack of vlans on top of the bond. ALB/TLB should only use first level/outer most vlans in its announcements. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 22:29:05 -04:00
Vlad Yasevich	44a4085538	bonding: Fix stacked device detection in arp monitoring Prior to commit `fbd929f2dc` bonding: support QinQ for bond arp interval the arp monitoring code allowed for proper detection of devices stacked on top of vlans. Since the above commit, the code can still detect a device stacked on top of single vlan, but not a device stacked on top of Q-in-Q configuration. The search will only set the inner vlan tag if the route device is the vlan device. However, this is not always the case, as it is possible to extend the stacked configuration. With this patch it is possible to provision devices on top Q-in-Q vlan configuration that should be used as a source of ARP monitoring information. For example: ip link add link bond0 vlan10 type vlan proto 802.1q id 10 ip link add link vlan10 vlan100 type vlan proto 802.1q id 100 ip link add link vlan100 type macvlan Note: This patch limites the number of stacked VLANs to 2, just like before. The original, however had another issue in that if we had more then 2 levels of VLANs, we would end up generating incorrectly tagged traffic. This is no longer possible. Fixes: `fbd929f2dc` (bonding: support QinQ for bond arp interval) CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Patric McHardy <kaber@trash.net> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 22:29:05 -04:00
Veaceslav Falico	8557cd74ca	bonding: replace SLAVE_IS_OK() with bond_slave_can_tx() They're verifying the same thing (except of IFF_UP, which is implied for netif_running(), which is also a prerequisite). CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:33 -04:00
Veaceslav Falico	891ab54d66	bonding: rename {, bond_}slave_can_tx and clean it up CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:32 -04:00
Veaceslav Falico	b6adc610f1	bonding: convert IS_UP(slave->dev) to inline function Also, remove the IFF_UP verification cause we can't be netif_running() with being also IFF_UP. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:32 -04:00
Veaceslav Falico	2807a9feb2	bonding: make IS_IP_TARGET_UNUSABLE_ADDRESS an inline function Also, use standard IP primitives to check the address. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:32 -04:00
Veaceslav Falico	01844098ec	bonding: create a macro for bond mode and use it CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:32 -04:00
Veaceslav Falico	ec0865a949	bonding: make USES_PRIMARY inline functions Change the name a bit to better reflect its scope, and update some comments. Two functions added - one which takes bond as a param and the other which takes the mode. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:32 -04:00
Veaceslav Falico	267bed777a	bonding: make BOND_NO_USES_ARP an inline function Also, change its name to better reflect its scope, and skip the "no" part. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:32 -04:00
Veaceslav Falico	d1e2e5cd4f	bonding: make TX_QUEUE_OVERRIDE() macro an inline function Also, make it accept bonding as a parameter and change the name a bit to better reflect its scope. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:31 -04:00
Veaceslav Falico	befb903ae6	bonding: remove BOND_MODE_IS_LB macro It's used only in an inline function and is useless. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:34:31 -04:00
Nikolay Aleksandrov	81c708068d	bonding: fix out of range parameters for bond_intmax_tbl I've missed to add a NULL entry to the bond_intmax_tbl when I introduced it with the conversion of arp_interval so add it now. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Fixes: `7bdb04ed0d` ("bonding: convert arp_interval to use the new option API") Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 23:36:42 -04:00
dingtianhong	3fdddd859a	bonding: alloc the structure ad_info dynamically in per slave The struct ad_slave_info is very huge, and only be used for 802.3ad mode, so alloc the structure dynamically could save 356 Bits for every slave in non 802.3ad mode. Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 14:09:15 -04:00
David S. Miller	5f013c9bc7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/altera/altera_sgdma.c net/netlink/af_netlink.c net/sched/cls_api.c net/sched/sch_api.c The netlink conflict dealt with moving to netlink_capable() and netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations in non-init namespaces. These were simple transformations from netlink_capable to netlink_ns_capable. The Altera driver conflict was simply code removal overlapping some void pointer cast cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 13:19:14 -04:00
Nikolay Aleksandrov	dc3e5d18f2	bonding: make a generic sysfs option store and fix comments Introduce a generic option store function for sysfs and remove the specific ones. The attribute name is used to match against the option which is to be set. Also adjust the "name" of tlb_dynamic_lb option to match the sysfs entry and fix the comments and comment style in bond_sysfs.c The comments which showed obvious behaviour (i.e. behaviour that's seen in the option's entry) are removed, the ones that explained important points about the setting function have been moved above the respective set function in bond_options.c There's only 1 exception: num_unsol_na/num_grat_arp since it has 2 names CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-09 16:12:51 -04:00
dingtianhong	2db2a15abf	bonding: remove the unused macro Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:41:12 -04:00
dingtianhong	bedabf903d	bonding: simplify the slave_do_arp_validate_only() The argument slave is not used for slave_do_arp_validate_only(), so no need to keep it, make the function more simple. Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:41:12 -04:00
dingtianhong	31924325f5	bonding: remove the unnecessary struct bond_net Move the structure bond_net forward, and remove the unnecessary structure declaration. Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-08 23:41:12 -04:00
Masanari Iida	014f1b2010	net: bonding: Fix format string mismatch in bond_sysfs.c Fix format string mismatch in bonding_show_min_links(). Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-28 14:48:16 -04:00
Mahesh Bandewar	e9f0fb8849	bonding: Add tlb_dynamic_lb parameter for tlb mode The aggresive load balancing causes packet re-ordering as active flows are moved from a slave to another within the group. Sometime this aggresive lb is not necessary if the preference is for less re-ordering. This parameter if used with value "0" disables this dynamic flow shuffling minimizing packet re-ordering. Of course the side effect is that it has to live with the static load balancing that the hashing distribution provides. This impact is less severe if the correct xmit-hashing-policy is used for the tlb setup. The default value of the parameter is set to "1" mimicing the earlier behavior. Ran the netperf test with 200 stream for 1 min between two hosts with 4x1G trunk (xmit-lb mode with xmit-policy L3+4) before and after these changes. Following was the command used for those 200 instances - netperf -t TCP_RR -l 60 -s 5 -H <host> -- -r81920,81920 Transactions per second: Before change: 1,367.11 After change: 1,470.65 Change-Id: Ie3f75c77282cf602e83a6e833c6eb164e72a0990 Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-24 13:04:34 -04:00
Mahesh Bandewar	f05b42eaa2	bonding: Added bond_tlb_xmit() for tlb mode. Re-organized the xmit function for the lb mode separating tlb xmit from the alb mode. This will enable use of the hashing policies like 802.3ad mode. Also extended use of xmit-hash-policy to tlb mode. Now the tlb-mode defaults to BOND_XMIT_POLICY_LAYER2 if the xmit policy module parameter is not set (just like 802.3ad, or Xor mode). Change-Id: I140257403d272df75f477b380207338d0f04963e Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-24 13:04:34 -04:00
Mahesh Bandewar	9a49aba1ad	bonding: Reorg bond_alb_xmit code Separating the actual xmit part from the function in a separate function that can be used in the tlb_xmit in the next patch. Also there is no reason do_tx_balance to be an int so changing it to bool type. Change-Id: I9c48ff30487810f68587e621a191db616f49bd3b Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-24 13:04:34 -04:00
Mahesh Bandewar	ee62e86813	bonding: Changed hashing function to just provide hash Modified the hash function to return just hash separating from the modulo operation that can be performed by the caller. This is to make way for the tlb mode to use the same hashing policies that are used in the 802.3ad and Xor mode. Change-Id: I276609e87e0ca213c4d1b17b79c5e0b0f3d0dd6f Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-24 13:04:34 -04:00
Thomas Richter	db29868653	bonding: Remove debug_fs files when module init fails Remove the bonding debug_fs entries when the module initialization fails. The debug_fs entries should be removed together with all other already allocated resources. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-11 15:04:40 -04:00
zheng.li	7db8df0279	bonding: Inactive slaves should keep inactive flag's value bond_open is not setting the inactive flag correctly for some modes (alb and tlb), resulting in error behavior if the bond has been administratively set down and then back up. This effect should not occur when slaves are added while the bond is up; it's something that only happens after a down/up bounce of the bond. For example, in bond tlb or alb mode, domu send some ARP request which go out from dom0 bond's active slave, then the ARP broadcast request packets go back to inactive slave from switch, because the inactive slave's inactive flag is zero, kernel will receive the packets and pass them to bridge that cause dom0's bridge map domu's MAC address to port of bond, bridge should map domu's MAC to port of vif. Signed-off-by: Zheng Li <zheng.x.li@oracle.com> Signed-off-by: Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-04 10:02:20 -04:00
Eric W. Biederman	a8779ec1c5	netpoll: Remove gfp parameter from __netpoll_setup The gfp parameter was added in: commit `47be03a28c` Author: Amerigo Wang <amwang@redhat.com> Date: Fri Aug 10 01:24:37 2012 +0000 netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup() slave_enable_netpoll() and __netpoll_setup() may be called with read_lock() held, so should use GFP_ATOMIC to allocate memory. Eric suggested to pass gfp flags to __netpoll_setup(). Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> The reason for the gfp parameter was removed in: commit `c4cdef9b71` Author: dingtianhong <dingtianhong@huawei.com> Date: Tue Jul 23 15:25:27 2013 +0800 bonding: don't call slave_xxx_netpoll under spinlocks The slave_xxx_netpoll will call synchronize_rcu_bh(), so the function may schedule and sleep, it should't be called under spinlocks. bond_netpoll_setup() and bond_netpoll_cleanup() are always protected by rtnl lock, it is no need to take the read lock, as the slave list couldn't be changed outside rtnl lock. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Nothing else that calls __netpoll_setup or ndo_netpoll_setup requires a gfp paramter, so remove the gfp parameter from both of these functions making the code clearer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-29 17:58:37 -04:00
Monam Agarwal	8800a244fa	drivers/net: Use RCU_INIT_POINTER(x, NULL) in bonding/bond_options.c This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-27 00:18:09 -04:00
dingtianhong	4873ac3c8e	bonding: add net_ratelimt to avoid spam in arp interval Remove the unnecessary log and add net_ratelimit to the others, in order to avoid spam the log. Cc: Joe Perches <joe@perches.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-26 16:41:28 -04:00
dingtianhong	fbd929f2dc	bonding: support QinQ for bond arp interval The bond send arp request to indicate that the slave is active, and if the bond dev is a vlan dev, it will set the vlan tag in skb to notice the vlan group, but the bond could only send a skb with 802.1q proto, not support for QinQ. So add outer tag for lower vlan tag and inner tag for upper vlan tag to support QinQ, The new skb will be consist of two vlan tag just like this: dst mac \| src mac \| outer vlan tag \| inner vlan tag \| data \| ..... If We don't need QinQ, the inner vlan tag could be set to 0 and use outer vlan tag as a normal vlan group. Using "ip link" to configure the bond for QinQ and add test log: ip link add link bond0 bond0.20 type vlan proto 802.1ad id 20 ip link add link bond0.20 bond0.20.200 type vlan proto 802.1q id 200 ifconfig bond0.20 11.11.20.36/24 ifconfig bond0.20.200 11.11.200.36/24 echo +11.11.200.37 > /sys/class/net/bond0/bonding/arp_ip_target 90:e2:ba:07:4a:5c (oui Unknown) > Broadcast, ethertype 802.1Q-QinQ (0x88a8),length 50: vlan 20, p 0,ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 11.11.200.37 tell 11.11.200.36, length 28 90:e2:ba:06:f9:86 (oui Unknown) > 90:e2:ba:07:4a:5c (oui Unknown), ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 20, p 0, ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 11.11.200.37 is-at 90:e2:ba:06:f9:86 (oui Unknown), length 28 v1->v2: remove the comment "TODO: QinQ?". Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-26 16:41:28 -04:00
dingtianhong	9152e26df2	bonding: ratelimit pr_err() for bond xmit broadcast It may spam if the system is out of the memory, add ratelimit for it. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-26 16:40:24 -04:00
dingtianhong	054bb88010	bonding: slight optimization for bond xmit path Add unlikely() micro to the unlikely conditions in the bond xmit path for slight optimization. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-26 16:40:24 -04:00
Veaceslav Falico	86a2b9cfcc	bonding: ratelimit pr_warn()s in 802.3ad mode Only ratelimit the ones that might spam, omiting the ones from enslave/deslave. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-18 14:50:48 -04:00
David S. Miller	85dcce7a73	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/r8152.c drivers/net/xen-netback/netback.c Both the r8152 and netback conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-14 22:31:55 -04:00
Veaceslav Falico	96a0922c23	bonding: use the correct ether type for alb Currently it's using the wrong ETH_P_LOOP type, which is sometimes treated as packet length instead of ether type (because it's 0x0060). Use the new ETH_P_LOOPBACK type. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-14 22:20:57 -04:00
dingtianhong	fb00bc2e6c	bonding: set correct vlan id for alb xmit path The commit `d3ab3ffd1d` (bonding: use rlb_client_info->vlan_id instead of ->tag) remove the rlb_client_info->tag, but occur some issues, The vlan_get_tag() will return 0 for success and -EINVAL for error, so the client_info->vlan_id always be set to 0 if the vlan_get_tag return 0 for success, so the client_info would never get a correct vlan id. We should only set the vlan id to 0 when the vlan_get_tag return error. Fixes: `d3ab3ffd1d` (bonding: use rlb_client_info->vlan_id instead of ->tag) CC: Ding Tianhong <dingtianhong@huawei.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-13 15:45:10 -04:00
Eric W. Biederman	2bb77ab42a	bonding: Call dev_kfree_skby_any instead of kfree_skb. Replace kfree_skb with dev_kfree_skb_any in functions that can be called in hard irq and other contexts. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 16:22:13 -04:00
stephen hemminger	a19a7ec8fc	bonding: force cast of IP address in options The option code is taking IP address and putting it into a generic container. Force cast to silence sparse warnings. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-11 16:37:14 -04:00
stephen hemminger	28f084cca3	bonding: fix const in options processing This is a fixup patch to resolve issues with const from my earlier patch. Make all the setter functions use const on input parameter. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-06 17:39:19 -05:00
Sasha Levin	5bd4e4c158	bonding: correctly handle out of range parameters for lp_interval We didn't correctly check cases where the value for lp_interval is not within the legal range due to a missing table terminator. This would let userspace trigger a kernel panic by specifying a value out of range: echo -1 > /sys/devices/virtual/net/bond0/bonding/lp_interval Introduced by commit `4325b374f8` ("bonding: convert lp_interval to use the new option API"). Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-06 17:06:17 -05:00
stephen hemminger	f3253339a4	bonding: options handling cleanup Make local functions static (ie. only used in bond_options.c) Make bond options parsing tables constant. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-06 16:08:52 -05:00
stephen hemminger	fca28094cd	bonding: remove dead code These functions are defined but no longer used. Compile tested only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-06 16:08:52 -05:00
Veaceslav Falico	072256d1f2	bonding: make slave status notifications GFP_ATOMIC Currently we're using GFP_KERNEL, however there are some path(s) where we can hold some spinlocks, specifically bond->curr_slave_lock: [ 4.722916] BUG: sleeping function called from invalid context at mm/slub.c:965 [ 4.724438] in_atomic(): 1, irqs_disabled(): 0, pid: 940, name: ifup-eth [ 4.726034] 5 locks held by ifup-eth/940: ...snip... [ 4.734646] #4: (&bond->curr_slave_lock){+...+.}, at: [<ffffffffa00badc6>] bond_enslave+0xda6/0xdd0 [bonding] ...snip... [ 4.759081] [<ffffffffa00b6f11>] bond_change_active_slave+0x191/0x3b0 [bonding] [ 4.760917] [<ffffffffa00b7227>] bond_select_active_slave+0xf7/0x1d0 [bonding] [ 4.762751] [<ffffffffa00badce>] bond_enslave+0xdae/0xdd0 [bonding] ...snip... As it's out of hot path and is a really rare event - change the gfp_t flags to GFP_ATOMIC to avoid sleeping under spinlock. v2: convert new notify calls to GFP_ATOMIC. CC: Thomas Glanzmann <thomas@glanzmann.de> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-06 15:19:43 -05:00
David S. Miller	67ddc87f16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/wireless/ath/ath9k/recv.c drivers/net/wireless/mwifiex/pcie.c net/ipv6/sit.c The SIT driver conflict consists of a bug fix being done by hand in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper was created (netdev_alloc_pcpu_stats()) which takes care of this. The two wireless conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-05 20:32:02 -05:00
Veaceslav Falico	285727600f	bonding: send arp requests even if there's no route to them Currently we're only sending arp requests if we have a route to the target (and, thus, can find out the source ip address). There are some use cases, however, where we don't want/need to set an ip address (or set up a specific route) for bonding to use arp monitoring for traffic generation. We can easily send arp probes (arp requests with src ip == 0) to generate arp broadcast responses from the target ip and use them for determining if the target is up. This, obviously, won't work with arp validation - because we don't have the ip address set and, thus, will filter out the responses. So in that case - print a warning. CC: François CACHEREUL <f.cachereul@alphalink.fr> CC: Zhenjie Chen <zhchen@redhat.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-02 14:54:10 -05:00
Jiri Bohac	09a89c219b	bonding: disallow enslaving a bond to itself Enslaving a bond to itself leads to an endless loop and hangs the kernel. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Tested-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 22:37:12 -05:00
Nikolay Aleksandrov	ee6154e11e	bonding: fix a div error caused by the slave release path There's a bug in the slave release function which leads the transmit functions which use the bond->slave_cnt to a div by 0 because we might just have released our last slave and made slave_cnt == 0 but at the same time we may have a transmitter after the check for an empty list which will fetch it and use it in the slave id calculation. Fix it by moving the slave_cnt after synchronize_rcu so if this was our last slave any new transmitters will see an empty slave list which is checked after rcu lock but before calling the mode transmit functions which rely on bond->slave_cnt. Fixes: `278b208375` ("bonding: initial RCU conversion") CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Jay Vosburgh <fubar@us.ibm.com> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 17:09:09 -05:00
dingtianhong	b0929915e0	bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor Veaceslav has reported and fix this problem by commit `f2ebd477f1` (bonding: restructure locking of bond_ab_arp_probe()). According Jay's opinion, the current solution is not very well, because the notification is to indicate that the interface has actually changed state in a meaningful way, but these calls in the ab ARP monitor are internal settings of the flags to allow the ARP monitor to search for a slave to become active when there are no active slaves. The flag setting to active or backup is to permit the ARP monitor's response logic to do the right thing when deciding if the test slave (current_arp_slave) is up or not. So the best way to fix the problem is that we should not send a notification when the slave is in testing state, and check the state at the end of the monitor, if the slave's state recover, avoid to send pointless notification twice. And RTNL is really a big lock, hold it regardless the slave's state changed or not when the current_active_slave is null will loss performance (every 100ms), so we should hold it only when the slave's state changed and need to notify. I revert the old commit and add new modifications. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 16:02:56 -05:00
dingtianhong	5e5b066535	bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode The problem was introduced by the commit `1d3ee88ae0` (bonding: add netlink attributes to slave link dev). The bond_set_active_slave() and bond_set_backup_slave() will use rtmsg_ifinfo to send slave's states, so these two functions should be called in RTNL. In 802.3ad mode, acquiring RTNL for the __enable_port and __disable_port cases is difficult, as those calls generally already hold the state machine lock, and cannot unconditionally call rtnl_lock because either they already hold RTNL (for calls via bond_3ad_unbind_slave) or due to the potential for deadlock with bond_3ad_adapter_speed_changed, bond_3ad_adapter_duplex_changed, bond_3ad_link_change, or bond_3ad_update_lacp_rate. All four of those are called with RTNL held, and acquire the state machine lock second. The calling contexts for __enable_port and __disable_port already hold the state machine lock, and may or may not need RTNL. According to the Jay's opinion, I don't think it is a problem that the slave don't send notify message synchronously when the status changed, normally the state machine is running every 100 ms, send the notify message at the end of the state machine if the slave's state changed should be better. I fix the problem through these steps: 1). add a new function bond_set_slave_state() which could change the slave's state and call rtmsg_ifinfo() according to the input parameters called notify. 2). Add a new slave parameter which called should_notify, if the slave's state changed and don't notify yet, the parameter will be set to 1, and then if the slave's state changed again, the param will be set to 0, it indicate that the slave's state has been restored, no need to notify any one. 3). the __enable_port and __disable_port should not call rtmsg_ifinfo in the state machine lock, any change in the state of slave could set a flag in the slave, it will indicated that an rtmsg_ifinfo should be called at the end of the state machine. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 16:02:56 -05:00
dingtianhong	7a4ddcd92e	bonding: remove no longer needed lock for bond_xxx_info_query() The bond_xxx_info_query() was already in RTNL, so no need to use bond lock to protect the bond slave list, so remove it. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 18:28:23 -05:00
dingtianhong	4335d60e5e	bonding: use rcu_dereference() to access curr_active_slave The bond_info_show_master already in RCU read-side critical section, and the we access curr_active_slave without the curr_slave_lock, we could not sure whether the curr_active_slave will be changed during the processing, so use RCU to protected the pointer. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 18:28:23 -05:00
dingtianhong	827418081a	bonding: netpoll: remove unwanted slave_dev_support_netpoll() The __netpoll_setup() will check the slave's flag and ndo_poll_controller just like the slave_dev_support_netpoll() does, and slave_dev_support_netpoll() was not used by any place, so remove it. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 18:28:23 -05:00
Veaceslav Falico	010d3c3989	bonding: fix bond_arp_rcv() race of curr_active_slave bond->curr_active_slave can be changed between its deferences, even to NULL, and thus we might panic. We're always holding the rcu (rx_handler->bond_handle_frame()->bond_arp_rcv()) so fix this by rcu_dereferencing() it and using the saved. Reported-by: Ding Tianhong <dingtianhong@huawei.com> Fixes: `aeea64a` ("bonding: don't trust arp requests unless active slave really works") CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-20 13:20:55 -05:00

1 2 3 4 5 ...

1138 Commits