OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	06f4e926d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits) macvlan: fix panic if lowerdev in a bond tg3: Add braces around 5906 workaround. tg3: Fix NETIF_F_LOOPBACK error macvlan: remove one synchronize_rcu() call networking: NET_CLS_ROUTE4 depends on INET irda: Fix error propagation in ircomm_lmp_connect_response() irda: Kill set but unused variable 'bytes' in irlan_check_command_param() irda: Kill set but unused variable 'clen' in ircomm_connect_indication() rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport() be2net: Kill set but unused variable 'req' in lancer_fw_download() irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication() atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined. rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer(). rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler() rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection() rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window() pkt_sched: Kill set but unused variable 'protocol' in tc_classify() isdn: capi: Use pr_debug() instead of ifdefs. tg3: Update version to 3.119 tg3: Apply rx_discards fix to 5719/5720 ... Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c as per Davem.	2011-05-20 13:43:21 -07:00
Eric Dumazet	d93515611b	macvlan: fix panic if lowerdev in a bond commit `a35e2c1b6d` (macvlan: use rx_handler_data pointer to store macvlan_port pointer V2) added a bug in macvlan_port_create() Steps to reproduce the bug: # ifenslave bond0 eth0 eth1 # ip link add link eth0 up name eth0#1 type macvlan ->error EBUSY # ip link add link eth0 up name eth0#1 type macvlan ->panic Fix: Dont set IFF_MACVLAN_PORT in error case. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-20 14:59:23 -04:00
Eric Dumazet	449f454426	macvlan: remove one synchronize_rcu() call When one macvlan device is dismantled, we can avoid one synchronize_rcu() call done after deletion from hash list, since caller will perform a synchronize_net() call after its ndo_stop() call. Add a new netdev->dismantle field to signal this dismantle intent. Reduces RTNL hold time. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> CC: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-20 00:33:18 -04:00
Eric Dumazet	226bd34114	net: use batched device unregister in veth and macvlan veth devices dont use the batched device unregisters yet. Since veth are a pair of devices, it makes sense to use a batch of two unregisters, this roughly divides dismantle time by two. Fix this by changing dellink() callers to always provide a non NULL head. (Idea from Michał Mirosław) This patch also handles macvlan case : We now dismantle all macvlans on top of a lower dev at once. Reported-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Michał Mirosław <mirqus@gmail.com> Cc: Jesse Gross <jesse@nicira.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-09 11:41:40 -07:00
Lai Jiangshan	6e070aecd9	macvlan,rcu: convert call_rcu(macvlan_port_rcu_free) to kfree_rcu() The rcu callback macvlan_port_rcu_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(macvlan_port_rcu_free). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-07 22:51:01 -07:00
Michał Mirosław	3918764666	net: macvlan: convert to hw_features Not much of a conversion anyway - macvlan has no way to change the offload settings independently to its base device. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-17 17:40:24 -07:00
Eric W. Biederman	d5cd92448f	macvlan: Fix use after free of struct macvlan_port. When the macvlan driver was extended to call unregisgter_netdevice_queue in `23289a37e2`, a use after free of struct macvlan_port was introduced. The code in dellink relied on unregister_netdevice actually unregistering the net device so it would be safe to free macvlan_port. Since unregister_netdevice_queue can just queue up the unregister instead of performing the unregiser immediately we free the macvlan_port too soon and then the code in macvlan_stop removes the macaddress for the set of macaddress to listen for and uses memory that has already been freed. To fix this add a reference count to track when it is safe to free the macvlan_port and move the call of macvlan_port_destroy into macvlan_uninit which is guaranteed to be called after the final macvlan_port_close. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-21 18:22:22 -07:00
Jiri Pirko	8a4eb5734e	net: introduce rx_handler results and logic around that This patch allows rx_handlers to better signalize what to do next to it's caller. That makes skb->deliver_no_wcard no longer needed. kernel-doc for rx_handler_result is taken from Nicolas' patch. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-16 12:53:54 -07:00
Daniel Lezcano	12a2856b60	macvlan : fix checksums error when we are in bridge mode When the lower device has offloading capabilities, the packets checksums are not computed. That leads to have any macvlan port in bridge mode to not work because the packets are dropped due to a bad checksum. If the macvlan is in bridge mode, the packet is forwarded to another macvlan port and reach the network stack where it looks for a checksum but this one was not computed due to the offloading of the lower device. In this case, we have to set the packet with CHECKSUM_UNNECESSARY when it is forwarded to a bridged port and restore the previous value of ip_summed when the packet goes to the lowerdev. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Patrick McHardy <kaber@trash.net> Cc: Andrian Nord <nightnord@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-14 16:54:44 -07:00
Sridhar Samudrala	eb06acdc85	macvlan: Introduce 'passthru' mode to takeover the underlying device With the current default 'vepa' mode, a KVM guest using virtio with macvtap backend has the following limitations. - cannot change/add a mac address on the guest virtio-net - cannot create a vlan device on the guest virtio-net - cannot enable promiscuous mode on guest virtio-net To address these limitations, this patch introduces a new mode called 'passthru' when creating a macvlan device which allows takeover of the underlying device and passing it to a guest using virtio with macvtap backend. Only one macvlan device is allowed in passthru mode and it inherits the mac address from the underlying device and sets it in promiscuous mode to receive and forward all the packets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> ------------------------------------------------------------------------- Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-22 08:24:29 -08:00
Eric Dumazet	8ffab51b3d	macvlan: lockless tx path macvlan is a stacked device, like tunnels. We should use the lockless mechanism we are using in tunnels and loopback. This patch completely removes locking in TX path. tx stat counters are added into existing percpu stat structure, renamed from rx_stats to pcpu_stats. Note : this reverts commit `2c11455321` (macvlan: add multiqueue capability) Note : rx_errors converted to a 32bit counter, like tx_dropped, since they dont need 64bit range. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Ben Greear <greearb@candelatech.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-16 10:58:30 -08:00
David Lamparter	3b27e10555	netns: keep vlan slaves on master netns move previously, if a vlan master device was moved from one network namespace to another, all 802.1q and macvlan slaves were deleted. we can use dev->reg_state to figure out whether dev_change_net_namespace is happening, since that won't set dev->reg_state NETREG_UNREGISTERING. so, this changes 8021q and macvlan to ignore NETDEV_UNREGISTER when reg_state is not NETREG_UNREGISTERING. Signed-off-by: David Lamparter <equinox@diac24.net> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-17 16:46:04 -07:00
Sridhar Samudrala	ba01877f56	macvlan: Fix rx counters update in macvlan_handle_frame() Fix macvlan_handle_frame() to update the rx counters based on the return value of the vlan->receive call. Updated the patch to not do any packet count drops when the interface is down based on Herber'ts comments. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-07-27 21:02:42 -07:00
David S. Miller	bb7e95c8fd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x_main.c Merge bnx2x bug fixes in by hand... :-/ Signed-off-by: David S. Miller <davem@davemloft.net>	2010-07-27 21:01:35 -07:00
Herbert Xu	8a35747a5d	macvtap: Limit packet queue length Mark Wagner reported OOM symptoms when sending UDP traffic over a macvtap link to a kvm receiver. This appears to be caused by the fact that macvtap packet queues are unlimited in length. This means that if the receiver can't keep up with the rate of flow, then we will hit OOM. Of course it gets worse if the OOM killer then decides to kill the receiver. This patch imposes a cap on the packet queue length, in the same way as the tuntap driver, using the device TX queue length. Please note that macvtap currently has no way of giving congestion notification, that means the software device TX queue cannot be used and packets will always be dropped once the macvtap driver queue fills up. This shouldn't be a great problem for the scenario where macvtap is used to feed a kvm receiver, as the traffic is most likely external in origin so congestion notification can't be applied anyway. Of course, if anybody decides to complain about guest-to-guest UDP packet loss down the track, then we may have to revisit this. Incidentally, this patch also fixes a real memory leak when macvtap_get_queue fails. Chris Wright noticed that for this patch to work, we need a non-zero TX queue length. This patch includes his work to change the default macvtap TX queue length to 500. Reported-by: Mark Wagner <mwagner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-07-22 13:08:56 -07:00
Ben Hutchings	3cfde79c6c	net: Get rid of rtnl_link_stats64 / net_device_stats union In commit `be1f3c2c02` "net: Enable 64-bit net device statistics on 32-bit architectures" I redefined struct net_device_stats so that it could be used in a union with struct rtnl_link_stats64, avoiding the need for explicit copying or conversion between the two. However, this is unsafe because there is no locking required and no lock consistently held around calls to dev_get_stats() and use of the statistics structure it returns. In commit `28172739f0` "net: fix 64 bit counters on 32 bit arches" Eric Dumazet dealt with that problem by requiring callers of dev_get_stats() to provide storage for the result. This means that the net_device::stats64 field and the padding in struct net_device_stats are now redundant, so remove them. Update the comment on net_device_ops::ndo_get_stats64 to reflect its new usage. Change dev_txq_stats_fold() to use struct rtnl_link_stats64, since that is what all its callers are really using and it is no longer going to be compatible with struct net_device_stats. Eric Dumazet suggested the separate function for the structure conversion. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-07-09 17:41:56 -07:00
Eric Dumazet	28172739f0	net: fix 64 bit counters on 32 bit arches There is a small possibility that a reader gets incorrect values on 32 bit arches. SNMP applications could catch incorrect counters when a 32bit high part is changed by another stats consumer/provider. One way to solve this is to add a rtnl_link_stats64 param to all ndo_get_stats64() methods, and also add such a parameter to dev_get_stats(). Rule is that we are not allowed to use dev->stats64 as a temporary storage for 64bit stats, but a caller provided area (usually on stack) Old drivers (only providing get_stats() method) need no changes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-07-07 14:58:56 -07:00
Eric Dumazet	bc66154efe	macvlan: 64 bit rx counters Use u64_stats_sync infrastructure to implement 64bit stats. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-28 23:24:30 -07:00
Jiri Pirko	a35e2c1b6d	macvlan: use rx_handler_data pointer to store macvlan_port pointer V2 Register macvlan_port pointer as rx_handler data pointer. As macvlan_port is removed from struct net_device, another netdev priv_flag is added to indicate the device serves as a macvlan port. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-15 11:47:12 -07:00
Jiri Pirko	93e2c32b5c	net: add rx_handler data pointer Add possibility to register rx_handler data pointer along with a rx_handler. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-15 11:47:11 -07:00
Jiri Pirko	8b37ef0a1f	macvlan: use call_rcu for port free Use call_rcu rather than synchronize_rcu. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-07 21:25:20 -07:00
Jiri Pirko	ab95bfe01f	net: replace hooks in __netif_receive_skb V5 What this patch does is it removes two receive frame hooks (for bridge and for macvlan) from __netif_receive_skb. These are replaced them with a single hook for both. It only supports one hook per device because it makes no sense to do bridging and macvlan on the same device. Then a network driver (of virtual netdev like macvlan or bridge) can register an rx_handler for needed net device. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-02 07:11:15 -07:00
Jiri Pirko	f16d3d5748	macvlan: do proper cleanup in macvlan_common_newlink() V2 Fixes possible memory leak. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-05-24 18:42:12 -07:00
Eric Dumazet	2d6c9ffcca	net: congestion notifications are not dropped packets vlan/macvlan start_xmit() can inform caller of congestion with NET_XMIT_CN return value. This doesnt mean packet was dropped. Increment normal stat counters instead of tx_dropped. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-05-16 00:42:15 -07:00
Jiri Pirko	a14462f1bd	net: adjust handle_macvlan to pass port struct to hook Now there's null check here and also again in the hook. Looking at bridge bits which are simmilar, port structure is rcu_dereferenced right away in handle_bridge and passed to hook. Looks nicer. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-05-15 23:48:02 -07:00
Jiri Pirko	a748ee2426	net: move address list functions to a separate file +little renaming of unicast functions to be smooth with multicast ones Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:22:11 -07:00
Jiri Pirko	1c01fe14a8	net: forbid underlaying devices to change its type It's not desired for underlaying devices to change type. At the time, there is for example possible to have bond with changed type from Ethernet to Infiniband as a port of a bridge. This patch fixes this. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 20:00:02 -07:00
Arnd Bergmann	fc0663d6b5	macvlan: allow multiple driver backends This makes it possible to hook into the macvlan driver from another kernel module. In particular, the goal is to extend it with the macvtap backend that provides a tun/tap compatible interface directly on the macvlan device. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-03 20:20:33 -08:00
Arnd Bergmann	8a83a00b07	net: maintain namespace isolation between vlan and real device In the vlan and macvlan drivers, the start_xmit function forwards data to the dev_queue_xmit function for another device, which may potentially belong to a different namespace. To make sure that classification stays within a single namespace, this resets the potentially critical fields. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-03 20:20:32 -08:00
Patrick Mullaney	6eb3a85533	macvlan: add GRO bit to features mask Allow macvlan devices to support GRO. Signed-off-by: Patrick Mullaney <pmullaney@novell.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-16 01:05:38 -08:00
Patrick Mullaney	fc4a748966	netdevice: provide common routine for macvlan and vlan operstate management Provide common routine for the transition of operational state for a leaf device during a root device transition. Signed-off-by: Patrick Mullaney <pmullaney@novell.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 15:59:22 -08:00
David S. Miller	9b963e5d0e	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/ieee802154/fakehard.c drivers/net/e1000e/ich8lan.c drivers/net/e1000e/phy.c drivers/net/netxen/netxen_nic_init.c drivers/net/wireless/ath/ath9k/main.c	2009-11-29 00:57:15 -08:00
Arnd Bergmann	27c0b1a850	macvlan: export macvlan mode through netlink In order to support all three modes of macvlan at runtime, extend the existing netlink protocol to allow choosing the mode per macvlan slave interface. This depends on a matching patch to iproute2 in order to become accessible in user land. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-26 15:53:10 -08:00
Arnd Bergmann	618e1b7482	macvlan: implement bridge, VEPA and private mode This allows each macvlan slave device to be in one of three modes, depending on the use case: MACVLAN_PRIVATE: The device never communicates with any other device on the same upper_dev. This even includes frames coming back from a reflective relay, where supported by the adjacent bridge. MACVLAN_VEPA: The new Virtual Ethernet Port Aggregator (VEPA) mode, we assume that the adjacent bridge returns all frames where both source and destination are local to the macvlan port, i.e. the bridge is set up as a reflective relay. Broadcast frames coming in from the upper_dev get flooded to all macvlan interfaces in VEPA mode. We never deliver any frames locally. MACVLAN_BRIDGE: We provide the behavior of a simple bridge between different macvlan interfaces on the same port. Frames from one interface to another one get delivered directly and are not sent out externally. Broadcast frames get flooded to all other bridge ports and to the external interface, but when they come back from a reflective relay, we don't deliver them again. Since we know all the MAC addresses, the macvlan bridge mode does not require learning or STP like the bridge module does. Based on an earlier patch "macvlan: Reflect macvlan packets meant for other macvlan devices" by Eric Biederman. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Patrick McHardy <kaber@trash.net> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-26 15:53:07 -08:00
Arnd Bergmann	a1e514c5d0	macvlan: cleanup rx statistics We have very similar code for rx statistics in two places in the macvlan driver, with a third one being added in the next patch. Consolidate them into one function to improve overall readability of the driver. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-26 15:53:02 -08:00
Patrick McHardy	8c2acc53fd	macvlan: fix gso_max_size setting gso_max_size must be set based on the value of the underlying device to support devices not using the full 64k. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-23 14:18:53 -08:00
Eric Dumazet	fccaf71011	macvlan: Precise RX stats accounting With multi queue devices, its possible that several cpus call macvlan RX routines simultaneously for the same macvlan device. We update RX stats counter without any locking, so we can get slightly wrong counters. One possible fix is to use percpu counters, to get precise accounting and also get guarantee of no cache line ping pongs between cpus. Note: this adds 16 bytes (32 bytes on 64bit arches) of percpu data per macvlan device. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 23:51:57 -08:00
Patrick McHardy	cbbef5e183	vlan/macvlan: propagate transmission state to upper layers Both vlan and macvlan devices usually don't use a qdisc and immediately queue packets to the underlying device. Propagate transmission state of the underlying device to the upper layers so they can react on congestion and/or inform the sending process. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 14:07:33 -08:00
Eric W. Biederman	81adee47df	net: Support specifying the network namespace upon device creation. There is no good reason to not support userspace specifying the network namespace during device creation, and it makes it easier to create a network device and pass it to a child network namespace with a well known name. We have to be careful to ensure that the target network namespace for the new device exists through the life of the call. To keep that logic clear I have factored out the network namespace grabbing logic into rtnl_link_get_net. In addtion we need to continue to pass the source network namespace to the rtnl_link_ops.newlink method so that we can find the base device source network namespace. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>	2009-11-08 00:53:51 -08:00
Eric Dumazet	23289a37e2	net: add a list_head parameter to dellink() method Adding a list_head parameter to rtnl_link_ops->dellink() methods allow us to queue devices on a list, in order to dismantle them all at once. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-28 02:22:07 -07:00
Eric Dumazet	2c11455321	macvlan: add multiqueue capability macvlan devices are currently not multi-queue capable. We can do that defining rtnl_link_ops method, get_tx_queues(), called from rtnl_create_link() This new method gets num_tx_queues/real_num_tx_queues from lower device. macvlan_get_tx_queues() is a copy of vlan_get_tx_queues(). Because macvlan_start_xmit() has to update netdev_queue stats only (and not dev->stats), I chose to change tx_errors/tx_aborted_errors accounting to tx_dropped, since netdev_queue structure doesnt define tx_errors / tx_aborted_errors. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-03 20:02:13 -07:00
Eric Dumazet	ac06713d55	macvlan: Use compare_ether_addr_64bits() To speedup ether addresses compares, we can use compare_ether_addr_64bits() (all operands are guaranteed to be at least 8 bytes long) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-01 17:40:25 -07:00
Stephen Hemminger	424efe9caf	netdev: convert pseudo drivers to netdev_tx_t These are all drivers that don't touch real hardware. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-01 01:13:40 -07:00
sg.tweak@gmail.com	ef5c89967d	drivers/net/macvlan.c: fix cloning of tagged VLAN interfaces Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13348 akpm: the reporter disappeared, so I typed it in again. It is not possible to make clone of tagged VLAN interface to be used as mac-based vlan interfave. How reproducible: Use any 802.1q tagged vlan interface, e.g. eth2.700 and clone it: ip link add link eth2.700 address 00:04:75:cb:38:09 macvlan0 type macvlan ip link set dev macvlan0 up ip addr add 10.195.1.1/24 dev macvlan0 So far, so good. Now try to ping anything via macvlan0: ping 10.195.1.2 Actual results: For every attempted packet tx kernel writes to console: ------------[ cut here ]------------ WARNING: at net/8021q/vlan_dev.c:254 vlan_dev_hard_header+0x36/0x126 [8021q]() Hardware name: M22ES Modules linked in: arptable_filter arp_tables bridge veth macvlan arc4 ecb ppp_mppe ppp_async crc_ccitt ppp_generic slhc autofs4 sunrpc 8021q garp stp ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp x_tables dm_mirror dm_region_hash dm_log dm_multipath dm_mod sbs sbshc lp floppy snd_intel8x0 joydev snd_seq_dummy snd_intel8x0m snd_ac97_codec ide_cd_mod ac97_bus snd_seq_oss cdrom snd_seq_midi_event serio_raw snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss parport_pc snd_pcm parport battery 8139cp snd_timer i2c_sis96x ac button snd rtc_cmos rtc_core 8139too soundcore rtc_lib mii i2c_core pcspkr snd_page_alloc pata_sis libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: ip_tables] Pid: 0, comm: swapper Tainted: G W 2.6.29.3 #1 Call Trace: [<c0425f48>] warn_slowpath+0x60/0x9f [<c0425f6f>] warn_slowpath+0x87/0x9f [<dffb850d>] vlan_dev_hard_header+0x0/0x126 [8021q] [<dffb8543>] vlan_dev_hard_header+0x36/0x126 [8021q] [<dffb850d>] vlan_dev_hard_header+0x0/0x126 [8021q] [<df83155d>] macvlan_hard_header+0x3c/0x47 [macvlan] [<df831521>] macvlan_hard_header+0x0/0x47 [macvlan] [<c062bf3f>] arp_create+0xef/0x1ff [<c062c08c>] arp_send+0x3d/0x54 [<c062c916>] arp_solicit+0x16c/0x177 [<c05fadd2>] neigh_timer_handler+0x227/0x269 [<c05fabab>] neigh_timer_handler+0x0/0x269 [<c042ce4d>] run_timer_softirq+0xf0/0x141 [<c0429e5a>] __do_softirq+0x76/0xf8 [<c0429de4>] __do_softirq+0x0/0xf8 <IRQ> [<c044fb67>] handle_level_irq+0x0/0xad [<c0429db7>] irq_exit+0x35/0x62 [<c04046bb>] do_IRQ+0xdf/0xf4 [<c04035a7>] common_interrupt+0x27/0x2c [<c04079c5>] default_idle+0x2a/0x3d [<c0401bb6>] cpu_idle+0x57/0x70 Macvlan driver always uses standard ethernet header length for all types of interface to which it is linked. This patch fixes this problem. Reported-by: <sg.tweak@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-11 02:32:39 -07:00
Jiri Pirko	ccffad25b5	net: convert unicast addr list This patch converts unicast address list to standard list_head using previously introduced struct netdev_hw_addr. It also relaxes the locking. Original spinlock (still used for multicast addresses) is not needed and is no longer used for a protection of this list. All reading and writing takes place under rtnl (with no changes). I also removed a possibility to specify the length of the address while adding or deleting unicast address. It's always dev->addr_len. The convertion touched especially e1000 and ixgbe codes when the change is not so trivial. Signed-off-by: Jiri Pirko <jpirko@redhat.com> drivers/net/bnx2.c \| 13 +-- drivers/net/e1000/e1000_main.c \| 24 +++-- drivers/net/ixgbe/ixgbe_common.c \| 14 ++-- drivers/net/ixgbe/ixgbe_common.h \| 4 +- drivers/net/ixgbe/ixgbe_main.c \| 6 +- drivers/net/ixgbe/ixgbe_type.h \| 4 +- drivers/net/macvlan.c \| 11 +- drivers/net/mv643xx_eth.c \| 11 +- drivers/net/niu.c \| 7 +- drivers/net/virtio_net.c \| 7 +- drivers/s390/net/qeth_l2_main.c \| 6 +- drivers/scsi/fcoe/fcoe.c \| 16 ++-- include/linux/netdevice.h \| 18 ++-- net/8021q/vlan.c \| 4 +- net/8021q/vlan_dev.c \| 10 +- net/core/dev.c \| 195 +++++++++++++++++++++++++++----------- net/dsa/slave.c \| 10 +- net/packet/af_packet.c \| 4 +- 18 files changed, 227 insertions(+), 137 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-29 22:12:32 -07:00
Eric Dumazet	93f154b594	net: release dst entry in dev_hard_start_xmit() One point of contention in high network loads is the dst_release() performed when a transmited skb is freed. This is because NIC tx completion calls dev_kree_skb() long after original call to dev_queue_xmit(skb). CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is quite visible if one CPU is 100% handling softirqs for a network device, since dst_clone() is done by other cpus, involving cache line ping pongs. It seems right place to release dst is in dev_hard_start_xmit(), for most devices but ones that are virtual, and some exceptions. David Miller suggested to define a new device flag, set in alloc_netdev_mq() (so that most devices set it at init time), and carefuly unset in devices which dont want a NULL skb->dst in their ndo_start_xmit(). List of devices that must clear this flag is : - loopback device, because it calls netif_rx() and quoting Patrick : "ip_route_input() doesn't accept loopback addresses, so loopback packets already need to have a dst_entry attached." - appletalk/ipddp.c : needs skb->dst in its xmit function - And all devices that call again dev_queue_xmit() from their xmit function (as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 22:19:19 -07:00
Patrick McHardy	b1b67dd45a	net: factor out ethtool invocation of vlan/macvlan drivers Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-21 02:00:51 -07:00
Patrick McHardy	7816a0a862	vlan/macvlan: fix NULL pointer dereferences in ethtool handlers Check whether the underlying device provides a set of ethtool ops before checking for individual handlers to avoid NULL pointer dereferences. Reported-by: Art van Breemen <ard@telegraafnet.nl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-17 15:59:23 -07:00
Eric Biederman	f9ac30f080	macvlan: Deterministic ingress packet delivery Changing the mac address when a macvlan device is up will leave the device on the wrong hash chain making it impossible to receive packets. There is no checking of the mac address set on the macvlan. Allowing a misconfiguration to grab packets from the the underlying device or another macvlan. To resolve these problems I update the hash table of macvlans when the mac address of a macvlan changes, and when updating the hash table I verify that the new mac address is usable. The result is well defined and predictable if not perfect handling of mac vlan mac addresses. To keep the code clear I have created a set of hash table maintenance in macvlan so I am not open coding the hash function and the logic needed to update the hash table all over the place. Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 13:16:13 -07:00
Eric Biederman	b0832a2961	macvlan: Support creating macvlans from macvlans When running in a network namespace whose only link to the outside world is a macvlan device, not being able to create another macvlan is a real pain. So modify macvlan creation to allow automatically forward a creation of a macvlan on a macvlan to become a creation of a macvlan on the underlying network device. Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 13:15:37 -07:00
David S. Miller	aa2ba5f108	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c drivers/net/smc91x.c	2008-12-02 19:50:27 -08:00
Patrick McHardy	efbbced361	macvlan: don't broadcast PAUSE frames to macvlan devices PAUSE frames are only relevant for the real device, broadcasting them to all macvlan devices can cause a significant load increase. Reported-by: Ben Greear <greearb@candelatech.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-26 15:30:48 -08:00
Stephen Hemminger	008298231a	netdev: add more functions to netdevice ops This patch moves neigh_setup and hard_start_xmit into the network device ops structure. For bisection, fix all the previously converted drivers as well. Bonding driver took the biggest hit on this. Added a prefetch of the hard_start_xmit in the fast path to try and reduce any impact this would have. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 20:14:53 -08:00
Stephen Hemminger	54a30c975b	macvlan: convert to net_device_ops Convert to net_device_ops function table. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-19 22:42:40 -08:00
David S. Miller	babcda74e9	drivers/net: Kill now superfluous ->last_rx stores. The generic packet receive code takes care of setting netdev->last_rx when necessary, for the sake of the bonding ARP monitor. Drivers need not do it any more. Some cases had to be skipped over because the drivers were making use of the ->last_rx value themselves. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-03 21:11:17 -08:00
Stephen Hemminger	9edb8bb68b	macvlan: add support for ethtool get settings If macvlan's are used, it is useful to propgate speed and other settings from underlying device up for application usage. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 15:31:53 -07:00
David S. Miller	cf508b1211	netdev: Handle ->addr_list_lock just like ->_xmit_lock for lockdep. The new address list lock needs to handle the same device layering issues that the _xmit_lock one does. This integrates work done by Patrick McHardy. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-22 14:16:42 -07:00
David S. Miller	49997d7515	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/powerpc/booting-without-of.txt drivers/atm/Makefile drivers/net/fs_enet/fs_enet-main.c drivers/pci/pci-acpi.c net/8021q/vlan.c net/iucv/iucv.c	2008-07-18 02:39:39 -07:00
David S. Miller	e8a0464cc9	netdev: Allocate multiple queues for TX. alloc_netdev_mq() now allocates an array of netdev_queue structures for TX, based upon the queue_count argument. Furthermore, all accesses to the TX queues are now vectored through the netdev_get_tx_queue() and netdev_for_each_tx_queue() interfaces. This makes it easy to grep the tree for all things that want to get to a TX queue of a net device. Problem spots which are not really multiqueue aware yet, and only work with one queue, can easily be spotted by grepping for all netdev_get_tx_queue() calls that pass in a zero index. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:00 -07:00
Wang Chen	b89fb7da2f	macvlan: Check return of dev_set_allmulti allmulti might overflow. Commit: "netdevice: Fix promiscuity and allmulti overflow" in net-next makes dev_set_promiscuity/allmulti return error number if overflow happened. Here, we check the positive increment for allmulti to get error return. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 20:57:07 -07:00
David S. Miller	c773e847ea	netdev: Move _xmit_lock and xmit_lock_owner into netdev_queue. Accesses are mostly structured such that when there are multiple TX queues the code transformations will be a little bit simpler. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:13:53 -07:00
Franck Bui-Huu	82524746c2	rcu: split list.h and move rcu-protected lists into rculist.h Move rcu-protected lists from list.h into a new header file rculist.h. This is done because list are a very used primitive structure all over the kernel and it's currently impossible to include other header files in this list.h without creating some circular dependencies. For example, list.h implements rcu-protected list and uses rcu_dereference() without including rcupdate.h. It actually compiles because users of rcu_dereference() are macros. Others RCU functions could be used too but aren't probably because of this. Therefore this patch creates rculist.h which includes rcupdates without to many changes/troubles. Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Josh Triplett <josh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-19 10:01:37 +02:00
Patrick McHardy	7312096454	macvlan: Fix memleak on device removal/crash on module removal As noticed by Ben Greear, macvlan crashes the kernel when unloading the module. The reason is that it tries to clean up the macvlan_port pointer on the macvlan device itself instead of the underlying device. A non-NULL pointer is taken as indication that the macvlan_handle_frame_hook is valid, when receiving the next packet on the underlying device it tries to call the NULL hook and crashes. Clean up the macvlan_port on the correct device to fix this. Signed-off-by; Patrick McHardy <kaber@trash.net> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-08 01:13:31 -07:00
YOSHIFUJI Hideaki	c346dca108	[NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS. Introduce per-net_device inlines: dev_net(), dev_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-03-26 04:39:53 +09:00
Rami Rosen	52913246e0	[MACVLAN]: Setting macvlan_handle_frame_hook to NULL when rtnl_link_register() fails. In drivers/net/macvlan.c, when rtnl_link_register() fails in macvlan_init_module(), there is no point to set it (second time in this method) to macvlan_handle_frame; macvlan_init_module() will return a negative number, so instead this patch sets macvlan_handle_frame_hook to NULL. Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:25 -08:00
Patrick McHardy	f12ca5f97b	[MACVLAN]: Fix thinko in macvlan_transfer_operstate() When the lower device's carrier is off, the macvlan devices's carrier state should be checked to decide whether it needs to be turned off. Currently the lower device's state is checked a second time. This still works, but unnecessarily tries to turn off the carrier when its already off. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:36 -08:00
Patrick McHardy	ad5d20a639	[MACVLAN]: Allow setting mac address while device is up Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:23 -08:00
Patrick McHardy	59891d53f4	[MACVLAN]: Remove unnecessary IFF_UP check Only devices that are UP are in the hash, so macvlan_broadcast() doesn't need to check for IFF_UP. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:23 -08:00
Patrick McHardy	a6ca5f1dbe	[MACVLAN]: Prevent nesting macvlan devices Don't allow to nest macvlan devices since it will cause lockdep warnings and isn't really useful for anything. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-10 22:39:28 -08:00
Al Viro	47063d6b11	remove duplicate initializer (macvlan) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-14 12:41:51 -07:00
Stephen Hemminger	3b04ddde02	[NET]: Move hardware header operations out of netdevice. Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:52 -07:00
Stephen Hemminger	0c4e85813d	[NET]: Wrap netdevice hardware header creation. Add inline for common usage of hardware header creation, and fix bug in IPV6 mcast where the assumption about negative return is an errno. Negative return from hard_header means not enough space was available,(ie -N bytes). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:50 -07:00
Jeff Garzik	88d3aafdae	[ETHTOOL] Provide default behaviors for a few ethtool sub-ioctls For the operations get-tx-csum get-sg get-tso get-ufo the default ethtool_op_xxx behavior is fine for all drivers, so we permit op==NULL to imply the default behavior. This provides a more uniform behavior across all drivers, eliminating ethtool(8) "ioctl not supported" errors on older drivers that had not been updated for the latest sub-ioctls. The ethtool_op_xxx() functions are left exported, in case anyone wishes to call them directly from a driver-private implementation -- a not-uncommon case. Should an ethtool_op_xxx() helper remain unused for a while, except by net/core/ethtool.c, we can un-export it at a later date. [ Resolved conflicts with set/get value ethtool patch... -DaveM ] Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:17 -07:00
Eric W. Biederman	881d966b48	[NET]: Make the device list and device lookups per namespace. This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:10 -07:00
Patrick McHardy	b863ceb7dd	[NET]: Add macvlan driver Add macvlan driver, which allows to create virtual ethernet devices based on MAC address. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-14 18:55:06 -07:00

1 2 3

125 Commits