OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Eric W. Biederman	b51642f6d7	net: Enable a userns root rtnl calls that are safe for unprivilged users - Only allow moving network devices to network namespaces you have CAP_NET_ADMIN privileges over. - Enable creating/deleting/modifying interfaces - Enable adding/deleting addresses - Enable adding/setting/deleting neighbour entries - Enable adding/removing routes - Enable adding/removing fib rules - Enable setting the forwarding state - Enable adding/removing ipv6 address labels - Enable setting bridge parameter Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-18 20:33:36 -05:00
Eric W. Biederman	cb99050305	net: Allow userns root to control the network bridge code. Allow an unpriviled user who has created a user namespace, and then created a network namespace to effectively use the new network namespace, by reducing capable(CAP_NET_ADMIN) and capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns, CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls. Allow setting bridge paramters via sysfs. Allow all of the bridge ioctls: BRCTL_ADD_IF BRCTL_DEL_IF BRCTL_SET_BRDIGE_FORWARD_DELAY BRCTL_SET_BRIDGE_HELLO_TIME BRCTL_SET_BRIDGE_MAX_AGE BRCTL_SET_BRIDGE_AGING_TIME BRCTL_SET_BRIDGE_STP_STATE BRCTL_SET_BRIDGE_PRIORITY BRCTL_SET_PORT_PRIORITY BRCTL_SET_PATH_COST BRCTL_ADD_BRIDGE BRCTL_DEL_BRDIGE Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-18 20:33:00 -05:00
Eric W. Biederman	dfc47ef863	net: Push capable(CAP_NET_ADMIN) into the rtnl methods - In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check to ns_capable(net->user-ns, CAP_NET_ADMIN). Allowing unprivileged users to make netlink calls to modify their local network namespace. - In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so that calls that are not safe for unprivileged users are still protected. Later patches will remove the extra capable calls from methods that are safe for unprivilged users. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-18 20:32:44 -05:00
stephen hemminger	1007dd1aa5	bridge: add root port blocking This is Linux bridge implementation of root port guard. If BPDU is received from a leaf (edge) port, it should not be elected as root port. Why would you want to do this? If using STP on a bridge and the downstream bridges are not fully trusted; this prevents a hostile guest for rerouting traffic. Why not just use netfilter? Netfilter does not track of follow spanning tree decisions. It would be difficult and error prone to try and mirror STP resolution in netfilter module. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-14 20:20:44 -05:00
stephen hemminger	a2e01a65cd	bridge: implement BPDU blocking This is Linux bridge implementation of STP protection (Cisco BPDU guard/Juniper BPDU block). BPDU block disables the bridge port if a STP BPDU packet is received. Why would you want to do this? If running Spanning Tree on bridge, hostile devices on the network may send BPDU and cause network failure. Enabling bpdu block will detect and stop this. How to recover the port? The port will be restarted if link is brought down, or removed and reattached. For example: # ip li set dev eth0 down; ip li set dev eth0 up Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-14 20:20:44 -05:00
stephen hemminger	cd7537326e	bridge: add template for bridge port flags Provide macro to build sysfs data structures and functions for accessing flag bits. If flag bits change do netlink notification. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-14 20:20:44 -05:00
stephen hemminger	25c71c75ac	bridge: bridge port parameters over netlink Expose bridge port parameter over netlink. By switching to a nested message, this can be used for other bridge parameters. This changes IFLA_PROTINFO attribute from one byte to a full nested set of attributes. This is safe for application interface because the old message used IFLA_PROTINFO and new one uses IFLA_PROTINFO \| NLA_F_NESTED. The code adapts to old format requests, and therefore stays compatible with user mode RSTP daemon. Since the type field for nested and unnested attributes are different, and the old code in libnetlink doesn't do the mask, it is also safe to use with old versions of bridge monitor command. Note: although mode is only a boolean, treating it as a full byte since in the future someone will probably want to add more values (like macvlan has). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-14 20:20:44 -05:00
Lee Jones	0cb2bbbea0	bridge: Avoid 'statement with no effect' compiler warnings Instead of issuing (0) statements when !CONFIG_SYSFS which will cause 'warning: ', we'll use inline statements instead. This will effectively do the same thing, but suppress any unnecessary warnings. Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: bridge@lists.linux-foundation.org Cc: netdev@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-04 01:59:29 -04:00
Ben Hutchings	2bc80059fe	eth: Rename and properly align br_reserved_address array Since this array is no longer part of the bridge driver, it should have an 'eth' prefix not 'br'. We also assume that either it's 16-bit-aligned or the architecture has efficient unaligned access. Ensure the first of these is true by explicitly aligning it. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-02 21:34:05 -04:00
Ben Hutchings	46acc460c0	eth: Make is_link_local() consistent with other address tests Function name should include '_ether_addr'. Return type should be bool. Parameter name should be 'addr' not 'dest' (also matching kernel-doc). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-02 21:34:05 -04:00
Ben Hutchings	4197f24b5b	bridge: Use is_link_local() in store_group_addr() Parse the string into an array of bytes rather than ints, so we can use is_link_local() rather than reimplementing it. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-02 21:34:05 -04:00
David S. Miller	810b6d7638	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to ixgbe, ixgbevf, igbvf, igb and networking core (bridge). Most notably is the addition of support for local link multicast addresses in SR-IOV mode to the networking core. Also note, the ixgbe patch "ixgbe: Add support for pipeline reset" and "ixgbe: Fix return value from macvlan filter function" is revised based on community feedback. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-31 14:26:44 -04:00
John Fastabend	2469ffd723	net: set and query VEB/VEPA bridge mode via PF_BRIDGE Hardware switches may support enabling and disabling the loopback switch which puts the device in a VEPA mode defined in the IEEE 802.1Qbg specification. In this mode frames are not switched in the hardware but sent directly to the switch. SR-IOV capable NICs will likely support this mode I am aware of at least two such devices. Also I am told (but don't have any of this hardware available) that there are devices that only support VEPA modes. In these cases it is important at a minimum to be able to query these attributes. This patch adds an additional IFLA_BRIDGE_MODE attribute that can be set and dumped via the PF_BRIDGE:{SET\|GET}LINK operations. Also anticipating bridge attributes that may be common for both embedded bridges and software bridges this adds a flags attribute IFLA_BRIDGE_FLAGS currently used to determine if the command or event is being generated to/from an embedded bridge or software bridge. Finally, the event generation is pulled out of the bridge module and into rtnetlink proper. For example using the macvlan driver in VEPA mode on top of an embedded switch requires putting the embedded switch into a VEPA mode to get the expected results. -------- -------- \| VEPA \| \| VEPA \| <-- macvlan vepa edge relays -------- -------- \| \| \| \| ------------------ \| VEPA \| <-- embedded switch in NIC ------------------ \| \| ------------------- \| external switch \| <-- shiny new physical ------------------- switch with VEPA support A packet sent from the macvlan VEPA at the top could be loopbacked on the embedded switch and never seen by the external switch. So in order for this to work the embedded switch needs to be set in the VEPA state via the above described commands. By making these attributes nested in IFLA_AF_SPEC we allow future extensions to be made as needed. CC: Lennert Buytenhek <buytenh@wantstofly.org> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-31 13:18:29 -04:00
John Fastabend	e5a55a8987	net: create generic bridge ops The PF_BRIDGE:RTM_{GET\|SET}LINK nlmsg family and type are currently embedded in the ./net/bridge module. This prohibits them from being used by other bridging devices. One example of this being hardware that has embedded bridging components. In order to use these nlmsg types more generically this patch adds two net_device_ops hooks. One to set link bridge attributes and another to dump the current bride attributes. ndo_bridge_setlink() ndo_bridge_getlink() CC: Lennert Buytenhek <buytenh@wantstofly.org> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-31 13:18:28 -04:00
John Fastabend	b3343a2a2c	net, ixgbe: handle link local multicast addresses in SR-IOV mode In SR-IOV mode the PF driver acts as the uplink port and is used to send control packets e.g. lldpad, stp, etc. eth0.1 eth0.2 eth0 VF VF PF \| \| \| <-- stand-in for uplink \| \| \| -------------------------- \| Embedded Switch \| -------------------------- \| MAC <-- uplink But the embedded switch is setup to forward multicast addresses to all interfaces both VFs and PF and onto the physical link. This results in reserved MAC addresses used by control protocols to be forwarded over the switch onto the VF. In the LLDP case the PF sends an LLDPDU and it is currently being forwarded to all the VFs who then see the PF as a peer. This is incorrect. This patch adds the multicast addresses to the RAR table in the hardware to prevent this behavior. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2012-10-29 22:31:49 -07:00
Sarveshwar Bandi	6caab7b054	bridge: Pull ip header into skb->data before looking into ip header. If lower layer driver leaves the ip header in the skb fragment, it needs to be first pulled into skb->data before inspecting ip header length or ip version number. Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 22:50:45 -04:00
stephen hemminger	edc7d57327	netlink: add attributes to fdb interface Later changes need to be able to refer to neighbour attributes when doing fdb_add. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 18:39:44 -04:00
stephen hemminger	6b6e27255f	netdev: make address const in device address management The internal functions for add/deleting addresses don't change their argument. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 16:35:22 -04:00
David S. Miller	b48b63a1f6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-15 11:43:53 -04:00
Joe Perches	16af511a66	netfilter: log: Fix log-level processing auto75914331@hushmail.com reports that iptables does not correctly output the KERN_<level>. $IPTABLES -A RULE_0_in -j LOG --log-level notice --log-prefix "DENY in: " result with linux 3.6-rc5 Sep 12 06:37:29 xxxxx kernel: <5>DENY in: IN=eth0 OUT= MAC=....... result with linux 3.5.3 and older: Sep 9 10:43:01 xxxxx kernel: DENY in: IN=eth0 OUT= MAC...... commit `04d2c8c83d` ("printk: convert the format for KERN_<LEVEL> to a 2 byte pattern") updated the syslog header style but did not update netfilter uses. Do so. Use KERN_SOH and string concatenation instead of "%c" KERN_SOH_ASCII as suggested by Eric Dumazet. Signed-off-by: Joe Perches <joe@perches.com> cc: auto75914331@hushmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-12 17:17:35 +02:00
Eric W. Biederman	15e473046c	netlink: Rename pid to portid to avoid confusion It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 15:30:41 -04:00
Pablo Neira Ayuso	9f00d9776b	netlink: hide struct module parameter in netlink_kernel_create This patch defines netlink_kernel_create as a wrapper function of __netlink_kernel_create to hide the struct module *me parameter (which seems to be THIS_MODULE in all existing netlink subsystems). Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-08 18:46:30 -04:00
David S. Miller	bf277b0cce	Merge git://1984.lsi.us.es/nf-next Pablo Neira Ayuso says: ==================== This is the first batch of Netfilter and IPVS updates for your net-next tree. Mostly cleanups for the Netfilter side. They are: * Remove unnecessary RTNL locking now that we have support for namespace in nf_conntrack, from Patrick McHardy. * Cleanup to eliminate unnecessary goto in the initialization path of several Netfilter tables, from Jean Sacren. * Another cleanup from Wu Fengguang, this time to PTR_RET instead of if IS_ERR then return PTR_ERR. * Use list_for_each_entry_continue_rcu in nf_iterate, from Michael Wang. * Add pmtu_disc sysctl option to disable PMTU in their tunneling transmitter, from Julian Anastasov. * Generalize application protocol registration in IPVS and modify IPVS FTP helper to use it, from Julian Anastasov. * update Kconfig. The IPVS FTP helper depends on the Netfilter FTP helper for NAT support, from Julian Anastasov. * Add logic to update PMTU for IPIP packets in IPVS, again from Julian Anastasov. * A couple of sparse warning fixes for IPVS and Netfilter from Claudiu Ghioc and Patrick McHardy respectively. Patrick's IPv6 NAT changes will follow after this batch, I need to flush this batch first before refreshing my tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 18:48:52 -07:00
David S. Miller	1304a7343b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-08-22 14:21:38 -07:00
Stephen Hemminger	c03307eab6	bridge: fix rcu dereference outside of rcu_read_lock Alternative solution for problem found by Linux Driver Verification project (linuxtesting.org). As it noted in the comment before the br_handle_frame_finish function, this function should be called under rcu_read_lock. The problem callgraph: br_dev_xmit -> br_nf_pre_routing_finish_bridge_slow -> -> br_handle_frame_finish -> br_port_get_rcu -> rcu_dereference And in this case there is no read-lock section. Reported-by: Denis Efremov <yefremov.denis@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 15:09:41 -07:00
Amerigo Wang	e15c3c2294	netpoll: check netpoll tx status on the right device Although this doesn't matter actually, because netpoll_tx_running() doesn't use the parameter, the code will be more readable. For team_dev_queue_xmit() we have to move it down to avoid compile errors. Cc: David Miller <davem@davemloft.net> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:33:32 -07:00
Amerigo Wang	4e3828c4bf	bridge: use list_for_each_entry() in netpoll functions We don't delete 'p' from the list in the loop, so we can just use list_for_each_entry(). Cc: David Miller <davem@davemloft.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:33:31 -07:00
Amerigo Wang	d30362c071	bridge: add some comments for NETDEV_RELEASE Add comments on why we don't notify NETDEV_RELEASE. Cc: David Miller <davem@davemloft.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:33:31 -07:00
Amerigo Wang	38e6bc185d	netpoll: make __netpoll_cleanup non-block Like the previous patch, slave_disable_netpoll() and __netpoll_cleanup() may be called with read_lock() held too, so we should make them non-block, by moving the cleanup and kfree() to call_rcu_bh() callbacks. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:33:30 -07:00
Amerigo Wang	47be03a28c	netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup() slave_enable_netpoll() and __netpoll_setup() may be called with read_lock() held, so should use GFP_ATOMIC to allocate memory. Eric suggested to pass gfp flags to __netpoll_setup(). Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:33:30 -07:00
Wu Fengguang	19e303d67d	netfilter: PTR_RET can be used This quiets the coccinelle warnings: net/bridge/netfilter/ebtable_filter.c:107:1-3: WARNING: PTR_RET can be used net/bridge/netfilter/ebtable_nat.c:107:1-3: WARNING: PTR_RET can be used net/ipv6/netfilter/ip6table_filter.c:65:1-3: WARNING: PTR_RET can be used net/ipv6/netfilter/ip6table_mangle.c💯1-3: WARNING: PTR_RET can be used net/ipv6/netfilter/ip6table_raw.c:44:1-3: WARNING: PTR_RET can be used net/ipv6/netfilter/ip6table_security.c:62:1-3: WARNING: PTR_RET can be used net/ipv4/netfilter/iptable_filter.c:72:1-3: WARNING: PTR_RET can be used net/ipv4/netfilter/iptable_mangle.c:107:1-3: WARNING: PTR_RET can be used net/ipv4/netfilter/iptable_raw.c:51:1-3: WARNING: PTR_RET can be used net/ipv4/netfilter/iptable_security.c:70:1-3: WARNING: PTR_RET can be used Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-14 02:31:47 +02:00
Eric Dumazet	a399a80531	time: jiffies_delta_to_clock_t() helper to the rescue Various /proc/net files sometimes report crazy timer values, expressed in clock_t units. This happens when an expired timer delta (expires - jiffies) is passed to jiffies_to_clock_t(). This function has an overflow in : return div_u64((u64)x * TICK_NSEC, NSEC_PER_SEC / USER_HZ); commit `cbbc719fcc` (time: Change jiffies_to_clock_t() argument type to unsigned long) only got around the problem. As we cant output negative values in /proc/net/tcp without breaking various tools, I suggest adding a jiffies_delta_to_clock_t() wrapper that caps the negative delta to a 0 value. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: hank <pyu@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-09 16:17:03 -07:00
stephen hemminger	5a0d513b62	bridge: make port attributes const Simple table that can be marked const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-30 14:53:22 -07:00
Kevin Groeneveld	e3906486f6	net: fix race condition in several drivers when reading stats Fix race condition in several network drivers when reading stats on 32bit UP architectures. These drivers update their stats in a BH context and therefore should use u64_stats_fetch_begin_bh/u64_stats_fetch_retry_bh instead of u64_stats_fetch_begin/u64_stats_fetch_retry when reading the stats. Signed-off-by: Kevin Groeneveld <kgroeneveld@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-22 12:12:32 -07:00
David S. Miller	a6ff1a2f1e	Merge branch 'nexthop_exceptions' These patches implement the final mechanism necessary to really allow us to go without the route cache in ipv4. We need a place to have long-term storage of PMTU/redirect information which is independent of the routes themselves, yet does not get us back into a situation where we have to write to metrics or anything like that. For this we use an "next-hop exception" table in the FIB nexthops. The one thing I desperately want to avoid is having to create clone routes in the FIB trie for this purpose, because that is very expensive. However, I'm willing to entertain such an idea later if this current scheme proves to have downsides that the FIB trie variant would not have. In order to accomodate this any such scheme, we need to be able to produce a full flow key at PMTU/redirect time. That required an adjustment of the interface call-sites used to propagate these events. For a PMTU/redirect with a fully specified socket, we pass that socket and use it to produce the flow key. Otherwise we use a passed in SKB to formulate the key. There are two cases that need to be distinguished, ICMP message processing (in which case the IP header is at skb->data) and output packet processing (mostly tunnels, and in all such cases the IP header is at ip_hdr(skb)). We also have to make the code able to handle the case where the dst itself passed into the dst_ops->{update_pmtu,redirect} method is invalidated. This matters for calls from sockets that have cached that route. We provide a inet{,6} helper function for this purpose, and edit SCTP specially since it caches routes at the transport rather than socket level. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-17 10:48:26 -07:00
Jiri Pirko	30fdd8a082	netpoll: move np->dev and np->dev_name init into __netpoll_setup() Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-17 09:02:36 -07:00
David S. Miller	6700c2709c	net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-17 03:29:28 -07:00
Thomas Graf	036be6dbcf	bridge: Fix enforcement of multicast hash_max limit The hash size is doubled when it needs to grow and compared against hash_max. The >= comparison will limit the hash table size to half of what is expected i.e. the default 512 hash_max will not allow the hash table to grow larger than 256. Also print the hash table limit instead of the desirable size when the limit is reached. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 22:59:30 -07:00
David S. Miller	b587ee3ba2	net: Add dummy dst_ops->redirect method where needed. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-12 00:39:24 -07:00
Li RongQing	4715213d9c	bridge: fix endian mld->mld_maxdelay is net endian, so we should use ntohs, not htons CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 01:31:24 -07:00
David S. Miller	f9d751667f	br_netfilter: Convert to dst_neigh_lookup_skb(). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:10:05 -07:00
David S. Miller	f894cbf847	net: Add optional SKB arg to dst_ops->neigh_lookup(). Causes the handler to use the daddr in the ipv4/ipv6 header when the route gateway is unspecified (local subnet). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:04:01 -07:00
Dan Carpenter	f7eadafb13	netfilter: use kfree_skb() not kfree() This was should be a kfree_skb() here to free the sk_buff pointer. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-30 17:26:51 -07:00
Pablo Neira Ayuso	a31f2d17b3	netlink: add netlink_kernel_cfg parameter to netlink_kernel_create This patch adds the following structure: struct netlink_kernel_cfg { unsigned int groups; void (input)(struct sk_buff skb); struct mutex *cb_mutex; }; That can be passed to netlink_kernel_create to set optional configurations for netlink kernel sockets. I've populated this structure by looking for NULL and zero parameters at the existing code. The remaining parameters that always need to be set are still left in the original interface. That includes optional parameters for the netlink socket creation. This allows easy extensibility of this interface in the future. This patch also adapts all callers to use this new interface. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-29 16:46:02 -07:00
David S. Miller	b26d344c6b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/caif/caif_hsi.c drivers/net/usb/qmi_wwan.c The qmi_wwan merge was trivial. The caif_hsi.c, on the other hand, was not. It's a conflict between `1c385f1fdf` ("caif-hsi: Replace platform device with ops structure.") in the net-next tree and commit `39abbaef19` ("caif-hsi: Postpone init of HIS until open()") in the net tree. I did my best with that one and will ask Sjur to check it out. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 17:37:00 -07:00
David S. Miller	62566ca55d	netfilter: ebt_ulog: Move away from NLMSG_PUT(). And use nlmsg_data() while we're here too. Also, free and NULL out skb when nlmsg_put() fails and remove pointless kernel log message. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-26 21:23:42 -07:00
stephen hemminger	149ddd83a9	bridge: Assign rtnl_link_ops to bridge devices created via ioctl (v2) This ensures that bridges created with brctl(8) or ioctl(2) directly also carry IFLA_LINKINFO when dumped over netlink. This also allows to create a bridge with ioctl(2) and delete it with RTM_DELLINK. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-26 21:12:32 -07:00
Alban Crequy	aa740f46fb	netfilter: bridge: switch hook PFs to nfproto This patch is a cleanup. Use NFPROTO_* for consistency with other netfilter code. Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk> Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Reviewed-by: Vincent Sanders <vincent.sanders@collabora.co.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:42 +02:00
Eldad Zack	1de5a71c3e	ipv6: correct the ipv6 option name - Pad0 to Pad1 The padding destination or hop-by-hop option is called Pad1 and not Pad0. See RFC2460 (4.2) or the IANA ipv6-parameters registry: http://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xml Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-17 15:49:51 -04:00
Joe Perches	9a7b6ef9b9	bridge: Convert compare_ether_addr to ether_addr_equal Use the new bool function ether_addr_equal to add some clarity and reduce the likelihood for misuse of compare_ether_addr for sorting. Done via cocci script: $ cat compare_ether_addr.cocci @@ expression a,b; @@ - !compare_ether_addr(a, b) + ether_addr_equal(a, b) @@ expression a,b; @@ - compare_ether_addr(a, b) + !ether_addr_equal(a, b) @@ expression a,b; @@ - !ether_addr_equal(a, b) == 0 + ether_addr_equal(a, b) @@ expression a,b; @@ - !ether_addr_equal(a, b) != 0 + !ether_addr_equal(a, b) @@ expression a,b; @@ - ether_addr_equal(a, b) == 0 + !ether_addr_equal(a, b) @@ expression a,b; @@ - ether_addr_equal(a, b) != 0 + ether_addr_equal(a, b) @@ expression a,b; @@ - !!ether_addr_equal(a, b) + ether_addr_equal(a, b) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-09 20:49:17 -04:00
Joe Perches	171fe5ef14	bridge: netfilter: Convert compare_ether_addr to ether_addr_equal Use the new bool function ether_addr_equal to add some clarity and reduce the likelihood for misuse of compare_ether_addr for sorting. Done via cocci script: $ cat compare_ether_addr.cocci @@ expression a,b; @@ - !compare_ether_addr(a, b) + ether_addr_equal(a, b) @@ expression a,b; @@ - compare_ether_addr(a, b) + !ether_addr_equal(a, b) @@ expression a,b; @@ - !ether_addr_equal(a, b) == 0 + ether_addr_equal(a, b) @@ expression a,b; @@ - !ether_addr_equal(a, b) != 0 + !ether_addr_equal(a, b) @@ expression a,b; @@ - ether_addr_equal(a, b) == 0 + !ether_addr_equal(a, b) @@ expression a,b; @@ - ether_addr_equal(a, b) != 0 + ether_addr_equal(a, b) @@ expression a,b; @@ - !!ether_addr_equal(a, b) + ether_addr_equal(a, b) Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-09 20:49:17 -04:00
Pablo Neira Ayuso	4981682cc1	netfilter: bridge: optionally set indev to vlan if net.bridge.bridge-nf-filter-vlan-tagged sysctl is enabled, bridge netfilter removes the vlan header temporarily and then feeds the packet to ip(6)tables. When the new "bridge-nf-pass-vlan-input-device" sysctl is on (default off), then bridge netfilter will also set the in-interface to the vlan interface; if such an interface exists. This is needed to make iptables REDIRECT target work with "vlan-on-top-of-bridge" setups and to allow use of "iptables -i" to match the vlan device name. Also update Documentation with current brnf default settings. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-05-08 19:36:47 +02:00
David S. Miller	0d6c4a2e46	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/intel/e1000e/param.c drivers/net/wireless/iwlwifi/iwl-agn-rx.c drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c drivers/net/wireless/iwlwifi/iwl-trans.h Resolved the iwlwifi conflict with mainline using 3-way diff posted by John Linville and Stephen Rothwell. In 'net' we added a bug fix to make iwlwifi report a more accurate skb->truesize but this conflicted with RX path changes that happened meanwhile in net-next. In e1000e a conflict arose in the validation code for settings of adapter->itr. 'net-next' had more sophisticated logic so that logic was used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-07 23:35:40 -04:00
Herbert Xu	bb63f1f8a0	bridge: Fix fatal typo in setup of multicast_querier_expired Unfortunately it seems that I didn't properly test the case of an expired external querier in the recent multicast bridge series. The setup of the timer in that case is completely broken and leads to a NULL-pointer dereference. This patch fixes it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-30 13:30:56 -04:00
Peter Huang (Peng)	a881e963c7	set fake_rtable's dst to NULL to avoid kernel Oops bridge: set fake_rtable's dst to NULL to avoid kernel Oops when bridge is deleted before tap/vif device's delete, kernel may encounter an oops because of NULL reference to fake_rtable's dst. Set fake_rtable's dst to NULL before sending packets out can solve this problem. v4 reformat, change br_drop_fake_rtable(skb) to {} v3 enrich commit header v2 introducing new flag DST_FAKE_RTABLE to dst_entry struct. [ Use "do { } while (0)" for nop br_drop_fake_rtable() implementation -DaveM ] Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Peter Huang <peter.huangpeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-24 00:16:24 -04:00
Eric W. Biederman	ec8f23ce0f	net: Convert all sysctl registrations to register_net_sysctl This results in code with less boiler plate that is a bit easier to read. Additionally stops us from using compatibility code in the sysctl core, hastening the day when the compatibility code can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:30 -04:00
Eric W. Biederman	5dd3df105b	net: Move all of the network sysctls without a namespace into init_net. This makes it clearer which sysctls are relative to your current network namespace. This makes it a little less error prone by not exposing sysctls for the initial network namespace in other namespaces. This is the same way we handle all of our other network interfaces to userspace and I can't honestly remember why we didn't do this for sysctls right from the start. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:17 -04:00
John Fastabend	77162022ab	net: add generic PF_BRIDGE:RTM_ FDB hooks This adds two new flags NTF_MASTER and NTF_SELF that can now be used to specify where PF_BRIDGE netlink commands should be sent. NTF_MASTER sends the commands to the 'dev->master' device for parsing. Typically this will be the linux net/bridge, or open-vswitch devices. Also without any flags set the command will be handled by the master device as well so that current user space tools continue to work as expected. The NTF_SELF flag will push the PF_BRIDGE commands to the device. In the basic example below the commands are then parsed and programmed in the embedded bridge. Note if both NTF_SELF and NTF_MASTER bits are set then the command will be sent to both 'dev->master' and 'dev' this allows user space to easily keep the embedded bridge and software bridge in sync. There is a slight complication in the case with both flags set when an error occurs. To resolve this the rtnl handler clears the NTF_ flag in the netlink ack to indicate which sets completed successfully. The add/del handlers will abort as soon as any error occurs. To support this new net device ops were added to call into the device and the existing bridging code was refactored to use these. There should be no required changes in user space to support the current bridge behavior. A basic setup with a SR-IOV enabled NIC looks like this, veth0 veth2 \| \| ------------ \| bridge0 \| <---- software bridging ------------ / / ethx.y ethx VF PF \ \ <---- propagate FDB entries to HW \ \ -------------------- \| Embedded Bridge \| <---- hardware offloaded switching -------------------- In this case the embedded bridge must be managed to allow 'veth0' to communicate with 'ethx.y' correctly. At present drivers managing the embedded bridge either send frames onto the network which then get dropped by the switch OR the embedded bridge will flood these frames. With this patch we have a mechanism to manage the embedded bridge correctly from user space. This example is specific to SR-IOV but replacing the VF with another PF or dropping this into the DSA framework generates similar management issues. Examples session using the 'br'[1] tool to add, dump and then delete a mac address with a new "embedded" option and enabled ixgbe driver: # br fdb add 22:35:19:ac:60:59 dev eth3 # br fdb port mac addr flags veth0 22:35:19:ac:60:58 static veth0 9a:5f:81:f7:f6:ec local eth3 00:1b:21:55:23:59 local eth3 22:35:19:ac:60:59 static veth0 22:35:19:ac:60:57 static #br fdb add 22:35:19:ac:60:59 embedded dev eth3 #br fdb port mac addr flags veth0 22:35:19:ac:60:58 static veth0 9a:5f:81:f7:f6:ec local eth3 00:1b:21:55:23:59 local eth3 22:35:19:ac:60:59 static veth0 22:35:19:ac:60:57 static eth3 22:35:19:ac:60:59 local embedded #br fdb del 22:35:19:ac:60:59 embedded dev eth3 I added a couple lines to 'br' to set the flags correctly is all. It is my opinion that the merit of this patch is now embedded and SW bridges can both be modeled correctly in user space using very nearly the same message passing. [1] 'br' tool was published as an RFC here and will be renamed 'bridge' http://patchwork.ozlabs.org/patch/117664/ Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for valuable feedback, suggestions, and review. v2: fixed api descriptions and error case with both NTF_SELF and NTF_MASTER set plus updated patch description. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-15 13:06:04 -04:00
Herbert Xu	c5c2326059	bridge: Add multicast_querier toggle and disable queries by default Sending general queries was implemented as an optimisation to speed up convergence on start-up. In order to prevent interference with multicast routers a zero source address has to be used. Unfortunately these packets appear to cause some multicast-aware switches to misbehave, e.g., by disrupting multicast packets to us. Since the multicast snooping feature still functions without sending our own queries, this patch will change the default to not send queries. For those that need queries in order to speed up convergence on start-up, a toggle is provided to restore the previous behaviour. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-15 12:51:35 -04:00
Herbert Xu	c83b8fab06	bridge: Restart queries when last querier expires As it stands when we discover that a real querier (one that queries with a non-zero source address) we stop querying. However, even after said querier has fallen off the edge of the earth, we will never restart querying (unless the bridge itself is restarted). This patch fixes this by kicking our own querier into gear when the timer for other queriers expire. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-15 12:51:34 -04:00
Herbert Xu	748572162a	bridge: Add br_multicast_start_querier This patch adds the helper br_multicast_start_querier so that the code which starts the queriers in br_multicast_toggle can be reused elsewhere. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-15 12:51:34 -04:00
Eric Dumazet	95c9617472	net: cleanup unsigned to unsigned int Use of "unsigned int" is preferred to bare "unsigned" in net tree. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-15 12:44:40 -04:00
David S. Miller	011e3c6325	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-04-12 19:41:23 -04:00
Herbert Xu	996304bbea	bridge: Do not send queries on multicast group leaves As it stands the bridge IGMP snooping system will respond to group leave messages with queries for remaining membership. This is both unnecessary and undesirable. First of all any multicast routers present should be doing this rather than us. What's more the queries that we send may end up upsetting other multicast snooping swithces in the system that are buggy. In fact, we can simply remove the code that send these queries because the existing membership expiry mechanism doesn't rely on them anyway. So this patch simply removes all code associated with group queries in response to group leave messages. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-11 09:43:13 -04:00
David S. Miller	2eb812e650	bridge: Stop using NLA_PUT*(). These macros contain a hidden goto, and are thus extremely error prone and make code hard to audit. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-02 04:33:44 -04:00
David S. Miller	b2d3298e09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-03-09 14:34:20 -08:00
Paulius Zaleckas	5200959b83	bridge: fix state reporting when port is disabled Now we have: eth0: link down br0: port 1(eth0) entered forwarding state br_log_state(p) should be called after p->state is set to BR_STATE_DISABLED. Reported-by: Zilvinas Valinskas <zilvinas@wilibox.com> Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-08 00:25:25 -08:00
Paulius Zaleckas	d9e179ecec	bridge: br_log_state() s/entering/entered/ When br_log_state() is reporting state it should say "entered" istead of "entering" since state at this point is already changed. Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-08 00:25:25 -08:00
Florian Westphal	739e4505a0	bridge: netfilter: don't call iptables on vlan packets if sysctl is off When net.bridge.bridge-nf-filter-vlan-tagged is 0 (default), vlan packets arriving should not be sent to ip(6)tables by bridge netfilter. However, it turns out that we currently always send VLAN packets to netfilter, if .. a), CONFIG_VLAN_8021Q is enabled ; or b), CONFIG_VLAN_8021Q is not set but rx vlan offload is enabled on the bridge port. This is because bridge netfilter treats skb with skb->protocol == ETH_P_IP{V6} as "non-vlan packet". With rx vlan offload on or CONFIG_VLAN_8021Q=y, the vlan header has already been removed here, and we cannot rely on skb->protocol alone. Fix this by only using skb->protocol if the skb has no vlan tag, or if a vlan tag is present and filter-vlan-tagged bridge netfilter sysctl is enabled. We cannot remove the skb->protocol == htons(ETH_P_8021Q) test because the vlan tag is still around in the CONFIG_VLAN_8021Q=n && "ethtool -K $itf rxvlan off" case. reproducer: iptables -t raw -I PREROUTING -i br0 iptables -t raw -I PREROUTING -i br0.1 Then send packets to an ip address configured on br0.1 interface. Even with net.bridge.bridge-nf-filter-vlan-tagged=0, the 1st rule will match instead of the 2nd one. With this patch applied, the 2nd rule will match instead. In the non-local address case, netfilter won't be consulted after this patch unless the sysctl is switched on. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 14:43:49 -05:00
Pablo Neira Ayuso	a157b9d5b5	netfilter: bridge: fix wrong pointer dereference In adf7ff8, a invalid dereference was added in ebt_make_names. CC [M] net/bridge/netfilter/ebtables.o net/bridge/netfilter/ebtables.c: In function `ebt_make_names': net/bridge/netfilter/ebtables.c:1371:20: warning: `t' may be used uninitialized in this function [-Wuninitialized] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 14:43:49 -05:00
Santosh Nayak	848edc6919	netfilter: ebtables: fix wrong name length while copying to user-space user-space ebtables expects 32 bytes-long names, but xt_match names use 29 bytes. We have to copy less 29 bytes and then, make sure we fill the remaining bytes with zeroes. Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 14:43:49 -05:00
David S. Miller	f6a1ad4295	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/vmxnet3/vmxnet3_drv.c Small vmxnet3 conflict with header size bug fix in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-05 21:16:26 -05:00
Ulrich Weber	d1d81d4c3d	bridge: check return value of ipv6_dev_get_saddr() otherwise source IPv6 address of ICMPV6_MGM_QUERY packet might be random junk if IPv6 is disabled on interface or link-local address is not yet ready (DAD). Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-05 16:45:34 -05:00
Joakim Tjernlund	709e1b5cd9	bridge: message age needs to increase, not decrease. commit bridge: send proper message_age in config BPDU added this gem: bpdu.message_age = (jiffies - root->designated_age) p->designated_age = jiffies + bpdu->message_age; Notice how bpdu->message_age is negated when reassigned to bpdu.message_age. This causes message age to decrease breaking the STP protocol. Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-04 21:57:40 -05:00
Joakim Tjernlund	aaca735f4f	bridge: Adjust min age inc for HZ > 256 min age increment needs to round up its min age tick for all HZ values to guarantee message age is increasing. Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-04 21:57:39 -05:00
David S. Miller	b4017c5368	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/broadcom/tg3.c Conflicts in the statistics regression bug fix from 'net', but happily Matt Carlson originally posted the fix against 'net-next' so I used that to resolve this. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-01 17:57:40 -05:00
Florian Westphal	e899b1119f	netfilter: bridge: fix module autoload in compat case We expected 0 if module doesn't exist, which is no longer the case (`42046e2e45`, netfilter: x_tables: return -ENOENT for non-existant matches/targets). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-02-25 14:29:09 +01:00
Danny Kukawka	7ca1e11ab7	br_device: unify return value of .ndo_set_mac_address if address is invalid Unify return value of .ndo_set_mac_address if the given address isn't valid. Return -EADDRNOTAVAIL as eth_mac_addr() already does if is_valid_ether_addr() fails. Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-23 17:03:20 -05:00
Danny Kukawka	7ce5d22219	net: use eth_hw_addr_random() and reset addr_assign_type Use eth_hw_addr_random() instead of calling random_ether_addr() to set addr_assign_type correctly to NET_ADDR_RANDOM. Reset the state to NET_ADDR_PERM as soon as the MAC get changed via .ndo_set_mac_address. v2: adapt to renamed eth_hw_addr_random() Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-15 15:34:17 -05:00
Eric Dumazet	27a429383b	bridge: BH already disabled in br_fdb_cleanup() br_fdb_cleanup() is run from timer interrupt, BH already masked. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Štefan Gula <steweg@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-17 10:17:32 -05:00
David S. Miller	abb434cb05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/bluetooth/l2cap_core.c Just two overlapping changes, one added an initialization of a local variable, and another change added a new local variable. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-23 17:13:56 -05:00
Eric Dumazet	e688a60480	net: introduce DST_NOPEER dst flag Chris Boot reported crashes occurring in ipv6_select_ident(). [ 461.457562] RIP: 0010:[<ffffffff812dde61>] [<ffffffff812dde61>] ipv6_select_ident+0x31/0xa7 [ 461.578229] Call Trace: [ 461.580742] <IRQ> [ 461.582870] [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2 [ 461.589054] [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155 [ 461.595140] [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b [ 461.601198] [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e [nf_conntrack_ipv6] [ 461.608786] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.614227] [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543 [ 461.620659] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.626440] [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a [bridge] [ 461.633581] [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459 [ 461.639577] [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76 [bridge] [ 461.646887] [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f [bridge] [ 461.653997] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.659473] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.665485] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.671234] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.677299] [<ffffffffa0379215>] ? nf_bridge_update_protocol+0x20/0x20 [bridge] [ 461.684891] [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack] [ 461.691520] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.697572] [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56 [bridge] [ 461.704616] [<ffffffffa0379031>] ? nf_bridge_push_encap_header+0x1c/0x26 [bridge] [ 461.712329] [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95 [bridge] [ 461.719490] [<ffffffffa037900a>] ? nf_bridge_pull_encap_header+0x1c/0x27 [bridge] [ 461.727223] [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge] [ 461.734292] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.739758] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.746203] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.751950] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.758378] [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56 [bridge] This is caused by bridge netfilter special dst_entry (fake_rtable), a special shared entry, where attaching an inetpeer makes no sense. Problem is present since commit `87c48fa3b4` (ipv6: make fragment identifications less predictable) Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and __ip_select_ident() fallback to the 'no peer attached' handling. Reported-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-22 22:34:56 -05:00
Eric Dumazet	a13861a28b	bridge: provide a mtu() method for fake_dst_ops Commit `618f9bc74a` (net: Move mtu handling down to the protocol depended handlers) forgot the bridge netfilter case, adding a NULL dereference in ip_fragment(). Reported-by: Chris Boot <bootc@bootc.net> CC: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-22 22:34:56 -05:00
Igor Maravić	e6373c4c0e	net:bridge: use IS_ENABLED Use IS_ENABLED(CONFIG_FOO) instead of defined(CONFIG_FOO) \|\| defined (CONFIG_FOO_MODULE) Signed-off-by: Igor Maravić <igorm@etf.rs> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-16 15:49:52 -05:00
Eric Dumazet	dfd56b8b38	net: use IS_ENABLED(CONFIG_IPV6) Instead of testing defined(CONFIG_IPV6) \|\| defined(CONFIG_IPV6_MODULE) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-11 18:25:16 -05:00
stephen hemminger	4359881338	bridge: add local MAC address to forwarding table (v2) If user has configured a MAC address that is not one of the existing ports of the bridge, then we need to add a special entry in the forwarding table. This forwarding table entry has no outgoing port so it has to be treated a little differently. The special entry is reported by the netlink interface with ifindex of bridge, but ignored by the old interface since there is no usable way to put it in the ABI. Reported-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-08 19:40:28 -05:00
stephen hemminger	31e8a49c16	bridge: rearrange fdb notifications (v2) Pass bridge to fdb_notify so it can determine correct namespace based on namespace of bridge rather than namespace of destination port. Also makes next patch easier. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-08 19:40:28 -05:00
stephen hemminger	f58ee4e1a2	bridge: refactor fdb_notify Move fdb_notify outside of fdb_create. This fixes the problem that notification of local entries are not flagged correctly. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-08 19:40:28 -05:00
David Miller	2721745501	net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. To reflect the fact that a refrence is not obtained to the resulting neighbour entry. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Roland Dreier <roland@purestorage.com>	2011-12-05 15:20:19 -05:00
Jesse Gross	75f2811c64	ipv6: Add fragment reporting to ipv6_skip_exthdr(). While parsing through IPv6 extension headers, fragment headers are skipped making them invisible to the caller. This reports the fragment offset of the last header in order to make it possible to determine whether the packet is fragmented and, if so whether it is a first or last fragment. Signed-off-by: Jesse Gross <jesse@nicira.com>	2011-12-03 09:35:10 -08:00
David S. Miller	b3613118eb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2011-12-02 13:49:21 -05:00
Vitalii Demianets	b03b6dd58c	bridge: master device stuck in no-carrier state forever when in user-stp mode When in user-stp mode, bridge master do not follow state of its slaves, so after the following sequence of events it can stuck forever in no-carrier state: 1) turn stp off 2) put all slaves down - master device will follow their state and also go in no-carrier state 3) turn stp on with bridge-stp script returning 0 (go to the user-stp mode) Now bridge master won't follow slaves' state and will never reach running state. This patch solves the problem by making user-stp and kernel-stp behavior similar regarding master following slaves' states. Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-01 14:05:17 -05:00
Alexey Dobriyan	4e3fd7a06d	net: remove ipv6_addr_copy() C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-22 16:43:32 -05:00
David S. Miller	efd0bf97de	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The forcedeth changes had a conflict with the conversion over to atomic u64 statistics in net-next. The libertas cfg.c code had a conflict with the bss reference counting fix by John Linville in net-next. Conflicts: drivers/net/ethernet/nvidia/forcedeth.c drivers/net/wireless/libertas/cfg.c	2011-11-21 13:50:33 -05:00
Michał Mirosław	34324dc2bf	net: remove NETIF_F_NO_CSUM feature bit Only distinct use is checking if NETIF_F_NOCACHE_COPY should be enabled by default. The check heuristics is altered a bit here, so it hits other people than before. The default shouldn't be trusted for performance-critical cases anyway. For all other uses NETIF_F_NO_CSUM is equivalent to NETIF_F_HW_CSUM. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-16 17:43:12 -05:00
Michał Mirosław	c8f44affb7	net: introduce and use netdev_features_t for device features sets v2: add couple missing conversions in drivers split unexporting netdev_fix_features() implemented %pNF convert sock::sk_route_(no?)caps Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-16 17:43:10 -05:00
stephen hemminger	fa2da8cdae	bridge: correct IPv6 checksum after pull Bridge multicast snooping of ICMPv6 would incorrectly report a checksum problem when used with Ethernet devices like sky2 that use CHECKSUM_COMPLETE. When bytes are removed from skb, the computed checksum needs to be adjusted. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Martin Volf <martin.volf.42@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-16 17:32:43 -05:00
stephen hemminger	292d139898	bridge: add NTF_USE support More changes to the recent code to support control of forwarding database via netlink. * Support NTF_USE like neighbour table * Validate state bits from application * Only send notifications (and change bits) if new entry is different. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:41:54 -05:00
Andrey Vagin	ef5e0d8237	bridge: Fix potential deadlock on br->multicast_lock multicast_lock is taken in softirq context, so we should use spin_lock_bh() in userspace. call-chain in softirq context: run_timer_softirq() br_multicast_query_expired() call-chain in userspace: sysfs_write_file() store_multicast_snooping() br_multicast_toggle() Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:38:53 -05:00
Linus Torvalds	32aaeffbd4	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h	2011-11-06 19:44:47 -08:00
Joe Perches	0a9ee81349	netfilter: Remove unnecessary OOM logging messages Site specific OOM messages are duplications of a generic MM out of memory message and aren't really useful, so just delete them. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2011-11-01 09:19:49 +01:00
Paul Gortmaker	79bb1ee46a	net: fix implicit kmod.h usage in bridge/br_stp_if.c To fix this, once the implicit presence of module.h is removed: net/bridge/br_stp_if.c: In function ‘br_stp_start’: net/bridge/br_stp_if.c:131: error: implicit declaration of function ‘call_usermodehelper’ net/bridge/br_stp_if.c:131: error: ‘UMH_WAIT_PROC’ undeclared (first use in this function) Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:30 -04:00
Paul Gortmaker	bc3b2d7fb9	net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules These files are non modular, but need to export symbols using the macros now living in export.h -- call out the include so that things won't break when we remove the implicit presence of module.h from everywhere. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:30 -04:00
David S. Miller	1805b2f048	Merge branch 'master' of ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	2011-10-24 18:18:09 -04:00
stephen hemminger	1ce5cce895	bridge: fix hang on removal of bridge via netlink Need to cleanup bridge device timers and ports when being bridge device is being removed via netlink. This fixes the problem of observed when doing: ip link add br0 type bridge ip link set dev eth1 master br0 ip link set br0 up ip link del br0 which would cause br0 to hang in unregister_netdev because of leftover reference count. Reported-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-18 23:24:16 -04:00
David S. Miller	88c5100c28	Merge branch 'master' of github.com:davem330/net Conflicts: net/batman-adv/soft-interface.c	2011-10-07 13:38:43 -04:00
stephen hemminger	515853ccec	bridge: allow forwarding some link local frames This is based on an earlier patch by Nick Carter with comments by David Lamparter but with some refinements. Thanks for their patience this is a confusing area with overlap of standards, user requirements, and compatibility with earlier releases. It adds a new sysfs attribute /sys/class/net/brX/bridge/group_fwd_mask that controls forwarding of frames with address of: 01-80-C2-00-00-0X The default setting has no forwarding to retain compatibility. One change from earlier releases is that forwarding of group addresses is not dependent on STP being enabled or disabled. This choice was made based on interpretation of tie 802.1 standards. I expect complaints will arise because of this, but better to follow the standard than continue acting incorrectly by default. The filtering mask is writeable, but only values that don't forward known control frames are allowed. It intentionally blocks attempts to filter control protocols. For example: writing a 8 allows forwarding 802.1X PAE addresses which is the most common request. Reported-by: David Lamparter <equinox@diac24.net> Original-patch-by: Nick Carter <ncarter100@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Benjamin Poirier <benjamin.poirier@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-06 15:27:56 -04:00
stephen hemminger	b64b73d7d0	bridge: leave carrier on for empty bridge This resolves a regression seen by some users of bridging. Some users use the bridge like a dummy device. They expect to be able to put an IPv6 address on the device with no ports attached. Although there are better ways of doing this, there is no reason to not allow it. Note: the bridge still will reflect the state of ports in the bridge if there are any added. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-06 15:26:50 -04:00
stephen hemminger	64af1bac9b	bridge: allow updating existing fdb entries Need to allow application to update existing fdb entries that already exist. This makes bridge netlink neighbor API have same flags and semantics as ip neighbor table. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-03 12:17:33 -04:00
stephen hemminger	77f9859837	bridge: fix ordering of NEWLINK and NEWNEIGH events When port is added to a bridge, the old code would send the new neighbor netlink message before the subsequent new link message. This bug makes it difficult to use the monitoring API in an application. This code changes the ordering to add the forwarding entry after the port is setup. One of the error checks (for invalid address) is moved earlier in the process to avoid having to do unwind. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-03 12:17:33 -04:00
David S. Miller	8decf86879	Merge branch 'master' of github.com:davem330/net Conflicts: MAINTAINERS drivers/net/Kconfig drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c drivers/net/ethernet/broadcom/tg3.c drivers/net/wireless/iwlwifi/iwl-pci.c drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c drivers/net/wireless/rt2x00/rt2800usb.c drivers/net/wireless/wl12xx/main.c	2011-09-22 03:23:13 -04:00
Jiri Pirko	4bc71cb983	net: consolidate and fix ethtool_ops->get_settings calling This patch does several things: - introduces __ethtool_get_settings which is called from ethtool code and from drivers as well. Put ASSERT_RTNL there. - dev_ethtool_get_settings() is replaced by __ethtool_get_settings() - changes calling in drivers so rtnl locking is respected. In iboe_get_rate was previously ->get_settings() called unlocked. This fixes it. Also prb_calc_retire_blk_tmo() in af_packet.c had the same problem. Also fixed by calling __dev_get_by_index() instead of dev_get_by_index() and holding rtnl_lock for both calls. - introduces rtnl_lock in bnx2fc_vport_create() and fcoe_vport_create() so bnx2fc_if_create() and fcoe_if_create() are called locked as they are from other places. - use __ethtool_get_settings() in bonding code Signed-off-by: Jiri Pirko <jpirko@redhat.com> v2->v3: -removed dev_ethtool_get_settings() -added ASSERT_RTNL into __ethtool_get_settings() -prb_calc_retire_blk_tmo - use __dev_get_by_index() and lock around it and __ethtool_get_settings() call v1->v2: add missing export_symbol Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> [except FCoE bits] Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-09-15 17:32:26 -04:00
Jiri Pirko	fa3df928e0	br: remove redundant check and init Since these checks and initialization are done in dev_ethtool_get_settings called later on, remove this redundancy. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-09-15 15:36:34 -04:00
David S. Miller	7858241655	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2011-08-30 17:43:56 -04:00
Eric Dumazet	22df13319d	bridge: fix a possible use after free br_multicast_ipv6_rcv() can call pskb_trim_rcsum() and therefore skb head can be reallocated. Cache icmp6_type field instead of dereferencing twice the struct icmp6hdr pointer. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-24 17:49:24 -07:00
Yan, Zheng	4b275d7efa	bridge: Pseudo-header required for the checksum of ICMPv6 Checksum of ICMPv6 is not properly computed because the pseudo header is not used. Thus, the MLD packet gets dropped by the bridge. Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Reported-by: Ang Way Chuang <wcang@sfc.wide.ad.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-24 17:49:00 -07:00
Eric Dumazet	11f3a6bdc2	bridge: fix a possible net_device leak Jan Beulich reported a possible net_device leak in bridge code after commit `bb900b27a2` (bridge: allow creating bridge devices with netlink) Reported-by: Jan Beulich <JBeulich@novell.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-22 16:49:56 -07:00
David S. Miller	823dcd2506	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net	2011-08-20 10:39:12 -07:00
Jiri Pirko	afc4b13df1	net: remove use of ndo_set_multicast_list in drivers replace it by ndo_set_rx_mode Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-17 20:22:03 -07:00
Julia Lawall	5189054dd7	net/bridge/netfilter/ebtables.c: use available error handling code Free the locally allocated table and newinfo as done in adjacent error handling code. Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-11 05:52:57 -07:00
Andrei Warkentin	9be6dd6510	Bridge: Always send NETDEV_CHANGEADDR up on br MAC change. This ensures the neighbor entries associated with the bridge dev are flushed, also invalidating the associated cached L2 headers. This means we br_add_if/br_del_if ports to implement hand-over and not wind up with bridge packets going out with stale MAC. This means we can also change MAC of port device and also not wind up with bridge packets going out with stale MAC. This builds on Stephen Hemminger's patch, also handling the br_del_if case and the port MAC change case. Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Andrei Warkentin <andreiw@motorola.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-09 21:44:44 -07:00
Stephen Hemminger	a9b3cd7f32	rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER When assigning a NULL value to an RCU protected pointer, no barrier is needed. The rcu_assign_pointer, used to handle that but will soon change to not handle the special case. Convert all rcu_assign_pointer of NULL value. //smpl @@ expression P; @@ - rcu_assign_pointer(P, NULL) + RCU_INIT_POINTER(P, NULL) // </smpl> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-02 04:29:23 -07:00
Bart De Schuymer	9823d9ff48	netfilter: ebtables: fix ebtables build dependency The configuration of ebtables shouldn't depend on CONFIG_BRIDGE_NETFILTER, only on CONFIG_NETFILTER. Reported-by: S�bastien Laveze <slaveze@gmail.com> Signed-off-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-07-29 16:40:30 +02:00
Arun Sharma	60063497a9	atomic: use <linux/atomic.h> This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:47 -07:00
Linus Torvalds	d3ec4844d4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) fs: Merge split strings treewide: fix potentially dangerous trailing ';' in #defined values/expressions uwb: Fix misspelling of neighbourhood in comment net, netfilter: Remove redundant goto in ebt_ulog_packet trivial: don't touch files that are removed in the staging tree lib/vsprintf: replace link to Draft by final RFC number doc: Kconfig: `to be' -> `be' doc: Kconfig: Typo: square -> squared doc: Konfig: Documentation/power/{pm => apm-acpi}.txt drivers/net: static should be at beginning of declaration drivers/media: static should be at beginning of declaration drivers/i2c: static should be at beginning of declaration XTENSA: static should be at beginning of declaration SH: static should be at beginning of declaration MIPS: static should be at beginning of declaration ARM: static should be at beginning of declaration rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check Update my e-mail address PCIe ASPM: forcedly -> forcibly gma500: push through device driver tree ... Fix up trivial conflicts: - arch/arm/mach-ep93xx/dma-m2p.c (deleted) - drivers/gpio/gpio-ep93xx.c (renamed and context nearby) - drivers/net/r8169.c (just context changes)	2011-07-25 13:56:39 -07:00
stephen hemminger	160d73b845	bridge: minor cleanups Some minor cleanups that won't impact code: 1. Remove inline from non-critical functions; compiler will most likely inline them anyway. 2. Make function args const where possible. 3. Whitespace cleanup Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-22 17:01:13 -07:00
stephen hemminger	4ecb961c8b	bridge: add notification over netlink when STP changes state When STP changes state of interface need to send a new link message to reflect that change. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-22 17:01:12 -07:00
stephen hemminger	56139fc5bd	bridge: notifier called with the wrong device If a new device is added to a bridge, the ethernet address of the bridge network device may change. When the address changes, the appropriate callback is called, but with the wrong device argument. The address of the bridge device (ie br0) changes not the address of the device being passed to add_if (ie eth0). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-22 17:01:12 -07:00
stephen hemminger	0652cac22c	bridge: ignore bogus STP config packets If the message_age is already greater than the max_age, then the BPDU is bogus. Linux won't generate BPDU, but conformance tester or buggy implementation might. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-22 17:01:12 -07:00
stephen hemminger	0c03150e7e	bridge: send proper message_age in config BPDU A bridge topology with three systems: +------+ +------+ \| A(2) \|--\| B(1) \| +------+ +------+ \ / +------+ \| C(3) \| +------+ What is supposed to happen: * bridge with the lowest ID is elected root (for example: B) * C detects that A->C is higher cost path and puts in blocking state What happens. Bridge with lowest id (B) is elected correctly as root and things start out fine initially. But then config BPDU doesn't get transmitted from A -> C. Because of that the link from A-C is transistioned to the forwarding state. The root cause of this is that the configuration messages is generated with bogus message age, and dropped before sending. In the standardmessage_age is supposed to be: the time since the generation of the Configuration BPDU by the Root that instigated the generation of this Configuration BPDU. Reimplement this by recording the timestamp (age + jiffies) when recording config information. The old code incorrectly used the time elapsed on the ageing timer which was incorrect. See also: https://bugzilla.vyatta.com/show_bug.cgi?id=7164 Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-22 17:01:12 -07:00
Jesper Juhl	f0da7ee410	net, netfilter: Remove redundant goto in ebt_ulog_packet In net/bridge/netfilter/ebt_ulog.c:ebt_ulog_packet() the 'goto unlock' before the 'alloc_failure' label is completely redundant. This patch removes it. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-07-21 14:02:17 +02:00
David S. Miller	d3aaeb38c4	net: Add ->neigh_lookup() operation to dst_ops In the future dst entries will be neigh-less. In that environment we need to have an easy transition point for current users of dst->neighbour outside of the packet output fast path. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-18 00:40:17 -07:00
David S. Miller	69cce1d140	net: Abstract dst->neighbour accesses behind helpers. dst_{get,set}_neighbour() Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-17 23:11:35 -07:00
David S. Miller	8f40b161de	neigh: Pass neighbour entry to output ops. This will get us closer to being able to do "neigh stuff" completely independent of the underlying dst_entry for protocols (ipv4/ipv6) that wish to do so. We will also be able to make dst entries neigh-less. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-17 23:11:17 -07:00
David S. Miller	f6b72b6217	net: Embed hh_cache inside of struct neighbour. Now that there is a one-to-one correspondance between neighbour and hh_cache entries, we no longer need: 1) dynamic allocation 2) attachment to dst->hh 3) refcounting Initialization of the hh_cache entry is indicated by hh_len being non-zero, and such initialization is always done with the neighbour's lock held as a writer. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-14 07:53:20 -07:00
David S. Miller	e12fe68ce3	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2011-07-05 23:23:37 -07:00
Herbert Xu	44661462ee	bridge: Always flood broadcast packets As is_multicast_ether_addr returns true on broadcast packets as well, we need to explicitly exclude broadcast packets so that they're always flooded. This wasn't an issue before as broadcast packets were considered to be an unregistered multicast group, which were always flooded. However, as we now only flood such packets to router ports, this is no longer acceptable. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-05 18:39:39 -07:00
Herbert Xu	bd4265fe36	bridge: Only flood unregistered groups to routers The bridge currently floods packets to groups that we have never seen before to all ports. This is not required by RFC4541 and in fact it is not desirable in environment where traffic to unregistered group is always present. This patch changes the behaviour so that we only send traffic to unregistered groups to ports marked as routers. The user can always force flooding behaviour to any given port by marking it as a router. Note that this change does not apply to traffic to 224.0.0.X as traffic to those groups must always be flooded to all ports. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-06-24 17:52:51 -07:00
David S. Miller	9f6ec8d697	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-agn-rxon.c drivers/net/wireless/rtlwifi/pci.c net/netfilter/ipvs/ip_vs_core.c	2011-06-20 22:29:08 -07:00
WANG Cong	cefa9993f1	netpoll: copy dev name of slaves to struct netpoll Otherwise we will not see the name of the slave dev in error message: [ 388.469446] (null): doesn't support polling, aborting. Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-06-19 16:13:01 -07:00
Fernando Luis Vázquez Cao	fc2af6c73f	IGMP snooping: set mrouters_only flag for IPv6 traffic properly Upon reception of a MGM report packet the kernel sets the mrouters_only flag in a skb that is a clone of the original skb, which means that the bridge loses track of MGM packets (cb buffers are tied to a specific skb and not shared) and it ends up forwading join requests to the bridge interface. This can cause unexpected membership timeouts and intermitent/permanent loss of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]: A snooping switch should forward IGMP Membership Reports only to those ports where multicast routers are attached. [...] Sending membership reports to other hosts can result, for IGMPv1 and IGMPv2, in unintentionally preventing a host from joining a specific multicast group. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: David S. Miller <davem@conan.davemloft.net>	2011-06-16 23:14:13 -04:00
Fernando Luis Vázquez Cao	62b2bcb49c	IGMP snooping: set mrouters_only flag for IPv4 traffic properly Upon reception of a IGMP/IGMPv2 membership report the kernel sets the mrouters_only flag in a skb that may be a clone of the original skb, which means that sometimes the bridge loses track of membership report packets (cb buffers are tied to a specific skb and not shared) and it ends up forwading join requests to the bridge interface. This can cause unexpected membership timeouts and intermitent/permanent loss of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]: A snooping switch should forward IGMP Membership Reports only to those ports where multicast routers are attached. [...] Sending membership reports to other hosts can result, for IGMPv1 and IGMPv2, in unintentionally preventing a host from joining a specific multicast group. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Tested-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp> Signed-off-by: David S. Miller <davem@conan.davemloft.net>	2011-06-16 23:14:12 -04:00
Greg Rose	c7ac8679be	rtnetlink: Compute and store minimum ifinfo dump size The message size allocated for rtnl ifinfo dumps was limited to a single page. This is not enough for additional interface info available with devices that support SR-IOV and caused a bug in which VF info would not be displayed if more than approximately 40 VFs were created per interface. Implement a new function pointer for the rtnl_register service that will calculate the amount of data required for the ifinfo dump and allocate enough data to satisfy the request. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2011-06-09 20:38:07 -07:00
Alexander Holler	6407d74c51	bridge: provide a cow_metrics method for fake_ops Like in commit `0972ddb237` (provide cow_metrics() methods to blackhole dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as well. This fixes a regression coming from commits `62fa8a846d` (net: Implement read-only protection and COW'ing of metrics.) and `33eb9873a2` (bridge: initialize fake_rtable metrics) ip link set mybridge mtu 1234 --> [ 136.546243] Pid: 8415, comm: ip Tainted: P 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn /V1Sn [ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0 [ 136.546268] EIP is at 0x0 [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1 [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48 [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80 task.ti=f15c2000) [ 136.546297] Stack: [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80 ffffffa1 f15c3bbc [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80 ffffffa6 f15c3be4 [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae 00000000 00000000 [ 136.546343] Call Trace: [ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge] [ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80 [ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860 [ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70 [ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0 [ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510 [ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100 [ 136.546429] [<c1362762>] ? error_code+0x5a/0x60 [ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80 [ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210 [ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20 [ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0 [ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30 [ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280 [ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0 [ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100 [ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750 [ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60 [ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0 [ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200 [ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410 [ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0 [ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720 [ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90 [ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180 [ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100 [ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0 [ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60 [ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0 [ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26 [ 136.546619] Code: Bad EIP value. [ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48 [ 136.546645] CR2: 0000000000000000 [ 136.546652] ---[ end trace 6909b560e78934fa ]--- Signed-off-by: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-06-07 00:51:35 -07:00
David S. Miller	58bf2dbccc	Merge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6	2011-05-27 13:04:40 -04:00
David Miller	97242c85a2	netfilter: Fix several warnings in compat_mtw_from_user(). Kill set but not used 'entry_offset'. Add a default case to the switch statement so the compiler can see that we always initialize off and size_kern before using them. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2011-05-26 19:09:07 +02:00
Eric Dumazet	33eb9873a2	bridge: initialize fake_rtable metrics bridge netfilter code uses a fake_rtable, and we must init its _metric field or risk NULL dereference later. Ref: https://bugzilla.kernel.org/show_bug.cgi?id=35672 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-24 13:32:18 -04:00
Eric Dumazet	6df427fe8c	net: remove synchronize_net() from netdev_set_master() In the old days, we used to access dev->master in __netif_receive_skb() in a rcu_read_lock section. So one synchronize_net() call was needed in netdev_set_master() to make sure another cpu could not use old master while/after we release it. We now use netdev_rx_handler infrastructure and added one synchronize_net() call in bond_release()/bond_release_all() Remove the obsolete synchronize_net() from netdev_set_master() and add one in bridge del_nbp() after its netdev_rx_handler_unregister() call. This makes enslave -d a bit faster. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jiri Pirko <jpirko@redhat.com> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-22 21:01:20 -04:00
Amerigo Wang	bb8ed6302b	bridge: call NETDEV_JOIN notifiers when add a slave In the previous patch I added NETDEV_JOIN, now we can notify netconsole when adding a device to a bridge too. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Neil Horman <nhorman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-22 21:01:19 -04:00
David S. Miller	9cbc94eabb	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/vmxnet3/vmxnet3_ethtool.c net/core/dev.c	2011-05-17 17:33:11 -04:00
Stephen Hemminger	cb68552858	bridge: fix forwarding of IPv6 The commit `6b1e960fdb` bridge: Reset IPCB when entering IP stack on NF_FORWARD broke forwarding of IPV6 packets in bridge because it would call bp_parse_ip_options with an IPV6 packet. Reported-by: Noah Meyerhans <noahm@debian.org> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-13 16:03:24 -04:00
David S. Miller	3c709f8fb4	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-3.6 Conflicts: drivers/net/benet/be_main.c	2011-05-11 14:26:58 -04:00
Florian Westphal	103a9778e0	netfilter: ebtables: only call xt_compat_add_offset once per rule The optimizations in commit `255d0dc340` (netfilter: x_table: speedup compat operations) assume that xt_compat_add_offset is called once per rule. ebtables however called it for each match/target found in a rule. The match/watcher/target parser already returns the needed delta, so it is sufficient to move the xt_compat_add_offset call to a more reasonable location. While at it, also get rid of the unused COMPAT iterator macros. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-05-10 09:52:17 +02:00
Eric Dumazet	5a6351eecf	netfilter: fix ebtables compat support commit `255d0dc340` (netfilter: x_table: speedup compat operations) made ebtables not working anymore. 1) xt_compat_calc_jump() is not an exact match lookup 2) compat_table_info() has a typo in xt_compat_init_offsets() call 3) compat_do_replace() misses a xt_compat_init_offsets() call Reported-by: dann frazier <dannf@dannf.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-05-10 09:48:59 +02:00
Eric Dumazet	e67f88dd12	net: dont hold rtnl mutex during netlink dump callbacks Four years ago, Patrick made a change to hold rtnl mutex during netlink dump callbacks. I believe it was a wrong move. This slows down concurrent dumps, making good old /proc/net/ files faster than rtnetlink in some situations. This occurred to me because one "ip link show dev ..." was _very_ slow on a workload adding/removing network devices in background. All dump callbacks are able to use RCU locking now, so this patch does roughly a revert of commits : `1c2d670f36` : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks `6313c1e099` : [RTNETLINK]: Remove unnecessary locking in dump callbacks This let writers fight for rtnl mutex and readers going full speed. It also takes care of phonet : phonet_route_get() is now called from rcu read section. I renamed it to phonet_route_get_rcu() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Remi Denis-Courmont <remi.denis-courmont@nokia.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-02 15:26:28 -07:00
David Decotigny	25db033881	ethtool: Use full 32 bit speed range in ethtool's set_settings This makes sure the ethtool's set_settings() callback of network drivers don't ignore the 16 most significant bits when ethtool calls their set_settings(). All drivers compiled with make allyesconfig on x86_64 have been updated. Signed-off-by: David Decotigny <decot@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-29 14:03:00 -07:00
Michał Mirosław	c4d27ef957	bridge: convert br_features_recompute() to ndo_fix_features Note: netdev_update_features() needs only rtnl_lock as br->port_list is only changed while holding it. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-28 13:33:08 -07:00
David S. Miller	2bd93d7af1	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Resolved logic conflicts causing a build failure due to drivers/net/r8169.c changes using a patch from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-26 12:16:46 -07:00
Eric Dumazet	b71d1d426d	inet: constify ip headers and in6_addr Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-22 11:04:14 -07:00
David S. Miller	f01cb5fbea	Revert "bridge: Forward reserved group addresses if !STP" This reverts commit `1e253c3b8a`. It breaks 802.3ad bonding inside of a bridge. The commit was meant to support transport bridging, and specifically virtual machines bridged to an ethernet interface connected to a switch port wiht 802.1x enabled. But this isn't the way to do it, it breaks too many other things. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-21 21:17:25 -07:00
David S. Miller	e1943424e4	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x_ethtool.c	2011-04-19 00:21:33 -07:00
Stephen Hemminger	28674b97cf	bridge: fix accidental creation of sysfs directory Commit `bb900b27a2` ("bridge: allow creating bridge devices with netlink") introduced a bug in net-next because of a typo in notifier. Every device would have the sysfs bridge directory (and files). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-17 17:52:51 -07:00
Eric Dumazet	f8e9881c2a	bridge: reset IPCB in br_parse_ip_options Commit `462fb2af97` (bridge : Sanitize skb before it enters the IP stack), missed one IPCB init before calling ip_options_compile() Thanks to Scot Doyle for his tests and bug reports. Reported-by: Scot Doyle <lkml@scotdoyle.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Acked-by: Bandan Das <bandan.das@stratus.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Cc: Jan Lübbe <jluebbe@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-12 13:39:14 -07:00
David S. Miller	1c01a80cfe	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/smsc911x.c	2011-04-11 13:44:25 -07:00
Linus Torvalds	42933bac11	Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6 * 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings	2011-04-07 11:14:49 -07:00
stephen hemminger	14f98f258f	bridge: range check STP parameters Apply restrictions on STP parameters based 802.1D 1998 standard. * Fixes missing locking in set path cost ioctl * Uses common code for both ioctl and sysfs This is based on an earlier patch Sasikanth V but with overhaul. Note: 1. It does NOT enforce the restriction on the relationship max_age and forward delay or hello time because in existing implementation these are set as independant operations. 2. If STP is disabled, there is no restriction on forward delay 3. No restriction on holding time because users use Linux code to act as hub or be sticky. 4. Although standard allow 0-255, Linux only allows 0-63 for port priority because more bits are reserved for port number. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-04 17:22:29 -07:00
stephen hemminger	bb900b27a2	bridge: allow creating bridge devices with netlink Add netlink device ops to allow creating bridge device via netlink. This works in a manner similar to vlan, macvlan and bonding. Example: # ip link add link dev br0 type bridge # ip link del dev br0 The change required rearranging initializtion code to deal with being called by create link. Most of the initialization happens in br_dev_setup, but allocation of stats is done in ndo_init callback to deal with allocation failure. Sysfs setup has to wait until after the network device kobject is registered. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-04 17:22:28 -07:00
stephen hemminger	36fd2b63e3	bridge: allow creating/deleting fdb entries via netlink Use RTM_NEWNEIGH and RTM_DELNEIGH to allow updating of entries in bridge forwarding table. This allows manipulating static entries which is not possible with existing tools. Example (using bridge extensions to iproute2) # br fdb add 00:02:03:04:05:06 dev eth0 Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-04 17:22:28 -07:00
stephen hemminger	b078f0df67	bridge: add netlink notification on forward entry changes This allows applications to query and monitor bridge forwarding table in the same method used for neighbor table. The forward table entries are returned in same structure format as used by the ioctl. If more information is desired in future, the netlink method is extensible. Example (using bridge extensions to iproute2) # br monitor Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-04 17:22:27 -07:00
stephen hemminger	664de48bb6	bridge: split rcu and no-rcu cases of fdb lookup In some cases, look up of forward database entry is done with RCU; and for others no RCU is needed because of locking. Split the two cases into two differnt loops (and take off inline). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-04 17:22:26 -07:00
stephen hemminger	7cd8861ab0	bridge: track last used time in forwarding table Adds tracking the last used time in forwarding table. Rename ageing_timer to updated to better describe it. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-04 17:22:26 -07:00
stephen hemminger	03e9b64b89	bridge: change arguments to fdb_create Later patch provides ability to create non-local static entry. To make this easier move the updating of the flag values to after the code that creates entry. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-04 17:22:25 -07:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Linus Lüssing	ff9a57a62a	bridge: mcast snooping, fix length check of snooped MLDv1/2 "len = ntohs(ip6h->payload_len)" does not include the length of the ipv6 header itself, which the rest of this function assumes, though. This leads to a length check less restrictive as it should be in the following line for one thing. For another, it very likely leads to an integer underrun when substracting the offset and therefore to a very high new value of 'len' due to its unsignedness. This will ultimately lead to the pskb_trim_rcsum() practically never being called, even in the cases where it should. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-30 02:28:20 -07:00
Balaji G	1459a3cc51	bridge: Fix compilation warning in function br_stp_recalculate_bridge_id() net/bridge/br_stp_if.c: In function ‘br_stp_recalculate_bridge_id’: net/bridge/br_stp_if.c:216:3: warning: ‘return’ with no value, in function returning non-void Signed-off-by: G.Balaji <balajig81@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-29 23:37:23 -07:00
stephen hemminger	edf947f100	bridge: notify applications if address of bridge device changes The mac address of the bridge device may be changed when a new interface is added to the bridge. If this happens, then the bridge needs to call the network notifiers to tickle any other systems that care. Since bridge can be a module, this also means exporting the notifier function. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-27 23:35:02 -07:00
Linus Lüssing	a7bff75b08	bridge: Fix possibly wrong MLD queries' ethernet source address The ipv6_dev_get_saddr() is currently called with an uninitialized destination address. Although in tests it usually seemed to nevertheless always fetch the right source address, there seems to be a possible race condition. Therefore this commit changes this, first setting the destination address and only after that fetching the source address. Reported-by: Jan Beulich <JBeulich@novell.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-22 19:26:42 -07:00
Herbert Xu	6b1e960fdb	bridge: Reset IPCB when entering IP stack on NF_FORWARD Whenever we enter the IP stack proper from bridge netfilter we need to ensure that the skb is in a form the IP stack expects it to be in. The entry point on NF_FORWARD did not meet the requirements of the IP stack, therefore leading to potential crashes/panics. This patch fixes the problem. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-18 15:13:12 -07:00
Jiri Pirko	8a4eb5734e	net: introduce rx_handler results and logic around that This patch allows rx_handlers to better signalize what to do next to it's caller. That makes skb->deliver_no_wcard no longer needed. kernel-doc for rx_handler_result is taken from Nicolas' patch. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-16 12:53:54 -07:00
David S. Miller	c337ffb68e	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2011-03-15 15:15:17 -07:00
stephen hemminger	a461c0297f	bridge: skip forwarding delay if not using STP If Spanning Tree Protocol is not enabled, there is no good reason for the bridge code to wait for the forwarding delay period before enabling the link. The purpose of the forwarding delay is to allow STP to learn about other bridges before nominating itself. The only possible impact is that when starting up a new port the bridge may flood a packet now, where previously it might have seen traffic from the other host and preseeded the forwarding table. Includes change for local variable br already available in that func. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-14 15:06:49 -07:00
stephen hemminger	1faa4356a3	bridge: control carrier based on ports online This makes the bridge device behave like a physical device. In earlier releases the bridge always asserted carrier. This changes the behavior so that bridge device carrier is on only if one or more ports are in the forwarding state. This should help IPv6 autoconfiguration, DHCP, and routing daemons. I did brief testing with Network and Virt manager and they seem fine, but since this changes behavior of bridge, it should wait until net-next (2.6.39). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Tested-By: Adam Majer <adamm@zombino.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-14 14:29:02 -07:00
David S. Miller	78fbfd8a65	ipv4: Create and use route lookup helpers. The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-12 15:08:42 -08:00
David S. Miller	33175d84ee	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x_cmn.c	2011-03-10 14:26:00 -08:00
Randy Dunlap	dcbcdf22f5	net: bridge builtin vs. ipv6 modular When configs BRIDGE=y and IPV6=m, this build error occurs: br_multicast.c:(.text+0xa3341): undefined reference to `ipv6_dev_get_saddr' BRIDGE_IGMP_SNOOPING is boolean; if it were tristate, then adding depends on IPV6 \|\| IPV6=n to BRIDGE_IGMP_SNOOPING would be a good fix. As it is currently, making BRIDGE depend on the IPV6 config works. Reported-by: Patrick Schaaf <netdev@bof.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-10 13:45:57 -08:00
David S. Miller	0a0e9ae1bd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x.h	2011-03-03 21:27:42 -08:00
David S. Miller	b23dd4fe42	ipv4: Make output route lookup return rtable directly. Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-02 14:31:35 -08:00
David S. Miller	3872b28408	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2011-03-02 11:30:24 -08:00
Linus Lüssing	fe29ec41aa	bridge: Use IPv6 link-local address for multicast listener queries Currently the bridge multicast snooping feature periodically issues IPv6 general multicast listener queries to sense the absence of a listener. For this, it uses :: as its source address - however RFC 2710 requires: "To be valid, the Query message MUST come from a link-local IPv6 Source Address". Current Linux kernel versions seem to follow this requirement and ignore our bogus MLD queries. With this commit a link local address from the bridge interface is being used to issue the MLD query, resulting in other Linux devices which are multicast listeners in the network to respond with a MLD response (which was not the case before). Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:29 -08:00
Linus Lüssing	36cff5a10c	bridge: Fix MLD queries' ethernet source address Map the IPv6 header's destination multicast address to an ethernet source address instead of the MLD queries multicast address. For instance for a general MLD query (multicast address in the MLD query set to ::), this would wrongly be mapped to 33:33:00:00:00:00, although an MLD queries destination MAC should always be 33:33:00:00:00:01 which matches the IPv6 header's multicast destination ff02::1. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:28 -08:00
Linus Lüssing	e4de9f9e83	bridge: Allow mcast snooping for transient link local addresses too Currently the multicast bridge snooping support is not active for link local multicast. I assume this has been done to leave important multicast data untouched, like IPv6 Neighborhood Discovery. In larger, bridged, local networks it could however be desirable to optimize for instance local multicast audio/video streaming too. With the transient flag in IPv6 multicast addresses we have an easy way to optimize such multimedia traffic without tempering with the high priority multicast data from well-known addresses. This patch alters the multicast bridge snooping for IPv6, to take effect for transient multicast addresses instead of non-link-local addresses. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:28 -08:00
Linus Lüssing	d41db9f3f7	bridge: Add missing ntohs()s for MLDv2 report parsing The nsrcs number is 2 Byte wide, therefore we need to call ntohs() before using it. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:27 -08:00
Linus Lüssing	649e984d00	bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report We actually want a pointer to the grec_nsrcr and not the following field. Otherwise we can get very high values for *nsrcs as the first two bytes of the IPv6 multicast address are being used instead, leading to a failing pskb_may_pull() which results in MLDv2 reports not being parsed. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:26 -08:00
Linus Lüssing	9cc6e0c4c4	bridge: Fix IPv6 multicast snooping by storing correct protocol type The protocol type for IPv6 entries in the hash table for multicast bridge snooping is falsely set to ETH_P_IP, marking it as an IPv4 address, instead of setting it to ETH_P_IPV6, which results in negative look-ups in the hash table later. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:26 -08:00
David S. Miller	da935c66ba	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/e1000e/netdev.c net/xfrm/xfrm_policy.c	2011-02-19 19:17:35 -08:00
Vasiliy Kulikov	d846f71195	bridge: netfilter: fix information leak Struct tmp is copied from userspace. It is not checked whether the "name" field is NULL terminated. This may lead to buffer overflow and passing contents of kernel stack as a module name to try_then_request_module() and, consequently, to modprobe commandline. It would be seen by all userspace processes. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-02-14 16:49:23 +01:00
Jiri Pirko	afc6151a78	bridge: implement [add/del]_slave ops add possibility to addif/delif via rtnetlink Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-13 16:58:40 -08:00
Herbert Xu	8a870178c0	bridge: Replace mp->mglist hlist with a bool As it turns out we never need to walk through the list of multicast groups subscribed by the bridge interface itself (the only time we'd want to do that is when we shut down the bridge, in which case we simply walk through all multicast groups), we don't really need to keep an hlist for mp->mglist. This means that we can replace it with just a single bit to indicate whether the bridge interface is subscribed to a group. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-12 01:05:42 -08:00
Herbert Xu	24f9cdcbd7	bridge: Fix timer typo that may render snooping less effective In a couple of spots where we are supposed to modify the port group timer (p->timer) we instead modify the bridge interface group timer (mp->timer). The effect of this is mostly harmless. However, it can cause port subscriptions to be longer than they should be, thus making snooping less effective. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-11 21:59:37 -08:00
Herbert Xu	6b0d6a9b42	bridge: Fix mglist corruption that leads to memory corruption The list mp->mglist is used to indicate whether a multicast group is active on the bridge interface itself as opposed to one of the constituent interfaces in the bridge. Unfortunately the operation that adds the mp->mglist node to the list neglected to check whether it has already been added. This leads to list corruption in the form of nodes pointing to itself. Normally this would be quite obvious as it would cause an infinite loop when walking the list. However, as this list is never actually walked (which means that we don't really need it, I'll get rid of it in a subsequent patch), this instead is hidden until we perform a delete operation on the affected nodes. As the same node may now be pointed to by more than one node, the delete operations can then cause modification of freed memory. This was observed in practice to cause corruption in 512-byte slabs, most commonly leading to crashes in jbd2. Thanks to Josef Bacik for pointing me in the right direction. Reported-by: Ian Page Hands <ihands@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-11 21:59:37 -08:00

... 2 3 4 5 6 ...

923 Commits