Commit Graph

233 Commits

Author SHA1 Message Date
Roopa Prabhu 821f1b21ca bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood
This patch adds a new bridge port flag BR_NEIGH_SUPPRESS to
suppress arp and nd flood on bridge ports. It implements
rfc7432, section 10.
https://tools.ietf.org/html/rfc7432#section-10
for ethernet VPN deployments. It is similar to the existing
BR_PROXYARP* flags but has a few semantic differences to conform
to EVPN standard. Unlike the existing flags, this new flag suppresses
flood of all neigh discovery packets (arp and nd) to tunnel ports.
Supports both vlan filtering and non-vlan filtering bridges.

In case of EVPN, it is mainly used to avoid flooding
of arp and nd packets to tunnel ports like vxlan.

This patch adds netlink and sysfs support to set this bridge port
flag.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:12:04 -07:00
Nikolay Aleksandrov 5af48b59f3 net: bridge: add per-port group_fwd_mask with less restrictions
We need to be able to transparently forward most link-local frames via
tunnels (e.g. vxlan, qinq). Currently the bridge's group_fwd_mask has a
mask which restricts the forwarding of STP and LACP, but we need to be able
to forward these over tunnels and control that forwarding on a per-port
basis thus add a new per-port group_fwd_mask option which only disallows
mac pause frames to be forwarded (they're always dropped anyway).
The patch does not change the current default situation - all of the others
are still restricted unless configured for forwarding.
We have successfully tested this patch with LACP and STP forwarding over
VxLAN and qinq tunnels.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29 06:02:55 +01:00
Matthias Schiffer 17dd0ec470 net: add netlink_ext_ack argument to rtnl_link_ops.slave_changelink
Add support for extended error reporting.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26 23:13:22 -04:00
Matthias Schiffer a8b8a889e3 net: add netlink_ext_ack argument to rtnl_link_ops.validate
Add support for extended error reporting.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26 23:13:22 -04:00
Matthias Schiffer ad744b223c net: add netlink_ext_ack argument to rtnl_link_ops.changelink
Add support for extended error reporting.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26 23:13:22 -04:00
Matthias Schiffer 7a3f4a1851 net: add netlink_ext_ack argument to rtnl_link_ops.newlink
Add support for extended error reporting.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26 23:13:21 -04:00
Arkadi Sharshevsky 3922285d96 net: bridge: Add support for offloading port attributes
Currently the flood, learning and learning_sync port attributes are
offloaded by setting the SELF flag. Add support for offloading the
flood and learning attribute through the bridge code. In case of
setting an unsupported flag on a offloded port the operation will
fail.

The learning_sync attribute doesn't have any software representation
and cannot be offloaded through the bridge code.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 14:16:24 -04:00
David S. Miller 216fe8f021 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Just some simple overlapping changes in marvell PHY driver
and the DSA core code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 22:20:08 -04:00
Nikolay Aleksandrov 1020ce3108 net: bridge: fix a null pointer dereference in br_afspec
We might call br_afspec() with p == NULL which is a valid use case if
the action is on the bridge device itself, but the bridge tunnel code
dereferences the p pointer without checking, so check if p is null
first.

Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Fixes: efa5356b0d ("bridge: per vlan dst_metadata netlink support")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 16:05:31 -04:00
Ido Schimmel 1f51445af3 bridge: Export VLAN filtering state
It's useful for drivers supporting bridge offload to be able to query
the bridge's VLAN filtering state.

Currently, upon enslavement to a bridge master, the offloading driver
will only learn about the bridge's VLAN filtering state after the bridge
device was already linked with its slave.

Being able to query the bridge's VLAN filtering state allows such
drivers to forbid enslavement in case resource couldn't be allocated for
a VLAN-aware bridge and also choose the correct initialization routine
for the enslaved port, which is dependent on the bridge type.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-26 15:18:44 -04:00
Tobias Jungel a285860211 bridge: netlink: check vlan_default_pvid range
Currently it is allowed to set the default pvid of a bridge to a value
above VLAN_VID_MASK (0xfff). This patch adds a check to br_validate and
returns -EINVAL in case the pvid is out of bounds.

Reproduce by calling:

[root@test ~]# ip l a type bridge
[root@test ~]# ip l a type dummy
[root@test ~]# ip l s bridge0 type bridge vlan_filtering 1
[root@test ~]# ip l s bridge0 type bridge vlan_default_pvid 9999
[root@test ~]# ip l s dummy0 master bridge0
[root@test ~]# bridge vlan
port	vlan ids
bridge0	 9999 PVID Egress Untagged

dummy0	 9999 PVID Egress Untagged

Fixes: 0f963b7592 ("bridge: netlink: add support for default_pvid")
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Tobias Jungel <tobias.jungel@bisdn.de>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18 10:15:00 -04:00
Tobias Klauser 9051247dcf bridge: netlink: account for IFLA_BRPORT_{B, M}CAST_FLOOD size and policy
The attribute sizes for IFLA_BRPORT_MCAST_FLOOD and
IFLA_BRPORT_BCAST_FLOOD weren't accounted for in br_port_info_size()
when they were added. Do so now and also add the corresponding policy
entries:

Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Mike Manning <mmanning@brocade.com>
Fixes: b6cb5ac833 ("net: bridge: add per-port multicast flood flag")
Fixes: 99f906e9ad ("bridge: add per-port broadcast flood flag")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-05 11:21:03 -04:00
Mike Manning 99f906e9ad bridge: add per-port broadcast flood flag
Support for l2 multicast flood control was added in commit b6cb5ac833
("net: bridge: add per-port multicast flood flag"). It allows broadcast
as it was introduced specifically for unknown multicast flood control.
But as broadcast is a special case of multicast, this may also need to
be disabled. For this purpose, introduce a flag to disable the flooding
of received l2 broadcasts. This approach is backwards compatible and
provides flexibility in filtering for the desired packet types.

Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Mike Manning <mmanning@brocade.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27 16:34:29 -04:00
David S. Miller 6b6cbc1471 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were simply overlapping changes.  In the net/ipv4/route.c
case the code had simply moved around a little bit and the same fix
was made in both 'net' and 'net-next'.

In the net/sched/sch_generic.c case a fix in 'net' happened at
the same time that a new argument was added to qdisc_hash_add().

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-15 21:16:30 -04:00
Johannes Berg fceb6435e8 netlink: pass extended ACK struct to parsing functions
Pass the new extended ACK reporting struct to all of the generic
netlink parsing functions. For now, pass NULL in almost all callers
(except for some in the core.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:22 -04:00
Ido Schimmel 5b8d5429da bridge: netlink: register netdevice before executing changelink
Peter reported a kernel oops when executing the following command:

$ ip link add name test type bridge vlan_default_pvid 1

[13634.939408] BUG: unable to handle kernel NULL pointer dereference at
0000000000000190
[13634.939436] IP: __vlan_add+0x73/0x5f0
[...]
[13634.939783] Call Trace:
[13634.939791]  ? pcpu_next_unpop+0x3b/0x50
[13634.939801]  ? pcpu_alloc+0x3d2/0x680
[13634.939810]  ? br_vlan_add+0x135/0x1b0
[13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
[13634.939834]  ? br_changelink+0x120/0x4e0
[13634.939844]  ? br_dev_newlink+0x50/0x70
[13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
[13634.939864]  ? rtnl_newlink+0x176/0x8a0
[13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
[13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
[13634.939896]  ? lookup_fast+0x52/0x370
[13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
[13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
[13634.939925]  ? rtnetlink_rcv+0x24/0x30
[13634.939934]  ? netlink_unicast+0x177/0x220
[13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
[13634.939954]  ? _copy_from_user+0x39/0x40
[13634.939964]  ? sock_sendmsg+0x30/0x40
[13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
[13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
[13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
[13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
[13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
[13634.940809]  ? __sys_sendmsg+0x51/0x90
[13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad

The problem is that the bridge's VLAN group is created after setting the
default PVID, when registering the netdevice and executing its
ndo_init().

Fix this by changing the order of both operations, so that
br_changelink() is only processed after the netdevice is registered,
when the VLAN group is already initialized.

Fixes: b6677449df ("bridge: netlink: call br_changelink() during br_dev_newlink()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Peter V. Saveliev <peter@svinota.eu>
Tested-by: Peter V. Saveliev <peter@svinota.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11 22:22:44 -04:00
Colin Ian King 1f02b5f42f net: bridge: remove redundant check to see if err is set
The error check on err is redundant as it is being checked
previously each time it has been updated.  Remove this redundant
check.

Detected with CoverityScan, CID#140030("Logically dead code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07 14:04:29 -05:00
Nikolay Aleksandrov f7cdee8a79 bridge: move to workqueue gc
Move the fdb garbage collector to a workqueue which fires at least 10
milliseconds apart and cleans chain by chain allowing for other tasks
to run in the meantime. When having thousands of fdbs the system is much
more responsive. Most importantly remove the need to check if the
matched entry has expired in __br_fdb_get that causes false-sharing and
is completely unnecessary if we cleanup entries, at worst we'll get 10ms
of traffic for that entry before it gets deleted.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06 22:53:13 -05:00
Roopa Prabhu efa5356b0d bridge: per vlan dst_metadata netlink support
This patch adds support to attach per vlan tunnel info dst
metadata. This enables bridge driver to map vlan to tunnel_info
at ingress and egress. It uses the kernel dst_metadata infrastructure.

The initial use case is vlan to vni bridging, but the api is generic
to extend to any tunnel_info in the future:
    - Uapi to configure/unconfigure/dump per vlan tunnel data
    - netlink functions to configure vlan and tunnel_info mapping
    - Introduces bridge port flag BR_LWT_VLAN to enable attach/detach
    dst_metadata to bridged packets on ports. off by default.
    - changes to existing code is mainly refactor some existing vlan
    handling netlink code + hooks for new vlan tunnel code
    - I have kept the vlan tunnel code isolated in separate files.
    - most of the netlink vlan tunnel code is handling of vlan-tunid
    ranges (follows the vlan range handling code). To conserve space
    vlan-tunid by default are always dumped in ranges if applicable.

Use case:
example use for this is a vxlan bridging gateway or vtep
which maps vlans to vn-segments (or vnis).

iproute2 example (patched and pruned iproute2 output to just show
relevant fdb entries):
example shows same host mac learnt on two vni's and
vlan 100 maps to vni 1000, vlan 101 maps to vni 1001

before (netdev per vni):
$bridge fdb show | grep "00:02:00:00:00:03"
00:02:00:00:00:03 dev vxlan1001 vlan 101 master bridge
00:02:00:00:00:03 dev vxlan1001 dst 12.0.0.8 self
00:02:00:00:00:03 dev vxlan1000 vlan 100 master bridge
00:02:00:00:00:03 dev vxlan1000 dst 12.0.0.8 self

after this patch with collect metdata in bridged mode (single netdev):
$bridge fdb show | grep "00:02:00:00:00:03"
00:02:00:00:00:03 dev vxlan0 vlan 101 master bridge
00:02:00:00:00:03 dev vxlan0 src_vni 1001 dst 12.0.0.8 self
00:02:00:00:00:03 dev vxlan0 vlan 100 master bridge
00:02:00:00:00:03 dev vxlan0 src_vni 1000 dst 12.0.0.8 self

CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03 15:21:22 -05:00
David S. Miller 4e8f2fc1a5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Two trivial overlapping changes conflicts in MPLS and mlx5.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-28 10:33:06 -05:00
Felix Fietkau 6db6f0eae6 bridge: multicast to unicast
Implements an optional, per bridge port flag and feature to deliver
multicast packets to any host on the according port via unicast
individually. This is done by copying the packet per host and
changing the multicast destination MAC to a unicast one accordingly.

multicast-to-unicast works on top of the multicast snooping feature of
the bridge. Which means unicast copies are only delivered to hosts which
are interested in it and signalized this via IGMP/MLD reports
previously.

This feature is intended for interface types which have a more reliable
and/or efficient way to deliver unicast packets than broadcast ones
(e.g. wifi).

However, it should only be enabled on interfaces where no IGMPv2/MLDv1
report suppression takes place. This feature is disabled by default.

The initial patch and idea is from Felix Fietkau.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
[linus.luessing@c0d3.blue: various bug + style fixes, commit message]
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 12:39:52 -05:00
Ivan Vecera b6677449df bridge: netlink: call br_changelink() during br_dev_newlink()
Any bridge options specified during link creation (e.g. ip link add)
are ignored as br_dev_newlink() does not process them.
Use br_changelink() to do it.

Fixes: 1332351617 ("bridge: implement rtnl_link_ops->changelink")
Signed-off-by: Ivan Vecera <cera@cera.cz>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20 15:07:27 -05:00
Nikolay Aleksandrov aa2ae3e71c bridge: mcast: add MLDv2 querier support
This patch adds basic support for MLDv2 queries, the default is MLDv1
as before. A new multicast option - multicast_mld_version, adds the
ability to change it between 1 and 2 via netlink and sysfs.
The MLD option is disabled if CONFIG_IPV6 is disabled.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:16:58 -05:00
Nikolay Aleksandrov 5e9235853d bridge: mcast: add IGMPv3 query support
This patch adds basic support for IGMPv3 queries, the default is IGMPv2
as before. A new multicast option - multicast_igmp_version, adds the
ability to change it between 2 and 3 via netlink and sysfs. The option
struct member is in a 4 byte hole in net_bridge.

There also a few minor style adjustments in br_multicast_new_group and
br_multicast_add_group.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:16:58 -05:00
Nikolay Aleksandrov b6cb5ac833 net: bridge: add per-port multicast flood flag
Add a per-port flag to control the unknown multicast flood, similar to the
unknown unicast flood flag and break a few long lines in the netlink flag
exports.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-01 22:48:33 -07:00
Nikolay Aleksandrov 72f4af4e47 net: bridge: export also pvid flag in the xstats flags
When I added support to export the vlan entry flags via xstats I forgot to
add support for the pvid since it is manually matched, so check if the
entry matches the vlan_group's pvid and set the flag appropriately.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26 11:45:28 -07:00
Nikolay Aleksandrov 61ba1a2da9 net: bridge: export vlan flags with the stats
Use one of the vlan xstats padding fields to export the vlan flags. This is
needed in order to be able to distinguish between master (bridge) and port
vlan entries in user-space when dumping the bridge vlan stats.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 23:18:42 -07:00
Nikolay Aleksandrov d5ff8c41b5 net: bridge: consolidate bridge and port linkxstats calls
In the bridge driver we usually have the same function working for both
port and bridge. In order to follow that logic and also avoid code
duplication, consolidate the bridge_ and brport_ linkxstats calls into
one since they share most of their code. As a side effect this allows us
to dump the vlan stats also via the slave call which is in preparation for
the upcoming per-port vlan stats and vlan flag dumping.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 23:18:42 -07:00
Nikolay Aleksandrov 1080ab95e3 net: bridge: add support for IGMP/MLD stats and export them via netlink
This patch adds stats support for the currently used IGMP/MLD types by the
bridge. The stats are per-port (plus one stat per-bridge) and per-direction
(RX/TX). The stats are exported via netlink via the new linkxstats API
(RTM_GETSTATS). In order to minimize the performance impact, a new option
is used to enable/disable the stats - multicast_stats_enabled, similar to
the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
lookups and checks, we make use of the current "igmp" member of the bridge
private skb->cb region to record the type on Rx (both host-generated and
external packets pass by multicast_rcv()). We can do that since the igmp
member was used as a boolean and all the valid IGMP/MLD types are positive
values. The normal bridge fast-path is not affected at all, the only
affected paths are the flooding ones and since we make use of the IGMP/MLD
type, we can quickly determine if the packet should be counted using
cache-hot data (cb's igmp member). We add counters for:
* IGMP Queries
* IGMP Leaves
* IGMP v1/v2/v3 reports

* MLD Queries
* MLD Leaves
* MLD v1/v2 reports

These are invaluable when monitoring or debugging complex multicast setups
with bridges.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 06:18:24 -04:00
Nikolay Aleksandrov 80e73cc563 net: rtnetlink: add support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute
This patch adds support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute
which allows to export per-slave statistics if the master device supports
the linkxstats callback. The attribute is passed down to the linkxstats
callback and it is up to the callback user to use it (an example has been
added to the only current user - the bridge). This allows us to query only
specific slaves of master devices like bridge ports and export only what
we're interested in instead of having to dump all ports and searching only
for a single one. This will be used to export per-port IGMP/MLD stats and
also per-port vlan stats in the future, possibly other statistics as well.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 06:15:04 -04:00
Nikolay Aleksandrov 565ce8f32a net: bridge: fix vlan stats continue counter
I made a dumb off-by-one mistake when I added the vlan stats counter
dumping code. The increment should happen before the check, not after
otherwise we miss one entry when we continue dumping.

Fixes: a60c090361 ("bridge: netlink: export per-vlan stats")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:33:35 -04:00
Nikolay Aleksandrov a60c090361 bridge: netlink: export per-vlan stats
Add a new LINK_XSTATS_TYPE_BRIDGE attribute and implement the
RTM_GETSTATS callbacks for IFLA_STATS_LINK_XSTATS (fill_linkxstats and
get_linkxstats_size) in order to export the per-vlan stats.
The paddings were added because soon these fields will be needed for
per-port per-vlan stats (or something else if someone beats me to it) so
avoiding at least a few more netlink attributes.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 22:27:06 -04:00
Nikolay Aleksandrov 6dada9b10a bridge: vlan: learn to count
Add support for per-VLAN Tx/Rx statistics. Every global vlan context gets
allocated a per-cpu stats which is then set in each per-port vlan context
for quick access. The br_allowed_ingress() common function is used to
account for Rx packets and the br_handle_vlan() common function is used
to account for Tx packets. Stats accounting is performed only if the
bridge-wide vlan_stats_enabled option is set either via sysfs or netlink.
A struct hole between vlan_enabled and vlan_proto is used for the new
option so it is in the same cache line. Currently it is binary (on/off)
but it is intentionally restricted to exactly 0 and 1 since other values
will be used in the future for different purposes (e.g. per-port stats).

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 22:27:06 -04:00
Nicolas Dichtel 12a0faa3bd bridge: use nla_put_u64_64bit()
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-25 15:09:10 -04:00
Vivien Didelot 7c25b16dbb net: bridge: log port STP state on change
Remove the shared br_log_state function and print the info directly in
br_set_state, where the net_bridge_port state is actually changed.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18 14:20:08 -05:00
Arad, Ronen b1974ed05e netlink: Rightsize IFLA_AF_SPEC size calculation
if_nlmsg_size() overestimates the minimum allocation size of netlink
dump request (when called from rtnl_calcit()) or the size of the
message (when called from rtnl_getlink()). This is because
ext_filter_mask is not supported by rtnl_link_get_af_size() and
rtnl_link_get_size().

The over-estimation is significant when at least one netdev has many
VLANs configured (8 bytes for each configured VLAN).

This patch-set "rightsizes" the protocol specific attribute size
calculation by propagating ext_filter_mask to rtnl_link_get_af_size()
and adding this a argument to get_link_af_size op in rtnl_af_ops.

Bridge module already used filtering aware sizing for notifications.
br_get_link_af_size_filtered() is consistent with the modified
get_link_af_size op so it replaces br_get_link_af_size() in br_af_ops.
br_get_link_af_size() becomes unused and thus removed.

Signed-off-by: Ronen Arad <ronen.arad@intel.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21 19:15:20 -07:00
Nikolay Aleksandrov e9c953eff7 bridge: vlan: use rcu for vlan_list traversal in br_fill_ifinfo
br_fill_ifinfo is called by br_ifinfo_notify which can be called from
many contexts with different locks held, sometimes it relies upon
bridge's spinlock only which is a problem for the vlan code, so use
explicitly rcu for that to avoid problems.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:57:54 -07:00
Nikolay Aleksandrov 907b1e6e83 bridge: vlan: use proper rcu for the vlgrp member
The bridge and port's vlgrp member is already used in RCU way, currently
we rely on the fact that it cannot disappear while the port exists but
that is error-prone and we might miss places with improper locking
(either RCU or RTNL must be held to walk the vlan_list). So make it
official and use RCU for vlgrp to catch offenders. Introduce proper vlgrp
accessors and use them consistently throughout the code.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:57:52 -07:00
Nikolay Aleksandrov 6623c60dc2 bridge: vlan: enforce no pvid flag in vlan ranges
Currently it's possible for someone to send a vlan range to the kernel
with the pvid flag set which will result in the pvid bouncing from a
vlan to vlan and isn't correct, it also introduces problems for hardware
where it doesn't make sense having more than 1 pvid. iproute2 already
enforces this, so let's enforce it on kernel-side as well.

Reported-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 19:59:15 -07:00
Scott Feldman c62987bbd8 bridge: push bridge setting ageing_time down to switchdev
Use SWITCHDEV_F_SKIP_EOPNOTSUPP to skip over ports in bridge that don't
support setting ageing_time (or setting bridge attrs in general).

If push fails, don't update ageing_time in bridge and return err to user.

If push succeeds, update ageing_time in bridge and run gc_timer now to
recalabrate when to run gc_timer next, based on new ageing_time.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:20:20 -07:00
Nikolay Aleksandrov 5d6ae479ab bridge: netlink: add support for port's multicast_router attribute
Add IFLA_BRPORT_MULTICAST_ROUTER to allow setting/getting port's
multicast_router via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:49:34 -07:00
Nikolay Aleksandrov 9b0c6e4deb bridge: netlink: allow to flush port's fdb
Add IFLA_BRPORT_FLUSH to allow flushing port's fdb similar to sysfs's
flush.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:49:32 -07:00
Nikolay Aleksandrov 61c0a9a83e bridge: netlink: export port's timer values
Add the following attributes in order to export port's timer values:
IFLA_BRPORT_MESSAGE_AGE_TIMER, IFLA_BRPORT_FORWARD_DELAY_TIMER and
IFLA_BRPORT_HOLD_TIMER.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:49:31 -07:00
Nikolay Aleksandrov e08e838ac5 bridge: netlink: export port's topology_change_ack and config_pending
Add IFLA_BRPORT_TOPOLOGY_CHANGE_ACK and IFLA_BRPORT_CONFIG_PENDING to
allow getting port's topology_change_ack and config_pending respectively
via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:49:30 -07:00
Nikolay Aleksandrov 42d452c4b5 bridge: netlink: export port's id and number
Add IFLA_BRPORT_(ID|NO) to allow getting port's port_id and port_no
respectively via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:49:29 -07:00
Nikolay Aleksandrov 96f94e7f4a bridge: netlink: export port's designated cost and port
Add IFLA_BRPORT_DESIGNATED_(COST|PORT) to allow getting the port's
designated cost and port respectively via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:49:29 -07:00
Nikolay Aleksandrov 80df9a2692 bridge: netlink: export port's bridge id
Add IFLA_BRPORT_BRIDGE_ID to allow getting the designated bridge id via
netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:49:28 -07:00
Nikolay Aleksandrov 4ebc7660ab bridge: netlink: export port's root id
Add IFLA_BRPORT_ROOT_ID to allow getting the designated root id via
netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:49:27 -07:00
Nikolay Aleksandrov 4917a1548f bridge: netlink: make br_fill_info's frame size smaller
When KASAN is enabled the frame size grows > 2048 bytes and we get a
warning, so make it smaller.
net/bridge/br_netlink.c: In function 'br_fill_info':
>> net/bridge/br_netlink.c:1110:1: warning: the frame size of 2160 bytes
>> is larger than 2048 bytes [-Wframe-larger-than=]

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-07 04:15:57 -07:00
Nikolay Aleksandrov 0f963b7592 bridge: netlink: add support for default_pvid
Add IFLA_BR_VLAN_DEFAULT_PVID to allow setting/getting bridge's
default_pvid via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:07 -07:00
Nikolay Aleksandrov 93870cc02a bridge: netlink: add support for netfilter tables config
Add support to allow getting/setting netfilter tables settings.
Currently these are IFLA_BR_NF_CALL_IPTABLES, IFLA_BR_NF_CALL_IP6TABLES
and IFLA_BR_NF_CALL_ARPTABLES.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:07 -07:00
Nikolay Aleksandrov 7e4df51eb3 bridge: netlink: add support for igmp's intervals
Add support to set/get all of the igmp's configurable intervals via
netlink. These currently are:
IFLA_BR_MCAST_LAST_MEMBER_INTVL
IFLA_BR_MCAST_MEMBERSHIP_INTVL
IFLA_BR_MCAST_QUERIER_INTVL
IFLA_BR_MCAST_QUERY_INTVL
IFLA_BR_MCAST_QUERY_RESPONSE_INTVL
IFLA_BR_MCAST_STARTUP_QUERY_INTVL

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:06 -07:00
Nikolay Aleksandrov b89e6babad bridge: netlink: add support for multicast_startup_query_count
Add IFLA_BR_MCAST_STARTUP_QUERY_CNT to allow setting/getting
br->multicast_startup_query_count via netlink. Also align the ifla
comments.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:06 -07:00
Nikolay Aleksandrov 79b859f573 bridge: netlink: add support for multicast_last_member_count
Add IFLA_BR_MCAST_LAST_MEMBER_CNT to allow setting/getting
br->multicast_last_member_count via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:06 -07:00
Nikolay Aleksandrov 858079fdae bridge: netlink: add support for igmp's hash_max
Add IFLA_BR_MCAST_HASH_MAX to allow setting/getting br->hash_max via
netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:06 -07:00
Nikolay Aleksandrov 431db3c050 bridge: netlink: add support for igmp's hash_elasticity
Add IFLA_BR_MCAST_HASH_ELASTICITY to allow setting/getting
br->hash_elasticity via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:05 -07:00
Nikolay Aleksandrov ba062d7cc6 bridge: netlink: add support for multicast_querier
Add IFLA_BR_MCAST_QUERIER to allow setting/getting br->multicast_querier
via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:04 -07:00
Nikolay Aleksandrov 295141d904 bridge: netlink: add support for multicast_query_use_ifaddr
Add IFLA_BR_MCAST_QUERY_USE_IFADDR to allow setting/getting
br->multicast_query_use_ifaddr via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:03 -07:00
Nikolay Aleksandrov 89126327f9 bridge: netlink: add support for multicast_snooping
Add IFLA_BR_MCAST_SNOOPING to allow enabling/disabling multicast
snooping via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:02 -07:00
Nikolay Aleksandrov a9a6bc70f5 bridge: netlink: add support for multicast_router
Add IFLA_BR_MCAST_ROUTER to allow setting and retrieving
br->multicast_router when igmp snooping is enabled.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:01 -07:00
Nikolay Aleksandrov 150217c688 bridge: netlink: add fdb flush
Simple attribute that flushes the bridge's fdb.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:46:01 -07:00
Nikolay Aleksandrov 111189abc5 bridge: netlink: add group_addr support
Add IFLA_BR_GROUP_ADDR attribute to allow setting and retrieving the
group_addr via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:45:59 -07:00
Nikolay Aleksandrov d76bd14e0f bridge: netlink: export all timers
Export the following bridge timers (also exported via sysfs):
IFLA_BR_HELLO_TIMER, IFLA_BR_TCN_TIMER, IFLA_BR_TOPOLOGY_CHANGE_TIMER,
IFLA_BR_GC_TIMER via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:45:59 -07:00
Nikolay Aleksandrov ed4163098e bridge: netlink: export topology_change and topology_change_detected
Add IFLA_BR_TOPOLOGY_CHANGE and IFLA_BR_TOPOLOGY_CHANGE_DETECTED and
export them via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:45:58 -07:00
Nikolay Aleksandrov 684dd248be bridge: netlink: export root path cost
Add IFLA_BR_ROOT_PATH_COST and export it via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:45:57 -07:00
Nikolay Aleksandrov 8762ba680f bridge: netlink: export root port
Add IFLA_BR_ROOT_PORT and export it via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:45:56 -07:00
Nikolay Aleksandrov 7599a2201f bridge: netlink: export bridge id
Add IFLA_BR_BRIDGE_ID and export br->bridge_id via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:45:55 -07:00
Nikolay Aleksandrov 5127c81f84 bridge: netlink: export root id
Add IFLA_BR_ROOT_ID and export br->designated_root via netlink. For this
purpose add struct ifla_bridge_id that would represent struct bridge_id.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:45:54 -07:00
Nikolay Aleksandrov 7910228b6b bridge: netlink: add group_fwd_mask support
Add IFLA_BR_GROUP_FWD_MASK attribute to allow setting and retrieving the
group_fwd_mask via netlink.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:45:53 -07:00
Nikolay Aleksandrov 586c2b573e bridge: vlan: use rcu list for the ordered vlan list
When I did the conversion to rhashtable I missed the required locking of
one important user of the vlan list - br_get_link_af_size_filtered()
which is called:
br_ifinfo_notify() -> br_nlmsg_size() -> br_get_link_af_size_filtered()
and the notifications can be sent without holding rtnl. Before this
conversion the function relied on using rcu and since we already use rcu to
destroy the vlans, we can simply migrate the list to use the rcu helpers.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-04 16:43:47 -07:00
Nikolay Aleksandrov 77751ee8ae bridge: vlan: move pvid inside net_bridge_vlan_group
One obvious way to converge more code (which was also used by the
previous vlan code) is to move pvid inside net_bridge_vlan_group. This
allows us to simplify some and remove other port-specific functions.
Also gives us the ability to simply pass the vlan group and use all of the
contained information.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-01 18:24:04 -07:00
Nikolay Aleksandrov 2594e9064a bridge: vlan: add per-vlan struct and move to rhashtables
This patch changes the bridge vlan implementation to use rhashtables
instead of bitmaps. The main motivation behind this change is that we
need extensible per-vlan structures (both per-port and global) so more
advanced features can be introduced and the vlan support can be
extended. I've tried to break this up but the moment net_port_vlans is
changed and the whole API goes away, thus this is a larger patch.
A few short goals of this patch are:
- Extensible per-vlan structs stored in rhashtables and a sorted list
- Keep user-visible behaviour (compressed vlans etc)
- Keep fastpath ingress/egress logic the same (optimizations to come
  later)

Here's a brief list of some of the new features we'd like to introduce:
- per-vlan counters
- vlan ingress/egress mapping
- per-vlan igmp configuration
- vlan priorities
- avoid fdb entries replication (e.g. local fdb scaling issues)

The structure is kept single for both global and per-port entries so to
avoid code duplication where possible and also because we'll soon introduce
"port0 / aka bridge as port" which should simplify things further
(thanks to Vlad for the suggestion!).

Now we have per-vlan global rhashtable (bridge-wide) and per-vlan port
rhashtable, if an entry is added to a port it'll get a pointer to its
global context so it can be quickly accessed later. There's also a
sorted vlan list which is used for stable walks and some user-visible
behaviour such as the vlan ranges, also for error paths.
VLANs are stored in a "vlan group" which currently contains the
rhashtable, sorted vlan list and the number of "real" vlan entries.
A good side-effect of this change is that it resembles how hw keeps
per-vlan data.
One important note after this change is that if a VLAN is being looked up
in the bridge's rhashtable for filtering purposes (or to check if it's an
existing usable entry, not just a global context) then the new helper
br_vlan_should_use() needs to be used if the vlan is found. In case the
lookup is done only with a port's vlan group, then this check can be
skipped.

Things tested so far:
- basic vlan ingress/egress
- pvids
- untagged vlans
- undef CONFIG_BRIDGE_VLAN_FILTERING
- adding/deleting vlans in different scenarios (with/without global ctx,
  while transmitting traffic, in ranges etc)
- loading/removing the module while having/adding/deleting vlans
- extracting bridge vlan information (user ABI), compressed requests
- adding/deleting fdbs on vlans
- bridge mac change, promisc mode
- default pvid change
- kmemleak ON during the whole time

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 13:36:06 -07:00
Vivien Didelot 7a577f013d net: bridge: remove unnecessary switchdev include
Remove the unnecessary switchdev.h include from br_netlink.c.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-08 22:33:14 -07:00
Toshiaki Makita d2d427b392 bridge: Add netlink support for vlan_protocol attribute
This enables bridge vlan_protocol to be configured through netlink.

When CONFIG_BRIDGE_VLAN_FILTERING is disabled, kernel behaves the
same way as this feature is not implemented.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 15:35:33 -07:00
Scott Feldman eb4cb85180 bridge: fix netlink max attr size
.maxtype should match .policy.  Probably just been getting lucky here
because IFLA_BRPORT_MAX > IFLA_BR_MAX.

Fixes: 13323516 ("bridge: implement rtnl_link_ops->changelink")
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20 14:36:25 -07:00
David S. Miller 182ad468e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/cavium/Kconfig

The cavium conflict was overlapping dependency
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 16:23:11 -07:00
Nikolay Aleksandrov a7854037da bridge: netlink: add support for vlan_filtering attribute
This patch adds the ability to toggle the vlan filtering support via
netlink. Since we're already running with rtnl in .changelink() we don't
need to take any additional locks.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:36:43 -07:00
Nikolay Aleksandrov 786c2077ec bridge: netlink: account for the IFLA_BRPORT_PROXYARP_WIFI attribute size and policy
The attribute size wasn't accounted for in the get_slave_size() callback
(br_port_get_slave_size) when it was introduced, so fix it now. Also add
a policy entry for it in br_port_policy.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 842a9ae08a ("bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 23:54:42 -07:00
Nikolay Aleksandrov 355b9f9df1 bridge: netlink: account for the IFLA_BRPORT_PROXYARP attribute size and policy
The attribute size wasn't accounted for in the get_slave_size() callback
(br_port_get_slave_size) when it was introduced, so fix it now. Also add
a policy entry for it in br_port_policy.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 958501163d ("bridge: Add support for IEEE 802.11 Proxy ARP")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 23:54:42 -07:00
David S. Miller 5510b3c2a1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	arch/s390/net/bpf_jit_comp.c
	drivers/net/ethernet/ti/netcp_ethss.c
	net/bridge/br_multicast.c
	net/ipv4/ip_fragment.c

All four conflicts were cases of simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-31 23:52:20 -07:00
Nikolay Aleksandrov 963ad94853 bridge: netlink: fix slave_changelink/br_setport race conditions
Since slave_changelink support was added there have been a few race
conditions when using br_setport() since some of the port functions it
uses require the bridge lock. It is very easy to trigger a lockup due to
some internal spin_lock() usage without bh disabled, also it's possible to
get the bridge into an inconsistent state.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 3ac636b859 ("bridge: implement rtnl_link_ops->slave_changelink")
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-26 16:27:22 -07:00
Rosen, Rami 6ca91c6040 bridge: Fix setting a flag in br_fill_ifvlaninfo_range().
This patch fixes setting of vinfo.flags in the br_fill_ifvlaninfo_range() method. The
assignment of vinfo.flags &= ~BRIDGE_VLAN_INFO_RANGE_BEGIN has no effect and is
unneeded, as vinfo.flags value is overriden by the  immediately following
vinfo.flags = flags | BRIDGE_VLAN_INFO_RANGE_END assignement.

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-24 22:56:22 -07:00
Nikolay Aleksandrov 462e1ead92 bridge: vlan: fix usage of vlan 0 and 4095 again
Vlan ids 0 and 4095 were disallowed by commit:
8adff41c3d ("bridge: Don't use VID 0 and 4095 in vlan filtering")
but then the check was removed when vlan ranges were introduced by:
bdced7ef78 ("bridge: support for multiple vlans and vlan ranges in setlink and dellink requests")
So reintroduce the vlan range check.
Before patch:
[root@testvm ~]# bridge vlan add vid 0 dev eth0 master
(succeeds)
After Patch:
[root@testvm ~]# bridge vlan add vid 0 dev eth0 master
RTNETLINK answers: Invalid argument

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: bdced7ef78 ("bridge: support for multiple vlans and vlan ranges in setlink and dellink requests")
Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-02 12:19:17 -07:00
Scott Feldman 8508025c59 bridge: revert br_dellink change back to original
This is revert of:

commit 68e331c785 ("bridge: offload bridge port attributes to switch asic
if feature flag set")

Restore br_dellink back to original and don't call into SELF port driver.
rtnetlink.c:bridge_dellink() already does a call into port driver for SELF.

bridge vlan add/del cmd defaults to MASTER.  From man page for bridge vlan
add/del cmd:

       self   the vlan is configured on the specified physical device.
              Required if the device is the bridge device.

       master the vlan is configured on the software bridge (default).

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman 41c498b935 bridge: restore br_setlink back to original
This is revert of:

commit 68e331c785 ("bridge: offload bridge port attributes to switch asic
if feature flag set")

Restore br_setlink back to original and don't call into SELF port driver.
rtnetlink.c:bridge_setlink() already does a call into port driver for SELF.

bridge set link cmd defaults to MASTER.  From man page for bridge link set
cmd:

       self   link setting is configured on specified physical device

       master link setting is configured on the software bridge (default)

The link setting has two values: the device-side value and the software
bridge-side value.  These are independent and settable using the bridge
link set cmd by specifying some combination of [master] | [self].
Furthermore, the device-side and bridge-side settings have their own
initial value, viewable from bridge -d link show cmd.

Restoring br_setlink back to original makes rocker (the only in-kernel user
of SELF link settings) work as first implement: two-sided values.

It's true that when both MASTER and SELF are specified from the command,
two netlink notifications are generated, one for each side of the settings.
The user-space app can distiquish between the two notifications by
observing the MASTER or SELF flag.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:54 -04:00
Jiri Pirko ebb9a03a59 switchdev: s/netdev_switch_/switchdev_/ and s/NETDEV_SWITCH_/SWITCHDEV_/
Turned out that "switchdev" sticks. So just unify all related terms to use
this prefix.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:52 -04:00
Nicolas Dichtel 46c264daaa bridge/nl: remove wrong use of NLM_F_MULTI
NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
it is sent only at the end of a dump.

Libraries like libnl will wait forever for NLMSG_DONE.

Fixes: e5a55a8987 ("net: create generic bridge ops")
Fixes: 815cccbf10 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
CC: John Fastabend <john.r.fastabend@intel.com>
CC: Sathya Perla <sathya.perla@emulex.com>
CC: Subbu Seetharaman <subbu.seetharaman@emulex.com>
CC: Ajit Khaparde <ajit.khaparde@emulex.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: intel-wired-lan@lists.osuosl.org
CC: Jiri Pirko <jiri@resnulli.us>
CC: Scott Feldman <sfeldma@gmail.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: bridge@lists.linux-foundation.org
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-29 14:59:16 -04:00
Nicolas Dichtel a54acb3a6f dev: introduce dev_get_iflink()
The goal of this patch is to prepare the removal of the iflink field. It
introduces a new ndo function, which will be implemented by virtual interfaces.

There is no functional change into this patch. All readers of iflink field
now call dev_get_iflink().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:04:59 -04:00
Jörg Thalheim af615762e9 bridge: add ageing_time, stp_state, priority over netlink
Signed-off-by: Jörg Thalheim <joerg@higgsboson.tk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 23:21:06 -04:00
Jouni Malinen 842a9ae08a bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi
This extends the design in commit 958501163d ("bridge: Add support for
IEEE 802.11 Proxy ARP") with optional set of rules that are needed to
meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The
previously added BR_PROXYARP behavior is left as-is and a new
BR_PROXYARP_WIFI alternative is added so that this behavior can be
configured from user space when required.

In addition, this enables proxyarp functionality for unicast ARP
requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible
to use unicast as well as broadcast for these frames.

The key differences in functionality:

BR_PROXYARP:
- uses the flag on the bridge port on which the request frame was
  received to determine whether to reply
- block bridge port flooding completely on ports that enable proxy ARP

BR_PROXYARP_WIFI:
- uses the flag on the bridge port to which the target device of the
  request belongs
- block bridge port flooding selectively based on whether the proxyarp
  functionality replied

Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 14:52:23 -05:00
Johannes Berg 2f56f6be47 bridge: fix bridge netlink RCU usage
When the STP timer fires, it can call br_ifinfo_notify(),
which in turn ends up in the new br_get_link_af_size().
This function is annotated to be using RTNL locking, which
clearly isn't the case here, and thus lockdep warns:

  ===============================
  [ INFO: suspicious RCU usage. ]
  3.19.0+ #569 Not tainted
  -------------------------------
  net/bridge/br_private.h:204 suspicious rcu_dereference_protected() usage!

Fix this by doing RCU locking here.

Fixes: b7853d73e3 ("bridge: add vlan info to bridge setlink and dellink notification messages")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-04 00:20:22 -05:00
Roopa Prabhu fed0a159c8 bridge: fix link notification skb size calculation to include vlan ranges
my previous patch skipped vlan range optimizations during skb size
calculations for simplicity.

This incremental patch considers vlan ranges during
skb size calculations. This leads to a bit of code duplication
in the fill and size calculation functions. But, I could not find a
prettier way to do this. will take any suggestions.

Previously, I had reused the existing br_get_link_af_size size calculation
function to calculate skb size for notifications. Reusing it this time
around creates some change in behaviour issues for the usual
.get_link_af_size callback.

This patch adds a new br_get_link_af_size_filtered() function to
base the size calculation on the incoming filter flag and include
vlan ranges.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-26 11:25:43 -05:00
Roopa Prabhu b7853d73e3 bridge: add vlan info to bridge setlink and dellink notification messages
vlan add/deletes are not notified to userspace today. This patch adds
vlan info to bridge newlink/dellink notifications generated from the
bridge driver. Notifications use the RTEXT_FILTER_BRVLAN_COMPRESSED
flag to compress vlans into ranges whereever applicable.

The size calculations does not take ranges into account for
simplicity.  This has the potential for allocating a larger skb than
required.

There is an existing inconsistency with bridge NEWLINK and DELLINK
change notifications. Both generate NEWLINK notifications.  Since its
always a NEWLINK notification, this patch includes all vlans the port
belongs to in the notification. The NEWLINK and DELLINK request
messages however only include the vlans to be added and deleted.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 23:00:45 -05:00
Roopa Prabhu 1fd0bddb61 bridge: add missing bridge port check for offloads
This patch fixes a missing bridge port check caught by smatch.

setlink/dellink of attributes like vlans can come for a bridge device
and there is no need to offload those today. So, this patch adds a bridge
port check. (In these cases however, the BRIDGE_SELF flags will always be set
and we may not hit a problem with the current code).

smatch complaint:

The patch 68e331c785b8: "bridge: offload bridge port attributes to
switch asic if feature flag set" from Jan 29, 2015, leads to the
following Smatch complaint:

net/bridge/br_netlink.c:552 br_setlink()
	 error: we previously assumed 'p' could be null (see line 518)

net/bridge/br_netlink.c
   517
   518		if (p && protinfo) {
                    ^
Check for NULL.

Reported-By: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-07 22:49:47 -08:00
Roopa Prabhu 68e331c785 bridge: offload bridge port attributes to switch asic if feature flag set
This patch adds support to set/del bridge port attributes in hardware from
the bridge driver.

With this, when the user sends a bridge setlink message with no flags or
master flags set,
   - the bridge driver ndo_bridge_setlink handler sets settings in the kernel
   - calls the swicthdev api to propagate the attrs to the switchdev
	hardware

   You can still use the self flag to go to the switch hw or switch port
   driver directly.

With this, it also makes sure a notification goes out only after the
attributes are set both in the kernel and hw.

The patch calls switchdev api only if BRIDGE_FLAGS_SELF is not set.
This is because the offload cases with BRIDGE_FLAGS_SELF are handled in
the caller (in rtnetlink.c).

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:34 -08:00
Roopa Prabhu add511b382 bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellink
bridge flags are needed inside ndo_bridge_setlink/dellink handlers to
avoid another call to parse IFLA_AF_SPEC inside these handlers

This is used later in this series

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:33 -08:00
Daniel Borkmann 207895fd38 net: mark some potential candidates __read_mostly
They are all either written once or extremly rarely (e.g. from init
code), so we can move them to the .data..read_mostly section.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30 17:58:39 -08:00
Dan Carpenter 1b846f9282 bridge: simplify br_getlink() a bit
Static checkers complain that we should maybe set "ret" before we do the
"goto out;".  They interpret the NULL return from br_port_get_rtnl() as
a failure and forgetting to set the error code is a common bug in this
situation.

The code is confusing but it's actually correct.  We are returning zero
deliberately.  Let's re-write it a bit to be more clear.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-25 23:32:49 -08:00
Johannes Berg 053c095a82 netlink: make nlmsg_end() and genlmsg_end() void
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.

This makes the very common pattern of

  if (genlmsg_end(...) < 0) { ... }

be a whole bunch of dead code. Many places also simply do

  return nlmsg_end(...);

and the caller is expected to deal with it.

This also commonly (at least for me) causes errors, because it is very
common to write

  if (my_function(...))
    /* error condition */

and if my_function() does "return nlmsg_end()" this is of course wrong.

Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.

Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did

-	return nlmsg_end(...);
+	nlmsg_end(...);
+	return 0;

I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with <= 0 in dump functionality, but that could just
be changed to < 0 with no change in behaviour, so I opted for the more
efficient version.

One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for <0 or <=0 and thus broke out of the loop every single time.
I've preserved this since it will (I think) have caused the messages to
userspace to be formatted differently with just a single message for
every SKB returned to userspace. It's possible that this isn't needed
for the tools that actually use this, but I don't even know what they
are so couldn't test that changing this behaviour would be acceptable.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 01:03:45 -05:00
Roopa Prabhu 02dba4388d bridge: fix setlink/dellink notifications
problems with bridge getlink/setlink notifications today:
        - bridge setlink generates two notifications to userspace
                - one from the bridge driver
                - one from rtnetlink.c (rtnl_bridge_notify)
        - dellink generates one notification from rtnetlink.c. Which
	means bridge setlink and dellink notifications are not
	consistent

        - Looking at the code it appears,
	If both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF were set,
        the size calculation in rtnl_bridge_notify can be wrong.
        Example: if you set both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF
        in a setlink request to rocker dev, rtnl_bridge_notify will
	allocate skb for one set of bridge attributes, but,
	both the bridge driver and rocker dev will try to add
	attributes resulting in twice the number of attributes
	being added to the skb.  (rocker dev calls ndo_dflt_bridge_getlink)

There are multiple options:
1) Generate one notification including all attributes from master and self:
   But, I don't think it will work, because both master and self may use
   the same attributes/policy. Cannot pack the same set of attributes in a
   single notification from both master and slave (duplicate attributes).

2) Generate one notification from master and the other notification from
   self (This seems to be ideal):
     For master: the master driver will send notification (bridge in this
	example)
     For self: the self driver will send notification (rocker in the above
	example. It can use helpers from rtnetlink.c to do so. Like the
	ndo_dflt_bridge_getlink api).

This patch implements 2) (leaving the 'rtnl_bridge_notify' around to be used
with 'self').

v1->v2 :
	- rtnl_bridge_notify is now called only for self,
	so, remove 'BRIDGE_FLAGS_SELF' check and cleanup a few things
	- rtnl_bridge_dellink used to always send a RTM_NEWLINK msg
	earlier. So, I have changed the notification from br_dellink to
	go as RTM_NEWLINK

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-17 23:49:51 -05:00