Commit Graph

706585 Commits

Author SHA1 Message Date
John Hurley 2d9ad71a8c nfp: offload vxlan IPv4 endpoints of flower rules
Maintain a list of IPv4 addresses used as the tunnel destination IP match
fields in currently active flower rules. Offload the entire list of
NFP_FL_IPV4_ADDRS_MAX (even if some are unused) when new IPs are added or
removed. The NFP should only be aware of tunnel end points that are
currently used by rules on the device

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 21:27:36 -07:00
John Hurley fd0dd1ab1e nfp: offload flower vxlan endpoint MAC addresses
Generate a list of MAC addresses of netdevs that could be used as VXLAN
tunnel end points. Give offloaded MACs an index for storage on the NFP in
the ranges:
0x100-0x1ff physical port representors
0x200-0x2ff VF port representors
0x300-0x3ff other offloads (e.g. vxlan netdevs, ovs bridges)

Assign phys and vf indexes based on unique 8 bit values in the port num.
Maintain list of other netdevs to ensure same netdev is not offloaded
twice and each gets a unique ID without exhausting the entries. Because
the IDs are unique but constant for a netdev, any changes are implemented
by overwriting the index on NFP.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 21:27:36 -07:00
John Hurley b27d6a95a7 nfp: compile flower vxlan tunnel set actions
Compile set tunnel actions for tc flower. Only support VXLAN and ensure a
tunnel destination port of 4789 is used.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 21:27:35 -07:00
John Hurley 611aec101a nfp: compile flower vxlan tunnel metadata match fields
Compile ovs-tc flower vxlan metadata match fields for offloading. Only
support offload of tunnel data when the VXLAN port specifically matches
well known port 4789.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 21:27:35 -07:00
John Hurley 79ede4ae2d nfp: add helper to get flower cmsg length
Add a helper function that returns the length of the cmsg data when given
the cmsg skb

Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 21:27:35 -07:00
David S. Miller 14a0d032f4 Merge branch 'mlxsw-pass-gact'
Jiri Pirko says:

====================
mlxsw: Introduce support for "pass" gact action offloading

Very simple patchset adds ability for user to insert filters with "pass"
gact action and offload it. That allows scenarios like this:

$ tc filter add dev enp3s0np19 ingress protocol ip pref 10 flower skip_sw dst_ip 192.168.101.0/24 action drop
$ tc filter add dev enp3s0np19 ingress protocol ip pref 9 flower skip_sw dst_ip 192.168.101.1 action pass
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:26:45 -07:00
Jiri Pirko b2925957ec mlxsw: spectrum_flower: Offload "ok" termination action
If action is "gact_ok", offload it to HW.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:26:45 -07:00
Jiri Pirko 3b8e9238a8 net: sched: introduce helper to identify gact pass action
Introduce a helper called is_tcf_gact_pass which could be used to
tell if the action is gact pass or not.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:26:45 -07:00
Jiri Pirko 2a52a8c6e5 mlxsw: spectrum_acl: Propagate errors from mlxsw_afa_block_jump/continue
Propagate error instead of doing WARN_ON right away.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:26:45 -07:00
David S. Miller 5b52e57c3d Merge branch 'net-dsa-use-generic-slave-phydev'
Vivien Didelot says:

====================
net: dsa: use generic slave phydev

DSA currently stores a phy_device pointer in each slave private
structure. This requires to implement our own ethtool ksettings
accessors and such.

This patchset removes the private phy_device in favor of the one
provided in the net_device structure, and thus allows us to use the
generic phy_ethtool_* functions.
====================

Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:20:46 -07:00
Vivien Didelot 69b2c16296 net: dsa: use phy_ethtool_nway_reset
Use phy_ethtool_nway_reset now that dsa_slave_nway_reset does exactly
the same.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:06:34 -07:00
Vivien Didelot aa62a8ca85 net: dsa: use phy_ethtool_set_link_ksettings
Use phy_ethtool_set_link_ksettings now that dsa_slave_set_link_ksettings
does exactly the same.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:06:34 -07:00
Vivien Didelot 771df31ace net: dsa: use phy_ethtool_get_link_ksettings
Use phy_ethtool_get_link_ksettings now that dsa_slave_get_link_ksettings
does exactly the same.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:06:34 -07:00
Vivien Didelot 0115dcd178 net: dsa: use slave device phydev
There is no need to store a phy_device in dsa_slave_priv since
net_device already provides one. Simply s/p->phy/dev->phydev/.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:06:34 -07:00
Vivien Didelot f4344e0a48 net: dsa: return -ENODEV is there is no slave PHY
Instead of returning -EOPNOTSUPP when a slave device has no PHY,
directly return -ENODEV as ethtool and phylib do.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:06:34 -07:00
David S. Miller f6fc5b494d Merge branch 'mlxsw-Add-router-adjacency-dpipe-table'
Jiri Pirko says:

====================
mlxsw: Add router adjacency dpipe table

Arkadi says:

This patchset adds router adjacency dpipe table support. This will provide
the ability to observe the hardware offloaded IPv4/6 nexthops.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:36 -07:00
Arkadi Sharshevsky 427e652aa3 mlxsw: spectrum_dpipe: Add support for controlling nexthop counters
Add support for controlling nexthop counters via dpipe.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky 190d38a52a mlxsw: spectrum_dpipe: Add support for adjacency table dump
Add support for adjacency table dump.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky a5390278a5 mlxsw: spectrum: Add support for setting counters on nexthops
Add support for setting counters on nexthops based on dpipe's adjacency
table counter status. This patch also adds the ability for getting the
counter value, which will be used by the dpipe adjacency table dump
implementation in the next patches.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky f4de25fb53 mlxsw: reg: Add support for counters on RATR
In order to add the ability for setting counters on nexthops the RATR
register should be extended.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky c538adb3c6 mlxsw: spectrum_dpipe: Add initial support for the router adjacency table
Add initial support for router adjacency table. The table does lookup
based on the nexthop-group index and the local nexthop offset. After
locating the nexthop entry it sets the destination MAC address and the
egress RIF.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky c556cd2893 mlxsw: spectrum_router: Add helpers for nexthop access
This is done as a preparation before introducing the ability to dump the
adjacency table via dpipe, and to count the table size. The current table
implementation avoids tunnel entries, thus a helper for checking if
the nexthop group contains tunnel entries is also provided. The mlxsw's
nexthop representative struct stays private to the router module.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky ec2437f42b mlxsw: spectrum_router: Use helper to check for last neighbor
Use list_is_last helper to check for last neighbor.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky dbe4598c1e mlxsw: spectrum_router: Keep nexthops in a linked list
Keep nexthops in a linked list for easy access.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky c0859d697c mlxsw: Add fields for mlxsw's meta header for adjacency table
This patch adds field for mlxsw's meta header which will be used to
describe the match/action behavior of the adjacency table.

The fields are:
1. Adj_index - The global index of the nexthop group in the adjacency
   table.

2. Adj_hash_index - Local index offset which is based on packets hash
   mod the nexthop group size.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
Arkadi Sharshevsky be2336ebfd mlxsw: spectrum_dpipe: Fix indentation in header description
Fix indentation in mlxsw_meta header's description.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 20:04:35 -07:00
David S. Miller 390e96ec8e Merge branch 'bpf-metadata-direct-access'
Daniel Borkmann says:

====================
BPF metadata for direct access

This work enables generic transfer of metadata from XDP into skb,
meaning the packet has a flexible and programmable room for meta
data, which can later be used by BPF to set various skb members
when passing up the stack. For details, please see second patch.
Support has been implemented and tested with two drivers, and
should be straight forward to add to other drivers as well which
properly support head adjustment already.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:45 -07:00
Daniel Borkmann 366a88fe2f bpf, ixgbe: add meta data support
Implement support for transferring XDP meta data into skb for
ixgbe driver; before calling into the program, xdp.data_meta points
to xdp.data, where on program return with pass verdict, we call
into skb_metadata_set().

We implement this for the default ixgbe_build_skb() variant. For the
ixgbe_construct_skb() that is used when legacy-rx buffer mananagement
mode is turned on via ethtool, I found that XDP gets 0 headroom, so
neither xdp_adjust_head() nor xdp_adjust_meta() can be used with this.
Just add a comment with explanation for this operating mode.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Daniel Borkmann 65d88fd0ba bpf, nfp: add meta data support
Implement support for transferring XDP meta data into skb for
nfp driver; before calling into the program, xdp.data_meta points
to xdp.data, where on program return with pass verdict, we call
into skb_metadata_set().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Daniel Borkmann 22c8852624 bpf: improve selftests and add tests for meta pointer
Add various test_verifier selftests, and a simple xdp/tc functional
test that is being attached to veths. Also let new versions of clang
use the recently added -mcpu=probe support [1] for the BPF target,
so that it can probe the underlying kernel for BPF insn set extensions.
We could also just set this options always, where older versions just
ignore it and give a note to the user that the -mcpu value is not
supported, but given emitting the note cannot be turned off from clang
side lets not confuse users running selftests with it, thus fallback
to the default generic one when we see that clang doesn't support it.
Also allow CPU option to be overridden in the Makefile from command
line.

  [1] d7276a40d8

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Daniel Borkmann ac29991ba1 bpf: update bpf.h uapi header for tools
Looks like a couple of updates missed to get carried into tools/include/uapi/,
so copy the bpf.h header as usual to pull in latest updates.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Daniel Borkmann de8f3a83b0 bpf: add meta pointer for direct access
This work enables generic transfer of metadata from XDP into skb. The
basic idea is that we can make use of the fact that the resulting skb
must be linear and already comes with a larger headroom for supporting
bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
for adjusting a new pointer called xdp->data_meta. Thus, the packet has
a flexible and programmable room for meta data, followed by the actual
packet data. struct xdp_buff is therefore laid out that we first point
to data_hard_start, then data_meta directly prepended to data followed
by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
account whether we have meta data already prepended and if so, memmove()s
this along with the given offset provided there's enough room.

xdp->data_meta is optional and programs are not required to use it. The
rationale is that when we process the packet in XDP (e.g. as DoS filter),
we can push further meta data along with it for the XDP_PASS case, and
give the guarantee that a clsact ingress BPF program on the same device
can pick this up for further post-processing. Since we work with skb
there, we can also set skb->mark, skb->priority or other skb meta data
out of BPF, thus having this scratch space generic and programmable
allows for more flexibility than defining a direct 1:1 transfer of
potentially new XDP members into skb (it's also more efficient as we
don't need to initialize/handle each of such new members). The facility
also works together with GRO aggregation. The scratch space at the head
of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
yet supporting xdp->data_meta can simply be set up with xdp->data_meta
as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
such that the subsequent match against xdp->data for later access is
guaranteed to fail.

The verifier treats xdp->data_meta/xdp->data the same way as we treat
xdp->data/xdp->data_end pointer comparisons. The requirement for doing
the compare against xdp->data is that it hasn't been modified from it's
original address we got from ctx access. It may have a range marking
already from prior successful xdp->data/xdp->data_end pointer comparisons
though.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Daniel Borkmann 6aaae2b6c4 bpf: rename bpf_compute_data_end into bpf_compute_data_pointers
Just do the rename into bpf_compute_data_pointers() as we'll add
one more pointer here to recompute.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Himanshu Jha 3bd3b9ed1b net: bcm63xx_enet: Use setup_timer and mod_timer
Use setup_timer and mod_timer API instead of structure assignments.

This is done using Coccinelle and semantic patch used
for this as follows:

@@
expression x,y,z,a,b;
@@

-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);

Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 11:33:25 -07:00
David S. Miller 5b2ef20df9 Merge branch 'qed-iWARP-fixes-and-enhancements'
Michal Kalderon says:

====================
qed: iWARP fixes and enhancements

This patch series includes several fixes and enhancements
related to iWARP.

Patch #1 is actually the last of the initial iWARP submission.
It has been delayed until now as I wanted to make sure that qedr
supports iWARP prior to enabling iWARP device detection.

iWARP changes in RDMA tree have been accepted and targeted at
kernel 4.15, therefore, all iWARP fixes for this cycle are
submitted to net-next.

Changes from v1->v2
  - Added "Fixes:" tag to commit message of patch #3
====================

Signed-off by: Michal.Kalderon@cavium.com
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 11:22:03 -07:00
Michal Kalderon 1e99c49701 qed: iWARP - Add check for errors on a SYN packet
A SYN packet which arrives with errors from FW should be dropped.
This required adding an additional field to the ll2
rx completion data.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 11:22:03 -07:00
Michal Kalderon 471115ab98 qed: Fix maximum number of CQs for iWARP
The maximum number of CQs supported is bound to the number
of connections supported, which differs between RoCE and iWARP.

This fixes a crash that occurred in iWARP when running 1000 sessions
using perftest.

Fixes: 67b40dccc4 ("qed: Implement iWARP initialization, teardown and qp operations")

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 11:22:03 -07:00
Michal Kalderon d1abfd0b4e qed: Add iWARP out of order support
iWARP requires OOO support which is already provided by the ll2
interface (until now was used only for iSCSI offload).
The changes mostly include opening a ll2 dedicated connection for
OOO and notifiying the FW about the handle id.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 11:22:03 -07:00
Michal Kalderon e0a8f9de16 qed: Add iWARP enablement support
This patch is the last of the initial iWARP patch series. It
adds the possiblity to actually detect iWARP from the device and enable
it in the critical locations which basically make iWARP available.

It wasn't submitted until now as iWARP hadn't been accepted into
the rdma tree.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 11:22:03 -07:00
Tobias Klauser 2091c227fa ldmvsw: Remove redundant unlikely()
IS_ERR() already implies unlikely(), so it can be omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 10:15:44 -07:00
Tobias Klauser 92978ee801 net/mlx5: Remove redundant unlikely()
IS_ERR() already implies unlikely(), so it can be omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 10:15:44 -07:00
Tobias Klauser 1fac4b2fdb bnxt_en: Remove redundant unlikely()
IS_ERR() already implies unlikely(), so it can be omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 10:15:44 -07:00
Tobias Klauser d9db5e3680 kcm: Remove redundant unlikely()
IS_ERR() already implies unlikely(), so it can be omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 09:54:06 -07:00
Tobias Klauser 63a4e80be4 ipv6: Remove redundant unlikely()
IS_ERR() already implies unlikely(), so it can be omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 09:54:06 -07:00
Tobias Klauser 98e4fcff3e datagram: Remove redundant unlikely()
IS_ERR() already implies unlikely(), so it can be omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 09:54:06 -07:00
Tobias Klauser 1f4cf93b13 net: ena: Remove redundant unlikely()
IS_ERR() already implies unlikely(), so it can be omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 09:54:06 -07:00
Alexey Dobriyan 01ccdf126c neigh: make strucrt neigh_table::entry_size unsigned int
Key length can't be negative.

Leave comparisons against nla_len() signed just in case truncated attribute
can sneak in there.

Space savings:

	add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-7 (-7)
	function                                     old     new   delta
	pneigh_delete                                273     272      -1
	mlx5e_rep_netevent_event                    1415    1414      -1
	mlx5e_create_encap_header_ipv6              1194    1193      -1
	mlx5e_create_encap_header_ipv4              1071    1070      -1
	cxgb4_l2t_get                               1104    1103      -1
	__pneigh_lookup                               69      68      -1
	__neigh_create                              2452    2451      -1

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25 20:36:17 -07:00
Alexey Dobriyan e451ae8e4f neigh: make struct neigh_table::entry_size unsigned int
Neigh entry size can't be negative.

Space savings:

	add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-7 (-7)
	function                                     old     new   delta
	lowpan_neigh_construct                        25      24      -1
	clip_seq_sub_iter                            152     151      -1
	clip_ioctl                                  1475    1474      -1
	clip_constructor                              93      92      -1
	__neigh_create                              2455    2452      -3

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25 20:36:17 -07:00
Eric Dumazet 7c90584c66 net: speed up skb_rbtree_purge()
As measured in my prior patch ("sch_netem: faster rb tree removal"),
rbtree_postorder_for_each_entry_safe() is nice looking but much slower
than using rb_next() directly, except when tree is small enough
to fit in CPU caches (then the cost is the same)

Also note that there is not even an increase of text size :
$ size net/core/skbuff.o.before net/core/skbuff.o
   text	   data	    bss	    dec	    hex	filename
  40711	   1298	      0	  42009	   a419	net/core/skbuff.o.before
  40711	   1298	      0	  42009	   a419	net/core/skbuff.o

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25 20:35:11 -07:00
Eric Dumazet 3aa605f28b sch_netem: faster rb tree removal
While running TCP tests involving netem storing millions of packets,
I had the idea to speed up tfifo_reset() and did experiments.

I tried the rbtree_postorder_for_each_entry_safe() method that is
used in skb_rbtree_purge() but discovered it was slower than the
current tfifo_reset() method.

I measured time taken to release skbs with three occupation levels :
10^4, 10^5 and 10^6 skbs with three methods :

1) (current 'naive' method)

	while ((p = rb_first(&q->t_root))) {
		struct sk_buff *skb = netem_rb_to_skb(p);

		rb_erase(p, &q->t_root);
		rtnl_kfree_skbs(skb, skb);
	}

2) Use rb_next() instead of rb_first() in the loop :

	p = rb_first(&q->t_root);
	while (p) {
		struct sk_buff *skb = netem_rb_to_skb(p);

		p = rb_next(p);
		rb_erase(&skb->rbnode, &q->t_root);
		rtnl_kfree_skbs(skb, skb);
	}

3) "optimized" method using rbtree_postorder_for_each_entry_safe()

	struct sk_buff *skb, *next;

	rbtree_postorder_for_each_entry_safe(skb, next,
					     &q->t_root, rbnode) {
               rtnl_kfree_skbs(skb, skb);
	}
	q->t_root = RB_ROOT;

Results :

method_1:while (rb_first()) rb_erase() 10000 skbs in 690378 ns (69 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  10000 skbs in 541846 ns (54 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 10000 skbs in 868307 ns (86 ns per skb)

method_1:while (rb_first()) rb_erase() 99996 skbs in 7804021 ns (78 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  100000 skbs in 5942456 ns (59 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 100000 skbs in 11584940 ns (115 ns per skb)

method_1:while (rb_first()) rb_erase() 1000000 skbs in 108577838 ns (108 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  1000000 skbs in 82619635 ns (82 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 1000000 skbs in 127328743 ns (127 ns per skb)

Method 2) is simply faster, probably because it maintains a smaller
working size set.

Note that this is the method we use in tcp_ofo_queue() already.

I will also change skb_rbtree_purge() in a second patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25 20:31:32 -07:00