Heiner Kallweit says:
====================
net: phy: improve PM handling of PHY/MDIO
Current implementation of MDIO bus PM ops doesn't actually implement
bus-specific PM ops but just calls PM ops defined on a device level
what doesn't seem to be fully in line with the core PM model.
When looking e.g. at __device_suspend() the PM core looks for PM ops
of a device in a specific order:
1. device PM domain
2. device type
3. device class
4. device bus
I think it has good reason that there's no PM ops on device level.
The situation can be improved by modeling PHY's as device type of
a MDIO device. If for some other type of MDIO device PM ops are
needed, it could be modeled as struct device_type as well.
====================
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation of MDIO bus PM ops doesn't actually implement
bus-specific PM ops but just calls PM ops defined on a device level
what doesn't seem to be fully in line with the core PM model.
When looking e.g. at __device_suspend() the PM core looks for PM ops
of a device in a specific order:
1. device PM domain
2. device type
3. device class
4. device bus
I think it has good reason that there's no PM ops on device level.
Now that a device type representation of PHY's as special type of MDIO
devices was added (only user of MDIO bus PM ops), the MDIO bus
PM ops can be removed including member pm of struct mdio_device.
If for some other type of MDIO device PM ops are needed, it should be
modeled as struct device_type as well.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A PHY is a type of MDIO device, so let's model it as struct device_type
and place PM ops, attribute groups and release callback on device type
level. For this the attribute definitions have to be moved.
This change allows us to get rid of the PM ops on a bus level in a second
step.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2018-06-04
This series contains a smorgasbord of updates to documentation, e1000e,
igb, ixgbe, ixgbevf and i40e.
Benjamin Poirier fixes a potential kernel crash due to NULL pointer
dereference in e1000e.
Jeff updates the kernel documentation for e100 and e1000 to correct
default values and URLs which were incorrect in the documentation. Also
took the time to update these to the new reStructured text format for
kernel documentation.
Joanna Yurdal fixes a missing PTP transmit timestamp by ensuring that
TSICR gets cleared when ICR is cleared.
Sergey updates igb to reset all the transmit queues at one time so that
we only have to wait once for all the queues to be reset.
Alex fixes ixgbevf so that malicious driver detection (MDD) can co-exist
with XDP.
Emil and Tony extend the RTNL lock to ensure we get the most up-to-date
values for the bits and avoid a possible race condition when going down.
YueHaibing from Huawei introduces a helper function in ixgbe for
operation reads to simplify the code a bit more.
Daniel Borkmann adds support for XDP meta data when using build SKB
for i40e.
Shannon Nelson provides twp fixes for the IPSec code in ixgbe, first is
to make sure we do not try to offload the decryption of any incoming
packet that is destined for the management engine. The other fix is to
resolve a cast problem introduced by a sparse cleanup patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If the multicast mask value in device tree is configured not all
0xff, the broadcast mac will be lost from tcam table after the
execution of command 'ifconfig up'. The address is appended by
hns_ae_start, but will be clear later by hns_nic_set_rx_mode
called in dev_open process.
This patch fixed it by not use the multicast mask when add a
broadcast address.
Fixes: b5996f11ea ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If requested tcf proto is not found, get and del filter netlink protocol
handlers output error message to extack, but do not return actual error
code. Add check to return ENOENT when result of tp find function is NULL
pointer.
Fixes: c431f89b18 ("net: sched: split tc_ctl_tfilter into three handlers")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This code was introduced in 2011 around the same time that we made
netdev_features_t a u64 type. These days a u32 is not big enough to
hold all the potential features.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The features mask needs to be a netdev_features_t (u64) because a u32
is not big enough.
Fixes: cfc80d9a11 ("net: Introduce net_failover driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth-next 2018-06-04
Here's one last bluetooth-next pull request for the 4.18 kernel:
- New USB device IDs for Realtek 8822BE and 8723DE
- reset/resume fix for Dell Inspiron 5565
- Fix HCI_UART_INIT_PENDING flag behavior
- Fix patching behavior for some ATH3012 models
- A few other minor cleanups & fixes
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes some typos/misspelling errors in the
Documentation/networking files.
Signed-off-by: Olivier Gayot <olivier.gayot@sigexec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is not safe to do so because such sockets are already in the
hash tables and changing these options can result in invalidating
the tb->fastreuse(port) caching.
This can have later far reaching consequences wrt. bind conflict checks
which rely on these caches (for optimization purposes).
Not to mention that you can currently end up with two identical
non-reuseport listening sockets bound to the same local ip:port
by clearing reuseport on them after they've already both been bound.
There is unfortunately no EISBOUND error or anything similar,
and EISCONN seems to be misleading for a bound-but-not-connected
socket, so use EUCLEAN 'Structure needs cleaning' which AFAICT
is the closest you can get to meaning 'socket in bad state'.
(although perhaps EINVAL wouldn't be a bad choice either?)
This does unfortunately run the risk of breaking buggy
userspace programs...
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Change-Id: I77c2b3429b2fdf42671eee0fa7a8ba721c94963b
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This changes the /proc/sys/net/ipv4/tcp_tw_reuse from a boolean
to an integer.
It now takes the values 0, 1 and 2, where 0 and 1 behave as before,
while 2 enables timewait socket reuse only for sockets that we can
prove are loopback connections:
ie. bound to 'lo' interface or where one of source or destination
IPs is 127.0.0.0/8, ::ffff:127.0.0.0/104 or ::1.
This enables quicker reuse of ephemeral ports for loopback connections
- where tcp_tw_reuse is 100% safe from a protocol perspective
(this assumes no artificially induced packet loss on 'lo').
This also makes estblishing many loopback connections *much* faster
(allocating ports out of the first half of the ephemeral port range
is significantly faster, then allocating from the second half)
Without this change in a 32K ephemeral port space my sample program
(it just establishes and closes [::1]:ephemeral -> [::1]:server_port
connections in a tight loop) fails after 32765 connections in 24 seconds.
With it enabled 50000 connections only take 4.7 seconds.
This is particularly problematic for IPv6 where we only have one local
address and cannot play tricks with varying source IP from 127.0.0.0/8
pool.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Wei Wang <weiwan@google.com>
Change-Id: I0377961749979d0301b7b62871a32a4b34b654e1
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for configuring SRQ and provides the necessary
APIs for rdma upper layer driver (qedr) to enable the SRQ feature.
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Yuval Bason <yuval.bason@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Varsha Rao says:
====================
net: bnx2: Fix checkpatch and clang warnings
This patchset fixes NULL comparison and extra parentheses, checkpatch
and clang warnings.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the checkpatch issue of NULL comparison. Replace x == NULL
with !x, by using the following coccinelle script:
@disable is_null@
expression e;
@@
-e==NULL
+!e
Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following coccinelle script removes extra parentheses to fix the
clang warning of extraneous parentheses.
@disable paren@
identifier i;
expression e;
statement s;
@@
if (
-(i == e)
+i == e
)
s
Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in gemini dev_warn message
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We incorrectly compare the mask and the result is that we can't modify
an already existing rule.
Fix that by comparing correctly.
Fixes: 05cd271fd6 ("cls_flower: Support multiple masks per priority")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When destroying the instance, destroy the head rhashtable.
Fixes: 05cd271fd6 ("cls_flower: Support multiple masks per priority")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
<linux/skbuff.h> does not use nor need <linux/slab.h>, so drop this
header file from skbuff.h.
<linux/skbuff.h> is currently #included in around 1200 C source and
header files, making it the 31st most-used header file.
Build tested [allmodconfig] on 20 arch-es.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use dma_zalloc_coherent for allocating zeroed
memory and remove unnecessary memset function.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sometimes an in-progress call will stop responding on the fileserver when
the fileserver quietly cancels the call with an internally marked abort
(RX_CALL_DEAD), without sending an ABORT to the client.
This causes the client's call to eventually expire from lack of incoming
packets directed its way, which currently leads to it being cancelled
locally with ETIME. Note that it's not currently clear as to why this
happens as it's really hard to reproduce.
The rotation policy implement by kAFS, however, doesn't differentiate
between ETIME meaning we didn't get any response from the server and ETIME
meaning the call got cancelled mid-flow. The latter leads to an oops when
fetching data as the rotation partially resets the afs_read descriptor,
which can result in a cleared page pointer being dereferenced because that
page has already been filled.
Handle this by the following means:
(1) Set a flag on a call when we receive a packet for it.
(2) Store the highest packet serial number so far received for a call
(bearing in mind this may wrap).
(3) If, when the "not received anything recently" timeout expires on a
call, we've received at least one packet for a call and the connection
as a whole has received packets more recently than that call, then
cancel the call locally with ECONNRESET rather than ETIME.
This indicates that the call was definitely in progress on the server.
(4) In kAFS, if the rotation algorithm sees ECONNRESET rather than ETIME,
don't try the next server, but rather abort the call.
This avoids the oops as we don't try to reuse the afs_read struct.
Rather, as-yet ungotten pages will be reread at a later data.
Also:
(5) Add an rxrpc tracepoint to log detection of the call being reset.
Without this, I occasionally see an oops like the following:
general protection fault: 0000 [#1] SMP PTI
...
RIP: 0010:_copy_to_iter+0x204/0x310
RSP: 0018:ffff8800cae0f828 EFLAGS: 00010206
RAX: 0000000000000560 RBX: 0000000000000560 RCX: 0000000000000560
RDX: ffff8800cae0f968 RSI: ffff8800d58b3312 RDI: 0005080000000000
RBP: ffff8800cae0f968 R08: 0000000000000560 R09: ffff8800ca00f400
R10: ffff8800c36f28d4 R11: 00000000000008c4 R12: ffff8800cae0f958
R13: 0000000000000560 R14: ffff8800d58b3312 R15: 0000000000000560
FS: 00007fdaef108080(0000) GS:ffff8800ca680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb28a8fa000 CR3: 00000000d2a76002 CR4: 00000000001606e0
Call Trace:
skb_copy_datagram_iter+0x14e/0x289
rxrpc_recvmsg_data.isra.0+0x6f3/0xf68
? trace_buffer_unlock_commit_regs+0x4f/0x89
rxrpc_kernel_recv_data+0x149/0x421
afs_extract_data+0x1e0/0x798
? afs_wait_for_call_to_complete+0xc9/0x52e
afs_deliver_fs_fetch_data+0x33a/0x5ab
afs_deliver_to_call+0x1ee/0x5e0
? afs_wait_for_call_to_complete+0xc9/0x52e
afs_wait_for_call_to_complete+0x12b/0x52e
? wake_up_q+0x54/0x54
afs_make_call+0x287/0x462
? afs_fs_fetch_data+0x3e6/0x3ed
? rcu_read_lock_sched_held+0x5d/0x63
afs_fs_fetch_data+0x3e6/0x3ed
afs_fetch_data+0xbb/0x14a
afs_readpages+0x317/0x40d
__do_page_cache_readahead+0x203/0x2ba
? ondemand_readahead+0x3a7/0x3c1
ondemand_readahead+0x3a7/0x3c1
generic_file_buffered_read+0x18b/0x62f
__vfs_read+0xdb/0xfe
vfs_read+0xb2/0x137
ksys_read+0x50/0x8c
do_syscall_64+0x7d/0x1a0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Note the weird value in RDI which is a result of trying to kmap() a NULL
page pointer.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let user space set whatever it would like to advertise for the
tun interface. Preserve the existing defaults.
Signed-off-by: Chas Williams <3chas3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov says:
====================
sh_eth: fix & clean up sh_eth_soft_swap()
Here's a set of 3 patches against DaveM's 'net-next.git' repo. First one fixes an
old buffer endiannes issue (luckily, the ARM SoCs are smart enough to not actually
care) plus couple clean ups around sh_eth_soft_swap()...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When initializing 'maxp' in sh_eth_soft_swap(), the buffer length needs
to be rounded up -- that's just asking for DIV_ROUND_UP()!
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
sh_eth_tsu_soft_swap() is called twice by the driver, remove *inline* and
move that function from the header to the driver itself to let gcc decide
whether to expand it inline or not...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Browsing thru the driver disassembly, I noticed that ARM gcc generated
no code whatsoever for sh_eth_soft_swap() while building a little-endian
kernel -- apparently __LITTLE_ENDIAN__ was not being #define'd, however
it got implicitly #define'd when building with the SH gcc (I could only
find the explicit #define __LITTLE_ENDIAN that was #include'd when building
a little-endian kernel). Luckily, the Ether controller only doing big-
endian DMA is encountered on the early SH771x SoCs only and all ARM SoCs
implement EDMR.DE and thus set 'sh_eth_cpu_data::hw_swap'. But anyway, we
need to fix the #ifdef inside sh_eth_soft_swap() to something that would
work on all architectures...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up a cast problem introduced by a sparse cleanup patch. This fixes
a problem where the encrypted packets were not recognized on Rx and
subsequently dropped.
Fixes: 9cfbfa701b ("ixgbe: cleanup sparse warnings")
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make sure we don't try to offload the decryption of an incoming
packet that should get delivered to the management engine. This
is a corner case that will likely be very seldom seen, but could
really confuse someone if they were to hit it.
Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ido Schimmel says:
====================
mlxsw: Fixes in offloading of mirror-to-gretap
Petr says:
These two patches fix issues in offloading of mirror-to-gretap when
bridge is present in the underlay.
In patch #1, reconsideration of SPAN configuration is not done right at
the point that SWITCHDEV_OBJ_ID_PORT_VLAN deletion notification is
distributed, but is postponed, because the notifications are actually
distributed before the relevant change is implemented in the bridge.
In patch #2, a problem in configuring VLAN tagging in situations when a
VLAN device is on top of an 802.1Q bridge whose egress port is marked as
"egress untagged". In that case, mlxsw would neglect to suppress the
tagging implicitly assumed after the VLAN device was seen.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When offloading mirroring to gretap or ip6gretap netdevices, an 802.1q
bridge is one of the soft devices permissible in the underlay when
resolving the packet path. After the packet path is resolved to a
particular bridge egress device, flags on packet VLAN determine whether
the egressed packet should be tagged.
The current logic however only ever sets the VLAN tag, never suppresses
it. Thus if there's a VLAN netdevice above the bridge that determines
the packet VLAN, that VLAN is never unset, and mirroring is configured
with VLAN tagging.
Fix by setting the packet VLAN on both branches: set to zero (for unset)
when BRIDGE_VLAN_INFO_UNTAGGED, copy the resolved VLAN (e.g. from bridge
PVID) otherwise.
Fixes: 946a11e740 ("mlxsw: spectrum_span: Allow bridge for gretap mirror")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VLAN deletion notifications are emitted before the relevant change is
projected to bridge configuration. Thus, like with VLAN addition,
schedule SPAN respin for later.
Fixes: c520bc6986 ("mlxsw: Respin SPAN on switchdev events")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to ixgbevf, the same possibility for race exists. Extend the RTNL
lock in ixgbe_reset_subtask() to protect the state bits; this is to make
sure that we get the most up-to-date values for the bits and avoid a
possible race when going down.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for XDP meta data when using build skb variant of
the i40e driver. Implementation is analogous to the existing
ixgbe and ixgbevf support for meta data from 366a88fe2f ("bpf,
ixgbe: add meta data support") and be8333322e ("ixgbevf: Add
support for meta data"). With the build skb variant we get
192 bytes of extra headroom which can be used for encaps or
meta data.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some of the code paths calculating flow hash for IPv6 use flowlabel member
of struct flowi6 which, despite its name, encodes both flow label and
traffic class. If traffic class changes within a TCP connection (as e.g.
ssh does), ECMP route can switch between path. It's also inconsistent with
other code paths where ip6_flowlabel() (returning only flow label) is used
to feed the key.
Use only flow label everywhere, including one place where hash key is set
using ip6_flowinfo().
Fixes: 51ebd31815 ("ipv6: add support of equal cost multipath (ECMP)")
Fixes: f70ea018da ("net: Add functions to get skb->hash based on flow structures")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
ixgbe_dbg_reg_ops_read and ixgbe_dbg_netdev_ops_read copy-pasting
the same code except for ixgbe_dbg_netdev_ops_buf/ixgbe_dbg_reg_ops_buf,
so introduce a helper ixgbe_dbg_common_ops_read to remove redundant code.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Extend the RTNL lock in ixgbevf_reset_subtask() to protect the state bits
check in addition to the call to ixgbevf_reinit_locked().
This is to make sure that we get the most up-to-date values for the bits
and avoid a possible race when going down.
Suggested-by: Zhiping du <zhipingdu@tencent.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some of the code paths calculating flow hash for IPv6 use flowlabel member
of struct flowi6 which, despite its name, encodes both flow label and
traffic class. If traffic class changes within a TCP connection (as e.g.
ssh does), ECMP route can switch between path. It's also incosistent with
other code paths where ip6_flowlabel() (returning only flow label) is used
to feed the key.
Use only flow label everywhere, including one place where hash key is set
using ip6_flowinfo().
Fixes: 51ebd31815 ("ipv6: add support of equal cost multipath (ECMP)")
Fixes: f70ea018da ("net: Add functions to get skb->hash based on flow structures")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the case of the VF driver it is supposed to provide a context descriptor
that allows us to provide information about the header offsets inside of
the frame. However in the case of XDP we don't really have any of that
information since the data is minimally processed. As a result we were
seeing malicious driver detection (MDD) events being triggered when the PF
had that functionality enabled.
To address this I have added a bit of new code that will "prime" the XDP
ring by providing one context descriptor that assumes the minimal setup of
an Ethernet frame which is an L2 header length of 14. With just that we can
provide enough information to make the hardware happy so that we don't
trigger MDD events.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move 10ms sleep out of function resetting TX queue.
Reset all the TX queues in one turn and
wait for all of them just once.
Use usleep_range() instead of mdelay() in order not to
affect transmission on other interfaces.
Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Issuing "ip link set up/down" can block TSICR interrupts, what results in
missing PTP Tx timestamp and no PPS pulse generation.
Problem happens when the link is set up with the TSICR interrupts pending.
ICR is cleared before enabling interrupts, while TSICR is not. When all TSICR
interrupts are pending at this moment, time_sync interrupt will never
be generated. TSICR should be cleared as well.
In order to reproduce the issue:
1. Setup linux with IEEE 1588 grandmaster and PPS output enabled
2. Continue setting link up/down with random intervals between commands
3. Wait until PPS is not generated ( only one pulse is generated and PPS
dies), and ptp4l complains constantly about Tx timeout.
Signed-off-by: Joanna Yurdal <jyu@trackman.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Updated the e1000.txt kernel documentation with the latest information.
Also convert the text file to reStructuredText (RST) format, since the
Linux kernel documentation now uses this format for documentation.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Over the years, several of the links have changed or are no longer valid
so update them. In addition, the default values were incorrect for a
couple of parameters.
Converted the text file to the reStructuredText (RST) format, since the
Linux kernel documentation now uses this format for documentation.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
There have been multiple reports of crashes that look like
kernel: RIP: 0010:[<ffffffff8110303f>] timecounter_read+0xf/0x50
[...]
kernel: Call Trace:
kernel: [<ffffffffa0806b0f>] e1000e_phc_gettime+0x2f/0x60 [e1000e]
kernel: [<ffffffffa0806c5d>] e1000e_systim_overflow_work+0x1d/0x80 [e1000e]
kernel: [<ffffffff810992c5>] process_one_work+0x155/0x440
kernel: [<ffffffff81099e16>] worker_thread+0x116/0x4b0
kernel: [<ffffffff8109f422>] kthread+0xd2/0xf0
kernel: [<ffffffff8163184f>] ret_from_fork+0x3f/0x70
These can be traced back to the fact that e1000e_systim_reset() skips the
timecounter_init() call if e1000e_get_base_timinca() returns -EINVAL, which
leads to a null deref in timecounter_read().
Commit 83129b37ef ("e1000e: fix systim issues", v4.2-rc1) reworked
e1000e_get_base_timinca() in such a way that it can return -EINVAL for
e1000_pch_spt if the SYSCFI bit is not set in TSYNCRXCTL.
Some experimentation has shown that on I219 (e1000_pch_spt, "MAC: 12")
adapters, the E1000_TSYNCRXCTL_SYSCFI flag is unstable; TSYNCRXCTL reads
sometimes don't have the SYSCFI bit set. Retrying the read shortly after
finds the bit to be set. This was observed at boot (probe) but also link up
and link down.
Moreover, the phc (PTP Hardware Clock) seems to operate normally even after
reads where SYSCFI=0. Therefore, remove this register read and
unconditionally set the clock parameters.
Reported-by: Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de>
Message-Id: <20180425065243.g5mqewg5irkwgwgv@f2>
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1075876
Fixes: 83129b37ef ("e1000e: fix systim issues")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fixes: db9d7d36ee ("net: mvpp2: Split the PPv2 driver to a dedicated directory")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>