Or Gerlitz says:
====================
mlx4 driver encapsulation/steering fixes
The 1st patch fixes a bug in the TX path that supports offloading the
TX checksum of (VXLAN) encapsulated TCP packets. It turns out that the
bug is revealed only when the receiver runs in non-offloaded mode, so
we somehow missed it so far... please queue it for -stable >= 3.14
The 2nd patch makes sure not to leak steering entry on error flow,
please queue it to 3.17-stable
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If mlx4_ib_create_flow() attempts to create > 1 rules with the
firmware, and one of these registrations fail, we leaked the
already created flow rules.
One example of the leak is when the registration of the VXLAN ghost
steering rule fails, we didn't unregister the original rule requested
by the user, introduced in commit d2fce8a906 "mlx4: Set
user-space raw Ethernet QPs to properly handle VXLAN traffic".
While here, add dump of the VXLAN portion of steering rules
so it can actually be seen when flow creation fails.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For VXLAN/NVGRE encapsulation, the current HW doesn't support offloading
both the outer UDP TX checksum and the inner TCP/UDP TX checksum.
The driver doesn't advertize SKB_GSO_UDP_TUNNEL_CSUM, however we are wrongly
telling the HW to offload the outer UDP checksum for encapsulated packets,
fix that.
Fixes: 837052d0cc ('net/mlx4_en: Add netdev support for TCP/IP
offloads of vxlan tunneling')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-10-30
This series contains updates to e1000, igb and ixgbe.
Francesco Ruggeri fixes an issue with e1000 where in a VM the driver did
not support unicast filtering.
Roman Gushchin fixes an issue with igb where the driver was re-using
mapped pages so that packets were still getting dropped even if all
the memory issues are gone and there is free memory.
Junwei Zhang found where in the ixgbe_clean_rx_ring() we were repeating
the assignment of NULL to the receive buffer skb and fixes it.
Emil fixes a race condition between setup_link and SFP detection routine
in the watchdog when setting the advertised speed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes checkpatch warning:
"WARNING: Prefer seq_puts to seq_printf"
Signed-off-by: Michele Baldessari <michele@acksyn.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is often quite helpful to be able to know the state of a transport
outside of the application itself (for troubleshooting purposes or for
monitoring purposes). Add it under /proc/net/sctp/remaddr.
Signed-off-by: Michele Baldessari <michele@acksyn.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we cache them, the kernel will reuse them, independently of
whether forwarding is enabled or not. Which means that if forwarding is
disabled on the input interface where the first routing request comes
from, then that unreachable result will be cached and reused for
other interfaces, even if forwarding is enabled on them. The opposite
is also true.
This can be verified with two interfaces A and B and an output interface
C, where B has forwarding enabled, but not A and trying
ip route get $dst iif A from $src && ip route get $dst iif B from $src
Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
The man page for open(2) indicates that when O_CREAT is specified, the
'mode' argument applies only to future accesses to the file:
Note that this mode applies only to future accesses of the newly
created file; the open() call that creates a read-only file
may well return a read/write file descriptor.
The man page for open(2) implies that 'mode' is treated identically by
O_CREAT and O_TMPFILE.
O_TMPFILE, however, behaves differently:
int fd = open("/tmp", O_TMPFILE | O_RDWR, 0);
assert(fd == -1);
assert(errno == EACCES);
int fd = open("/tmp", O_TMPFILE | O_RDWR, 0600);
assert(fd > 0);
For O_CREAT, do_last() sets acc_mode to MAY_OPEN only:
if (*opened & FILE_CREATED) {
/* Don't check for write permission, don't truncate */
open_flag &= ~O_TRUNC;
will_truncate = false;
acc_mode = MAY_OPEN;
path_to_nameidata(path, nd);
goto finish_open_created;
}
But for O_TMPFILE, do_tmpfile() passes the full op->acc_mode to
may_open().
This patch lines up the behavior of O_TMPFILE with O_CREAT. After the
inode is created, may_open() is called with acc_mode = MAY_OPEN, in
do_tmpfile().
A different, but related glibc bug revealed the discrepancy:
https://sourceware.org/bugzilla/show_bug.cgi?id=17523
The glibc lazily loads the 'mode' argument of open() and openat() using
va_arg() only if O_CREAT is present in 'flags' (to support both the 2
argument and the 3 argument forms of open; same idea for openat()).
However, the glibc ignores the 'mode' argument if O_TMPFILE is in
'flags'.
On x86_64, for open(), it magically works anyway, as 'mode' is in
RDX when entering open(), and is still in RDX on SYSCALL, which is where
the kernel looks for the 3rd argument of a syscall.
But openat() is not quite so lucky: 'mode' is in RCX when entering the
glibc wrapper for openat(), while the kernel looks for the 4th argument
of a syscall in R10. Indeed, the syscall calling convention differs from
the regular calling convention in this respect on x86_64. So the kernel
sees mode = 0 when trying to use glibc openat() with O_TMPFILE, and
fails with EACCES.
Signed-off-by: Eric Rannaud <e@nanocritical.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Set RTL8152_UNPLUG when finding -ENODEV. This could accelerate
unloading the driver when the device is unplugged.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only count packets that failed cookie-authentication.
We can get SYNCOOKIESFAILED > 0 while we never even sent a single cookie.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
win0_lock was being used un-initialized, resulting in warning traces
being seen when lock debugging is enabled (and just wrong)
Fixes : fc5ab02096 ('cxgb4: Replaced the backdoor mechanism to access the HW
memory with PCIe Window method')
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2x_msix_fp_int() and bnx2x_interrupt() run from hard interrupt
context.
They can use napi_schedule_irqoff() instead of napi_schedule()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx4_en_rx_irq() and mlx4_en_tx_irq() run from hard interrupt context.
They can use napi_schedule_irqoff() instead of napi_schedule()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will allow the workload spreading via vRSS for IPv6.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fallback device is in ipv6 mode by default.
The mode can not be changed in runtime, so there
is no way to decapsulate ip4in6 packets coming from
various sources without creating the specific tunnel
ifaces for each peer.
This allows to update the fallback tunnel device, but only
the mode could be changed. Usual command should work for the
fallback device: `ip -6 tun change ip6tnl0 mode any`
The fallback device can not be hidden from the packet receiver
as a regular tunnel, but there is no need for synchronization
as long as we do single assignment.
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexey Andriyanov <alan@al-an.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do assignment before if condition and test !skb like in rawv6_recvmsg()
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
remove __inline__ / inline and let compiler decide what to do
with static functions
Inspired-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Apply commit e0f36310f7
("ipx: remove unnecessary casting on ntohl")
to all seq_printf/08lX
Inspired-by: "David S. Miller" <davem@davemloft.net>
Inspired-by: Joe Perches <joe@perches.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang says:
====================
r8152: patches for autosuspend
There are unexpected processes when enabling autosuspend.
These patches are used to fix them.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid unnecessary behavior when autosuspend occurs during open().
The relative processes should only be run after finishing open().
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If (tp->speed & LINK_STATUS) is not zero, the rtl8152_resume()
would call rtl_start_rx() before enabling the tx/rx. Avoid this
by resetting it to zero.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flag of SELECTIVE_SUSPEND should be cleared when autoresuming.
Otherwise, when the system suspend and resume occur, it may have
the wrong flow.
Besides, because the flag of SELECTIVE_SUSPEND couldn't be used
to check if the hw enables the relative feature, it should alwayes
be disabled in close().
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
bpf: reduce verifier memory consumption and add tests
Small set of cleanups:
- reduce verifier memory consumption
- add verifier test to check register state propagation and state equivalence
- add JIT test reduced from recent nmap triggered crash
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
nmap generates classic BPF programs to filter ARP packets with given target MAC
which triggered a bug in eBPF x64 JIT. The bug was fixed in
commit e0ee9c1215 ("x86: bpf_jit: fix two bugs in eBPF JIT compiler")
This patch is adding a testcase in eBPF instructions (those that
were generated by classic->eBPF converter) to be processed by JIT.
The test is primarily targeting JIT compiler.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- add a test specifically targeting verifier state pruning.
It checks state propagation between registers, storing that
state into stack and state pruning algorithm recognizing
equivalent stack and register states.
- add summary line to spot failures easier
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
verifier keeps track of register state spilled to stack.
registers are 8-byte wide and always aligned, so instead of tracking them
in every byte-sized stack slot, use MAX_BPF_STACK / 8 array to track
spilled register state.
Though verifier runs in user context and its state freed immediately
after verification, it makes sense to reduce its memory usage.
This optimization reduces sizeof(struct verifier_state)
from 12464 to 1712 on 64-bit and from 6232 to 1112 on 32-bit.
Note, this patch doesn't change existing limits, which are there to bound
time and memory during verification: 4k total number of insns in a program,
1k number of jumps (states to visit) and 32k number of processed insn
(since an insn may be visited multiple times). Theoretical worst case memory
during verification is 1712 * 1k = 17Mbyte. Out-of-memory situation triggers
cleanup and rejects the program.
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An error in the code makes the allocated space for firmware to be too
small.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Murilo Opsfelder Araujo <mopsfelder@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The new version of rtlwifi needs code in rtl92ce_get_desc() that returns
the buffer address for read operations.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Murilo Opsfelder Araujo <mopsfelder@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The new version of rtlwifi needs code in rtl92se_get_desc() that returns
the buffer address for read operations.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Murilo Opsfelder Araujo <mopsfelder@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Driver rtlwifi has been modified to call ieee80211_register_hw()
from the probe routine; however, the existing call in the callback
routine for deferred firmware loading was not removed.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Murilo Opsfelder Araujo <mopsfelder@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The recent changes in checking for Bluetooth status added some callbacks to code
in rtlwifi. To make certain that all callbacks are defined, a dummy routine has been
added to rtlwifi, and the drivers that need to use it are modified.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Murilo Opsfelder Araujo <mopsfelder@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
During 11n RX reordering, if there is a hole in RX table,
driver will not send packets to kernel until the rxreorder
timer expires or the table is full.
However, currently driver always restarts rxreorder timer when
receiving a packet, which causes the timer hardly to expire.
So while connected with to 11n AP in a busy environment,
ping packets may get blocked for about 30 seconds.
This patch fixes this timer restarting by ensuring rxreorder timer
would only be restarted either timer is not set or start_win
has changed.
Signed-off-by: Chin-Ran Lo <crlo@marvell.com>
Signed-off-by: Plus Chen <pchen@marvell.com>
Signed-off-by: Marc Yang <yangyang@marvell.com>
Signed-off-by: Cathy Luo <cluo@marvell.com>
Signed-off-by: Avinash Patil <patila@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The right shift operation has higher precedence than the mask so we
left shift by "(i * 3)" and then immediately right shift by "(i * 3)"
then we mask. It should be left shift, mask, and then right shift.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Guenter Roeck says:
====================
net: dsa: Fixes and enhancements
Patch 01/15 addresses a bug indicated by an an annoying and unhelpful
log message.
Patches 02/15 and 03/15 are minor enhancements, adding support for
known switch revisions.
Patches 04/15 and 05/15 add support for MV88E6352 and MV88E6176.
Patch 06/15 adds support for hardware monitoring, specifically for
reporting the chip temperature, to the dsa subsystem.
Patches 07/15 and 08/15 implement hardware monitoring for MV88E6352,
MV88E6176, MV88E6123, MV88E6161, and MV88E6165.
Patch 09/15 and 10/15 add support for EEPROM access to the DSA subsystem.
Patch 11/15 implements EEPROM access for MV88E6352 and MV88E6176.
Patch 12/15 adds support for reading switch registers to the DSA
subsystem.
Patches 13/15 amd 14/15 implement support for reading switch registers
to the drivers for MV88E6352, MV88E6176, MV88E6123, MV88E6161, and MV88E6165.
Patch 15/15 adds support for reading additional RMON registers to the drivers
for MV88E6352, MV88E6176, MV88E6123, MV88E6161, and MV88E6165.
The series was tested on top of v3.18-rc2 in an x86 system with MV88E6352.
Testing in systems with 88E6131, 88E6060 and MV88E6165 was done earlier
(I don't have access to those systems right now). The series was also build
tested using my build system at http://server.roeck-us.net:8010/builders.
Look into the 'dsa' column for build results.
The series merges cleanly into net-next as of today (10/29).
v3:
- Fix bug in eeprom patches seen if devicetree is enabled:
eeprom-length property is attached to switch devicetree node,
not to dsa node, and there was a compile error.
v2:
- Made reporting chip temperatures through the hwmon subsystem optional
with new Kconfig option
- Changed the hwmon chip name to <network device name>_dsa<index>
- Made EEPROM presence and size configurable through platform and devicetree
data
- Various minor changes and fixes (see individual patches for details)
====================
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Display sw_in_discards, sw_in_filtered, and sw_out_filtered for chips
supported by mv88e6123_61_65 and mv88e6352 drivers.
The variables are provided in port registers, not the normal status registers.
Mark by adding 0x100 to the register offset and add special handling code
to mv88e6xxx_get_ethtool_stats.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The infrastructure can now report switch registers to ethtool.
Add support for it to the mv88e6123_61_65 driver.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for reading switch registers with 'ethtool -d'.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
MV88E6352 supports read and write access to its configuration eeprom.
There is no means to detect if an EEPROM is connected to the switch.
Also, the switch supports EEPROMs with different sizes, but can not detect
or report the type or size of connected EEPROMs. Therefore, do not implement
the get_eeprom_len callback but depend on platform or devicetree data to
provide information about EEPROM presence and size.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dsa core now supports reading from and writing to a switch EEPROM
if connected. Describe optional devicetree property indicating that
an EEPROM is present and its size.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some chips it is possible to access the switch eeprom.
Add infrastructure support for it.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
MV88E6123 and compatible chips support reading the chip temperature
from PHY register 6:26.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
MV88E6352 supports reading the chip temperature from two PHY registers,
6:26 and 6:27. Report it using the more accurate register 6:27.
Also report temperature limit and alarm.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some switches provide chip temperature data.
Add support for reporting it through the hwmon subsystem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
MV88E6176 is mostly compatible to MV88E6352 and is documented
in the same functional specification. Add support for it.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marvell 88E6352 is mostly compatible to MV88E6123/61/65,
but requires indirect phy access. Also, its configuration
registers are a bit different.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>