linux-sg2042

Commit Graph

Author	SHA1	Message	Date
David S. Miller	457740a903	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-01-26 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) A number of extensions to tcp-bpf, from Lawrence. - direct R or R/W access to many tcp_sock fields via bpf_sock_ops - passing up to 3 arguments to bpf_sock_ops functions - tcp_sock field bpf_sock_ops_cb_flags for controlling callbacks - optionally calling bpf_sock_ops program when RTO fires - optionally calling bpf_sock_ops program when packet is retransmitted - optionally calling bpf_sock_ops program when TCP state changes - access to tclass and sk_txhash - new selftest 2) div/mod exception handling, from Daniel. One of the ugly leftovers from the early eBPF days is that div/mod operations based on registers have a hard-coded src_reg == 0 test in the interpreter as well as in JIT code generators that would return from the BPF program with exit code 0. This was basically adopted from cBPF interpreter for historical reasons. There are multiple reasons why this is very suboptimal and prone to bugs. To name one: the return code mapping for such abnormal program exit of 0 does not always match with a suitable program type's exit code mapping. For example, '0' in tc means action 'ok' where the packet gets passed further up the stack, which is just undesirable for such cases (e.g. when implementing policy) and also does not match with other program types. After considering _four_ different ways to address the problem, we adapt the same behavior as on some major archs like ARMv8: X div 0 results in 0, and X mod 0 results in X. aarch64 and aarch32 ISA do not generate any traps or otherwise aborts of program execution for unsigned divides. Given the options, it seems the most suitable from all of them, also since major archs have similar schemes in place. Given this is all in the realm of undefined behavior, we still have the option to adapt if deemed necessary. 3) sockmap sample refactoring, from John. 4) lpm map get_next_key fixes, from Yonghong. 5) test cleanups, from Alexei and Prashant. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-28 21:22:46 -05:00
David S. Miller	6b2e2829c1	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2018-01-26 This series contains updates to ixgbe and ixgbevf. Emil updates ixgbevf to match ixgbe functionality, starting with the consolidating of functions that represent logical steps in the receive process so we can later update them more easily. Updated ixgbevf to only synchronize the length of the frame, which will typically be the MTU or smaller. Updated the VF driver to use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed, which saves on reads and we can save time on initialization. Added support for DMA_ATTR_SKIP_CPU_SYNC/WEAK_ORDERING to help improve performance on some platforms. Updated the VF driver to do bulk updates of the page reference count instead of just incrementing it by one reference at a time. Updated the VF driver to only go through the region of the receive ring that was designated to be cleaned up, rather than process the entire ring. Colin Ian King adds the use of ARRAY_SIZE() on various arrays. Miroslav Lichvar fixes an issue where ethtool was reporting timestamping filters unsupported for X550, which is incorrect. Paul adds support for reporting 5G link speed for some devices. Dan Carpenter fixes a typo where && was used when it should have been \|\|. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-28 10:19:48 -05:00
Leon Romanovsky	751c45bd82	net/rocker: Remove unreachable return instruction The "return 0" instruction follows other return instruction and it makes it impossible to execute, hence remove it. Fixes: `00fc0c51e3` ("rocker: Change world_ops API and implementation to be switchdev independant") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-28 10:13:40 -05:00
Alexei Starovoitov	8223967fe0	Merge branch 'fix-lpm-map' Yonghong Song says: ==================== A kernel page fault which happens in lpm map trie_get_next_key is reported by syzbot and Eric. The issue was introduced by commit `b471f2f1de` ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map"). Patch #1 fixed the issue in the kernel and patch #2 adds a multithreaded test case in tools/testing/selftests/bpf/test_lpm_map. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 17:06:24 -08:00
Yonghong Song	af32efeede	tools/bpf: add a multithreaded stress test in bpf selftests test_lpm_map The new test will spawn four threads, doing map update, delete, lookup and get_next_key in parallel. It is able to reproduce the issue in the previous commit found by syzbot and Eric Dumazet. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 17:06:22 -08:00
Yonghong Song	6dd1ec6c7a	bpf: fix kernel page fault in lpm map trie_get_next_key Commit `b471f2f1de` ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map") introduces a bug likes below: if (!rcu_dereference(trie->root)) return -ENOENT; if (!key \|\| key->prefixlen > trie->max_prefixlen) { root = &trie->root; goto find_leftmost; } ...... find_leftmost: for (node = rcu_dereference(root); node;) { In the code after label find_leftmost, it is assumed that root should not be NULL, but it is not true as it is possbile trie->root is changed to NULL by an asynchronous delete operation. The issue is reported by syzbot and Eric Dumazet with the below error log: ...... kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 8033 Comm: syz-executor3 Not tainted 4.15.0-rc8+ #4 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:trie_get_next_key+0x3c2/0xf10 kernel/bpf/lpm_trie.c:682 ...... This patch fixed the issue by use local rcu_dereferenced pointer instead of *(&trie->root) later on. Fixes: `b471f2f1de` ("bpf: implement MAP_GET_NEXT_KEY command or LPM_TRIE map") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 17:06:22 -08:00
Alexei Starovoitov	1651e39e4a	Merge branch 'bpf-improvements-and-fixes' Daniel Borkmann says: ==================== This set contains a small cleanup in cBPF prologue generation and otherwise fixes an outstanding issue related to BPF to BPF calls and exception handling. For details please see related patches. Last but not least, BPF selftests is extended with several new test cases. Thanks! ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:08 -08:00
Daniel Borkmann	21ccaf2149	bpf: add further test cases around div/mod and others Update selftests to relfect recent changes and add various new test cases. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:07 -08:00
Daniel Borkmann	73ae3c0426	bpf, arm: remove obsolete exception handling from div/mod Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from arm32 JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:07 -08:00
Daniel Borkmann	e472d5d8af	bpf, mips64: remove unneeded zero check from div/mod with k The verifier in both cBPF and eBPF reject div/mod by 0 imm, so this can never load. Remove emitting such test and reject it from being JITed instead (the latter is actually also not needed, but given practice in sparc64, ppc64 today, so doesn't hurt to add it here either). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Daney <david.daney@cavium.com> Reviewed-by: David Daney <david.daney@cavium.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:06 -08:00
Daniel Borkmann	1fb5c9c622	bpf, mips64: remove obsolete exception handling from div/mod Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from mips64 JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Daney <david.daney@cavium.com> Reviewed-by: David Daney <david.daney@cavium.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:06 -08:00
Daniel Borkmann	740d52c646	bpf, sparc64: remove obsolete exception handling from div/mod Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from sparc64 JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:06 -08:00
Daniel Borkmann	53fbf5719a	bpf, ppc64: remove obsolete exception handling from div/mod Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from ppc64 JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:06 -08:00
Daniel Borkmann	a3212b8f15	bpf, s390x: remove obsolete exception handling from div/mod Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from s390x JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:06 -08:00
Daniel Borkmann	96a71005bd	bpf, arm64: remove obsolete exception handling from div/mod Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from arm64 JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:06 -08:00
Daniel Borkmann	3e5b1a39d7	bpf, x86_64: remove obsolete exception handling from div/mod Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from x86_64 JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:06 -08:00
Daniel Borkmann	f6b1b3bf0d	bpf: fix subprog verifier bypass by div/mod by 0 exception One of the ugly leftovers from the early eBPF days is that div/mod operations based on registers have a hard-coded src_reg == 0 test in the interpreter as well as in JIT code generators that would return from the BPF program with exit code 0. This was basically adopted from cBPF interpreter for historical reasons. There are multiple reasons why this is very suboptimal and prone to bugs. To name one: the return code mapping for such abnormal program exit of 0 does not always match with a suitable program type's exit code mapping. For example, '0' in tc means action 'ok' where the packet gets passed further up the stack, which is just undesirable for such cases (e.g. when implementing policy) and also does not match with other program types. While trying to work out an exception handling scheme, I also noticed that programs crafted like the following will currently pass the verifier: 0: (bf) r6 = r1 1: (85) call pc+8 caller: R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 callee: frame1: R1=ctx(id=0,off=0,imm=0) R10=fp0,call_1 10: (b4) (u32) r2 = (u32) 0 11: (b4) (u32) r3 = (u32) 1 12: (3c) (u32) r3 /= (u32) r2 13: (61) r0 = (u32 )(r1 +76) 14: (95) exit returning from callee: frame1: R0_w=pkt(id=0,off=0,r=0,imm=0) R1=ctx(id=0,off=0,imm=0) R2_w=inv0 R3_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0,call_1 to caller at 2: R0_w=pkt(id=0,off=0,r=0,imm=0) R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 from 14 to 2: R0=pkt(id=0,off=0,r=0,imm=0) R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 2: (bf) r1 = r6 3: (61) r1 = (u32 )(r1 +80) 4: (bf) r2 = r0 5: (07) r2 += 8 6: (2d) if r2 > r1 goto pc+1 R0=pkt(id=0,off=0,r=8,imm=0) R1=pkt_end(id=0,off=0,imm=0) R2=pkt(id=0,off=8,r=8,imm=0) R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 7: (71) r0 = (u8 )(r0 +0) 8: (b7) r0 = 1 9: (95) exit from 6 to 8: safe processed 16 insns (limit 131072), stack depth 0+0 Basically what happens is that in the subprog we make use of a div/mod by 0 exception and in the 'normal' subprog's exit path we just return skb->data back to the main prog. This has the implication that the verifier thinks we always get a pkt pointer in R0 while we still have the implicit 'return 0' from the div as an alternative unconditional return path earlier. Thus, R0 then contains 0, meaning back in the parent prog we get the address range of [0x0, skb->data_end] as read and writeable. Similar can be crafted with other pointer register types. Since i) BPF_ABS/IND is not allowed in programs that contain BPF to BPF calls (and generally it's also disadvised to use in native eBPF context), ii) unknown opcodes don't return zero anymore, iii) we don't return an exception code in dead branches, the only last missing case affected and to fix is the div/mod handling. What we would really need is some infrastructure to propagate exceptions all the way to the original prog unwinding the current stack and returning that code to the caller of the BPF program. In user space such exception handling for similar runtimes is typically implemented with setjmp(3) and longjmp(3) as one possibility which is not available in the kernel, though (kgdb used to implement it in kernel long time ago). I implemented a PoC exception handling mechanism into the BPF interpreter with porting setjmp()/longjmp() into x86_64 and adding a new internal BPF_ABRT opcode that can use a program specific exception code for all exception cases we have (e.g. div/mod by 0, unknown opcodes, etc). While this seems to work in the constrained BPF environment (meaning, here, we don't need to deal with state e.g. from memory allocations that we would need to undo before going into exception state), it still has various drawbacks: i) we would need to implement the setjmp()/longjmp() for every arch supported in the kernel and for x86_64, arm64, sparc64 JITs currently supporting calls, ii) it has unconditional additional cost on main program entry to store CPU register state in initial setjmp() call, and we would need some way to pass the jmp_buf down into ___bpf_prog_run() for main prog and all subprogs, but also storing on stack is not really nice (other option would be per-cpu storage for this, but it also has the drawback that we need to disable preemption for every BPF program types). All in all this approach would add a lot of complexity. Another poor-man's solution would be to have some sort of additional shared register or scratch buffer to hold state for exceptions, and test that after every call return to chain returns and pass R0 all the way down to BPF prog caller. This is also problematic in various ways: i) an additional register doesn't map well into JITs, and some other scratch space could only be on per-cpu storage, which, again has the side-effect that this only works when we disable preemption, or somewhere in the input context which is not available everywhere either, and ii) this adds significant runtime overhead by putting conditionals after each and every call, as well as implementation complexity. Yet another option is to teach verifier that div/mod can return an integer, which however is also complex to implement as verifier would need to walk such fake 'mov r0,<code>; exit;' sequeuence and there would still be no guarantee for having propagation of this further down to the BPF caller as proper exception code. For parent prog, it is also is not distinguishable from a normal return of a constant scalar value. The approach taken here is a completely different one with little complexity and no additional overhead involved in that we make use of the fact that a div/mod by 0 is undefined behavior. Instead of bailing out, we adapt the same behavior as on some major archs like ARMv8 [0] into eBPF as well: X div 0 results in 0, and X mod 0 results in X. aarch64 and aarch32 ISA do not generate any traps or otherwise aborts of program execution for unsigned divides. I verified this also with a test program compiled by gcc and clang, and the behavior matches with the spec. Going forward we adapt the eBPF verifier to emit such rewrites once div/mod by register was seen. cBPF is not touched and will keep existing 'return 0' semantics. Given the options, it seems the most suitable from all of them, also since major archs have similar schemes in place. Given this is all in the realm of undefined behavior, we still have the option to adapt if deemed necessary and this way we would also have the option of more flexibility from LLVM code generation side (which is then fully visible to verifier). Thus, this patch i) fixes the panic seen in above program and ii) doesn't bypass the verifier observations. [0] ARM Architecture Reference Manual, ARMv8 [ARM DDI 0487B.b] http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487b.b/DDI0487B_b_armv8_arm.pdf 1) aarch64 instruction set: section C3.4.7 and C6.2.279 (UDIV) "A division by zero results in a zero being written to the destination register, without any indication that the division by zero occurred." 2) aarch32 instruction set: section F1.4.8 and F5.1.263 (UDIV) "For the SDIV and UDIV instructions, division by zero always returns a zero result." Fixes: `f4d7e40a5b` ("bpf: introduce function calls (verification)") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:05 -08:00
Daniel Borkmann	5e581dad4f	bpf: make unknown opcode handling more robust Recent findings by syzcaller fixed in `7891a87efc` ("bpf: arsh is not supported in 32 bit alu thus reject it") triggered a warning in the interpreter due to unknown opcode not being rejected by the verifier. The 'return 0' for an unknown opcode is really not optimal, since with BPF to BPF calls, this would go untracked by the verifier. Do two things here to improve the situation: i) perform basic insn sanity check early on in the verification phase and reject every non-uapi insn right there. The bpf_opcode_in_insntable() table reuses the same mapping as the jumptable in ___bpf_prog_run() sans the non-public mappings. And ii) in ___bpf_prog_run() we do need to BUG in the case where the verifier would ever create an unknown opcode due to some rewrites. Note that JITs do not have such issues since they would punt to interpreter in these situations. Moreover, the BPF_JIT_ALWAYS_ON would also help to avoid such unknown opcodes in the first place. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:05 -08:00
Daniel Borkmann	2a5418a13f	bpf: improve dead code sanitizing Given we recently had `c131187db2` ("bpf: fix branch pruning logic") and `95a762e2c8` ("bpf: fix incorrect sign extension in check_alu_op()") in particular where before verifier skipped verification of the wrongly assumed dead branch, we should not just replace the dead code parts with nops (mov r0,r0). If there is a bug such as fixed in `95a762e2c8` in future again, where runtime could execute those insns, then one of the potential issues with the current setting would be that given the nops would be at the end of the program, we could execute out of bounds at some point. The best in such case would be to just exit the BPF program altogether and return an exception code. However, given this would require two instructions, and such a dead code gap could just be a single insn long, we would need to place 'r0 = X; ret' snippet at the very end after the user program or at the start before the program (where we'd skip that region on prog entry), and then place unconditional ja's into the dead code gap. While more complex but possible, there's still another block in the road that currently prevents from this, namely BPF to BPF calls. The issue here is that such exception could be returned from a callee, but the caller would not know that it's an exception that needs to be propagated further down. Alternative that has little complexity is to just use a ja-1 code for now which will trap the execution here instead of silently doing bad things if we ever get there due to bugs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:05 -08:00
Daniel Borkmann	1d621674d9	bpf: xor of a/x in cbpf can be done in 32 bit alu Very minor optimization; saves 1 byte per program in x86_64 JIT in cBPF prologue. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-26 16:42:05 -08:00
Mickaël Salaün	c25ef6a5e6	samples/bpf: Partially fixes the bpf.o build Do not build lib/bpf/bpf.o with this Makefile but use the one from the library directory. This avoid making a buggy bpf.o file (e.g. missing symbols). This patch is useful if some code (e.g. Landlock tests) needs both the bpf.o (from tools/lib/bpf) and the bpf_load.o (from samples/bpf). Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-26 23:57:10 +01:00
Lawrence Brakmo	771fc607e6	bpf: clean up from test_tcpbpf_kern.c Removed commented lines from test_tcpbpf_kern.c Fixes: `d6d4f60c3a` bpf: add selftest for tcpbpf Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-26 23:39:05 +01:00
Emil Tantilov	2bafa8fac1	ixgbe: don't set RXDCTL.RLPML for 82599 commit `2de6aa3a66` ("ixgbe: Add support for padding packet") Uses RXDCTL.RLPML to limit the maximum frame size on Rx when using build_skb. Unfortunately that register does not work on 82599. Added an explicit check to avoid setting this register on 82599 MAC. Extended the comment related to the setting of RXDCTL.RLPML to better explain its purpose. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 10:25:35 -08:00
Dan Carpenter	fd492228d4	ixgbe: Fix && vs \|\| typo "offset" can't be both 0x0 and 0xFFFF so presumably \|\| was intended instead of &&. That matches with how this check is done in other functions. Fixes: `73834aec71` ("ixgbe: extend firmware version support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 10:25:32 -08:00
Paul Greenwalt	06e3f94947	ixgbe: add support for reporting 5G link speed Since 5G link speed is supported by some devices, add reporting of 5G link speed. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 10:25:28 -08:00
Miroslav Lichvar	838200482d	ixgbe: Don't report unsupported timestamping filters for X550 The current code enables on X550 timestamping of all packets for any filter, which means ethtool should not report any PTP-specific filters as unsupported. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 10:25:23 -08:00
Colin Ian King	f0e49dc3f9	ixgbe: use ARRAY_SIZE for array sizing calculation on array buf Use the ARRAY_SIZE macro on array buf to determine size of the array. Improvement suggested by coccinelle. Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 10:25:19 -08:00
Colin Ian King	4078ea3756	ixgbevf: use ARRAY_SIZE for various array sizing calculations Use the ARRAY_SIZE macro on various arrays to determine size of the arrays. Improvement suggested by coccinelle. Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 10:25:15 -08:00
Emil Tantilov	865a4d987b	ixgbevf: don't bother clearing tx_buffer_info in ixgbevf_clean_tx_ring() In the case of the Tx rings we need to only clear the Tx buffer_info when we are resetting the rings. Ideally we do this when we configure the ring to bring it back up instead of when we are taking it down in order to avoid dirtying pages we don't need to. In addition we don't need to clear the Tx descriptor ring since we will fully repopulate it when we begin transmitting frames and next_to_watch can be cleared to prevent the ring from being cleaned beyond that point instead of needing to touch anything in the Tx descriptor ring. Finally with these changes we can avoid having to reset the skb member of the Tx buffer_info structure in the cleanup path since the skb will always be associated with the first buffer which has next_to_watch set. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 10:25:02 -08:00
David S. Miller	6bb46bc57c	Merge branch 'cxgb4-fix-dump-collection-when-firmware-crashed' Rahul Lakkireddy says: ==================== cxgb4: fix dump collection when firmware crashed Patch 1 resets FW_OK flag, if firmware reports error. Patch 2 fixes incorrect condition for using firmware LDST commands. Patch 3 fixes dump collection logic to use backdoor register access to collect dumps when firmware is crashed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 11:00:23 -05:00
Rahul Lakkireddy	770ca3477a	cxgb4: use backdoor access to collect dumps when firmware crashed Fallback to backdoor register access to collect dumps if firmware is crashed. Fixes TID, SGE Queue Context, and MPS TCAM dump collection. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 11:00:22 -05:00
Rahul Lakkireddy	ebb5568fe2	cxgb4: fix incorrect condition for using firmware LDST commands Only contact firmware if it's alive _AND_ if use_bd (use backdoor access) is not set when issuing FW_LDST_CMD. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 11:00:22 -05:00
Rahul Lakkireddy	825b2b6fd9	cxgb4: reset FW_OK flag on firmware crash If firmware reports error, reset FW_OK flag. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 11:00:22 -05:00
David S. Miller	2d0a527ed3	Merge branch 'hns3-next' Peng Li says: ==================== net: hns3: add support ethtool_ops.{set\|get}_coalesce for VF This patch-set adds ethtool_ops.{get\|set}_coalesce to VF and fix one related bug. HNS3 PF and VF driver use the common enet layer, as the ethtool_ops.{get\|set}_coalesce to PF have upstreamed, just need add the ops to hns3vf_ethtool_ops. [Patch 1/2] fix a related bug for the VF ethtool_ops.{set\| get}_coalesce. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:58:30 -05:00
Fuyun Liang	79eee41085	net: hns3: add int_gl_idx setup for VF Just like PF, if the int_gl_idx of VF does not be set, the default interrupt coalesce index of VF is 0. But it should be GL1 for TX queues and GL0 for RX queues. This patch adds the int_gl_idx setup for VF. Fixes: 200ecda42598 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:58:30 -05:00
Fuyun Liang	ad31c73201	net: hns3: add get/set_coalesce support to VF This patch adds ethtool_ops.get/set_coalesce support to VF. Since PF and VF share the same get/set_coalesce interface, we only need to set hns3_get/set_coalesce to the ethtool_ops when supporting get/set_coalesce for VF. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:58:29 -05:00
David S. Miller	e2d6e64bc3	linux-can-next-for-4.16-20180126 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlpq+ZkTHG1rbEBwZW5n dXRyb25peC5kZQAKCRAe/sog7Dgc9mFcB/wPSu30a664/+wjUvXM7Zdw4ko/PRdS deSRnjGj3epkHRyGJkdGSuPx9iGg3pqR8poMCZZmFUG+kGBmEcGQX+eyaR41zIUz iyEgZSufYDjsW47eGBsNE01xQjoL1jcF9JM7NHmRrw4+2YF75cGE3BOGcmcV6Hjc O5HDIpLmbeMHI4NcujgD4UG/VPnZQw3+oN9eyYUEbY5Aa2XQyW76DIJ3SyKsHQz0 K/s0uxAGo+Ap7xuoBUJpx6BBYoHYM171DTgXfH9pUB0MwqyDCq3hAyYGR+UEdIXb IDhIcN/l5wFU8VICjYmSKgKyjjHqlixgoki2snmJxVWu0KeVl5LJ1Edv =7jiC -----END PGP SIGNATURE----- Merge tag 'linux-can-next-for-4.16-20180126' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2018-01-26 this is a pull request for net-next/master consisting of 3 patches. The first two patches target the CAN documentation. The first is by me and fixes pointer to location of fsl,mpc5200-mscan node in the mpc5200 documentation. The second patch is by Robert Schwebel and it converts the plain ASCII documentation to restructured text. The third patch is by Fabrizio Castro add the r8a774[35] support to the rcar_can dt-bindings documentation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:49:12 -05:00
Emil Tantilov	6f3554548e	ixgbevf: improve performance and reduce size of ixgbevf_tx_map() Based on commit `ec718254cb` ("ixgbe: Improve performance and reduce size of ixgbe_tx_map") This change is meant to both improve the performance and reduce the size of ixgbevf_tx_map(). Expand the work done in the main loop by pushing first into tx_buffer. This allows us to pull in the dma_mapping_error check, the tx_buffer value assignment, and the initial DMA value assignment to the Tx descriptor. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 07:46:51 -08:00
Emil Tantilov	40b8178bc9	ixgbevf: clear rx_buffer_info in configure instead of clean Based on commit `d2bead576e` ("igb: Clear Rx buffer_info in configure instead of clean") This change makes it so that instead of going through the entire ring on Rx cleanup we only go through the region that was designated to be cleaned up and stop when we reach the region where new allocations should start. In addition we can avoid having to perform a memset on the Rx buffer_info structures until we are about to start using the ring again. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 07:46:51 -08:00
Emil Tantilov	2a35efe582	ixgbevf: add counters for Rx page allocations We already had placehloders for failed page and buffer allocations. Added alloc_rx_page and made sure the stats are properly updated and exposed in ethtool. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 07:46:51 -08:00
Emil Tantilov	35074d698d	ixgbevf: update code to better handle incrementing page count Based on commit `bd4171a5d4` ("igb: update code to better handle incrementing page count") Update the driver code so that we do bulk updates of the page reference count instead of just incrementing it by one reference at a time. The advantage to doing this is that we cut down on atomic operations and this in turn should give us a slight improvement in cycles per packet. In addition if we eventually move this over to using build_skb the gains will be more noticeable. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 07:46:50 -08:00
Emil Tantilov	16b359498b	ixgbevf: add support for DMA_ATTR_SKIP_CPU_SYNC/WEAK_ORDERING Based on commit `5be5955425` ("igb: update driver to make use of DMA_ATTR_SKIP_CPU_SYNC") and commit `7bd1759282` ("igb: Add support for DMA_ATTR_WEAK_ORDERING") Convert the calls to dma_map/unmap_page() to the attributes version and add DMA_ATTR_SKIP_CPU_SYNC/WEAK_ORDERING which should help improve performance on some platforms. Move sync_for_cpu call before we perform a prefetch to avoid invalidating the first 128 bytes of the packet on architectures where that call may invalidate the cache. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 07:46:50 -08:00
Emil Tantilov	24bff091d7	ixgbevf: use length to determine if descriptor is done Based on: commit `7ec0116c91` ("igb: Use length to determine if descriptor is done") This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. In addition we only reset the Rx descriptor length for descriptor zero when resetting a ring instead of having to do a memset with 0 over the entire ring. By doing this we can save some time on initialization. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 07:46:50 -08:00
Emil Tantilov	68b6ff5825	ixgbevf: only DMA sync frame length Based on commit `64f2525ca4` ("igb: Only DMA sync frame length") On some architectures synching a buffer for DMA may be expensive. Instead of the entire 2K receive buffer only synchronize the length of the frame, which will typically be the MTU or smaller. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 07:46:50 -08:00
Emil Tantilov	a355fd9a1b	ixgbevf: add function for checking if we can reuse page Introduce ixgbevf_can_reuse_page() similar to the change in ixgbe from commit `af43da0dba` ("ixgbe: Add function for checking to see if we can reuse page") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-01-26 07:46:50 -08:00
David S. Miller	72a231b7a7	Merge branch 'net-smc-fixes-2018-01-26' Ursula Braun says: ==================== net/smc: fixes 2018-01-26 here are some more smc patches. The first 4 patches take care about different aspects of smc socket closing, the 5th patch improves coding style. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:41:57 -05:00
Gustavo A. R. Silva	a8fbf8e7ec	net/smc: return booleans instead of integers Return statements in functions returning bool should use true/false instead of 1/0. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:41:56 -05:00
Ursula Braun	127f497058	net/smc: release clcsock from tcp_listen_worker Closing a listen socket may hit the warning WARN_ON(sock_owned_by_user(sk)) of tcp_close(), if the wake up of the smc_tcp_listen_worker has not yet finished. This patch introduces smc_close_wait_listen_clcsock() making sure the listening internal clcsock has been closed in smc_tcp_listen_work(), before the listening external SMC socket finishes closing. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:41:56 -05:00
Ursula Braun	51f1de79ad	net/smc: replace sock_put worker by socket refcounting Proper socket refcounting makes the sock_put worker obsolete. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:41:56 -05:00
Ursula Braun	8dce2786a2	net/smc: smc_poll improvements Increase the socket refcount during poll wait. Take the socket lock before checking socket state. For a listening socket return a mask independent of state SMC_ACTIVE and cover errors or closed state as well. Get rid of the accept_q loop in smc_accept_poll(). Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:41:56 -05:00

1 2 3 4 5 ...

726439 Commits All Branches Search

726439 Commits

All Branches