OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Daniel Borkmann	b426ce83ba	bpf: Add classid helper only based on skb->sk Similarly to `5a52ae4e32` ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks"), add a helper to retrieve cgroup v1 classid solely based on the skb->sk, so it can be used as key as part of BPF map lookups out of tc from host ns, in particular given the skb->sk is retained these days when crossing net ns thanks to `9c4c325252` ("skbuff: preserve sock reference when scrubbing the skb."). This is similar to bpf_skb_cgroup_id() which implements the same for v2. Kubernetes ecosystem is still operating on v1 however, hence net_cls needs to be used there until this can be dropped in with the v2 helper of bpf_skb_cgroup_id(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/ed633cf27a1c620e901c5aa99ebdefb028dce600.1601477936.git.daniel@iogearbox.net	2020-09-30 11:50:34 -07:00
Song Liu	963ec27a10	bpf: fix raw_tp test run in preempt kernel In preempt kernel, BPF_PROG_TEST_RUN on raw_tp triggers: [ 35.874974] BUG: using smp_processor_id() in preemptible [00000000] code: new_name/87 [ 35.893983] caller is bpf_prog_test_run_raw_tp+0xd4/0x1b0 [ 35.900124] CPU: 1 PID: 87 Comm: new_name Not tainted 5.9.0-rc6-g615bd02bf #1 [ 35.907358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 35.916941] Call Trace: [ 35.919660] dump_stack+0x77/0x9b [ 35.923273] check_preemption_disabled+0xb4/0xc0 [ 35.928376] bpf_prog_test_run_raw_tp+0xd4/0x1b0 [ 35.933872] ? selinux_bpf+0xd/0x70 [ 35.937532] __do_sys_bpf+0x6bb/0x21e0 [ 35.941570] ? find_held_lock+0x2d/0x90 [ 35.945687] ? vfs_write+0x150/0x220 [ 35.949586] do_syscall_64+0x2d/0x40 [ 35.953443] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by calling migrate_disable() before smp_processor_id(). Fixes: `1b4d60ec16` ("bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-30 08:34:08 -07:00
Thomas Kopp	dba1572c23	can: mcp25xxfd: narrow down wildcards in device tree bindings to "microchip,mcp251xfd" The wildcard should be narrowed down to prevent existing and future devices that are not compatible from matching. It is very unlikely that incompatible devices will be released that do not match the wildcard. Discussion Reference: https://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com> Link: https://lore.kernel.org/r/20200930091423.755-1-thomas.kopp@microchip.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 11:23:53 +02:00
Thomas Kopp	0e051294c0	dt-binding: can: mcp251xfd: narrow down wildcards in device tree bindings to "microchip,mcp251xfd" The wildcard should be narrowed down to prevent existing and future devices that are not compatible from matching. It is very unlikely that incompatible devices will be released that do not match the wildcard. This is the documentation part of the commit. Discussion Reference: https://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com> Link: https://lore.kernel.org/r/20200930091423.755-2-thomas.kopp@microchip.com [mkl: rename file, too] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 11:22:26 +02:00
Oleksij Rempel	9d5c8df1b9	dt-binding: can: mcp25xxfd: documentation fixes Apply following fixes: - Use 'interrupts'. (interrupts-extended will automagically be supported by the tools) - *-supply is always a single item. So, drop maxItems=1 - add "additionalProperties: false" flag to detect unneeded properties. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20200923125301.27200-1-o.rempel@pengutronix.de Reported-by: Rob Herring <robh@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Fixes: `1b5a78e69c` ("dt-binding: can: mcp25xxfd: document device tree bindings") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 10:34:31 +02:00
Marc Kleine-Budde	727fba74b5	can: mcp25xxfd: mcp25xxfd_irq(): add missing initialization of variable set_normal mode This patch fixes the following warning: drivers/net/can/spi/mcp25xxfd/mcp25xxfd-core.c:2155 mcp25xxfd_irq() error: uninitialized symbol 'set_normal_mode'. by adding the missing initialization. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Fixes: `55e5b97f00` ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Link: https://lore.kernel.org/r/20200923114726.2704426-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 10:34:30 +02:00
Dan Carpenter	8cffc6fe65	can: mcp25xxfd: mcp25xxfd_ring_free(): fix memory leak during cleanup This loop doesn't free the first element of the array. The "i > 0" has to be changed to "i >= 0". Fixes: `55e5b97f00` ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20200923112752.GA1473821@mwanda Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 10:34:30 +02:00
Thomas Kopp	f5b84dedf7	can: mcp25xxfd: mcp25xxfd_probe(): add SPI clk limit related errata information This patch adds a reference to the recent released MCP2517FD and MCP2518FD errata sheets and paste the explanation. The driver already implements the proposed fix. Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com> Link: https://lore.kernel.org/r/20200925065606.358-1-thomas.kopp@microchip.com [mkl: split into two patches, adjust subject and commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 10:34:30 +02:00
Thomas Kopp	788b83ea2c	can: mcp25xxfd: mcp25xxfd_handle_eccif(): add ECC related errata and update log messages This patch adds a reference to the recent released MCP2517FD and MCP2518FD errata sheets and paste the explanation. The single error correction does not always work, so always indicate that a single error occurred. If the location of the ECC error is outside of the TX-RAM always use netdev_notice() to log the problem. For ECC errors in the TX-RAM, there is a recovery procedure. Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com> Link: https://lore.kernel.org/r/20200925065606.358-1-thomas.kopp@microchip.com [mkl: split into two patches, adjust subject and commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-09-30 10:34:30 +02:00
David S. Miller	611ba7536e	Merge branch 'HW-support-for-VCAP-IS1-and-ES0-in-mscc_ocelot' Vladimir Oltean says: ==================== HW support for VCAP IS1 and ES0 in mscc_ocelot The patches from RFC series "Offload tc-flower to mscc_ocelot switch using VCAP chains" have been split into 2: https://patchwork.ozlabs.org/project/netdev/list/?series=204810&state=* This is the boring part, that deals with the prerequisites, and not with tc-flower integration. Apart from the initialization of some hardware blocks, which at this point still don't do anything, no new functionality is introduced. - Key and action field offsets are defined for the supported switches. - VCAP properties are added to the driver for the new TCAM blocks. But instead of adding them manually as was done for IS2, which is error prone, the driver is refactored to read these parameters from hardware, which is possible. - Some improvements regarding the processing of struct ocelot_vcap_filter. - Extending the code to be compatible with full and quarter keys. This series was tested, along with other patches not yet submitted, on the Felix and Seville switches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:42 -07:00
Vladimir Oltean	98642d1aa2	net: mscc: ocelot: look up the filters in flower_stats() and flower_destroy() Currently a new filter is created, containing just enough correct information to be able to call ocelot_vcap_block_find_filter_by_index() on it. This will be limiting us in the future, when we'll have more metadata associated with a filter, which will matter in the stats() and destroy() callbacks, and which we can't make up on the spot. For example, we'll start "offloading" some dummy tc filter entries for the TCAM skeleton, but we won't actually be adding them to the hardware, or to block->rules. So, it makes sense to avoid deleting those rules too. That's the kind of thing which is difficult to determine unless we look up the real filter. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	085f5b9162	net: mscc: ocelot: add a new ocelot_vcap_block_find_filter_by_id function And rename the existing find to ocelot_vcap_block_find_filter_by_index. The index is the position in the TCAM, and the id is the flow cookie given by tc. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	642942637c	net: mscc: ocelot: rename variable 'cnt' in vcap_data_offset_get() The 'cnt' variable is actually used for 2 purposes, to hold the number of sub-words per VCAP entry, and the number of sub-words per VCAP action. In fact, I'm pretty sure these 2 numbers can never be different from one another. By hardware definition, the entry (key) TCAM rows are divided into the same number of sub-words as its associated action RAM rows. But nonetheless, let's at least rename the variables such that observations like this one are easier to make in the future. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	5963083a31	net: mscc: ocelot: rename variable 'count' in vcap_data_offset_get() This gets rid of one of the 2 variables named, very generically, "count". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Xiaoliang Yang	e6ae7c506f	net: mscc: ocelot: calculate vcap offsets correctly for full and quarter entries When calculating the offsets for the current entry within the row and placing them inside struct vcap_data, the function assumes half key entry (2 keys per row). This patch modifies the vcap_data_offset_get() function to calculate a correct data offset when the setting VCAP Type-Group of a key to VCAP_TG_FULL or VCAP_TG_QUARTER. This is needed because, for example, VCAP ES0 only supports full keys. Also rename the 'count' variable to 'num_entries_per_row' to make the function just one tiny bit easier to follow. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	7a155fa3d8	net: mscc: ocelot: parse flower action before key When we'll make the switch to multiple chain offloading, we'll want to know first what VCAP block the rule is offloaded to. This impacts what keys are available. Since the VCAP block is determined by what actions are used, parse the action first. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	d732e9cef0	net: mscc: ocelot: remove unneeded VCAP parameters for IS2 Now that we are deriving these from the constants exposed by the hardware, we can delete the static info we're keeping in the driver. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	2096805497	net: mscc: ocelot: automatically detect VCAP constants The numbers in struct vcap_props are not intuitive to derive, because they are not a straightforward copy-and-paste from the reference manual but instead rely on a fairly detailed level of understanding of the layout of an entry in the TCAM and in the action RAM. For this reason, bugs are very easy to introduce here. Ease the work of hardware porters and read from hardware the constants that were exported for this particular purpose. Note that this implies that struct vcap_props can no longer be const. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:24 -07:00
Vladimir Oltean	e3aea296d8	net: mscc: ocelot: add definitions for VCAP ES0 keys, actions and target As a preparation step for the offloading to ES0, let's create the infrastructure for talking with this hardware block. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Vladimir Oltean	a61e365d7c	net: mscc: ocelot: add definitions for VCAP IS1 keys, actions and target As a preparation step for the offloading to IS1, let's create the infrastructure for talking with this hardware block. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Vladimir Oltean	c1c3993edb	net: mscc: ocelot: generalize existing code for VCAP In the Ocelot switches there are 3 TCAMs: VCAP ES0, IS1 and IS2, which have the same configuration interface, but different sets of keys and actions. The driver currently only supports VCAP IS2. In preparation of VCAP IS1 and ES0 support, the existing code must be generalized to work with any VCAP. In that direction, we should move the structures that depend upon VCAP instantiation, like vcap_is2_keys and vcap_is2_actions, out of struct ocelot and into struct vcap_props .keys and .actions, a structure that is replicated 3 times, once per VCAP. We'll pass that structure as an argument to each function that does the key and action packing - only the control logic needs to distinguish between ocelot->vcap[VCAP_IS2] or IS1 or ES0. Another change is to make use of the newly introduced ocelot_target_read and ocelot_target_write API, since the 3 VCAPs have the same registers but put at different addresses. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Xiaoliang Yang	ed5672d82c	net: mscc: ocelot: return error if VCAP filter is not found Although it doesn't look like it is possible to hit these conditions from user space, there are 2 separate, but related, issues. First, the ocelot_vcap_block_get_filter_index function, née ocelot_ace_rule_get_index_id prior to the `aae4e500e1` ("net: mscc: ocelot: generalize the "ACE/ACL" names") rename, does not do what the author probably intended. If the desired filter entry is not present in the ACL block, this function returns an index equal to the total number of filters, instead of -1, which is maybe what was intended, judging from the curious initialization with -1, and the "++index" idioms. Either way, none of the callers seems to expect this behavior. Second issue, the callers don't actually check the return value at all. So in case the filter is not found in the rule list, propagate the return code. So update the callers and also take the opportunity to get rid of the odd coding idioms that appear to work but don't. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Vladimir Oltean	3c0e37a9e4	net: mscc: ocelot: introduce a new ocelot_target_{read,write} API There are some targets (register blocks) in the Ocelot switch that are instantiated more than once. For example, the VCAP IS1, IS2 and ES0 blocks all share the same register layout for interacting with the cache for the TCAM and the action RAM. For the VCAPs, the procedure for servicing them is actually common. We just need an API specifying which VCAP we are talking to, and we do that via these raw ocelot_target_read and ocelot_target_write accessors. In plain ocelot_read, the target is encoded into the register enum itself: u16 target = reg >> TARGET_OFFSET; For the VCAPs, the registers are currently defined like this: enum ocelot_reg { [...] S2_CORE_UPDATE_CTRL = S2 << TARGET_OFFSET, S2_CORE_MV_CFG, S2_CACHE_ENTRY_DAT, S2_CACHE_MASK_DAT, S2_CACHE_ACTION_DAT, S2_CACHE_CNT_DAT, S2_CACHE_TG_DAT, [...] }; which is precisely what we want to avoid, because we'd have to duplicate the same register map for S1 and for S0, and then figure out how to pass VCAP instance-specific registers to the ocelot_read calls (basically another lookup table that undoes the effect of shifting with TARGET_OFFSET). So for some targets, propose a more raw API, similar to what is currently done with ocelot_port_readl and ocelot_port_writel. Those targets can only be accessed with ocelot_target_{read,write} and not with ocelot_{read,write} after the conversion, which is fine. The VCAP registers are not actually modified to use this new API as of this patch. They will be modified in the next one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:26:12 -07:00
Lorenzo Bianconi	879456bedb	net: mvneta: avoid possible cache misses in mvneta_rx_swbm Do not use rx_desc pointers if possible since rx descriptors are stored in uncached memory and dereferencing rx_desc pointers generate extra loads. This patch improves XDP_DROP performance of ~ 110Kpps (700Kpps vs 590Kpps) on Marvell Espressobin Analyzed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 18:10:07 -07:00
Andrii Nakryiko	b0efc216f5	libbpf: Compile in PIC mode only for shared library case Libbpf compiles .o's for static and shared library modes separately, so no need to specify -fPIC for both. Keep it only for shared library mode. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-3-andriin@fb.com	2020-09-29 17:05:31 -07:00
Andrii Nakryiko	0a62291d69	libbpf: Compile libbpf under -O2 level by default and catch extra warnings For some reason compiler doesn't complain about uninitialized variable, fixed in previous patch, if libbpf is compiled without -O2 optimization level. So do compile it with -O2 and never let similar issue slip by again. -Wall is added unconditionally, so no need to specify it again. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-2-andriin@fb.com	2020-09-29 17:05:31 -07:00
Andrii Nakryiko	3343391345	libbpf: Fix uninitialized variable in btf_parse_type_sec Fix obvious unitialized variable use that wasn't reported by compiler. libbpf Makefile changes to catch such errors are added separately. Fixes: `3289959b97` ("libbpf: Support BTF loading and raw data output in both endianness") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-1-andriin@fb.com	2020-09-29 17:05:31 -07:00
Alexei Starovoitov	67e4ca7495	Merge branch 'bpf, x64: optimize JIT's pro/epilogue' Maciej Fijalkowski says: ==================== Hi! This small set can be considered as a followup after recent addition of support for tailcalls in bpf subprograms and is focused on optimizing x64 JIT prologue and epilogue sections. Turns out the popping tail call counter is not needed anymore and %rsp handling when stack depth is 0 can be skipped. For longer explanations, please see commit messages. Thank you, Maciej ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-29 16:47:39 -07:00
Maciej Fijalkowski	4d0b8c0b46	bpf: x64: Do not emit sub/add 0, %rsp when !stack_depth There is no particular reason for keeping the "sub 0, %rsp" insn within the BPF's x64 JIT prologue. When tail call code was skipping the whole prologue section these 7 bytes that represent the rsp subtraction could not be simply discarded as the jump target address would be broken. An option to address that would be to substitute it with nop7. Right now tail call is skipping only first 11 bytes of target program's prologue and "sub X, %rsp" is the first insn that is processed, so if stack depth is zero then this insn could be omitted without the need for nop7 swap. Therefore, do not emit the "sub 0, %rsp" in prologue when program is not making use of R10 register. Also, make the emission of "add X, %rsp" conditional in tail call code logic and take into account the presence of mentioned insn when calculating the jump offsets. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929204653.4325-3-maciej.fijalkowski@intel.com	2020-09-29 16:47:39 -07:00
Maciej Fijalkowski	d207929d97	bpf, x64: Drop "pop %rcx" instruction on BPF JIT epilogue Back when all of the callee-saved registers where always pushed to stack in x64 JIT prologue, tail call counter was placed at the bottom of the BPF program's stack frame that had a following layout: +-------------+ \| ret addr \| +-------------+ \| rbp \| <- rbp +-------------+ \| \| \| free space \| \| from: \| \| sub $x,%rsp \| \| \| +-------------+ \| rbx \| +-------------+ \| r13 \| +-------------+ \| r14 \| +-------------+ \| r15 \| +-------------+ \| tail call \| <- rsp \| counter \| +-------------+ In order to restore the callee saved registers, epilogue needed to explicitly toss away the tail call counter via "pop %rbx" insn, so that %rsp would be back at the place where %r15 was stored. Currently, the tail call counter is placed on stack before the callee saved registers (brackets on rbx through r15 mean that they are now pushed to stack only if they are used): +-------------+ \| ret addr \| +-------------+ \| rbp \| <- rbp +-------------+ \| \| \| free space \| \| from: \| \| sub $x,%rsp \| \| \| +-------------+ \| tail call \| \| counter \| +-------------+ ( rbx ) +-------------+ ( r13 ) +-------------+ ( r14 ) +-------------+ ( r15 ) <- rsp +-------------+ For the record, the epilogue insns consist of (assuming all of the callee saved registers are used by program): pop %r15 pop %r14 pop %r13 pop %rbx pop %rcx leaveq retq "pop %rbx" for getting rid of tail call counter was not an option anymore as it would overwrite the restored value of %rbx register, so it was changed to use the %rcx register. Since epilogue can start popping the callee saved registers right away without any additional work, the "pop %rcx" could be dropped altogether as "leave" insn will simply move the %rbp to %rsp. IOW, tail call counter does not need the explicit handling. Having in mind the explanation above and the actual reason for that, let's piggy back on "leave" insn for discarding the tail call counter from stack and remove the "pop %rcx" from epilogue. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929204653.4325-2-maciej.fijalkowski@intel.com	2020-09-29 16:47:39 -07:00
Ilya Leoshkevich	6458bde368	selftests/bpf: Fix endianness issues in sk_lookup/ctx_narrow_access This test makes a lot of narrow load checks while assuming little endian architecture, and therefore fails on s390. Fix by introducing LSB and LSW macros and using them to perform narrow loads. Fixes: `0ab5539f85` ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929201814.44360-1-iii@linux.ibm.com	2020-09-29 16:28:34 -07:00
Armin Wolf	2b2706aaae	lib8390: Replace panic() call with BUILD_BUG_ON Replace panic() call in lib8390.c with BUILD_BUG_ON() since checking the size of struct e8390_pkt_hdr should happen at compile-time. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:04:09 -07:00
David S. Miller	e6b6be53ec	Merge branch 'net-in_interrupt-cleanup-and-fixes' Thomas Gleixner says: ==================== net: in_interrupt() cleanup and fixes in the discussion about preempt count consistency accross kernel configurations: https://lore.kernel.org/r/20200914204209.256266093@linutronix.de/ Linus clearly requested that code in drivers and libraries which changes behaviour based on execution context should either be split up so that e.g. task context invocations and BH invocations have different interfaces or if that's not possible the context information has to be provided by the caller which knows in which context it is executing. This includes conditional locking, allocation mode (GFP_) decisions and avoidance of code paths which might sleep. In the long run, usage of 'preemptible, in_irq etc.' should be banned from driver code completely. This is the second version of the first batch of related changes. V1 can be found here: https://lore.kernel.org/r/20200927194846.045411263@linutronix.de Changes vs. V1: - Rebased to net-next - Fixed the half done rename sillyness in the ENIC patch. - Fixed the IONIC driver fallout. - Picked up the SFC fix from Edward and adjusted the GFP_KERNEL change accordingly. - Addressed the review comments vs. BCRFMAC. - Collected Reviewed/Acked-by tags as appropriate. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	920872e083	net: rtlwifi: Replace in_interrupt() for context detection rtl_lps_enter() and rtl_lps_leave() are using in_interrupt() to detect whether it is safe to acquire a mutex or if it is required to defer to a workqueue. The usage of in_interrupt() in drivers is phased out and Linus clearly requested that code which changes behaviour depending on context should either be seperated or the context be conveyed in an argument passed by the caller, which usually knows the context. in_interrupt() also is only partially correct because it fails to chose the correct code path when just preemption or interrupts are disabled. Add an argument 'may_block' to both functions and adjust the callers to pass the context information. The following call chains were analyzed to be safe to block: rtl_watchdog_wq_callback() rlf_lps_leave/enter() rtl_op_suspend() rtl_lps_leave() rtl_op_bss_info_changed() rtl_lps_leave() rtl_op_sw_scan_start() rtl_lps_leave() The following call chains were analyzed to be unsafe to block: _rtl_pci_interrupt() _rtl_pci_rx_interrupt() rtl_lps_leave() _rtl_pci_interrupt() _rtl_pci_rx_interrupt() rtl_is_special_data() rtl_lps_leave() _rtl_pci_interrupt() _rtl_pci_rx_interrupt() rtl_is_special_data() setup_special_tx() rtl_lps_leave() _rtl_pci_interrupt() _rtl_pci_tx_isr rtl_lps_leave() halbtc_leave_lps() rtl_lps_leave() This leaves four callers of rtl_lps_enter/leave() where the analyzis stopped dead in the maze of several nested pointer based callchains and lack of rtlwifi hardware to debug this via tracing: halbtc_leave_lps(), halbtc_enter_lps(), halbtc_normal_lps(), halbtc_pre_normal_lps() These four have been cautionally marked to be unable to block which is the safe option, but the rtwifi wizards should be able to clarify that. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	e741751bda	net: rtlwifi: Remove in_interrupt() from debug macro The usage of in_interrupt() in drivers in is phased out. rtl_dbg() a printk based debug aid is using in_interrupt() in the underlying C function _rtl_dbg_out() which is almost identical to _rtl_dbg_print(). The only difference is the printout of in_interrupt(). The decoding of in_interrupt() as hexvalue is non-trivial and aside of being phased out for driver usage the return value is just by chance the masked preempt count value and not a boolean. These home brewn printk debug aids are tedious to work with and provide only minimal context. They should be replaced by trace_printk() or a debug tracepoint which automatically records all context information. To make progress on the in_interrupt() cleanup, make rtl_dbg() use _rtl_dbg_print() and remove _rtl_dbg_out(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	a3b7b227f1	net: rtlwifi: Remove void* casts related to delayed work INIT_DELAYED_WORK() takes two arguments: A pointer to the delayed work and a function reference for the callback. The rtl code casts all function references to (void *) because the callbacks in use are not matching the required function signature. That's error prone and bad pratice. Some of the callback functions are also global, but only used in a single file. Clean the mess up by: - Adding the proper arguments to the callback functions and using them in the container_of() constructs correctly which removes the hideous container_of_dwork_rtl() macro as well. - Removing the type cast at the initializers - Making the unnecessary global functions static Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	021b58ef51	net: libertas: Use netif_rx_any_context() The usage of in_interrupt() in non-core code is phased out. Ideally the information of the calling context should be passed by the callers or the functions be split as appropriate. libertas uses in_interupt() to select the netif_rx*() variant which matches the calling context. The attempt to consolidate the code by passing an arguemnt or by distangling it failed due lack of knowledge about this driver and because the call chains are hard to follow. As a stop gap use netif_rx_any_context() which invokes the correct code path depending on context and confines the in_interrupt() usage to core code. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	8faee70181	net: libertas libertas_tf: Remove in_interrupt() from debug macro. The debug macro prints (INT) when in_interrupt() returns true. The value of this information is dubious as it does not distinguish between the various contexts which are covered by in_interrupt(). As the usage of in_interrupt() in drivers is phased out and the same information can be more precisely obtained with tracing, remove the in_interrupt() conditional from this debug printk. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	d36981e0bd	net: mwifiex: Use netif_rx_any_context(). The usage of in_interrupt() in non-core code is phased out. Ideally the information of the calling context should be passed by the callers or the functions be split as appropriate. mwifiex uses in_interupt() to select the netif_rx*() variant which matches the calling context. The attempt to consolidate the code by passing an arguemnt or by distangling it failed due lack of knowledge about this driver and because the call chains are hard to follow. As a stop gap use netif_rx_any_context() which invokes the correct code path depending on context and confines the in_interrupt() usage to core code. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	75fd296398	net: hostap: Remove in_interrupt() usage in_interrupt() is ill defined and does not provide what the name suggests. The usage especially in driver code is deprecated and a tree wide effort to clean up and consolidate the (ab)usage of in_interrupt() and related checks is happening. hfa384x_cmd() and prism2_hw_reset() check in_interrupt() at function entry and if true emit a printk at debug loglevel and return. This is clearly debug code. Both functions invoke functions which can sleep. These functions already have appropriate debug checks which cover all invalid contexts, while in_interrupt() fails to detect context which just has preemption or interrupts disabled. Remove both checks as they are incomplete, debug only and already covered by the subsequently invoked functions properly. If called from invalid context the resulting back trace is definitely more helpful to analyze the problem than a printk at debug loglevel. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	bd63bca5e0	net: iwlwifi: Remove in_interrupt() from tracing macro. The usage of in_interrupt) in driver code is phased out. The iwlwifi_dbg tracepoint records in_interrupt() seperately, but that's superfluous because the trace header already records all kind of state and context information like hardirq status, softirq status, preemption count etc. Aside of that the recording of in_interrupt() as boolean does not allow to distinguish between the possible contexts (hard interrupt, soft interrupt, bottom half disabled) while the trace header gives precise information. Remove the duplicate information from the tracepoint and fixup the caller. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Luca Coelho <luca@coelho.fi> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	e4ff7d6b8c	net: ipw2x00,iwlegacy,iwlwifi: Remove in_interrupt() from debug macros The usage of in_interrupt() in non-core code is phased out. The debugging macros in these drivers use in_interrupt() to print 'I' or 'U' depending on the return value of in_interrupt(). While 'U' is confusing at best and 'I' is not really describing the actual context (hard interupt, soft interrupt, bottom half disabled section) these debug macros originate from the pre ftrace kernel era and their value today is questionable. They probably should be removed completely. The macros weere added initially for ipw2100 and then spreaded when the driver was forked. Remove the in_interrupt() usage at least.. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	c597ede403	net: brcmfmac: Convey allocation mode as argument The usage of in_interrupt() in drivers is phased out and Linus clearly requested that code which changes behaviour depending on context should either be seperated or the context be conveyed in an argument passed by the caller, which usually knows the context. brcmf_fweh_process_event() uses in_interrupt() to select the allocation mode GFP_KERNEL/GFP_ATOMIC. Aside of the above reasons this check is incomplete as it cannot detect contexts which just have preemption or interrupts disabled. All callchains leading to brcmf_fweh_process_event() can clearly identify the calling context. Convey a 'gfp' argument through the callchains and let the callers hand in the appropriate GFP mode. This has also the advantage that any change of execution context or preemption/interrupt state in these callchains will be detected by the memory allocator for all GFP_KERNEL allocations. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Thomas Gleixner	687006e20c	net: brcmfmac: Convey execution context via argument to brcmf_netif_rx() bcrmgf_netif_rx() uses in_interrupt to chose between netif_rx() and netif_rx_ni(). in_interrupt() usage in drivers is phased out. Convey the execution mode via an 'inirq' argument through the various callchains leading to brcmf_netif_rx(): brcmf_pcie_isr_thread() <- Task context brcmf_proto_msgbuf_rx_trigger() brcmf_msgbuf_process_rx() brcmf_msgbuf_process_msgtype() brcmf_msgbuf_process_rx_complete() brcmf_netif_mon_rx() brcmf_netif_rx(isirq = false) brcmf_netif_rx(isirq = false) brcmf_sdio_readframes() <- Task context sdio_claim_host() might sleep brcmf_rx_frame(isirq = false) brcmf_sdio_rxglom() <- Task context sdio_claim_host() might sleep brcmf_rx_frame(isirq = false) brcmf_usb_rx_complete() <- Interrupt context brcmf_rx_frame(isirq = true) brcmf_rx_frame() brcmf_proto_rxreorder() brcmf_proto_bcdc_rxreorder() brcmf_fws_rxreorder() brcmf_netif_rx() brcmf_netif_rx() Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arend van Spriel <arend.vanspriel@broadcom.com> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	d067c0fa29	net: brcmfmac: Replace in_interrupt() brcmf_sdio_isr() is using in_interrupt() to distinguish if it is called from a interrupt service routine or from a worker thread. Passing such information from the calling context is preferred and requested by Linus, so add an argument `in_isr' to brcmf_sdio_isr() and let the callers pass the information about the calling context. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:55 -07:00
Sebastian Andrzej Siewior	c2f8c90079	net: wan/lmc: Remove lmc_trace() lmc_trace() was first introduced in commit e7a392d5158af ("Import 2.3.99pre6-5") and was not touched ever since. The reason for looking at this was to get rid of the in_interrupt() usage, but while looking at it the following observations were made: - At least lmc_get_stats() (->ndo_get_stats()) is invoked with disabled preemption which is not detected by the in_interrupt() check, which would cause schedule() to be called from invalid context. - The code is hidden behind #ifdef LMC_TRACE which is not defined within the kernel and wasn't at the time it was introduced. - Three jiffies don't match 50ms. msleep() would be a better match which would also avoid the schedule() invocation. But why have it to begin with? - Nobody would do something like this today. Either netdev_dbg() or trace_printk() or a trace event would be used. If only the functions related to this driver are interesting then ftrace can be used with filtering. As it is obviously broken for years, simply remove it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:54 -07:00
Sebastian Andrzej Siewior	cfa1b49319	net: usb: net1080: Remove in_interrupt() comment The comment above nc_vendor_write() suggests that the function could become async so that is usable in `in_interrupt()' context or that it already is safe to be called from such a context. Eitherway: The function did not become async since v2.4.9.2 (2002) and it must be not be called from `in_interrupt()' context because it sleeps on mutltiple occations. Remove the misleading comment. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:54 -07:00
Sebastian Andrzej Siewior	a19c261901	net: usb: kaweth: Remove last user of kaweth_control() kaweth_async_set_rx_mode() invokes kaweth_contol() and has two callers: - kaweth_open() which is invoked from preemptible context . - kaweth_start_xmit() which holds a spinlock and has bottom halfs disabled. If called from kaweth_start_xmit() kaweth_async_set_rx_mode() obviously cannot block, which means it can't call kaweth_control(). This is detected with an in_interrupt() check. Replace the in_interrupt() check in kaweth_async_set_rx_mode() with an argument which is set true by the caller if the context is safe to sleep, otherwise false. Now kaweth_control() is only called from preemptible context which means there is no need for GFP_ATOMIC allocations anymore. Replace it with usb_control_msg(). Cleanup the code a bit while at it. Finally remove kaweth_control() since the last user is gone. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:54 -07:00
Sebastian Andrzej Siewior	af3563be9d	net: usb: kaweth: Replace kaweth_control() with usb_control_msg() kaweth_control() is almost the same as usb_control_msg() except for the memory allocation mode (GFP_ATOMIC vs GFP_NOIO) and the in_interrupt() check. All the invocations of kaweth_control() are within the probe function in fully preemtible context so there is no reason to use atomic allocations, GFP_NOIO which is used by usb_control_msg() is perfectly fine. Replace kaweth_control() invocations from probe with usb_control_msg(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:54 -07:00
Sebastian Andrzej Siewior	911b8eacd7	net: zd1211rw: Remove ZD_ASSERT(in_interrupt()) in_interrupt() is ill defined and does not provide what the name suggests. The usage especially in driver code is deprecated and a tree wide effort to clean up and consolidate the (ab)usage of in_interrupt() and related checks is happening. handle_regs_int() is always invoked as part of URB callback which is either invoked from hard or soft interrupt context. Remove the magic assertion. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-29 14:02:54 -07:00

... 3 4 5 6 7 ...

953143 Commits All Branches Search

953143 Commits

All Branches