Commit Graph

889829 Commits

Author SHA1 Message Date
Paul Blakey 82270e1254 net/mlx5: ft: Check prio and chain sanity for ft offload
Before changing the chain from original chain to ft offload chain,
make sure user doesn't actually use chains.

While here, normalize the prio range to that which we support.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 15:48:58 -08:00
Paul Blakey e66cbc961c net/mlx5: ft: Use getter function to get ft chain
FT chain is defined as the next chain after tc.

To prepare for next patches that will increase the number of tc
chains available at runtime, use a getter function to get this
value.

The define is still used in static fs_core allocation,
to calculate the number of chains. This static allocation
will be used if the relevant capabilities won't be available
to support dynamic chains.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 15:48:58 -08:00
Paul Blakey 79cdb0aaea net/mlx5: Allow creating autogroups with reserved entries
Exclude the last n entries for an autogrouped flow table.

Reserving entries at the end of the FT will ensure that this FG will be
the last to be evaluated. This will be used in the next patch to create
a miss group enabling custom actions on FT miss.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 15:48:58 -08:00
Paul Blakey ff189b4356 net/mlx5: Add ignore level support fwd to table rules
If user sets ignore flow level flag on a rule, that rule can point to
a flow table of any level, including those with levels equal or less
than the level of the flow table it is added on.

This with unamanged tables will be used to create a FDB chain/prio
hierarchy much larger than currently supported level range.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 15:48:58 -08:00
Paul Blakey 5281a0c909 net/mlx5: fs_core: Introduce unmanaged flow tables
Currently, Most of the steering tree is statically declared ahead of time,
with steering prios instances allocated for each fdb chain to assign max
number of levels for each of them. This allows fs_core to manage the
connections and  levels of the flow tables hierarcy to prevent loops, but
restricts us with the number of supported chains and priorities.

Introduce unmananged flow tables, allowing the user to manage the flow
table connections. A unamanged table is detached from the fs_core flow
table hierarcy, and is only connected back to the hierarchy by explicit
FTEs forward actions.

This will be used together with firmware that supports ignoring the flow
table levels to increase the number of supported chains and prios.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 15:48:58 -08:00
Saeed Mahameed 12e9e0d0d9 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
This merge syncs with mlx5-next latest HW bits and layout updates for next
features, in addition one patch that improves
mlx5_create_auto_grouped_flow_table() API across all mlx5 users.

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Refactor mlx5_create_auto_grouped_flow_table
  net/mlx5e: Add discard counters per priority
  net/mlx5e: Expose FEC feilds and related capability bit
  net/mlx5: Add mlx5_ifc definitions for connection tracking support
  net/mlx5: Add copy header action struct layout
  net/mlx5: Expose resource dump register mapping
  net/mlx5: Add structures and defines for MIRC register
  net/mlx5: Read MCAM register groups 1 and 2
  net/mlx5: Add structures layout for new MCAM access reg groups
  net/mlx5: Expose vDPA emulation device capabilities
  net/mlx5: Add Virtio Emulation related device capabilities

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 15:48:24 -08:00
Paul Blakey 61dc7b0141 net/mlx5: Refactor mlx5_create_auto_grouped_flow_table
Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param
which already carries the max_fte, prio and flags memebers, and is
used the same in similar mlx5_create_flow_table() function.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 15:41:59 -08:00
Aharon Landau 827a8cb2dd net/mlx5e: Add discard counters per priority
Add counters that count (per priority) the number of received
packets that dropped due to lack of buffers on a physical port. If
this counter is increasing, it implies that the adapter is
congested and cannot absorb the traffic coming from the network.

Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 14:11:33 -08:00
Aya Levin a58837f52d net/mlx5e: Expose FEC feilds and related capability bit
Introduce 50G per lane FEC modes capability bit and newly supported
fields in PPLM register which allow this configuration.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 14:11:30 -08:00
Paul Blakey 822e114b50 net/mlx5: Add mlx5_ifc definitions for connection tracking support
Add the required hardware definitions to mlx5_ifc:
ignore_flow_level, registers, copy_header, and fwd_and_modify cap.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Sholomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 14:11:28 -08:00
Hamdan Igbaria 31d8bde1c8 net/mlx5: Add copy header action struct layout
Add definition for copy header action, copy action is used
to copy header fields from source to destination.

Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 14:11:26 -08:00
Aya Levin 609b82727f net/mlx5: Expose resource dump register mapping
Add new register enumeration for resource dump. Add layout mapping for
resource dump: access command and response.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 14:11:23 -08:00
Eran Ben Elisha bab58ba10e net/mlx5: Add structures and defines for MIRC register
Add needed structures, layouts and defines for MIRC (Management Image
Re-activation Control) register. This structure will be used for the FSM
reactivation flow in the downstream patches.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 14:11:21 -08:00
Eran Ben Elisha 932ef15511 net/mlx5: Read MCAM register groups 1 and 2
On load, Driver caches MCAM (Management Capabilities Mask Register)
registers. in addition to the only MCAM register group (0) the driver
already reads, here we add support for reading groups 1 and 2.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 14:11:19 -08:00
Eran Ben Elisha f397464eb7 net/mlx5: Add structures layout for new MCAM access reg groups
MCAM has 3 access_reg_groups (0-2). Defines data structures in order to
read and parse access_reg_groups #1 and #2.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16 14:11:16 -08:00
Tom Lendacky a006483b2f x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained
If the SME and SEV features are present via CPUID, but memory encryption
support is not enabled (MSR 0xC001_0010[23]), the feature flags are cleared
using clear_cpu_cap(). However, if get_cpu_cap() is later called, these
feature flags will be reset back to present, which is not desired.

Change from using clear_cpu_cap() to setup_clear_cpu_cap() so that the
clearing of the flags is maintained.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.16.x-
Link: https://lkml.kernel.org/r/226de90a703c3c0be5a49565047905ac4e94e8f3.1579125915.git.thomas.lendacky@amd.com
2020-01-16 20:23:20 +01:00
Linus Torvalds 0c99ee44b8 platform/chrome fixes for v5.5-rc7.
One fix in the wilco_ec keyboard backlight driver
 to allow the EC driver to continue loading in the absence
 of a backlight module.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQCtZK6p/AktxXfkOlzbaomhzOwwgUCXiCicQAKCRBzbaomhzOw
 ws7lAP9JZTQi2nqmYOfink9JAEQbNSMO1tdrvQiZpYPXZ/j4LgD/c1hg2ae9MMGX
 at2A9ffaLAXEKEJ16U7A3vUlDOO9bQY=
 =2Otp
 -----END PGP SIGNATURE-----

Merge tag 'tag-chrome-platform-fixes-for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux

Pull chrome platform fix from Benson Leung:
 "One fix in the wilco_ec keyboard backlight driver to allow the EC
  driver to continue loading in the absence of a backlight module"

* tag 'tag-chrome-platform-fixes-for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  platform/chrome: wilco_ec: Fix keyboard backlight probing
2020-01-16 10:26:40 -08:00
Reinhard Speyerer f3eaabbfd0 USB: serial: option: add support for Quectel RM500Q in QDL mode
Add support for Quectel RM500Q in QDL mode.

T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 24 Spd=480  MxCh= 0
D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=0800 Rev= 0.00
S:  Manufacturer=Qualcomm CDMA Technologies MSM
S:  Product=QUSB_BULK_SN:xxxxxxxx
S:  SerialNumber=xxxxxxxx
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=  2mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=10 Driver=option
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

It is assumed that the ZLP flag required for other Qualcomm-based
5G devices also applies to Quectel RM500Q.

Signed-off-by: Reinhard Speyerer <rspmn@arcor.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-16 16:54:34 +01:00
Jeremy Sowden 567d746b55 netfilter: bitwise: add support for shifts.
Hitherto nft_bitwise has only supported boolean operations: NOT, AND, OR
and XOR.  Extend it to do shifts as well.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:52:02 +01:00
Jeremy Sowden 779f725e14 netfilter: bitwise: add NFTA_BITWISE_DATA attribute.
Add a new bitwise netlink attribute that will be used by shift
operations to store the size of the shift.  It is not used by boolean
operations.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:52:02 +01:00
Jeremy Sowden ed991d4363 netfilter: bitwise: only offload boolean operations.
Only boolean operations supports offloading, so check the type of the
operation and return an error for other types.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:52:01 +01:00
Jeremy Sowden 4d57ca2be1 netfilter: bitwise: add helper for dumping boolean operations.
Split the code specific to dumping bitwise boolean operations out into a
separate function.  A similar function will be added later for shift
operations.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:52:00 +01:00
Jeremy Sowden 71d6ded3ac netfilter: bitwise: add helper for evaluating boolean operations.
Split the code specific to evaluating bitwise boolean operations out
into a separate function.  Similar functions will be added later for
shift operations.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:59 +01:00
Jeremy Sowden 3f8d9eb032 netfilter: bitwise: add helper for initializing boolean operations.
Split the code specific to initializing bitwise boolean operations out
into a separate function.  A similar function will be added later for
shift operations.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:58 +01:00
Jeremy Sowden 9d1f979986 netfilter: bitwise: add NFTA_BITWISE_OP netlink attribute.
Add a new bitwise netlink attribute, NFTA_BITWISE_OP, which is set to a
value of a new enum, nft_bitwise_ops.  It describes the type of
operation an expression contains.  Currently, it only has one value:
NFT_BITWISE_BOOL.  More values will be added later to implement shifts.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:57 +01:00
Jeremy Sowden 577c734a81 netfilter: bitwise: replace gotos with returns.
When dumping a bitwise expression, if any of the puts fails, we use goto
to jump to a label.  However, no clean-up is required and the only
statement at the label is a return.  Drop the goto's and return
immediately instead.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:57 +01:00
Jeremy Sowden 265ec7b0e8 netfilter: bitwise: remove NULL comparisons from attribute checks.
In later patches, we will be adding more checks.  In order to be
consistent and prevent complaints from checkpatch.pl, replace the
existing comparisons with NULL with logical NOT operators.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:56 +01:00
Jeremy Sowden fbf19ddf39 netfilter: nf_tables: white-space fixes.
Indentation fixes for the parameters of a few nft functions.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:55 +01:00
Pablo Neira Ayuso a7965d58dd netfilter: flowtable: add nf_flow_table_offload_cmd()
Split nf_flow_table_offload_setup() in two functions to make it more
maintainable.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:54 +01:00
Pablo Neira Ayuso ae29045018 netfilter: flowtable: add nf_flow_offload_tuple() helper
Consolidate code to configure the flow_cls_offload structure into one
helper function.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:53 +01:00
Florian Westphal 28b3a4270c netfilter: hashlimit: do not use indirect calls during gc
no need, just use a simple boolean to indicate we want to reap all
entries.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:52 +01:00
Pablo Neira Ayuso f698fe4082 netfilter: flowtable: refresh flow if hardware offload fails
If nf_flow_offload_add() fails to add the flow to hardware, then the
NF_FLOW_HW_REFRESH flag bit is set and the flow remains in the flowtable
software path.

If flowtable hardware offload is enabled, this patch enqueues a new
request to offload this flow to hardware.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:52 +01:00
Pablo Neira Ayuso a5449cdcaa netfilter: flowtable: add nf_flowtable_hw_offload() helper function
This function checks for the NF_FLOWTABLE_HW_OFFLOAD flag, meaning that
the flowtable hardware offload is enabled.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:51 +01:00
Pablo Neira Ayuso 355a8b13f8 netfilter: flowtable: use atomic bitwise operations for flow flags
Originally, all flow flag bits were set on only from the workqueue. With
the introduction of the flow teardown state and hardware offload this is
no longer true. Let's be safe and use atomic bitwise operation to
operation with flow flags.

Fixes: 59c466dd68 ("netfilter: nf_flow_table: add a new flow state for tearing down offloading")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:50 +01:00
Pablo Neira Ayuso 445db8d096 netfilter: flowtable: remove dying bit, use teardown bit instead
The dying bit removes the conntrack entry if the netdev that owns this
flow is going down. Instead, use the teardown mechanism to push back the
flow to conntrack to let the classic software path decide what to do
with it.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:49 +01:00
Pablo Neira Ayuso 87265d842c netfilter: flowtable: add nf_flow_offload_work_alloc()
Add helper function to allocate and initialize flow offload work and use
it to consolidate existing code.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:48 +01:00
Pablo Neira Ayuso a7521a60a5 netfilter: flowtable: restrict flow dissector match on meta ingress device
Set on FLOW_DISSECTOR_KEY_META meta key using flow tuple ingress interface.

Fixes: c29f74e0df ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:48 +01:00
Pablo Neira Ayuso 79b9b685dd netfilter: flowtable: fetch stats only if flow is still alive
Do not fetch statistics if flow has expired since it might not in
hardware anymore. After this update, remove the FLOW_OFFLOAD_HW_DYING
check from nf_flow_offload_stats() since this flag is never set on.

Fixes: c29f74e0df ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: wenxu <wenxu@ucloud.cn>
2020-01-16 15:51:47 +01:00
Jeremy Sowden 4a7faaf4ad netfilter: nft_bitwise: correct uapi header comment.
The comment documenting how bitwise expressions work includes a table
which summarizes the mask and xor arguments combined to express the
supported boolean operations.  However, the row for OR:

 mask    xor
 0       x

is incorrect.

  dreg = (sreg & 0) ^ x

is not equivalent to:

  dreg = sreg | x

What the code actually does is:

  dreg = (sreg & ~x) ^ x

Update the documentation to match.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:51:14 +01:00
Hans Westgaard Ry c4c86abb3f net/rds: Detect need of On-Demand-Paging memory registration
Add code to check if memory intended for RDMA is FS-DAX-memory. RDS
will fail with error code EOPNOTSUPP if FS-DAX-memory is detected.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:15:25 +02:00
Jason Gunthorpe 8ffc324851 RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
Till recently it was not possible for userspace to specify a different
IOVA, but with the new ibv_reg_mr_iova() library call this can be done.

To compute the user_va we must compute:
  user_va = (iova - iova_start) + user_va_start

while being cautious of overflow and other math problems.

The iova is not reliably stored in the mmkey when the MR is created. Only
the cached creation path (the common one) set it, so it must also be set
when creating uncached MRs.

Fix the weird use of iova when computing the starting page index in the
MR. In the normal case, when iova == umem.address:
  iova & (~(BIT(page_shift) - 1)) ==
  ALIGN_DOWN(umem.address, odp->page_size) ==
  ib_umem_start(odp)

And when iova is different using it in math with a user_va is wrong.

Finally, do not allow an implicit ODP to be created with a non-zero IOVA
as we have no support for that.

Fixes: 7bdf65d411 ("IB/mlx5: Handle page faults")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:15:07 +02:00
Moni Shoua a73a895588 IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
The ODP handler for WQEs in RQ or SRQ is not implented for kernel QPs.
Therefore don't report support in these if query comes from a kernel user.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:15:00 +02:00
Leon Romanovsky 4835709176 RDMA/mlx5: Don't fake udata for kernel path
Kernel paths must not set udata and provide NULL pointer,
instead of faking zeroed udata struct.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:14:53 +02:00
Moni Shoua da9ee9d8a8 IB/mlx5: Add ODP WQE handlers for kernel QPs
One of the steps in ODP page fault handler for WQEs is to read a WQE
from a QP send queue or receive queue buffer at a specific index.

Since the implementation of this buffer is different between kernel and
user QP the implementation of the handler needs to be aware of that and
handle it in a different way.

ODP for kernel MRs is currently supported only for RDMA_READ
and RDMA_WRITE operations so change the handler to
- read a WQE from a kernel QP send queue
- fail if access to receive queue or shared receive queue is
  required for a kernel QP

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:14:47 +02:00
Moni Shoua 87d8069f6b IB/core: Add interface to advise_mr for kernel users
Allow ULPs to call advise_mr, so they can control ODP regions
in the same way as user space applications.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:14:42 +02:00
Moni Shoua 33006bd4f3 IB/core: Introduce ib_reg_user_mr
Add ib_reg_user_mr() for kernel ULPs to register user MRs.

The common use case that uses this function is a userspace application
that allocates memory for HCA access but the responsibility to register
the memory at the HCA is on an kernel ULP. This ULP that acts as an agent
for the userspace application.

This function is intended to be used without a user context so vendor
drivers need to be aware of calling reg_user_mr() device operation with
udata equal to NULL.

Among all drivers, i40iw is the only driver which relies on presence
of udata, so check udata existence for that driver.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:14:36 +02:00
Moni Shoua c320e527e1 IB: Allow calls to ib_umem_get from kernel ULPs
So far the assumption was that ib_umem_get() and ib_umem_odp_get()
are called from flows that start in UVERBS and therefore has a user
context. This assumption restricts flows that are initiated by ULPs
and need the service that ib_umem_get() provides.

This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device
directly by relying on the fact that both UVERBS and ULPs sets that
field correctly.

Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:14:28 +02:00
Eyal Birger 61177e911d netfilter: nat: fix ICMP header corruption on ICMP errors
Commit 8303b7e8f0 ("netfilter: nat: fix spurious connection timeouts")
made nf_nat_icmp_reply_translation() use icmp_manip_pkt() as the l4
manipulation function for the outer packet on ICMP errors.

However, icmp_manip_pkt() assumes the packet has an 'id' field which
is not correct for all types of ICMP messages.

This is not correct for ICMP error packets, and leads to bogus bytes
being written the ICMP header, which can be wrongfully regarded as
'length' bytes by RFC 4884 compliant receivers.

Fix by assigning the 'id' field only for ICMP messages that have this
semantic.

Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Fixes: 8303b7e8f0 ("netfilter: nat: fix spurious connection timeouts")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 15:08:25 +01:00
Madhuparna Bhowmik 93ad0f969f net: wan: lapbether.c: Use built-in RCU list checking
The only callers of the function lapbeth_get_x25_dev()
are lapbeth_rcv() and lapbeth_device_event().

lapbeth_rcv() uses rcu_read_lock() whereas lapbeth_device_event()
is called with RTNL held (As mentioned in the comments).

Therefore, pass lockdep_rtnl_is_held() as cond argument in
list_for_each_entry_rcu();

Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik04@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-16 14:30:52 +01:00
Florian Westphal 335178d542 netfilter: nf_tables: fix flowtable list del corruption
syzbot reported following crash:

  list_del corruption, ffff88808c9bb000->prev is LIST_POISON2 (dead000000000122)
  [..]
  Call Trace:
   __list_del_entry include/linux/list.h:131 [inline]
   list_del_rcu include/linux/rculist.h:148 [inline]
   nf_tables_commit+0x1068/0x3b30 net/netfilter/nf_tables_api.c:7183
   [..]

The commit transaction list has:

NFT_MSG_NEWTABLE
NFT_MSG_NEWFLOWTABLE
NFT_MSG_DELFLOWTABLE
NFT_MSG_DELTABLE

A missing generation check during DELTABLE processing causes it to queue
the DELFLOWTABLE operation a second time, so we corrupt the list here:

  case NFT_MSG_DELFLOWTABLE:
     list_del_rcu(&nft_trans_flowtable(trans)->list);
     nf_tables_flowtable_notify(&trans->ctx,

because we have two different DELFLOWTABLE transactions for the same
flowtable.  We then call list_del_rcu() twice for the same flowtable->list.

The object handling seems to suffer from the same bug so add a generation
check too and only queue delete transactions for flowtables/objects that
are still active in the next generation.

Reported-by: syzbot+37a6804945a3a13b1572@syzkaller.appspotmail.com
Fixes: 3b49e2e94e ("netfilter: nf_tables: add flow table netlink frontend")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16 14:22:33 +01:00