Commit Graph

877021 Commits

Author SHA1 Message Date
Thomas Bogendoerfer e2ffe3ff6f net: ipconfig: Wait for deferred device probes
If network device drives are using deferred probing, it was possible
that waiting for devices to show up in ipconfig was already over,
when the device eventually showed up. By calling wait_for_device_probe()
we now make sure deferred probing is done before checking for available
devices.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:39:53 -08:00
Mao Wenan 2be8ca97d0 vsock/vmci: make vmci_vsock_cb_host_called static
When using make C=2 drivers/misc/vmw_vmci/vmci_driver.o
to compile, below warning can be seen:
drivers/misc/vmw_vmci/vmci_driver.c:33:6: warning:
symbol 'vmci_vsock_cb_host_called' was not declared. Should it be static?

This patch make symbol vmci_vsock_cb_host_called static.

Fixes: b1bba80a43 ("vsock/vmci: register vmci_transport only when VMCI guest/host are active")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:39:29 -08:00
David S. Miller aee024f610 Merge branch 'ibmvnic-regression'
Juliet Kim says:

====================
Support both XIVE and XICS modes in ibmvnic

This series aims to support both XICS and XIVE with avoiding
a regression in behavior when a system runs in XICS mode.

Patch 1 reverts commit 11d49ce9f7
(“net/ibmvnic: Fix EOI when running in XIVE mode.”)

Patch 2 Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:37:15 -08:00
Juliet Kim 2df5c60e19 net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode
Reversion of commit 11d49ce9f7
(“net/ibmvnic: Fix EOI when running in XIVE mode.”) leaves us
calling H_EOI even in XIVE mode. That will fail with H_FUNCTION
because H_EOI is not supported in that mode. That failure is
harmless. Ignore it so we can use common code for both XICS and
XIVE.

Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:37:15 -08:00
Juliet Kim 284f87d2f3 Revert "net/ibmvnic: Fix EOI when running in XIVE mode"
This reverts commit 11d49ce9f7
(“net/ibmvnic: Fix EOI when running in XIVE mode.”) since that
has the unintended effect of changing the interrupt priority
and emits warning when running in legacy XICS mode.

Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:37:15 -08:00
David S. Miller e07e75412b Merge branch 'page_pool-DMA-sync'
Lorenzo Bianconi says:

====================
add DMA-sync-for-device capability to page_pool API

Introduce the possibility to sync DMA memory for device in the page_pool API.
This feature allows to sync proper DMA size and not always full buffer
(dma_sync_single_for_device can be very costly).
Please note DMA-sync-for-CPU is still device driver responsibility.
Relying on page_pool DMA sync mvneta driver improves XDP_DROP pps of
about 170Kpps:

- XDP_DROP DMA sync managed by mvneta driver:	~420Kpps
- XDP_DROP DMA sync managed by page_pool API:	~585Kpps

Do not change naming convention for the moment since the changes will hit other
drivers as well. I will address it in another series.

Changes since v4:
- do not allow the driver to set max_len to 0
- convert PP_FLAG_DMA_MAP/PP_FLAG_DMA_SYNC_DEV to BIT() macro

Changes since v3:
- move dma_sync_for_device before putting the page in ptr_ring in
  __page_pool_recycle_into_ring since ptr_ring can be consumed
  concurrently. Simplify the code moving dma_sync_for_device
  before running __page_pool_recycle_direct/__page_pool_recycle_into_ring

Changes since v2:
- rely on PP_FLAG_DMA_SYNC_DEV flag instead of dma_sync

Changes since v1:
- rename sync in dma_sync
- set dma_sync_size to 0xFFFFFFFF in page_pool_recycle_direct and
  page_pool_put_page routines
- Improve documentation
====================

Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:34:37 -08:00
Lorenzo Bianconi 07e13edbb6 net: mvneta: get rid of huge dma sync in mvneta_rx_refill
Get rid of costly dma_sync_single_for_device in mvneta_rx_refill
since now the driver can let page_pool API to manage needed DMA
sync with a proper size.

- XDP_DROP DMA sync managed by mvneta driver:	~420Kpps
- XDP_DROP DMA sync managed by page_pool API:	~585Kpps

Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:34:29 -08:00
Lorenzo Bianconi e68bc75691 net: page_pool: add the possibility to sync DMA memory for device
Introduce the following parameters in order to add the possibility to sync
DMA memory for device before putting allocated pages in the page_pool
caches:
- PP_FLAG_DMA_SYNC_DEV: if set in page_pool_params flags, all pages that
  the driver gets from page_pool will be DMA-synced-for-device according
  to the length provided by the device driver. Please note DMA-sync-for-CPU
  is still device driver responsibility
- offset: DMA address offset where the DMA engine starts copying rx data
- max_len: maximum DMA memory size page_pool is allowed to flush. This
  is currently used in __page_pool_alloc_pages_slow routine when pages
  are allocated from page allocator
These parameters are supposed to be set by device drivers.

This optimization reduces the length of the DMA-sync-for-device.
The optimization is valid because pages are initially
DMA-synced-for-device as defined via max_len. At RX time, the driver
will perform a DMA-sync-for-CPU on the memory for the packet length.
What is important is the memory occupied by packet payload, because
this is the area CPU is allowed to read and modify. As we don't track
cache-lines written into by the CPU, simply use the packet payload length
as dma_sync_size at page_pool recycle time. This also take into account
any tail-extend.

Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:34:28 -08:00
Lorenzo Bianconi f383b29500 net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp
Rely on page_pool_recycle_direct and not on xdp_return_buff in
mvneta_run_xdp. This is a preliminary patch to limit the dma sync len
to the one strictly necessary

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:34:17 -08:00
Eran Ben Elisha 30e9e0550b net/mlxfw: Verify FSM error code translation doesn't exceed array size
Array mlxfw_fsm_state_err_str contains value to string translation, when
values are provided by mlxfw_dev. If value is larger than
MLXFW_FSM_STATE_ERR_MAX, return "unknown error" as expected instead of
reading an address than exceed array size.

Fixes: 410ed13cae ("Add the mlxfw module for Mellanox firmware flash process")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:06 -08:00
Shani Shapp b7eca94032 net/mlx5: Update the list of the PCI supported devices
Add the upcoming ConnectX-6 LX device ID.

Fixes: 85327a9c41 ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: Shani Shapp <shanish@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:06 -08:00
Maor Gottlieb 97fd8da281 net/mlx5: Fix auto group size calculation
Once all the large flow groups (defined by the user when the flow table
is created - max_num_groups) were created, then all the following new
flow groups will have only one flow table entry, even though the flow table
has place to larger groups.
Fix the condition to prefer large flow group.

Fixes: f0d22d1874 ("net/mlx5_core: Introduce flow steering autogrouped flow table")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:06 -08:00
Marina Varshaver 9c98f7ec01 net/mlx5e: Add missing capability bit check for IP-in-IP
Device that doesn't support IP-in-IP offloads has to filter csum and gso
offload support, otherwise kernel will conclude that device is capable of
offloading csum and gso for IP-in-IP tunnels and that might result in
IP-in-IP tunnel not functioning.

Fixes: 25948b87dd ("net/mlx5e: Support TSO and TX checksum offloads for IP-in-IP")
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:06 -08:00
Eran Ben Elisha 2496057450 net/mlx5e: Do not use non-EXT link modes in EXT mode
On some old Firmwares, connector type value was not supported, and value
read from FW was 0. For those, driver used link mode in order to set
connector type in link_ksetting.

After FW exposed the connector type, driver translated the value to ethtool
definitions. However, as 0 is a valid value, before returning PORT_OTHER,
driver run the check of link mode in order to maintain backward
compatibility.

Cited patch added support to EXT mode.  With both features (connector type
and EXT link modes) ,if connector_type read from FW is 0 and EXT mode is
set, driver mistakenly compare EXT link modes to non-EXT link mode.
Fixed that by skipping this comparison if we are in EXT mode, as connector
type value is valid in this scenario.

Fixes: 6a89737241 ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:05 -08:00
Roi Dayan 751021218f net/mlx5e: Fix set vf link state error flow
Before this commit the ndo always returned success.
Fix that.

Fixes: 1ab2068a4c ("net/mlx5: Implement vports admin state backup/restore")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:05 -08:00
Alex Vesker 21586a0fc4 net/mlx5: DR, Limit STE hash table enlarge based on bytemask
When an ste hash table has too many collision we enlarge it
to a bigger hash table (rehash). Rehashing collision improvement
depends on the bytemask value. The more 1 bits we have in bytemask
means better spreading in the table.

Without this fix tables can grow in size without providing any
improvement which can lead to memory depletion and failures.

This patch will limit table rehash to reduce memory and improve
the performance.

Fixes: 41d0707415 ("net/mlx5: DR, Expose steering rule functionality")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:05 -08:00
Alex Vesker 83e7948913 net/mlx5: DR, Skip rehash for tables with byte mask zero
The byte mask fields affect on the hash index distribution,
when the byte mask is zero, the hash calculation will always
be equal to the same index.

To avoid unneeded rehash of hash tables mark the table to skip
rehash.

This is needed by the next patch which will limit table rehash
to reduce memory consumption.

Fixes: 41d0707415 ("net/mlx5: DR, Expose steering rule functionality")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:05 -08:00
Alex Vesker 829969956f net/mlx5: DR, Fix invalid EQ vector number on CQ creation
When creating a CQ, the CPU id is used for the vector value.
This would fail in-case the CPU id was higher than the maximum
vector value.

Fixes: 297cccebdc ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:05 -08:00
Vlad Buslov b6a4ac24c1 net/mlx5e: Reorder mirrer action parsing to check for encap first
Mirred action parsing code in parse_tc_fdb_actions() first checks if
out_dev has same parent id, and only verifies that there is a pending encap
action that was parsed before. Recent change in vxlan module made function
netdev_port_same_parent_id() to return true when called for mlx5 eswitch
representor and vxlan device created explicitly on mlx5 representor
device (vxlan devices created with "external" flag without explicitly
specifying parent interface are not affected). With call to
netdev_port_same_parent_id() returning true, incorrect code path is chosen
and encap rules fail to offload because vxlan dev is not a valid eswitch
forwarding dev. Dmesg log of error:

[ 1784.389797] devices ens1f0_0 vxlan1 not on same switch HW, can't offload forwarding

In order to fix the issue, rearrange conditional in parse_tc_fdb_actions()
to check for pending encap action before checking if out_dev has the same
parent id.

Fixes: 0ce1822c2a ("vxlan: add adjacent link to limit depth level")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:04 -08:00
Eli Cohen 7b83355f6d net/mlx5e: Fix ingress rate configuration for representors
Current code uses the old method of prio encoding in
flow_cls_common_offload. Fix to follow the changes introduced in
commit ef01adae0e ("net: sched: use major priority number as hardware priority").

Fixes: fcb64c0f56 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:04 -08:00
Eli Cohen a86db2269f net/mlx5e: Fix error flow cleanup in mlx5e_tc_tun_create_header_ipv4/6
Be sure to release the neighbour in case of failures after successful
route lookup.

Fixes: 101f4de9dd ("net/mlx5e: Move TC tunnel offloading code to separate source file")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-20 12:33:04 -08:00
Gautam Ramakrishnan cec2975f2b net: sched: pie: enable timestamp based delay calculation
RFC 8033 suggests an alternative approach to calculate the queue
delay in PIE by using a timestamp on every enqueued packet. This
patch adds an implementation of that approach and sets it as the
default method to calculate queue delay. The previous method (based
on Little's law) to calculate queue delay is set as optional.

Signed-off-by: Gautam Ramakrishnan <gautamramk@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Acked-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:31:45 -08:00
Krzysztof Kozlowski f01b437d12 isdn: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:30:47 -08:00
Krzysztof Kozlowski 041ccdb620 nfc: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:30:40 -08:00
David S. Miller a20ee510a9 Merge branch 's390-fixes'
Julian Wiedmann says:

====================
s390/qeth: fixes 2019-11-20

please apply two late qeth fixes to your net tree.

The first fixes a deadlock that can occur if a qeth device is set
offline while in the middle of processing deferred HW events.
The second patch converts the return value of an error path to
use -EIO, so that it can be passed back to userspace.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:29:47 -08:00
Julian Wiedmann 2f3c269d96 s390/qeth: return proper errno on IO error
When propagating IO errors back to userspace, one error path in
qeth_irq() currently returns '1' instead of a proper errno.

Fixes: 54daaca702 ("s390/qeth: cancel cmd on early error")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:29:47 -08:00
Julian Wiedmann c8183f5489 s390/qeth: fix potential deadlock on workqueue flush
The L2 bridgeport code uses the coarse 'conf_mutex' for guarding access
to its configuration state.
This can result in a deadlock when qeth_l2_stop_card() - called under the
conf_mutex - blocks on flush_workqueue() to wait for the completion of
pending bridgeport workers. Such workers would also need to aquire
the conf_mutex, stalling indefinitely.

Introduce a lock that specifically guards the bridgeport configuration,
so that the workers no longer need the conf_mutex.
Wrapping qeth_l2_promisc_to_bridge() in this fine-grained lock then also
fixes a theoretical race against a concurrent qeth_bridge_port_role_store()
operation.

Fixes: c0a2e4d10d ("s390/qeth: conclude all event processing before offlining a card")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:29:47 -08:00
Hangbin Liu 004b39427f ipv6/route: return if there is no fib_nh_gw_family
Previously we will return directly if (!rt || !rt->fib6_nh.fib_nh_gw_family)
in function rt6_probe(), but after commit cc3a86c802
("ipv6: Change rt6_probe to take a fib6_nh"), the logic changed to
return if there is fib_nh_gw_family.

Fixes: cc3a86c802 ("ipv6: Change rt6_probe to take a fib6_nh")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:17:39 -08:00
Jouni Hogander b8eb718348 net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject
kobject_init_and_add takes reference even when it fails. This has
to be given up by the caller in error handling. Otherwise memory
allocated by kobject_init_and_add is never freed. Originally found
by Syzkaller:

BUG: memory leak
unreferenced object 0xffff8880679f8b08 (size 8):
  comm "netdev_register", pid 269, jiffies 4294693094 (age 12.132s)
  hex dump (first 8 bytes):
    72 78 2d 30 00 36 20 d4                          rx-0.6 .
  backtrace:
    [<000000008c93818e>] __kmalloc_track_caller+0x16e/0x290
    [<000000001f2e4e49>] kvasprintf+0xb1/0x140
    [<000000007f313394>] kvasprintf_const+0x56/0x160
    [<00000000aeca11c8>] kobject_set_name_vargs+0x5b/0x140
    [<0000000073a0367c>] kobject_init_and_add+0xd8/0x170
    [<0000000088838e4b>] net_rx_queue_update_kobjects+0x152/0x560
    [<000000006be5f104>] netdev_register_kobject+0x210/0x380
    [<00000000e31dab9d>] register_netdevice+0xa1b/0xf00
    [<00000000f68b2465>] __tun_chr_ioctl+0x20d5/0x3dd0
    [<000000004c50599f>] tun_chr_ioctl+0x2f/0x40
    [<00000000bbd4c317>] do_vfs_ioctl+0x1c7/0x1510
    [<00000000d4c59e8f>] ksys_ioctl+0x99/0xb0
    [<00000000946aea81>] __x64_sys_ioctl+0x78/0xb0
    [<0000000038d946e5>] do_syscall_64+0x16f/0x580
    [<00000000e0aa5d8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [<00000000285b3d1a>] 0xffffffffffffffff

Cc: David Miller <davem@davemloft.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:09:50 -08:00
David S. Miller 07def46382 Merge branch 'cxgb4-add-TC-MATCHALL-classifier-offload'
Rahul Lakkireddy says:

====================
cxgb4: add TC-MATCHALL classifier offload

This series of patches add support to offload TC-MATCHALL classifier
to hardware to classify all outgoing and incoming traffic on the
underlying port. Only 1 egress and 1 ingress rule each can be
offloaded on the underlying port.

Patch 1 adds support for TC-MATCHALL classifier offload on the egress
side. TC-POLICE is the only action that can be offloaded on the egress
side and is used to rate limit all outgoing traffic to specified max
rate.

Patch 2 adds logic to reject the current rule offload if its priority
conflicts with existing rules in the TCAM.

Patch 3 adds support for TC-MATCHALL classifier offload on the ingress
side. The same set of actions supported by existing TC-FLOWER
classifier offload can be applied on all the incoming traffic.

v5:
- Fixed commit message and comment to include comparison for equal
  priority in patch 2.

v4:
- Removed check in patch 1 to reject police offload if prio is not 1.
- Moved TC_SETUP_BLOCK code to separate function in patch 1.
- Added logic to ensure the prio passed by TC doesn't conflict with
  other rules in TCAM in patch 2.
- Higher index has lower priority than lower index in TCAM. So, rework
  cxgb4_get_free_ftid() to search free index from end of TCAM in
  descending order in patch 2.
- Added check to ensure the matchall rule's prio doesn't conflict with
  other rules in TCAM in patch 3.
- Added logic to fill default mask for VIID, if none has been
  provided, to prevent conflict with duplicate VIID rules in patch 3.
- Used existing variables in private structure to fill VIID info,
  instead of extracting the info manually in patch 3.

v3:
- Added check in patch 1 to reject police offload if prio is not 1.
- Assign block_shared variable only for TC_SETUP_BLOCK in patch 1.

v2:
- Added check to reject flow block sharing for policers in patch 1.
- Removed logic to fetch free index from end of TCAM in patch 2.
  Must maintain the same ordering as in kernel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:05:24 -08:00
Rahul Lakkireddy 21c4c60b76 cxgb4: add TC-MATCHALL classifier ingress offload
Add TC-MATCHALL classifier ingress offload support. The same actions
supported by existing TC-FLOWER offload can be applied to all incoming
traffic on the underlying interface.

Ensure the rule priority doesn't conflict with existing rules in the
TCAM. Only 1 ingress matchall rule can be active at a time on the
underlying interface.

v5:
- No change.

v4:
- Added check to ensure the matchall rule's prio doesn't conflict with
  other rules in TCAM.
- Added logic to fill default mask for VIID, if none has been
  provided, to prevent conflict with duplicate VIID rules.
- Used existing variables in private structure to fill VIID info,
  instead of extracting the info manually.

v3:
- No change.

v2:
- Removed logic to fetch free index from end of TCAM. Must maintain
  same ordering as in kernel.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:05:23 -08:00
Rahul Lakkireddy 41ec03e534 cxgb4: check rule prio conflicts before offload
Only offload rule if it satisfies both of the following conditions:
1. The immediate previous rule has priority <= current rule's priority.
2. The immediate next rule has priority >= current rule's priority.

Also rework free entry fetch logic to search from end of TCAM, instead
of beginning, because higher indices have lower priority than lower
indices. This is similar to how TC auto generates priority values.

v5:
- Fixed commit message and comment to include comparison for equal
  priority.

v4:
- Patch added in this version.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:05:23 -08:00
Rahul Lakkireddy 4ec4762d8e cxgb4: add TC-MATCHALL classifier egress offload
Add TC-MATCHALL classifier offload with TC-POLICE action applied for
all outgoing traffic on the underlying interface. Split flow block
offload to support both egress and ingress classification.

For example, to rate limit all outgoing traffic to 1 Gbps:

$ tc qdisc add dev enp2s0f4 clsact
$ tc filter add dev enp2s0f4 egress matchall skip_sw \
	action police rate 1Gbit burst 8Kbit

Note that skip_sw is important. Otherwise, both stack and hardware
will end up doing policing. Policing can't be shared across flow
blocks. Only 1 egress matchall rule can be active at a time on the
underlying interface.

v5:
- No change.

v4:
- Removed check to reject police offload if prio is not 1.
- Moved TC_SETUP_BLOCK code to separate function.

v3:
- Added check to reject police offload if prio is not 1.
- Assign block_shared variable only for TC_SETUP_BLOCK.

v2:
- Added check to reject flow block sharing for policers.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 12:05:23 -08:00
David S. Miller 77c05d2f73 Merge branch 'page_pool-API-for-numa-node-change-handling'
Saeed Mahameed says:

====================
page_pool: API for numa node change handling

This series extends page pool API to allow page pool consumers to update
page pool numa node on the fly. This is required since on some systems,
rx rings irqs can migrate between numa nodes, due to irq balancer or user
defined scripts, current page pool has no way to know of such migration
and will keep allocating and holding on to pages from a wrong numa node,
which is bad for the consumer performance.

1) Add API to update numa node id of the page pool
Consumers will call this API to update the page pool numa node id.

2) Don't recycle non-reusable pages:
Page pool will check upon page return whether a page is suitable for
recycling or not.
 2.1) when it belongs to a different num node.
 2.2) when it was allocated under memory pressure.

3) mlx5 will use the new API to update page pool numa id on demand.

The series is a joint work between me and Jonathan, we tested it and it
proved itself worthy to avoid page allocator bottlenecks and improve
packet rate and cpu utilization significantly for the described
scenarios above.

Performance testing:
XDP drop/tx rate and TCP single/multi stream, on mlx5 driver
while migrating rx ring irq from close to far numa:

mlx5 internal page cache was locally disabled to get pure page pool
results.

CPU: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz
NIC: Mellanox Technologies MT27700 Family [ConnectX-4] (100G)

XDP Drop/TX single core:
NUMA  | XDP  | Before    | After
---------------------------------------
Close | Drop | 11   Mpps | 10.9 Mpps
Far   | Drop | 4.4  Mpps | 5.8  Mpps

Close | TX   | 6.5 Mpps  | 6.5 Mpps
Far   | TX   | 3.5 Mpps  | 4   Mpps

Improvement is about 30% drop packet rate, 15% tx packet rate for numa
far test.
No degradation for numa close tests.

TCP single/multi cpu/stream:
NUMA  | #cpu | Before  | After
--------------------------------------
Close | 1    | 18 Gbps | 18 Gbps
Far   | 1    | 15 Gbps | 18 Gbps
Close | 12   | 80 Gbps | 80 Gbps
Far   | 12   | 68 Gbps | 80 Gbps

In all test cases we see improvement for the far numa case, and no
impact on the close numa case.

==================

Performance analysis and conclusions by Jesper [1]:
Impact on XDP drop x86_64 is inconclusive and shows only 0.3459ns
slow-down, as this is below measurement accuracy of system.

v2->v3:
 - Rebase on top of latest net-next and Jesper's page pool object
   release patchset [2]
 - No code changes
 - Performance analysis by Jesper added to the cover letter.

v1->v2:
  - Drop last patch, as requested by Ilias and Jesper.
  - Fix documentation's performance numbers order.

[1] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/page_pool04_inflight_changes.org#performance-notes
[2] https://patchwork.ozlabs.org/cover/1192098/
====================

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:47:36 -08:00
Saeed Mahameed 6849c6d86b net/mlx5e: Rx, Update page pool numa node when changed
Once every napi poll cycle, check if numa node is different than
the page pool's numa id, and update it using page_pool_update_nid().

Alternatively, we could have registered an irq affinity change handler,
but page_pool_update_nid() must be called from napi context anyways, so
the handler won't actually help.

Performance testing:
XDP drop/tx rate and TCP single/multi stream, on mlx5 driver
while migrating rx ring irq from close to far numa:

mlx5 internal page cache was locally disabled to get pure page pool
results.

CPU: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz
NIC: Mellanox Technologies MT27700 Family [ConnectX-4] (100G)

XDP Drop/TX single core:
NUMA  | XDP  | Before    | After
---------------------------------------
Close | Drop | 11   Mpps | 10.9 Mpps
Far   | Drop | 4.4  Mpps | 5.8  Mpps

Close | TX   | 6.5 Mpps  | 6.5 Mpps
Far   | TX   | 3.5 Mpps  | 4  Mpps

Improvement is about 30% drop packet rate, 15% tx packet rate for numa
far test.
No degradation for numa close tests.

TCP single/multi cpu/stream:
NUMA  | #cpu | Before  | After
--------------------------------------
Close | 1    | 18 Gbps | 18 Gbps
Far   | 1    | 15 Gbps | 18 Gbps
Close | 12   | 80 Gbps | 80 Gbps
Far   | 12   | 68 Gbps | 80 Gbps

In all test cases we see improvement for the far numa case, and no
impact on the close numa case.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:47:36 -08:00
Saeed Mahameed d5394610b1 page_pool: Don't recycle non-reusable pages
A page is NOT reusable when at least one of the following is true:
1) allocated when system was under some pressure. (page_is_pfmemalloc)
2) belongs to a different NUMA node than pool->p.nid.

To update pool->p.nid users should call page_pool_update_nid().

Holding on to such pages in the pool will hurt the consumer performance
when the pool migrates to a different numa node.

Performance testing:
XDP drop/tx rate and TCP single/multi stream, on mlx5 driver
while migrating rx ring irq from close to far numa:

mlx5 internal page cache was locally disabled to get pure page pool
results.

CPU: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz
NIC: Mellanox Technologies MT27700 Family [ConnectX-4] (100G)

XDP Drop/TX single core:
NUMA  | XDP  | Before    | After
---------------------------------------
Close | Drop | 11   Mpps | 10.9 Mpps
Far   | Drop | 4.4  Mpps | 5.8  Mpps

Close | TX   | 6.5 Mpps  | 6.5 Mpps
Far   | TX   | 3.5 Mpps  | 4  Mpps

Improvement is about 30% drop packet rate, 15% tx packet rate for numa
far test.
No degradation for numa close tests.

TCP single/multi cpu/stream:
NUMA  | #cpu | Before  | After
--------------------------------------
Close | 1    | 18 Gbps | 18 Gbps
Far   | 1    | 15 Gbps | 18 Gbps
Close | 12   | 80 Gbps | 80 Gbps
Far   | 12   | 68 Gbps | 80 Gbps

In all test cases we see improvement for the far numa case, and no
impact on the close numa case.

The impact of adding a check per page is very negligible, and shows no
performance degradation whatsoever, also functionality wise it seems more
correct and more robust for page pool to verify when pages should be
recycled, since page pool can't guarantee where pages are coming from.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:47:36 -08:00
Saeed Mahameed bc83674870 page_pool: Add API to update numa node
Add page_pool_update_nid() to be called by page pool consumers when they
detect numa node changes.

It will update the page pool nid value to start allocating from the new
effective numa node.

This is to mitigate page pool allocating pages from a wrong numa node,
where the pool was originally allocated, and holding on to pages that
belong to a different numa node, which causes performance degradation.

For pages that are already being consumed and could be returned to the
pool by the consumer, in next patch we will add a check per page to avoid
recycling them back to the pool and return them to the page allocator.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:47:36 -08:00
David S. Miller 1f12177b32 Merge branch 'cpsw-switchdev'
Grygorii Strashko says:

====================
net: ethernet: ti: introduce new cpsw switchdev based driver

Thank you All for review of v6.

There are no significant changes in this version, just fixed comments to v6.

--- v6
The major change in this version is DT bindings conversation to json-schema, and
fixed other comments to v5. Also added patch to clean up ALE on init and netif
restart.

--- v5
The major part of work done in this iteration is rebasing on top of net-next
with XDP series from Ivan Khoronzhuk [3], and enable XDP support in the new
CPSW switchdev driver (it was little bit painful ;(). There are mostly no
functional changes in new CPSW driver, just few fixes, sync with old driver
and cleanups/optimizations. So, I've kept rest of cover letter unchanged.

---
This series originally based on work [1][2] done by
Ilias Apalodimas <ilias.apalodimas@linaro.org>.

This the RFC v5 which introduces new CPSW switchdev based driver which is
operating in dual-emac mode by default, thus working as 2 individual
network interfaces. The Switch mode can be enabled by configuring devlink driver
parameter "switch_mode" to 1/true:
	devlink dev param set platform/48484000.switch \
	name switch_mode value 1 cmode runtime
This can be done regardless of the state of Port's netdev devices - UP/DOWN, but
Port's netdev devices have to be in UP before joining the bridge to avoid
overwriting of bridge configuration as CPSW switch driver completely reloads its
configuration when first Port changes its state to UP.
When the both interfaces joined the bridge - CPSW switch driver will start
marking packets with offload_fwd_mark flag unless "ale_bypass=0".
All configuration is implemented via switchdev API.

The previous solution of tracking both Ports joined the bridge
(from netdevice_notifier) proved to be not correct as changing CPSW switch
driver mode required cleanup of ALE table and CPSW settings which happens
while second Port is joined bridge and as result configuration loaded
by bridge for the first Port became corrupted.

The introduction of the new CPSW switchdev based driver (cpsw_new.c) is split
on two parts: Part 1 - basic dual-emac driver; Part 2 switchdev support.
Such approach has simplified code development and testing alot. And, I hope,
it will help with better review.

patches #1 - 5: preparation patches which also moves common code to cpsw_priv.c
patches #6 - 9: Introduce TI CPSW switch driver based on switchdev and new
 DT bindings
patch #10: new CPSW switchdev driver documentation
patch #11: adds DT nodes for new CPSW switchdev driver added for DRA7 SoC
patch #12: adds DT nodes for new cpsw switchdev driver for am571x-idk board
patch #13: enables build of TI CPSW driver

Most of the contents of the previous cover-letter have been added in
new driver documentation, so please refer to that for configuration,
testing and future work.

These patches can be found at (branch contains some additional patches required
for testing on top of net-next):
 https://github.com/grygoriyS/linux.git
 branch: lkml-5.4-switch-tbd-v7

changes in v7:
 - patch 2: added check for devm_kmalloc_array() return value
 - patch 6: fixed comments

changes in v6: https://lkml.org/lkml/2019/11/9/108
 - DT bindings converted to json-schema
 - netdev initialization is split on creation and registration.
   The netdevs registration happens now at the end of the pobe.
 - reworked cpsw_set_pauseparam() to use PHYlib APIs.
 - other comments for v5 fixed

v5: https://patchwork.kernel.org/cover/11208785/
 - rebase on top of net-next with XDP series from Ivan Khoronzhuk [3],
   and enable XDP support in the new CPSW switchdev driver
   cpsw driver (tested XDP_DROP only)
 - sync with old cpsw driver
 - implement comments from  Ivan Khoronzhuk and Rob Herring
 - fixed "NETDEV WATCHDOG: .." warning after interface after interface UP/DOWN,
   missed TX wake in cpsw_adjust_link()

v4: https://patchwork.kernel.org/cover/11010523/
 - finished split of common CPSW code
 - added devlink support
 - changed CPSW mode configuration approach: from netdevice_notifier to devlink
   parameter
 - refactor and clean up ALE changes which allows to modify VLANs/MDBs entries
 - added missed support for port QDISC_CBS and QDISC_MQPRIO
 - the CPSW is split on two parts: basic dual_mac driver and switchdev support
 - added missed callback .ndo_get_port_parent_id()
 - reworked ingress frames marking in switch mode (offload_fwd_mark)
 - applied comments from Andrew Lunn

v3: https://lwn.net/Articles/786677/
Changes in v3:
- alot of work done to split properly common code between legacy and switchdev
  CPSW drivers and clean up code
- CPSW switchdev interface updated to the current LKML switchdev interface
- actually new CPSW switchdev based driver introduced
- optimized dual_mac mode in new driver. Main change is that in promiscuous
mode P0_UNI_FLOOD (both ports) is enabled in addition to ALLMULTI (current
port) instead of ALE_BYPASS.  So, port in non promiscuous mode will keep
possibility of mcast and vlan filtering.
- changed bridge join sequnce: now switch mode will be enabled only when
both ports joined the bridge. CPSW will be switched to dual_mac mode if any
port leave bridge. ALE table is completly cleared and then refiled while
switching to switch mode - this simplidies code a lot, but introduces some
limitation to bridge setup sequence:
 ip link add name br0 type bridge
 ip link set dev br0 type bridge ageing_time 1000
 ip link set dev br0 type bridge vlan_filtering 0 <- disable
 echo 0 > /sys/class/net/br0/bridge/default_vlan

 ip link set dev sw0p1 up <- add ports
 ip link set dev sw0p2 up
 ip link set dev sw0p1 master br0
 ip link set dev sw0p2 master br0

 echo 1 > /sys/class/net/br0/bridge/default_vlan <- enable
 ip link set dev br0 type bridge vlan_filtering 1
 bridge vlan add dev br0 vid 1 pvid untagged self
- STP tested with vlan_filtering 1/0. To make STP work I've had to set
  NO_SA_UPDATE for all slave ports (see comment in code). It also required to
  statically register STP mcast address {0x01, 0x80, 0xc2, 0x0, 0x0, 0x0};
- allowed build both TI_CPSW and TI_CPSW_SWITCHDEV drivers
- PTP can be enabled on both ports in dual_mac mode

[1] https://patchwork.ozlabs.org/cover/929367/
[2] https://patches.linaro.org/cover/136709/
[3] https://patchwork.kernel.org/cover/11035813/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Grygorii Strashko 3727d259dd arm: omap2plus_defconfig: enable new cpsw switchdev driver
Add CONFIG_TI_CPSW_SWITCHDEV option to enable new cpsw switchdev driver

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Grygorii Strashko 15b991ade4 ARM: dts: am571x-idk: enable for new cpsw switch dev driver
Add DT nodes for new cpsw switchdev driver for am571x-idk board for now to
enable testing of the new solution.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Grygorii Strashko 39331a49c4 ARM: dts: dra7: add dt nodes for new cpsw switch dev driver
Add DT nodes for new cpsw switch dev driver.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Ilias Apalodimas 14c815a9d1 Documentation: networking: add cpsw switchdev based driver documentation
A new cpsw dirver based on switchdev was added. Add documentation about
basic configuration and future features

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Grygorii Strashko da84e50c8e phy: ti: phy-gmii-sel: dependency from ti cpsw-switchdev driver
Add dependency from TI_CPSW_SWITCHDEV.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Ilias Apalodimas 111cf1ab4d net: ethernet: ti: introduce cpsw switchdev based driver part 2 - switch
CPSW switchdev based driver which is operating in dual-emac mode by
default, thus working as 2 individual network interfaces. The Switch mode
can be enabled by configuring devlink driver parameter "switch_mode" to 1:

	devlink dev param set platform/48484000.switch \
	name switch_mode value 1 cmode runtime

This can be done regardless of the state of Port's netdevs - UP/DOWN, but
Port's netdev devices have to be UP before joining the bridge to avoid
overwriting of bridge configuration as CPSW switch driver completely
reloads its configuration when first Port changes its state to UP.
When the both interfaces joined the bridge - CPSW switch driver will start
marking packets with offload_fwd_mark flag unless "ale_bypass=0".

All configuration is implemented via switchdev API and notifiers.
Supported:
 - SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS
 - SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: BR_MCAST_FLOOD
 - SWITCHDEV_ATTR_ID_PORT_STP_STATE
 - SWITCHDEV_OBJ_ID_PORT_VLAN
 - SWITCHDEV_OBJ_ID_PORT_MDB
 - SWITCHDEV_OBJ_ID_HOST_MDB

Hence CPSW switchdev driver supports:
- FDB offloading
- MDB offloading
- VLAN filtering and offloading
- STP

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Ilias Apalodimas ed3525eda4 net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac
Part 1:
 Introduce basic CPSW dual_mac driver (cpsw_new.c) which is operating in
dual-emac mode by default, thus working as 2 individual network interfaces.
Main differences from legacy CPSW driver are:

 - optimized promiscuous mode: The P0_UNI_FLOOD (both ports) is enabled in
addition to ALLMULTI (current port) instead of ALE_BYPASS. So, Ports in
promiscuous mode will keep possibility of mcast and vlan filtering, which
is provides significant benefits when ports are joined to the same bridge,
but without enabling "switch" mode, or to different bridges.
 - learning disabled on ports as it make not too much sense for
   segregated ports - no forwarding in HW.
 - enabled basic support for devlink.

	devlink dev show
		platform/48484000.switch

	devlink dev param show
	 platform/48484000.switch:
	name ale_bypass type driver-specific
	 values:
		cmode runtime value false

 - "ale_bypass" devlink driver parameter allows to enable
ALE_CONTROL(4).BYPASS mode for debug purposes.
 - updated DT bindings.

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko ef63fe72f6 dt-bindings: net: ti: add new cpsw switch driver bindings
Add bindings for the new TI CPSW switch driver. Comparing to the legacy
bindings (net/cpsw.txt):
- ports definition follows DSA bindings (net/dsa/dsa.txt) and ports can be
marked as "disabled" if not physically wired.
- all deprecated properties dropped;
- all legacy propertiies dropped which represent constant HW cpapbilities
(cpdma_channels, ale_entries, bd_ram_size, mac_control, slaves,
active_slave)
- TI CPTS DT properties are reused as is, but grouped in "cpts" sub-node
- TI Davinci MDIO DT bindings are reused as is, because Davinci MDIO is
reused.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko c5013ac1dd net: ethernet: ti: cpsw: move set of common functions in cpsw_priv
As a preparatory patch to add support for a switchdev based cpsw driver,
move common functions to cpsw-priv.c so that they can be used across both
drivers.

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko 51a9533797 net: ethernet: ti: cpsw: resolve build deps of cpsw drivers
A following patches introduce new CPSW switchdev driver which uses common
code with legacy CPSW driver. This will introduce build dependency between
CPSW switchdev and CPSW legacy drivers related to for_each_slave() and
cpsw_slave_index() - they can be compiled both, but only one of them will
be not functional depending in Kconfig settings due to duffrences in Slave
Ports indexes calculation.

To fix this make for_each_slave() local (it's used now only by legacy CPSW
driver) and convert cpsw_slave_index() to be a function pointer which is
assigned in probe. Driver to probe is defined by DT.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Ilias Apalodimas e85c143707 net: ethernet: ti: ale: modify vlan/mdb api for switchdev
A following patch introduces switchdev functionality, so modify
ALE engine VLANs/MDBs API:
- cpsw_ale_del_mcast(): update so it will remove only selected ports from
mcast port_mask or delete whole mcast record if !port_mask
- cpsw_ale_del_vlan(): update so it will remove only selected ports from
all VLAN record's masks or delete whole VLAN record if !port_mask
- add cpsw_ale_vlan_add_modify() to add or modify existing VLAN record's
masks
- add cpsw_ale_set_unreg_mcast() for enabling unreg mcast on port VLANs

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00
Grygorii Strashko 4b41d34367 net: ethernet: ti: cpsw: allow untagged traffic on host port
Now untagged vlan traffic is not support on Host P0 port. This patch adds
in ALE context bitmap of VLANs for which Host P0 port bit set in Force
Untagged Packet Egress bitmask in VLANs ALE entries, and adds corresponding
check in VLAN incapsulation header parsing function cpsw_rx_vlan_encap().

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:23 -08:00