Allwinner V3s SoC has a syscon like the one in H3.
Add its compatible string.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allwinner V3s SoC has a Ethernet MAC like the one in Allwinner H3, but
have no external MII capability. That means that it can only use the
EPHY and cannot do Gbps transmission.
Add binding for it.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lucas Bates says:
====================
net: Introduction of the tc tests
Apologies for sending this as one big patch. I've been sitting on this a little
too long, but it's ready and I wanted to get it out.
There are a limited number of tests to start - I plan to add more on a regular
basis.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the beginnings of a testsuite for tc functionality in the kernel.
These are a series of unit tests that use the tc executable and verify
the success of those commands by checking both the exit codes and the
output from tc's 'show' operation.
To run the tests:
# cd tools/testing/selftests/tc-testing
# sudo ./tdc.py
You can specify the tc executable to use with the -p argument on the command
line or editing the 'TC' variable in tdc_config.py. Refer to the README for
full details on how to run.
The initial complement of test cases are limited mostly to tc actions. Test
cases are most welcome; see the creating-testcases subdirectory for help
in creating them.
Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz says:
====================
qed*: RDMA and infrastructure for iWARP
This series focuses on RDMA in general with emphasis on required changes
toward adding iWARP support. The vast majority of the changes introduced
are in qed/qede, with a couple of small changes to qedr
[mentioned below].
The infrastructure changes:
- Patch #1 adds the ability to pass PBL memory externally for a newly
created chain.
- Patches #4, #5 rename qede_roce.[ch] into qede_rdma.[ch] + change
prefixes from _roce_ to _rdma_, as the API between qede and qedr is
agnostic to the variant of the RDMA protocol used. These patches also
touch qedr [basically to align it with the renaming, nothing more].
- Patch #7 replaces the current SPQ async mechanism into serving
registered callbacks [before adding iWARP which would add another client
in need of this sort of functionallity].
The non-infrastrucutre changes:
- Patches #2, #3 contain DCB-related changes to better align RDMA with
configured DCB.
- Patch #6 contains a minor [mostly theoretical fix] to release flow.
Changes from previous versions
------------------------------
- V4: This is actually a repost of V3 due to some confusion regarding
the sent cover-letter
- V3: Add commit log message in #4 indicating change in header inclusion
- V2: Add several inclusion into qede_rdma.h to have proper declarations
of all variable types used in it
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever firmware indicates that there's an async indication it needs
to handle, there's a switch-case where the right functionality is called
based on function's personality and information.
Before iWARP is added [as yet another client], switch over the SPQ into
a callback-registered mechanism, allowing registration of the relevant
event-processing logic based on the function's personality. This allows
us to tidy the code by removing protocol-specifics from a common file.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver needs to wait for all resources to return from FW before it can send
the FUNC_CLOSE ramrod.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the functions common to both iWARP and RoCE to have a prefix of
_rdma_ instead of _roce_.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once we have iWARP support, the qede portion of the qedr<->qede would
serve all the RDMA protocols - so rename the file to be appropriate
to its function.
While we're at it, we're also moving a couple of inclusions to it into
.h files and adding includes to make sure it contains all type
definitions it requires.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If DCBx update occurs while QPs are open, stop sending edpms until all
QPs are closed.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Configure device according to DCBx results so that EDPMs
made by RoCE would honor flow-control.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
iWARP would require the chains to allocate/free their PBL memory
independently, so add the infrastructure to provide it externally.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace first padding in the tcp_md5sig structure with a new flag field
and address prefix length so it can be specified when configuring a new
key for TCP MD5 signature. The tcpm_flags field will only be used if the
socket option is TCP_MD5SIG_EXT to avoid breaking existing programs, and
tcpm_prefixlen only when the TCP_MD5SIG_FLAG_PREFIX flag is set.
Signed-off-by: Bob Gilligan <gilligan@arista.com>
Signed-off-by: Eric Mowat <mowat@arista.com>
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows the keys used for TCP MD5 signature to be used for whole
range of addresses, specified with a prefix length, instead of only one
address as it currently is.
Signed-off-by: Bob Gilligan <gilligan@arista.com>
Signed-off-by: Eric Mowat <mowat@arista.com>
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ioctl support to IPoIB device driver. For now, this
ioctl will support timestamp get and set.
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Eitan Rabin <rabin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Enable PTP for IPoIB rdma_netdev and add the ability
to get the time stamping parameters using ethtool.
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Eitan Rabin <rabin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add the ndo that supports change mtu for IPoIB.
The callback called from the ipoib ULP driver, that gives the ability to
change the SW and HW resources accordingly in the lower driver.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The mtu extra space that kept for the HW is specific for each link type,
and it is different in mlx5e and mlx5i modules.
Now it is kept in the priv structures, set by the mlx5e/mlx5i driver
accordingly.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add function that sets the default values for ipoib, setting/clearing
abilities that IPoIB doesn't support, like RQ size in this case.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Updating the carrier involves specific HW setting, each profile should
use its own function for that.
Both IPoIB and VF representor don't need carrier update function, since
VF representor has only a logical link to VF and IPoIB manages its own
link via ib_core upper layer.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Port flow control supported only for ethernet ports,
therefore, prevent any call if the port type differs from
MLX5_CAP_PORT_TYPE_ETH.
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
IPoIB netdevice driver was only introduced in previous kernel release
and it is growing in terms of features and LOC, move it to a separate
directory.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
During the module initialisation there is a possible race
(basically race between uld and lld) where neither the uld
nor lld notifies the uP about where to route the ctrl queue
completions. LLD skips notifying uP as the rdma queues were
not created by then (will leave it to ULD to notify the uP).
As the ULD comes up, it also skips notifying the uP as the
flag FULL_INIT_DONE is not set yet (ULD assumes that the
interface is not up yet).
Consequently, this race between uld and lld leaves uP
unnotified about where to send the ctrl queue completions
to, leading to iwarp RI_RES WR failure.
Here is the race:
CPU 0 CPU1
- allocates nic rx queus
- t4_sge_alloc_ctrl_txq()
(if rdma rsp queues exists,
tell uP to route ctrl queue
compl to rdma rspq)
- acquires the mutex_lock
- allocates rdma response queues
- if FULL_INIT_DONE set,
tell uP to route ctrl queue compl
to rdma rspq
- relinquishes mutex_lock
- acquires the mutex_lock
- enable_rx()
- set FULL_INIT_DONE
- relinquishes mutex_lock
This patch fixes the above issue.
Fixes: e7519f9926f1('cxgb4: avoid enabling napi twice to the same queue')
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
CC: Stable <stable@vger.kernel.org> # 4.9+
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add general use per-vNIC mailbox area and use it for VLAN filtering
support. Initially proto is hardcoded to 802.1q.
Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid NULL dereference in setup_sge_queues() when the adapter is
in non offload mode.
Fixes: 0fbc81b3ad ('chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's')
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each Octeon output ring can DMA packets to host memory in two modes: info-
pointer mode and buffer-pointer-only mode. In info-pointer mode, Octeon
takes two buffer pointers for each packet and places the length of the
packet along with specified number of bytes from the beginning of the
packet into one buffer and the rest of the packet in a separate buffer. In
buffer-pointer-only mode, Octeon takes single buffer pointer and places the
length of the packet at the beginning of the buffer followed by the packet
data.
This patch switches all Octeon output rings from info-pointer mode to
buffer-pointer-only mode. This results in fewer DMA setups and cache line
snoops.
Signed-off-by: Prasad Kanneganti <pkanneganti@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Variable opt in pptp_release() is set but never used, thus needs to be
removed.
Signed-off-by: Christos Gkekas <chris.gekas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add implementation to support ethtool -K ethX rx-vlan-filter on/off.
Rename OCTNET_CMD_ENABLE_VLAN_FILTER command to OCTNET_CMD_VLAN_FILTER_CTL
and add OCTNET_CMD_VLAN_FILTER_ENABLE and OCTNET_CMD_VLAN_FILTER_DISABLE
parameters so that it can be used to enable or disable the filter.
Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 6d3c8c0dd8 ("net: dsa: Remove master_netdev and
use dst->cpu_dp->netdev") and a29342e739 ("net: dsa: Associate
slave network device with CPU port") we would be seeing NULL pointer
dereferences when accessing dst->cpu_dp->netdev too early. In the legacy
code, we actually know early in advance the master network device, so
pass it down to the relevant functions.
Fixes: 6d3c8c0dd8 ("net: dsa: Remove master_netdev and use dst->cpu_dp->netdev")
Fixes: a29342e739 ("net: dsa: Associate slave network device with CPU port")
Reported-by: Jason Cobham <jcobham@questertangent.com>
Tested-by: Jason Cobham <jcobham@questertangent.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Missing crypto deps for some platforms.
Default to n for new module.
config: m68k-amcore_defconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 4.9.0
make.cross ARCH=m68k
All errors (new ones prefixed by >>):
net/built-in.o: In function `tls_set_sw_offload':
>> (.text+0x732f8): undefined reference to `crypto_alloc_aead'
net/built-in.o: In function `tls_set_sw_offload':
>> (.text+0x7333c): undefined reference to `crypto_aead_setkey'
net/built-in.o: In function `tls_set_sw_offload':
>> (.text+0x73354): undefined reference to `crypto_aead_setauthsize'
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Wang says:
====================
remove dst garbage collector logic
The current mechanism of dst release is a bit complicated. It is because
the users of dst get divided into 2 situations:
1. Most users take the reference count when using a dst and release the
reference count when done.
2. Exceptional users like IPv4/IPv6/decnet/xfrm routing code do not take
reference count when referencing to a dst due to some histotic reasons.
Due to those exceptional use cases in 2, reference count being 0 is not an
adequate evidence to indicate that no user is using this dst. So users in 1
can't free the dst simply based on reference count being 0 because users in
2 might still hold reference to it.
Instead, a dst garbage list is needed to hold the dst entries that already
get removed by the users in 2 but are still held by users in 1. And a periodic
garbage collector task is run to check all the dst entries in the list to see
if the users in 1 have released the reference to those dst entries.
If so, the dst is now ready to be freed.
This logic introduces unnecessary complications in the dst code which makes it
hard to understand and to debug.
In order to get rid of the whole dst garbage collector (gc) and make the dst
code more unified and simplified, we can make the users in 2 also take reference
count on the dst and release it properly when done.
This way, dst can be safely freed once the refcount drops to 0 and no gc
thread is needed anymore.
This patch series' target is to completely get rid of dst gc logic and free
dst based on reference count only.
Patch 1-3 are preparation patches to do some cleanup/improvement on the existing
code to make later work easier.
Patch 4-21 are real implementations.
In these patches, a temporary flag DST_NOGC is used to help transition
those exceptional users one by one. Once every component is transitioned,
this temporary flag is removed.
By the end of this patch series, all dst are refcounted when being used
and released when done. And dst will be freed when its refcount drops to 0.
No dst gc task is running anymore.
Note: This patch series depends on the decnet fix that was sent right before:
"decnet: always not take dst->__refcnt when inserting dst into hash table"
v2:
add curly braces in udp_v4/6_early_demux() in patch 02
add EXPORT_SYMBOL() for dst_dev_put() in patch 05
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is meant to add a debug warning on the situation where dst is
being held during its destroy phase. This could potentially cause double
free issue on the dst.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As some dst flags are removed, reorder the dst flags to fill in the
blanks.
Note: these flags are not exposed into user space. So it is safe to
reorder.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DST_NOCACHE flag check has been removed from dst_release() and
dst_hold_safe() in a previous patch because all the dst are now ref
counted properly and can be released based on refcnt only.
Looking at the rest of the DST_NOCACHE use, all of them can now be
removed or replaced with other checks.
So this patch gets rid of all the DST_NOCACHE usage and remove this flag
completely.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that all the components have been changed to release dst based on
refcnt only and not depend on dst gc anymore, we can remove the
temporary flag DST_NOGC.
Note that we also need to remove the DST_NOCACHE check in dst_release()
and dst_hold_safe() because now all the dst are released based on refcnt
and behaves as DST_NOCACHE.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes all dst gc related code and all the dst free
functions
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct dn_route is inserted into dn_rt_hash_table but no dst->__refcnt
is taken.
This patch makes sure the dn_rt_hash_table's reference to the dst is ref
counted.
As the dst is always ref counted properly, we can safely mark
DST_NOGC flag so dst_release() will release dst based on refcnt only.
And dst gc is no longer needed and all dst_free() or its related
function calls should be replaced with dst_release() or
dst_release_immediate(). And dst_dev_put() is called when removing dst
from the hash table to release the reference on dst->dev before we lose
pointer to it.
Also, correct the logic in dn_dst_check_expire() and dn_dst_gc() to
check dst->__refcnt to be > 1 to indicate it is referenced by other
users.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During the creation of xfrm_dst bundle, always take ref count when
allocating the dst. This way, xfrm_bundle_create() will form a linked
list of dst with dst->child pointing to a ref counted dst child. And
the returned dst pointer is also ref counted. This makes the link from
the flow cache to this dst now ref counted properly.
As the dst is always ref counted properly, we can safely mark
DST_NOGC flag so dst_release() will release dst based on refcnt only.
And dst gc is no longer needed and all dst_free() and its related
function calls should be replaced with dst_release() or
dst_release_immediate().
The special handling logic for dst->child in dst_destroy() can be
replaced with a simple dst_release_immediate() call on the child to
release the whole list linked by dst->child pointer.
Previously used DST_NOHASH flag is not needed anymore as well. The
reason that DST_NOHASH is used in the existing code is mainly to prevent
the dst inserted in the fib tree to be wrongly destroyed during the
deletion of the xfrm_dst bundle. So in the existing code, DST_NOHASH
flag is marked in all the dst children except the one which is in the
fib tree.
However, with this patch series to remove dst gc logic and release dst
only based on ref count, it is safe to release all the children from a
xfrm_dst bundle as long as the dst children are all ref counted
properly which is already the case in the existing code.
So, this patch removes the use of DST_NOHASH flag.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
icmp6 dst route is currently ref counted during creation and will be
freed by user during its call of dst_release(). So no need of a garbage
collector for it.
Remove all icmp6 dst garbage collector related code.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the previous preparation patches, we are ready to get rid of the
dst gc operation in ipv6 code and release dst based on refcnt only.
So this patch adds DST_NOGC flag for all IPv6 dst and remove the calls
to dst_free() and its related functions.
At this point, all dst created in ipv6 code do not use the dst gc
anymore and will be destroyed at the point when refcnt drops to 0.
Also, as icmp6 dst route is refcounted during creation and will be freed
by user during its call of dst_release(), there is no need to add this
dst to the icmp6 gc list as well.
Instead, we need to add it into uncached list so that when a
NETDEV_DOWN/NETDEV_UNREGISRER event comes, we can properly go through
these icmp6 dst as well and release the net device properly.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar as ipv4, ipv6 path also needs to call dst_hold_safe() when
necessary to avoid double free issue on the dst.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the intend of this patch series is to completely remove dst gc,
we need to call dst_dev_put() to release the reference to dst->dev
when removing routes from fib because we won't keep the gc list anymore
and will lose the dst pointer right after removing the routes.
Without the gc list, there is no way to find all the dst's that have
dst->dev pointing to the going-down dev.
Hence, we are doing dst_dev_put() immediately before we lose the last
reference of the dst from the routing code. The next dst_check() will
trigger a route re-lookup to find another route (if there is any).
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In IPv6 routing code, struct rt6_info is created for each static route
and RTF_CACHE route and inserted into fib6 tree. In both cases, dst
ref count is not taken.
As explained in the previous patch, this leads to the need of the dst
garbage collector.
This patch holds ref count of dst before inserting the route into fib6
tree and properly releases the dst when deleting it from the fib6 tree
as a preparation in order to fully get rid of dst gc later.
Also, correct fib6_age() logic to check dst->__refcnt to be 1 to indicate
no user is referencing the dst.
And remove dst_hold() in vrf_rt6_create() as ip6_dst_alloc() already puts
dst->__refcnt to 1.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the previous preparation patches, we are ready to get rid of the
dst gc operation in ipv4 code and release dst based on refcnt only.
So this patch adds DST_NOGC flag for all IPv4 dst and remove the calls
to dst_free().
At this point, all dst created in ipv4 code do not use the dst gc
anymore and will be destroyed at the point when refcnt drops to 0.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch checks all the calls to
dst_hold()/skb_dst_force()/dst_clone()/dst_use() to see if
dst_hold_safe() is needed to avoid double free issue if dst
gc is removed and dst_release() directly destroys dst when
dst->__refcnt drops to 0.
In tx path, TCP hold sk->sk_rx_dst ref count and also hold sock_lock().
UDP and other similar protocols always hold refcount for
skb->_skb_refdst. So both paths seem to be safe.
In rx path, as it is lockless and skb_dst_set_noref() is likely to be
used, dst_hold_safe() should always be used when trying to hold dst.
In the routing code, if dst is held during an rcu protected session, it
is necessary to call dst_hold_safe() as the current dst might be in its
rcu grace period.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the intend of this patch series is to completely remove dst gc,
we need to call dst_dev_put() to release the reference to dst->dev
when removing routes from fib because we won't keep the gc list anymore
and will lose the dst pointer right after removing the routes.
Without the gc list, there is no way to find all the dst's that have
dst->dev pointing to the going-down dev.
Hence, we are doing dst_dev_put() immediately before we lose the last
reference of the dst from the routing code. The next dst_check() will
trigger a route re-lookup to find another route (if there is any).
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In IPv4 routing code, fib_nh and fib_nh_exception can hold pointers
to struct rtable but they never increment dst->__refcnt.
This leads to the need of the dst garbage collector because when user
is done with this dst and calls dst_release(), it can only decrement
dst->__refcnt and can not free the dst even it sees dst->__refcnt
drops from 1 to 0 (unless DST_NOCACHE flag is set) because the routing
code might still hold reference to it.
And when the routing code tries to delete a route, it has to put the
dst to the gc_list if dst->__refcnt is not yet 0 and have a gc thread
running periodically to check on dst->__refcnt and finally to free dst
when refcnt becomes 0.
This patch increments dst->__refcnt when
fib_nh/fib_nh_exception holds reference to this dst and properly release
the dst when fib_nh/fib_nh_exception has been updated with a new dst.
This patch is a preparation in order to fully get rid of dst gc later.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>