Stephen Hemminger says:
====================
hv_netvsc: fix races during shutdown and changes
This set of patches fixes issues identified by Vitaly Kuznetsov and
Mohammed Gamal related to state changes in Hyper-v network driver.
A lot of the issues are because setting up the netvsc device requires
a second step (in work queue) to get all the sub-channels running.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Make common function for detaching internals of device
during changes to MTU and RSS. Make sure no more packets
are transmitted and all packets have been received before
doing device teardown.
Change the wait logic to be common and use usleep_range().
Changes transmit enabling logic so that transmit queues are disabled
during the period when lower device is being changed. And enabled
only after sub channels are setup. This avoids issue where it could
be that a packet was being sent while subchannel was not initialized.
Fixes: 8195b1396e ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On older versions of Windows, the host ignores messages after
vmbus channel is closed.
Workaround this by doing what Windows does and send the teardown
before close on older versions of NVSP protocol.
Reported-by: Mohammed Gamal <mgamal@redhat.com>
Fixes: 0cf737808a ("hv_netvsc: netvsc_teardown_gpadl() split")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The receive processing may continue to happen while the
internal network device state is in RCU grace period.
The internal RNDIS structure is associated with the
internal netvsc_device structure; both have the same
RCU lifetime.
Defer freeing all associated parts until after grace
period.
Fixes: 0cf737808a ("hv_netvsc: netvsc_teardown_gpadl() split")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes sure that no CPU is still process packets when
the channel is closed.
Fixes: 76bb5db5c7 ("netvsc: fix use after free on module removal")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For multipath routes the ONLINK flag can be specified per nexthop in
rtnh_flags or globally in rtm_flags. Update ip6_route_multipath_add
to consider the ONLINK setting coming from rtnh_flags. Each loop over
nexthops the config for the sibling route is initialized to the global
config and then per nexthop settings overlayed. The flag is 'or'ed into
fib6_config to handle the ONLINK flag coming from either rtm_flags or
rtnh_flags.
Fixes: fc1e64e109 ("net/ipv6: Add support for onlink flag")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We already detect situations where a PPP channel sends packets back to
its upper PPP device. While this is enough to avoid deadlocking on xmit
locks, this doesn't prevent packets from looping between the channel
and the unit.
The problem is that ppp_start_xmit() enqueues packets in ppp->file.xq
before checking for xmit recursion. Therefore, __ppp_xmit_process()
might dequeue a packet from ppp->file.xq and send it on the channel
which, in turn, loops it back on the unit. Then ppp_start_xmit()
queues the packet back to ppp->file.xq and __ppp_xmit_process() picks
it up and sends it again through the channel. Therefore, the packet
will loop between __ppp_xmit_process() and ppp_start_xmit() until some
other part of the xmit path drops it.
For L2TP, we rapidly fill the skb's headroom and pppol2tp_xmit() drops
the packet after a few iterations. But PPTP reallocates the headroom
if necessary, letting the loop run and exhaust the machine resources
(as reported in https://bugzilla.kernel.org/show_bug.cgi?id=199109).
Fix this by letting __ppp_xmit_process() enqueue the skb to
ppp->file.xq, so that we can check for recursion before adding it to
the queue. Now ppp_xmit_process() can drop the packet when recursion is
detected.
__ppp_channel_push() is a bit special. It calls __ppp_xmit_process()
without having any actual packet to send. This is used by
ppp_output_wakeup() to re-enable transmission on the parent unit (for
implementations like ppp_async.c, where the .start_xmit() function
might not consume the skb, leaving it in ppp->xmit_pending and
disabling transmission).
Therefore, __ppp_xmit_process() needs to handle the case where skb is
NULL, dequeuing as many packets as possible from ppp->file.xq.
Reported-by: xu heng <xuheng333@zoho.com>
Fixes: 55454a5658 ("ppp: avoid dealock on recursive xmit")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh says:
====================
Aquantia atlantic hot fixes 03-2018
This is a set of atlantic driver hot fixes for various areas:
Some issues with hardware reset covered,
Fixed napi_poll flood happening on some traffic conditions,
Allow system to change MAC address on live device,
Add pci shutdown handler.
patch v2:
- reverse christmas tree
- remove driver private parameter, replacing it with define.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We should close link and all NIC operations during shutdown.
On some systems graceful reboot never closes NIC interface on its own,
but only indicates pci device shutdown. Without explicit handler, NIC
rx rings continued to transfer DMA data into prepared buffers while CPU
rebooted already. That caused memory corruptions on soft reboot.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is nothing prevents us from changing MAC on the running interface.
Allow this with ndev priv flag.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should report to napi full budget only when we have more job to do.
Before this fix, on any tx queue cleanup we forced napi to do poll again.
Thats a waste of cpu resources and caused storming with napi polls when
there was at least one tx on each interrupt.
With this fix we report full budget only when there is more job on TX
to do. Or, as before, when rx budget was fully consumed.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
B1 hardware changes behavior of mailbox interface, it has busy bit
always raised. Data ready condition should be detected by increment
of address register.
Old code has empty `for` loop, and that caused cpu overloads on B1
hardware. aq_nic_service_timer_cb consumed ~100ms because of that.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FW 1.5.58 and below needs a fixed delay even after 0x18 register
is filled. Otherwise, setting MPI_INIT state too fast causes
traffic hang.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under some circumstances (notably using thunderbolt interface) SPI
on chip reset may be in active transaction.
Here we forcibly cleanup SPI to prevent possible hangups.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann says:
====================
s390/qeth: fixes 2018-03-20
Please apply one final set of qeth patches for 4.16.
All of these fix long-standing bugs, so please queue them up for -stable
as well.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When the IRQ handler determines that one of the cmd IO channels has
failed and schedules recovery, block any further cmd requests from
being submitted. The request would inevitably stall, and prevent the
recovery from making progress until the request times out.
This sort of error was observed after Live Guest Relocation, where
the pending IO on the READ channel intentionally gets terminated to
kick-start recovery. Simultaneously the guest executed SIOCETHTOOL,
triggering qeth to issue a QUERY CARD INFO command. The command
then stalled in the inoperabel WRITE channel.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For calling ccw_device_start(), issue_next_read() needs to hold the
device's ccwlock.
This is satisfied for the IRQ handler path (where qeth_irq() gets called
under the ccwlock), but we need explicit locking for the initial call by
the MPC initialization.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qeth_wait_for_threads() is potentially called by multiple users, make
sure to notify all of them after qeth_clear_thread_running_bit()
adjusted the thread_running_mask. With no timeout, callers would
otherwise stall.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On removal, a qeth card's netdevice is currently not properly freed
because the call chain looks as follows:
qeth_core_remove_device(card)
lx_remove_device(card)
unregister_netdev(card->dev)
card->dev = NULL !!!
qeth_core_free_card(card)
if (card->dev) !!!
free_netdev(card->dev)
Fix it by free'ing the netdev straight after unregistering. This also
fixes the sysfs-driven layer switch case (qeth_dev_layer2_store()),
where the need to free the current netdevice was not considered at all.
Note that free_netdev() takes care of the netif_napi_del() for us too.
Fixes: 4a71df5004 ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kevin Hao says:
====================
net: phy: Add general dummy stubs for MMD register access
v2:
As suggested by Andrew:
- Add general dummy stubs
- Also use that for the micrel phy
This patch series fix the Ethernet broken on the mpc8315erdb board introduced
by commit b6b5e8a691 ("gianfar: Disable EEE autoneg by default").
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The new general dummy stubs for MMD register access were introduced.
Use that for the codes reuse.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Ethernet on mpc8315erdb is broken since commit b6b5e8a691
("gianfar: Disable EEE autoneg by default"). The reason is that
even though the rtl8211b doesn't support the MMD extended registers
access, it does return some random values if we trying to access
the MMD register via indirect method. This makes it seem that the
EEE is supported by this phy device. And the subsequent writing to
the MMD registers does cause the phy malfunction. So use the dummy
stubs for the MMD register access to fix this issue.
Fixes: b6b5e8a691 ("gianfar: Disable EEE autoneg by default")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For some phy devices, even though they don't support the MMD extended
register access, it does have some side effect if we are trying to
read/write the MMD registers via indirect method. So introduce general
dummy stubs for MMD register access which these devices can use to avoid
such side effect.
Fixes: b6b5e8a691 ("gianfar: Disable EEE autoneg by default")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- fix possible IPv6 packet loss when multicast extension is used, by Linus Luessing
- fix SKB handling issues for TTVN and DAT, by Matthias Schiffer (two patches)
- fix include for eventpoll, by Sven Eckelmann
- fix skb checksum for ttvn reroutes, by Sven Eckelmann
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlqv5tcWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeob7PD/wPnjVmFl6uQlRHfOfUBzGvW9hU
ASakGylzfZTXzP2VviMJ+7JehXQgediM4caTr/8jWhJFeVxu5jmeYzv8nzOlwZJ/
PjNlX+ChuIAWv1wyxHI3YWwj7Y1Ox9Lp8Gs7m5UbmfyiZEtb+ybHjvPzZ/FMfhWo
hOgbndSpFKfwFuXFkXyitektBcKTiFUyjk9U22v91XCQZZzMb8KqougVBE+bdg0w
N8LfgKXVC448sUNKTEczhji7bFuJwR+Sogx1TwIIsyfT2Tp4JXY3A7M1LDwAL/Rc
nTu9L58vk4qUouRjIQJ7rhhoxdPSrD5p3oinzVnLvJBxgOS3pxegY3asX8sRMXsj
bgolIGCZ9vDfA21WXqa3RyjUiBEx9UId6W++h+22kVqVHt5tqIFK2nVQ6IInOXwB
kzd3UDrRBLgRdBpwlu/ii9rn68MOEdLNpL3Seo9DkViJESfLn1ODnFA1Y2rFoK6S
cVeYl1DVj4SQSXjNilqYhFZoTEo/kxN5KhgHRXzujTv6MCHVa2SnCZaXS5EXp8n6
u4gpPc363Isj17A+UVlmHmmzA6d8Jqe8veVvET0xSvQPHmE7eepMkJU9hIUNXfcv
fENmQnU10pS1keJqS6xuR6k2RxMoofYRUPpKEigCcH7z/0GytWQc6QVGmo19LD9b
GNcWAhKI453pc1IuUA==
=hsM8
-----END PGP SIGNATURE-----
Merge tag 'batadv-net-for-davem-20180319' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- fix possible IPv6 packet loss when multicast extension is used, by Linus Luessing
- fix SKB handling issues for TTVN and DAT, by Matthias Schiffer (two patches)
- fix include for eventpoll, by Sven Eckelmann
- fix skb checksum for ttvn reroutes, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use tegra124_(primary|overlay)_formats for Tegra124, otherwise the count
specified in the Tegra124 SoC info structure will be different from the
array size and cause a crash.
Fixes: 511c7023cf ("drm/tegra: dc: Support more formats")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The netfilter netdevice event handler hold the nfnl_lock mutex, this
avoids races with a device going away while such device is being
attached to hooks from the netlink control plane. Therefore, either
control plane bails out with ENOENT or netdevice event path waits until
the hook that is attached to net_device is registered.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Devices going away have to grab the nfnl_lock from the netdev event path
to avoid races with control plane updates.
However, netlink dumps in netfilter do not hold nfnl_lock mutex. Cache
the device name into the objects to avoid an use-after-free situation
for a device that is going away.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The clockid argument of clockid_to_kclock() comes straight from user space
via various syscalls and is used as index into the posix_clocks array.
Protect it against spectre v1 array out of bounds speculation. Remove the
redundant check for !posix_clock[id] as this is another source for
speculation and does not provide any advantage over the return
posix_clock[id] path which returns NULL in that case anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1802151718320.1296@nanos.tec.linutronix.de
In loopback_open() and loopback_close(), we assign and release the
substream object to the corresponding cable in a racy way. It's
neither locked nor done in the right position. The open callback
assigns the substream before its preparation finishes, hence the other
side of the cable may pick it up, which may lead to the invalid memory
access.
This patch addresses these: move the assignment to the end of the open
callback, and wrap with cable->lock for avoiding concurrent accesses.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The aloop driver tries to stop the pending timer via timer_del() in
the trigger callback and in the close callback. The former is
correct, as it's an atomic operation, while the latter expects that
the timer gets really removed and proceeds the resource releases after
that. But timer_del() doesn't synchronize, hence the running timer
may still access the released resources.
A similar situation can be also seen in the prepare callback after
trigger(STOP) where the prepare tries to re-initialize the things
while a timer is still running.
The problems like the above are seen indirectly in some syzkaller
reports (although it's not 100% clear whether this is the only cause,
as the race condition is quite narrow and not always easy to
trigger).
For addressing these issues, this patch adds the explicit alls of
timer_del_sync() in some places, so that the pending timer is properly
killed / synced.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
It will have a chance speaker no sound after system resume.
To toggle NID 0x53 index 0x2 bit 15 will solve this issue.
This usage will also suitable with ALC256.
Fixes: 4a219ef8f3 ("ALSA: hda/realtek - Add ALC256 HP depop function")
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This platform was hardware fixed type for CTIA type for headset port.
Assigned 0x19 verb will fix can't record issue.
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The memmap options sent to the udl framebuffer driver were not being
checked for all sets of possible crazy values. Fix this up by properly
bounding the allowed values.
Reported-by: Eyal Itkin <eyalit@checkpoint.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180321154553.GA18454@kroah.com
The bitfield dma_inuse is allocated of size dma_requests bits, thus a
valid bit address is from 0 to (dma_requests - 1).
When find_first_zero_bit() fails, it returns dma_requests as invalid
address.
Using such address for the following set_bit() is incorrect and, if
dma_requests is a multiple of BITS_PER_LONG, it will cause a buffer
overflow.
Currently this driver is only used in DT stm32h743.dtsi where a safe value
dma_requests=16 is not triggering the buffer overflow.
Fixed by checking the return value of find_first_zero_bit() _before_
using it.
Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This reverts commit 2170dd0431
The intent of commit 2170dd0431 ("vfio-pci: Mask INTx if a device is
not capabable of enabling it") was to disallow the user from seeing
that the device supports INTx if the platform is incapable of enabling
it. The detection of this case however incorrectly includes devices
which natively do not support INTx, such as SR-IOV VFs, and further
discussions reveal gaps even for the target use case.
Reported-by: Arjun Vynipadath <arjun@chelsio.com>
Fixes: 2170dd0431 ("vfio-pci: Mask INTx if a device is not capabable of enabling it")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Since commit 3af5a67c86 ("MIPS: Fix early CM probing") the MT7621 has
not been able to boot.
This commit caused mips_cm_probe() to be called before
mt7621.c::proc_soc_init().
prom_soc_init() has a comment explaining that mips_cm_probe() "wipes out
the bootloader config" and means that configuration registers are no
longer available. It has some code to re-enable this config.
Before this re-enable code is run, the sysc register cannot be read, so
when SYSC_REG_CHIP_NAME0 is read, a garbage value is returned and
panic() is called.
If we move the config-repair code to the top of prom_soc_init(), the
registers can be read and boot can proceed.
Very occasionally, the first register read after the reconfiguration
returns garbage, so add a call to __sync().
Fixes: 3af5a67c86 ("MIPS: Fix early CM probing")
Signed-off-by: NeilBrown <neil@brown.name>
Reviewed-by: Matt Redfearn <matt.redfearn@mips.com>
Cc: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.5+
Patchwork: https://patchwork.linux-mips.org/patch/18859/
Signed-off-by: James Hogan <jhogan@kernel.org>
ralink_halt() does nothing that machine_halt() doesn't already do, so it
adds no value.
It actually causes incorrect behaviour due to the "unreachable()" at the
end. This tells the compiler that the end of the function will never be
reached, which isn't true. The compiler responds by not adding a
'return' instruction, so control simply moves on to whatever bytes come
afterwards in memory. In my tested, that was the ralink_restart()
function. This means that an attempt to 'halt' the machine would
actually cause a reboot.
So remove ralink_halt() so that a 'halt' really does halt.
Fixes: c06e836ada ("MIPS: ralink: adds reset code")
Signed-off-by: NeilBrown <neil@brown.name>
Cc: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.9+
Patchwork: https://patchwork.linux-mips.org/patch/18851/
Signed-off-by: James Hogan <jhogan@kernel.org>
A few more fixes for 4.16. Mostly for displays:
- A fix for DP handling on radeon
- Fix banding on eDP panels
- Fix HBR audio
- Fix for disabling VGA mode on Raven that leads to a corrupt or
blank display on some platforms
* 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux:
drm/amd/display: Add one to EDID's audio channel count when passing to DC
drm/amd/display: We shouldn't set format_default on plane as atomic driver
drm/amd/display: Fix FMT truncation programming
drm/amd/display: Allow truncation to 10 bits
drm/amd/display: fix dereferencing possible ERR_PTR()
drm/amd/display: Refine disable VGA
drm/amdgpu: Use atomic function to disable crtcs with dc enabled
drm/radeon: Don't turn off DP sink when disconnected
Davide Caratti says:
====================
fix idr leak in actions
This series fixes situations where a temporary failure to install a TC
action results in the permanent impossibility to reuse the configured
value of 'index'.
Thanks to Cong Wang for the initial review.
v2: fix build error in act_ipt.c, reported by kbuild test robot
====================
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_skbmod_init() can fail after the idr has been successfully reserved.
When this happens, every subsequent attempt to configure skbmod rules
using the same idr value will systematically fail with -ENOSPC, unless
the first attempt was done using the 'replace' keyword:
# tc action add action skbmod swap mac index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action skbmod swap mac index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action skbmod swap mac index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in tcf_skbmod_init(), ensuring that tcf_idr_release() is called
on the error path when the idr has been reserved, but not yet inserted.
Also, don't test 'ovr' in the error path, to avoid a 'replace' failure
implicitly become a 'delete' that leaks refcount in act_skbmod module:
# rmmod act_skbmod; modprobe act_skbmod
# tc action add action skbmod swap mac index 100
# tc action add action skbmod swap mac continue index 100
RTNETLINK answers: File exists
We have an error talking to the kernel
# tc action replace action skbmod swap mac continue index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action list action skbmod
#
# rmmod act_skbmod
rmmod: ERROR: Module act_skbmod is in use
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_vlan_init() can fail after the idr has been successfully reserved.
When this happens, every subsequent attempt to configure vlan rules using
the same idr value will systematically fail with -ENOSPC, unless the first
attempt was done using the 'replace' keyword.
# tc action add action vlan pop index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action vlan pop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action vlan pop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in tcf_vlan_init(), ensuring that tcf_idr_release() is called on
the error path when the idr has been reserved, but not yet inserted. Also,
don't test 'ovr' in the error path, to avoid a 'replace' failure implicitly
become a 'delete' that leaks refcount in act_vlan module:
# rmmod act_vlan; modprobe act_vlan
# tc action add action vlan push id 5 index 100
# tc action replace action vlan push id 7 index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action list action vlan
#
# rmmod act_vlan
rmmod: ERROR: Module act_vlan is in use
Fixes: 4c5b9d9642 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update")
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__tcf_ipt_init() can fail after the idr has been successfully reserved.
When this happens, subsequent attempts to configure xt/ipt rules using
the same idr value systematically fail with -ENOSPC:
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
Command "(null)" is unknown, try "tc actions help".
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
Command "(null)" is unknown, try "tc actions help".
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of __tcf_ipt_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup(). Since tcf_ipt_release() can now be called
when tcfi_t is NULL, we also need to protect calls to ipt_destroy_target()
to avoid NULL pointer dereference.
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_pedit_init() can fail to allocate 'keys' after the idr has been
successfully reserved. When this happens, subsequent attempts to configure
a pedit rule using the same idr value systematically fail with -ENOSPC:
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of tcf_act_pedit_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The persistence domain is a point in the platform where once writes
reach that destination the platform claims it will make them persistent
relative to power loss. In the ACPI NFIT this is currently communicated
as 2 bits in the "NFIT - Platform Capabilities Structure". The bits
comprise a hierarchy, i.e. bit0 "CPU Cache Flush to NVDIMM Durability on
Power Loss Capable" implies bit1 "Memory Controller Flush to NVDIMM
Durability on Power Loss Capable".
Commit 96c3a23905 "libnvdimm: expose platform persistence attr..."
shows the persistence domain as flags, but it's really an enumerated
hierarchy.
Fix this newly introduced user ABI to show the closest available
persistence domain before userspace develops dependencies on seeing, or
needing to develop code to tolerate, the raw NFIT flags communicated
through the libnvdimm-generic region attribute.
Fixes: 96c3a23905 ("libnvdimm: expose platform persistence attr...")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
tcf_act_police_init() can fail after the idr has been successfully
reserved (e.g., qdisc_get_rtab() may return NULL). When this happens,
subsequent attempts to configure a police rule using the same idr value
systematiclly fail with -ENOSPC:
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: No space left on device
...
Fix this in the error path of tcf_act_police_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if the kernel fails to duplicate 'sdata', creation of a new action fails
with -ENOMEM. However, subsequent attempts to install the same action
using the same value of 'index' systematically fail with -ENOSPC, and
that value of 'index' will no more be usable by act_simple, until rmmod /
insmod of act_simple.ko is done:
# tc actions add action simple sdata hello index 100
# tc actions list action simple
action order 0: Simple <hello>
index 100 ref 1 bind 0
# tc actions flush action simple
# tc actions add action simple sdata hello index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc actions flush action simple
# tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of tcf_simp_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>