It is possible that on a (partially) unsuccessful page reclaim,
kref_put() called in z3fold_reclaim_page() does not yield page release,
but the page is released shortly afterwards by another thread. Then
z3fold_reclaim_page() would try to list_add() that (released) page again
which is obviously a bug.
To avoid that, spin_lock() has to be taken earlier, before the
kref_put() call mentioned earlier.
Link: http://lkml.kernel.org/r/20170913162937.bfff21c7d12b12a5f47639fd@gmail.com
Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: <Oleksiy.Avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices. If enum
values are defined, but never used, pinmux_pins[] contains (zero-filled)
holes. Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:
sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22
Remove GPIO_PH[0-7] from the enum to fix this.
Link: http://lkml.kernel.org/r/1505205657-18012-5-git-send-email-geert+renesas@glider.be
Fixes: ef0fa5331a ("sh: Add pinmux for sh7269")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices. If enum
values are defined, but never used, pinmux_pins[] contains (zero-filled)
holes. Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:
sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22
Remove GPIO_PH[0-7] from the enum to fix this.
Link: http://lkml.kernel.org/r/1505205657-18012-4-git-send-email-geert+renesas@glider.be
Fixes: 41797f7548 ("sh: Add pinmux for sh7264")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Jacopo Mondi <jacopo+renesas@jmondi.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 3810e96056 ("sh: modify pinmux for SH7757 2nd cut") renamed
GPIO_PT[JLNQ]7 to GPIO_PT[JLNQ]7_RESV, and removed the existing users
from the pinmux_pins[] array.
However, pinmux_pins[] is initialized through PINMUX_GPIO(), using
designated array initializers, where the GPIO_* enums serve as indices.
Hence entries were not really removed, but replaced by (zero-filled)
holes. Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:
sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22
Remove GPIO_PT[JLNQ]7_RESV from the enum to fix this.
Link: http://lkml.kernel.org/r/1505205657-18012-3-git-send-email-geert+renesas@glider.be
Fixes: 3810e96056 ("sh: modify pinmux for SH7757 2nd cut")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Jacopo Mondi <jacopo+renesas@jmondi.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "sh: sh7722/sh7757i/sh7264/sh7269: Fix pinctrl registration",
v2.
Magnus Damm reported that on sh7722/Migo-R, pinctrl registration fails
with:
sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22
pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices. Apparently
GPIO_PTQ7 was defined in the enum, but never used. If enum values are
defined, but never used, pinmux_pins[] contains (zero-filled) holes.
Hence such entries are treated as pin zero, which was registered before,
and pinctrl registration fails.
I can't see how this ever worked, as at the time of commit f5e25ae52f
("sh-pfc: Add sh7722 pinmux support"), pinmux_gpios[] in
drivers/pinctrl/sh-pfc/pfc-sh7722.c already had the hole, and
drivers/pinctrl/core.c already had the check.
Some scripting revealed a few more broken drivers:
- sh7757 has four holes, due to nonexistent GPIO_PT[JLNQ]7_RESV.
- sh7264 and sh7269 define GPIO_PH[0-7], but don't use it with
PINMUX_GPIO().
Patch 1 fixes the issue on sh7722, and was tested. Patches 3-4 should
fix the issue on the other 3 SoCs, but was untested due to lack of
hardware.
This patch (of 4):
On sh7722/Migo-R, pinctrl registration fails with:
sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22
pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array
initializers, where the GPIO_* enums serve as indices. As GPIO_PTQ7 is
defined in the enum, but never used, pinmux_pins[] contains a
(zero-filled) hole. Hence this entry is treated as pin zero, which was
registered before, and pinctrl registration fails.
According to the datasheet, port PTQ7 does not exist. Hence remove
GPIO_PTQ7 from the enum to fix this.
Link: http://lkml.kernel.org/r/1505205657-18012-2-git-send-email-geert+renesas@glider.be
Fixes: 8d7b5b0af7 ("sh: Add sh7722 pinmux code")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reported-by: Magnus Damm <magnus.damm@gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes a bug in madvise() where if you'd try to soft offline a
hugepage via madvise(), while walking the address range you'd end up,
using the wrong page offset due to attempting to get the compound order
of a former but presently not compound page, due to dissolving the huge
page (since commit c3114a84f7f9: "mm: hugetlb: soft-offline: dissolve
source hugepage after successful migration").
As a result I ended up with all my free pages except one being offlined.
Link: http://lkml.kernel.org/r/20170912204306.GA12053@gmail.com
Fixes: c3114a84f7 ("mm: hugetlb: soft-offline: dissolve source hugepage after successful migration")
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In this place mm is unlocked, so vmas or list may change. Down read
mmap_sem to protect them from modifications.
Link: http://lkml.kernel.org/r/150512788393.10691.8868381099691121308.stgit@localhost.localdomain
Fixes: e86c59b1b1 ("mm/ksm: improve deduplication of zero pages with colouring")
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: zhong jiang <zhongjiang@huawei.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There's a typo in recent change of VM_MPX definition. We want it to be
VM_HIGH_ARCH_4, not VM_HIGH_ARCH_BIT_4.
This bug does cause visible regressions. In arch_vma_name the vmflags
are tested against VM_MPX. With the incorrect value of VM_MPX, a number
of vmas (such as the stack) test positive and end up being marked as
"[mpx]" in /proc/N/maps instead of their correct names.
This confuses tools like rr which expect to be able to find familiar
vmas.
Fixes: df3735c5b4 ("x86,mpx: make mpx depend on x86-64 to free up VMA flag")
Link: http://lkml.kernel.org/r/20170918140253.36856-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kyle Huey <me@kylehuey.com>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here are some of the more spelling mistakes and typos that I've found
while fixing up spelling mistakes in kernel error message text over the
past eight weeks.
[akpm@linux-foundation.org: s/|/||/, per Joe]
Link: http://lkml.kernel.org/r/20170919090818.5989-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Joe Perches <joe@perches.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This parameter is named kp, so the documentation should use that.
Fixes: 9b473de872 ("param: Fix duplicate module prefixes")
Link: http://lkml.kernel.org/r/20170919142656.64aea59e@endymion
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The build of alpha allmodconfig is giving error:
arch/alpha/include/asm/mmu_context.h: In function 'ev5_switch_mm':
arch/alpha/include/asm/mmu_context.h:160:2: error:
implicit declaration of function 'task_thread_info';
did you mean 'init_thread_info'? [-Werror=implicit-function-declaration]
The file 'mmu_context.h' needed an extra header file.
Link: http://lkml.kernel.org/r/1505668810-7497-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marcelo Ricardo Leitner says:
====================
Introduce SCTP Stream Schedulers
This patchset introduces the SCTP Stream Schedulers are defined by
https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
It provides 3 schedulers at the moment: FCFS, Priority and Round Robin.
The other 3, Round Robin per packet, Fair Capacity and Weighted Fair
Capacity will be added later. More specifically, WFQ is required by
WebRTC Datachannels.
The draft also defines the idata chunk, allowing a usermsg to be
interrupted by another piece of idata from another stream. This patchset
*doesn't* include it. It will be posted later by Xin Long. Its
integration with this patchset is very simple and it basically only
requires a tweak in sctp_sched_dequeue_done(), to ignore datamsg
boundaries.
The first 5 patches are a preparation for the next ones. The most
relevant patches are the 4th and 6th ones. More details are available on
each patch.
v2: changelog update on patch 3
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces RFC Draft ndata section 3.2 Priority Based
Scheduler (SCTP_SS_RR).
Works by maintaining a list of enqueued streams and tracking the last
one used to send data. When the datamsg is done, it switches to the next
stream.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces RFC Draft ndata section 3.4 Priority Based
Scheduler (SCTP_SS_PRIO).
It works by having a struct sctp_stream_priority for each priority
configured. This struct is then enlisted on a queue ordered per priority
if, and only if, there is a stream with data queued, so that dequeueing
is very straightforward: either finish current datamsg or simply dequeue
from the highest priority queued, which is the next stream pointed, and
that's it.
If there are multiple streams assigned with the same priority and with
data queued, it will do round robin amongst them while respecting
datamsgs boundaries (when not using idata chunks), to be reasonably
fair.
We intentionally don't maintain a list of priorities nor a list of all
streams with the same priority to save memory. The first would mean at
least 2 other pointers per priority (which, for 1000 priorities, that
can mean 16kB) and the second would also mean 2 other pointers but per
stream. As SCTP supports up to 65535 streams on a given asoc, that's
1MB. This impacts when giving a priority to some stream, as we have to
find out if the new priority is already being used and if we can free
the old one, and also when tearing down.
The new fields in struct sctp_stream_out_ext and sctp_stream are added
under a union because that memory is to be shared with other schedulers.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As defined per RFC Draft ndata Section 4.3.3, named as
SCTP_STREAM_SCHEDULER_VALUE.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As defined per RFC Draft ndata Section 4.3.2, named as
SCTP_STREAM_SCHEDULER.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces the hooks necessary to do stream scheduling, as
per RFC Draft ndata. It also introduces the first scheduler, which is
what we do today but now factored out: first come first served (FCFS).
With stream scheduling now we have to track which chunk was enqueued on
which stream and be able to select another other than the in front of
the main outqueue. So we introduce a list on sctp_stream_out_ext
structure for this purpose.
We reuse sctp_chunk->transmitted_list space for the list above, as the
chunk cannot belong to the two lists at the same time. By using the
union in there, we can have distinct names for these moments.
sctp_sched_ops are the operations expected to be implemented by each
scheduler. The dequeueing is a bit particular to this implementation but
it is to match how we dequeue packets today. We first dequeue and then
check if it fits the packet and if not, we requeue it at head. Thus why
we don't have a peek operation but have dequeue_done instead, which is
called once the chunk can be safely considered as transmitted.
The check removed from sctp_outq_flush is now performed by
sctp_stream_outq_migrate, which is only called during assoc setup.
(sctp_sendmsg() also checks for it)
The only operation that is foreseen but not yet added here is a way to
signalize that a new packet is starting or that the packet is done, for
round robin scheduler per packet, but is intentionally left to the
patch that actually implements it.
Support for I-DATA chunks, also described in this RFC, with user message
interleaving is straightforward as it just requires the schedulers to
probe for the feature and ignore datamsg boundaries when dequeueing.
See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to fetch the stream number from a given chunk.
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the stream schedulers, sctp_stream_out will become too big to be
allocated by kmalloc and as we need to allocate with BH disabled, we
cannot use __vmalloc in sctp_stream_init().
This patch moves out the stats from sctp_stream_out to
sctp_stream_out_ext, which will be allocated only when the application
tries to sendmsg something on it.
Just the introduction of sctp_stream_out_ext would already fix the issue
described above by splitting the allocation in two. Moving the stats
to it also reduces the pressure on the allocator as we will ask for less
memory atomically when creating the socket and we will use GFP_KERNEL
later.
Then, for stream schedulers, we will just use sctp_stream_out_ext.
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is 1 place allocating it and another reallocating. Move such
procedures to a common function.
v2: updated changelog
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is 1 place allocating it and 2 other reallocating. Move such
procedures to a common function.
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As SCTP supports up to 65535 streams, that can lead to very large
allocations in sctp_stream_init(). As Xin Long noticed, systems with
small amounts of memory are more prone to not have enough memory and
dump warnings on dmesg initiated by user actions. Thus, silence them.
Also, if the reallocation of stream->out is not necessary, skip it and
keep the memory we already have.
Reported-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2017-10-03
This series contains updates to fm10k only.
Jake provides majority of the changes in this series, starting with using
fm10k_prepare_for_reset() if we lose PCIe link. Before we would detach
the device and close the netdev, which left a lot of items still active,
such as the Tx/Rx resources. This could cause problems where register
reads would return potentially invalid values and would result in unknown
driver behavior, so call fm10k_prepare_for_reset() much like we do for
suspend/resume cycles. This will attempt to shutdown as much as possible
to prevent possible issues. Then replaced the PCI specific legacy power
management hooks with the new generic power management hooks for both
suspend and hibernate. Introduced a workqueue item which monitors a
queue of MAC and VLAN requests since a large number of MAC address or
VLAN updates at once can overload the mailbox with too many messages at
once. Fixed a cppcheck warning by properly declaring the min_rate and
max_rate variables in the declaration and definition for .ndo_set_vf_bw,
rather than using "unused" for the minimum rates.
Joe Perches fixes the backward logic when using net_ratelimit().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- bpf prog_array just like all other types of bpf array accepts 32-bit index.
Clarify that in the comment.
- fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes
- tighten corresponding check in the interpreter to stay consistent
The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag
in commit 96eabe7a40 in 4.14. Before that the map_flags would stay zero and
though JIT code is wrong it will check bounds correctly.
Hence two fixes tags. All other JITs don't have this problem.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fixes: 96eabe7a40 ("bpf: Allow selecting numa node during map creation")
Fixes: b52f00e6a7 ("x86: bpf_jit: implement bpf_tail_call() helper")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device alias can be set by either rtnetlink (rtnl is held) or sysfs.
rtnetlink hold the rtnl mutex, sysfs acquires it for this purpose.
Add an extra mutex for it and use rcu to protect concurrent accesses.
This allows the sysfs path to not take rtnl and would later allow
to not hold it when dumping ifalias.
Based on suggestion from Eric Dumazet.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add constants and callback functions for the dwmac on rk3128 soc.
As can be seen, the base structure is the same, only registers
and the bits in them moved slightly.
Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some NIC drivers don't have correct speed/duplex settings at the
time they send NETDEV_UP notification and that messes up the
bonding state. Especially 802.3ad mode which is very sensitive
to these settings. In the current implementation we invoke
bond_update_speed_duplex() when we receive NETDEV_UP, however,
ignore the return value. If the values we get are invalid
(UNKNOWN), then slave gets removed from the aggregator with
speed and duplex set to UNKNOWN while link is still marked as UP.
This patch fixes this scenario. Also 802.3ad mode is sensitive to
these conditions while other modes are not, so making sure that it
doesn't change the behavior for other modes.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Treat the ef/04/01 interface class/subclass/protocol combination used
by the Novatel Verizon USB730L (1410:9030) as a possible RNDIS
interface.
T: Bus=01 Lev=02 Prnt=02 Port=01 Cnt=02 Dev#= 17 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 3
P: Vendor=1410 ProdID=9030 Rev=03.10
S: Manufacturer=Novatel Wireless
S: Product=MiFi USB730L
S: SerialNumber=0123456789ABCDEF
C: #Ifs= 3 Cfg#= 1 Atr=80 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=ef(misc ) Sub=04 Prot=01 Driver=rndis_host
I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host
I: If#= 2 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid
Once the network interface is brought up, the user just needs to run a
DHCP client to get IP address and routing setup.
As a side note, other Novatel Verizon USB730L models with the same
vid:pid end up exposing a standard ECM interface which doesn't require
any other kernel update to make it work.
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull cgroup fix from Tejun Heo:
"The recent migration code updates assumed that migrations always
execute from the top to the bottom once and didn't clean up internal
states after each migration round; however, cgroup_transfer_tasks()
repeats the inner steps multiple times and the garbage internal states
from the previous iteration led to OOPS.
Waiman fixed the bug by reinitializing the relevant states at the end
of each migration round"
* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Reinit cgroup_taskset structure before cgroup_migrate_execute() returns
We accidentally return success if the kmalloc_array() call fails.
Fixes: 0e14c7777a ("mlxsw: spectrum: Add the multicast routing hardware logic")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw_afa_block_create() doesn't return error pointers, it returns NULL
on error.
Fixes: 0e14c7777a ("mlxsw: spectrum: Add the multicast routing hardware logic")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function mt7530_phy_write is local to the source and does not need to
be in global scope, so make it static.
Cleans up sparse warnings:
symbol 'mt7530_phy_write' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The functions lan9303_mdio_phy_write and lan9303_mdio_phy_read are local
to the source and do not need to be in global scope, so make them static.
Cleans up sparse warnings:
symbol 'lan9303_mdio_phy_write' was not declared. Should it be static?
symbol 'lan9303_mdio_phy_read' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
When RTM_GETSTATS was added the fields of its header struct were not all
initialized when returning the result thus leaking 4 bytes of information
to user-space per rtnl_fill_statsinfo call, so initialize them now. Thanks
to Alexander Potapenko for the detailed report and bisection.
Reported-by: Alexander Potapenko <glider@google.com>
Fixes: 10c9ead9f3 ("rtnetlink: add new RTM_GETSTATS message to dump link stats")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko says:
====================
mlxsw: Add support for partial multicast route offload
Yotam says:
Previous patchset introduced support for offloading multicast MFC routes to
the Spectrum hardware. As described in that patchset, no partial offloading
is supported, i.e if a route has one output interface which is not a valid
offloadable device (e.g. pimreg device, dummy device, management NIC), the
route is trapped to the CPU and the forwarding is done in slow-path.
Add support for partial offloading of multicast routes, by letting the
hardware to forward the packet to all the in-hardware devices, while the
kernel ipmr module will continue forwarding to all other interfaces.
Similarly to the bridge, the kernel ipmr module will forward a marked
packet to an interface only if the interface has a different parent ID than
the packet's ingress interfaces.
The first patch introduces the offload_mr_fwd_mark skb field, which can be
used by offloading drivers to indicate that a packet had already gone
through multicast forwarding in hardware, similarly to the offload_fwd_mark
field that indicates that a packet had already gone through L2 forwarding
in hardware.
Patches 2 and 3 change the ipmr module to not forward packets that had
already been forwarded by the hardware, i.e. packets that are marked with
offload_mr_fwd_mark and the ingress VIF shares the same parent ID with the
egress VIF.
Patches 4, 5, 6 and 7 add the support in the mlxsw Spectrum driver for trap
and forward routes, while marking the trapped packets with the
offload_mr_fwd_mark.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the support of trap-and-forward route action in the multicast routing
offloading logic. A route will be set to trap-and-forward action if one (or
more) of its output interfaces is not offload-able, i.e. does not have a
valid Spectrum RIF.
This way, a route with mixed output VIFs list, which contains both
offload-able and un-offload-able devices can go through partial offloading
in hardware, and the rest will be done in the kernel ipmr module.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In addition to the current multicast route actions, which include trap
route action and a forward route action, add the trap-and-forward multicast
route action, and implement it in the multicast routing hardware logic.
To implement that, add a trap-and-forward ACL action as the last action in
the route flexible action set. The used trap is the ACL2 trap, which marks
the packets with offload_mr_forward_mark, to prevent the packet from being
forwarded again by the kernel.
Note: At that stage the offloading logic does not support trap-and-forward
multicast routes. This patch adds the support only in the hardware logic.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a multicast route is configured with trap-and-forward action, the
packets should be marked with skb->offload_mr_fwd_mark, in order to prevent
the packets from being forwarded again by the kernel ipmr module.
Due to this, it is not possible to use the already existing multicast trap
(MLXSW_TRAP_ID_ACL1) as the packet should be marked differently. Add the
MLXSW_TRAP_ID_ACL2 which is for trap-and-forward multicast routes, and set
the offload_mr_fwd_mark skb field in its handler.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use trap/discard flex action to implement trap and forward. The action will
later be used for multicast routing, as the multicast routing mechanism is
done using ACL flexible actions in Spectrum hardware. Using that action, it
will be possible to implement a trap-and-forward route.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the ipmr module to not forward packets if:
- The packet is marked with the offload_mr_fwd_mark, and
- Both input interface and output interface share the same parent ID.
This way, a packet can go through partial multicast forwarding in the
hardware, where it will be forwarded only to the devices that share the
same parent ID (AKA, reside inside the same hardware). The kernel will
forward the packet to all other interfaces.
To do this, add the ipmr_offload_forward helper, which per skb, ingress VIF
and egress VIF, returns whether the forwarding was offloaded to hardware.
The ipmr_queue_xmit frees the skb and does not forward it if the result is
a true value.
All the forwarding path code compiles out when the CONFIG_NET_SWITCHDEV is
not set.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to allow the ipmr module to do partial multicast forwarding
according to the device parent ID, add the device parent ID field to the
VIF struct. This way, the forwarding path can use the parent ID field
without invoking switchdev calls, which requires the RTNL lock.
When a new VIF is added, set the device parent ID field in it by invoking
the switchdev_port_attr_get call.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similarly to the offload_fwd_mark field, the offload_mr_fwd_mark field is
used to allow partial offloading of MFC multicast routes.
Switchdev drivers can offload MFC multicast routes to the hardware by
registering to the FIB notification chain. When one of the route output
interfaces is not offload-able, i.e. has different parent ID, the route
cannot be fully offloaded by the hardware. Examples to non-offload-able
devices are a management NIC, dummy device, pimreg device, etc.
Similar problem exists in the bridge module, as one bridge can hold
interfaces with different parent IDs. At the bridge, the problem is solved
by the offload_fwd_mark skb field.
Currently, when a route cannot go through full offload, the only solution
for a switchdev driver is not to offload it at all and let the packet go
through slow path.
Using the offload_mr_fwd_mark field, a driver can indicate that a packet
was already forwarded by hardware to all the devices with the same parent
ID as the input device. Further patches in this patch-set are going to
enhance ipmr to skip multicast forwarding to devices with the same parent
ID if a packets is marked with that field.
The reason why the already existing "offload_fwd_mark" bit cannot be used
is that a switchdev driver would want to make the distinction between a
packet that has already gone through L2 forwarding but did not go through
multicast forwarding, and a packet that has already gone through both L2
and multicast forwarding.
For example: when a packet is ingressing from a switchport enslaved to a
bridge, which is configured with multicast forwarding, the following
scenarios are possible:
- The packet can be trapped to the CPU due to exception while multicast
forwarding (for example, MTU error). In that case, it had already gone
through L2 forwarding in the hardware, thus A switchdev driver would
want to set the skb->offload_fwd_mark and not the
skb->offload_mr_fwd_mark.
- The packet can also be trapped due to a pimreg/dummy device used as one
of the output interfaces. In that case, it can go through both L2 and
(partial) multicast forwarding inside the hardware, thus a switchdev
driver would want to set both the skb->offload_fwd_mark and
skb->offload_mr_fwd_mark.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellaox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull percpu fixes from Tejun Heo:
"Rather important fixes this time.
- The new percpu area allocator had a subtle bug in how it iterates
the memory regions and could skip viable areas, which led to
allocation failures for module static percpu variables. Dennis
fixed the bug and another non-critical one in stat calculation.
- Mark noticed that the generic implementations of percpu local
atomic reads aren't properly protected against irqs and there's a
(slim) chance for split reads on some 32bit systems. Generic
implementations are updated to disable irq when read size is larger
than ulong size. This may have made some 32bit archs which can do
atomic local 64bit accesses generate sub-optimal code. We need to
find them out and implement arch-specific overrides"
* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: fix iteration to prevent skipping over block
percpu: fix starting offset for chunk statistics traversal
percpu: make this_cpu_generic_read() atomic w.r.t. interrupts
We have lost a comment for minimum mtu value set for netdevice with
'commit d894be57ca ("ethernet: use net core MTU range checking in
more drivers"). Updating it accordingly.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull libata fixes from Tejun Heo:
"Nothing too interesting.
Arnd's gcc-7 warning fixes that slipped through the cracks for two
release cycles (my bad), and two minor low level driver updates"
* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: don't ignore result code of ahci_reset_controller()
ata_piix: Add Fujitsu-Siemens Lifebook S6120 to short cable IDs
ata: avoid gcc-7 warning in ata_timing_quantize
Here are a number of USB fixes for 4.14-rc4 to resolved reported issue.
There's a bunch of stuff in here based on the great work Andrey
Konovalov is doing in fuzzing the USB stack. Lots of bug fixes when
dealing with corrupted USB descriptors that we've never seen in "normal"
operation, but is now ensuring the stack is much more hardened overall.
There's also the usual XHCI and gadget driver fixes as well, and a build
error fix, and a few other minor things, full details in the shortlog.
All of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN/yw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl6pQCdGY+nPJhzj9EIeFj5QUpSuS4b1pYAoKrbNn+V
CMpg4iG1oXUtVL8jBbKa
=fVpl
-----END PGP SIGNATURE-----
Merge tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of USB fixes for 4.14-rc4 to resolved reported
issues.
There's a bunch of stuff in here based on the great work Andrey
Konovalov is doing in fuzzing the USB stack. Lots of bug fixes when
dealing with corrupted USB descriptors that we've never seen in
"normal" operation, but is now ensuring the stack is much more
hardened overall.
There's also the usual XHCI and gadget driver fixes as well, and a
build error fix, and a few other minor things, full details in the
shortlog.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (38 commits)
usb: dwc3: of-simple: Add compatible for Spreadtrum SC9860 platform
usb: gadget: udc: atmel: set vbus irqflags explicitly
usb: gadget: ffs: handle I/O completion in-order
usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction
usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe
usb: gadget: udc: renesas_usb3: Fix return value of usb3_write_pipe()
usb: gadget: udc: renesas_usb3: fix Pn_RAMMAP.Pn_MPKT value
usb: gadget: udc: renesas_usb3: fix for no-data control transfer
USB: dummy-hcd: Fix erroneous synchronization change
USB: dummy-hcd: fix infinite-loop resubmission bug
USB: dummy-hcd: fix connection failures (wrong speed)
USB: cdc-wdm: ignore -EPIPE from GetEncapsulatedResponse
USB: devio: Don't corrupt user memory
USB: devio: Prevent integer overflow in proc_do_submiturb()
USB: g_mass_storage: Fix deadlock when driver is unbound
USB: gadgetfs: Fix crash caused by inadequate synchronization
USB: gadgetfs: fix copy_to_user while holding spinlock
USB: uas: fix bug in handling of alternate settings
usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives
usb-storage: fix bogus hardware error messages for ATA pass-thru devices
...
Here are a small number (5) of patches for some reported TTY and serial
issues. Nothing major, a documentation update, timing fix, error
handling fix, name reporting fix, and a timeout issue resolved.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN+uw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykZmgCbBSJmwcbVhuhZ64Fx4OE0eprjOgoAoMLmHaT2
jTjQTxM/Gaz108t3o9rt
=5ve+
-----END PGP SIGNATURE-----
Merge tag 'tty-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are a small number (5) of patches for some reported TTY and
serial issues. Nothing major, a documentation update, timing fix,
error handling fix, name reporting fix, and a timeout issue resolved.
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: sccnxp: Fix error handling in sccnxp_probe()
tty: serial: lpuart: avoid report NULL interrupt
serial: bcm63xx: fix timing issue.
mxser: fix timeout calculation for low rates
serial: sh-sci: document R8A77970 bindings
Here are some small staging/IIO driver fixes for 4.14-rc4
Most of these have been in my tree for a while due to travels, sorry for
the delay. They resolve a number of small issues reported by people,
mostly for the iio drivers. Nothing major in here, full details are in
the shortlog.
All have been linux-next for a few weeks with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN+KQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yluygCgneh7i/okOfsmt/p75eCA4ClWVLwAoIE7BZzt
1WdBcY/Zxv1ANIoY7ZTQ
=K+FX
-----END PGP SIGNATURE-----
Merge tag 'staging-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO fixes from Greg KH:
"Here are some small staging/IIO driver fixes for 4.14-rc4
Most of these have been in my tree for a while due to travels, sorry
for the delay. They resolve a number of small issues reported by
people, mostly for the iio drivers. Nothing major in here, full
details are in the shortlog.
All have been linux-next for a few weeks with no reported issues"
* tag 'staging-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (23 commits)
staging: iio: ad7192: Fix - use the dedicated reset function avoiding dma from stack.
iio: core: Return error for failed read_reg
iio: ad7793: Fix the serial interface reset
iio: ad_sigma_delta: Implement a dedicated reset function
IIO: BME280: Updates to Humidity readings need ctrl_reg write!
iio: adc: mcp320x: Fix readout of negative voltages
iio: adc: mcp320x: Fix oops on module unload
iio: adc: stm32: fix bad error check on max_channels
iio: trigger: stm32-timer: fix a corner case to write preset
iio: trigger: stm32-timer: preset shouldn't be buffered
iio: adc: twl4030: Return an error if we can not enable the vusb3v1 regulator in 'twl4030_madc_probe()'
iio: adc: twl4030: Disable the vusb3v1 rugulator in the error handling path of 'twl4030_madc_probe()'
iio: adc: twl4030: Fix an error handling path in 'twl4030_madc_probe()'
staging: rtl8723bs: avoid null pointer dereference on pmlmepriv
staging: rtl8723bs: add missing range check on id
staging: vchiq_2835_arm: Fix NULL ptr dereference in free_pagelist
staging: speakup: fix speakup-r empty line lockup
staging: pi433: Move limit check to switch default to kill warning
staging: r8822be: fix null pointer dereferences with a null driver_adapter
staging: mt29f_spinand: Enable the read ECC before program the page
...
We've had support for setting both a minimum and maximum bandwidth via
.ndo_set_vf_bw since commit 883a9ccbae ("fm10k: Add support for SR-IOV
to driver", 2014-09-20).
Likely because we do not support minimum rates, the declaration
mis-ordered the "unused" parameter, which causes warnings when analyzed
with cppcheck.
Fix this warning by properly declaring the min_rate and max_rate
variables in the declaration and definition (rather than using
"unused"). Also rename "rate" to max_rate so as to clarify that we only
support setting the maximum rate.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>