As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():
vsock_stream_has_space
vmci_transport_stream_has_space
vmci_qpair_produce_free_space
qp_lock
qp_acquire_queue_mutex
mutex_lock
Just switch to the new wait API like we did for commit
d9dc8b0f8b ("net: fix sleeping for sk_wait_event()").
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit dc9c4d0fe0, the arp_target array moved from a static global
to a local variable. By the nature of static globals, the array used to
be initialized to all 0. At present, it's full of random data, which
that gets interpreted as arp_target values, when none have actually been
specified. Systems end up booting with spew along these lines:
[ 32.161783] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.168475] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.175089] 8021q: adding VLAN 0 to HW filter on device lacp0
[ 32.193091] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.204892] lacp0: Setting MII monitoring interval to 100
[ 32.211071] lacp0: Removing ARP target 216.124.228.17
[ 32.216824] lacp0: Removing ARP target 218.160.255.255
[ 32.222646] lacp0: Removing ARP target 185.170.136.184
[ 32.228496] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.236294] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.243987] lacp0: Removing ARP target 56.125.228.17
[ 32.249625] lacp0: Removing ARP target 218.160.255.255
[ 32.255432] lacp0: Removing ARP target 15.157.233.184
[ 32.261165] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.268939] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.276632] lacp0: Removing ARP target 16.0.0.0
[ 32.281755] lacp0: Removing ARP target 218.160.255.255
[ 32.287567] lacp0: Removing ARP target 72.125.228.17
[ 32.293165] lacp0: Removing ARP target 218.160.255.255
[ 32.298970] lacp0: Removing ARP target 8.125.228.17
[ 32.304458] lacp0: Removing ARP target 218.160.255.255
None of these were actually specified as ARP targets, and the driver does
seem to clean up the mess okay, but it's rather noisy and confusing, leaks
values to userspace, and the 255.255.255.255 spew shows up even when debug
prints are disabled.
The fix: just zero out arp_target at init time.
While we're in here, init arp_all_targets_value in the right place.
Fixes: dc9c4d0fe0 ("bonding: reduce scope of some global variables")
CC: Mahesh Bandewar <maheshb@google.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* intel_pstate:
cpufreq: intel_pstate: Document the current behavior and user interface
* pm-cpufreq:
cpufreq: dbx500: add a Kconfig symbol
* pm-cpufreq-sched:
cpufreq: schedutil: use now as reference when aggregating shared policy requests
DM-Verity has an (undocumented) mode where no salt is used. This was
never handled directly by the DM-Verity code, instead working due to the
fact that calling crypto_shash_update() with a zero length data is an
implicit noop.
This is no longer the case now that we have switched to
crypto_ahash_update(). Fix the issue by introducing explicit handling
of the no salt use case to DM-Verity.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Reported-by: Marian Csontos <mcsontos@redhat.com>
Fixes: d1ac3ff ("dm verity: switch to using asynchronous hash crypto API")
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The assignmnet:
ip_align = strict ? 2 : NET_IP_ALIGN;
in compare_pkt_ptr_alignment() trips up Coverity because we can only
get to this code when strict is true, therefore ip_align will always
be 2 regardless of NET_IP_ALIGN's value.
So just assign directly to '2' and explain the situation in the
comment above.
Reported-by: "Gustavo A. R. Silva" <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stingray SDHCI hardware supports ACMD12 and automatically
issues after multi block transfer completed.
If ACMD12 in SDHCI is disabled, spurious tx done interrupts are seen
on multi block read command with below error message:
Got data interrupt 0x00000002 even though no data
operation was in progress.
This patch uses SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 to enable
ACM12 support in SDHCI hardware and suppress spurious interrupt.
Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: b580c52d58 ("mmc: sdhci-iproc: add IPROC SDHCI driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
As of 7bb11dc9f5 and 0622cab034, bond slaves in a 3ad bond are not
removed from the aggregator when they are down, and the active slave count
is NOT equal to number of ports in the aggregator, but rather the number
of ports in the aggregator that are still enabled. The sysfs spew for
bonding_show_ad_num_ports() has a comment that says "Show number of active
802.3ad ports.", but it's currently showing total number of ports, both
active and inactive. Remedy it by using the same logic introduced in
0622cab034 in __bond_3ad_get_active_agg_info(), so sysfs, procfs and
netlink all report the number of active ports. Note that this means that
IFLA_BOND_AD_INFO_NUM_PORTS really means NUM_ACTIVE_PORTS instead of
NUM_PORTS, and thus perhaps should be renamed for clarity.
Lightly tested on a dual i40e lacp bond, simulating link downs with an ip
link set dev <slave2> down, was able to produce the state where I could
see both in the same aggregator, but a number of ports count of 1.
MII Status: up
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2 <---
Slave Interface: ens10
MII Status: up <---
Aggregator ID: 1
Slave Interface: ens11
MII Status: up
Aggregator ID: 1
MII Status: up
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1 <---
Slave Interface: ens10
MII Status: down <---
Aggregator ID: 1
Slave Interface: ens11
MII Status: up
Aggregator ID: 1
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If dma mask checks fail in atl2_probe(), it breaks off initialization,
deallocates all resources, but returns zero.
The patch adds proper error code return value and
make error code setup unified.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the regulator probing is not yet finished this driver
might catch a -EPROBE_DEFER. Returning after this condition
did not remove the created platform device. On a repeated
call to the probe function the of_platform_device_create
fails.
Calling of_platform_device_destroy after EPROBE_DEFER resolves
this bug.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
of_platform_device_destroy is the counterpart to
of_platform_device_create which is a non-static function.
After creating a platform device it might be neccessary
to destroy it to deal with -EPROBE_DEFER where a
repeated of_platform_device_create call would fail otherwise.
Therefore also make of_platform_device_destroy globally visible.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In case the DT specifies neither a regulator nor a gpio
for the shared power the driver will crash accessing the regulator.
Prevent the crash by checking the regulator before use.
Use mmc_regulator_get_supply() instead of open coding the same
logic.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Andrey Konovalov and idaifish@gmail.com reported crashes caused by
one skb shared_info being overwritten from __ip6_append_data()
Andrey program lead to following state :
copy -4200 datalen 2000 fraglen 2040
maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200
The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen,
fraggap, 0); is overwriting skb->head and skb_shared_info
Since we apparently detect this rare condition too late, move the
code earlier to even avoid allocating skb and risking crashes.
Once again, many thanks to Andrey and syzkaller team.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: <idaifish@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andre Przywara <andre.przywara@arm.com> noticed that we can get the
following warning with -EPROBE_DEFER:
"WARNING: CPU: 1 PID: 89 at drivers/base/dd.c:349
driver_probe_device+0x2ac/0x2e8"
Let's fix the issue by removing the indices as suggested by
Tejun Heo <tj@kernel.org>. All we have to do here is kill the radix
tree.
I probably ended up with the indices after grepping for removal
of all entries using radix_tree_for_each_slot() and the first
match found was gmap_radix_tree_free(). Anyways, no need for
indices here, and we can just do remove all the entries using
radix_tree_for_each_slot() along how the item_kill_tree() test
case does.
Fixes: c7059c5ac7 ("pinctrl: core: Add generic pinctrl functions for managing groups")
Fixes: a76edc89b1 ("pinctrl: core: Add generic pinctrl functions for managing groups")
Reported-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
I've forgotten to sync the documentation with the actually available
options for some time. Now all updated.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Recently some laptops and mobos are equipped with the dual Realtek
codecs that require special quirks. For making the debugging easier,
add the model "dual-codecs" to be passed via module option.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
MSI Z270-Gamin mobo has also two ALC1220 codecs like Gigabyte AZ370-
Gaming mobo. Apply the same quirk to this one.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The keyboard dock used with the Asus Transformer T100 series, uses
the same vendor-defined 0xff31 usage-page as some other Asus
keyboards. But with a small twist, it has a small descriptor bug which
needs to be fixed up for things to work.
This commit adds the USB-ID for this keyboard to the hid-asus driver
and makes asus_report_fixup fix the descriptor issue, fixing
various special function keys on this keyboard not working.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add stubs for gpiod_add_lookup_table() and gpiod_remove_lookup_table()
for the !GPIOLIB case to prevent build errors.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This reverts commit 8c58f1a7a4.
It turns out that applying these generic properties was
premature: the properties used in the driver using this
are of unclear electrical nature and the subject need to
be discussed.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Make sure dmi_system_id tables are NULL terminated.
Fixes: 7036502783 ("pinctrl: cherryview: Add a quirk to make Acer
Chromebook keyboard work again")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
We need to initializes those variables to 0 for platforms that do not
provide ACPI parameters. Otherwise, we set sda_hold_time to random
values, breaking e.g. Galileo and IOT2000 boards.
Fixes: 9d64084330 ("i2c: designware: don't infer timings described by ACPI from clock rate")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
nand_ooblayout_lp_hamming_ops can be made static as it does not need to be
in global scope.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
According to Boris, some user-space tools expect MTD drivers to
update ecc_stats.corrected, and it's better to provide a lower
bound than to provide no information at all.
Fixes: 6956e2385a ("mtd: nand: add tango NAND flash controller support")
Cc: stable@vger.kernel.org
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The device table is required to load modules based on
modaliases. After adding MODULE_DEVICE_TABLE, below entries
for example will be added to module.alias:
alias: of:N*T*Csigma,smp8758-nandC*
alias: of:N*T*Csigma,smp8758-nand
Fixes: 6956e2385a ("mtd: nand: add tango NAND flash controller support")
Cc: stable@vger.kernel.org
Signed-off-by: Andres Galacho <andresgalacho@gmail.com>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
We don't handle cases larger than 7. We probably shouldn't pretend we
know the ECC step size in this case, and it's probably also good to
WARN() like we do in many other similar cases.
Fixes: 8fc82d456e ("mtd: nand: samsung: Retrieve ECC requirements from extended ID")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
If we fail any time after calling nand_detect(), then we don't call the
vendor-specific ->cleanup() callback, and we'll leak any resources the
vendor-specific code might have allocated.
Mark the "fix" against the first commit that started allocating anything
in ->init().
Fixes: 626994e074 ("mtd: nand: hynix: Add read-retry support for 1x nm MLC NANDs")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
nand_ids isn't a separate module anymore and doesn't need this header.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
This bug seems to have been here forever, although we came close to
fixing all of them in [1]!
[1] 11eaf6df1c ("mtd: nand: Remove BUG() abuse in nand_scan_tail")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The add_new_disk returns with communication locked if
__sendmsg returns failure, fix it with call unlock_comm
before return.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
CC: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
The code to fetch a 64-bit value from user space was entirely buggered,
and has been since the code was merged in early 2016 in commit
b2f680380d ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit
kernels").
Happily the buggered routine is almost certainly entirely unused, since
the normal way to access user space memory is just with the non-inlined
"get_user()", and the inlined version didn't even historically exist.
The normal "get_user()" case is handled by external hand-written asm in
arch/x86/lib/getuser.S that doesn't have either of these issues.
There were two independent bugs in __get_user_asm_u64():
- it still did the STAC/CLAC user space access marking, even though
that is now done by the wrapper macros, see commit 11f1a4b975
("x86: reorganize SMAP handling in user space accesses").
This didn't result in a semantic error, it just means that the
inlined optimized version was hugely less efficient than the
allegedly slower standard version, since the CLAC/STAC overhead is
quite high on modern Intel CPU's.
- the double register %eax/%edx was marked as an output, but the %eax
part of it was touched early in the asm, and could thus clobber other
inputs to the asm that gcc didn't expect it to touch.
In particular, that meant that the generated code could look like
this:
mov (%eax),%eax
mov 0x4(%eax),%edx
where the load of %edx obviously was _supposed_ to be from the 32-bit
word that followed the source of %eax, but because %eax was
overwritten by the first instruction, the source of %edx was
basically random garbage.
The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark
the 64-bit output as early-clobber to let gcc know that no inputs should
alias with the output register.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@kernel.org # v4.8+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al noticed that unsafe_put_user() had type problems, and fixed them in
commit a7cc722fff ("fix unsafe_put_user()"), which made me look more
at those functions.
It turns out that unsafe_get_user() had a type issue too: it limited the
largest size of the type it could handle to "unsigned long". Which is
fine with the current users, but doesn't match our existing normal
get_user() semantics, which can also handle "u64" even when that does
not fit in a long.
While at it, also clean up the type cast in unsafe_put_user(). We
actually want to just make it an assignment to the expected type of the
pointer, because we actually do want warnings from types that don't
convert silently. And it makes the code more readable by not having
that one very long and complex line.
[ This patch might become stable material if we ever end up back-porting
any new users of the unsafe uaccess code, but as things stand now this
doesn't matter for any current existing uses. ]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The check for an MCE being a memory error in the NFIT mce handler was
bogus. Use the new mce_is_memory_error() helper to detect the error
properly.
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170519093915.15413-3-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export the function which checks whether an MCE is a memory error to
other users so that we can reuse the logic. Drop the boot_cpu_data use,
while at it, as mce.cpuvendor already has the CPU vendor in there.
Integrate a piece from a patch from Vishal Verma
<vishal.l.verma@intel.com> to export it for modules (nfit).
The main reason we're exporting it is that the nfit handler
nfit_handle_mce() needs to detect a memory error properly before doing
its recovery actions.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170519093915.15413-2-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull misc uaccess fixes from Al Viro:
"Fix for unsafe_put_user() (no callers currently in mainline, but
anyone starting to use it will step into that) + alpha osf_wait4()
infoleak fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
osf_wait4(): fix infoleak
fix unsafe_put_user()
Pull scheduler fix from Thomas Gleixner:
"A single scheduler fix:
Prevent idle task from ever being preempted. That makes sure that
synchronize_rcu_tasks() which is ignoring idle task does not pretend
that no task is stuck in preempted state. If that happens and idle was
preempted on a ftrace trampoline the machine crashes due to
inconsistent state"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Call __schedule() from do_idle() without enabling preemption
Pull irq fixes from Thomas Gleixner:
"A set of small fixes for the irq subsystem:
- Cure a data ordering problem with chained interrupts
- Three small fixlets for the mbigen irq chip"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Fix chained interrupt data ordering
irqchip/mbigen: Fix the clear register offset calculation
irqchip/mbigen: Fix potential NULL dereferencing
irqchip/mbigen: Fix memory mapping code
Since commit 76b91c32dd ("bridge: stp: when using userspace stp stop
kernel hello and hold timers"), bridge would not start hello_timer if
stp_enabled is not KERNEL_STP when br_dev_open.
The problem is even if users set stp_enabled with KERNEL_STP later,
the timer will still not be started. It causes that KERNEL_STP can
not really work. Users have to re-ifup the bridge to avoid this.
This patch is to fix it by starting br->hello_timer when enabling
KERNEL_STP in br_stp_start.
As an improvement, it's also to start hello_timer again only when
br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is
no reason to start the timer again when it's NO_STP.
Fixes: 76b91c32dd ("bridge: stp: when using userspace stp stop kernel hello and hold timers")
Reported-by: Haidong Li <haili@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
When TX checksum offload is used, if the computed checksum is 0 the
LAN95xx device do not alter the checksum to 0xffff. In the case of ipv4
UDP checksum, it indicates to receiver that no checksum is calculated.
Under ipv6, UDP checksum yields a result of zero must be changed to
0xffff. Hence disabling checksum offload for ipv6 packets.
Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com>
Reported-by: popcorn mix <popcornmix@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ihar Hrachyshka says:
====================
arp: always override existing neigh entries with gratuitous ARP
This patchset is spurred by discussion started at
https://patchwork.ozlabs.org/patch/760372/ where we figured that there is no
real reason for enforcing override by gratuitous ARP packets only when
arp_accept is 1. Same should happen when it's 0 (the default value).
changelog v2: handled review comments by Julian Anastasov
- fixed a mistake in a comment;
- postponed addr_type calculation to as late as possible.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when arp_accept is 1, we always override existing neigh
entries with incoming gratuitous ARP replies. Otherwise, we override
them only if new replies satisfy _locktime_ conditional (packets arrive
not earlier than _locktime_ seconds since the last update to the neigh
entry).
The idea behind locktime is to pick the very first (=> close) reply
received in a unicast burst when ARP proxies are used. This helps to
avoid ARP thrashing where Linux would switch back and forth from one
proxy to another.
This logic has nothing to do with gratuitous ARP replies that are
generally not aligned in time when multiple IP address carriers send
them into network.
This patch enforces overriding of existing neigh entries by all incoming
gratuitous ARP packets, irrespective of their time of arrival. This will
make the kernel honour all incoming gratuitous ARP packets.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The addr_type retrieval can be costly, so it's worth trying to avoid its
calculation as much as possible. This patch makes it calculated only
for gratuitous ARP packets. This is especially important since later we
may want to move is_garp calculation outside of arp_accept block, at
which point the costly operation will be executed for all setups.
The patch is the result of a discussion in net-dev:
http://marc.info/?l=linux-netdev&m=149506354216994
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code is quite involving already to earn a separate function for
itself. If anything, it helps arp_process readability.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the is_garp code deals just with gratuitous ARP packets, not every
unsolicited packet.
This patch is a result of a discussion in netdev:
http://marc.info/?l=linux-netdev&m=149506354216994
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tcp_disconnect() is called, inet_csk_delack_init() sets
icsk->icsk_ack.rcv_mss to 0.
This could potentially cause tcp_recvmsg() => tcp_cleanup_rbuf() =>
__tcp_select_window() call path to have division by 0 issue.
So this patch initializes rcv_mss to TCP_MIN_MSS instead of 0.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>