If a user did:
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu1/online
we would (this a build with DEBUG enabled) get to:
smpboot: ++++++++++++++++++++=_---CPU UP 1
.. snip..
smpboot: Stack at about ffff880074c0ff44
smpboot: CPU1: has booted.
and hang. The RCU mechanism would kick in an try to IPI the CPU1
but the IPIs (and all other interrupts) would never arrive at the
CPU1. At first glance at least. A bit digging in the hypervisor
trace shows that (using xenanalyze):
[vla] d4v1 vec 243 injecting
0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
] 0.043163639 --|x d4v1 vmentry cycles 1468
] 0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
0.043164913 --|x d4v1 inj_virq vec 243 real
[vla] d4v1 vec 243 injecting
0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
] 0.043165526 --|x d4v1 vmentry cycles 1472
] 0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
0.043166800 --|x d4v1 inj_virq vec 243 real
[vla] d4v1 vec 243 injecting
there is a pending event (subsequent debugging shows it is the IPI
from the VCPU0 when smpboot.c on VCPU1 has done
"set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is
interrupted with the callback IPI (0xf3 aka 243) which ends up calling
__xen_evtchn_do_upcall.
The __xen_evtchn_do_upcall seems to do *something* but not acknowledge
the pending events. And the moment the guest does a 'cli' (that is the
ffffffff81673254 in the log above) the hypervisor is invoked again to
inject the IPI (0xf3) to tell the guest it has pending interrupts.
This repeats itself forever.
The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup
we set each per_cpu(xen_vcpu, cpu) to point to the
shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info
to register per-CPU structures (xen_vcpu_setup).
This is used to allow events for more than 32 VCPUs and for performance
optimizations reasons.
When the user performs the VCPU hotplug we end up calling the
the xen_vcpu_setup once more. We make the hypercall which returns
-EINVAL as it does not allow multiple registration calls (and
already has re-assigned where the events are being set). We pick
the fallback case and set per_cpu(xen_vcpu, cpu) to point to the
shared_info->vcpu_info[vcpu] (which is a good fallback during bootup).
However the hypervisor is still setting events in the register
per-cpu structure (per_cpu(xen_vcpu_info, cpu)).
As such when the events are set by the hypervisor (such as timer one),
and when we iterate in __xen_evtchn_do_upcall we end up reading stale
events from the shared_info->vcpu_info[vcpu] instead of the
per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the
events that the hypervisor has set and the hypervisor keeps on reminding
us to ack the events which we never do.
The fix is simple. Don't on the second time when xen_vcpu_setup is
called over-write the per_cpu(xen_vcpu, cpu) if it points to
per_cpu(xen_vcpu_info).
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The following resolves a section mismatch warning below in xen-acpi-processor introduced by
3fac10145b [13/13] xen: Re-upload processor PM data to hypervisor after S3 resume (v2)
Warning:
WARNING: drivers/xen/built-in.o(.text+0x2056a): Section mismatch in reference from the function xen_upload_processor_pm_data() to the function .init.text:read_acpi_id()
The function xen_upload_processor_pm_data() references
the function __init read_acpi_id().
This is often because xen_upload_processor_pm_data lacks a __init
annotation or the annotation of read_acpi_id is wrong.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Ben Guthro <benjamin.guthro@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Upon resume, it was found that ACPI C-states were missing from non-boot CPUs.
This change registers a syscore_ops handler for this case, and re-uploads the
PM information to the hypervisor to properly reset the C-state on these
processors.
v2:
v1 did not go through the check_acpi_ids() code-path, and missed some cases when
xen was running with the dom0_max_vcpus= command line parameter.
Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com>
[v3: Ate some tabs, s/printk/pr_info/]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
There is no need to use the PV version of the IRQ_WORKER mechanism
as under PVHVM we are using the native version. The native
version is using the SMP API.
They just sit around unused:
69: 0 0 xen-percpu-ipi irqwork0
83: 0 0 xen-percpu-ipi irqwork1
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
See git commit f10cd522c5
(xen: disable PV spinlocks on HVM) for details.
But we did not disable it everywhere - which means that when
we boot as PVHVM we end up allocating per-CPU irq line for
spinlock. This fixes that.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The default (uninitialized) value of the IRQ line is -1.
Check if we already have allocated an spinlock interrupt line
and if somebody is trying to do it again. Also set it to -1
when we offline the CPU.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If the timer interrupt has been de-init or is just now being
initialized, the default value of -1 should be preset as
interrupt line. Check for that and if something is odd
WARN us.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We naively assume that the IRQ value passed in is correct.
If it is not, then any dereference operation for the 'info'
structure will result in crash - so might as well guard ourselves
and sprinkle copious amounts of WARN_ON.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When we online the CPU, we get this splat:
smpboot: Booting Node 0 Processor 1 APIC 0x2
installing Xen timer for CPU 1
BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
Call Trace:
[<ffffffff810c1fea>] __might_sleep+0xda/0x100
[<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff813036eb>] kvasprintf+0x5b/0x90
[<ffffffff81303758>] kasprintf+0x38/0x40
[<ffffffff81044510>] xen_setup_timer+0x30/0xb0
[<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30
[<ffffffff81666d0a>] start_secondary+0x19c/0x1a8
The solution to that is use kasprintf in the CPU hotplug path
that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
and remove the call to in xen_hvm_setup_cpu_clockevents.
Unfortunatly the later is not a good idea as the bootup path
does not use xen_hvm_cpu_notify so we would end up never allocating
timer%d interrupt lines when booting. As such add the check for
atomic() to continue.
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
While we don't use the spinlock interrupt line (see for details
commit f10cd522c5 -
xen: disable PV spinlocks on HVM) - we should still do the proper
init / deinit sequence. We did not do that correctly and for the
CPU init for PVHVM guest we would allocate an interrupt line - but
failed to deallocate the old interrupt line.
This resulted in leakage of an irq_desc but more importantly this splat
as we online an offlined CPU:
genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1)
Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1
Call Trace:
[<ffffffff811156de>] __setup_irq+0x23e/0x4a0
[<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250
[<ffffffff811161bb>] request_threaded_irq+0xfb/0x160
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff810cad35>] ? update_max_interval+0x15/0x40
[<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78
[<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33
[<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70
[<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10
[<ffffffff8109402b>] __cpu_notify+0x1b/0x30
[<ffffffff8166834a>] _cpu_up+0xa0/0x14b
[<ffffffff816684ce>] cpu_up+0xd9/0xec
[<ffffffff8165f754>] store_online+0x94/0xd0
[<ffffffff8141d15b>] dev_attr_store+0x1b/0x20
[<ffffffff81218f44>] sysfs_write_file+0xf4/0x170
[<ffffffff811a2864>] vfs_write+0xb4/0x130
[<ffffffff811a302a>] sys_write+0x5a/0xa0
[<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b
cpu 1 spinlock event irq -16
smpboot: Booting Node 0 Processor 1 APIC 0x2
And if one looks at the /proc/interrupts right after
offlining (CPU1):
70: 0 0 xen-percpu-ipi spinlock0
71: 0 0 xen-percpu-ipi spinlock1
77: 0 0 xen-percpu-ipi spinlock2
There is the oddity of the 'spinlock1' still being present.
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
In the PVHVM path when we do CPU online/offline path we would
leak the timer%d IRQ line everytime we do a offline event. The
online path (xen_hvm_setup_cpu_clockevents via
x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt
line for the timer%d.
But we would still use the old interrupt line leading to:
kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261!
invalid opcode: 0000 [#1] SMP
RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270
.. snip..
<IRQ>
[<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0
[<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0
[<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240
[<ffffffff811175b9>] handle_percpu_irq+0x49/0x70
[<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0
[<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40
[<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80
<EOI>
[<ffffffff81666d01>] ? start_secondary+0x193/0x1a8
[<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8
There is also the oddity (timer1) in the /proc/interrupts after
offlining CPU1:
64: 1121 0 xen-percpu-virq timer0
78: 0 0 xen-percpu-virq timer1
84: 0 2483 xen-percpu-virq timer2
This patch fixes it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
A randconfig compile test discovered that we can select
INPUT_XEN_KBDDEV_FRONTEND without all of its dependencies being met. Fix
this by adding the dependency to the select line.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
For quite a few Xen versions, this wasn't the IRQ vector anymore
anyway, and it is not being used by the kernel for anything. Hence
drop the field from struct irq_info, and respective function
parameters.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
During early setup of a dom0 kernel, populate boot_params with the
Enhanced Disk Drive (EDD) and MBR signature data. This makes
information on the BIOS boot device available in /sys/firmware/edd/.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Pull KVM fix from Gleb Natapov:
"Bugfix for the regression introduced by commit c300aa64ddf5"
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Allow cross page reads and writes from cached translations.
Pull x86 fixes from Peter Anvin:
"Two quite small fixes: one a build problem, and the other fixes
seccomp filters on x32."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Fix rebuild with EFI_STUB enabled
x86: remove the x32 syscall bitmask from syscall_get_nr()
Interrupt handlers are always invoked with interrupts disabled, so
remove all uses of the deprecated IRQF_DISABLED flag.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linux has expected that interrupt handlers are executed with local
interrupts disabled for a while now, so ensure that this is the case on
Alpha even for non-device interrupts such as IPIs.
Without this patch, secondary boot results in the following backtrace:
warning: at kernel/softirq.c:139 __local_bh_enable+0xb8/0xd0()
trace:
__local_bh_enable+0xb8/0xd0
irq_enter+0x74/0xa0
scheduler_ipi+0x50/0x100
handle_ipi+0x84/0x260
do_entint+0x1ac/0x2e0
irq_exit+0x60/0xa0
handle_irq+0x98/0x100
do_entint+0x2c8/0x2e0
ret_from_sys_call+0x0/0x10
load_balance+0x3e4/0x870
cpu_idle+0x24/0x80
rcu_eqs_enter_common.isra.38+0x0/0x120
cpu_idle+0x40/0x80
rest_init+0xc0/0xe0
_stext+0x1c/0x20
A similar dump occurs if you try to reboot using magic-sysrq.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Due to all of the goodness being packed into today's kernels, the
resulting image isn't as slim as it once was.
In light of this, don't pass -msmall-data to gcc, which otherwise results
in link failures due to impossible relocations when compiling anything but
the most trivial configurations.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Thorsten Kranzkowski <dl8bcu@dl8bcu.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes a NULL pointer dereference at boot on UP1500.
Cc: stable@vger.kernel.org
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jay Estabrook <jay.estabrook@gmail.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds support for kvm_gfn_to_hva_cache_init functions for
reads and writes that will cross a page. If the range falls within
the same memslot, then this will be a fast operation. If the range
is split between two memslots, then the slower kvm_read_guest and
kvm_write_guest are used.
Tested: Test against kvm_clock unit tests.
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
cache target when the device being cached is not itself wrapped with
device-mapper.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRXuMlAAoJEK2W1qbAHj1n44cQAJYD4CFKQn3Mv1YkpkTeuuy6
N6kLmKl5leFNk8y6dwGtcRMtn7ZkVJLrDFXoi+pa46geG1WfCuAVhqm1y9t+Fa+m
mWJgSutkvEdI5OiU7SfoK0m0zGsB8B5k5G2rp0vT7sRmIWmPKPH6GUP7Fw0QRDYl
kXkxO9P6GM8kU5H/7SqVqtbvD1tNgEeME8vqAFJU3xC7PrfcknpnFD5kJeBMcddD
NZUXPDmUv20G2WySxLWbX4eBNNsqpI7QwX0Mthab8OusCOaLLID3P388ZLRyAqLx
/ja8rIKPdjRBOgyJ/kkemybffle69Bg5M+mfPygGATyUsPeVXmi/6u2AHLTcVj9e
tsF2MjHNYfEZE7Bq4rmEKrRk/pec+2UbgYj41mhQ06ICK+FlSDxIJWHN8fyrVpvu
0bT2f9Ddqwd6fZX8Wwd6xIaH+Nv7EZRNChdF0uUsHkjZbBWkojdFsdUSuDJWN87p
TjYCZJuRmStyDPTvnJeikuaXJ0DAs+OZ0w9I4HCNWNNF1rUbUEH5ixOcNwGA54IN
EGvbnGNfZlwYNCVpBRB+srlGPTTP8y63m3IZsCYc6VL2G9VBBZiIUBSNQMqUD3d4
Cnfr1crEC2EqPPd6XjCeRPv4dHuDMVwH4MA3SSvjL+j1jkxnz6prZPsBHu70aRWh
OFnV+n4YX8VJnAT5Rn0N
=LUeu
-----END PGP SIGNATURE-----
Merge tag 'dm-3.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm
Pull device-mapper fixes from Alasdair Kergon:
"A pair of patches to fix the writethrough mode of the device-mapper
cache target when the device being cached is not itself wrapped with
device-mapper."
* tag 'dm-3.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm cache: reduce bio front_pad size in writeback mode
dm cache: fix writes to cache device in writethrough mode
ASPM
Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus"
kexec
PCI: Don't try to disable Bus Master on disconnected PCI devices
Platform ROM images
PCI: Add PCI ROM helper for platform-provided ROM images
nouveau: Attempt to use platform-provided ROM image
radeon: Attempt to use platform-provided ROM image
Hotplug
PCI/ACPI: Always resume devices on ACPI wakeup notifications
PCI/PM: Disable runtime PM of PCIe ports
EISA
EISA/PCI: Fix bus res reference
EISA/PCI: Init EISA early, before PNP
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRXzmPAAoJEFmIoMA60/r8svoQAIb1Z11DAdPfTaXThwEIURZb
KC4wzfLMAZCRvfY6j6qZw6Qi3NT6ZqgrZxSXgV0myB6Luwcv80+D2BHmDnRoxRYD
ywAm2m8vwoUq5s7MsaGHRLViBafL5TDs1KQLVEPkJ+B1gCzo61UWXPVDu+z9u+Fx
t9fNV5Xc/uaxciXZYVnPGTCFjCVTqa95x16bF6NgmG1j1DHYuX7fBVSsJc2CS77m
P2oyrfLmsKUA7jKHD4ntbIslYcqUC5NF8wcRad324Pzi93IwdktUSn8tqqNFCX9T
EE8hxCp24CPm6ngyCFIIkqYM/MMDM8YVqBwuZjngm3+tjaS3hBIrr1jBDenUgncY
2WRomgRzBwaa8nTr5xhin3kYGRkQAEf3M8RWG70YBppTidvvoUt0bfxSfSlN9R/t
g/7Eu4u35+lJ9F+Gg4unc0JrydkSJG1tfE+0LaEOhqXgCwloPR3kk08qcHPA2NHQ
dlEMJ3FSUHvINoRE7pm6+E2lcbGvZjCa9Cvd9IbdJ6r4wVYI7pHl8Zlp/sf3HZdV
sKRAw7fSazoFCX73lHR0f1zx7l1OdTmDSxEHv/pQA+xVt7K0DcOq+Mj0B84pVYcl
WSRUcr6jAi91syuKX+8TRpbfACb25PkPrEFEsjsGcBq+I/a15e8a+sIW31Z66PNu
ICejTqh/2/aOlgFg9OBs
=gIGr
-----END PGP SIGNATURE-----
Merge tag 'pci-v3.9-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"PCI updates for v3.9:
ASPM
Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus"
kexec
PCI: Don't try to disable Bus Master on disconnected PCI devices
Platform ROM images
PCI: Add PCI ROM helper for platform-provided ROM images
nouveau: Attempt to use platform-provided ROM image
radeon: Attempt to use platform-provided ROM image
Hotplug
PCI/ACPI: Always resume devices on ACPI wakeup notifications
PCI/PM: Disable runtime PM of PCIe ports
EISA
EISA/PCI: Fix bus res reference
EISA/PCI: Init EISA early, before PNP"
* tag 'pci-v3.9-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/PM: Disable runtime PM of PCIe ports
PCI/ACPI: Always resume devices on ACPI wakeup notifications
PCI: Don't try to disable Bus Master on disconnected PCI devices
Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus"
radeon: Attempt to use platform-provided ROM image
nouveau: Attempt to use platform-provided ROM image
EISA/PCI: Init EISA early, before PNP
EISA/PCI: Fix bus res reference
PCI: Add PCI ROM helper for platform-provided ROM images
Pull networking fixes from David Miller:
1) Fix erroneous sock_orphan() leading to crashes and double
kfree_skb() in NFC protocol. From Thierry Escande and Samuel Ortiz.
2) Fix use after free in remain-on-channel mac80211 code, from Johannes
Berg.
3) nf_reset() needs to reset the NF tracing cookie, otherwise we can
leak it from one namespace into another. Fix from Gao Feng and
Patrick McHardy.
4) Fix overflow in channel scanning array of mwifiex driver, from Stone
Piao.
5) Fix loss of link after suspend/shutdown in r8169, from Hayes Wang.
6) Synchronization of unicast address lists to the undelying device
doesn't work because whether to sync is maintained as a boolean
rather than a true count. Fix from Vlad Yasevich.
7) Fix corruption of TSO packets in atl1e by limiting the segmented
packet length. From Hannes Frederic Sowa.
8) Revert bogus AF_UNIX credential passing change and fix the
coalescing issue properly, from Eric W Biederman.
9) Changes of ipv4 address lifetime settings needs to generate a
notification, from Jiri Pirko.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits)
netfilter: don't reset nf_trace in nf_reset()
net: ipv4: notify when address lifetime changes
ixgbe: fix registration order of driver and DCA nofitication
af_unix: If we don't care about credentials coallesce all messages
Revert "af_unix: dont send SCM_CREDENTIAL when dest socket is NULL"
bonding: remove sysfs before removing devices
atl1e: limit gso segment size to prevent generation of wrong ip length fields
net: count hw_addr syncs so that unsync works properly.
r8169: fix auto speed down issue
netfilter: ip6t_NPT: Fix translation for non-multiple of 32 prefix lengths
mwifiex: limit channel number not to overflow memory
NFC: microread: Fix build failure due to a new MEI bus API
iwlwifi: dvm: fix the passive-no-RX workaround
netfilter: nf_conntrack: fix error return code
NFC: llcp: Keep the connected socket parent pointer alive
mac80211: fix idle handling sequence
netfilter: nfnetlink_acct: return -EINVAL if object name is empty
netfilter: nfnetlink_queue: fix error return code in nfnetlink_queue_init()
netfilter: reset nf_trace in nf_reset
mac80211: fix remain-on-channel cancel crash
...
eboot.o and efi_stub_$(BITS).o didn't get added to "targets", and hence
their .cmd files don't get included by the build machinery, leading to
the files always getting rebuilt.
Rather than adding the two files individually, take the opportunity and
add $(VMLINUX_OBJS) to "targets" instead, thus allowing the assignment
at the top of the file to be shrunk quite a bit.
At the same time, remove a pointless flags override line - the variable
assigned to was misspelled anyway, and the options added are
meaningless for assembly sources.
[ hpa: the patch is not minimal, but I am taking it for -urgent anyway
since the excess impact of the patch seems to be small enough. ]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/515C5D2502000078000CA6AD@nat28.tlf.novell.com
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code
to reset nf_trace in nf_reset(). This is wrong and unnecessary.
nf_reset() is used in the following cases:
- when passing packets up the the socket layer, at which point we want to
release all netfilter references that might keep modules pinned while
the packet is queued. nf_trace doesn't matter anymore at this point.
- when encapsulating or decapsulating IPsec packets. We want to continue
tracing these packets after IPsec processing.
- when passing packets through virtual network devices. Only devices on
that encapsulate in IPv4/v6 matter since otherwise nf_trace is not
used anymore. Its not entirely clear whether those packets should
be traced after that, however we've always done that.
- when passing packets through virtual network devices that make the
packet cross network namespace boundaries. This is the only cases
where we clearly want to reset nf_trace and is also what the
original patch intended to fix.
Add a new function nf_reset_trace() and use it in dev_forward_skb() to
fix this properly.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS fixes from Ralf Baechle:
"Fixes for a number of small glitches in various corners of the MIPS
tree. No particular areas is standing out.
With this applied all MIPS defconfigs are building fine. No merge
conflicts are expected."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Delete definition of SA_RESTORER.
MIPS: Fix ISA level which causes secondary cache init bypassing and more
MIPS: Fix build error cavium-octeon without CONFIG_SMP
MIPS: Kconfig: Rename SNIPROM too
MIPS: Alchemy: Fix typo "CONFIG_DEBUG_PCI"
MIPS: Unbreak function tracer for 64-bit kernel.
Pull GFS2 fixes from Steven Whitehouse:
"There are two patches which fix up a couple of minor issues in the DLM
interface code, a missing error path in gfs2_rs_alloc(), one patch
which fixes a problem during "withdraw" and a fix for discards/FITRIM
when using 4k sector sized devices."
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: Issue discards in 512b sectors
GFS2: Fix unlock of fcntl locks during withdrawn state
GFS2: return error if malloc failed in gfs2_rs_alloc()
GFS2: use memchr_inv
GFS2: use kmalloc for lvb bitmap
Commit e2eed58b4f ("IB/qib: change QLogic to Intel") moved a firmware
file potentially breaking the ABI.
This patch reverts that aspect of the fix as well as reverting the
firmware name as used in qib.
Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A bunch of small driver fixes plus a fix for error handling in the core
- nothing too exciting overall.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRXt12AAoJELSic+t+oim9PNIP/RPojoOyrzSr2knNwgFHtxo7
rcQqxq/Z0Ah6B/BKZ5qomzIaUPzLoKGM2ND4KhCJjBrXY3PR6hJQK6b4M2ocXbpK
whC5lt36M9/3b/eLGH9lIfKGXPKT+n2qfvbALdlxWYTcLbeoiMfakwcGhZngihEq
H96x7+xOuBheF8dx46+zG10uUm0lP3hpELzXKqHi/K173FEfgIeq9iELzfTiQbIw
tdXUl7jrUwifvBlUx3/tRM334K4pAGeq+hLtftxvX/qFBqNIZ6bO3ZguZLaPNgXd
NvQ1AtsArf2KmL+/093Zesz+b7fqfLEjsBqecWjhT7UWEs1y8qmmTBNM6cgAOzyg
OZnFzZ8fl3G9YuLgktAFr1pWT9T43B/ZFbQ4NQzkxVOh1QvOz1Xkx9JS0hCigVBu
pen6n5L2OWzxcGsDiB38b4HSCzLgNt4QtsXCtQZaDKUJ6Q8pmsCWClyPfqJwbEWR
5QNW/8x2BCuQMSa4aKvPhjw9bIfOhOYX/+JQezHdS/pu6nwAGHHM5Yohtjn9/hm/
6hXniAzctT6rbk4ZDr5ux1CBxyBzjcTAL0tdiqFglAYcXwRNpzqm714jKN57yH6p
YXD3V2aRF7h7honDJqav18M9opuIo4rNg0lBCreA+5FJ+MmQHLIU/DvfO6qUV2/0
XO540x4NdS1PMAX02P1h
=eqgv
-----END PGP SIGNATURE-----
Merge tag 'spi-fix-v3.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc
Pull spi fixes from Mark Brown:
"A bunch of small driver fixes plus a fix for error handling in the
core - nothing too exciting overall."
* tag 'spi-fix-v3.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc:
spi/mpc512x-psc: optionally keep PSC SS asserted across xfer segmensts
spi: Unlock a spinlock before calling into the controller driver.
spi/s3c64xx: modified error interrupt handling and init
spi/bcm63xx: don't disable non enabled clocks in probe error path
spi/bcm63xx: Remove unused variable
spi: slink-tegra20: move runtime pm calls to transfer_one_message
This patch changes GFS2's discard issuing code so that it calls
function sb_issue_discard rather than blkdev_issue_discard. The
code was calling blkdev_issue_discard and specifying the correct
sector offset and sector size, but blkdev_issue_discard expects
these values to be in terms of 512 byte sectors, even if the native
sector size for the device is different. Calling sb_issue_discard
with the BLOCK size instead ensures the correct block-to-512b-sector
translation. I verified that "minlen" is specified in blocks, so
comparing it to a number of blocks is correct.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This reverts commit 0ef1594c01.
This patch introduced a few races which cannot be easily fixed with a
small follow-up patch. Furthermore, the SoC with the broken hardware
register, which this patch intended to add support for, can only be used
with device trees, which this driver currently does not support.
[ Here is the discussion that led to this "revert" patch:
https://lkml.org/lkml/2013/4/3/176 ]
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRXrj5AAoJEPo9qoy8lh71MVEQAIpbZnMGuzhPbVdDfVZCkNt+
GNGhRaYNODoVhAaiIJ4kcCtCpM36EHXZozw/Jlc0hjVuWBptZI2W8vnoH/utrdCD
Dh6SxKVc8QwGz2Ybhjqy8U1Bpw3GVGHbPzLz/6gmZQy34ownOeM5qqSLXXuxF8NW
LiALmD1nUZwg+Mn9aF8Hvu8BMM4lxK90oM3XTGfru3LmgN2ING5sgb6eLaTPAxPd
+vDmeynekPWd+yDjmHC0Es5Xx1J0tuOi/qepWYH1pok9R+dtsyfAOmeXoUjcbKct
5Vpx/3Y/exzGQeiZuFYlpKYCtDqYh26CwJjPKpbMPztTJMgdmCRFJxMiISmNdu9k
cVQ0V3oJr92uAtX6gl5/5YM1nmXeH0Y9LwHVdgN6Tjiwe54FhGq8url2HqZINjQI
B19Jwx6XHsZ2P7Yr9LDqmrGFEMjv/ufdV/qhrLt+bxjxu/8DIxrPG37N7beFPZbE
KbPnxy6zDD2ROBtheTVFNdL5FkZbplJKaqhUMpfuBsDo/w0mpQADOJOPEQCgOC8I
CaaGfuBqbRBlO6R02bCmnLaeFwOSUIKjqLmug3IOcpZ+uZym3RPMZuMsRnHTIoYh
tD8+zfQ6z9iEYaoQO6CjzHLzvqCKdG1t0ksDp+FmhCfFlityuzXhvHvP9rI72ys2
/6ECxV0W4m3zLLKofxOM
=lKy5
-----END PGP SIGNATURE-----
Merge tag 'fbdev-fixes-3.9-rc6' of git://gitorious.org/linux-omap-dss2/linux
Pull fbdev fixes from Tomi Valkeinen:
"Fix uvesafb crash bug and typoed flag name in fbmon's new videomode
code"
* tag 'fbdev-fixes-3.9-rc6' of git://gitorious.org/linux-omap-dss2/linux:
video:uvesafb: Fix dereference NULL pointer code path
fbmon: use VESA_DMT_VSYNC_HIGH to fix typo
This contains slightly more volumes than usual at this stage, mostly
because of my vacation in the last week.
Nothing to scare, all small and/or trivial fixes:
- Fix loop path handling in ASoC DAPM
- Some memory handling fixes in ASoC core
- Fix spear_pcm to adapt to the updated API
- HD-audio HDMI ELD handling fixes
- Fix for CM6331 USB-audio SRC change bugs
- Revert power_save_controller option change due to user-space usage
- A few other small ASoC and HD-audio fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJRXl85AAoJEGwxgFQ9KSmkDzAQALJU/3Bn/tRJMFDwmVPYbuoi
/mJ8iTBLz7uvKTnGaOIPg+vrt2SOkpcfE/bkv2d8GL/YMbXC7RY2j+37D8LYlnZQ
rhUlM7VbkFsmoXR9GtgeWdZyrQSVNi7KFCfNLSNZmZAkDkopq0VzW4Ll+q4wyPWQ
taZDQ0g3fnbAegiri246cb8nKH8gTzKRG16I/5N1S7fxLinQ6u5W4IPu9jHxNhiW
sOyQOXwYp7CpgmqKBbjES8oncbgwSnk55YPY3Si+vfYIHy18yR9yCNyGsrGN8M0W
c4oTU8EoNzGrwE4/+MSvkWwA4qYBEqRkEwS8eL2QlcyNpeZq1mROWnEa1BZz3PJ/
uw4M5GrVYFD/w95XRX/hJmenQelSV5S2EfCmxAbMQwXX9je2q9RzM0AeaORNUPYT
Iy8S4+d9KBWBcLxmAASLUPTk5nl4bTPW/yy/Cb5ICQdnhHGTmKnJjQdIbP9tiiDD
jfnftPUDBONRnCe24AtxQzDmj2wfikkMkQoR/nENKKR+u56zy6Le+m4zmv7OlfLX
lqcWHh/Sgsg+B960GUPZSEqpDaXA5hkpXTz/foWbeu8Go0JVQTV9/+HDomrfP0VE
EoRrApZf35PBSo2rqC6lyKhAp+OgWKmXoWelzCY3//4kYXl2cEK1voa5zFhgVRGD
tccXSAxPyW4HsIg+4hqF
=L1Ya
-----END PGP SIGNATURE-----
Merge tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This contains slightly more volumes than usual at this stage, mostly
because of my vacation in the last week. Nothing to scare, all small
and/or trivial fixes:
- Fix loop path handling in ASoC DAPM
- Some memory handling fixes in ASoC core
- Fix spear_pcm to adapt to the updated API
- HD-audio HDMI ELD handling fixes
- Fix for CM6331 USB-audio SRC change bugs
- Revert power_save_controller option change due to user-space usage
- A few other small ASoC and HD-audio fixes"
* tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/generic - fix uninitialized variable
Revert "ALSA: hda - Allow power_save_controller option override DCAPS"
ALSA: hda - fix typo in proc output
ALSA: hda - Enabling Realtek ALC 671 codec
ALSA: usb: Work around CM6631 sample rate change bug
ALSA: hda - bug fix on HDMI ELD debug message
ALSA: hda - bug fix on return value when getting HDMI ELD info
ASoC: dma-sh7760: Fix compile error
ASoC: core: fix invalid free of devm_ allocated data
ASoC: spear_pcm: Update to new pcm_new() API
ASoC:: max98090: Remove executable bit
ASoC: dapm: Fix pointer dereference in is_connected_output_ep()
ASoC: pcm030 audio fabric: remove __init from probe
ASoC: imx-ssi: Fix occasional AC97 reset failure
ASoC: core: fix possible memory leak in snd_soc_bytes_put()
ASoC: wm_adsp: fix possible memory leak in wm_adsp_load_coeff()
ASoC: dapm: Fix handling of loops
ASoC: si476x: Add missing break for SNDRV_PCM_FORMAT_S8 switch case
A recent patch to fix the dm cache target's writethrough mode extended
the bio's front_pad to include a 1056-byte struct dm_bio_details.
Writeback mode doesn't need this, so this patch reduces the
per_bio_data_size to 16 bytes in this case instead of 1096.
The dm_bio_details structure was added in "dm cache: fix writes to
cache device in writethrough mode" which fixed commit e2e74d617e ("dm
cache: fix race in writethrough implementation"). In writeback mode
we avoid allocating the writethrough-specific members of the
per_bio_data structure (the dm_bio_details structure included).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
The dm-cache writethrough strategy introduced by commit e2e74d617e
("dm cache: fix race in writethrough implementation") issues a bio to
the origin device, remaps and then issues the bio to the cache device.
This more conservative in-series approach was selected to favor
correctness over performance (of the previous parallel writethrough).
However, this in-series implementation that reuses the same bio to write
both the origin and cache device didn't take into account that the block
layer's req_bio_endio() modifies a completing bio's bi_sector and
bi_size. So the new writethrough strategy needs to preserve these bio
fields, and restore them before submission to the cache device,
otherwise nothing gets written to the cache (because bi_size is 0).
This patch adds a struct dm_bio_details field to struct per_bio_data,
and uses dm_bio_record() and dm_bio_restore() to ensure the bio is
restored before reissuing to the cache device. Adding such a large
structure to the per_bio_data is not ideal but we can improve this
later, for now correctness is the important thing.
This problem initially went unnoticed because the dm-cache test-suite
uses a linear DM device for the dm-cache device's origin device.
Writethrough worked as expected because DM submits a *clone* of the
original bio, so the original bio which was reused for the cache was
never touched.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
SA_RESTORER used to be defined as 0x04000000 but only the O32 ABI ever
supported its use and no libc was using it, so the entire sa-restorer
functionality was removed with lmo commit 39bffc12c3580ab [Zap sa_restorer.]
for 2.5.48 retaining only the SA_RESTORER definition as a reminder to avoid
accidental reuse of the mask bit.
Upstream cdef9602fb [signal: always clear
sa_restorer on execve] adds code that assumes sa_sigaction has an
sa_restorer field, if SA_RESTORER is defined which would break MIPS.
So remove the SA_RESTORER definition before the v3.8.4 merge.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
(cherry picked from commit 17da8d63add23830892ac4dc2cbb3b5d4ffb79a8)
The commit a96102be70 introduced set_isa() where compatible ISA info is
also set aside from the one gets passed in. It means, for example, 1004K
will have MIPS_CPU_ISA_M32R2/M32R1/II/I flags. This leads to things like
the following inappropriate:
if (c->isa_level == MIPS_CPU_ISA_M32R1 ||
c->isa_level == MIPS_CPU_ISA_M32R2 ||
c->isa_level == MIPS_CPU_ISA_M64R1 ||
c->isa_level == MIPS_CPU_ISA_M64R2)
This patch fixes it.
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 7517de3486 ("MIPS: Alchemy: Redo
PCI as platform driver") added a reference to CONFIG_DEBUG_PCI. Change
it to CONFIG_PCI_DEBUG, as that is a valid Kconfig macro.
Also add a newline to a debugging printk that this fix enables.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 58b69401c7 [MIPS: Function tracer: Fix broken function tracing]
completely broke the function tracer for 64-bit kernels. The symptom is
a system hang very early in the boot process.
The fix: Remove/fix $sp adjustments for 64-bit case.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: Al Cooper <alcooperx@gmail.com>
Cc: viric@viric.name
Cc: stable@vger.kernel.org # 3.8.x
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
changed is not initialized in path_power_down_sync, but it is expected
to be false in case no change happened in the loop. So set it to
false.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
if userspace changes lifetime of address, send netlink notification and
call notifier.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
ixgbe_notify_dca cannot be called before driver registration
because it expects driver's klist_devices to be allocated and
initialized. While on it make sure debugfs files are removed
when registration fails.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was reported that the following LSB test case failed
https://lsbbugs.linuxfoundation.org/attachment.cgi?id=2144 because we
were not coallescing unix stream messages when the application was
expecting us to.
The problem was that the first send was before the socket was accepted
and thus sock->sk_socket was NULL in maybe_add_creds, and the second
send after the socket was accepted had a non-NULL value for sk->socket
and thus we could tell the credentials were not needed so we did not
bother.
The unnecessary credentials on the first message cause
unix_stream_recvmsg to start verifying that all messages had the same
credentials before coallescing and then the coallescing failed because
the second message had no credentials.
Ignoring credentials when we don't care in unix_stream_recvmsg fixes a
long standing pessimization which would fail to coallesce messages when
reading from a unix stream socket if the senders were different even if
we did not care about their credentials.
I have tested this and verified that the in the LSB test case mentioned
above that the messages do coallesce now, while the were failing to
coallesce without this change.
Reported-by: Karel Srot <ksrot@redhat.com>
Reported-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 14134f6584.
The problem that the above patch was meant to address is that af_unix
messages are not being coallesced because we are sending unnecesarry
credentials. Not sending credentials in maybe_add_creds totally
breaks unconnected unix domain sockets that wish to send credentails
to other sockets.
In practice this break some versions of udev because they receive a
message and the sending uid is bogus so they drop the message.
Reported-by: Sven Joachim <svenjoac@gmx.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have a race condition if we try to rmmod bonding and simultaneously add
a bond master through sysfs. In bonding_exit() we first remove the devices
(through rtnl_link_unregister() ) and only after that we remove the sysfs.
If we manage to add a device through sysfs after that the devices were
removed - we'll end up with that device/sysfs structure and with the module
unloaded.
Fix this by first removing the sysfs and only after that calling
rtnl_link_unregister().
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>