Commit Graph

483 Commits

Author SHA1 Message Date
Vitaly Kuznetsov 940b68e2c3 Drivers: hv: ring_buffer: eliminate hv_ringbuffer_peek()
Currently, there is only one user for hv_ringbuffer_read()/
hv_ringbuffer_peak() functions and the usage of these functions is:
- insecure as we drop ring_lock between them, someone else (in theory
  only) can acquire it in between;
- non-optimal as we do a number of things (acquire/release the above
  mentioned lock, calculate available space on the ring, ...) twice and
  this path is performance-critical.

Remove hv_ringbuffer_peek() moving the logic from __vmbus_recvpacket() to
hv_ringbuffer_read().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov 667d374064 Drivers: hv: remove code duplication between vmbus_recvpacket()/vmbus_recvpacket_raw()
vmbus_recvpacket() and vmbus_recvpacket_raw() are almost identical but
there are two discrepancies:
1) vmbus_recvpacket() doesn't propagate errors from hv_ringbuffer_read()
   which looks like it is not desired.
2) There is an error message printed in packetlen > bufferlen case in
   vmbus_recvpacket(). I'm removing it as it is usless for users to see
   such messages and /vmbus_recvpacket_raw() doesn't have it.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov b5f53dde8d Drivers: hv: ring_buffer: remove code duplication from hv_ringbuffer_peek/read()
hv_ringbuffer_peek() does the same as hv_ringbuffer_read() without
advancing the read index. The only functional change this patch brings
is moving hv_need_to_signal_on_read() call under the ring_lock but this
function is just a couple of comparisons.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov 822f18d4d3 Drivers: hv: ring_buffer.c: fix comment style
Convert 6+-string comments repeating function names to normal kernel-style
comments and fix a couple of other comment style issues. No textual or
functional changes intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov 9420098adc Drivers: hv: utils: fix crash when device is removed from host side
The crash is observed when a service is being disabled host side while
userspace daemon is connected to the device:

[   90.244859] general protection fault: 0000 [#1] SMP
...
[   90.800082] Call Trace:
[   90.800082]  [<ffffffff81187008>] __fput+0xc8/0x1f0
[   90.800082]  [<ffffffff8118716e>] ____fput+0xe/0x10
...
[   90.800082]  [<ffffffff81015278>] do_signal+0x28/0x580
[   90.800082]  [<ffffffff81086656>] ? finish_task_switch+0xa6/0x180
[   90.800082]  [<ffffffff81443ebf>] ? __schedule+0x28f/0x870
[   90.800082]  [<ffffffffa01ebbaa>] ? hvt_op_read+0x12a/0x140 [hv_utils]
...

The problem is that hvutil_transport_destroy() which does misc_deregister()
freeing the appropriate device is reachable by two paths: module unload
and from util_remove(). While module unload path is protected by .owner in
struct file_operations util_remove() path is not. Freeing the device while
someone holds an open fd for it is a show stopper.

In general, it is not possible to revoke an fd from all users so the only
way to solve the issue is to defer freeing the hvutil_transport structure.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov a15025660d Drivers: hv: utils: introduce HVUTIL_TRANSPORT_DESTROY mode
When Hyper-V host asks us to remove some util driver by closing the
appropriate channel there is no easy way to force the current file
descriptor holder to hang up but we can start to respond -EBADF to all
operations asking it to exit gracefully.

As we're setting hvt->mode from two separate contexts now we need to use
a proper locking.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov a72f3a4ccf Drivers: hv: utils: rename outmsg_lock
As a preparation to reusing outmsg_lock to protect test-and-set openrations
on 'mode' rename it the more general 'lock'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov 1f75338b6f Drivers: hv: utils: fix memory leak on on_msg() failure
inmsg should be freed in case of on_msg() failure to avoid memory leak.
Preserve the error code from on_msg().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:27:30 -08:00
K. Y. Srinivasan 2d0c3b5ad7 Drivers: hv: utils: Invoke the poll function after handshake
When the handshake with daemon is complete, we should poll the channel since
during the handshake, we will not be processing any messages. This is a
potential bug if the host is waiting for a response from the guest.
I would like to thank Dexuan for pointing this out.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
K. Y. Srinivasan b282e4c06f Drivers: hv: vmbus: Force all channel messages to be delivered on CPU 0
Force all channel messages to be delivered on CPU0. These messages are not
performance critical and are used during the setup and teardown of the
channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Andrey Smetanin c35b82ef02 drivers/hv: correct tsc page sequence invalid value
Hypervisor Top Level Functional Specification v3/4 says
that TSC page sequence value = -1(0xFFFFFFFF) is used to
indicate that TSC page no longer reliable source of reference
timer. Unfortunately, we found that Windows Hyper-V guest
side implementation uses sequence value = 0 to indicate
that Tsc page no longer valid. This is clearly visible
inside Windows 2012R2 ntoskrnl.exe HvlGetReferenceTime()
function dissassembly:

HvlGetReferenceTime proc near
                 xchg    ax, ax
loc_1401C3132:
                 mov     rax, cs:HvlpReferenceTscPage
                 mov     r9d, [rax]
                 test    r9d, r9d
                 jz      short loc_1401C3176
                 rdtsc
                 mov     rcx, cs:HvlpReferenceTscPage
                 shl     rdx, 20h
                 or      rdx, rax
                 mov     rax, [rcx+8]
                 mov     rcx, cs:HvlpReferenceTscPage
                 mov     r8, [rcx+10h]
                 mul     rdx
                 mov     rax, cs:HvlpReferenceTscPage
                 add     rdx, r8
                 mov     ecx, [rax]
                 cmp     ecx, r9d
                 jnz     short loc_1401C3132
                 jmp     short loc_1401C3184
loc_1401C3176:
                 mov     ecx, 40000020h
                 rdmsr
                 shl     rdx, 20h
                 or      rdx, rax
loc_1401C3184:
                 mov     rax, rdx
                 retn
HvlGetReferenceTime endp

This patch aligns Tsc page invalid sequence value with
Windows Hyper-V guest implementation which is more
compatible with both Hyper-V hypervisor and KVM hypervisor.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
K. Y. Srinivasan 8599846d73 Drivers: hv: vmbus: Fix a Host signaling bug
Currently we have two policies for deciding when to signal the host:
One based on the ring buffer state and the other based on what the
VMBUS client driver wants to do. Consider the case when the client
wants to explicitly control when to signal the host. In this case,
if the client were to defer signaling, we will not be able to signal
the host subsequently when the client does want to signal since the
ring buffer state will prevent the signaling. Implement logic to
have only one signaling policy in force for a given channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Tested-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Jake Oshins 40f26f3168 drivers:hv: Allow for MMIO claims that span ACPI _CRS records
This patch makes 16GB GPUs work in Hyper-V VMs, since, for
compatibility reasons, the Hyper-V BIOS lists MMIO ranges in 2GB
chunks in its root bus's _CRS object.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Dexuan Cui d6f591e339 Drivers: hv: vmbus: channge vmbus_connection.channel_lock to mutex
spinlock is unnecessary here.
mutex is enough.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Dexuan Cui f52078cf57 Drivers: hv: vmbus: release relid on error in vmbus_process_offer()
We want to simplify vmbus_onoffer_rescind() by not invoking
hv_process_channel_removal(NULL, ...).

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Dexuan Cui 34c6801e33 Drivers: hv: vmbus: fix rescind-offer handling for device without a driver
In the path vmbus_onoffer_rescind() -> vmbus_device_unregister()  ->
device_unregister() -> ... -> __device_release_driver(), we can see for a
device without a driver loaded: dev->driver is NULL, so
dev->bus->remove(dev), namely vmbus_remove(), isn't invoked.

As a result, vmbus_remove() -> hv_process_channel_removal() isn't invoked
and some cleanups(like sending a CHANNELMSG_RELID_RELEASED message to the
host) aren't done.

We can demo the issue this way:
1. rmmod hv_utils;
2. disable the Heartbeat Integration Service in Hyper-V Manager and lsvmbus
shows the device disappears.
3. re-enable the Heartbeat in Hyper-V Manager and modprobe hv_utils, but
lsvmbus shows the device can't appear again.
This is because, the host thinks the VM hasn't released the relid, so can't
re-offer the device to the VM.

We can fix the issue by moving hv_process_channel_removal()
from vmbus_close_internal() to vmbus_device_release(), since the latter is
always invoked on device_unregister(), whether or not the dev has a driver
loaded.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Dexuan Cui 64b7faf903 Drivers: hv: vmbus: do sanity check of channel state in vmbus_close_internal()
This fixes an incorrect assumption of channel state in the function.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Dexuan Cui 63d55b2aeb Drivers: hv: vmbus: serialize process_chn_event() and vmbus_close_internal()
process_chn_event(), running in the tasklet, can race with
vmbus_close_internal() in the case of SMP guest, e.g., when the former is
accessing channel->inbound.ring_buffer, the latter could be freeing the
ring_buffer pages.

To resolve the race, we can serialize them by disabling the tasklet when
the latter is running here.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
K. Y. Srinivasan efc267226b Drivers: hv: vmbus: Get rid of the unused irq variable
The irq we extract from ACPI is not used - we deliver hypervisor
interrupts on a special vector. Make the necessary adjustments.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
K. Y. Srinivasan 4ae9250893 Drivers: hv: vmbus: Use uuid_le_cmp() for comparing GUIDs
Use uuid_le_cmp() for comparing GUIDs.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
K. Y. Srinivasan af3ff643ea Drivers: hv: vmbus: Use uuid_le type consistently
Consistently use uuid_le type in the Hyper-V driver code.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Olaf Hering ed9ba608e4 Drivers: hv: vss: run only on supported host versions
The Backup integration service on WS2012 has appearently trouble to
negotiate with a guest which does not support the provided util version.
Currently the VSS driver supports only version 5/0. A WS2012 offers only
version 1/x and 3/x, and vmbus_prep_negotiate_resp correctly returns an
empty icframe_vercnt/icmsg_vercnt. But the host ignores that and
continues to send ICMSGTYPE_NEGOTIATE messages. The result are weird
errors during boot and general misbehaviour.

Check the Windows version to work around the host bug, skip hv_vss_init
on WS2012 and older.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:15:05 -08:00
Jake Oshins 3053c76244 drivers:hv: Define the channel type for Hyper-V PCI Express pass-through
This defines the channel type for PCI front-ends in Hyper-V VMs.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:12:21 -08:00
Jake Oshins a108393dbf drivers:hv: Export the API to invoke a hypercall on Hyper-V
This patch exposes the function that hv_vmbus.ko uses to make hypercalls.  This
is necessary for retargeting an interrupt when it is given a new affinity.

Since we are exporting this API, rename the API as it will be visible outside
the hv.c file.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:12:21 -08:00
Jake Oshins 619848bd07 drivers:hv: Export a function that maps Linux CPU num onto Hyper-V proc num
This patch exposes the mapping between Linux CPU number and Hyper-V virtual
processor number.  This is necessary because the hypervisor needs to know which
virtual processors to target when making a mapping in the Interrupt Redirection
Table in the I/O MMU.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:12:21 -08:00
Andrey Smetanin 17efbee8ba drivers/hv: cleanup synic msrs if vmbus connect failed
Before vmbus_connect() synic is setup per vcpu - this means
hypervisor receives writes at synic msr's and probably allocate
hypervisor resources per synic setup.

If vmbus_connect() failed for some reason it's neccessary to cleanup
synic setup by call hv_synic_cleanup() at each vcpu to get a chance
to free allocated resources by hypervisor per synic.

This patch does appropriate cleanup in case of vmbus_connect() failure.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:12:21 -08:00
Olaf Hering b00359642c Drivers: hv: utils: use memdup_user in hvt_op_write
Use memdup_user to handle OOM.

Fixes: 14b50f80c3 ('Drivers: hv: util: introduce hv_utils_transport abstraction')

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:12:21 -08:00
Olaf Hering cdc0c0c94e Drivers: hv: util: catch allocation errors
Catch allocation errors in hvutil_transport_send.

Fixes: 14b50f80c3 ('Drivers: hv: util: introduce hv_utils_transport abstraction')

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:12:21 -08:00
Olaf Hering 3cace4a616 Drivers: hv: utils: run polling callback always in interrupt context
All channel interrupts are bound to specific VCPUs in the guest
at the point channel is created. While currently, we invoke the
polling function on the correct CPU (the CPU to which the channel
is bound to) in some cases we may run the polling function in
a non-interrupt context. This  potentially can cause an issue as the
polling function can be interrupted by the channel callback function.
Fix the issue by running the polling function on the appropriate CPU
at interrupt level. Additional details of the issue being addressed by
this patch are given below:

Currently hv_fcopy_onchannelcallback is called from interrupts and also
via the ->write function of hv_utils. Since the used global variables to
maintain state are not thread safe the state can get out of sync.
This affects the variable state as well as the channel inbound buffer.

As suggested by KY adjust hv_poll_channel to always run the given
callback on the cpu which the channel is bound to. This avoids the need
for locking because all the util services are single threaded and only
one transaction is active at any given point in time.

Additionally, remove the context variable, they will always be the same as
recv_channel.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:12:21 -08:00
K. Y. Srinivasan c0b200cfb0 Drivers: hv: util: Increase the timeout for util services
Util services such as KVP and FCOPY need assistance from daemon's running
in user space. Increase the timeout so we don't prematurely terminate
the transaction in the kernel. Host sets up a 60 second timeout for
all util driver transactions. The host will retry the transaction if it
times out. Set the guest timeout at 30 seconds.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 19:12:21 -08:00
Sudip Mukherjee 9220e39b5c Drivers: hv: vmbus: fix build warning
We were getting build warning about unused variable "tsc_msr" and
"va_tsc" while building for i386 allmodconfig.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14 13:34:50 -08:00
Andrey Smetanin c75efa974e drivers/hv: share Hyper-V SynIC constants with userspace
Moved Hyper-V synic contants from guest Hyper-V drivers private
header into x86 arch uapi Hyper-V header.

Added Hyper-V synic msr's flags into x86 arch uapi Hyper-V header.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-04 16:24:33 +01:00
Dexuan Cui ca1c4b7457 Drivers: hv: vmbus: fix init_vp_index() for reloading hv_netvsc
This fixes the recent commit 3b71107d73b16074afa7658f3f0fcf837aabfe24:
Drivers: hv: vmbus: Further improve CPU affiliation logic

Without the fix, reloading hv_netvsc hangs the guest.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-20 22:44:51 -07:00
Vitaly Kuznetsov f39c4280a3 Drivers: hv: vmbus: use cpu_hotplug_enable/disable
Commit e513229b4c ("Drivers: hv: vmbus: prevent cpu offlining on newer
hypervisors") was altering smp_ops.cpu_disable to prevent CPU offlining.
We can bo better by using cpu_hotplug_enable/disable functions instead of
such hard-coding.

Reported-by: Radim Kr.má  <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:46:44 -07:00
Dexuan Cui 042ab0313b Drivers: hv: vmbus: add a sysfs attr to show the binding of channel/VP
This is useful to analyze performance issue.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
K. Y. Srinivasan ca9357bd26 Drivers: hv: vmbus: Implement a clocksource based on the TSC page
The current Hyper-V clock source is based on the per-partition reference counter
and this counter is being accessed via s synthetic MSR - HV_X64_MSR_TIME_REF_COUNT.
Hyper-V has a more efficient way of computing the per-partition reference
counter value that does not involve reading a synthetic MSR. We implement
a time source based on this mechanism.

Tested-by: Vivek Yadav <vyadav@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Viresh Kumar bc609cb47f drivers/hv: Migrate to new 'set-state' interface
Migrate hv driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.

This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: devel@linuxdriverproject.org
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Christopher Oo a5cca686ce Drivers: hv_vmbus: Fix signal to host condition
Fixes a bug where previously hv_ringbuffer_read would pass in the old
number of bytes available to read instead of the expected old read index
when calculating when to signal to the host that the ringbuffer is empty.
Since the previous write size is already saved, also changes the
hv_need_to_signal_on_read to use the previously read value rather than
recalculating it.

Signed-off-by: Christopher Oo <t-chriso@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Dexuan Cui 3b71107d73 Drivers: hv: vmbus: Further improve CPU affiliation logic
Keep track of CPU affiliations of sub-channels within the scope of the primary
channel. This will allow us to better distribute the load amongst available
CPUs.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:28 -07:00
K. Y. Srinivasan 9f01ec5345 Drivers: hv: vmbus: Improve the CPU affiliation for channels
The current code tracks the assigned CPUs within a NUMA node in the context of
the primary channel. So, if we have a VM with a single NUMA node with 8 VCPUs, we may
end up unevenly distributing the channel load. Fix the issue by tracking affiliations
globally.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:31 -07:00
Jake Oshins 3546448338 drivers:hv: Move MMIO range picking from hyper_fb to hv_vmbus
This patch deletes the logic from hyperv_fb which picked a range of MMIO space
for the frame buffer and adds new logic to hv_vmbus which picks ranges for
child drivers.  The new logic isn't quite the same as the old, as it considers
more possible ranges.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:31 -07:00
Jake Oshins 7f163a6fd9 drivers:hv: Modify hv_vmbus to search for all MMIO ranges available.
This patch changes the logic in hv_vmbus to record all of the ranges in the
VM's firmware (BIOS or UEFI) that offer regions of memory-mapped I/O space for
use by paravirtual front-end drivers.  The old logic just found one range
above 4GB and called it good.  This logic will find any ranges above 1MB.

It would have been possible with this patch to just use existing resource
allocation functions, rather than keep track of the entire set of Hyper-V
related MMIO regions in VMBus.  This strategy, however, is not sufficient
when the resource allocator needs to be aware of the constraints of a
Hyper-V virtual machine, which is what happens in the next patch in the series.
So this first patch exists to show the first steps in reworking the MMIO
allocation paths for Hyper-V front-end drivers.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:30 -07:00
K. Y. Srinivasan 379e4f756b Drivers: hv: vmbus: Consider ND NIC in binding channels to CPUs
We cycle through all the "high performance" channels to distribute
load across the available CPUs. Process the NetworkDirect as a
high performance device.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:30:44 -07:00
Denis V. Lunev cc2dd4027a mshyperv: fix recognition of Hyper-V guest crash MSR's
Hypervisor Top Level Functional Specification v3.1/4.0 notes that cpuid
(0x40000003) EDX's 10th bit should be used to check that Hyper-V guest
crash MSR's functionality available.

This patch should fix this recognition. Currently the code checks EAX
register instead of EDX.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:30:44 -07:00
Vitaly Kuznetsov 4a54243fc0 Drivers: hv: vmbus: don't send CHANNELMSG_UNLOAD on pre-Win2012R2 hosts
Pre-Win2012R2 hosts don't properly handle CHANNELMSG_UNLOAD and
wait_for_completion() hangs. Avoid sending such request on old hosts.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
Nik Nyby e26009aad0 Drivers: hv: vmbus: fix typo in hv_port_info struct
This fixes a typo: base_flag_bumber to base_flag_number

Signed-off-by: Nik Nyby <nikolas@gnu.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
Dan Carpenter 9dd6a06430 hv: util: checking the wrong variable
We don't catch this allocation failure because there is a typo and we
check the wrong variable.

Fixes: 14b50f80c3 ('Drivers: hv: util: introduce hv_utils_transport abstraction')

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
K. Y. Srinivasan b81658cf5d Drivers: hv: vmbus: Permit sending of packets without payload
The guest may have to send a completion packet back to the host.
To support this usage, permit sending a packet without a payload -
we would be only sending the descriptor in this case.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:39 -07:00
Alex Ng b6ddeae160 Drivers: hv: balloon: Enable dynamic memory protocol negotiation with Windows 10 hosts
Support Win10 protocol for Dynamic Memory. Thia patch allows guests on Win10 hosts
to hot-add memory even when dynamic memory is not enabled on the guest.

Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:39 -07:00
Vitaly Kuznetsov 25ef06fe27 Drivers: hv: fcopy: dynamically allocate smsg_out in fcopy_send_data()
struct hv_start_fcopy is too big to be on stack on i386, the following
warning is reported:

>> drivers/hv/hv_fcopy.c:159:1: warning: the frame size of 1088 bytes is larger than 1024 bytes [-Wframe-larger-than=]

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov b36fda3397 Drivers: hv: kvp: check kzalloc return value
kzalloc() return value check was accidentally lost in 11bc3a5fa91f:
"Drivers: hv: kvp: convert to hv_utils_transport" commit.

We don't need to reset kvp_transaction.state here as we have the
kvp_timeout_func() timeout function and in case we're in OOM situation
it is preferable to wait.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov 510f7aef65 Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'
current_pt_regs() sometimes returns regs of the userspace process and in
case of a kernel crash this is not what we need to report. E.g. when we
trigger crash with sysrq we see the following:
...
 RIP: 0010:[<ffffffff815b8696>]  [<ffffffff815b8696>] sysrq_handle_crash+0x16/0x20
 RSP: 0018:ffff8800db0a7d88  EFLAGS: 00010246
 RAX: 000000000000000f RBX: ffffffff820a0660 RCX: 0000000000000000
...
at the same time current_pt_regs() give us:
ip=7f899ea7e9e0, ax=ffffffffffffffda, bx=26c81a0, cx=7f899ea7e9e0, ...
These registers come from the userspace process triggered the crash. As we
don't even know which process it was this information is rather useless.

When kernel crash happens through 'die' proper regs are being passed to
all receivers on the die_chain (and panic_notifier_list is being notified
with the string passed to panic() only). If panic() is called manually
(e.g. on BUG()) we won't get 'die' notification so keep the 'panic'
notification reporter as well but guard against double reporting.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov b4370df2b1 Drivers: hv: vmbus: add special crash handler
Full kernel hang is observed when kdump kernel starts after a crash. This
hang happens in vmbus_negotiate_version() function on
wait_for_completion() as Hyper-V host (Win2012R2 in my testing) never
responds to CHANNELMSG_INITIATE_CONTACT as it thinks the connection is
already established. We need to perform some mandatory minimalistic
cleanup before we start new kernel.

Reported-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov d7646eaa76 Drivers: hv: don't do hypercalls when hypercall_page is NULL
At the very late stage of kexec a driver (which are not being unloaded) can
try to post a message or signal an event. This will crash the kernel as we
already did hv_cleanup() and the hypercall page is NULL.

Move all common (between 32 and 64 bit code) declarations to the beginning
of the do_hypercall() function. Unfortunately we have to write the
!hypercall_page check twice to not mix declarations and code.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:29 -07:00
Vitaly Kuznetsov 2517281d63 Drivers: hv: vmbus: add special kexec handler
When general-purpose kexec (not kdump) is being performed in Hyper-V guest
the newly booted kernel fails with an MCE error coming from the host. It
is the same error which was fixed in the "Drivers: hv: vmbus: Implement
the protocol for tearing down vmbus state" commit - monitor pages remain
special and when they're being written to (as the new kernel doesn't know
these pages are special) bad things happen. We need to perform some
minimalistic cleanup before booting a new kernel on kexec. To do so we
need to register a special machine_ops.shutdown handler to be executed
before the native_machine_shutdown(). Registering a shutdown notification
handler via the register_reboot_notifier() call is not sufficient as it
happens to early for our purposes. machine_ops is not being exported to
modules (and I don't think we want to export it) so let's do this in
mshyperv.c

The minimalistic cleanup consists of cleaning up clockevents, synic MSRs,
guest os id MSR, and hypercall MSR.

Kdump doesn't require all this stuff as it lives in a separate memory
space.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:29 -07:00
Vitaly Kuznetsov 06210b42f3 Drivers: hv: vmbus: remove hv_synic_free_cpu() call from hv_synic_cleanup()
We already have hv_synic_free() which frees all per-cpu pages for all
CPUs, let's remove the hv_synic_free_cpu() call from hv_synic_cleanup()
so it will be possible to do separate cleanup (writing to MSRs) and final
freeing. This is going to be used to assist kexec.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:28 -07:00
K. Y. Srinivasan 294409d205 Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion
Allocate ring buffer memory from the NUMA node assigned to the channel.
Since this is a performance and not a correctness issue, if the node specific
allocation were to fail, fall back and allocate without specifying the node.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-12 16:58:33 -07:00
K. Y. Srinivasan 1f656ff3fd Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels
Channels/sub-channels can be affinitized to VCPUs in the guest. Implement
this affinity in a way that is NUMA aware. The current protocol distributed
the primary channels uniformly across all available CPUs. The new protocol
is NUMA aware: primary channels are distributed across the available NUMA
nodes while the sub-channels within a primary channel are distributed amongst
CPUs within the NUMA node assigned to the primary channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 10:56:31 +09:00
K. Y. Srinivasan 9c6e64adf2 Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0
Map target_cpu to target_vcpu using the mapping table.
We should use the mapping table to transform guest CPU ID to VP Index
as is done for the non-performance critical channels.
While the value CPU 0 is special and will
map to VP index 0, it is good to be consistent.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 10:56:31 +09:00
Vitaly Kuznetsov 4e4bd36f97 Drivers: hv: balloon: check if ha_region_mutex was acquired in MEM_CANCEL_ONLINE case
Memory notifiers are being executed in a sequential order and when one of
them fails returning something different from NOTIFY_OK the remainder of
the notification chain is not being executed. When a memory block is being
onlined in online_pages() we do memory_notify(MEM_GOING_ONLINE, ) and if
one of the notifiers in the chain fails we end up doing
memory_notify(MEM_CANCEL_ONLINE, ) so it is possible for a notifier to see
MEM_CANCEL_ONLINE without seeing the corresponding MEM_GOING_ONLINE event.
E.g. when CONFIG_KASAN is enabled the kasan_mem_notifier() is being used
to prevent memory hotplug, it returns NOTIFY_BAD for all MEM_GOING_ONLINE
events. As kasan_mem_notifier() comes before the hv_memory_notifier() in
the notification chain we don't see the MEM_GOING_ONLINE event and we do
not take the ha_region_mutex. We, however, see the MEM_CANCEL_ONLINE event
and unconditionally try to release the lock, the following is observed:

[  110.850927] =====================================
[  110.850927] [ BUG: bad unlock balance detected! ]
[  110.850927] 4.1.0-rc3_bugxxxxxxx_test_xxxx #595 Not tainted
[  110.850927] -------------------------------------
[  110.850927] systemd-udevd/920 is trying to release lock
(&dm_device.ha_region_mutex) at:
[  110.850927] [<ffffffff81acda0e>] mutex_unlock+0xe/0x10
[  110.850927] but there are no more locks to release!

At the same time we can have the ha_region_mutex taken when we get the
MEM_CANCEL_ONLINE event in case one of the memory notifiers after the
hv_memory_notifier() in the notification chain failed so we need to add
the mutex_is_locked() check. In case of MEM_ONLINE we are always supposed
to have the mutex locked.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 06:38:21 +09:00
Keith Mange 6c4e5f9c9f Drivers: hv: vmbus:Update preferred vmbus protocol version to windows 10.
Add support for Windows 10.

Signed-off-by: Keith Mange <keith.mange@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 06:38:21 +09:00
Vitaly Kuznetsov ce59fec836 Drivers: hv: vmbus: distribute subchannels among all vcpus
Primary channels are distributed evenly across all vcpus we have. When the host
asks us to create subchannels it usually makes us num_cpus-1 offers and we are
supposed to distribute the work evenly among the channel itself and all its
subchannels. Make sure they are all assigned to different vcpus.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov f38e7dd723 Drivers: hv: vmbus: move init_vp_index() call to vmbus_process_offer()
We need to call init_vp_index() after we added the channel to the appropriate
list (global or subchannel) to be able to use this information when assigning
the channel to the particular vcpu. To do so we need to move a couple of
functions around. The only real change is the init_vp_index() call. This is a
small refactoring without a functional change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov 357e836a60 Drivers: hv: vmbus: decrease num_sc on subchannel removal
It is unlikely that that host will ask us to close only one subchannel for a
device but let's be consistent. Do both num_sc++ and num_sc-- with
channel->lock to be on the safe side.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov 8dfd332674 Drivers: hv: vmbus: unify calls to percpu_channel_enq()
Remove some code duplication, no functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov 1959a28e26 Drivers: hv: vmbus: kill tasklets on module unload
Explicitly kill tasklets we create on module unload.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov ffc151f3c8 Drivers: hv: vmbus: do cleanup on all vmbus_open() failure paths
In case there was an error reported in the response to the CHANNELMSG_OPENCHANNEL
call we need to do the cleanup as a vmbus_open() user won't be doing it after
receiving an error. The cleanup should be done on all failure paths. We also need
to avoid returning open_info->response.open_result.status as the return value as
all other errors we return from vmbus_open() are -EXXX and vmbus_open() callers
are not supposed to analyze host error codes.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
K. Y. Srinivasan 2db84eff12 Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state
Implement the protocol for tearing down the monitor state established with
the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:18:24 -07:00
Dexuan Cui 813c5b7958 hv: vmbus_free_channels(): remove the redundant free_channel()
free_channel() has been invoked in
vmbus_remove() -> hv_process_channel_removal(), or vmbus_remove() ->
... -> vmbus_close_internal() -> hv_process_channel_removal().

We also change to use list_for_each_entry_safe(), because the entry
is removed in hv_process_channel_removal().

This patch fixes a bug in the vmbus unload path.

Thank Dan Carpenter for finding the issue!

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov 096c605feb Drivers: hv: vmbus: unregister panic notifier on module unload
Commit 96c1d0581d ("Drivers: hv: vmbus: Add
support for VMBus panic notifier handler") introduced
atomic_notifier_chain_register() call on module load. We also need to call
atomic_notifier_chain_unregister() on module unload as otherwise the following
crash is observed when we bring hv_vmbus back:

[   39.788877] BUG: unable to handle kernel paging request at ffffffffa00078a8
[   39.788877] IP: [<ffffffff8109d63f>] notifier_call_chain+0x3f/0x80
...
[   39.788877] Call Trace:
[   39.788877]  [<ffffffff8109de7d>] __atomic_notifier_call_chain+0x5d/0x90
...
[   39.788877]  [<ffffffff8109d788>] ? atomic_notifier_chain_register+0x38/0x70
[   39.788877]  [<ffffffff8109d767>] ? atomic_notifier_chain_register+0x17/0x70
[   39.788877]  [<ffffffffa002814f>] hv_acpi_init+0x14f/0x1000 [hv_vmbus]
[   39.788877]  [<ffffffff81002144>] do_one_initcall+0xd4/0x210

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov e4ecb41cf9 Drivers: hv: vmbus: introduce vmbus_acpi_remove
In case we do request_resource() in vmbus_acpi_add() we need to tear it down
to be able to load the driver again. Otherwise the following crash in observed
when hv_vmbus unload/load sequence is performed on a Generation2 instance:

[   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
[   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
[   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
[   38.166315] Oops: 0000 [#1] SMP
[   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
[   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-rc5_bug923184+ #486
[   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti: ffff88003f60c000
[   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
[   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov 7c95912758 Drivers: hv: utils: unify driver registration reporting
Unify driver registration reporting and move it to debug level as normally daemons write to syslog themselves
and these kernel messages are useless.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:42 -07:00
Vitaly Kuznetsov a4d1ee5b02 Drivers: hv: fcopy: full handshake support
Introduce FCOPY_VERSION_1 to support kernel replying to the negotiation
message with its own version.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:42 -07:00
Vitaly Kuznetsov cd8dc05485 Drivers: hv: vss: full handshake support
Introduce VSS_OP_REGISTER1 to support kernel replying to the negotiation
message with its own version.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 11bc3a5fa9 Drivers: hv: kvp: convert to hv_utils_transport
Convert to hv_utils_transport to support both netlink and /dev/vmbus/hv_kvp communication methods.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov c7e490fc23 Drivers: hv: fcopy: convert to hv_utils_transport
Unify the code with the recently introduced hv_utils_transport. Netlink
communication is disabled for fcopy.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 6472f80a2e Drivers: hv: vss: convert to hv_utils_transport
Convert to hv_utils_transport to support both netlink and /dev/vmbus/hv_vss communication methods.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 14b50f80c3 Drivers: hv: util: introduce hv_utils_transport abstraction
The intention is to make KVP/VSS drivers work through misc char devices.
Introduce an abstraction for kernel/userspace communication to make the
migration smoother. Transport operational mode (netlink or char device)
is determined by the first received message. To support driver upgrades
the switch from netlink to chardev operational mode is supported.

Every hv_util daemon is supposed to register 2 callbacks:
1) on_msg() to get notified when the userspace daemon sent a message;
2) on_reset() to get notified when the userspace daemon drops the connection.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 3f0dccf86c Drivers: hv: fcopy: set .owner reference for file operations
Get an additional reference otherwise a crash is observed when hv_utils module is being unloaded while
fcopy daemon is still running. .owner gives us an additional reference when
someone holds a descriptor for the device.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 4c93ccccf4 Drivers: hv: fcopy: switch to using the hvutil_device_state state machine
Switch to using the hvutil_device_state state machine from using 3 different state variables:
fcopy_transaction.active, opened, and in_hand_shake.

State transitions are:
-> HVUTIL_DEVICE_INIT when driver loads or on device release
-> HVUTIL_READY if the handshake was successful
-> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host
-> HVUTIL_USERSPACE_REQ after userspace daemon read the message
   -> HVUTIL_USERSPACE_RECV after/if userspace has replied
-> HVUTIL_READY after we respond to the host
-> HVUTIL_DEVICE_DYING on driver unload

In hv_fcopy_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when
the userspace daemon is disconnected, otherwise we can make the host think
we don't support FCOPY and disable the service completely.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 086a6f68d6 Drivers: hv: vss: switch to using the hvutil_device_state state machine
Switch to using the hvutil_device_state state machine from using kvp_transaction.active.

State transitions are:
-> HVUTIL_DEVICE_INIT when driver loads or on device release
-> HVUTIL_READY if the handshake was successful
-> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host
-> HVUTIL_USERSPACE_REQ after we sent the message to the userspace daemon
   -> HVUTIL_USERSPACE_RECV after/if the userspace daemon has replied
-> HVUTIL_READY after we respond to the host
-> HVUTIL_DEVICE_DYING on driver unload

In hv_vss_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when
the userspace daemon is disconnected, otherwise we can make the host think
we don't support VSS and disable the service completely.

Unfortunately there is no good way we can figure out that the userspace daemon
has died (unless we start treating all timeouts as such), add a protection
against processing new VSS_OP_REGISTER messages while being in the middle of a
transaction (HVUTIL_USERSPACE_REQ or HVUTIL_USERSPACE_RECV state).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 97bf16cd30 Drivers: hv: kvp: switch to using the hvutil_device_state state machine
Switch to using the hvutil_device_state state machine from using 2 different state variables: kvp_transaction.active and
in_hand_shake.

State transitions are:
-> HVUTIL_DEVICE_INIT when driver loads or on device release
-> HVUTIL_READY if the handshake was successful
-> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host
-> HVUTIL_USERSPACE_REQ after we sent the message to the userspace daemon
   -> HVUTIL_USERSPACE_RECV after/if the userspace daemon has replied
-> HVUTIL_READY after we respond to the host
-> HVUTIL_DEVICE_DYING on driver unload

In hv_kvp_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when
the userspace daemon is disconnected, otherwise we can make the host think
we don't support KVP and disable the service completely.

Unfortunately there is no good way we can figure out that the userspace daemon
has died (unless we start treating all timeouts as such). In case the daemon
restarts we skip the negotiation procedure (so the daemon is supposed to has
the same version). This behavior is unchanged from in_handshake approach.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 636c88da6d Drivers: hv: util: introduce state machine for util drivers
KVP/VSS/FCOPY drivers work in fully serialized mode: we wait till userspace
daemon registers, wait for a message from the host, send this message to the
daemon, get the reply, send it back to host, wait for another message.
Introduce enum hvutil_device_state to represend this state in all 3 drivers.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 1d072339fa Drivers: hv: fcopy: rename fcopy_work -> fcopy_timeout_work
'fcopy_work' (and fcopy_work_func) is a misnomer as it sounds like we expect
this useful work to happen and in reality it is just an emergency escape when
timeout happens.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov 68c8b39a0d Drivers: hv: kvp: rename kvp_work -> kvp_timeout_work
'kvp_work' (and kvp_work_func) is a misnomer as it sounds like we expect
this useful work to happen and in reality it is just an emergency escape when
timeout happens.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov 38c06c29ba Drivers: hv: vss: process deferred messages when we complete the transaction
In theory, the host is not supposed to issue any requests before be reply to
the previous one. In KVP we, however, support the following scenarios:
1) A message was received before userspace daemon registered;
2) A message was received while the previous one is still being processed.
In VSS we support only the former. Add support for the later, use
hv_poll_channel() to do the job.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov 242f31221d Drivers: hv: fcopy: process deferred messages when we complete the transaction
In theory, the host is not supposed to issue any requests before be reply to
the previous one. In KVP we, however, support the following scenarios:
1) A message was received before userspace daemon registered;
2) A message was received while the previous one is still being processed.
In FCOPY we support only the former. Add support for the later, use
hv_poll_channel() to do the job.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov 8efe78fdb1 Drivers: hv: kvp: move poll_channel() to hyperv_vmbus.h
Move poll_channel() to hyperv_vmbus.h and make it inline and rename it to hv_poll_channel() so it can be reused
in other hv_util modules.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov 5fa97480b9 Drivers: hv: kvp: reset kvp_context
We set kvp_context when we want to postpone receiving a packet from vmbus due
to the previous transaction being unfinished. We, however, never reset this
state, all consequent kvp_respond_to_host() calls will result in poll_channel()
calling hv_kvp_onchannelcallback(). This doesn't cause real issues as:
1) Host is supposed to serialize transactions as well
2) If no message is pending vmbus_recvpacket() will return 0 recvlen.
This is just a cleanup.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov 3647a83d9d Drivers: hv: util: move kvp/vss function declarations to hyperv_vmbus.h
These declarations are internal to hv_util module and hv_fcopy_* declarations
already reside there.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov 797f88c987 Drivers: hv: hv_balloon: correctly handle num_pages>INT_MAX case
balloon_wrk.num_pages is __u32 and it comes from host in struct dm_balloon
where it is also __u32. We, however, use 'int' in balloon_up() and in case
we happen to receive num_pages>INT_MAX request we'll end up allocating zero
pages as 'num_pages < alloc_unit' check in alloc_balloon_pages() will pass.
Change num_pages type to unsigned int.

In real life ballooning request come with num_pages in [512, 32768] range so
this is more a future-proof/cleanup.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:20:12 +02:00
Vitaly Kuznetsov ba0c444153 Drivers: hv: hv_balloon: correctly handle val.freeram<num_pages case
'Drivers: hv: hv_balloon: refuse to balloon below the floor' fix does not
correctly handle the case when val.freeram < num_pages as val.freeram is
__kernel_ulong_t and the 'val.freeram - num_pages' value will be a huge
positive value instead of being negative.

Usually host doesn't ask us to balloon more than val.freeram but in case
he have a memory hog started after we post the last pressure report we
can get into troubles.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:20:12 +02:00
Haiyang Zhang e1c0d82dab hv_vmbus: Add gradually increased delay for retries in vmbus_post_msg()
Most of the retries can be done within a millisecond successfully, so we
sleep 1ms before the first retry, then gradually increase the retry
interval to 2^n with max value of 2048ms. Doing so, we will have shorter
overall delay time, because most of the cases succeed within 1-2 attempts.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:18:02 +02:00
Vitaly Kuznetsov 0a1a86ac04 Drivers: hv: hv_balloon: survive ballooning request with num_pages=0
... and simplify alloc_balloon_pages() interface by removing redundant
alloc_error from it.

If we happen to enter balloon_up() with balloon_wrk.num_pages = 0 we will enter
infinite 'while (!done)' loop as alloc_balloon_pages() will be always returning
0 and not setting alloc_error. We will also be sending a meaningless message to
the host on every iteration.

The 'alloc_unit == 1 && alloc_error -> num_ballooned == 0' change and
alloc_error elimination requires a special comment. We do alloc_balloon_pages()
with 2 different alloc_unit values and there are 4 different
alloc_balloon_pages() results, let's check them all.

alloc_unit = 512:
1) num_ballooned = 0, alloc_error = 0: we do 'alloc_unit=1' and retry pre- and
  post-patch.
2) num_ballooned > 0, alloc_error = 0: we check 'num_ballooned == num_pages'
  and act accordingly,  pre- and post-patch.
3) num_ballooned > 0, alloc_error > 0: we report this chunk and remain within
  the loop, no changes here.
4) num_ballooned = 0, alloc_error > 0: we do 'alloc_unit=1' and retry pre- and
  post-patch.

alloc_unit = 1:
1) num_ballooned = 0, alloc_error = 0: this can happen in two cases: when we
  passed 'num_pages=0' to alloc_balloon_pages() or when there was no space in
  bl_resp to place a single response. The second option is not possible as
  bl_resp is of PAGE_SIZE size and single response 'union dm_mem_page_range' is
  8 bytes, but the first one is (in theory, I think that Hyper-V host never
  places such requests). Pre-patch code loops forever, post-patch code sends
  a reply with more_pages = 0 and finishes.
2) num_ballooned > 0, alloc_error = 0: we ran out of space in bl_resp, we
  report partial success and remain within the loop, no changes pre- and
  post-patch.
3) num_ballooned > 0, alloc_error > 0: pre-patch code finishes, post-patch code
  does one more try and if there is no progress (we finish with
  'num_ballooned = 0') we finish. So we try a bit harder with this patch.
4) num_ballooned = 0, alloc_error > 0: both pre- and post-patch code enter
 'more_pages = 0' branch and finish.

So this patch has two real effects:
1) We reply with an empty response to 'num_pages=0' request.
2) We try a bit harder on alloc_unit=1 allocations (and reply with an empty
   tail reply in case we fail).

An empty reply should be supported by host as we were able to send it even with
pre-patch code when we were not able to allocate a single page.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:18:02 +02:00
Vitaly Kuznetsov 7fb0e1a650 Drivers: hv: hv_balloon: eliminate jumps in piecewiese linear floor function
Commit 79208c57da ("Drivers: hv: hv_balloon: Make adjustments in computing
the floor") was inacurate as it introduced a jump in our piecewiese linear
'floor' function:

At 2048MB we have:
Left limit:
104 + 2048/8 = 360
Right limit:
256 + 2048/16 = 384 (so the right value is 232)

We now have to make an adjustment at 8192 boundary:
232 + 8192/16 = 744
512 + 8192/32 = 768 (so the right value is 488)

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:18:02 +02:00
Vitaly Kuznetsov d6cbd2c3a3 Drivers: hv: hv_balloon: do not online pages in offline blocks
Currently we add memory in 128Mb blocks but the request from host can be
aligned differently. In such case we add a partially backed block and
when this block goes online we skip onlining pages which are not backed
(hv_online_page() callback serves this purpose). When we receive next
request for the same host add region we online pages which were not backed
before with hv_bring_pgs_online(). However, we don't check if the the block
in question was onlined and online this tail unconditionally. This is bad as
we avoid all online_pages() logic: these pages are not accounted, we don't
send notifications (and hv_balloon is not the only receiver of them),...
And, first of all, nobody asked as to online these pages. Solve the issue by
checking if the last previously backed page was onlined and onlining the tail
only in case it was.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:18:02 +02:00
Dexuan Cui aadc3780f3 hv: remove the per-channel workqueue
It's not necessary any longer, since we can safely run the blocking
message handlers in vmbus_connection.work_queue now.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:18:02 +02:00
Dexuan Cui d43e2fe7da hv: don't schedule new works in vmbus_onoffer()/vmbus_onoffer_rescind()
Since the 2 fucntions can safely run in vmbus_connection.work_queue without
hang, we don't need to schedule new work items into the per-channel workqueue.

Actally we can even remove the per-channel workqueue now -- we'll do it
in the next patch.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:18:02 +02:00
Dexuan Cui 652594c7df hv: run non-blocking message handlers in the dispatch tasklet
A work item in vmbus_connection.work_queue can sleep, waiting for a new
host message (usually it is some kind of "completion" message). Currently
the new message will be handled in the same workqueue, but since work items
in the workqueue is serialized, we actually have no chance to handle
the new message if the current work item is sleeping -- as as result, the
current work item will hang forever.

K. Y. has posted the below fix to resolve the issue:
Drivers: hv: vmbus: Perform device register in the per-channel work element

Actually we can simplify the fix by directly running non-blocking message
handlers in the dispatch tasklet (inspired by K. Y.).

This patch is the fundamental change. The following 2 patches will simplify
the message offering and rescind-offering handling a lot.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-03 16:18:01 +02:00
K. Y. Srinivasan 73cffdb65e Drivers: hv: vmbus: Don't wait after requesting offers
Don't wait after sending request for offers to the host. This wait is
unnecessary and simply adds 5 seconds to the boot time.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26 23:51:14 +01:00
K. Y. Srinivasan 5f5cc81733 Drivers: hv: vmbus: Fix a siganlling host signalling issue
Handle the case when the write to the ringbuffer fails. In this case,
unconditionally signal the host. Since we may have deferred signalling
the host based on the kick_q parameter, signalling the host
unconditionally in this case deals with the issue.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 11:53:55 +01:00
K. Y. Srinivasan b3a19b36ad Drivers: hv: vmbus: Export the vmbus_sendpacket_pagebuffer_ctl()
Export the vmbus_sendpacket_pagebuffer_ctl() interface. This export will be
used by the netvsc driver.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 11:53:55 +01:00
K. Y. Srinivasan f2eddbc9f1 Drivers: hv: vmbus: Fix a bug in rescind processing in vmbus_close_internal()
When a channel has been rescinded, the close operation is a noop.
Restructure the code so we deal with the rescind condition after
we properly cleanup the channel. I would like to thank
Dexuan Cui <decui@microsoft.com> for observing this problem.
The current code leaks memory when the channel is rescinded.

The current char-next branch is broken and this patch fixes
the bug.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 11:53:54 +01:00
Dan Carpenter 177757423f hv: vmbus: missing curly braces in vmbus_process_offer()
The indenting makes it clear that there were curly braces intended here.

Fixes: 2dd37cb815 ('Drivers: hv: vmbus: Handle both rescind and offer messages in the same context')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 11:53:54 +01:00
Nick Meier 5ef5b6927f Drivers: hv: vmbus: Correcting truncation error for constant HV_CRASH_CTL_CRASH_NOTIFY
HV_CRASH_CTL_CRASH_NOTIFY is a 64 bit number.  Depending on the usage context,
the value may be truncated. This patch is in response from the following
email from Wu Fengguang <fengguang.wu@intel.com>:

    From: Wu Fengguang <fengguang.wu@intel.com>
    Subject:  [char-misc:char-misc-testing 25/45] drivers/hv/vmbus_drv.c:67:9: sparse:
              constant 0x8000000000000000 is so big it is unsigned long

    tree:   git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git char-misc-testing
    head:   b3de8e3719
    commit: 96c1d0581d [25/45] Drivers: hv: vmbus: Add support for VMBus panic notifier handler
    reproduce:
      # apt-get install sparse
      git checkout 96c1d0581d
      make ARCH=x86_64 allmodconfig
      make C=1 CF=-D__CHECK_ENDIAN__

    sparse warnings: (new ones prefixed by >>)

    drivers/hv/vmbus_drv.c:67:9: sparse: constant 0x8000000000000000 is so big it is unsigned long
    ...

Signed-off-by: Nick Meier <nmeier@microsoft.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 11:53:54 +01:00
Vitaly Kuznetsov 5abbbb75d7 Drivers: hv: hv_balloon: don't lose memory when onlining order is not natural
Memory blocks can be onlined in random order. When this order is not natural
some memory pages are not onlined because of the redundant check in
hv_online_page().

Here is a real world scenario:
1) Host tries to hot-add the following (process_hot_add):
  pg_start=rg_start=0x48000, pfn_cnt=111616, rg_size=262144

2) This results in adding 4 memory blocks:
[  109.057866] init_memory_mapping: [mem 0x48000000-0x4fffffff]
[  114.102698] init_memory_mapping: [mem 0x50000000-0x57ffffff]
[  119.168039] init_memory_mapping: [mem 0x58000000-0x5fffffff]
[  124.233053] init_memory_mapping: [mem 0x60000000-0x67ffffff]
The last one is incomplete but we have special has->covered_end_pfn counter to
avoid onlining non-backed frames and hv_bring_pgs_online() function to bring
them online later on.

3) Now we have 4 offline memory blocks: /sys/devices/system/memory/memory9-12
$ for f in /sys/devices/system/memory/memory*/state; do echo $f `cat $f`; done | grep -v onlin
/sys/devices/system/memory/memory10/state offline
/sys/devices/system/memory/memory11/state offline
/sys/devices/system/memory/memory12/state offline
/sys/devices/system/memory/memory9/state offline

4) We bring them online in non-natural order:
$grep MemTotal /proc/meminfo
MemTotal:         966348 kB
$echo online > /sys/devices/system/memory/memory12/state && grep MemTotal /proc/meminfo
MemTotal:        1019596 kB
$echo online > /sys/devices/system/memory/memory11/state && grep MemTotal /proc/meminfo
MemTotal:        1150668 kB
$echo online > /sys/devices/system/memory/memory9/state && grep MemTotal /proc/meminfo
MemTotal:        1150668 kB

As you can see memory9 block gives us zero additional memory. We can also
observe a huge discrepancy between host- and guest-reported memory sizes.

The root cause of the issue is the redundant pg >= covered_start_pfn check (and
covered_start_pfn advancing) in hv_online_page(). When upper memory block in
being onlined before the lower one (memory12 and memory11 in the above case) we
advance the covered_start_pfn pointer and all memory9 pages do not pass the
check. If the assumption that host always gives us requests in sequential order
and pg_start always equals rg_start when the first request for the new HA
region is received (that's the case in my testing) is correct than we can get
rid of covered_start_pfn and pg >= start_pfn check in hv_online_page() is
sufficient.

The current char-next branch is broken and this patch fixes
the bug.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 11:53:54 +01:00
Vitaly Kuznetsov f3f6eb8087 Drivers: hv: hv_balloon: keep locks balanced on add_memory() failure
When add_memory() fails the following BUG is observed:
[  743.646107] hv_balloon: hot_add memory failed error is -17
[  743.679973]
[  743.680930] =====================================
[  743.680930] [ BUG: bad unlock balance detected! ]
[  743.680930] 3.19.0-rc5_bug1131426+ #552 Not tainted
[  743.680930] -------------------------------------
[  743.680930] kworker/0:2/255 is trying to release lock (&dm_device.ha_region_mutex) at:
[  743.680930] [<ffffffff81aae5fe>] mutex_unlock+0xe/0x10
[  743.680930] but there are no more locks to release!

This happens as we don't acquire ha_region_mutex and hot_add_req() expects us
to as it does unconditional mutex_unlock(). Acquire the lock on the error path.

The current char-next branch is broken and this patch fixes
the bug.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 11:53:54 +01:00
K. Y. Srinivasan fde25d25db Drivers: hv: vmbus: Perform device register in the per-channel work element
This patch is a continuation of the rescind handling cleanup work. We cannot
block in the global message handling work context especially if we are blocking
waiting for the host to wake us up. I would like to thank
Dexuan Cui <decui@microsoft.com> for observing this problem.

The current char-next branch is broken and this patch fixes
the bug.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-25 11:53:53 +01:00
kbuild test robot ed99d846f1 mei: bus: () can be static
drivers/hv/vmbus_drv.c:51:5: sparse: symbol 'hyperv_panic_event' was not declared. Should it be static?
drivers/hv/vmbus_drv.c:51:5: sparse: symbol 'hyperv_panic_event' was not declared. Should it be static?
drivers/hv/vmbus_drv.c:51:5: sparse: symbol 'hyperv_panic_event' was not declared. Should it be static?
drivers/hv/vmbus_drv.c:51:5: sparse: symbol 'hyperv_panic_event' was not declared. Should it be static?

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 21:43:37 -08:00
K. Y. Srinivasan e9395e3f89 Drivers: hv: vmbus: Suport an API to send packet with additional control
Implement an API that gives additional control on the what VMBUS flags will be
set as well as if the host needs to be signalled. This API will be
useful for clients that want to batch up requests to the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:47 -08:00
K. Y. Srinivasan 87e93d6170 Drivers: hv: vmbus: Suport an API to send pagebuffers with additional control
Implement an API for sending pagebuffers that gives more control to the client
in terms of setting the vmbus flags as well as deciding when to
notify the host. This will be useful for enabling batch processing.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:47 -08:00
K. Y. Srinivasan a13e8bbe85 Drivers: hv: vmbus: Use a round-robin algorithm for picking the outgoing channel
The current algorithm for picking an outgoing channel was not distributing
the load well. Implement a simple round-robin scheme to ensure good
distribution of the outgoing traffic.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:47 -08:00
Nick Meier 96c1d0581d Drivers: hv: vmbus: Add support for VMBus panic notifier handler
Hyper-V allows a guest to notify the Hyper-V host that a panic
condition occured.  This notification can include up to five 64
bit values.  These 64 bit values are written into crash MSRs.
Once the data has been written into the crash MSRs, the host is
then notified by writing into a Crash Control MSR.  On the Hyper-V
host, the panic notification data is captured in the Windows Event
log as a 18590 event.

Crash MSRs are defined in appendix H of the Hypervisor Top Level
Functional Specification.  At the time of this patch, v4.0 is the
current functional spec.  The URL for the v4.0 document is:

http://download.microsoft.com/download/A/B/4/AB43A34E-BDD0-4FA6-BDEF-79EEF16E880B/Hypervisor Top Level Functional Specification v4.0.docx

Signed-off-by: Nick Meier <nmeier@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:47 -08:00
Vitaly Kuznetsov 530d15b907 Drivers: hv: hv_balloon: refuse to balloon below the floor
When host asks us to balloon up we need to be sure we're not committing suicide
by overballooning. Use already existent 'floor' metric as our lowest possible
value for free ram.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:47 -08:00
Vitaly Kuznetsov 549fd280b1 Drivers: hv: hv_balloon: report offline pages as being used
When hot-added memory pages are not brought online or when some memory blocks
are sent offline the subsequent ballooning process kills the guest with OOM
killer. This happens as we don't report these pages as neither used nor free
and apparently host algorithm considers them as being unused. Keep track of
all online/offline operations and report all currently offline pages as being
used so host won't try to balloon them out.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:47 -08:00
Vitaly Kuznetsov b05d8d9ef5 Drivers: hv: hv_balloon: eliminate the trylock path in acquire/release_region_mutex
When many memory regions are being added and automatically onlined the
following lockup is sometimes observed:

INFO: task udevd:1872 blocked for more than 120 seconds.
...
Call Trace:
 [<ffffffff816ec0bc>] schedule_timeout+0x22c/0x350
 [<ffffffff816eb98f>] wait_for_common+0x10f/0x160
 [<ffffffff81067650>] ? default_wake_function+0x0/0x20
 [<ffffffff816eb9fd>] wait_for_completion+0x1d/0x20
 [<ffffffff8144cb9c>] hv_memory_notifier+0xdc/0x120
 [<ffffffff816f298c>] notifier_call_chain+0x4c/0x70
...

When several memory blocks are going online simultaneously we got several
hv_memory_notifier() trying to acquire the ha_region_mutex. When this mutex is
being held by hot_add_req() all these competing acquire_region_mutex() do
mutex_trylock, fail, and queue themselves into wait_for_completion(..). However
when we do complete() from release_region_mutex() only one of them wakes up.
This could be solved by changing complete() -> complete_all() memory onlining
can be delayed as well, in that case we can still get several
hv_memory_notifier() runners at the same time trying to grab the mutex.
Only one of them will succeed and the others will hang for forever as
complete() is not being called. We don't see this issue often because we have
5sec onlining timeout in hv_mem_hot_add() and usually all udev events arrive
in this time frame.

Get rid of the trylock path, waiting on the mutex is supposed to provide the
required serialization.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:47 -08:00
K. Y. Srinivasan 37f492ce81 Drivers: hv: vmbus: Get rid of some unnecessary messages
Currently we log messages when either we are not able to map an ID to a
channel or when the channel does not have a callback associated
(in the channel interrupt handling path). These messages don't add
any value, get rid of them.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:02 -08:00
K. Y. Srinivasan 5380b383ba Drivers: hv: util: On device remove, close the channel after de-initializing the service
When the offer is rescinded, vmbus_close() can free up the channel;
deinitialize the service before closing the channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:02 -08:00
K. Y. Srinivasan 5b1e5b5307 Drivers: hv: vmbus: Remove the channel from the channel list(s) on failure
Properly rollback state in vmbus_pocess_offer() in the failure paths.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:02 -08:00
K. Y. Srinivasan 2dd37cb815 Drivers: hv: vmbus: Handle both rescind and offer messages in the same context
Execute both ressind and offer messages in the same work context. This serializes these
operations and naturally addresses the various corner cases.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:02 -08:00
K. Y. Srinivasan ed6cfcc5fd Drivers: hv: vmbus: Introduce a function to remove a rescinded offer
In response to a rescind message, we need to remove the channel and the
corresponding device. Cleanup this code path by factoring out the code
to remove a channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:02 -08:00
K. Y. Srinivasan d15a0301c4 Drivers: hv: vmbus: Properly handle child device remove
Handle the case when the device may be removed when the device has no driver
attached to it.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:31:02 -08:00
K. Y. Srinivasan 04653a009a Drivers: hv: vmbus: Add support for the NetworkDirect GUID
NetworkDirect is a service that supports guest RDMA.
Define the GUID for this service.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:08 -08:00
K. Y. Srinivasan 40384e4bbe Drivers: hv: vmbus: Fix a bug in the error path in vmbus_open()
Correctly rollback state if the failure occurs after we have handed over
the ownership of the buffer to the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:08 -08:00
Nicholas Mc Guire b057b3ad16 hv: hv_balloon: match var type to return type of wait_for_completion
return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.

Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:08 -08:00
Nicholas Mc Guire 51e5181d20 hv: channel_mgmt: match var type to return type of wait_for_completion
return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.

Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:08 -08:00
Nicholas Mc Guire 08a9513f74 hv: channel: match var type to return type of wait_for_completion
return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.

Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:07 -08:00
Dexuan Cui ac0d12b7ce hv: vmbus_open(): reset the channel state on ENOMEM
Without this patch, the state is put to CHANNEL_OPENING_STATE, and when
the driver is loaded next time, vmbus_open() will fail immediately due to
newchannel->state != CHANNEL_OPEN_STATE.

CC: "K. Y. Srinivasan" <kys@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:07 -08:00
Dexuan Cui 89f9f6796d hv: vmbus_post_msg: retry the hypercall on some transient errors
I got HV_STATUS_INVALID_CONNECTION_ID on Hyper-V 2008 R2 when keeping running
"rmmod hv_netvsc; modprobe hv_netvsc; rmmod hv_utils; modprobe hv_utils"
in a Linux guest. Looks the host has some kind of throttling mechanism if
some kinds of hypercalls are sent too frequently.
Without the patch, the driver can occasionally fail to load.

Also let's retry HV_STATUS_INSUFFICIENT_MEMORY, though we didn't get it
before.

Removed 'case -ENOMEM', since the hypervisor doesn't return this.

CC: "K. Y. Srinivasan" <kys@microsoft.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:07 -08:00
Dexuan Cui 1896566372 hv: hv_util: move vmbus_open() to a later place
Before the line vmbus_open() returns, srv->util_cb can be already running
and the variables, like util_fw_version, are needed by the srv->util_cb.

So we have to make sure the variables are initialized before the vmbus_open().

CC: "K. Y. Srinivasan" <kys@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:07 -08:00
Vitaly Kuznetsov e086748c65 Drivers: hv: vmbus: Teardown clockevent devices on module unload
Newly introduced clockevent devices made it impossible to unload hv_vmbus
module as clockevents_config_and_register() takes additional reverence to
the module. To make it possible again we do the following:
- avoid setting dev->owner for clockevent devices;
- implement hv_synic_clockevents_cleanup() doing clockevents_unbind_device();
- call it from vmbus_exit().

In theory hv_synic_clockevents_cleanup() can be merged with hv_synic_cleanup(),
however, we call hv_synic_cleanup() from smp_call_function_single() and this
doesn't work for clockevents_unbind_device() as it does such call on its own. I
opted for a separate function.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:30:07 -08:00
Vitaly Kuznetsov e72e7ac583 drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload
SynIC has to be switched off when we unload the module, otherwise registered
memory pages can get corrupted after (as Hyper-V host still writes there) and
we see the following crashes for random processes:

[   89.116774] BUG: Bad page map in process sh  pte:4989c716 pmd:36f81067
[   89.159454] addr:0000000000437000 vm_flags:00000875 anon_vma:          (null) mapping:ffff88007bba55a0 index:37
[   89.226146] vma->vm_ops->fault: filemap_fault+0x0/0x410
[   89.257776] vma->vm_file->f_op->mmap: generic_file_mmap+0x0/0x60
[   89.297570] CPU: 0 PID: 215 Comm: sh Tainted: G    B          3.19.0-rc5_bug923184+ #488
[   89.353738] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   89.409138]  0000000000000000 000000004e083d7b ffff880036e9fa18 ffffffff81a68d31
[   89.468724]  0000000000000000 0000000000437000 ffff880036e9fa68 ffffffff811a1e3a
[   89.519233]  000000004989c716 0000000000000037 ffffea0001edc340 0000000000437000
[   89.575751] Call Trace:
[   89.591060]  [<ffffffff81a68d31>] dump_stack+0x45/0x57
[   89.625164]  [<ffffffff811a1e3a>] print_bad_pte+0x1aa/0x250
[   89.667234]  [<ffffffff811a2c95>] vm_normal_page+0x55/0xa0
[   89.703818]  [<ffffffff811a3105>] unmap_page_range+0x425/0x8a0
[   89.737982]  [<ffffffff811a3601>] unmap_single_vma+0x81/0xf0
[   89.780385]  [<ffffffff81184320>] ? lru_deactivate_fn+0x190/0x190
[   89.820130]  [<ffffffff811a4131>] unmap_vmas+0x51/0xa0
[   89.860168]  [<ffffffff811ad12c>] exit_mmap+0xac/0x1a0
[   89.890588]  [<ffffffff810763c3>] mmput+0x63/0x100
[   89.919205]  [<ffffffff811eba48>] flush_old_exec+0x3f8/0x8b0
[   89.962135]  [<ffffffff8123b5bb>] load_elf_binary+0x32b/0x1260
[   89.998581]  [<ffffffff811a14f2>] ? get_user_pages+0x52/0x60

hv_synic_cleanup() function exists but noone calls it now. Do the following:
- call hv_synic_cleanup() on each cpu from vmbus_exit();
- write global disable bit through MSR;
- use hv_synic_free_cpu() to avoid memory leask and code duplication.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:29:05 -08:00
Vitaly Kuznetsov 09a196288e Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown
We need to destroy hv_vmbus_con on module shutdown, otherwise the following
crash is sometimes observed:

[   76.569845] hv_vmbus: Hyper-V Host Build:9600-6.3-17-0.17039; Vmbus version:3.0
[   82.598859] BUG: unable to handle kernel paging request at ffffffffa0003480
[   82.599287] IP: [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] PGD 1f34067 PUD 1f35063 PMD 3f72d067 PTE 0
[   82.599287] Oops: 0010 [#1] SMP
[   82.599287] Modules linked in: [last unloaded: hv_vmbus]
[   82.599287] CPU: 0 PID: 26 Comm: kworker/0:1 Not tainted 3.19.0-rc5_bug923184+ #488
[   82.599287] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   82.599287] Workqueue: hv_vmbus_con 0xffffffffa0003480
[   82.599287] task: ffff88007b6ddfa0 ti: ffff88007f8f8000 task.ti: ffff88007f8f8000
[   82.599287] RIP: 0010:[<ffffffffa0003480>]  [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] RSP: 0018:ffff88007f8fbe00  EFLAGS: 00010202
...

To avoid memory leaks we need to free monitor_pages and int_page for
vmbus_connection. Implement vmbus_disconnect() function by separating cleanup
path from vmbus_connect().

As we use hv_vmbus_con to release channels (see free_channel() in channel_mgmt.c)
we need to make sure the work was done before we remove the queue, do that with
drain_workqueue(). We also need to avoid handling messages  which can (potentially)
create new channels, so set vmbus_connection.conn_state = DISCONNECTED at the very
beginning of vmbus_exit() and check for that in vmbus_onmessage_work().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:29:05 -08:00
Vitaly Kuznetsov adcde069a8 Drivers: hv: vmbus: avoid double kfree for device_obj
On driver shutdown device_obj is being freed twice:
1) In vmbus_free_channels()
2) vmbus_device_release() (which is being triggered by device_unregister() in
   vmbus_device_unregister().
This double kfree leads to the following sporadic crash on driver unload:

[   23.469876] general protection fault: 0000 [#1] SMP
[   23.470036] Modules linked in: hv_vmbus(-)
[   23.470036] CPU: 2 PID: 213 Comm: rmmod Not tainted 3.19.0-rc5_bug923184+ #488
[   23.470036] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   23.470036] task: ffff880036ef1cb0 ti: ffff880036ce8000 task.ti: ffff880036ce8000
[   23.470036] RIP: 0010:[<ffffffff811d2e1b>]  [<ffffffff811d2e1b>] __kmalloc_node_track_caller+0xdb/0x1e0
[   23.470036] RSP: 0018:ffff880036cebcc8  EFLAGS: 00010246
...

When this crash does not happen on driver unload the similar one is expected if
we try to load hv_vmbus again.

Remove kfree from vmbus_free_channels() as freeing it from
vmbus_device_release() seems right.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:29:05 -08:00
Vitaly Kuznetsov bc63b6f634 Drivers: hv: vmbus: rename channel work queues
All channel work queues are named 'hv_vmbus_ctl', this makes them
indistinguishable in ps output and makes it hard to link to the corresponding
vmbus device. Rename them to hv_vmbus_ctl/N and make vmbus device names match,
e.g. now vmbus_1 device is served by hv_vmbus_ctl/1 work queue.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:29:05 -08:00
Vitaly Kuznetsov e513229b4c Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors
When an SMP Hyper-V guest is running on top of 2012R2 Server and secondary
cpus are sent offline (with echo 0 > /sys/devices/system/cpu/cpu$cpu/online)
the system freeze is observed. This happens due to the fact that on newer
hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
when vmbus is loaded until the issue is fixed host-side.

This patch also disables hibernation but it is OK as it is also broken (MCE
error is hit on resume). Suspend still works.

Tested with WS2008R2 and WS2012R2.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-01 19:29:05 -08:00
Linus Torvalds 4ba63072b9 Char / Misc patches for 3.20-rc1
Here's the big char/misc driver update for 3.20-rc1.
 
 Lots of little things in here, all described in the changelog.  Nothing
 major or unusual, except maybe the binder selinux stuff, which was all
 acked by the proper selinux people and they thought it best to come
 through this tree.
 
 All of this has been in linux-next with no reported issues for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlTgs80ACgkQMUfUDdst+yn86gCeMLbxANGExVLd+PR46GNsAUQb
 SJ4AmgIqrkIz+5LCwZWM02ldbYhPeBVf
 =lfmM
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc patches from Greg KH:
 "Here's the big char/misc driver update for 3.20-rc1.

  Lots of little things in here, all described in the changelog.
  Nothing major or unusual, except maybe the binder selinux stuff, which
  was all acked by the proper selinux people and they thought it best to
  come through this tree.

  All of this has been in linux-next with no reported issues for a while"

* tag 'char-misc-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (90 commits)
  coresight: fix function etm_writel_cp14() parameter order
  coresight-etm: remove check for unknown Kconfig macro
  coresight: fixing CPU hwid lookup in device tree
  coresight: remove the unnecessary function coresight_is_bit_set()
  coresight: fix the debug AMBA bus name
  coresight: remove the extra spaces
  coresight: fix the link between orphan connection and newly added device
  coresight: remove the unnecessary replicator property
  coresight: fix the replicator subtype value
  pdfdocs: Fix 'make pdfdocs' failure for 'uio-howto.tmpl'
  mcb: Fix error path of mcb_pci_probe
  virtio/console: verify device has config space
  ti-st: clean up data types (fix harmless memory corruption)
  mei: me: release hw from reset only during the reset flow
  mei: mask interrupt set bit on clean reset bit
  extcon: max77693: Constify struct regmap_config
  extcon: adc-jack: Release IIO channel on driver remove
  extcon: Remove duplicated include from extcon-class.c
  Drivers: hv: vmbus: hv_process_timer_expiration() can be static
  Drivers: hv: vmbus: serialize Offer and Rescind offer
  ...
2015-02-15 10:48:44 -08:00
Lv Zheng a45de93eb1 ACPICA: Resources: Provide common part for struct acpi_resource_address structures.
struct acpi_resource_address and struct acpi_resource_extended_address64 share substracts
just at different offsets. To unify the parsing functions, OSPMs like Linux
need a new ACPI_ADDRESS64_ATTRIBUTE as their substructs, so they can
extract the shared data.

This patch also synchronizes the structure changes to the Linux kernel.
The usages are searched by matching the following keywords:
1. acpi_resource_address
2. acpi_resource_extended_address
3. ACPI_RESOURCE_TYPE_ADDRESS
4. ACPI_RESOURCE_TYPE_EXTENDED_ADDRESS
And we found and fixed the usages in the following files:
 arch/ia64/kernel/acpi-ext.c
 arch/ia64/pci/pci.c
 arch/x86/pci/acpi.c
 arch/x86/pci/mmconfig-shared.c
 drivers/xen/xen-acpi-memhotplug.c
 drivers/acpi/acpi_memhotplug.c
 drivers/acpi/pci_root.c
 drivers/acpi/resource.c
 drivers/char/hpet.c
 drivers/pnp/pnpacpi/rsparser.c
 drivers/hv/vmbus_drv.c

Build tests are passed with defconfig/allnoconfig/allyesconfig and
defconfig+CONFIG_ACPI=n.

Original-by: Thomas Gleixner <tglx@linutronix.de>
Original-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-26 16:09:56 +01:00
kbuild test robot d8a60e000c Drivers: hv: vmbus: hv_process_timer_expiration() can be static
drivers/hv/vmbus_drv.c:582:6: sparse: symbol 'hv_process_timer_expiration' was not declared. Should it be static?

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:24:42 -08:00
Vitaly Kuznetsov d7f2fbafb4 Drivers: hv: vmbus: serialize Offer and Rescind offer
Commit 4b2f9abea5 ("staging: hv: convert channel_mgmt.c to not call
osd_schedule_callback")' was written under an assumption that we never receive
Rescind offer while we're still processing the initial Offer request. However,
the issue we fixed in 04a258c162 could be caused by this assumption not
always being true.

In particular, we need to protect against the following:
1) Receiving a Rescind offer after we do queue_work() for processing an Offer
   request and before we actually enter vmbus_process_offer(). work.func points
   to vmbus_process_offer() at this moment and in vmbus_onoffer_rescind() we do
   another queue_work() without a check so we'll enter vmbus_process_offer()
   twice.
2) Receiving a Rescind offer after we enter vmbus_process_offer() and
   especially after we set >state = CHANNEL_OPEN_STATE. Many things can go
   wrong in that case, e.g. we can call free_channel() while we're still using
   it.

Implement the required protection by changing work->func at the very end of
vmbus_process_offer() and checking work->func in vmbus_onoffer_rescind(). In
case we receive rescind offer during or before vmbus_process_offer() is done
we set rescind flag to true and we check it at the end of vmbus_process_offer()
so such offer will not get lost.

Suggested-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:18:01 -08:00
Vitaly Kuznetsov 67fae053bf Drivers: hv: rename sc_lock to the more generic lock
sc_lock spinlock in struct vmbus_channel is being used to not only protect the
sc_list field, e.g. vmbus_open() function uses it to implement test-and-set
access to the state field. Rename it to the more generic 'lock' and add the
description.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:18:00 -08:00
Vitaly Kuznetsov 9c3a6f7e47 Drivers: hv: check vmbus_device_create() return value in vmbus_process_offer()
vmbus_device_create() result is not being checked in vmbus_process_offer() and
it can fail if kzalloc() fails. Add the check and do minor cleanup to avoid
additional duplication of "free_channel(); return;" block.

Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:18:00 -08:00
Dexuan Cui d9b1652947 hv: hv_fcopy: drop the obsolete message on transfer failure
In the case the user-space daemon crashes, hangs or is killed, we
need to down the semaphore, otherwise, after the daemon starts next
time, the obsolete data in fcopy_transaction.message or
fcopy_transaction.fcopy_msg will be used immediately.

Cc: Jason Wang <jasowang@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:17:58 -08:00
K. Y. Srinivasan d61031ee8d Drivers: hv: vmbus: Support a vmbus API for efficiently sending page arrays
Currently, the API for sending a multi-page buffer over VMBUS is limited to
a maximum pfn array of MAX_MULTIPAGE_BUFFER_COUNT. This limitation is
not imposed by the host and unnecessarily limits the maximum payload
that can be sent. Implement an API that does not have this restriction.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:17:58 -08:00
K. Y. Srinivasan 9f52a16309 Drivers: hv: vmbus: Fix a bug in vmbus_establish_gpadl()
Correctly compute the local (gpadl) handle.
I would like to thank Michael Brown <mcb30@ipxe.org> for seeing this bug.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:17:58 -08:00
K. Y. Srinivasan 4061ed9e2a Drivers: hv: vmbus: Implement a clockevent device
Implement a clockevent device based on the timer support available on
Hyper-V.
In this version of the patch I have addressed Jason's review comments.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:17:57 -08:00
K. Y. Srinivasan ab3de22bb4 Drivers: hv: hv_balloon: Don't post pressure status from interrupt context
We currently release memory (balloon down) in the interrupt context and we also
post memory status while releasing memory. Rather than posting the status
in the interrupt context, wakeup the status posting thread to post the status.
This will address the inconsistent lock state that Sitsofe Wheeler <sitsofe@gmail.com>
reported:

http://lkml.iu.edu/hypermail/linux/kernel/1411.1/00075.html

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Sitsofe Wheeler <sitsofe@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:17:57 -08:00
K. Y. Srinivasan 22f88475b6 Drivers: hv: hv_balloon: Fix a locking bug in the balloon driver
We support memory hot-add in the Hyper-V balloon driver by hot adding an appropriately
sized and aligned region and controlling the on-lining of pages within that region
based on the pages that the host wants us to online. We do this because the
granularity and alignment requirements in Linux are different from what Windows
expects. The state to manage the onlining of pages needs to be correctly
protected. Fix this bug.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:17:57 -08:00
K. Y. Srinivasan 79208c57da Drivers: hv: hv_balloon: Make adjustments in computing the floor
Make adjustments in computing the balloon floor. The current computation
of the balloon floor was not appropriate for virtual machines with more than
10 GB of assigned memory - we would get into situations where the host would
agressively balloon down the guest and leave the guest in an unusable state.
This patch fixes the issue by raising the floor.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:17:57 -08:00
K. Y. Srinivasan 87712bf81d Drivers: hv: vmbus: Use get_cpu() to get the current CPU
Replace calls for smp_processor_id() to get_cpu() to get the CPU ID of
the current CPU. In these instances, there is no correctness issue with
regards to preemption, we just need the current CPU ID.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-12 05:04:10 -08:00
Linus Torvalds 6ae840e7cc Char/Misc driver patches for 3.19-rc1
Here's the big char/misc driver update for 3.19-rc1
 
 Lots of little things all over the place in different drivers, and a new
 subsystem, "coresight" has been added.  Full details are in the
 shortlog.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlSODosACgkQMUfUDdst+ykSNwCfcqx1Z3rQzbLwSrR2sa1fV3Zb
 yEAAniJoLZ4ZkoQK4/1ozsFc31q+gXNm
 =/epr
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here's the big char/misc driver update for 3.19-rc1

  Lots of little things all over the place in different drivers, and a
  new subsystem, "coresight" has been added.  Full details are in the
  shortlog"

* tag 'char-misc-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (73 commits)
  parport: parport_pc, do not remove parent devices early
  spmi: Remove shutdown/suspend/resume kernel-doc
  carma-fpga-program: drop videobuf dependency
  carma-fpga: drop videobuf dependency
  carma-fpga-program.c: fix compile errors
  i8k: Fix temperature bug handling in i8k_get_temp()
  cxl: Name interrupts in /proc/interrupt
  CXL: Return error to PSL if IRQ demultiplexing fails & print clearer warning
  coresight-replicator: remove .owner field for driver
  coresight: fixed comments in coresight.h
  coresight: fix typo in comment in coresight-priv.h
  coresight: bindings for coresight drivers
  coresight: Adding ABI documentation
  w1: support auto-load of w1_bq27000 module.
  w1: avoid potential u16 overflow
  cn: verify msg->len before making callback
  mei: export fw status registers through sysfs
  mei: read and print all six FW status registers
  mei: txe: add cherrytrail device id
  mei: kill cached host and me csr values
  ...
2014-12-14 16:43:47 -08:00
Haiyang Zhang c3582a2c4d hyperv: Add support for vNIC hot removal
This patch adds proper handling of the vNIC hot removal event, which includes
a rescind-channel-offer message from the host side that triggers vNIC close and
removal. In this case, the notices to the host during close and removal is not
necessary because the channel is rescinded. This patch blocks these unnecessary
messages, and lets vNIC removal process complete normally.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-08 20:24:11 -05:00
Dexuan Cui f671223847 hv: hv_balloon: avoid memory leak on alloc_error of 2MB memory block
If num_ballooned is not 0, we shouldn't neglect the
already-partially-allocated 2MB memory block(s).

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-26 19:03:04 -08:00
Vitaly Kuznetsov 8d9560ebcc Drivers: hv: kvp,vss: Fast propagation of userspace communication failure
If we fail to send a message to userspace daemon with cn_netlink_send()
there is no need to wait for userspace to reply as it is not going to
happen. This happens when kvp or vss daemon is stopped after a successful
handshake. Report HV_E_FAIL immediately and cancel the timeout job so
host won't receive two failures.
Use pr_warn() for VSS and pr_debug() for KVP deliberately as VSS request
are rare and result in a failed backup. KVP requests are much more frequent
after a successful handshake so avoid flooding logs. It would be nice to
have an ability to de-negotiate with the host in case userspace daemon gets
disconnected so we won't receive new requests. But I'm not sure it is
possible.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-26 19:00:32 -08:00
Vitaly Kuznetsov 649142074d Drivers: hv: vss: Introduce timeout for communication with userspace
In contrast with KVP there is no timeout when communicating with
userspace VSS daemon. In case it gets stuck performing freeze/thaw
operation no message will be sent to the host so it will take very
long (around 10 minutes) before backup fails. Introduce 10 second
timeout using schedule_delayed_work().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-26 19:00:32 -08:00
Vitaly Kuznetsov 04a258c162 Drivers: hv: vmbus: Fix a race condition when unregistering a device
When build with Debug the following crash is sometimes observed:
Call Trace:
 [<ffffffff812b9600>] string+0x40/0x100
 [<ffffffff812bb038>] vsnprintf+0x218/0x5e0
 [<ffffffff810baf7d>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff812bb4c1>] vscnprintf+0x11/0x30
 [<ffffffff8107a2f0>] vprintk+0xd0/0x5c0
 [<ffffffffa0051ea0>] ? vmbus_process_rescind_offer+0x0/0x110 [hv_vmbus]
 [<ffffffff8155c71c>] printk+0x41/0x45
 [<ffffffffa004ebac>] vmbus_device_unregister+0x2c/0x40 [hv_vmbus]
 [<ffffffffa0051ecb>] vmbus_process_rescind_offer+0x2b/0x110 [hv_vmbus]
...

This happens due to the following race: between 'if (channel->device_obj)' check
in vmbus_process_rescind_offer() and pr_debug() in vmbus_device_unregister() the
device can disappear. Fix the issue by taking an additional reference to the
device before proceeding to vmbus_device_unregister().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-07 10:21:44 -08:00
K. Y. Srinivasan 046c7911b2 Drivers: hv: vmbus: Enable interrupt driven flow control
In win8 we have a feature that allows for interrupt driven flow management
for host/guest communication. For instance, if the host were blocked because
there was no space available in the ringbuffer, the host could request that the
guest send an interrupt when space becomes available in the ringbuffer (when
the guest drains the ringbuffer).

While this feature was implemented in the guest a while ago, we had not
advertised that the guest supported this feature. This patch advertises
the support to the host.

For pre-win8 hosts, this has no effect since the size of the ringbuffer
control structure has not changed and all changes have been backward
compatible - unused/reserved space has been used to implement this
feature.

In this version of the patch I have cleaned up the commit log based on
feedback from Greg KH.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23 23:31:22 -07:00
K. Y. Srinivasan 2115b5617a Drivers: hv: vmbus: Properly protect calls to smp_processor_id()
Disable preemption when sampling current processor ID when preemption
is otherwise possible.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23 23:31:21 -07:00
K. Y. Srinivasan b29ef3546a Drivers: hv: vmbus: Cleanup hv_post_message()
Minimize failures in this function by pre-allocating the buffer
for posting messages. The hypercall for posting the message can fail
for a number of reasons:

        1. Transient resource related issues
        2. Buffer alignment
        3. Buffer cannot span a page boundry

We address issues 2 and 3 by preallocating a per-cpu page for the buffer.
Transient resource related failures are handled by retrying by the callers
of this function.

This patch is based on the investigation
done by Dexuan Cui <decui@microsoft.com>.

I would like to thank Sitsofe Wheeler <sitsofe@yahoo.com>
for reporting the issue and helping in debuggging.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23 23:31:21 -07:00
K. Y. Srinivasan 98d731bb06 Drivers: hv: vmbus: Cleanup vmbus_close_internal()
Eliminate calls to BUG_ON() in vmbus_close_internal().
We have chosen to potentially leak memory, than crash the guest
in case of failures.

In this version of the patch I have addressed comments from
Dan Carpenter (dan.carpenter@oracle.com).

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23 23:30:38 -07:00
K. Y. Srinivasan 45d727cee9 Drivers: hv: vmbus: Fix a bug in vmbus_open()
Fix a bug in vmbus_open() and properly propagate the error. I would
like to thank Dexuan Cui <decui@microsoft.com> for identifying the
issue.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23 23:30:38 -07:00
K. Y. Srinivasan 72c6b71c24 Drivers: hv: vmbus: Cleanup vmbus_establish_gpadl()
Eliminate the call to BUG_ON() by waiting for the host to respond. We are
trying to reclaim the ownership of memory that was given to the host and so
we will have to wait until the host responds.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23 23:30:38 -07:00
K. Y. Srinivasan 66be653083 Drivers: hv: vmbus: Cleanup vmbus_teardown_gpadl()
Eliminate calls to BUG_ON() by properly handling errors. In cases where
rollback is possible, we will return the appropriate error to have the
calling code decide how to rollback state. In the case where we are
transferring ownership of the guest physical pages to the host,
we will wait for the host to respond.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23 23:30:38 -07:00
K. Y. Srinivasan fdeebcc622 Drivers: hv: vmbus: Cleanup vmbus_post_msg()
Posting messages to the host can fail because of transient resource
related failures. Correctly deal with these failures and increase the
number of attempts to post the message before giving up.

In this version of the patch, I have normalized the error code to
Linux error code.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23 23:30:38 -07:00
Linus Torvalds 2521129a6d Char / Misc driver patches for 3.17-rc1
Here's the big driver misc / char pull request for 3.17-rc1.
 
 Lots of things in here, the thunderbolt support for Apple laptops, some
 other new drivers, testing fixes, and other good things.  All have been
 in linux-next for a long time.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlPf1LcACgkQMUfUDdst+ymaVwCgqMrKFmpduBufOSFROhxlfB5Q
 ajsAoNDmIn3pgla+kj23Y5ib20aMi++s
 =IdIr
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc driver patches from Greg KH:
 "Here's the big driver misc / char pull request for 3.17-rc1.

  Lots of things in here, the thunderbolt support for Apple laptops,
  some other new drivers, testing fixes, and other good things.  All
  have been in linux-next for a long time"

* tag 'char-misc-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (119 commits)
  misc: bh1780: Introduce the use of devm_kzalloc
  Lattice ECP3 FPGA: Correct endianness
  drivers/misc/ti-st: Load firmware from ti-connectivity directory.
  dt-bindings: extcon: Add support for SM5502 MUIC device
  extcon: sm5502: Change internal hardware switch according to cable type
  extcon: sm5502: Detect cable state after completing platform booting
  extcon: sm5502: Add support new SM5502 extcon device driver
  extcon: arizona: Get MICVDD against extcon device
  extcon: Remove unnecessary OOM messages
  misc: vexpress: Fix sparse non static symbol warnings
  mei: drop unused hw dependent fw status functions
  misc: bh1770glc: Use managed functions
  pcmcia: remove DEFINE_PCI_DEVICE_TABLE usage
  misc: remove DEFINE_PCI_DEVICE_TABLE usage
  ipack: Replace DEFINE_PCI_DEVICE_TABLE macro use
  drivers/char/dsp56k.c: drop check for negativity of unsigned parameter
  mei: fix return value on disconnect timeout
  mei: don't schedule suspend in pm idle
  mei: start disconnect request timer consistently
  mei: reset client connection state on timeout
  ...
2014-08-04 17:32:24 -07:00
Dexuan Cui 2ef82d24f4 Drivers: hv: hv_fcopy: fix a race condition for SMP guest
We should schedule the 5s "timer work" before starting the data transfer,
otherwise, the data transfer code may finish so fast on another
virtual cpu that when the code(fcopy_write()) trying to cancel the 5s
"timer work" can occasionally fail because the "timer work" may haven't
been scheduled yet and as a result the fcopy process will be aborted
wrongly by fcopy_work_func() in 5s.

Thank Liz Zhang <lizzha@microsoft.com> for the initial investigation
on the bug.

This addresses https://bugzilla.redhat.com/show_bug.cgi?id=1118123

Tested-by: Liz Zhang <lizzha@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-17 18:43:10 -07:00
Greg Kroah-Hartman 9f48c89862 Merge 3.16-rc5 into char-misc-next
This resolves a number of merge issues with changes in this tree and
Linus's tree at the same time.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-13 15:26:47 -07:00
K. Y. Srinivasan 9bd2d0dfe4 Drivers: hv: util: Fix a bug in the KVP code
Add code to poll the channel since we process only one message
at a time and the host may not interrupt us. Also increase the
receive buffer size since some KVP messages are close to 8K bytes in size.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09 14:34:35 -07:00
K. Y. Srinivasan affb1aff30 Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code
Starting with Win8, we have implemented several optimizations to improve the
scalability and performance of the VMBUS transport between the Host and the
Guest. Some of the non-performance critical services cannot leverage these
optimization since they only read and process one message at a time.
Make adjustments to the callback dispatch code to account for the way
non-performance critical drivers handle reading of the channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09 14:34:35 -07:00
Jason Wang 7a446d635d hyperv: remove meaningless pr_err() in vmbus_recvpacket_raw()
All its callers depends on the return value of -ENOBUFS to reallocate a
bigger buffer and retry the receiving. So there's no need to call
pr_err() here since it was not a real issue, otherwise syslog will be
flooded by this false warning.

Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09 14:21:27 -07:00
Linus Torvalds f9da455b93 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
2014-06-12 14:27:40 -07:00
stephen hemminger 1b9d48f2a5 hyper-v: make uuid_le const
The uuid structure could be managed as a const in several places.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-03 18:18:38 -07:00
Radim Krčmář a100d88df1 hv: use correct order when freeing monitor_pages
We try to free two pages when only one has been allocated.
Cleanup path is unlikely, so I haven't found any trace that would fit,
but I hope that free_pages_prepare() does catch it.

Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-28 13:45:15 -07:00
K. Y. Srinivasan ae339336dc Drivers: hv: balloon: Ensure pressure reports are posted regularly
The current code posts periodic memory pressure status from a dedicated thread.
Under some conditions, especially when we are releasing a lot of memory into
the guest, we may not send timely pressure reports back to the host. Fix this
issue by reporting pressure in all contexts that can be active in this driver.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-03 19:25:17 -04:00
Tobias Klauser 24b8a406bf hv: Remove unnecessary comparison of unsigned against 0
pfncount is of type u32 and thus can never be smaller than 0.

Found by the coverity scanner, CID 143213.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-03 19:24:26 -04:00
K. Y. Srinivasan 3a28fa35d6 Drivers: hv: vmbus: Implement per-CPU mapping of relid to channel
Currently the mapping of the relID to channel is done under the protection of a
single spin lock. Starting with ws2012, each channel is bound to a specific VCPU
in the guest. Use this binding to eliminate the spin lock by setting up
per-cpu state for mapping relId to the channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-03 19:24:26 -04:00
K. Y. Srinivasan d3ba720dd5 Drivers: hv: Eliminate the channel spinlock in the callback path
By ensuring that we set the callback handler to NULL in the channel close
path on the same CPU that the channel is bound to, we can eliminate this lock
acquisition and release in a performance critical path.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-03 19:24:26 -04:00
K. Y. Srinivasan 03367ef5ea Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts
Only ws2012r2 hosts support the ability to reconnect to the host on VMBUS. This functionality
is needed by kexec in Linux. To use this functionality we need to negotiate version 3.0 of the
VMBUS protocol.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>        [3.9+]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-16 14:14:07 -07:00
Linus Torvalds 675c354a95 Char/Misc driver patches for 3.15-rc1
Here's the big char/misc driver updates for 3.15-rc1.
 
 Lots of various things here, including the new mcb driver subsystem.
 
 All of these have been in linux-next for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEYEABECAAYFAlM7ArIACgkQMUfUDdst+ylS+gCfcJr0Zo2v5aWnqD7rFtFETmFI
 LhcAoNTQ4cvlVdxnI0driWCWFYxLj6at
 =aj+L
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver patches from Greg KH:
 "Here's the big char/misc driver updates for 3.15-rc1.

  Lots of various things here, including the new mcb driver subsystem.

  All of these have been in linux-next for a while"

* tag 'char-misc-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (118 commits)
  extcon: Move OF helper function to extcon core and change function name
  extcon: of: Remove unnecessary function call by using the name of device_node
  extcon: gpio: Use SIMPLE_DEV_PM_OPS macro
  extcon: palmas: Use SIMPLE_DEV_PM_OPS macro
  mei: don't use deprecated DEFINE_PCI_DEVICE_TABLE macro
  mei: amthif: fix checkpatch error
  mei: client.h fix checkpatch errors
  mei: use cl_dbg where appropriate
  mei: fix Unnecessary space after function pointer name
  mei: report consistently copy_from/to_user failures
  mei: drop pr_fmt macros
  mei: make me hw headers private to me hw.
  mei: fix memory leak of pending write cb objects
  mei: me: do not reset when less than expected data is received
  drivers: mcb: Fix build error discovered by 0-day bot
  cs5535-mfgpt: Simplify dependencies
  spmi: pm: drop bus-level PM suspend/resume routines
  spmi: pmic_arb: make selectable on ARCH_QCOM
  Drivers: hv: vmbus: Increase the limit on the number of pfns we can handle
  pch_phub: Report error writing MAC back to user
  ...
2014-04-01 16:13:21 -07:00
Thomas Gleixner 76d388cd72 x86: hyperv: Fixup the (brain) damage caused by the irq cleanup
Compiling last minute changes without setting the proper config
options is not really clever.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-03-05 13:42:14 +01:00
Thomas Gleixner 1aec169673 x86: Hyperv: Cleanup the irq mess
The vmbus/hyperv interrupt handling is another complete trainwreck and
probably the worst of all currently in tree.

If CONFIG_HYPERV=y then the interrupt delivery to the vmbus happens
via the direct HYPERVISOR_CALLBACK_VECTOR. So far so good, but:

  The driver requests first a normal device interrupt. The only reason
  to do so is to increment the interrupt stats of that device
  interrupt. For no reason it also installs a private flow handler.

  We have proper accounting mechanisms for direct vectors, but of
  course it's too much effort to add that 5 lines of code.

  Aside of that the alloc_intr_gate() is not protected against
  reallocation which makes module reload impossible.

Solution to the problem is simple to rip out the whole mess and
implement it correctly.

First of all move all that code to arch/x86/kernel/cpu/mshyperv.c and
merily install the HYPERVISOR_CALLBACK_VECTOR with proper reallocation
protection and use the proper direct vector accounting mechanism.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linuxdrivers <devel@linuxdriverproject.org>
Cc: x86 <x86@kernel.org>
Link: http://lkml.kernel.org/r/20140223212739.028307673@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-03-04 17:37:54 +01:00
Gerd Hoffmann 90eedf0cbe vmbus: use resource for hyperv mmio region
Use a resource for the hyperv mmio region instead of start/size
variables.  Register the region properly so it shows up in
/proc/iomem.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-24 16:15:05 -08:00
Gerd Hoffmann 4eb923f878 vmbus: add missing breaks
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-24 16:15:05 -08:00
Fengguang Wu c765a6dfad Drivers: hv: fcopy_open() can be static
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-18 12:24:55 -08:00
K. Y. Srinivasan 01325476d6 Drivers: hv: Implement the file copy service
Implement the file copy service for Linux guests on Hyper-V. This permits the
host to copy a file (over VMBUS) into the guest. This facility is part of
"guest integration services" supported on the Windows platform.
Here is a link that provides additional details on this functionality:

http://technet.microsoft.com/en-us/library/dn464282.aspx

In V1 version of the patch I have addressed comments from
Olaf Hering <olaf@aepfle.de> and Dan Carpenter <dan.carpenter@oracle.com>

In V2 version of this patch I did some minor cleanup (making some globals
static). In V4 version of the patch I have addressed all of Olaf's
most recent set of comments/concerns.

In V5 version of the patch I had addressed Greg's most recent comments.
I would like to thank Greg for suggesting that I use misc device; it has
significantly simplified the code.

In V6 version of the patch I have cleaned up error message based on Olaf's
comments. I have also rebased the patch based on the current tip.

In this version of the patch, I have addressed the latest comments from Greg.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-18 10:53:48 -08:00
Greg Kroah-Hartman ba4b60e85d Merge 3.14-rc3 into char-misc-next
We need the fixes here for future mei and other patches.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-18 08:09:40 -08:00
K. Y. Srinivasan 5dba4c56df Drivers: hv: Ballon: Make pressure posting thread sleep interruptibly
The non-interruptible sleep of the memory pressure posting thread
results in higher reported load average. Make this sleep interruptible.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-15 12:08:38 -08:00
David Fries ac8f73305e connector: add portid to unicast in addition to broadcasting
This allows replying only to the requestor portid while still
supporting broadcasting.  Pass 0 to portid for the previous behavior.

Signed-off-by: David Fries <David@Fries.net>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-07 15:40:17 -08:00
K. Y. Srinivasan 011a7c3cc3 Drivers: hv: vmbus: Cleanup the packet send path
The current channel code is using scatterlist abstraction to pass data to the
ringbuffer API on the send path. This causes unnecessary translations
between virtual and physical addresses. Fix this.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-07 15:22:40 -08:00
K. Y. Srinivasan 90f3453585 Drivers: hv: vmbus: Extract the mmio information from DSDT
On Gen2 firmware, Hyper-V does not emulate the PCI bus. However, the MMIO
information is packaged up in DSDT. Extract this information and export it
for use by the synthetic framebuffer driver. This is the only driver that
needs this currently.

In this version of the patch mmio, I have updated the hyperv header file
(linux/hyperv.h) with mmio definitions.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-07 15:21:48 -08:00
K. Y. Srinivasan 269f979467 Drivers: hv: vmbus: Don't timeout during the initial connection with host
When the guest attempts to connect with the host when there may already be a
connection with the host (as would be the case during the kdump/kexec path),
it is difficult to guarantee timely response from the host. Starting with
WS2012 R2, the host supports this ability to re-connect with the host
(explicitly to support kexec). Prior to responding to the guest, the host
needs to ensure that device states based on the previous connection to
the host have been properly torn down. This may introduce unbounded delays.
To deal with this issue, don't do a timed wait during the initial connect
with the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-07 08:27:34 -08:00
K. Y. Srinivasan e28bab4828 Drivers: hv: vmbus: Specify the target CPU that should receive notification
During the initial VMBUS connect phase, starting with WS2012 R2, we should
specify the VPCU in the guest that should receive the notification. Fix this
issue. This fix is required to properly connect to the host in the kexeced
kernel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>        [3.9+]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-07 08:27:34 -08:00
Haiyang Zhang b679ef73ed hyperv: Add support for physically discontinuous receive buffer
This will allow us to use bigger receive buffer, and prevent allocation failure
due to fragmented memory.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27 16:40:45 -08:00
Linus Torvalds 09da8dfa98 ACPI and power management updates for 3.14-rc1
- ACPI core changes to make it create a struct acpi_device object for every
    device represented in the ACPI tables during all namespace scans regardless
    of the current status of that device.  In accordance with this, ACPI hotplug
    operations will not delete those objects, unless the underlying ACPI tables
    go away.
 
  - On top of the above, new sysfs attribute for ACPI device objects allowing
    user space to check device status by triggering the execution of _STA for
    its ACPI object.  From Srinivas Pandruvada.
 
  - ACPI core hotplug changes reducing code duplication, integrating the
    PCI root hotplug with the core and reworking container hotplug.
 
  - ACPI core simplifications making it use ACPI_COMPANION() in the code
    "glueing" ACPI device objects to "physical" devices.
 
  - ACPICA update to upstream version 20131218.  This adds support for the
    DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug
    facilities.  From Bob Moore, Lv Zheng and Betty Dall.
 
  - Init code change to carry out the early ACPI initialization earlier.
    That should allow us to use ACPI during the timekeeping initialization
    and possibly to simplify the EFI initialization too.  From Chun-Yi Lee.
 
  - Clenups of the inclusions of ACPI headers in many places all over from
    Lv Zheng and Rashika Kheria (work in progress).
 
  - New helper for ACPI _DSM execution and rework of the code in drivers
    that uses _DSM to execute it via the new helper.  From Jiang Liu.
 
  - New Win8 OSI blacklist entries from Takashi Iwai.
 
  - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo,
    Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria,
    Tang Chen, Zhang Rui.
 
  - intel_pstate driver updates, including proper Baytrail support, from
    Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra.
 
  - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski.
 
  - powernow-k6 cpufreq driver fixes from Mikulas Patocka.
 
  - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown.
 
  - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias,
    Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar.
 
  - cpuidle cleanups from Bartlomiej Zolnierkiewicz.
 
  - Support for hibernation APM events from Bin Shi.
 
  - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled
    during thaw transitions from Bjørn Mork.
 
  - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson.
 
  - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa,
    Rashika Kheria.
 
  - New tool for profiling system suspend from Todd E Brandt and a cpupower
    tool cleanup from One Thousand Gnomes.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJS3a1eAAoJEILEb/54YlRxnTgP/iGawvgjKWm6Qqp7WSIvd5gQ
 zZ6q75C6Pc/W2fq1+OzVGnpCF8WYFy+nFDAXOvUHjIXuoxSwFcuW5l4aMckgl/0a
 TXEWe9MJrCHHRfDApfFacCJ44U02bjJAD5vTyL/hKA+IHeinq4WCSojryYC+8jU0
 cBrUIV0aNH8r5JR2WJNAyv/U29rXsDUOu0I4qTqZ4YaZT6AignMjtLXn1e9AH1Pn
 DPZphTIo/HMnb+kgBOjt4snMk+ahVO9eCOxh/hH8ecnWExw9WynXoU5Nsna0tSZs
 ssyHC7BYexD3oYsG8D52cFUpp4FCsJ0nFQNa2kw0LY+0FBNay43LySisKYHZPXEs
 2WpESDv+/t7yhtnrvM+TtA7aBheKm2XMWGFSu/aERLE17jIidOkXKH5Y7ryYLNf/
 uyRKxNS0NcZWZ0G+/wuY02jQYNkfYz3k/nTr8BAUItRBjdporGIRNEnR9gPzgCUC
 uQhjXWMPulqubr8xbyefPWHTEzU2nvbXwTUWGjrBxSy8zkyy5arfqizUj+VG6afT
 NsboANoMHa9b+xdzigSFdA3nbVK6xBjtU6Ywntk9TIpODKF5NgfARx0H+oSH+Zrj
 32bMzgZtHw/lAbYsnQ9OnTY6AEWQYt6NMuVbTiLXrMHhM3nWwfg/XoN4nZqs6jPo
 IYvE6WhQZU6L6fptGHFC
 =dRf6
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:
 "As far as the number of commits goes, the top spot belongs to ACPI
  this time with cpufreq in the second position and a handful of PM
  core, PNP and cpuidle updates.  They are fixes and cleanups mostly, as
  usual, with a couple of new features in the mix.

  The most visible change is probably that we will create struct
  acpi_device objects (visible in sysfs) for all devices represented in
  the ACPI tables regardless of their status and there will be a new
  sysfs attribute under those objects allowing user space to check that
  status via _STA.

  Consequently, ACPI device eject or generally hot-removal will not
  delete those objects, unless the table containing the corresponding
  namespace nodes is unloaded, which is extremely rare.  Also ACPI
  container hotplug will be handled quite a bit differently and cpufreq
  will support CPU boost ("turbo") generically and not only in the
  acpi-cpufreq driver.

  Specifics:

   - ACPI core changes to make it create a struct acpi_device object for
     every device represented in the ACPI tables during all namespace
     scans regardless of the current status of that device.  In
     accordance with this, ACPI hotplug operations will not delete those
     objects, unless the underlying ACPI tables go away.

   - On top of the above, new sysfs attribute for ACPI device objects
     allowing user space to check device status by triggering the
     execution of _STA for its ACPI object.  From Srinivas Pandruvada.

   - ACPI core hotplug changes reducing code duplication, integrating
     the PCI root hotplug with the core and reworking container hotplug.

   - ACPI core simplifications making it use ACPI_COMPANION() in the
     code "glueing" ACPI device objects to "physical" devices.

   - ACPICA update to upstream version 20131218.  This adds support for
     the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves
     debug facilities.  From Bob Moore, Lv Zheng and Betty Dall.

   - Init code change to carry out the early ACPI initialization
     earlier.  That should allow us to use ACPI during the timekeeping
     initialization and possibly to simplify the EFI initialization too.
     From Chun-Yi Lee.

   - Clenups of the inclusions of ACPI headers in many places all over
     from Lv Zheng and Rashika Kheria (work in progress).

   - New helper for ACPI _DSM execution and rework of the code in
     drivers that uses _DSM to execute it via the new helper.  From
     Jiang Liu.

   - New Win8 OSI blacklist entries from Takashi Iwai.

   - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun
     Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava,
     Rashika Kheria, Tang Chen, Zhang Rui.

   - intel_pstate driver updates, including proper Baytrail support,
     from Dirk Brandewie and intel_pstate documentation from Ramkumar
     Ramachandra.

   - Generic CPU boost ("turbo") support for cpufreq from Lukasz
     Majewski.

   - powernow-k6 cpufreq driver fixes from Mikulas Patocka.

   - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark
     Brown.

   - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John
     Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh
     Kumar.

   - cpuidle cleanups from Bartlomiej Zolnierkiewicz.

   - Support for hibernation APM events from Bin Shi.

   - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC
     disabled during thaw transitions from Bjørn Mork.

   - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf
     Hansson.

   - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente
     Kurusa, Rashika Kheria.

   - New tool for profiling system suspend from Todd E Brandt and a
     cpupower tool cleanup from One Thousand Gnomes"

* tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits)
  thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412)
  cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ
  Documentation: cpufreq / boost: Update BOOST documentation
  cpufreq: exynos: Extend Exynos cpufreq driver to support boost
  cpufreq / boost: Kconfig: Support for software-managed BOOST
  acpi-cpufreq: Adjust the code to use the common boost attribute
  cpufreq: Add boost frequency support in core
  intel_pstate: Add trace point to report internal state.
  cpufreq: introduce cpufreq_generic_get() routine
  ARM: SA1100: Create dummy clk_get_rate() to avoid build failures
  cpufreq: stats: create sysfs entries when cpufreq_stats is a module
  cpufreq: stats: free table and remove sysfs entry in a single routine
  cpufreq: stats: remove hotplug notifiers
  cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly
  cpufreq: speedstep: remove unused speedstep_get_state
  platform: introduce OF style 'modalias' support for platform bus
  PM / tools: new tool for suspend/resume performance optimization
  ACPI: fix module autoloading for ACPI enumerated devices
  ACPI: add module autoloading support for ACPI enumerated devices
  ACPI: fix create_modalias() return value handling
  ...
2014-01-24 15:51:02 -08:00
Rashika Kheria 8712954d15 drivers: hv: Mark the function hv_synic_free_cpu() as static in hv.c
This patch marks the function hv_synic_free_cpu() as static in hv.c
because it is not used outside this file.

Thus, it also eliminates the following warning in hv.c:
drivers/hv/hv.c:304:6: warning: no previous prototype for ‘hv_synic_free_cpu’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-18 16:41:52 -08:00
Lv Zheng 8b48463f89 ACPI: Clean up inclusions of ACPI header files
Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and
<acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h>
inclusions and remove some inclusions of those files that aren't
necessary.

First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>
should not be included directly from any files that are built for
CONFIG_ACPI unset, because that generally leads to build warnings about
undefined symbols in !CONFIG_ACPI builds.  For CONFIG_ACPI set,
<linux/acpi.h> includes those files and for CONFIG_ACPI unset it
provides stub ACPI symbols to be used in that case.

Second, there are ordering dependencies between those files that always
have to be met.  Namely, it is required that <acpi/acpi_bus.h> be included
prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the
latter depends on are always there.  And <acpi/acpi.h> which provides
basic ACPICA type declarations should always be included prior to any other
ACPI headers in CONFIG_ACPI builds.  That also is taken care of including
<linux/acpi.h> as appropriate.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff)
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-07 01:03:14 +01:00
K. Y. Srinivasan 565ce6422f Drivers: hv: vmbus: Fix a bug in channel rescind code
Rescind of subchannels were not being correctly handled. Fix the bug.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>        [3.11+]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-19 19:53:46 -07:00
Felipe Pena fdf91dae6f drivers: hv: Fix wrong check for synic_event_page
The check for calling free_page() on hv_context.synic_event_page[cpu] is the
same for hv_context.synic_message_page[cpu], like a copy-paste error.

Signed-off-by: Felipe Pena <felipensp@gmail.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-16 14:00:09 -07:00
Greg Kroah-Hartman d717349368 Merge 3.12-rc3 into char-misc-next
We need/want the mei fixes in here so we can apply other updates that
are depending on them.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-29 18:27:03 -07:00
Dan Carpenter 33b06938cf hv: vmbus: fix vmbus_recvpacket_raw() return code
Don't return success if the buffer has not been initialized.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26 16:34:55 -07:00