Commit Graph

18660 Commits

Author SHA1 Message Date
Jan Kiszka 6dfacadd58 KVM: nVMX: Add support for activity state HLT
We can easily emulate the HLT activity state for L1: If it decides that
L2 shall be halted on entry, just invoke the normal emulation of halt
after switching to L2. We do not depend on specific host features to
provide this, so we can expose the capability unconditionally.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-12 10:49:56 +01:00
Gleb Natapov 2961e8764f KVM: VMX: shadow VM_(ENTRY|EXIT)_CONTROLS vmcs field
VM_(ENTRY|EXIT)_CONTROLS vmcs fields are read/written on each guest
entry but most times it can be avoided since values do not changes.
Keep fields copy in memory to avoid unnecessary reads from vmcs.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-12 10:49:49 +01:00
Peter Zijlstra be5e610c0f math64: Add mul_u64_u32_shr()
Introduce mul_u64_u32_shr() as proposed by Andy a while back; it
allows using 64x64->128 muls on 64bit archs and recent GCC
which defines __SIZEOF_INT128__ and __int128.

(This new method will be used by the scheduler.)

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: fweisbec@gmail.com
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-hxjoeuzmrcaumR0uZwjpe2pv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-11 15:52:34 +01:00
Peter Zijlstra ba1f14fbe7 sched: Remove PREEMPT_NEED_RESCHED from generic code
While hunting a preemption issue with Alexander, Ben noticed that the
currently generic PREEMPT_NEED_RESCHED stuff is horribly broken for
load-store architectures.

We currently rely on the IPI to fold TIF_NEED_RESCHED into
PREEMPT_NEED_RESCHED, but when this IPI lands while we already have
a load for the preempt-count but before the store, the store will erase
the PREEMPT_NEED_RESCHED change.

The current preempt-count only works on load-store archs because
interrupts are assumed to be completely balanced wrt their preempt_count
fiddling; the previous preempt_count load will match the preempt_count
state after the interrupt and therefore nothing gets lost.

This patch removes the PREEMPT_NEED_RESCHED usage from generic code and
pushes it into x86 arch code; the generic code goes back to relying on
TIF_NEED_RESCHED.

Boot tested on x86_64 and compile tested on ppc64.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-and-Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131128132641.GP10022@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-11 15:52:32 +01:00
Matthew Garrett 04bf9ba720 x86, efi: Don't use (U)EFI time services on 32 bit
UEFI time services are often broken once we're in virtual mode. We were
already refusing to use them on 64-bit systems, but it turns out that
they're also broken on some 32-bit firmware, including the Dell Venue.
Disable them for now, we can revisit once we have the 1:1 mappings code
incorporated.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Link: http://lkml.kernel.org/r/1385754283-2464-1-git-send-email-matthew.garrett@nebula.com
Cc: <stable@vger.kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-12-10 15:02:34 -08:00
Tejun Heo 13ccb93f41 Merge branch 'driver-core-linus' into driver-core-next
a8b1474442 ("sysfs: give different locking key to regular and bin
files") in driver-core-linus modifies sysfs_open_file() so that it
gives out different locking classes to sysfs_open_files depending on
whether the file is bin or not.  Due to the massive kernfs
reorganization in driver-core-next, this naturally causes merge
conflict in fs/sysfs/file.c.

Due to the way things are split between kernfs and sysfs in
driver-core-next, the same fix can't easily be applied to
driver-core-next.  This merge simply ignores the offending commit.  A
following patch will implement a separate fix for the issue.

Signed-off-by: Tejun Heo <tj@kernel.org>
2013-12-10 08:44:37 -05:00
cpw 3eae49ca89 x86/UV: Fix NULL pointer dereference in uv_flush_tlb_others() if the 'nobau' boot option is used
The SGI UV tlb shootdown code panics the system with a NULL
pointer deference if 'nobau' is specified on the boot
commandline.

uv_flush_tlb_other() gets called for every flush, whether the
BAU is disabled or not.  It should not be keeping the s_enters
statistic while the BAU is disabled.

The panic occurs because during initialization
init_per_cpu_tunables() does not set the bcp->statp pointer if
'nobau' was specified.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@vger.kernel.org> # 3.12.x
Link: http://lkml.kernel.org/r/E1VnzBi-0005yF-MU@eag09.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-10 10:06:00 +01:00
H. Peter Anvin 8b3b005d67 x86, build: Pass in additional -mno-mmx, -mno-sse options
In checkin

    5551a34e5a x86-64, build: Always pass in -mno-sse

we unconditionally added -mno-sse to the main build, to keep newer
compilers from generating SSE instructions from autovectorization.
However, this did not extend to the special environments
(arch/x86/boot, arch/x86/boot/compressed, and arch/x86/realmode/rm).
Add -mno-sse to the compiler command line for these environments, and
add -mno-mmx to all the environments as well, as we don't want a
compiler to generate MMX code either.

This patch also removes a $(cc-option) call for -m32, since we have
long since stopped supporting compilers too old for the -m32 option,
and in fact hardcode it in other places in the Makefiles.

Reported-by: Kevin B. Smith <kevin.b.smith@intel.com>
Cc: Sunil K. Pandey <sunil.k.pandey@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: H. J. Lu <hjl.tools@gmail.com>
Link: http://lkml.kernel.org/n/tip-j21wzqv790q834n7yc6g80j1@git.kernel.org
Cc: <stable@vger.kernel.org> # build fix only
2013-12-09 15:52:39 -08:00
Yijing Wang 894d334378 x86/PCI: Use dev_is_pci() to identify PCI devices
Use dev_is_pci() instead of checking bus type directly.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-12-09 16:50:12 -07:00
Takashi Iwai 75da02b29f microcode: Use request_firmware_direct()
Use the new helper, request_firmware_direct(), for avoiding the
lengthy timeout of non-existing firmware loads.  Especially the Intel
microcode driver suffers from this problem because each CPU triggers
the f/w loading, thus it ends up taking (literally) hours with many
cores.

Tested-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-08 18:22:32 -08:00
Qiaowei Ren e7d820a5e5 x86, xsave: Support eager-only xsave features, add MPX support
Some features, like Intel MPX, work only if the kernel uses eagerfpu
model.  So we should force eagerfpu on unless the user has explicitly
disabled it.

Add definitions for Intel MPX and add it to the supported list.

[ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ]

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Link: http://lkml.kernel.org/r/9E0BE1322F2F2246BD820DA9FC397ADE014A6115@SHSMSX102.ccr.corp.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-12-06 17:17:42 -08:00
Lv Zheng c099eacbca SFI / ACPI: Fix warnings reported during builds with W=1
The following warnings can be seen in W=1 builds, because the original
sfi_acpi.[ch] header inclusions are incorrect:

include/linux/sfi_acpi.h:72:2: error: implicit declaration of function 'acpi_table_parse' [-Werror=implicit-function-declaration]
drivers/sfi/sfi_acpi.c:154:5: warning: no previous prototype for 'sfi_acpi_table_parse' [-Wmissing-prototypes]

Fix linux/sfi_acpi.h and modify drivers/sfi/sfi_acpi.c accordingly.

Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[rjw: Subject and changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-07 01:24:33 +01:00
Lv Zheng 8b48463f89 ACPI: Clean up inclusions of ACPI header files
Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and
<acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h>
inclusions and remove some inclusions of those files that aren't
necessary.

First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>
should not be included directly from any files that are built for
CONFIG_ACPI unset, because that generally leads to build warnings about
undefined symbols in !CONFIG_ACPI builds.  For CONFIG_ACPI set,
<linux/acpi.h> includes those files and for CONFIG_ACPI unset it
provides stub ACPI symbols to be used in that case.

Second, there are ordering dependencies between those files that always
have to be met.  Namely, it is required that <acpi/acpi_bus.h> be included
prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the
latter depends on are always there.  And <acpi/acpi.h> which provides
basic ACPICA type declarations should always be included prior to any other
ACPI headers in CONFIG_ACPI builds.  That also is taken care of including
<linux/acpi.h> as appropriate.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff)
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-07 01:03:14 +01:00
Qiaowei Ren 191f57c137 x86, cpufeature: Define the Intel MPX feature flag
Define the Intel MPX (Memory Protection Extensions) CPU feature flag
in the cpufeature list.

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Link: http://lkml.kernel.org/r/1386375658-2191-2-git-send-email-qiaowei.ren@intel.com
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-12-06 10:21:44 -08:00
Maria Dimakopoulou cf30d52e2d perf/x86: Fix constraint table end marker bug
The EVENT_CONSTRAINT_END() macro defines the end marker as
a constraint with a weight of zero. This was all fine
until we blacklisted the corrupting memory events on
Intel IvyBridge. These events are blacklisted by using
a counter bitmask of zero. Thus, they also get a constraint
weight of zero.

The iteration macro: for_each_constraint tests the weight==0.
Therefore, it was stopping at the first blacklisted event, i.e.,
0xd0. The corrupting events were therefore considered as
unconstrained and were scheduled on any of the generic counters.

This patch fixes the end marker to have a weight of -1. With
this, the blacklisted events get an empty constraint and cannot
be scheduled which is what we want for now.

Signed-off-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/20131204232437.GA10689@starlight
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-05 10:02:30 +01:00
Linus Torvalds 53c6de5026 Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 and EFI fixes from Peter Anvin:
 "Half of these are EFI-related:

  The by far biggest change is the change to hold off the deletion of a
  sysfs entry while a backend scan is in progress.  This is to avoid
  calling kmemdup() while under a spinlock.

  The other major change is for each entry in the EFI pstore backend to
  get a unique identifier, as required by the pstore filesystem proper.

  The other changes are:

  A fix to the recent consolidation and optimization of using "asm goto"
  with read-modify-write operation, which broke the bitops; specifically
  in such a way that we could end up generating invalid code.

  A build hack to make sure we compile with -mno-sse.  icc, and most
  likely future versions of gcc, can generate SSE instructions unless we
  tell it not to.

  A comment-only patch to a change the was due in part to an unpublished
  erratum; now when the erratum is published we want to add a comment
  explaining why"

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic, doc: Justification for disabling IO APIC before Local APIC
  x86, bitops: Correct the assembly constraints to testing bitops
  x86-64, build: Always pass in -mno-sse
  efi-pstore: Make efi-pstore return a unique id
  x86/efi: Fix earlyprintk off-by-one bug
  efivars, efi-pstore: Hold off deletion of sysfs entry until the scan is completed
2013-12-04 21:45:21 -08:00
Fenghua Yu 2885432aaf x86/apic, doc: Justification for disabling IO APIC before Local APIC
Since erratum AVR31 in "Intel Atom Processor C2000 Product Family
Specification Update" is now published, I added a justification
comment for disabling IO APIC before Local APIC, as changed in commit:

522e664644 x86/apic: Disable I/O APIC before shutdown of the local APIC

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1386202069-51515-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-12-04 19:33:21 -08:00
H. Peter Anvin e0f6dec35f x86, bitops: Correct the assembly constraints to testing bitops
In checkin:

0c44c2d0f4 x86: Use asm goto to implement better modify_and_test() functions

the various functions which do modify and test were unified and
optimized using "asm goto".  However, this change missed the detail
that the bitops require an "Ir" constraint rather than an "er"
constraint ("I" = integer constant from 0-31, "e" = signed 32-bit
integer constant).  This would cause code to miscompile if these
functions were used on constant bit positions 32-255 and the build to
fail if used on constant bit positions above 255.

Add the constraints as a parameter to the GEN_BINARY_RMWcc() macro to
avoid this problem.

Reported-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/529E8719.4070202@zytor.com
2013-12-04 14:31:28 -08:00
H. Peter Anvin 5551a34e5a x86-64, build: Always pass in -mno-sse
Always pass in the -mno-sse argument, regardless if
-preferred-stack-boundary is supported.  We never want to generate SSE
instructions in the kernel unless we *really* know what we're doing.

According to H. J. Lu, any version of gcc new enough that we support
it at all should handle the -mno-sse option, so just add it
unconditionally.

Reported-by: Kevin B. Smith <kevin.b.smith@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: H. J. Lu <hjl.tools@gmail.com>
Link: http://lkml.kernel.org/n/tip-j21wzqv790q834n7yc6g80j1@git.kernel.org
Cc: <stable@vger.kernel.org> # build fix only
2013-12-03 17:40:22 -08:00
Linus Torvalds e321ae4c20 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc kernel and tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools lib traceevent: Fix conversion of pointer to integer of different size
  perf/trace: Properly use u64 to hold event_id
  perf: Remove fragile swevent hlist optimization
  ftrace, perf: Avoid infinite event generation loop
  tools lib traceevent: Fix use of multiple options in processing field
  perf header: Fix possible memory leaks in process_group_desc()
  perf header: Fix bogus group name
  perf tools: Tag thread comm as overriden
2013-12-02 10:13:09 -08:00
Masanari Iida 5065a706c1 treewide: Fix typo in Kconfig
Correct spelling typo in Kconfig.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-12-02 14:54:57 +01:00
Levente Kurusa 853d9b18f1 x86, mce: Call put_device on device_register failure
This patch adds a call to put_device() when the device_register() call
has failed. This is required so that the last reference to the device is
given up.

Signed-off-by: Levente Kurusa <levex@linux.com>
Link: http://lkml.kernel.org/r/5298F900.9000208@linux.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2013-11-30 12:14:51 +01:00
Matt Fleming 1f3a8bae21 x86/efi: Fix earlyprintk off-by-one bug
Dave reported seeing the following incorrect output on his Thinkpad T420
when using earlyprintk=efi,

[    0.000000] efi: EFI v2.00 by Lenovo
                    ACPI=0xdabfe000  ACPI 2.0=0xdabfe014 SMBIOS=0xdaa9e000

The output should be on one line, not split over two. The cause is an
off-by-one error when checking that the efi_y coordinate hasn't been
incremented out of bounds.

Reported-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-28 20:16:56 +00:00
Stephane Eranian 65661f96d3 perf/x86: Add RAPL hrtimer support
The RAPL PMU counters do not interrupt on overflow.
Therefore, the kernel needs to poll the counters
to avoid missing an overflow. This patch adds
the hrtimer code to do this.

The timer interval is calculated at boot time
based on the power unit used by the HW.

There is one hrtimer per-cpu to handle the case
of multiple simultaneous use across cores on
the same package + hotplug CPU.

Thanks to Maria Dimakopoulou for her contributions
to this patch especially on the math aspects.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
[ Applied 32-bit build fix. ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: maria.n.dimakopoulou@gmail.com
Link: http://lkml.kernel.org/r/1384275531-10892-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-27 15:31:23 +01:00
Stephane Eranian 4788e5b4b2 perf/x86: Add Intel RAPL PMU support
This patch adds a new uncore PMU to expose the Intel
RAPL energy consumption counters. Up to 3 counters,
each counting a particular RAPL event are exposed.

The RAPL counters are available on Intel SandyBridge,
IvyBridge, Haswell. The server skus add a 3rd counter.

The following events are available and exposed in sysfs:

  - power/energy-cores: power consumption of all cores on socket
  - power/energy-pkg: power consumption of all cores + LLc cache
  - power/energy-dram: power consumption of DRAM (servers only)

For each event both the unit (Joules) and scale (2^-32 J)
is exposed in sysfs for use by perf stat and other tools.
The files are:

	/sys/devices/power/events/energy-*.unit
	/sys/devices/power/events/energy-*.scale

The RAPL PMU is uncore by nature and is implemented such
that it only works in system-wide mode. Measuring only
one CPU per socket is sufficient. The /sys/devices/power/cpumask
file can be used by tools to figure out which CPUs to monitor
by default. For instance, on a 2-socket system, 2 CPUs
(one on each socket) will be shown.

All the counters measure in the same unit (exposed via sysfs).
The perf_events API exposes all RAPL counters as 64-bit integers
counting in unit of 1/2^32 Joules (about 0.23 nJ). User level tools
must convert the counts by multiplying them by 2^-32 to obtain
Joules. The reason for this is that the kernel avoids
doing floating point math whenever possible because it is
expensive (user floating-point state must be saved). The method
used avoids kernel floating-point usage. There is no loss of
precision. Thanks to PeterZ for suggesting this approach.

To convert the raw count in Watt:
   W = C * 2.3 / (1e10 * time)
or ldexp(C, -32).

RAPL PMU is a new standalone PMU which registers with the
perf_event core subsystem. The PMU type (attr->type) is
dynamically allocated and is available from /sys/device/power/type.

Sampling is not supported by the RAPL PMU. There is no
privilege level filtering either.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/1384275531-10892-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-27 11:16:40 +01:00
Ingo Molnar 61d0669775 * New static EFI runtime services virtual mapping layout which is
groundwork for kexec support on EFI - Borislav Petkov
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJSfMDjAAoJEC84WcCNIz1V0LIP/2WyTbJR6bL0HXwnGLpxmxag
 v0VgnKRhypNboA3WEu4a9as6AdExqB7qsiWIipHuDSMj/vkfZgHAKTd2f1iRxmsJ
 RZxzwV6YzvsWkdXjvpCoLWKSsvQDug++BAIti1PitW6RXRjo01t3ymo/Ho1CQrpI
 hNJbB3bbihMF+uqFvdSpO0KZtZE6EtnylrfBeuo0GzqqJdTGe1MmqlWmyUlEy5JW
 ZiHV8E/xTjh3N675tWPcT9hGVfCyOXXu/kPrXsJTXrdYyZL9qgA9b8SVRLs6DctX
 wVgL9lNv4wobsmZJ5DxkYl9+TaF7rbshUeIJbzrQyMVJjb3TpXk/ZpspDMAEjL7e
 bb76c1bAx4xZuUatR/f1ykkWKAEryAhXHkvwcbIBjebW33if1MgGJLk5udJQQv6H
 j+J9ROH38MDr0Geg+pM2RnCyTz8l+q+8Mfu4Yh9TSte+ttB6fr9phs3/G+fdSUn1
 0vI627v1HWzDcBh4eZjjslzJviR8PldsGVT3EsIaOnHGtk/9FPz/7n4efph4v7+9
 yqTkLvQHxsAx7f0tR/qRpkEIQ9WTMXO0IO79OC13QTSATJSl+WSPTJM7ccqOgn+P
 h89ssBnzlwmFHkuvTi599KVHdzOZrWTsB2zROn+NnSchJ+YAY6TXwznk3/MNo9rB
 d+euVcffL4yfWZ7Bzj8Z
 =w6ff
 -----END PGP SIGNATURE-----

Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/efi

Pull EFI virtual mapping changes from Matt Fleming:

  * New static EFI runtime services virtual mapping layout which is
    groundwork for kexec support on EFI. (Borislav Petkov)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-26 12:23:04 +01:00
Linus Torvalds 26b265cd29 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 - Made x86 ablk_helper generic for ARM
 - Phase out chainiv in favour of eseqiv (affects IPsec)
 - Fixed aes-cbc IV corruption on s390
 - Added constant-time crypto_memneq which replaces memcmp
 - Fixed aes-ctr in omap-aes
 - Added OMAP3 ROM RNG support
 - Add PRNG support for MSM SoC's
 - Add and use Job Ring API in caam
 - Misc fixes

[ NOTE! This pull request was sent within the merge window, but Herbert
  has some questionable email sending setup that makes him public enemy
  #1 as far as gmail is concerned.  So most of his emails seem to be
  trapped by gmail as spam, resulting in me not seeing them.  - Linus ]

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (49 commits)
  crypto: s390 - Fix aes-cbc IV corruption
  crypto: omap-aes - Fix CTR mode counter length
  crypto: omap-sham - Add missing modalias
  padata: make the sequence counter an atomic_t
  crypto: caam - Modify the interface layers to use JR API's
  crypto: caam - Add API's to allocate/free Job Rings
  crypto: caam - Add Platform driver for Job Ring
  hwrng: msm - Add PRNG support for MSM SoC's
  ARM: DT: msm: Add Qualcomm's PRNG driver binding document
  crypto: skcipher - Use eseqiv even on UP machines
  crypto: talitos - Simplify key parsing
  crypto: picoxcell - Simplify and harden key parsing
  crypto: ixp4xx - Simplify and harden key parsing
  crypto: authencesn - Simplify key parsing
  crypto: authenc - Export key parsing helper function
  crypto: mv_cesa: remove deprecated IRQF_DISABLED
  hwrng: OMAP3 ROM Random Number Generator support
  crypto: sha256_ssse3 - also test for BMI2
  crypto: mv_cesa - Remove redundant of_match_ptr
  crypto: sahara - Remove redundant of_match_ptr
  ...
2013-11-23 16:18:25 -08:00
Linus Torvalds aecde27c4f Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull DRM fixes from Dave Airlie:
 "I was going to leave this until post -rc1 but sysfs fixes broke
  hotplug in userspace, so I had to fix it harder, otherwise a set of
  pulls from intel, radeon and vmware,

  The vmware/ttm changes are bit larger but since its early and they are
  unlikely to break anything else I put them in, it lets vmware work
  with dri3"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (36 commits)
  drm/sysfs: fix hotplug regression since lifetime changes
  drm/exynos: g2d: fix memory leak to userptr
  drm/i915: Fix gen3 self-refresh watermarks
  drm/ttm: Remove set_need_resched from the ttm fault handler
  drm/ttm: Don't move non-existing data
  drm/radeon: hook up backlight functions for CI and KV family.
  drm/i915: Replicate BIOS eDP bpp clamping hack for hsw
  drm/i915: Do not enable package C8 on unsupported hardware
  drm/i915: Hold pc8 lock around toggling pc8.gpu_idle
  drm/i915: encoder->get_config is no longer optional
  drm/i915/tv: add ->get_config callback
  drm/radeon/cik: Add macrotile mode array query
  drm/radeon/cik: Return backend map information to userspace
  drm/vmwgfx: Make vmwgfx dma buffers prime aware
  drm/vmwgfx: Make surfaces prime-aware
  drm/vmwgfx: Hook up the prime ioctls
  drm/ttm: Add a minimal prime implementation for ttm base objects
  drm/vmwgfx: Fix false lockdep warning
  drm/ttm: Allow execbuf util reserves without ticket
  drm/i915: restore the early forcewake cleanup
  ...
2013-11-22 10:56:11 -08:00
Linus Torvalds c874e6fc35 Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Gleb Natapov.

* 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: kvm_clear_guest_page(): fix empty_zero_page usage
  kvm: mmu: delay mmu audit activation
  arm/arm64: KVM: Fix hyp mappings of vmalloc regions
2013-11-22 09:56:07 -08:00
Kirill A. Shutemov c283610e44 x86, mm: do not leak page->ptl for pmd page tables
There are two code paths how page with pmd page table can be freed:
pmd_free() and pmd_free_tlb().

I've missed the second one and didn't add page table destructor call
there.  It leads to leak of page->ptl for pmd page tables, if
dynamically allocated page->ptl is in use.

The patch adds the missed destructor and modifies documentation
accordingly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Andrey Vagin <avagin@openvz.org>
Tested-by: Andrey Vagin <avagin@openvz.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-21 16:42:28 -08:00
Dave Airlie cf96967794 Merge tag 'drm-intel-fixes-2013-11-20' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Just a small pile of fixes for bugs and a few regressions. I'm still
trying to track down a driver load hang on my g33 (which infuriatingly
doesn't happen when loading the module manually after boot), somehow
bisecting loves to go astray on this one :( And there's a (harmless)
locking WARN in the suspend code due to one of Jesse's vlv backlight
rework patches. Otherwise nothing outstanding afaik.

* tag 'drm-intel-fixes-2013-11-20' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915: Fix gen3 self-refresh watermarks
  drm/i915: Replicate BIOS eDP bpp clamping hack for hsw
  drm/i915: Do not enable package C8 on unsupported hardware
  drm/i915: Hold pc8 lock around toggling pc8.gpu_idle
  drm/i915: encoder->get_config is no longer optional
  drm/i915/tv: add ->get_config callback
  drm/i915: restore the early forcewake cleanup
  Partially revert "drm/i915: tune the RC6 threshold for stability"
  drm/i915: flush cursors harder
  i915: Use 120MHz LVDS SSC clock for gen5/gen6/gen7
  x86/early quirk: use gen6 stolen detection for VLV
  drm/i915/dp: set sink to power down mode on dp disable
2013-11-21 18:45:51 +10:00
Al Viro 2a46eed54a Wrong page freed on preallocate_pmds() failure exit
Note that pmds[i] is simply uninitialized at that point...

Granted, it's very hard to hit (you need split page locks *and*
kmalloc(sizeof(spinlock_t), GFP_KERNEL) failing), but the code is
obviously bogus.

Introduced by commit 09ef493985 ("x86: add missed
pgtable_pmd_page_ctor/dtor calls for preallocated pmds")

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-20 14:22:14 -08:00
H. Peter Anvin 661c80192d x86-64, copy_user: Use leal to produce 32-bit results
When we are using lea to produce a 32-bit result, we can use the leal
form, rather than using leaq and worry about truncation elsewhere.

Make the leal explicit, both to be more obvious and since that is what
gcc generates and thus is less likely to trigger obscure gas bugs.

Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1384634221-6006-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-20 13:57:07 -08:00
Linus Torvalds 82023bb7f7 More ACPI and power management updates for 3.13-rc1
- ACPI-based device hotplug fixes for issues introduced recently and
   a fix for an older error code path bug in the ACPI PCI host bridge
   driver.
 
 - Fix for recently broken OMAP cpufreq build from Viresh Kumar.
 
 - Fix for a recent hibernation regression related to s2disk.
 
 - Fix for a locking-related regression in the ACPI EC driver from
   Puneet Kumar.
 
 - System suspend error code path fix related to runtime PM and
   runtime PM documentation update from Ulf Hansson.
 
 - cpufreq's conservative governor fix from Xiaoguang Chen.
 
 - New processor IDs for intel_idle and turbostat and removal of
   an obsolete Kconfig option from Len Brown.
 
 - New device IDs for the ACPI LPSS (Low-Power Subsystem) driver and
   ACPI-based PCI hotplug (ACPIPHP) cleanup from Mika Westerberg.
 
 - Removal of several ACPI video DMI blacklist entries that are not
   necessary any more from Aaron Lu.
 
 - Rework of the ACPI companion representation in struct device and
   code cleanup related to that change from Rafael J Wysocki,
   Lan Tianyu and Jarkko Nikula.
 
 - Fixes for assigning names to ACPI-enumerated I2C and SPI devices
   from Jarkko Nikula.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABCAAGBQJSjLYNAAoJEILEb/54YlRxkEQP/1pmFWNwSsxLtTHd+PEs0Xbo
 QccqvjQrnw/c8GcmK4eZrz6/xyuepmmjy9kfRKj2ENZniy0NEsSFqkTdSO3vYlva
 8HKWUj7MV3evhFERXAF6Tu0b4Enx4jOP7VMtmYxJo3qrSnKRUcUzc6DGv/ACsUT1
 Nkj0Lhdsg053Z+YzIXrl50w0tCDEMhVmWlMHBtYgr+dMNVnkfPBGkqMblMkKCXT2
 w/yHvauZlxQHtI+8bVqTuGgNN0CPzdlpFGiuUF+5mDf6dRX8zlSn56Ia+Wyw1k9X
 dQp4jYQOgPRo03rNKqQPDiPxUdc7T0RAHRvDB51Ncweuh5PfZGguQe71p6/LKY2W
 i6zblZ0f/vc13hTiMrP+qzKcwZvgPB5DH7SfnHr61JKV7GNFCdYAqoceS5hYMzR9
 d2Fd+txgm763IHWewXfDS/G2cU492R5qr4jpmUIACBQKWDZcqmSRDwRj83t56Ltb
 jgFBMbg4vZxG7IARhind74xsALxdhsgmFjPmx+0qPWjYxcU8otQZpXbgGNI9iOuW
 pxIQv5WPQW0tTmwO4HSuVCOwDPLPz5R0jkev7SvSj3Ek3TeD7He4LmnK055CATiC
 puq+6dp1FISPOPJYk+0DI61qN/CB/qNwRp8LU3ctZwudPVhznIE9FFQ3iN1FdBg2
 X8VDcT9t7VvVuxSBjgkj
 =QMp+
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI and power management updates from Rafael Wysocki:

 - ACPI-based device hotplug fixes for issues introduced recently and a
   fix for an older error code path bug in the ACPI PCI host bridge
   driver

 - Fix for recently broken OMAP cpufreq build from Viresh Kumar

 - Fix for a recent hibernation regression related to s2disk

 - Fix for a locking-related regression in the ACPI EC driver from
   Puneet Kumar

 - System suspend error code path fix related to runtime PM and runtime
   PM documentation update from Ulf Hansson

 - cpufreq's conservative governor fix from Xiaoguang Chen

 - New processor IDs for intel_idle and turbostat and removal of an
   obsolete Kconfig option from Len Brown

 - New device IDs for the ACPI LPSS (Low-Power Subsystem) driver and
   ACPI-based PCI hotplug (ACPIPHP) cleanup from Mika Westerberg

 - Removal of several ACPI video DMI blacklist entries that are not
   necessary any more from Aaron Lu

 - Rework of the ACPI companion representation in struct device and code
   cleanup related to that change from Rafael J Wysocki, Lan Tianyu and
   Jarkko Nikula

 - Fixes for assigning names to ACPI-enumerated I2C and SPI devices from
   Jarkko Nikula

* tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (24 commits)
  PCI / hotplug / ACPI: Drop unused acpiphp_debug declaration
  ACPI / scan: Set flags.match_driver in acpi_bus_scan_fixed()
  ACPI / PCI root: Clear driver_data before failing enumeration
  ACPI / hotplug: Fix PCI host bridge hot removal
  ACPI / hotplug: Fix acpi_bus_get_device() return value check
  cpufreq: governor: Remove fossil comment in the cpufreq_governor_dbs()
  ACPI / video: clean up DMI table for initial black screen problem
  ACPI / EC: Ensure lock is acquired before accessing ec struct members
  PM / Hibernate: Do not crash kernel in free_basic_memory_bitmaps()
  ACPI / AC: Remove struct acpi_device pointer from struct acpi_ac
  spi: Use stable dev_name for ACPI enumerated SPI slaves
  i2c: Use stable dev_name for ACPI enumerated I2C slaves
  ACPI: Provide acpi_dev_name accessor for struct acpi_device device name
  ACPI / bind: Use (put|get)_device() on ACPI device objects too
  ACPI: Eliminate the DEVICE_ACPI_HANDLE() macro
  ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node
  cpufreq: OMAP: Fix compilation error 'r & ret undeclared'
  PM / Runtime: Fix error path for prepare
  PM / Runtime: Update documentation around probe|remove|suspend
  cpufreq: conservative: set requested_freq to policy max when it is over policy max
  ...
2013-11-20 13:25:04 -08:00
Sasha Levin 521ee0cfb8 kvm: mmu: delay mmu audit activation
We should not be using jump labels before they were initialized. Push back
the callback to until after jump label initialization.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-20 11:12:56 +02:00
Linus Torvalds cdc7ef8981 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML changes from Richard Weinberger:
 "This pile contains a nice defconfig cleanup, a rewritten stack
  unwinder and various cleanups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Remove unused declarations from <as-layout.h>
  um: remove used STDIO_CONSOLE Kconfig param
  um/vdso: add .gitignore for a couple of targets
  arch/um: make it work with defconfig and x86_64
  um: Make kstack_depth_to_print conform to arch/x86
  um: Get rid of thread_struct->saved_task
  um: Make stack trace reliable against kernel mode faults
  um: Rewrite show_stack()
2013-11-19 11:42:32 -08:00
Linus Torvalds 9066d9b250 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "A modular build fix for certain .config's"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Export 'boot_cpu_physical_apicid' to modules
2013-11-19 10:48:19 -08:00
Linus Torvalds 4007162647 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq cleanups from Ingo Molnar:
 "This is a multi-arch cleanup series from Thomas Gleixner, which we
  kept to near the end of the merge window, to not interfere with
  architecture updates.

  This series (motivated by the -rt kernel) unifies more aspects of IRQ
  handling and generalizes PREEMPT_ACTIVE"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  preempt: Make PREEMPT_ACTIVE generic
  sparc: Use preempt_schedule_irq
  ia64: Use preempt_schedule_irq
  m32r: Use preempt_schedule_irq
  hardirq: Make hardirq bits generic
  m68k: Simplify low level interrupt handling code
  genirq: Prevent spurious detection for unconditionally polled interrupts
2013-11-19 10:40:00 -08:00
Peter Zijlstra d5b5f391d4 ftrace, perf: Avoid infinite event generation loop
Vince's perf-trinity fuzzer found yet another 'interesting' problem.

When we sample the irq_work_exit tracepoint with period==1 (or
PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an
infinite event generation loop:

  ,-> <IPI>
  |     irq_work_exit() ->
  |       trace_irq_work_exit() ->
  |         ...
  |           __perf_event_overflow() -> (due to fasync)
  |             irq_work_queue() -> (irq_work_list must be empty)
  '---------      arch_irq_work_raise()

Similar things can happen due to regular poll() wakeups if we exceed
the ring-buffer wakeup watermark, or have an event_limit.

To avoid this, dis-allow sampling this particular tracepoint.

In order to achieve this, create a special perf_perm function pointer
for each event and call this (when set) on trying to create a
tracepoint perf event.

[ roasted: use expr... to allow for ',' in your expression ]

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131114152304.GC5364@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-19 16:57:40 +01:00
Kirill A. Shutemov fd8526ad14 x86/mm: Implement ASLR for hugetlb mappings
Matthew noticed that hugetlb mappings don't participate in ASLR on x86-64:

  %  for i in `seq 3`; do
  > tools/testing/selftests/vm/map_hugetlb | grep address
  > done
  Returned address is 0x2aaaaac00000
  Returned address is 0x2aaaaac00000
  Returned address is 0x2aaaaac00000

/proc/PID/maps entries for the mapping are always the same
(except inode number):

  2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 8200              /anon_hugepage (deleted)
  2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 256               /anon_hugepage (deleted)
  2aaaaac00000-2aaabac00000 rw-p 00000000 00:0c 7180              /anon_hugepage (deleted)

The reason is the generic hugetlb_get_unmapped_area() function
which is used on x86-64.  It doesn't support randomization and
use bottom-up unmapped area lookup, instead of usual top-down
on x86-64.

x86 has arch-specific hugetlb_get_unmapped_area(), but it's used
only on x86-32.

Let's use arch-specific hugetlb_get_unmapped_area() on x86-64
too. That adds ASLR and switches hugetlb mappings to use top-down
unmapped area lookup:

  %  for i in `seq 3`; do
  > tools/testing/selftests/vm/map_hugetlb | grep address
  > done
  Returned address is 0x7f4f08a00000
  Returned address is 0x7fdda4200000
  Returned address is 0x7febe0000000

/proc/PID/maps entries:

  7f4f08a00000-7f4f18a00000 rw-p 00000000 00:0c 1168              /anon_hugepage (deleted)
  7fdda4200000-7fddb4200000 rw-p 00000000 00:0c 7092              /anon_hugepage (deleted)
  7febe0000000-7febf0000000 rw-p 00000000 00:0c 7183              /anon_hugepage (deleted)

Unmapped area lookup policy for hugetlb mappings is consistent
with normal mappings now -- the only difference is alignment
requirements for huge pages.

libhugetlbfs test-suite didn't detect any regressions with the
patch applied (although it shows few failures on my machine
regardless the patch).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mel Gorman <mgorman@suse.de>
Link: http://lkml.kernel.org/r/20131119131750.EA45CE0090@blue.fi.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-19 14:24:50 +01:00
Cyrill Gorcunov 5305ca10e7 x86/mm: Unify pte_to_pgoff() and pgoff_to_pte() helpers
Use unified pte_bitop() helper to manipulate bits in pte/pgoff
bitfields and convert pte_to_pgoff()/pgoff_to_pte() to inlines.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-19 14:24:34 +01:00
Rafael J. Wysocki 6431b43097 Merge branch 'pm-tools'
* pm-tools:
  tools / power turbostat: Support Silvermont
2013-11-19 01:06:38 +01:00
Ramkumar Ramachandra b13a9bfc79 um/vdso: add .gitignore for a couple of targets
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-11-17 11:32:07 +01:00
Ramkumar Ramachandra e40f04d040 arch/um: make it work with defconfig and x86_64
arch/um/defconfig only lists one default configuration, and that applies
only to the i386 architecture.  Replace it with two minimal
configuration files generated using `make savedefconfig`:

  i386_defconfig and x86_64_defconfig

The build scripts now require two updates:

1. um's Kconfig (arch/x86/um/Kconfig) should specify an ARCH_DEFCONFIG
   section explicitly pointing to these scripts if the required
   variables are set.  Take care to remove the DEFCONFIG_LIST section
   defined in the included file arch/um/Kconfig.common.

2. um's Makefile (arch/um/Makefile) should set KBUILD_DEFCONFIG properly
   for the top-level Makefile to pick up.  Copy the logic in
   arch/x86/Makefile to properly pick the defconfig file depending on
   the actual architecture; except we're working with $SUBARCH here,
   instead of $ARCH.

Now, you can do:

  $ ARCH=um make defconfig
  $ ARCH=um make

and successfully build User-Mode Linux on an x86_64 box in default
configuration.

Cc: Richard Weinberger <richard@nod.at>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-11-17 11:31:48 +01:00
Richard Weinberger 9d1ee8ce92 um: Rewrite show_stack()
Currently on UML stack traces are not very reliable and both
x86 and x86_64 have their on implementations.
This patch unifies both and adds support to outline unreliable
functions calls.

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-11-17 11:27:25 +01:00
Fenghua Yu f4cb1cc18f x86-64, copy_user: Remove zero byte check before copy user buffer.
Operation of rep movsb instruction handles zero byte copy. As pointed out by
Linus, there is no need to check zero size in kernel. Removing this redundant
check saves a few cycles in copy user functions.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1384634221-6006-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-16 18:00:58 -08:00
Linus Torvalds b29c8306a3 This batch of changes is mostly clean ups and small bug fixes.
The only real feature that was added this release is from Namhyung Kim,
 who introduced "set_graph_notrace" filter that lets you run the function
 graph tracer and not trace particular functions and their call chain.
 
 Tom Zanussi added some updates to the ftrace multibuffer tracing that
 made it more consistent with the top level tracing.
 
 One of the fixes for perf function tracing required an API change in
 RCU; the addition of "rcu_is_watching()". As Paul McKenney is pushing
 that change in this release too, he gave me a branch that included
 all the changes to get that working, and I pulled that into my tree
 in order to complete the perf function tracing fix.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQEcBAABAgAGBQJSgX5SAAoJEKQekfcNnQGulUAH/jORqJrKaNAulmZ314VsAqfa
 zMtF5UAAPf7kqc3AN/jtFrhJUNEfxWOo7A4r0FsM/rKdWJF+98GA6aqYVD+XoWFt
 +36fg1enxbXUjixQ96Uh+o1+BJUgYDqljuWzqSu/oiXWfWwl8+WL4kcbhb+V9WcF
 SpdzLCWVZRfhyDiN3+0zvyQ8RSG2Pd7CWn9zroI0e4sxGo0Ki6JUnIcXtZGOBDOQ
 IIZdjXvGSfpJ+3u3XvRPXJcltRCtOsVWxYzrmvRlmHDW5QMe1+WmmrlojTePrLaJ
 xn8+3WINqetAR+ZQnazbpt1XzJzKa8QtFgpiN0kT6qL7cg3N1Owc4vLGohl7wok=
 =Nesf
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing update from Steven Rostedt:
 "This batch of changes is mostly clean ups and small bug fixes.  The
  only real feature that was added this release is from Namhyung Kim,
  who introduced "set_graph_notrace" filter that lets you run the
  function graph tracer and not trace particular functions and their
  call chain.

  Tom Zanussi added some updates to the ftrace multibuffer tracing that
  made it more consistent with the top level tracing.

  One of the fixes for perf function tracing required an API change in
  RCU; the addition of "rcu_is_watching()".  As Paul McKenney is pushing
  that change in this release too, he gave me a branch that included all
  the changes to get that working, and I pulled that into my tree in
  order to complete the perf function tracing fix"

* tag 'trace-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Add rcu annotation for syscall trace descriptors
  tracing: Do not use signed enums with unsigned long long in fgragh output
  tracing: Remove unused function ftrace_off_permanent()
  tracing: Do not assign filp->private_data to freed memory
  tracing: Add helper function tracing_is_disabled()
  tracing: Open tracer when ftrace_dump_on_oops is used
  tracing: Add support for SOFT_DISABLE to syscall events
  tracing: Make register/unregister_ftrace_command __init
  tracing: Update event filters for multibuffer
  recordmcount.pl: Add support for __fentry__
  ftrace: Have control op function callback only trace when RCU is watching
  rcu: Do not trace rcu_is_watching() functions
  ftrace/x86: skip over the breakpoint for ftrace caller
  trace/trace_stat: use rbtree postorder iteration helper instead of opencoding
  ftrace: Add set_graph_notrace filter
  ftrace: Narrow down the protected area of graph_lock
  ftrace: Introduce struct ftrace_graph_data
  ftrace: Get rid of ftrace_graph_filter_enabled
  tracing: Fix potential out-of-bounds in trace_get_user()
  tracing: Show more exact help information about snapshot
2013-11-16 12:23:18 -08:00
Linus Torvalds 9073e1a804 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual earth-shaking, news-breaking, rocket science pile from
  trivial.git"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
  doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
  doc: add missing files to timers/00-INDEX
  timekeeping: Fix some trivial typos in comments
  mm: Fix some trivial typos in comments
  irq: Fix some trivial typos in comments
  NUMA: fix typos in Kconfig help text
  mm: update 00-INDEX
  doc: Documentation/DMA-attributes.txt fix typo
  DRM: comment: `halve' -> `half'
  Docs: Kconfig: `devlopers' -> `developers'
  doc: typo on word accounting in kprobes.c in mutliple architectures
  treewide: fix "usefull" typo
  treewide: fix "distingush" typo
  mm/Kconfig: Grammar s/an/a/
  kexec: Typo s/the/then/
  Documentation/kvm: Update cpuid documentation for steal time and pv eoi
  treewide: Fix common typo in "identify"
  __page_to_pfn: Fix typo in comment
  Correct some typos for word frequency
  clk: fixed-factor: Fix a trivial typo
  ...
2013-11-15 16:47:22 -08:00
Linus Torvalds f13399f033 Kconfig cleanups for v3.13
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQIcBAABAgAGBQJSeP3kAAoJEOiN4VijXeFP0ykP/11/3M1cG9vB5OREJ590eFqR
 rsHT3EP0cnNuJSuVyDSoxsdjUYFv5496BoOe9M1oztb7MuJdY4AFINf8730T4E9Z
 xE6zufStyZej7vBc0RoSYBX+5VUHQFe4wdLq2jC5Fhd0ZVkS+JpyR7JE5ccUvoL4
 iWTwWS/DsI41pZ9WnY3RbomcL8vnY7+zpsy0i//1mSlktmFfP/zBz0Lh9xugg6dG
 Gk+/mjeJ/c6QakdbC5yDXZqXIAv1nSFMqnwmni1eGNCbOG6sAIXJ8gO/TX8Bb1bw
 mBdSdmNCga+U5CPXWEhy5xjLbOrzrogoixLMZ9wZRhbv6Wn5PYd+RZDDZO21/D4a
 ES2WZyvpSxyl/hoPC2QcPSw+JDqBiZzQUQESWbzt81pwGXHUVzXBvpB0gwZhW/6Y
 CQOIl8NAWcQa3cYpT0an/Jr98u3qR/rBrv+iSWn513cEGcX2K0x7LgAWHDPxYnzi
 ROhg0GDq5Sddk/7doIuwRTQuCezYQ6RW3PixcMTcd8gPyUtrCLipDC6riMNFpRrN
 lpRKGv9dJFx8sdRx0nYjb0L4385XRUQIRWLcP9bDo7rDkG9ApJ3uQl67H/16QBjB
 Q/PgHjWae/78ROs8jx3Ek/SNCZUUWchiPWjd2cVZGkSWUQgWskwVeXSHAgINTyJS
 6e/m3rW9c8eyxfje84Fl
 =pPWo
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

Pull Kconfig cleanups from Mark Salter:
 "Remove some unused config options from C6X and clean up PC_PARPORT
  dependencies.  The latter was discussed here:

    https://lkml.org/lkml/2013/10/8/12"

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
  c6x: remove unused COMMON_CLKDEV Kconfig parameter
  Kconfig cleanup (PARPORT_PC dependencies)
  x86: select ARCH_MIGHT_HAVE_PC_PARPORT
  unicore32: select ARCH_MIGHT_HAVE_PC_PARPORT
  sparc: select ARCH_MIGHT_HAVE_PC_PARPORT
  sh: select ARCH_MIGHT_HAVE_PC_PARPORT
  powerpc: select ARCH_MIGHT_HAVE_PC_PARPORT
  parisc: select ARCH_MIGHT_HAVE_PC_PARPORT
  mips: select ARCH_MIGHT_HAVE_PC_PARPORT
  microblaze: select ARCH_MIGHT_HAVE_PC_PARPORT
  m68k: select ARCH_MIGHT_HAVE_PC_PARPORT
  ia64: select ARCH_MIGHT_HAVE_PC_PARPORT
  arm: select ARCH_MIGHT_HAVE_PC_PARPORT
  alpha: select ARCH_MIGHT_HAVE_PC_PARPORT
  c6x: remove unused parameter in Kconfig
2013-11-15 14:05:15 -08:00
David Rientjes cc08e04c3f x86: Export 'boot_cpu_physical_apicid' to modules
Commit 9ebddac7ea "ACPI, x86: Fix extended error log driver to depend on
CONFIG_X86_LOCAL_APIC" fixed a build error when CONFIG_X86_LOCAL_APIC was not
selected and !CONFIG_SMP.

However, since CONFIG_ACPI_EXTLOG is tristate, there is a second build error:

  ERROR: "boot_cpu_physical_apicid" [drivers/acpi/acpi_extlog.ko] undefined!

The symbol needs to be exported for it to be available.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1311141504080.30112@chino.kir.corp.google.com
[ Changed it to a _GPL() export. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-15 08:38:30 +01:00
Linus Torvalds 049ffa8ab3 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is a combo of -next and some -fixes that came in in the
  intervening time.

  Highlights:

  New drivers:
    ARM Armada driver for Marvell Armada 510 SOCs

  Intel:
    Broadwell initial support under a default off switch,
    Stereo/3D HDMI mode support
    Valleyview improvements
    Displayport improvements
    Haswell fixes
    initial mipi dsi panel support
    CRC support for debugging
    build with CONFIG_FB=n

  Radeon:
    enable DPM on a number of GPUs by default
    secondary GPU powerdown support
    enable HDMI audio by default
    Hawaii support

  Nouveau:
    dynamic pm code infrastructure reworked, does nothing major yet
    GK208 modesetting support
    MSI fixes, on by default again
    PMPEG improvements
    pageflipping fixes

  GMA500:
    minnowboard SDVO support

  VMware:
    misc fixes

  MSM:
    prime, plane and rendernodes support

  Tegra:
    rearchitected to put the drm driver into the drm subsystem.
    HDMI and gr2d support for tegra 114 SoC

  QXL:
    oops fix, and multi-head fixes

  DRM core:
    sysfs lifetime fixes
    client capability ioctl
    further cleanups to device midlayer
    more vblank timestamp fixes"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (789 commits)
  drm/nouveau: do not map evicted vram buffers in nouveau_bo_vma_add
  drm/nvc0-/gr: shift wrapping bug in nvc0_grctx_generate_r406800
  drm/nouveau/pwr: fix missing mutex unlock in a failure path
  drm/nv40/therm: fix slowing down fan when pstate undefined
  drm/nv11-: synchronise flips to vblank, unless async flip requested
  drm/nvc0-: remove nasty fifo swmthd hack for flip completion method
  drm/nv10-: we no longer need to create nvsw object on user channels
  drm/nouveau: always queue flips relative to kernel channel activity
  drm/nouveau: there is no need to reserve/fence the new fb when flipping
  drm/nouveau: when bailing out of a pushbuf ioctl, do not remove previous fence
  drm/nouveau: allow nouveau_fence_ref() to be a noop
  drm/nvc8/mc: msi rearm is via the nvc0 method
  drm/ttm: Fix vma page_prot bit manipulation
  drm/vmwgfx: Fix a couple of compile / sparse warnings and errors
  drm/vmwgfx: Resource evict fixes
  drm/edid: compare actual vrefresh for all modes for quirks
  drm: shmob_drm: Convert to clk_prepare/unprepare
  drm/nouveau: fix 32-bit build
  drm/i915/opregion: fix build error on CONFIG_ACPI=n
  Revert "drm/radeon/audio: don't set speaker allocation on DCE4+"
  ...
2013-11-15 14:19:54 +09:00
Linus Torvalds f080480488 Here are the 3.13 KVM changes. There was a lot of work on the PPC
side: the HV and emulation flavors can now coexist in a single kernel
 is probably the most interesting change from a user point of view.
 On the x86 side there are nested virtualization improvements and a
 few bugfixes.  ARM got transparent huge page support, improved
 overcommit, and support for big endian guests.
 
 Finally, there is a new interface to connect KVM with VFIO.  This
 helps with devices that use NoSnoop PCI transactions, letting the
 driver in the guest execute WBINVD instructions.  This includes
 some nVidia cards on Windows, that fail to start without these
 patches and the corresponding userspace changes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJShPAhAAoJEBvWZb6bTYbyl48P/297GgmELHAGBgjvb6q7yyGu
 L8+eHjKbh4XBAkPwyzbvUjuww5z2hM0N3JQ0BDV9oeXlO+zwwCEns/sg2Q5/NJXq
 XxnTeShaKnp9lqVBnE6G9rAOUWKoyLJ2wItlvUL8JlaO9xJ0Vmk0ta4n2Nv5GqDp
 db6UD7vju6rHtIAhNpvvAO51kAOwc01xxRixCVb7KUYOnmO9nvpixzoI/S0Rp1gu
 w/OWMfCosDzBoT+cOe79Yx1OKcpaVW94X6CH1s+ShCw3wcbCL2f13Ka8/E3FIcuq
 vkZaLBxio7vjUAHRjPObw0XBW4InXEbhI1DjzIvm8dmc4VsgmtLQkTCG8fj+jINc
 dlHQUq6Do+1F4zy6WMBUj8tNeP1Z9DsABp98rQwR8+BwHoQpGQBpAxW0TE0ZMngC
 t1caqyvjZ5pPpFUxSrAV+8Kg4AvobXPYOim0vqV7Qea07KhFcBXLCfF7BWdwq/Jc
 0CAOlsLL4mHGIQWZJuVGw0YGP7oATDCyewlBuDObx+szYCoV4fQGZVBEL0KwJx/1
 7lrLN7JWzRyw6xTgJ5VVwgYE1tUY4IFQcHu7/5N+dw8/xg9KWA3f4PeMavIKSf+R
 qteewbtmQsxUnvuQIBHLs8NRWPnBPy+F3Sc2ckeOLIe4pmfTte6shtTXcLDL+LqH
 NTmT/cfmYp2BRkiCfCiS
 =rWNf
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM changes from Paolo Bonzini:
 "Here are the 3.13 KVM changes.  There was a lot of work on the PPC
  side: the HV and emulation flavors can now coexist in a single kernel
  is probably the most interesting change from a user point of view.

  On the x86 side there are nested virtualization improvements and a few
  bugfixes.

  ARM got transparent huge page support, improved overcommit, and
  support for big endian guests.

  Finally, there is a new interface to connect KVM with VFIO.  This
  helps with devices that use NoSnoop PCI transactions, letting the
  driver in the guest execute WBINVD instructions.  This includes some
  nVidia cards on Windows, that fail to start without these patches and
  the corresponding userspace changes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
  kvm, vmx: Fix lazy FPU on nested guest
  arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
  arm/arm64: KVM: MMIO support for BE guest
  kvm, cpuid: Fix sparse warning
  kvm: Delete prototype for non-existent function kvm_check_iopl
  kvm: Delete prototype for non-existent function complete_pio
  hung_task: add method to reset detector
  pvclock: detect watchdog reset at pvclock read
  kvm: optimize out smp_mb after srcu_read_unlock
  srcu: API for barrier after srcu read unlock
  KVM: remove vm mmap method
  KVM: IOMMU: hva align mapping page size
  KVM: x86: trace cpuid emulation when called from emulator
  KVM: emulator: cleanup decode_register_operand() a bit
  KVM: emulator: check rex prefix inside decode_register()
  KVM: x86: fix emulation of "movzbl %bpl, %eax"
  kvm_host: typo fix
  KVM: x86: emulate SAHF instruction
  MAINTAINERS: add tree for kvm.git
  Documentation/kvm: add a 00-INDEX file
  ...
2013-11-15 13:51:36 +09:00
Linus Torvalds eda670c626 Features:
- SWIOTLB has tracing added when doing bounce buffer.
  - Xen ARM/ARM64 can use Xen-SWIOTLB. This work allows Linux to
    safely program real devices for DMA operations when running as
    a guest on Xen on ARM, without IOMMU support.*1
  - xen_raw_printk works with PVHVM guests if needed.
 Bug-fixes:
  - Make memory ballooning work under HVM with large MMIO region.
  - Inform hypervisor of MCFG regions found in ACPI DSDT.
  - Remove deprecated IRQF_DISABLED.
  - Remove deprecated __cpuinit.
 
 [*1]:
 "On arm and arm64 all Xen guests, including dom0, run with second stage
 translation enabled. As a consequence when dom0 programs a device for a
 DMA operation is going to use (pseudo) physical addresses instead
 machine addresses. This work introduces two trees to track physical to
 machine and machine to physical mappings of foreign pages. Local pages
 are assumed mapped 1:1 (physical address == machine address).  It
 enables the SWIOTLB-Xen driver on ARM and ARM64, so that Linux can
 translate physical addresses to machine addresses for dma operations
 when necessary. " (Stefano).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSgS86AAoJEFjIrFwIi8fJpY4H/R2gke1A1p9UvTwbkaDhgPs/
 u/mkI6aH+ktgvu5QZNprki660uydtc4Ck7y8leeLGYw+ed1Ys559SJhRc/x8jBYZ
 Hh2chnplld0LAjSpdIDTTePArE1xBo4Gz+fT0zc5cVh0leJwOXn92Kx8N5AWD/T3
 gwH4Ok4K1dzZBIls7imM2AM/L1xcApcx3Dl/QpNcoePQtR4yLuPWMUbb3LM8pbUY
 0B6ZVN4GOhtJ84z8HRKnh4uMnBYmhmky6laTlHVa6L+j1fv7aAPCdNbePjIt/Pvj
 HVYB1O/ht73yHw0zGfK6lhoGG8zlu+Q7sgiut9UsGZZfh34+BRKzNTypqJ3ezQo=
 =xc43
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.13-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull Xen updates from Konrad Rzeszutek Wilk:
 "This has tons of fixes and two major features which are concentrated
  around the Xen SWIOTLB library.

  The short <blurb> is that the tracing facility (just one function) has
  been added to SWIOTLB to make it easier to track I/O progress.
  Additionally under Xen and ARM (32 & 64) the Xen-SWIOTLB driver
  "is used to translate physical to machine and machine to physical
  addresses of foreign[guest] pages for DMA operations" (Stefano) when
  booting under hardware without proper IOMMU.

  There are also bug-fixes, cleanups, compile warning fixes, etc.

  The commit times for some of the commits is a bit fresh - that is b/c
  we wanted to make sure we have the Ack's from the ARM folks - which
  with the string of back-to-back conferences took a bit of time.  Rest
  assured - the code has been stewing in #linux-next for some time.

  Features:
   - SWIOTLB has tracing added when doing bounce buffer.
   - Xen ARM/ARM64 can use Xen-SWIOTLB.  This work allows Linux to
     safely program real devices for DMA operations when running as a
     guest on Xen on ARM, without IOMMU support. [*1]
   - xen_raw_printk works with PVHVM guests if needed.

  Bug-fixes:
   - Make memory ballooning work under HVM with large MMIO region.
   - Inform hypervisor of MCFG regions found in ACPI DSDT.
   - Remove deprecated IRQF_DISABLED.
   - Remove deprecated __cpuinit.

  [*1]:
  "On arm and arm64 all Xen guests, including dom0, run with second
   stage translation enabled.  As a consequence when dom0 programs a
   device for a DMA operation is going to use (pseudo) physical
   addresses instead machine addresses.  This work introduces two trees
   to track physical to machine and machine to physical mappings of
   foreign pages.  Local pages are assumed mapped 1:1 (physical address
   == machine address).  It enables the SWIOTLB-Xen driver on ARM and
   ARM64, so that Linux can translate physical addresses to machine
   addresses for dma operations when necessary.  " (Stefano)"

* tag 'stable/for-linus-3.13-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (32 commits)
  xen/arm: pfn_to_mfn and mfn_to_pfn return the argument if nothing is in the p2m
  arm,arm64/include/asm/io.h: define struct bio_vec
  swiotlb-xen: missing include dma-direction.h
  pci-swiotlb-xen: call pci_request_acs only ifdef CONFIG_PCI
  arm: make SWIOTLB available
  xen: delete new instances of added __cpuinit
  xen/balloon: Set balloon's initial state to number of existing RAM pages
  xen/mcfg: Call PHYSDEVOP_pci_mmcfg_reserved for MCFG areas.
  xen: remove deprecated IRQF_DISABLED
  x86/xen: remove deprecated IRQF_DISABLED
  swiotlb-xen: fix error code returned by xen_swiotlb_map_sg_attrs
  swiotlb-xen: static inline xen_phys_to_bus, xen_bus_to_phys, xen_virt_to_bus and range_straddles_page_boundary
  grant-table: call set_phys_to_machine after mapping grant refs
  arm,arm64: do not always merge biovec if we are running on Xen
  swiotlb: print a warning when the swiotlb is full
  swiotlb-xen: use xen_dma_map/unmap_page, xen_dma_sync_single_for_cpu/device
  xen: introduce xen_dma_map/unmap_page and xen_dma_sync_single_for_cpu/device
  tracing/events: Fix swiotlb tracepoint creation
  swiotlb-xen: use xen_alloc/free_coherent_pages
  xen: introduce xen_alloc/free_coherent_pages
  ...
2013-11-15 13:34:37 +09:00
Christoph Hellwig 0a06ff068f kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS
We've switched over every architecture that supports SMP to it, so
remove the new useless config variable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:22 +09:00
Kirill A. Shutemov 49076ec2cc mm: dynamically allocate page->ptl if it cannot be embedded to struct page
If split page table lock is in use, we embed the lock into struct page
of table's page.  We have to disable split lock, if spinlock_t is too
big be to be embedded, like when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC
enabled.

This patch add support for dynamic allocation of split page table lock
if we can't embed it to struct page.

page->ptl is unsigned long now and we use it as spinlock_t if
sizeof(spinlock_t) <= sizeof(long), otherwise it's pointer to spinlock_t.

The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
pgtable_pmd_page_ctor() for PMD table.  All other helpers converted to
support dynamically allocated page->ptl.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:20 +09:00
Kirill A. Shutemov cecbd1b5af x86: handle pgtable_page_ctor() fail
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:19 +09:00
Kirill A. Shutemov 09ef493985 x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds
In split page table lock case, we embed spinlock_t into struct page.
For obvious reason, we don't want to increase size of struct page if
spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or
on -rt kernel.  So we disable split page table lock, if spinlock_t is
too big.

This patchset allows to allocate the lock dynamically if spinlock_t is
big.  In this page->ptl is used to store pointer to spinlock instead of
spinlock itself.  It costs additional cache line for indirect access,
but fix page fault scalability for multi-threaded applications.

LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling
LOCK_STAT to analyse scalability issues breaks scalability.  ;)

The patchset mostly fixes this.  Results for ./thp_memscale -c 80 -b 512M
on 4-socket machine:

baseline, no CONFIG_LOCK_STAT:	9.115460703 seconds time elapsed
baseline, CONFIG_LOCK_STAT=y:	53.890567123 seconds time elapsed
patched, no CONFIG_LOCK_STAT:	8.852250368 seconds time elapsed
patched, CONFIG_LOCK_STAT=y:	11.069770759 seconds time elapsed

Patch count is scary, but most of them trivial. Overview:

 Patches 1-4	Few bug fixes. No dependencies to other patches.
		Probably should applied as soon as possible.

 Patch 5	Changes signature of pgtable_page_ctor(). We will use it
		for dynamic lock allocation, so it can fail.

 Patches 6-8	Add missing constructor/destructor calls on few archs.
		It's fixes NR_PAGETABLE accounting and prepare to use
		split ptl.

 Patches 9-33	Add pgtable_page_ctor() fail handling to all archs.

 Patches 34	Finally adds support of dynamically-allocated page->pte.
		Also contains documentation for split page table lock.

This patch (of 34):

I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE
enabled.  Let's add missed constructor/destructor calls.

I haven't noticed it during testing since prep_new_page() clears
page->mapping and therefore page->ptl.  It's effectively equal to
spin_lock_init(&page->ptl).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:15 +09:00
Kirill A. Shutemov 9491846fca x86, mm: enable split page table lock for PMD level
Enable PMD split page table lock for X86_64 and PAE.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:15 +09:00
Kirill A. Shutemov 57c1ffcefb mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS
We're going to introduce split page table lock for PMD level.  Let's
rename existing split ptlock for PTE level to avoid confusion.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:14 +09:00
Rafael J. Wysocki 7b1998116b ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node
Modify struct acpi_dev_node to contain a pointer to struct acpi_device
associated with the given device object (that is, its ACPI companion
device) instead of an ACPI handle corresponding to it.  Introduce two
new macros for manipulating that pointer in a CONFIG_ACPI-safe way,
ACPI_COMPANION() and ACPI_COMPANION_SET(), and rework the
ACPI_HANDLE() macro to take the above changes into account.
Drop the ACPI_HANDLE_SET() macro entirely and rework its users to
use ACPI_COMPANION_SET() instead.  For some of them who used to
pass the result of acpi_get_child() directly to ACPI_HANDLE_SET()
introduce a helper routine acpi_preset_companion() doing an
equivalent thing.

The main motivation for doing this is that there are things
represented by struct acpi_device objects that don't have valid
ACPI handles (so called fixed ACPI hardware features, such as
power and sleep buttons) and we would like to create platform
device objects for them and "glue" them to their ACPI companions
in the usual way (which currently is impossible due to the
lack of valid ACPI handles).  However, there are more reasons
why it may be useful.

First, struct acpi_device pointers allow of much better type checking
than void pointers which are ACPI handles, so it should be more
difficult to write buggy code using modified struct acpi_dev_node
and the new macros.  Second, the change should help to reduce (over
time) the number of places in which the result of ACPI_HANDLE() is
passed to acpi_bus_get_device() in order to obtain a pointer to the
struct acpi_device associated with the given "physical" device,
because now that pointer is returned by ACPI_COMPANION() directly.
Finally, the change should make it easier to write generic code that
will build both for CONFIG_ACPI set and unset without adding explicit
compiler directives to it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> # on Haswell
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Aaron Lu <aaron.lu@intel.com> # for ATA and SDIO part
2013-11-14 23:14:43 +01:00
Jesse Barnes 7bd40c16cc x86/early quirk: use gen6 stolen detection for VLV
We've always been able to use either method on VLV, but it appears more
recent BIOSes only support the gen6 method, so switch over to that.

References: https://bugs.freedesktop.org/show_bug.cgi?id=71370
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14 09:32:11 +01:00
Linus Torvalds d320e203ba Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull two x86 fixes from Ingo Molnar.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/amd: Tone down printk(), don't treat a missing firmware file as an error
  x86/dumpstack: Fix printk_address for direct addresses
2013-11-14 16:55:56 +09:00
Linus Torvalds 5e30025a31 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking changes from Ingo Molnar:
 "The biggest changes:

   - add lockdep support for seqcount/seqlocks structures, this
     unearthed both bugs and required extra annotation.

   - move the various kernel locking primitives to the new
     kernel/locking/ directory"

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  block: Use u64_stats_init() to initialize seqcounts
  locking/lockdep: Mark __lockdep_count_forward_deps() as static
  lockdep/proc: Fix lock-time avg computation
  locking/doc: Update references to kernel/mutex.c
  ipv6: Fix possible ipv6 seqlock deadlock
  cpuset: Fix potential deadlock w/ set_mems_allowed
  seqcount: Add lockdep functionality to seqcount/seqlock structures
  net: Explicitly initialize u64_stats_sync structures for lockdep
  locking: Move the percpu-rwsem code to kernel/locking/
  locking: Move the lglocks code to kernel/locking/
  locking: Move the rwsem code to kernel/locking/
  locking: Move the rtmutex code to kernel/locking/
  locking: Move the semaphore core to kernel/locking/
  locking: Move the spinlock code to kernel/locking/
  locking: Move the lockdep code to kernel/locking/
  locking: Move the mutex code to kernel/locking/
  hung_task debugging: Add tracepoint to report the hang
  x86/locking/kconfig: Update paravirt spinlock Kconfig description
  lockstat: Report avg wait and hold times
  lockdep, x86/alternatives: Drop ancient lockdep fixup message
  ...
2013-11-14 16:30:30 +09:00
Linus Torvalds 7971e23a66 Merge branch 'x86-trace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/trace changes from Ingo Molnar:
 "This adds page fault tracepoints which have zero runtime cost in the
  disabled case via IDT trickery (no NOPs in the page fault hotpath)"

* 'x86-trace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, trace: Change user|kernel_page_fault to page_fault_user|kernel
  x86, trace: Add page fault tracepoints
  x86, trace: Delete __trace_alloc_intr_gate()
  x86, trace: Register exception handler to trace IDT
  x86, trace: Remove __alloc_intr_gate()
2013-11-14 16:25:10 +09:00
Linus Torvalds 2f466d33f5 PCI changes for the v3.13 merge window:
Resource management
     - Fix host bridge window coalescing (Alexey Neyman)
     - Pass type, width, and prefetchability for window alignment (Wei Yang)
 
   PCI device hotplug
     - Convert acpiphp, acpiphp_ibm to dynamic debug (Lan Tianyu)
 
   Power management
     - Remove pci_pm_complete() (Liu Chuansheng)
 
   MSI
     - Fail initialization if device is not in PCI_D0 (Yijing Wang)
 
   MPS (Max Payload Size)
     - Use pcie_get_mps() and pcie_set_mps() to simplify code (Yijing Wang)
     - Use pcie_set_readrq() to simplify code (Yijing Wang)
     - Use cached pci_dev->pcie_mpss to simplify code (Yijing Wang)
 
   SR-IOV
     - Enable upstream bridges even for VFs on virtual buses (Bjorn Helgaas)
     - Use pci_is_root_bus() to avoid catching virtual buses (Wei Yang)
 
   Virtualization
     - Add x86 MSI masking ops (Konrad Rzeszutek Wilk)
 
   Freescale i.MX6
     - Support i.MX6 PCIe controller (Sean Cross)
     - Increase link startup timeout (Marek Vasut)
     - Probe PCIe in fs_initcall() (Marek Vasut)
     - Fix imprecise abort handler (Tim Harvey)
     - Remove redundant of_match_ptr (Sachin Kamat)
 
   Renesas R-Car
     - Support Gen2 internal PCIe controller (Valentine Barshak)
 
   Samsung Exynos
     - Add MSI support (Jingoo Han)
     - Turn off power when link fails (Jingoo Han)
     - Add Jingoo Han as maintainer (Jingoo Han)
     - Add clk_disable_unprepare() on error path (Wei Yongjun)
     - Remove redundant of_match_ptr (Sachin Kamat)
 
   Synopsys DesignWare
     - Add irq_create_mapping() (Pratyush Anand)
     - Add header guards (Seungwon Jeon)
 
   Miscellaneous
     - Enable native PCIe services by default on non-ACPI (Andrew Murray)
     - Cleanup _OSC usage and messages (Bjorn Helgaas)
     - Remove pcibios_last_bus boot option on non-x86 (Bjorn Helgaas)
     - Convert bus code to use bus_, drv_, and dev_groups (Greg Kroah-Hartman)
     - Remove unused pci_mem_start (Myron Stowe)
     - Make sysfs functions static (Sachin Kamat)
     - Warn on invalid return from driver probe (Stephen M. Cameron)
     - Remove Intel Haswell D3 delays (Todd E Brandt)
     - Call pci_set_master() in core if driver doesn't do it (Yinghai Lu)
     - Use pci_is_pcie() to simplify code (Yijing Wang)
     - Use PCIe capability accessors to simplify code (Yijing Wang)
     - Use cached pci_dev->pcie_cap to simplify code (Yijing Wang)
     - Removed unused "is_pcie" from struct pci_dev (Yijing Wang)
     - Simplify sysfs CPU affinity implementation (Yijing Wang))
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJSgUzsAAoJEFmIoMA60/r8wmsQAJhwmtkUYR2L4T1g9smAyjJz
 bLm5zoC6WdywFcbTpTBfsTrS1CHIQG5akRgkEXGdr99epiho5F2lwmagWsUR4ijL
 39Qn3knAUMgtNjoVXXI106h/DfTyxSmkZBfih2AQFyWobJq+0kg7hjQQA3+836b4
 8ssWr1+NSl6JJTqYQ0Paw1kSqvvYoXsu5rWFEfCHk8D0s/1bvr5ldAUpk2jTg93I
 uo9/5+O264yt1YoKZOMqAMZLUfd5DaWY1mV3yeF0Uauy1pBmol5csE8ckqJPDrES
 PRdJT1+PhBeLYWcgXANOBZsW58ddxA0pQ5jQV6VJHQWsm5cE82OBpYJf6xUZ2moV
 o6DZ0KRnCPVA3NllYYR16H+wbMfADwwO83QoA+QTIZJy/WgpDH3Cst+m8KePGqbL
 uFgDdXSws9Bs1BCFs7bfYzAM3OdkBFnn+ac7JoPXKP5ibgAp9nDlurgK2r90zRnp
 j15vHMx0mV+e8B8/iwiW5eRtg7NoCHYiNfFy7JalOlsPmYr2KFazBVKclp13Hng7
 fe/Jy6X4UhWoQPdqsy4ftvSQb0gm1MClxFJeZ3VAt6LY9j8OP6S/Vdf6lpAL85KR
 lAQoQzB+lOhTPdXxFY2xgGkITkqPDOQMjPfowYUYFwybqBuG6BHXZPJobL+niBlb
 Nh+M2WlUUA9Z3V6rWJB6
 =CTPk
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI changes from Bjorn Helgaas:
 "Resource management
    - Fix host bridge window coalescing (Alexey Neyman)
    - Pass type, width, and prefetchability for window alignment (Wei Yang)

  PCI device hotplug
    - Convert acpiphp, acpiphp_ibm to dynamic debug (Lan Tianyu)

  Power management
    - Remove pci_pm_complete() (Liu Chuansheng)

  MSI
    - Fail initialization if device is not in PCI_D0 (Yijing Wang)

  MPS (Max Payload Size)
    - Use pcie_get_mps() and pcie_set_mps() to simplify code (Yijing Wang)
    - Use pcie_set_readrq() to simplify code (Yijing Wang)
    - Use cached pci_dev->pcie_mpss to simplify code (Yijing Wang)

  SR-IOV
    - Enable upstream bridges even for VFs on virtual buses (Bjorn Helgaas)
    - Use pci_is_root_bus() to avoid catching virtual buses (Wei Yang)

  Virtualization
    - Add x86 MSI masking ops (Konrad Rzeszutek Wilk)

  Freescale i.MX6
    - Support i.MX6 PCIe controller (Sean Cross)
    - Increase link startup timeout (Marek Vasut)
    - Probe PCIe in fs_initcall() (Marek Vasut)
    - Fix imprecise abort handler (Tim Harvey)
    - Remove redundant of_match_ptr (Sachin Kamat)

  Renesas R-Car
    - Support Gen2 internal PCIe controller (Valentine Barshak)

  Samsung Exynos
    - Add MSI support (Jingoo Han)
    - Turn off power when link fails (Jingoo Han)
    - Add Jingoo Han as maintainer (Jingoo Han)
    - Add clk_disable_unprepare() on error path (Wei Yongjun)
    - Remove redundant of_match_ptr (Sachin Kamat)

  Synopsys DesignWare
    - Add irq_create_mapping() (Pratyush Anand)
    - Add header guards (Seungwon Jeon)

  Miscellaneous
    - Enable native PCIe services by default on non-ACPI (Andrew Murray)
    - Cleanup _OSC usage and messages (Bjorn Helgaas)
    - Remove pcibios_last_bus boot option on non-x86 (Bjorn Helgaas)
    - Convert bus code to use bus_, drv_, and dev_groups (Greg Kroah-Hartman)
    - Remove unused pci_mem_start (Myron Stowe)
    - Make sysfs functions static (Sachin Kamat)
    - Warn on invalid return from driver probe (Stephen M. Cameron)
    - Remove Intel Haswell D3 delays (Todd E Brandt)
    - Call pci_set_master() in core if driver doesn't do it (Yinghai Lu)
    - Use pci_is_pcie() to simplify code (Yijing Wang)
    - Use PCIe capability accessors to simplify code (Yijing Wang)
    - Use cached pci_dev->pcie_cap to simplify code (Yijing Wang)
    - Removed unused "is_pcie" from struct pci_dev (Yijing Wang)
    - Simplify sysfs CPU affinity implementation (Yijing Wang)"

* tag 'pci-v3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (79 commits)
  PCI: Enable upstream bridges even for VFs on virtual buses
  PCI: Add pci_upstream_bridge()
  PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()
  PCI: Warn on driver probe return value greater than zero
  PCI: Drop warning about drivers that don't use pci_set_master()
  PCI: Workaround missing pci_set_master in pci drivers
  powerpc/pci: Use pci_is_pcie() to simplify code [fix]
  PCI: Update pcie_ports 'auto' behavior for non-ACPI platforms
  PCI: imx6: Probe the PCIe in fs_initcall()
  PCI: Add R-Car Gen2 internal PCI support
  PCI: imx6: Remove redundant of_match_ptr
  PCI: Report pci_pme_active() kmalloc failure
  mn10300/PCI: Remove useless pcibios_last_bus
  frv/PCI: Remove pcibios_last_bus
  PCI: imx6: Increase link startup timeout
  PCI: exynos: Remove redundant of_match_ptr
  PCI: imx6: Fix imprecise abort handler
  PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0
  PCI: imx6: Remove redundant dev_err() in imx6_pcie_probe()
  x86/PCI: Coalesce multiple overlapping host bridge windows
  ...
2013-11-14 14:02:00 +09:00
Linus Torvalds f9300eaaac ACPI and power management updates for 3.13-rc1
- New power capping framework and the the Intel Running Average Power
    Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan.
 
  - Addition of the in-kernel switching feature to the arm_big_little
    cpufreq driver from Viresh Kumar and Nicolas Pitre.
 
  - cpufreq support for iMac G5 from Aaro Koskinen.
 
  - Baytrail processors support for intel_pstate from Dirk Brandewie.
 
  - cpufreq support for Midway/ECX-2000 from Mark Langsdorf.
 
  - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha.
 
  - ACPI power management support for the I2C and SPI bus types from
    Mika Westerberg and Lv Zheng.
 
  - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat,
    Stratos Karafotis, Xiaoguang Chen, Lan Tianyu.
 
  - cpufreq drivers updates (mostly fixes and cleanups) from Viresh Kumar,
    Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz Majewski,
    Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev.
 
  - intel_pstate updates from Dirk Brandewie and Adrian Huang.
 
  - ACPICA update to version 20130927 includig fixes and cleanups and
    some reduction of divergences between the ACPICA code in the kernel
    and ACPICA upstream in order to improve the automatic ACPICA patch
    generation process.  From Bob Moore, Lv Zheng, Tomasz Nowicki,
    Naresh Bhat, Bjorn Helgaas, David E Box.
 
  - ACPI IPMI driver fixes and cleanups from Lv Zheng.
 
  - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani,
    Zhang Yanfei, Rafael J Wysocki.
 
  - Conversion of the ACPI AC driver to the platform bus type and
    multiple driver fixes and cleanups related to ACPI from Zhang Rui.
 
  - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu,
    Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki.
 
  - Fixes and cleanups and new blacklist entries related to the ACPI
    video support from Aaron Lu, Felipe Contreras, Lennart Poettering,
    Kirill Tkhai.
 
  - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi.
 
  - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han,
    Bartlomiej Zolnierkiewicz, Prarit Bhargava.
 
  - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe.
 
  - Operation Performance Points (OPP) core updates from Nishanth Menon.
 
  - Runtime power management core fix from Rafael J Wysocki and update
    from Ulf Hansson.
 
  - Hibernation fixes from Aaron Lu and Rafael J Wysocki.
 
  - Device suspend/resume lockup detection mechanism from Benoit Goby.
 
  - Removal of unused proc directories created for various ACPI drivers
    from Lan Tianyu.
 
  - ACPI LPSS driver fix and new device IDs for the ACPI platform scan
    handler from Heikki Krogerus and Jarkko Nikula.
 
  - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa.
 
  - Assorted fixes and cleanups related to ACPI from Andy Shevchenko,
    Al Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter,
    Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause,
    Liu Chuansheng.
 
  - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding,
    Jean-Christophe Plagniol-Villard.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABCAAGBQJSfPKLAAoJEILEb/54YlRxH6YQAJwDKi25RCZziFSIenXuqzC/
 c6JxoH/tSnDHJHhcTgqh7H7Raa+zmatMDf0m2oEv2Wjfx4Lt4BQK4iefhe/zY4lX
 yJ8uXDg+U8DYhDX2XwbwnFpd1M1k/A+s2gIHDTHHGnE0kDngXdd8RAFFktBmooTZ
 l5LBQvOrTlgX/ZfqI/MNmQ6lfY6kbCABGSHV1tUUsDA6Kkvk/LAUTOMSmptv1q22
 hcs6k55vR34qADPkUX5GghjmcYJv+gNtvbDEJUjcmCwVoPWouF415m7R5lJ8w3/M
 49Q8Tbu5HELWLwca64OorS8qh/P7sgUOf1BX5IDzHnJT+TGeDfvcYbMv2Z275/WZ
 /bqhuLuKBpsHQ2wvEeT+lYV3FlifKeTf1FBxER3ApjzI3GfpmVVQ+dpEu8e9hcTh
 ZTPGzziGtoIsHQ0unxb+zQOyt1PmIk+cU4IsKazs5U20zsVDMcKzPrb19Od49vMX
 gCHvRzNyOTqKWpE83Ss4NGOVPAG02AXiXi/BpuYBHKDy6fTH/liKiCw5xlCDEtmt
 lQrEbupKpc/dhCLo5ws6w7MZzjWJs2eSEQcNR4DlR++pxIpYOOeoPTXXrghgZt2X
 mmxZI2qsJ7GAvPzII8OBeF3CRO3fabZ6Nez+M+oEZjGe05ZtpB3ccw410HwieqBn
 dYpJFt/BHK189odhV9CM
 =JCxk
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael J Wysocki:

 - New power capping framework and the the Intel Running Average Power
   Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan.

 - Addition of the in-kernel switching feature to the arm_big_little
   cpufreq driver from Viresh Kumar and Nicolas Pitre.

 - cpufreq support for iMac G5 from Aaro Koskinen.

 - Baytrail processors support for intel_pstate from Dirk Brandewie.

 - cpufreq support for Midway/ECX-2000 from Mark Langsdorf.

 - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha.

 - ACPI power management support for the I2C and SPI bus types from Mika
   Westerberg and Lv Zheng.

 - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat,
   Stratos Karafotis, Xiaoguang Chen, Lan Tianyu.

 - cpufreq drivers updates (mostly fixes and cleanups) from Viresh
   Kumar, Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz
   Majewski, Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev.

 - intel_pstate updates from Dirk Brandewie and Adrian Huang.

 - ACPICA update to version 20130927 includig fixes and cleanups and
   some reduction of divergences between the ACPICA code in the kernel
   and ACPICA upstream in order to improve the automatic ACPICA patch
   generation process.  From Bob Moore, Lv Zheng, Tomasz Nowicki, Naresh
   Bhat, Bjorn Helgaas, David E Box.

 - ACPI IPMI driver fixes and cleanups from Lv Zheng.

 - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani, Zhang
   Yanfei, Rafael J Wysocki.

 - Conversion of the ACPI AC driver to the platform bus type and
   multiple driver fixes and cleanups related to ACPI from Zhang Rui.

 - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu,
   Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki.

 - Fixes and cleanups and new blacklist entries related to the ACPI
   video support from Aaron Lu, Felipe Contreras, Lennart Poettering,
   Kirill Tkhai.

 - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi.

 - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han,
   Bartlomiej Zolnierkiewicz, Prarit Bhargava.

 - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe.

 - Operation Performance Points (OPP) core updates from Nishanth Menon.

 - Runtime power management core fix from Rafael J Wysocki and update
   from Ulf Hansson.

 - Hibernation fixes from Aaron Lu and Rafael J Wysocki.

 - Device suspend/resume lockup detection mechanism from Benoit Goby.

 - Removal of unused proc directories created for various ACPI drivers
   from Lan Tianyu.

 - ACPI LPSS driver fix and new device IDs for the ACPI platform scan
   handler from Heikki Krogerus and Jarkko Nikula.

 - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa.

 - Assorted fixes and cleanups related to ACPI from Andy Shevchenko, Al
   Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter,
   Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause,
   Liu Chuansheng.

 - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding,
   Jean-Christophe Plagniol-Villard.

* tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (386 commits)
  cpufreq: conservative: fix requested_freq reduction issue
  ACPI / hotplug: Consolidate deferred execution of ACPI hotplug routines
  PM / runtime: Use pm_runtime_put_sync() in __device_release_driver()
  ACPI / event: remove unneeded NULL pointer check
  Revert "ACPI / video: Ignore BIOS initial backlight value for HP 250 G1"
  ACPI / video: Quirk initial backlight level 0
  ACPI / video: Fix initial level validity test
  intel_pstate: skip the driver if ACPI has power mgmt option
  PM / hibernate: Avoid overflow in hibernate_preallocate_memory()
  ACPI / hotplug: Do not execute "insert in progress" _OST
  ACPI / hotplug: Carry out PCI root eject directly
  ACPI / hotplug: Merge device hot-removal routines
  ACPI / hotplug: Make acpi_bus_hot_remove_device() internal
  ACPI / hotplug: Simplify device ejection routines
  ACPI / hotplug: Fix handle_root_bridge_removal()
  ACPI / hotplug: Refuse to hot-remove all objects with disabled hotplug
  ACPI / scan: Start matching drivers after trying scan handlers
  ACPI: Remove acpi_pci_slot_init() headers from internal.h
  ACPI / blacklist: fix name of ThinkPad Edge E530
  PowerCap: Fix build error with option -Werror=format-security
  ...

Conflicts:
	arch/arm/mach-omap2/opp.c
	drivers/Kconfig
	drivers/spi/spi.c
2013-11-14 13:41:48 +09:00
Thomas Gleixner 00d1a39e69 preempt: Make PREEMPT_ACTIVE generic
No point in having this bit defined by architecture.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130917183629.090698799@linutronix.de
2013-11-13 20:21:47 +01:00
Anthoine Bourgeois e504c9098e kvm, vmx: Fix lazy FPU on nested guest
If a nested guest does a NM fault but its CR0 doesn't contain the TS
flag (because it was already cleared by the guest with L1 aid) then we
have to activate FPU ourselves in L0 and then continue to L2. If TS flag
is set then we fallback on the previous behavior, forward the fault to
L1 if it asked for.

Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-11-13 18:46:54 +01:00
Linus Torvalds 42a2d923cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) The addition of nftables.  No longer will we need protocol aware
    firewall filtering modules, it can all live in userspace.

    At the core of nftables is a, for lack of a better term, virtual
    machine that executes byte codes to inspect packet or metadata
    (arriving interface index, etc.) and make verdict decisions.

    Besides support for loading packet contents and comparing them, the
    interpreter supports lookups in various datastructures as
    fundamental operations.  For example sets are supports, and
    therefore one could create a set of whitelist IP address entries
    which have ACCEPT verdicts attached to them, and use the appropriate
    byte codes to do such lookups.

    Since the interpreted code is composed in userspace, userspace can
    do things like optimize things before giving it to the kernel.

    Another major improvement is the capability of atomically updating
    portions of the ruleset.  In the existing netfilter implementation,
    one has to update the entire rule set in order to make a change and
    this is very expensive.

    Userspace tools exist to create nftables rules using existing
    netfilter rule sets, but both kernel implementations will need to
    co-exist for quite some time as we transition from the old to the
    new stuff.

    Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have
    worked so hard on this.

 2) Daniel Borkmann and Hannes Frederic Sowa made several improvements
    to our pseudo-random number generator, mostly used for things like
    UDP port randomization and netfitler, amongst other things.

    In particular the taus88 generater is updated to taus113, and test
    cases are added.

 3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet
    and Yang Yingliang.

 4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin
    Sujir.

 5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet,
    Neal Cardwell, and Yuchung Cheng.

 6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary
    control message data, much like other socket option attributes.
    From Francesco Fusco.

 7) Allow applications to specify a cap on the rate computed
    automatically by the kernel for pacing flows, via a new
    SO_MAX_PACING_RATE socket option.  From Eric Dumazet.

 8) Make the initial autotuned send buffer sizing in TCP more closely
    reflect actual needs, from Eric Dumazet.

 9) Currently early socket demux only happens for TCP sockets, but we
    can do it for connected UDP sockets too.  Implementation from Shawn
    Bohrer.

10) Refactor inet socket demux with the goal of improving hash demux
    performance for listening sockets.  With the main goals being able
    to use RCU lookups on even request sockets, and eliminating the
    listening lock contention.  From Eric Dumazet.

11) The bonding layer has many demuxes in it's fast path, and an RCU
    conversion was started back in 3.11, several changes here extend the
    RCU usage to even more locations.  From Ding Tianhong and Wang
    Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav
    Falico.

12) Allow stackability of segmentation offloads to, in particular, allow
    segmentation offloading over tunnels.  From Eric Dumazet.

13) Significantly improve the handling of secret keys we input into the
    various hash functions in the inet hashtables, TCP fast open, as
    well as syncookies.  From Hannes Frederic Sowa.  The key fundamental
    operation is "net_get_random_once()" which uses static keys.

    Hannes even extended this to ipv4/ipv6 fragmentation handling and
    our generic flow dissector.

14) The generic driver layer takes care now to set the driver data to
    NULL on device removal, so it's no longer necessary for drivers to
    explicitly set it to NULL any more.  Many drivers have been cleaned
    up in this way, from Jingoo Han.

15) Add a BPF based packet scheduler classifier, from Daniel Borkmann.

16) Improve CRC32 interfaces and generic SKB checksum iterators so that
    SCTP's checksumming can more cleanly be handled.  Also from Daniel
    Borkmann.

17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces
    using the interface MTU value.  This helps avoid PMTU attacks,
    particularly on DNS servers.  From Hannes Frederic Sowa.

18) Use generic XPS for transmit queue steering rather than internal
    (re-)implementation in virtio-net.  From Jason Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
  random32: add test cases for taus113 implementation
  random32: upgrade taus88 generator to taus113 from errata paper
  random32: move rnd_state to linux/random.h
  random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized
  random32: add periodic reseeding
  random32: fix off-by-one in seeding requirement
  PHY: Add RTL8201CP phy_driver to realtek
  xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
  macmace: add missing platform_set_drvdata() in mace_probe()
  ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
  ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
  vlan: Implement vlan_dev_get_egress_qos_mask as an inline.
  ixgbe: add warning when max_vfs is out of range.
  igb: Update link modes display in ethtool
  netfilter: push reasm skb through instead of original frag skbs
  ip6_output: fragment outgoing reassembled skb properly
  MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
  net_sched: tbf: support of 64bit rates
  ixgbe: deleting dfwd stations out of order can cause null ptr deref
  ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
  ...
2013-11-13 17:40:34 +09:00
Linus Torvalds 5cbb3d216e Merge branch 'akpm' (patches from Andrew Morton)
Merge first patch-bomb from Andrew Morton:
 "Quite a lot of other stuff is banked up awaiting further
  next->mainline merging, but this batch contains:

   - Lots of random misc patches
   - OCFS2
   - Most of MM
   - backlight updates
   - lib/ updates
   - printk updates
   - checkpatch updates
   - epoll tweaking
   - rtc updates
   - hfs
   - hfsplus
   - documentation
   - procfs
   - update gcov to gcc-4.7 format
   - IPC"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (269 commits)
  ipc, msg: fix message length check for negative values
  ipc/util.c: remove unnecessary work pending test
  devpts: plug the memory leak in kill_sb
  ./Makefile: export initial ramdisk compression config option
  init/Kconfig: add option to disable kernel compression
  drivers: w1: make w1_slave::flags long to avoid memory corruption
  drivers/w1/masters/ds1wm.cuse dev_get_platdata()
  drivers/memstick/core/ms_block.c: fix unreachable state in h_msb_read_page()
  drivers/memstick/core/mspro_block.c: fix attributes array allocation
  drivers/pps/clients/pps-gpio.c: remove redundant of_match_ptr
  kernel/panic.c: reduce 1 byte usage for print tainted buffer
  gcov: reuse kbasename helper
  kernel/gcov/fs.c: use pr_warn()
  kernel/module.c: use pr_foo()
  gcov: compile specific gcov implementation based on gcc version
  gcov: add support for gcc 4.7 gcov format
  gcov: move gcov structs definitions to a gcc version specific file
  kernel/taskstats.c: return -ENOMEM when alloc memory fails in add_del_listener()
  kernel/taskstats.c: add nla_nest_cancel() for failure processing between nla_nest_start() and nla_nest_end()
  kernel/sysctl_binary.c: use scnprintf() instead of snprintf()
  ...
2013-11-13 15:45:43 +09:00
Linus Torvalds 9bc9ccd7db Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "All kinds of stuff this time around; some more notable parts:

   - RCU'd vfsmounts handling
   - new primitives for coredump handling
   - files_lock is gone
   - Bruce's delegations handling series
   - exportfs fixes

  plus misc stuff all over the place"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits)
  ecryptfs: ->f_op is never NULL
  locks: break delegations on any attribute modification
  locks: break delegations on link
  locks: break delegations on rename
  locks: helper functions for delegation breaking
  locks: break delegations on unlink
  namei: minor vfs_unlink cleanup
  locks: implement delegations
  locks: introduce new FL_DELEG lock flag
  vfs: take i_mutex on renamed file
  vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
  vfs: don't use PARENT/CHILD lock classes for non-directories
  vfs: pull ext4's double-i_mutex-locking into common code
  exportfs: fix quadratic behavior in filehandle lookup
  exportfs: better variable name
  exportfs: move most of reconnect_path to helper function
  exportfs: eliminate unused "noprogress" counter
  exportfs: stop retrying once we race with rename/remove
  exportfs: clear DISCONNECTED on all parents sooner
  exportfs: more detailed comment for path_reconnect
  ...
2013-11-13 15:34:18 +09:00
Linus Torvalds c08acff054 Merge branch 'for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu changes from Tejun Heo:
 "Two smallish changes for percpu.  Two patches to remove unused
  this_cpu_xor() and one to fix a bug in percpu init failure path so
  that it can reach the proper BUG() instead of oopsing earlier"

* 'for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  x86: remove this_cpu_xor() implementation
  percpu: remove this_cpu_xor() implementation
  percpu: fix bootmem error handling in pcpu_page_first_chunk()
2013-11-13 15:17:16 +09:00
Vineet Gupta c375f15a43 x86: move fpu_counter into ARCH specific thread_struct
Only a couple of arches (sh/x86) use fpu_counter in task_struct so it can
be moved out into ARCH specific thread_struct, reducing the size of
task_struct for other arches.

Compile tested i386_defconfig + gcc 4.7.3

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mundt <paul.mundt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:13 +09:00
Zhi Yong Wu d4dd100f2e arch/x86/mm/init.c: fix incorrect function name in alloc_low_pages()
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:11 +09:00
Tang Chen c5320926e3 mem-hotplug: introduce movable_node boot option
The hot-Pluggable field in SRAT specifies which memory is hotpluggable.
As we mentioned before, if hotpluggable memory is used by the kernel, it
cannot be hot-removed.  So memory hotplug users may want to set all
hotpluggable memory in ZONE_MOVABLE so that the kernel won't use it.

Memory hotplug users may also set a node as movable node, which has
ZONE_MOVABLE only, so that the whole node can be hot-removed.

But the kernel cannot use memory in ZONE_MOVABLE.  By doing this, the
kernel cannot use memory in movable nodes.  This will cause NUMA
performance down.  And other users may be unhappy.

So we need a way to allow users to enable and disable this functionality.
In this patch, we introduce movable_node boot option to allow users to
choose to not to consume hotpluggable memory at early boot time and later
we can set it as ZONE_MOVABLE.

To achieve this, the movable_node boot option will control the memblock
allocation direction.  That said, after memblock is ready, before SRAT is
parsed, we should allocate memory near the kernel image as we explained in
the previous patches.  So if movable_node boot option is set, the kernel
does the following:

1. After memblock is ready, make memblock allocate memory bottom up.
2. After SRAT is parsed, make memblock behave as default, allocate memory
   top down.

Users can specify "movable_node" in kernel commandline to enable this
functionality.  For those who don't use memory hotplug or who don't want
to lose their NUMA performance, just don't specify anything.  The kernel
will work as before.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Suggested-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:09 +09:00
Tang Chen fa591c4ae7 x86, acpi, crash, kdump: do reserve_crashkernel() after SRAT is parsed.
Memory reserved for crashkernel could be large.  So we should not allocate
this memory bottom up from the end of kernel image.

When SRAT is parsed, we will be able to know which memory is hotpluggable,
and we can avoid allocating this memory for the kernel.  So reorder
reserve_crashkernel() after SRAT is parsed.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:08 +09:00
Tang Chen b959ed6c73 x86/mem-hotplug: support initialize page tables in bottom-up
The Linux kernel cannot migrate pages used by the kernel.  As a result,
kernel pages cannot be hot-removed.  So we cannot allocate hotpluggable
memory for the kernel.

In a memory hotplug system, any numa node the kernel resides in should be
unhotpluggable.  And for a modern server, each node could have at least
16GB memory.  So memory around the kernel image is highly likely
unhotpluggable.

ACPI SRAT (System Resource Affinity Table) contains the memory hotplug
info.  But before SRAT is parsed, memblock has already started to allocate
memory for the kernel.  So we need to prevent memblock from doing this.

So direct memory mapping page tables setup is the case.
init_mem_mapping() is called before SRAT is parsed.  To prevent page
tables being allocated within hotpluggable memory, we will use bottom-up
direction to allocate page tables from the end of kernel image to the
higher memory.

Note:
As for allocating page tables in lower memory, TJ said:

: This is an optional behavior which is triggered by a very specific kernel
: boot param, which I suspect is gonna need to stick around to support
: memory hotplug in the current setup unless we add another layer of address
: translation to support memory hotplug.

As for page tables may occupy too much lower memory if using 4K mapping
(CONFIG_DEBUG_PAGEALLOC and CONFIG_KMEMCHECK both disable using >4k
pages), TJ said:

: But as I said in the same paragraph, parsing SRAT earlier doesn't solve
: the problem in itself either.  Ignoring the option if 4k mapping is
: required and memory consumption would be prohibitive should work, no?
: Something like that would be necessary if we're gonna worry about cases
: like this no matter how we implement it, but, frankly, I'm not sure this
: is something worth worrying about.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:08 +09:00
Tang Chen 0167d7d8b0 x86/mm: factor out of top-down direct mapping setup
Create a new function memory_map_top_down to factor out of the top-down
direct memory mapping pagetable setup.  This is also a preparation for the
following patch, which will introduce the bottom-up memory mapping.  That
said, we will put the two ways of pagetable setup into separate functions,
and choose to use which way in init_mem_mapping, which makes the code more
clear.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:08 +09:00
Jianguo Wu 40c3baa7c6 mm/arch: use NUMA_NO_NODE
Use more appropriate NUMA_NO_NODE instead of -1 in all archs' module_alloc()

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:05 +09:00
Len Brown 144b44b135 tools / power turbostat: Support Silvermont
Support the next generation Intel Atom processor
mirco-architecture, formerly called Silvermont.

The server version, formerly called "Avoton",
is named the "Intel(R) Atom(TM) Processor C2000 Product Family".

The client version, formerly called "Bay Trail",
is named the "Intel Atom Processor Z3000 Series",
as well as various "Intel Pentium Processor"
and "Intel Celeron Processor" brands, depending
on form-factor.

Silvermont has a set of MSRs not far off from NHM,
but the RAPL register set is a sub-set of those previously supported.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-12 23:16:02 +01:00
Thomas Renninger 11f918d3e2 x86/microcode/amd: Tone down printk(), don't treat a missing firmware file as an error
Do it the same way as done in microcode_intel.c: use pr_debug()
for missing firmware files.

There seem to be CPUs out there for which no microcode update
has been submitted to kernel-firmware repo yet resulting in
scary sounding error messages in dmesg:

  microcode: failed to load file amd-ucode/microcode_amd_fam16h.bin

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1384274383-43510-1-git-send-email-trenn@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-12 22:03:49 +01:00
Jiri Slaby 5f01c98859 x86/dumpstack: Fix printk_address for direct addresses
Consider a kernel crash in a module, simulated the following way:

 static int my_init(void)
 {
         char *map = (void *)0x5;
         *map = 3;
         return 0;
 }
 module_init(my_init);

When we turn off FRAME_POINTERs, the very first instruction in
that function causes a BUG. The problem is that we print IP in
the BUG report using %pB (from printk_address). And %pB
decrements the pointer by one to fix printing addresses of
functions with tail calls.

This was added in commit 71f9e59800 ("x86, dumpstack: Use
%pB format specifier for stack trace") to fix the call stack
printouts.

So instead of correct output:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000005
  IP: [<ffffffffa01ac000>] my_init+0x0/0x10 [pb173]

We get:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000005
  IP: [<ffffffffa0152000>] 0xffffffffa0151fff

To fix that, we use %pS only for stack addresses printouts (via
newly added printk_stack_address) and %pB for regs->ip (via
printk_address). I.e. we revert to the old behaviour for all
except call stacks. And since from all those reliable is 1, we
remove that parameter from printk_address.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: joe@perches.com
Cc: jirislaby@gmail.com
Link: http://lkml.kernel.org/r/1382706418-8435-1-git-send-email-jslaby@suse.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-12 21:06:06 +01:00
Kees Cook 327f7d7245 x86, kaslr: Use char array to gain sizeof sanity
The build_str needs to be char [] not char * for the sizeof() to report
the string length.

Reported-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20131112165607.GA5921@www.outflux.net
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-11-12 08:58:35 -08:00
Linus Torvalds 10d0c9705e DeviceTree updates for 3.13. This is a bit larger pull request than
usual for this cycle with lots of clean-up.
 
 - Cross arch clean-up and consolidation of early DT scanning code.
 - Clean-up and removal of arch prom.h headers. Makes arch specific
   prom.h optional on all but Sparc.
 - Addition of interrupts-extended property for devices connected to
   multiple interrupt controllers.
 - Refactoring of DT interrupt parsing code in preparation for deferred
   probe of interrupts.
 - ARM cpu and cpu topology bindings documentation.
 - Various DT vendor binding documentation updates.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJSgPQ4AAoJEMhvYp4jgsXif28H/1WkrXq5+lCFQZF8nbYdE2h0
 R8PsfiJJmAl6/wFgQTsRel+ScMk2hiP08uTyqf2RLnB1v87gCF7MKVaLOdONfUDi
 huXbcQGWCmZv0tbBIklxJe3+X3FIJch4gnyUvPudD1m8a0R0LxWXH/NhdTSFyB20
 PNjhN/IzoN40X1PSAhfB5ndWnoxXBoehV/IVHVDU42vkPVbVTyGAw5qJzHW8CLyN
 2oGTOalOO4ffQ7dIkBEQfj0mrgGcODToPdDvUQyyGZjYK2FY2sGrjyquir6SDcNa
 Q4gwatHTu0ygXpyphjtQf5tc3ZCejJ/F0s3olOAS1ahKGfe01fehtwPRROQnCK8=
 =GCbY
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
 "DeviceTree updates for 3.13.  This is a bit larger pull request than
  usual for this cycle with lots of clean-up.

   - Cross arch clean-up and consolidation of early DT scanning code.
   - Clean-up and removal of arch prom.h headers.  Makes arch specific
     prom.h optional on all but Sparc.
   - Addition of interrupts-extended property for devices connected to
     multiple interrupt controllers.
   - Refactoring of DT interrupt parsing code in preparation for
     deferred probe of interrupts.
   - ARM cpu and cpu topology bindings documentation.
   - Various DT vendor binding documentation updates"

* tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (82 commits)
  powerpc: add missing explicit OF includes for ppc
  dt/irq: add empty of_irq_count for !OF_IRQ
  dt: disable self-tests for !OF_IRQ
  of: irq: Fix interrupt-map entry matching
  MIPS: Netlogic: replace early_init_devtree() call
  of: Add Panasonic Corporation vendor prefix
  of: Add Chunghwa Picture Tubes Ltd. vendor prefix
  of: Add AU Optronics Corporation vendor prefix
  of/irq: Fix potential buffer overflow
  of/irq: Fix bug in interrupt parsing refactor.
  of: set dma_mask to point to coherent_dma_mask
  of: add vendor prefix for PHYTEC Messtechnik GmbH
  DT: sort vendor-prefixes.txt
  of: Add vendor prefix for Cadence
  of: Add empty for_each_available_child_of_node() macro definition
  arm/versatile: Fix versatile irq specifications.
  of/irq: create interrupts-extended property
  microblaze/pci: Drop PowerPC-ism from irq parsing
  of/irq: Create of_irq_parse_and_map_pci() to consolidate arch code.
  of/irq: Use irq_of_parse_and_map()
  ...
2013-11-12 16:52:17 +09:00
H. Peter Anvin e8236c4d93 x86, kaslr: Add a circular multiply for better bit diffusion
If we don't have RDRAND (in which case nothing else *should* matter),
most sources have a highly biased entropy distribution.  Use a
circular multiply to diffuse the entropic bits.  A circular multiply
is a good operation for this: it is cheap on standard hardware and
because it is symmetric (unlike an ordinary multiply) it doesn't
introduce its own bias.

Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/20131111222839.GA28616@www.outflux.net
2013-11-11 23:05:49 -08:00
Kees Cook a653f3563c x86, kaslr: Mix entropy sources together as needed
Depending on availability, mix the RDRAND and RDTSC entropy together with
XOR. Only when neither is available should the i8254 be used. Update
the Kconfig documentation to reflect this. Additionally, since bits
used for entropy is masked elsewhere, drop the needless masking in
the get_random_long(). Similarly, use the entire TSC, not just the low
32 bits.

Finally, to improve the starting entropy, do a simple hashing of a
build-time versions string and the boot-time boot_params structure for
some additional level of unpredictability.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20131111222839.GA28616@www.outflux.net
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-11-11 22:29:44 -08:00
Linus Torvalds 9b66bfb280 Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 UV debug changes from Ingo Molnar:
 "Various SGI UV debuggability improvements, amongst them KDB support,
  with related core KDB enabling patches changing kernel/debug/kdb/"

* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "x86/UV: Add uvtrace support"
  x86/UV: Add call to KGDB/KDB from NMI handler
  kdb: Add support for external NMI handler to call KGDB/KDB
  x86/UV: Check for alloc_cpumask_var() failures properly in uv_nmi_setup()
  x86/UV: Add uvtrace support
  x86/UV: Add kdump to UV NMI handler
  x86/UV: Add summary of cpu activity to UV NMI handler
  x86/UV: Update UV support for external NMI signals
  x86/UV: Move NMI support
2013-11-12 12:01:14 +09:00
Linus Torvalds c2136301e4 Merge branch 'x86-uaccess-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 uaccess changes from Ingo Molnar:
 "A single change that micro-optimizes __copy_*_user_inatomic(), used by
  the futex code"

* 'x86-uaccess-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic
2013-11-12 11:46:06 +09:00
Linus Torvalds 986189f9ec Merge branch 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 reboot changes from Ingo Molnar:
 "Misc changes - the only one with functional impact should be commit
  16c21ae5ca ("reboot: Allow specifying warm/cold reset for CF9 boot
  type") which extends cold/warm reboot handling to the 0xCF9 reboot
  method"

* 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/reboot: Correct pr_info() log message in the set_bios/pci/kbd_reboot()
  x86/reboot: Sort reboot DMI quirks by vendor
  x86/reboot: Remove the duplicate C6100 entry in the reboot quirks list
  reboot: Allow specifying warm/cold reset for CF9 boot type
2013-11-12 11:33:38 +09:00
Linus Torvalds 2934578e28 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform fixlet from Ingo Molnar:
 "A single __initdata fix"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/geode: Fix incorrect placement of __initdata tag
2013-11-12 11:23:44 +09:00
Linus Torvalds 2612c49db3 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm fixlet from Ingo Molnar:
 "One cleanup that documents a particular detail in init_mem_mapping()"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Add 'step_size' comments to init_mem_mapping()
2013-11-12 11:20:52 +09:00
Linus Torvalds 340286cd4e Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 RAS changes from Ingo Molnar:
 "The biggest change adds support for Intel 'CPER' (UEFI Common Platform
  Error Record) error logging, which builds upon an enhanced error
  logging mechanism available on Xeon processors.

  Full description is here:

    http://www.intel.com/content/www/us/en/architecture-and-technology/enhanced-mca-logging-xeon-paper.html

  This change provides a module (and support code) to check for an
  extended error log and prints extra details about the error on the
  console"

* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ACPI, x86: Fix extended error log driver to depend on CONFIG_X86_LOCAL_APIC
  dmi: Avoid unaligned memory access in save_mem_devices()
  Move cper.c from drivers/acpi/apei to drivers/firmware/efi
  EDAC, GHES: Update ghes error record info
  ACPI, APEI, CPER: Cleanup CPER memory error output format
  ACPI, APEI, CPER: Enhance memory reporting capability
  ACPI, APEI, CPER: Add UEFI 2.4 support for memory error
  DMI: Parse memory device (type 17) in SMBIOS
  ACPI, x86: Extended error log driver for x86 platform
  bitops: Introduce a more generic BITMASK macro
  ACPI, CPER: Update cper info
  ACPI, APEI, CPER: Fix status check during error printing
2013-11-12 11:16:44 +09:00
Linus Torvalds 339a4b72c8 Merge branch 'x86-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 iommu changes from Ingo Molnar:
 "Make it easier to turn off the old AMD GART code"

* 'x86-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/iommu: Clean up the CONFIG_GART_IOMMU config option a bit
  x86/iommu: Don't make AMD_GART depend on EXPERT and default y
2013-11-12 11:15:12 +09:00
Linus Torvalds dba538ff56 Merge branch 'x86-intel-mid-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/intel-mid changes from Ingo Molnar:
 "Update the 'intel mid' (mobile internet device) platform code as Intel
  is rolling out more SoC designs.

  This gets rid of most of the 'MRST' platform code in the process,
  mostly by renaming and shuffling code around into their respective
  'intel-mid' platform drivers"

* 'x86-intel-mid-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, intel-mid: Do not re-introduce usage of obsolete __cpuinit
  intel_mid: Move platform device setups to their own platform_<device>.* files
  x86: intel-mid: Add section for sfi device table
  intel-mid: sfi: Allow struct devs_id.get_platform_data to be NULL
  intel_mid: Moved SFI related code to sfi.c
  intel_mid: Added custom handler for ipc devices
  intel_mid: Added custom device_handler support
  intel_mid: Refactored sfi_parse_devs() function
  intel_mid: Renamed *mrst* to *intel_mid*
  pci: intel_mid: Return true/false in function returning bool
  intel_mid: Renamed *mrst* to *intel_mid*
  mrst: Fixed indentation issues
  mrst: Fixed printk/pr_* related issues
2013-11-12 11:12:22 +09:00
Linus Torvalds 2dc1733fd4 Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/hyperv changes from Ingo Molnar:
 "These changes enable Linux guests to boot as 'Modern VM' guest kernels
  on MS-Hyperv hosts"

* 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, hyperv: Move a variable to avoid an unused variable warning
  x86, hyperv: Fix build error due to missing <asm/apic.h> include
  x86, hyperv: Correctly guard the local APIC calibration code
  x86, hyperv: Get the local APIC timer frequency from the hypervisor
2013-11-12 11:07:49 +09:00
Linus Torvalds 69019d77c7 Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 EFI changes from Ingo Molnar:
 "Main changes:

   - Add support for earlyprintk=efi which uses the EFI framebuffer.
     Very useful for debugging boot problems.

   - EFI stub support for large memory maps (more than 128 entries)

   - EFI ARM support - this was mostly done by generalizing x86 <-> ARM
     platform differences, such as by moving x86 EFI code into
     drivers/firmware/efi/ and sharing it with ARM.

   - Documentation updates

   - misc fixes"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
  x86/efi: Add EFI framebuffer earlyprintk support
  boot, efi: Remove redundant memset()
  x86/efi: Fix config_table_type array termination
  x86 efi: bugfix interrupt disabling sequence
  x86: EFI stub support for large memory maps
  efi: resolve warnings found on ARM compile
  efi: Fix types in EFI calls to match EFI function definitions.
  efi: Renames in handle_cmdline_files() to complete generalization.
  efi: Generalize handle_ramdisks() and rename to handle_cmdline_files().
  efi: Allow efi_free() to be called with size of 0
  efi: use efi_get_memory_map() to get final map for x86
  efi: generalize efi_get_memory_map()
  efi: Rename __get_map() to efi_get_memory_map()
  efi: Move unicode to ASCII conversion to shared function.
  efi: Generalize relocate_kernel() for use by other architectures.
  efi: Move relocate_kernel() to shared file.
  efi: Enforce minimum alignment of 1 page on allocations.
  efi: Rename memory allocation/free functions
  efi: Add system table pointer argument to shared functions.
  efi: Move common EFI stub code from x86 arch code to common location
  ...
2013-11-12 10:48:30 +09:00
Linus Torvalds 6df1e7f2e9 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu changes from Ingo Molnar:
 "The biggest change that stands out is the increase of the
  CONFIG_NR_CPUS range from 4096 to 8192 - as real hardware out there
  already went beyond 4k CPUs ...

  We only allow more than 512 CPUs if offstack cpumasks are enabled.

  CONFIG_MAXSMP=y remains to be the 'you are nuts!' extreme testcase,
  which now means a max of 8192 CPUs"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Increase max CPU count to 8192
  x86/cpu: Allow higher NR_CPUS values
  x86/cpu: Always print SMP information in /proc/cpuinfo
  x86/cpu: Track legacy CPU model data only on 32-bit kernels
2013-11-12 10:46:43 +09:00
Linus Torvalds d96d8aa261 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Two small cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, msr: Use file_inode(), not f_mapping->host
  x86: mkpiggy.c: Explicitly close the output file
2013-11-12 10:45:01 +09:00
Linus Torvalds 54e11304e8 Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build changes from Ingo Molnar:
 "Two small changes"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, defconfig: Add DEVTMPFS and DEVTMPFS_MOUNT to *86*_defconfig
  x86, build: move build output statistics away from stderr
2013-11-12 10:43:32 +09:00
Linus Torvalds 014d595c23 Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot changes from Ingo Molnar:
 "Two changes that prettify and compactify the SMP bootup output from:

     smpboot: Booting Node   0, Processors  #1 #2 #3 OK
     smpboot: Booting Node   1, Processors  #4 #5 #6 #7 OK
     smpboot: Booting Node   2, Processors  #8 #9 #10 #11 OK
     smpboot: Booting Node   3, Processors  #12 #13 #14 #15 OK
     Brought up 16 CPUs

  to something like:

     x86: Booting SMP configuration:
     .... node  #0, CPUs:        #1  #2  #3
     .... node  #1, CPUs:    #4  #5  #6  #7
     .... node  #2, CPUs:    #8  #9 #10 #11
     .... node  #3, CPUs:   #12 #13 #14 #15
     x86: Booted up 4 nodes, 16 CPUs"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Further compress CPUs bootup message
  x86: Improve the printout of the SMP bootup CPU table
2013-11-12 10:41:10 +09:00
Linus Torvalds ae795fe760 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 user access changes from Ingo Molnar:
 "This tree contains two copy_[from/to]_user() build time checking
  changes/enhancements from Jan Beulich.

  The desired outcome is to get better compiler warnings with
  CONFIG_DEBUG_STRICT_USER_COPY_CHECKS=y, to keep people from
  introducing bugs such as overflows and information leaks"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Unify copy_to_user() and add size checking to it
  x86: Unify copy_from_user() size checking
2013-11-12 10:39:08 +09:00
Linus Torvalds b7ab6e3d21 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic fix from Ingo Molnar:
 "A single fix to the IO-APIC / local-APIC shutdown sequence"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Disable I/O APIC before shutdown of the local APIC
2013-11-12 10:37:53 +09:00
Linus Torvalds 87093826aa Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer changes from Ingo Molnar:
 "Main changes in this cycle were:

   - Updated full dynticks support.

   - Event stream support for architected (ARM) timers.

   - ARM clocksource driver updates.

   - Move arm64 to using the generic sched_clock framework & resulting
     cleanup in the generic sched_clock code.

   - Misc fixes and cleanups"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  x86/time: Honor ACPI FADT flag indicating absence of a CMOS RTC
  clocksource: sun4i: remove IRQF_DISABLED
  clocksource: sun4i: Report the minimum tick that we can program
  clocksource: sun4i: Select CLKSRC_MMIO
  clocksource: Provide timekeeping for efm32 SoCs
  clocksource: em_sti: convert to clk_prepare/unprepare
  time: Fix signedness bug in sysfs_get_uname() and its callers
  timekeeping: Fix some trivial typos in comments
  alarmtimer: return EINVAL instead of ENOTSUPP if rtcdev doesn't exist
  clocksource: arch_timer: Do not register arch_sys_counter twice
  timer stats: Add a 'Collection: active/inactive' line to timer usage statistics
  sched_clock: Remove sched_clock_func() hook
  arch_timer: Move to generic sched_clock framework
  clocksource: tcb_clksrc: Remove IRQF_DISABLED
  clocksource: tcb_clksrc: Improve driver robustness
  clocksource: tcb_clksrc: Replace clk_enable/disable with clk_prepare_enable/disable_unprepare
  clocksource: arm_arch_timer: Use clocksource for suspend timekeeping
  clocksource: dw_apb_timer_of: Mark a few more functions as __init
  clocksource: Put nodes passed to CLOCKSOURCE_OF_DECLARE callbacks centrally
  arm: zynq: Enable arm_global_timer
  ...
2013-11-12 10:36:00 +09:00
Linus Torvalds 39cf275a1a Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
 "The main changes in this cycle are:

   - (much) improved CONFIG_NUMA_BALANCING support from Mel Gorman, Rik
     van Riel, Peter Zijlstra et al.  Yay!

   - optimize preemption counter handling: merge the NEED_RESCHED flag
     into the preempt_count variable, by Peter Zijlstra.

   - wait.h fixes and code reorganization from Peter Zijlstra

   - cfs_bandwidth fixes from Ben Segall

   - SMP load-balancer cleanups from Peter Zijstra

   - idle balancer improvements from Jason Low

   - other fixes and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
  ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED
  stop_machine: Fix race between stop_two_cpus() and stop_cpus()
  sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus
  sched: Fix asymmetric scheduling for POWER7
  sched: Move completion code from core.c to completion.c
  sched: Move wait code from core.c to wait.c
  sched: Move wait.c into kernel/sched/
  sched/wait: Fix __wait_event_interruptible_lock_irq_timeout()
  sched: Avoid throttle_cfs_rq() racing with period_timer stopping
  sched: Guarantee new group-entities always have weight
  sched: Fix hrtimer_cancel()/rq->lock deadlock
  sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
  sched: Fix race on toggling cfs_bandwidth_used
  sched: Remove extra put_online_cpus() inside sched_setaffinity()
  sched/rt: Fix task_tick_rt() comment
  sched/wait: Fix build breakage
  sched/wait: Introduce prepare_to_wait_event()
  sched/wait: Add ___wait_cond_timeout() to wait_event*_timeout() too
  sched: Remove get_online_cpus() usage
  sched: Fix race in migrate_swap_stop()
  ...
2013-11-12 10:20:12 +09:00
Linus Torvalds ad5d69899e Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "As a first remark I'd like to note that the way to build perf tooling
  has been simplified and sped up, in the future it should be enough for
  you to build perf via:

        cd tools/perf/
        make install

  (ie without the -j option.) The build system will figure out the
  number of CPUs and will do a parallel build+install.

  The various build system inefficiencies and breakages Linus reported
  against the v3.12 pull request should now be resolved - please
  (re-)report any remaining annoyances or bugs.

  Main changes on the perf kernel side:

   * Performance optimizations:
      . perf ring-buffer code optimizations,          by Peter Zijlstra
      . perf ring-buffer code optimizations,          by Oleg Nesterov
      . x86 NMI call-stack processing optimizations,  by Peter Zijlstra
      . perf context-switch optimizations,            by Peter Zijlstra
      . perf sampling speedups,                       by Peter Zijlstra
      . x86 Intel PEBS processing speedups,           by Peter Zijlstra

   * Enhanced hardware support:
      . for Intel Ivy Bridge-EP uncore PMUs,          by Zheng Yan
      . for Haswell transactions,                     by Andi Kleen, Peter Zijlstra

   * Core perf events code enhancements and fixes by Oleg Nesterov:
      . for uprobes, if fork() is called with pending ret-probes
      . for uprobes platform support code

   * New ABI details by Andi Kleen:
      . Report x86 Haswell TSX transaction abort cost as weight

  Main changes on the perf tooling side (some of these tooling changes
  utilize the above kernel side changes):

   * 'perf report/top' enhancements:

      . Convert callchain children list to rbtree, greatly reducing the
        time taken for callchain processing, from Namhyung Kim.

      . Add new COMM infrastructure, further improving histogram
        processing, from Frédéric Weisbecker, one fix from Namhyung Kim.

      . Add /proc/kcore based live-annotation improvements, including
        build-id cache support, multi map 'call' instruction navigation
        fixes, kcore address validation, objdump workarounds.  From
        Adrian Hunter.

      . Show progress on histogram collapsing, that can take a long
        time, from Namhyung Kim.

      . Add --max-stack option to limit callchain stack scan in 'top'
        and 'report', improving callchain processing when reducing the
        stack depth is an option, from Waiman Long.

      . Add new option --ignore-vmlinux for perf top, from Willy
        Tarreau.

   * 'perf trace' enhancements:

      . 'perf trace' now can can use a 'perf probe' dynamic tracepoints
        to hook into the userspace -> kernel pathname copy so that it
        can map fds to pathnames without reading /proc/pid/fd/ symlinks.
        From Arnaldo Carvalho de Melo.

      . Show VFS path associated with fd in live sessions, using a
        'vfs_getname' 'perf probe' created dynamic tracepoint or by
        looking at /proc/pid/fd, from Arnaldo Carvalho de Melo.

      . Add 'trace' beautifiers for lots of syscall arguments, from
        Arnaldo Carvalho de Melo.

      . Implement more compact 'trace' output by suppressing zeroed
        args, from Arnaldo Carvalho de Melo.

      . Show thread COMM by default in 'trace', from Arnaldo Carvalho de
        Melo.

      . Add option to show full timestamp in 'trace', from David Ahern.

      . Add 'record' command in 'trace', to record raw_syscalls:*, from
        David Ahern.

      . Add summary option to dump syscall statistics in 'trace', from
        David Ahern.

      . Improve error messages in 'trace', providing hints about system
        configuration steps needed for using it, from Ramkumar
        Ramachandra.

      . 'perf trace' now emits hints as to why tracing is not possible,
        helping the user to setup the system to allow tracing in the
        desired permission granularity, telling if the problem is due to
        debugfs not being mounted or with not enough permission for
        !root, /proc/sys/kernel/perf_event_paranoit value, etc.  From
        Arnaldo Carvalho de Melo.

   * 'perf record' enhancements:

      . Check maximum frequency rate for record/top, emitting better
        error messages, from Jiri Olsa.

      . 'perf record' code cleanups, from David Ahern.

      . Improve write_output error message in 'perf record', from Adrian
        Hunter.

      . Allow specifying B/K/M/G unit to the --mmap-pages arguments,
        from Jiri Olsa.

      . Fix command line callchain attribute tests to handle the new
        -g/--call-chain semantics, from Arnaldo Carvalho de Melo.

   * 'perf kvm' enhancements:

      . Disable live kvm command if timerfd is not supported, from David
        Ahern.

      . Fix detection of non-core features, from David Ahern.

   * 'perf list' enhancements:

      . Add usage to 'perf list', from David Ahern.

      . Show error in 'perf list' if tracepoints not available, from
        Pekka Enberg.

   * 'perf probe' enhancements:

      . Support "$vars" meta argument syntax for local variables,
        allowing asking for all possible variables at a given probe
        point to be collected when it hits, from Masami Hiramatsu.

   * 'perf sched' enhancements:

      . Address the root cause of that 'perf sched' stack initialization
        build slowdown, by programmatically setting a big array after
        moving the global variable back to the stack.  Fix from Adrian
        Hunter.

   * 'perf script' enhancements:

      . Set up output options for in-stream attributes, from Adrian
        Hunter.

      . Print addr by default for BTS in 'perf script', from Adrian
        Juntmer

   * 'perf stat' enhancements:

      . Improved messages when doing profiling in all or a subset of
        CPUs using a workload as the session delimitator, as in:

         'perf stat --cpu 0,2 sleep 10s'

        from Arnaldo Carvalho de Melo.

      . Add units to nanosec-based counters in 'perf stat', from David
        Ahern.

      . Remove bogus info when using 'perf stat' -e cycles/instructions,
        from Ramkumar Ramachandra.

   * 'perf lock' enhancements:

      . 'perf lock' fixes and cleanups, from Davidlohr Bueso.

   * 'perf test' enhancements:

      . Fixup PERF_SAMPLE_TRANSACTION handling in sample synthesizing
        and 'perf test', from Adrian Hunter.

      . Clarify the "sample parsing" test entry, from Arnaldo Carvalho
        de Melo.

      . Consider PERF_SAMPLE_TRANSACTION in the "sample parsing" test,
        from Arnaldo Carvalho de Melo.

      . Memory leak fixes in 'perf test', from Felipe Pena.

   * 'perf bench' enhancements:

      . Change the procps visible command-name of invididual benchmark
        tests plus cleanups, from Ingo Molnar.

   * Generic perf tooling infrastructure/plumbing changes:

      . Separating data file properties from session, code
        reorganization from Jiri Olsa.

      . Fix version when building out of tree, as when using one of
        these:

        $ make help | grep perf
          perf-tar-src-pkg    - Build perf-3.12.0.tar source tarball
          perf-targz-src-pkg  - Build perf-3.12.0.tar.gz source tarball
          perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball
          perf-tarxz-src-pkg  - Build perf-3.12.0.tar.xz source tarball
        $

        from David Ahern.

      . Enhance option parse error message, showing just the help lines
        of the options affected, from Namhyung Kim.

      . libtraceevent updates from upstream trace-cmd repo, from Steven
        Rostedt.

      . Always use perf_evsel__set_sample_bit to set sample_type, from
        Adrian Hunter.

      . Memory and mmap leak fixes from Chenggang Qin.

      . Assorted build fixes for from David Ahern and Jiri Olsa.

      . Speed up and prettify the build system, from Ingo Molnar.

      . Implement addr2line directly using libbfd, from Roberto Vitillo.

      . Separate the GTK support in a separate libperf-gtk.so DSO, that
        is only loaded when --gtk is specified, from Namhyung Kim.

      . perf bash completion fixes and improvements from Ramkumar
        Ramachandra.

      . Support for Openembedded/Yocto -dbg packages, from Ricardo
        Ribalda Delgado.

  And lots and lots of other fixes and code reorganizations that did not
  make it into the list, see the shortlog, diffstat and the Git log for
  details!"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (300 commits)
  uprobes: Fix the memory out of bound overwrite in copy_insn()
  uprobes: Fix the wrong usage of current->utask in uprobe_copy_process()
  perf tools: Remove unneeded include
  perf record: Remove post_processing_offset variable
  perf record: Remove advance_output function
  perf record: Refactor feature handling into a separate function
  perf trace: Don't relookup fields by name in each sample
  perf tools: Fix version when building out of tree
  perf evsel: Ditch evsel->handler.data field
  uprobes: Export write_opcode() as uprobe_write_opcode()
  uprobes: Introduce arch_uprobe->ixol
  uprobes: Kill module_init() and module_exit()
  uprobes: Move function declarations out of arch
  perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
  perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes
  perf: Factor out strncpy() in perf_event_mmap_event()
  tools/perf: Add required memory barriers
  perf: Fix arch_perf_out_copy_user default
  perf: Update a stale comment
  perf: Optimize perf_output_begin() -- address calculation
  ...
2013-11-12 10:06:34 +09:00
Linus Torvalds 1006fae359 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull IRQ changes from Ingo Molnar:
 "The biggest change this cycle are the softirq/hardirq stack
  interaction and nesting fixes, cleanups and reorganizations from
  Frederic.  This is the longer followup story to the softirq nesting
  fix that is already upstream (commit ded797547548: "irq: Force hardirq
  exit's softirq processing on its own stack")"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: bcm2835: Convert to use IRQCHIP_DECLARE macro
  powerpc: Tell about irq stack coverage
  x86: Tell about irq stack coverage
  irq: Optimize softirq stack selection in irq exit
  irq: Justify the various softirq stack choices
  irq: Improve a bit softirq debugging
  irq: Optimize call to softirq on hardirq exit
  irq: Consolidate do_softirq() arch overriden implementations
  x86/irq: Correct comment about i8259 initialization
2013-11-12 10:02:59 +09:00
Ingo Molnar b5dfcb09de Revert "x86/UV: Add uvtrace support"
This reverts commit 8eba18428a.

uv_trace() is not used by anything, nor is uv_trace_nmi_func, nor
uv_trace_func.

That's not how we do instrumentation code in the kernel: we add
tracepoints, printk()s, etc. so that everyone not just those with
magic kernel modules can debug a system.

So remove this unused (and misguied) piece of code.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mike Travis <travis@sgi.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/n/tip-tumfBffmr4jmnt8Gyxanoblg@git.kernel.org
2013-11-11 19:53:42 +01:00
H. Peter Anvin a4f61dec55 x86, trace: Change user|kernel_page_fault to page_fault_user|kernel
Tracepoints are named hierachially, and it makes more sense to keep a
general flow of information level from general to specific from left
to right, i.e.

	x86_exceptions.page_fault_user|kernel

rather than

	x86_exceptions.user|kernel_page_fault

Suggested-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Seiji Aguchi <seiji.aguchi@hds.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/20131111082955.GB12405@gmail.com
2013-11-11 08:15:40 -08:00
Al Viro ce39596048 constify copy_siginfo_to_user{,32}()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-09 00:16:29 -05:00
Al Viro 9b56d54380 dump_skip(): dump_seek() replacement taking coredump_params
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-09 00:16:26 -05:00
Al Viro 43a5d548eb aout: switch to dump_emit
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-09 00:16:25 -05:00
Al Viro aa3e7eaf0a switch elf_core_write_extra_data() to dump_emit()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-09 00:16:23 -05:00
Al Viro 506f21c556 switch elf_core_write_extra_phdrs() to dump_emit()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-09 00:16:23 -05:00
Al Viro 7d2f551f6d restore 32bit aout coredump
just getting rid of bitrot

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-09 00:16:22 -05:00
Seiji Aguchi d34603b07c x86, trace: Add page fault tracepoints
This patch introduces page fault tracepoints to x86 architecture
by switching IDT.

  Two events, for user and kernel spaces, are introduced at the beginning
  of page fault handler for tracing.

  - User space event
    There is a request of page fault event for user space as below.

    https://lkml.kernel.org/r/1368079520-11015-2-git-send-email-fdeslaur+()+gmail+!+com
    https://lkml.kernel.org/r/1368079520-11015-1-git-send-email-fdeslaur+()+gmail+!+com

  - Kernel space event:
    When we measure an overhead in kernel space for investigating performance
    issues, we can check if it comes from the page fault events.

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/52716E67.6090705@hds.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-08 14:15:49 -08:00
Seiji Aguchi ac7956e269 x86, trace: Delete __trace_alloc_intr_gate()
Currently irq vector handlers for tracing are registered in both set_intr_gate()
 and __trace_alloc_intr_gate() in alloc_intr_gate().
But, we don't need to do that twice.
So, let's delete __trace_alloc_intr_gate().

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/52716E1B.7090205@hds.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-08 14:15:47 -08:00
Seiji Aguchi 25c74b10ba x86, trace: Register exception handler to trace IDT
This patch registers exception handlers for tracing to a trace IDT.

To implemented it in set_intr_gate(), this patch does followings.
 - Register the exception handlers to
   the trace IDT by prepending "trace_" to the handler's names.
 - Also, newly introduce trace_page_fault() to add tracepoints
   in a subsequent patch.

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/52716DEC.5050204@hds.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-08 14:15:45 -08:00
Seiji Aguchi 959c071f09 x86, trace: Remove __alloc_intr_gate()
Prepare to move set_intr_gate() into a macro by removing
__alloc_intr_gate().

The purpose is to avoid failing a kernel build after applying a
subsequent patch which changes set_intr_gate() into a macro.

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/52716DB8.1080702@hds.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-08 14:15:44 -08:00
Konrad Rzeszutek Wilk e1d8f62ad4 Merge remote-tracking branch 'stefano/swiotlb-xen-9.1' into stable/for-linus-3.13
* stefano/swiotlb-xen-9.1:
  swiotlb-xen: fix error code returned by xen_swiotlb_map_sg_attrs
  swiotlb-xen: static inline xen_phys_to_bus, xen_bus_to_phys, xen_virt_to_bus and range_straddles_page_boundary
  grant-table: call set_phys_to_machine after mapping grant refs
  arm,arm64: do not always merge biovec if we are running on Xen
  swiotlb: print a warning when the swiotlb is full
  swiotlb-xen: use xen_dma_map/unmap_page, xen_dma_sync_single_for_cpu/device
  xen: introduce xen_dma_map/unmap_page and xen_dma_sync_single_for_cpu/device
  swiotlb-xen: use xen_alloc/free_coherent_pages
  xen: introduce xen_alloc/free_coherent_pages
  arm64/xen: get_dma_ops: return xen_dma_ops if we are running as xen_initial_domain
  arm/xen: get_dma_ops: return xen_dma_ops if we are running as xen_initial_domain
  swiotlb-xen: introduce xen_swiotlb_set_dma_mask
  xen/arm,arm64: enable SWIOTLB_XEN
  xen: make xen_create_contiguous_region return the dma address
  xen/x86: allow __set_phys_to_machine for autotranslate guests
  arm/xen,arm64/xen: introduce p2m
  arm64: define DMA_ERROR_CODE
  arm: make SWIOTLB available

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Conflicts:
	arch/arm/include/asm/dma-mapping.h
	drivers/xen/swiotlb-xen.c

[Conflicts arose b/c "arm: make SWIOTLB available" v8 was in Stefano's
branch, while I had v9 + Ack from Russel. I also fixed up white-space
issues]
2013-11-08 16:10:48 -05:00
Konrad Rzeszutek Wilk bad97817de Linux 3.12-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQEcBAABAgAGBQJSWyGgAAoJEHm+PkMAQRiGA8MH/35upHXImoRCsI5uC1qvHJtI
 QvQAhDFxoEXbFUKeaYgTfcM8q9FgnqnfjhLf8eYa4Q7tDZeqLXOE8bkI807mSZMl
 yECr3jcwlV+zyhV2MP/HdwTjzy25bwxLM3Zy43S7QROrYoMHZYznil/QPfyMATCJ
 XLPuXZC1FtuUen89n4BoDIuL8QaVrIR/zLqFklAQcdTcGpLHSOwFtH8gb2WaRLhv
 +4IikFRFgTNZiMR5tP0GPc6UH6TVTvRb4QKSqqa7J8OmfAIvOzAUdhqWSPOIwWwt
 Z/+JFxFDczAcNmpv4gE6jkgc2vR8CVeHsvh0j61RDSFObBWspwk337CSyUZxYSA=
 =w4VQ
 -----END PGP SIGNATURE-----

Merge tag 'v3.12-rc5' into stable/for-linus-3.13

Linux 3.12-rc5

Because the Stefano branch (for SWIOTLB ARM changes) is based on that.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

* tag 'v3.12-rc5': (550 commits)
  Linux 3.12-rc5
  watchdog: sunxi: Fix section mismatch
  watchdog: kempld_wdt: Fix bit mask definition
  watchdog: ts72xx_wdt: locking bug in ioctl
  ARM: exynos: dts: Update 5250 arch timer node with clock frequency
  parisc: let probe_kernel_read() capture access to page zero
  parisc: optimize variable initialization in do_page_fault
  parisc: fix interruption handler to respect pagefault_disable()
  parisc: mark parisc_terminate() noreturn and cold.
  parisc: remove unused syscall_ipi() function.
  parisc: kill SMP single function call interrupt
  parisc: Export flush_cache_page() (needed by lustre)
  vfs: allow O_PATH file descriptors for fstatfs()
  ext4: fix memory leak in xattr
  ARC: Ignore ptrace SETREGSET request for synthetic register "stop_pc"
  ALSA: hda - Sony VAIO Pro 13 (haswell) now has a working headset jack
  ALSA: hda - Add a headset mic model for ALC269 and friends
  ALSA: hda - Fix microphone for Sony VAIO Pro 13 (Haswell model)
  compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
  Revert "i915: Update VGA arbiter support for newer devices"
  ...
2013-11-08 15:28:05 -05:00
Stefano Stabellini 92c0fd17c0 pci-swiotlb-xen: call pci_request_acs only ifdef CONFIG_PCI
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-11-08 15:21:44 -05:00
Paul Gortmaker 3b284bde70 xen: delete new instances of added __cpuinit
commit 6efa20e49b
("xen: Support 64-bit PV guest receiving NMIs") and
commit cd9151e26d
( "xen/balloon: set a mapping for ballooned out pages")
added new instances of __cpuinit usage.

We removed this a couple versions ago; we now want to remove
the compat no-op stubs.  Introducing new users is not what
we want to see at this point in time, as it will break once
the stubs are gone.

Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-11-08 15:13:16 -05:00
Ben Widawsky 9459d25237 drm/i915/bdw: support GMS and GGMS changes
All the BARs have the ability to grow.

v2: Pulled out the simulator workaround to a separate patch.
Rebased.

v3: Rebase onto latest vlv patches from Jesse.

v4: Rebased on top of the early stolen quirk patch from Jesse.

v5: Use the new macro names.
s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS
s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS
It's Jesse's fault for not following the convention I originally set.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:39 +01:00
Andrey Vagin 98bbc06aab net: x86: bpf: don't forget to free sk_filter (v2)
sk_filter isn't freed if bpf_func is equal to sk_run_filter.

This memory leak was introduced by v3.12-rc3-224-gd45ed4a4
"net: fix unsafe set_memory_rw from softirq".

Before this patch sk_filter was freed in sk_filter_release_rcu,
now it should be freed in bpf_jit_free.

Here is output of kmemleak:
unreferenced object 0xffff8800b774eab0 (size 128):
  comm "systemd", pid 1, jiffies 4294669014 (age 124.062s)
  hex dump (first 32 bytes):
    00 00 00 00 0b 00 00 00 20 63 7f b7 00 88 ff ff  ........ c......
    60 d4 55 81 ff ff ff ff 30 d9 55 81 ff ff ff ff  `.U.....0.U.....
  backtrace:
    [<ffffffff816444be>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811845af>] __kmalloc+0xef/0x260
    [<ffffffff81534028>] sock_kmalloc+0x38/0x60
    [<ffffffff8155d4dd>] sk_attach_filter+0x5d/0x190
    [<ffffffff815378a1>] sock_setsockopt+0x991/0x9e0
    [<ffffffff81531bd6>] SyS_setsockopt+0xb6/0xd0
    [<ffffffff8165f3e9>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff

v2: add extra { } after else

Fixes: d45ed4a4e3 ("net: fix unsafe set_memory_rw from softirq")
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:06:52 -05:00
Bjorn Helgaas eaaeb1cb33 Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Enable upstream bridges even for VFs on virtual buses
  PCI: Add pci_upstream_bridge()
  PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()
2013-11-07 15:02:04 -07:00
Paul Gortmaker aeeca40426 x86, intel-mid: Do not re-introduce usage of obsolete __cpuinit
The commit 712b6aa873 [Nov7 linux-next
via tip/auto-latest] ("intel_mid: Renamed *mrst* to *intel_mid*")
adds a __cpuinit.

We removed this a couple versions ago; we now want to remove
the compat no-op stubs.  Introducing new users is not what
we want to see at this point in time, as it will break once
the stubs are gone.

Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1383849290-11250-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-07 10:44:03 -08:00
Rafael J. Wysocki 0faf996f4d Merge branch 'acpica'
* acpica: (35 commits)
  ACPICA: Add __init for ACPICA initializers/finalizers.
  ACPICA: Cleanup asmlinkage for ACPICA APIs.
  ACPICA: Update acpidump related header file changes.
  ACPICA: Update compilation environment settings.
  ACPICA: Fix cached object deletion code.
  ACPICA: Remove dead AOPOBJ_INVALID check.
  ACPICA: Cleanup useless memset invocations.
  ACPICA: Fix an ACPI_ALLOCATE_ZEROED() reversal.
  ACPICA: Fix wrong object length returned by acpi_ut_get_simple_object_size().
  ACPICA: Add new statistics interface.
  ACPICA: Update DMAR table definitions.
  ACPICA: Update RSDP table definitions.
  ACPICA: Update namespace dump code.
  ACPICA: Update check for setting the ANOBJ_IS_EXTERNAL flag.
  ACPICA: Update default space handlers.
  ACPICA: Update version to 20130927.
  ACPICA: Update aclinux.h for new OSL override mechanism.
  ACPICA: Add support to allow host OS to redefine individual OSL prototypes.
  ACPICA: Simplify configuration of global ACPI_REDUCED_HARDWARE macro.
  ACPICA: Fix indentation issues for macro invocations.
  ...
2013-11-07 19:21:11 +01:00
Rob Herring b5480950c6 Merge remote-tracking branch 'grant/devicetree/next' into for-next 2013-11-07 10:34:46 -06:00
Borislav Petkov 1b2ca42267 kvm, cpuid: Fix sparse warning
We need to copy padding to kernel space first before looking at it.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-07 12:27:46 +02:00
Fenghua Yu 522e664644 x86/apic: Disable I/O APIC before shutdown of the local APIC
In reboot and crash path, when we shut down the local APIC, the I/O APIC is
still active. This may cause issues because external interrupts
can still come in and disturb the local APIC during shutdown process.

To quiet external interrupts, disable I/O APIC before shutdown local APIC.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1382578212-4677-1-git-send-email-fenghua.yu@intel.com
Cc: <stable@kernel.org>
[ I suppose the 'issue' is a hang during shutdown. It's a fine change nevertheless. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-07 10:12:37 +01:00
Konrad Rzeszutek Wilk 0e4ccb1505 PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()
Certain platforms do not allow writes in the MSI-X BARs to setup or tear
down vector values.  To combat against the generic code trying to write to
that and either silently being ignored or crashing due to the pagetables
being marked R/O this patch introduces a platform override.

Note that we keep two separate, non-weak, functions default_mask_msi_irqs()
and default_mask_msix_irqs() for the behavior of the arch_mask_msi_irqs()
and arch_mask_msix_irqs(), as the default behavior is needed by x86 PCI
code.

For Xen, which does not allow the guest to write to MSI-X tables - as the
hypervisor is solely responsible for setting the vector values - we
implement two nops.

This fixes a Xen guest crash when passing a PCI device with MSI-X to the
guest.  See the bugzilla for more details.

[bhelgaas: add bugzilla info]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=64581
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
CC: Zhenzhong Duan <zhenzhong.duan@oracle.com>
2013-11-06 16:32:19 -07:00
Michael Opdenacker 9d71cee667 x86/xen: remove deprecated IRQF_DISABLED
This patch proposes to remove the IRQF_DISABLED flag from x86/xen
code. It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-11-06 15:31:01 -05:00
Oleg Nesterov 8a8de66c4f uprobes: Introduce arch_uprobe->ixol
Currently xol_get_insn_slot() assumes that we should simply copy
arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn)
just the copy of the original insn.

This is not true for arm which needs to create another insn to
execute it out-of-line.

So this patch simply adds the new member, ->ixol into the union.
This doesn't make any difference for x86 and powerpc, but arm
can divorce insn/ixol and initialize the correct xol insn in
arch_uprobe_analyze_insn().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06 20:00:05 +01:00
David A. Long 3820b4d278 uprobes: Move function declarations out of arch
Move the function declarations from the arch headers to the common
header, since only the function bodies are architecture-specific.
These changes are from Vincent Rabin's uprobes patch.

[ oleg: update arch/powerpc/include/asm/uprobes.h ]

Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David A. Long <dave.long@linaro.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06 19:59:37 +01:00
H. Peter Anvin 4c08edd305 x86, hyperv: Move a variable to avoid an unused variable warning
The variable hv_lapic_frequency causes an unused variable warning if
CONFIG_X86_LOCAL_APIC is disabled.  Since the variable is only used
inside a small if statement, move the declaration of that variable
into the if statement itself.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1381444224-3303-1-git-send-email-kys@microsoft.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-06 10:02:05 -08:00
John Stultz 1ca7d67cf5 seqcount: Add lockdep functionality to seqcount/seqlock structures
Currently seqlocks and seqcounts don't support lockdep.

After running across a seqcount related deadlock in the timekeeping
code, I used a less-refined and more focused variant of this patch
to narrow down the cause of the issue.

This is a first-pass attempt to properly enable lockdep functionality
on seqlocks and seqcounts.

Since seqcounts are used in the vdso gettimeofday code, I've provided
non-lockdep accessors for those needs.

I've also handled one case where there were nested seqlock writers
and there may be more edge cases.

Comments and feedback would be appreciated!

Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:40:26 +01:00
Yan, Zheng f891d8cfb8 perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX
device. For PCI bus without UBOX device, we find the next bus that
has UBOX device and use its 'bus to socket' mapping.

Besides the counter/control registers in IRP boxes are not properly
aligned.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:31 +01:00
Yan, Zheng d1e8f4a836 perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes
The encoding for filter registers of IvyBridge-EP uncore QPI boxes is
completely the same as SandyBridge-EP.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:29 +01:00
Peter Zijlstra 0a196848ca perf: Fix arch_perf_out_copy_user default
The arch_perf_output_copy_user() default of
__copy_from_user_inatomic() returns bytes not copied, while all other
argument functions given DEFINE_OUTPUT_COPY() return bytes copied.

Since copy_from_user_nmi() is the odd duck out by returning bytes
copied where all other *copy_{to,from}* functions return bytes not
copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes
not copied.

Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied
while expecting its worker functions to return bytes copied.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: will.deacon@arm.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:25 +01:00
Josh Triplett a890b6fefd kvm: Delete prototype for non-existent function kvm_check_iopl
The prototype for kvm_check_iopl appeared in commit
f850e2e603 ("KVM: x86 emulator: Check IOPL
level during io instruction emulation"), but the function never actually
existed.  Remove the prototype.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-06 11:07:09 +02:00
Josh Triplett 35a5121b58 kvm: Delete prototype for non-existent function complete_pio
complete_pio ceased to exist in commit
7972995b0c ("KVM: x86 emulator: Move
string pio emulation into emulator.c"), but the prototype remained.
Remove its prototype.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-06 11:06:32 +02:00
Marcelo Tosatti 8b414521bc hung_task: add method to reset detector
In certain occasions it is possible for a hung task detector
positive to be false: continuation from a paused VM, for example.

Add a method to reset detection, similar as is done
with other kernel watchdogs.

Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-06 09:49:02 +02:00
Marcelo Tosatti d63285e94a pvclock: detect watchdog reset at pvclock read
Implement reset of kernel watchdogs at pvclock read time. This avoids
adding special code to every watchdog.

This is possible for watchdogs which measure time based on sched_clock() or
ktime_get() variants.

Suggested by Don Zickus.

Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-06 09:48:43 +02:00
Michael S. Tsirkin 01b71917b5 kvm: optimize out smp_mb after srcu_read_unlock
I noticed that srcu_read_lock/unlock both have a memory barrier,
so just by moving srcu_read_unlock earlier we can get rid of
one call to smp_mb() using smp_mb__after_srcu_read_unlock instead.

Unsurprisingly, the gain is small but measureable using the unit test
microbenchmark:
before
        vmcall in the ballpark of 1410 cycles
after
        vmcall in the ballpark of 1360 cycles

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-06 09:32:31 +02:00
Josh Boyer b53b5eda81 x86/cpu: Increase max CPU count to 8192
The MAXSMP option is intended to enable silly large numbers of
CPUs for testing purposes.  The current value of 4096 isn't very
silly any longer as there are actual SGI machines that approach
6096 CPUs when taking HT into account.

Increase the value to a nice round 8192 to account for this and
allow for short term future increases.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: prarit@redhat.com
Cc: Russ Anderson <rja@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131105143816.GK9944@hansolo.jdub.homelinux.org
[ Tweaked it so that MAXSMP simply sets the maximum of the normal range. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 08:16:34 +01:00
Josh Boyer bb61ccc7db x86/cpu: Allow higher NR_CPUS values
The current range for SMP configs is 2 - 512 CPUs, or a full
4096 in the case of MAXSMP.  There are machines that have 1024
CPUs in them today and configuring a kernel for that means you
are forced to set MAXSMP.  This adds additional unnecessary
overhead.  While that overhead might be considered tiny for
large machines, it isn't necessarily so if you are building a
kernel that runs across a wide variety of machines.

To cover the range of more common machines today, we allow
NR_CPUS to be up to 4096 when CPUMASK_OFFSTACK is enabled.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: prarit@redhat.com
Cc: Russ Anderson <rja@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131105143728.GJ9944@hansolo.jdub.homelinux.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 08:16:28 +01:00
HATAYAMA Daisuke a477c8594b x86/cpu: Always print SMP information in /proc/cpuinfo
Currently show_cpuinfo_core() displays cpu core information only if
the number of threads per a whole cores is 2 or larger.

However, this condition doesn't care about the number of
sockets. For example, this condition doesn't hold on systems
with two logical cpus consisting of two sockets and a single
core on each socket - yet the topology information would be
interesting to see in that case as well.

I don't know whether or not there are processors in real world
by which such configurations are possible, but at least on
vitual machine environments, such configuration can occur,
typically when no explicit SMP information is provided in
advance.

For example, on qemu/KVM, SMP information is specified via -smp
command-line option, more specifically, its syntax is:

  -smp n[,cores=cores][,threads=threads][,sockets=sockets][,maxcpus=maxcpus]

If this is not specified, qemu tells configuration with
n-sockets, 1-core and 1-thread to the guest machine, on which
guest, MP information is not displayed in /proc/cpuinfo.

I saw this situation on VMWare guest environment, too.

To fix this issue, this patch simply removes the condition
because this information is useful even if there's only 1
thread.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5277D644.4090707@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 08:13:56 +01:00
Ingo Molnar c90423d1de Merge branch 'sched/core' into core/locking, to prepare the kernel/locking/ file move
Conflicts:
	kernel/Makefile

There are conflicts in kernel/Makefile due to file moving in the
scheduler tree - resolve them.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 07:50:37 +01:00
Ingo Molnar 164777530b Linux 3.12
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSdt9HAAoJEHm+PkMAQRiGnzEH/345Keg5dp+oKACnokBfzOtp
 V0p3g5EBsGtzEVnV+1B96trczDUtWdDFFr5GfGSj565NBQpFyc+iZC1mC99RDJCs
 WUquGFqlLMK2aV0SbKwCO4K1rJ5A0TRVj0ZRJOUJUY7jwNf5Qahny0WBVjO/8qAY
 UvJK1rktBClhKdH53YtpDHHgXBeZ2LOrzt1fQ/AMpujGbZauGvnLdNOli5r2kCFK
 jzoOgFLvX+PHU/5/d4/QyJPeQNPva5hjk5Ho9UuSJYhnFtPO3EkD4XZLcpcbNEJb
 LqBvbnZWm6CS435lfU1l93RqQa5xMO9ITk0oe4h69syTSHwWk9aJ+ZTc/4Up+t8=
 =57MC
 -----END PGP SIGNATURE-----

Merge tag 'v3.12' into x86/cpu, to refresh the branch before queueing up more changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 06:50:21 +01:00
Ingo Molnar 97c53b402f Linux 3.12
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSdt9HAAoJEHm+PkMAQRiGnzEH/345Keg5dp+oKACnokBfzOtp
 V0p3g5EBsGtzEVnV+1B96trczDUtWdDFFr5GfGSj565NBQpFyc+iZC1mC99RDJCs
 WUquGFqlLMK2aV0SbKwCO4K1rJ5A0TRVj0ZRJOUJUY7jwNf5Qahny0WBVjO/8qAY
 UvJK1rktBClhKdH53YtpDHHgXBeZ2LOrzt1fQ/AMpujGbZauGvnLdNOli5r2kCFK
 jzoOgFLvX+PHU/5/d4/QyJPeQNPva5hjk5Ho9UuSJYhnFtPO3EkD4XZLcpcbNEJb
 LqBvbnZWm6CS435lfU1l93RqQa5xMO9ITk0oe4h69syTSHwWk9aJ+ZTc/4Up+t8=
 =57MC
 -----END PGP SIGNATURE-----

Merge tag 'v3.12' into core/locking to pick up mutex upates

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 06:39:45 +01:00
Kevin Hao ab4ead02ec ftrace/x86: skip over the breakpoint for ftrace caller
In commit 8a4d0a687a "ftrace: Use breakpoint method to update ftrace
caller", we choose to use breakpoint method to update the ftrace
caller. But we also need to skip over the breakpoint in function
ftrace_int3_handler() for them. Otherwise weird things would happen.

Cc: stable@vger.kernel.org # 3.5+
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05 16:01:47 -05:00
Gleb Natapov a9d4e4393b KVM: x86: trace cpuid emulation when called from emulator
Currently cpuid emulation is traced only when executed by intercept.
Move trace point so that emulator invocation is traced too.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-05 09:11:40 +02:00
Gleb Natapov 6d4d85ec56 KVM: emulator: cleanup decode_register_operand() a bit
Make code shorter.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-05 09:11:30 +02:00
Gleb Natapov aa9ac1a632 KVM: emulator: check rex prefix inside decode_register()
All decode_register() callers check if instruction has rex prefix
to properly decode one byte operand. It make sense to move the check
inside.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-05 09:11:18 +02:00
Chen Gang 7f71be4c9f x86, defconfig: Add DEVTMPFS and DEVTMPFS_MOUNT to *86*_defconfig
The defconfig kernel can not run under neither fedora16 x86_64 laptop
nor fedora17 x86_64 pc. After enable DEVTMPFS* in x86_64_defconfig, it
will be OK.

DEVTMPFS* is only related with software, so for i386_defconfig may also
need them (at least, it has no negative effect for defconfig).

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Link: http://lkml.kernel.org/r/52784DFF.8040004@asianux.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-11-04 20:01:55 -08:00
David S. Miller 394efd19d5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be.h
	drivers/net/netconsole.c
	net/bridge/br_private.h

Three mostly trivial conflicts.

The net/bridge/br_private.h conflict was a function signature (argument
addition) change overlapping with the extern removals from Joe Perches.

In drivers/net/netconsole.c we had one change adjusting a printk message
whilst another changed "printk(KERN_INFO" into "pr_info(".

Lastly, the emulex change was a new inline function addition overlapping
with Joe Perches's extern removals.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 13:48:30 -05:00
Gleb Natapov 95f328d3ad Merge branch 'kvm-ppc-queue' of git://github.com/agraf/linux-2.6 into queue
Conflicts:
	arch/powerpc/include/asm/processor.h
2013-11-04 10:20:57 +02:00
Ingo Molnar 2a3ede8cb2 Merge branch 'perf/urgent' into perf/core to fix conflicts
Conflicts:
	tools/perf/bench/numa.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-04 07:49:35 +01:00
Paolo Bonzini daf727225b KVM: x86: fix emulation of "movzbl %bpl, %eax"
When I was looking at RHEL5.9's failure to start with
unrestricted_guest=0/emulate_invalid_guest_state=1, I got it working with a
slightly older tree than kvm.git.  I now debugged the remaining failure,
which was introduced by commit 660696d1 (KVM: X86 emulator: fix
source operand decoding for 8bit mov[zs]x instructions, 2013-04-24)
introduced a similar mis-emulation to the one in commit 8acb4207 (KVM:
fix sil/dil/bpl/spl in the mod/rm fields, 2013-05-30).  The incorrect
decoding occurs in 8-bit movzx/movsx instructions whose 8-bit operand
is sil/dil/bpl/spl.

Needless to say, "movzbl %bpl, %eax" does occur in RHEL5.9's decompression
prolog, just a handful of instructions before finally giving control to
the decompressed vmlinux and getting out of the invalid guest state.

Because OpMem8 bypasses decode_modrm, the same handling of the REX prefix
must be applied to OpMem8.

Reported-by: Michele Baldessari <michele@redhat.com>
Cc: stable@vger.kernel.org
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-03 16:50:37 +02:00
Borislav Petkov ee41143027 x86/efi: Check krealloc return value
Check it just in case. We might just as well panic there because runtime
won't be functioning anyway.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:36 +00:00
Borislav Petkov d2f7cbe7b2 x86/efi: Runtime services virtual mapping
We map the EFI regions needed for runtime services non-contiguously,
with preserved alignment on virtual addresses starting from -4G down
for a total max space of 64G. This way, we provide for stable runtime
services addresses across kernels so that a kexec'd kernel can still use
them.

Thus, they're mapped in a separate pagetable so that we don't pollute
the kernel namespace.

Add an efi= kernel command line parameter for passing miscellaneous
options and chicken bits from the command line.

While at it, add a chicken bit called "efi=old_map" which can be used as
a fallback to the old runtime services mapping method in case there's
some b0rkage with a particular EFI implementation (haha, it is hard to
hold up the sarcasm here...).

Also, add the UEFI RT VA space to Documentation/x86/x86_64/mm.txt.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:36 +00:00
Borislav Petkov 82f0712ca0 x86/mm/cpa: Map in an arbitrary pgd
Add the ability to map pages in an arbitrary pgd. This wires in the
remaining stuff so that there's a new interface with which you can map a
region into an arbitrary PGD.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:35 +00:00
Borislav Petkov 52a628fb45 x86/mm/pageattr: Add last levels of error path
We try to free the pagetable pages once we've unmapped our portion.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:31 +00:00
Borislav Petkov 0bb8aeee7b x86/mm/pageattr: Add a PUD error unwinding path
In case we encounter an error during the mapping of a region, we want to
unwind what we've established so far exactly the way we did the mapping.
This is the PUD part kept deliberately small for easier review.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:28 +00:00
Borislav Petkov c6b6f363f7 x86/mm/pageattr: Add a PTE pagetable populating function
Handle last level by unconditionally writing the PTEs into the PTE page
while paying attention to the NX bit.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:24 +00:00
Borislav Petkov f900a4b8ab x86/mm/pageattr: Add a PMD pagetable populating function
Handle PMD-level mappings the same as PUD ones.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:20 +00:00
Borislav Petkov 4b23538d88 x86/mm/pageattr: Add a PUD pagetable populating function
Add the next level of the pagetable populating function, we handle
chunks around a 1G boundary by mapping them with the lower level
functions - otherwise we use 1G pages for the mappings, thus using as
less amount of pagetable pages as possible.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:17 +00:00
Borislav Petkov f3f729661e x86/mm/pageattr: Add a PGD pagetable populating function
This allocates, if necessary, and populates the corresponding PGD entry
with a PUD page. The next population level is a dummy macro which will
be removed by the next patch and it is added here to keep the patch
small and easily reviewable but not break bisection, at the same time.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:12 +00:00
Borislav Petkov 0fd64c23fd x86/mm/pageattr: Lookup address in an arbitrary PGD
This is preparatory work in order to be able to map pages into a
specified PGD and not implicitly and only into init_mm.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:08 +00:00
Borislav Petkov f4fccac05f x86/efi: Simplify EFI_DEBUG
... and lose one #ifdef .. #endif sandwich.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-02 11:09:03 +00:00
Linus Torvalds 9581b7d268 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Two fixes:

   - Fix 'NMI handler took too long to run' false positives

     [ Genuine NMI overhead speedups will come for v3.13, this commit
       only fixes a measurement bug ]

   - Fix perf ring-buffer missed barrier causing (rare) ring-buffer data
     corruption on ppc64"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix NMI measurements
  perf: Fix perf ring buffer memory ordering
2013-11-01 12:54:51 -07:00
Ingo Molnar fb10d5b7ef Merge branch 'linus' into sched/core
Resolve cherry-picking conflicts:

Conflicts:
	mm/huge_memory.c
	mm/memory.c
	mm/mprotect.c

See this upstream merge commit for more details:

  52469b4fcd Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-01 08:24:41 +01:00
Paolo Bonzini 98f73630f9 KVM: x86: emulate SAHF instruction
Yet another instruction that we fail to emulate, this time found
in Windows 2008R2 32-bit.

Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-31 22:14:10 +01:00
Bjorn Helgaas 33de1b8bf6 Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Report pci_pme_active() kmalloc failure
  mn10300/PCI: Remove useless pcibios_last_bus
  frv/PCI: Remove pcibios_last_bus
  PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0
  x86/PCI: Coalesce multiple overlapping host bridge windows
  MAINTAINERS: Add arch/x86/pci to PCI file patterns
  PCI/PM: Remove pci_pm_complete()
  PCI: Add pci_dev_show_local_cpu() to simplify code
  mn10300/PCI: Remove unused pci_mem_start
  cris/PCI: Remove unused pci_mem_start
  PCI: Make pci_dev_pm_ops static

Conflicts:
	drivers/pci/pci-sysfs.c
2013-10-31 14:12:40 -06:00
Lv Zheng 40bce100ca ACPICA: Cleanup asmlinkage for ACPICA APIs.
Add an asmlinkage wrapper around acpi_enter_sleep_state() to prevent
an empty stub from being called by assmebly code for ACPI_REDUCED_HARDWARE
set.

As arch/x86/kernel/acpi/wakeup_xx.S is only compiled when CONFIG_ACPI=y
and there are no users of ACPI_HARDWARE_REDUCED, currently this is in
fact not a real issue, but a cleanup to reduce source code differences
between Linux and ACPICA upstream.

[rjw: Changelog]
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-31 14:37:35 +01:00
Michael S. Tsirkin 6026620475 kvm/vmx: error message typo fix
mst can't be blamed for lack of switch entries: the
issue is with msrs actually.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-31 11:55:15 +01:00
Paolo Bonzini c67a04cb9a KVM: x86: fix KVM_SET_XCRS loop
The loop was always using 0 as the index.  This means that
any rubbish after the first element of the array went undetected.
It seems reasonable to assume that no KVM userspace did that.

Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-31 11:31:19 +01:00
Paolo Bonzini 46c34cb059 KVM: x86: fix KVM_SET_XCRS for CPUs that do not support XSAVE
The KVM_SET_XCRS ioctl must accept anything that KVM_GET_XCRS
could return.  XCR0's bit 0 is always 1 in real processors with
XSAVE, and KVM_GET_XCRS will always leave bit 0 set even if the
emulated processor does not have XSAVE.  So, KVM_SET_XCRS must
ignore that bit when checking for attempts to enable unsupported
save states.

Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-31 11:30:46 +01:00
David Rientjes d68ce0177c x86, hyperv: Fix build error due to missing <asm/apic.h> include
9e7827b5ea ("x86, hyperv: Get the local APIC timer frequency from the
hypervisor") breaks the build with some configs because apic.h isn't
directly included:

arch/x86/kernel/cpu/mshyperv.c: In function 'ms_hyperv_init_platform':
arch/x86/kernel/cpu/mshyperv.c:90:3: error: 'lapic_timer_frequency' undeclared (first use in this function)
arch/x86/kernel/cpu/mshyperv.c:90:3: note: each undeclared identifier is reported only once for each function it appears in

Fix it by including asm/apic.h.

Signed-off-by: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1310111604160.31170@chino.kir.corp.google.com
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-30 15:37:47 -07:00
Linus Torvalds 12aee278b5 Merge branch 'akpm' (fixes from Andrew Morton)
Merge three fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting
  percpu: fix this_cpu_sub() subtrahend casting for unsigneds
  mm/pagewalk.c: fix walk_page_range() access of wrong PTEs
2013-10-30 14:27:10 -07:00
Greg Thelen bd09d9a351 percpu: fix this_cpu_sub() subtrahend casting for unsigneds
this_cpu_sub() is implemented as negation and addition.

This patch casts the adjustment to the counter type before negation to
sign extend the adjustment.  This helps in cases where the counter type
is wider than an unsigned adjustment.  An alternative to this patch is
to declare such operations unsupported, but it seemed useful to avoid
surprises.

This patch specifically helps the following example:
  unsigned int delta = 1
  preempt_disable()
  this_cpu_write(long_counter, 0)
  this_cpu_sub(long_counter, delta)
  preempt_enable()

Before this change long_counter on a 64 bit machine ends with value
0xffffffff, rather than 0xffffffffffffffff.  This is because
this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta),
which is basically:
  long_counter = 0 + 0xffffffff

Also apply the same cast to:
  __this_cpu_sub()
  __this_cpu_sub_return()
  this_cpu_sub_return()

All percpu_test.ko passes, especially the following cases which
previously failed:

  l -= ui_one;
  __this_cpu_sub(long_counter, ui_one);
  CHECK(l, long_counter, -1);

  l -= ui_one;
  this_cpu_sub(long_counter, ui_one);
  CHECK(l, long_counter, -1);
  CHECK(l, long_counter, 0xffffffffffffffff);

  ul -= ui_one;
  __this_cpu_sub(ulong_counter, ui_one);
  CHECK(ul, ulong_counter, -1);
  CHECK(ul, ulong_counter, 0xffffffffffffffff);

  ul = this_cpu_sub_return(ulong_counter, ui_one);
  CHECK(ul, ulong_counter, 2);

  ul = __this_cpu_sub_return(ulong_counter, ui_one);
  CHECK(ul, ulong_counter, 1);

Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-30 14:27:03 -07:00
Alex Williamson e0f0bbc527 kvm: Create non-coherent DMA registeration
We currently use some ad-hoc arch variables tied to legacy KVM device
assignment to manage emulation of instructions that depend on whether
non-coherent DMA is present.  Create an interface for this, adapting
legacy KVM device assignment and adding VFIO via the KVM-VFIO device.
For now we assume that non-coherent DMA is possible any time we have a
VFIO group.  Eventually an interface can be developed as part of the
VFIO external user interface to query the coherency of a group.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 19:02:23 +01:00
Alex Williamson d96eb2c6f4 kvm/x86: Convert iommu_flags to iommu_noncoherent
Default to operating in coherent mode.  This simplifies the logic when
we switch to a model of registering and unregistering noncoherent I/O
with KVM.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 19:02:13 +01:00
Alex Williamson ec53500fae kvm: Add VFIO device
So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made.  This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction.  The user creates
the device and can add and remove VFIO groups to it via file
descriptors.  When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 19:02:03 +01:00
Borislav Petkov 84cffe499b kvm: Emulate MOVBE
This basically came from the need to be able to boot 32-bit Atom SMP
guests on an AMD host, i.e. a host which doesn't support MOVBE. As a
matter of fact, qemu has since recently received MOVBE support but we
cannot share that with kvm emulation and thus we have to do this in the
host. We're waay faster in kvm anyway. :-)

So, we piggyback on the #UD path and emulate the MOVBE functionality.
With it, an 8-core SMP guest boots in under 6 seconds.

Also, requesting MOVBE emulation needs to happen explicitly to work,
i.e. qemu -cpu n270,+movbe...

Just FYI, a fairly straight-forward boot of a MOVBE-enabled 3.9-rc6+
kernel in kvm executes MOVBE ~60K times.

Signed-off-by: Andre Przywara <andre@andrep.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 18:54:41 +01:00
Borislav Petkov 0bc5eedb82 kvm, emulator: Add initial three-byte insns support
Add initial support for handling three-byte instructions in the
emulator.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 18:54:41 +01:00
Borislav Petkov b51e974fcd kvm, emulator: Rename VendorSpecific flag
Call it EmulateOnUD which is exactly what we're trying to do with
vendor-specific instructions.

Rename ->only_vendor_specific_insn to something shorter, while at it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 18:54:40 +01:00
Borislav Petkov 1ce19dc16c kvm, emulator: Use opcode length
Add a field to the current emulation context which contains the
instruction opcode length. This will streamline handling of opcodes of
different length.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 18:54:39 +01:00
Borislav Petkov 9c15bb1d0a kvm: Add KVM_GET_EMULATED_CPUID
Add a kvm ioctl which states which system functionality kvm emulates.
The format used is that of CPUID and we return the corresponding CPUID
bits set for which we do emulate functionality.

Make sure ->padding is being passed on clean from userspace so that we
can use it for something in the future, after the ioctl gets cast in
stone.

s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at
it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 18:54:39 +01:00
Tim Gardner d780a31271 KVM: Fix modprobe failure for kvm_intel/kvm_amd
The x86 specific kvm init creates a new conflicting
debugfs directory which causes modprobe issues
with kvm_intel and kvm_amd. For example,

sudo modprobe kvm_amd
modprobe: ERROR: could not insert 'kvm_amd': Bad address

The simplest fix is to just rename the directory. The following
KVM config options are set:

CONFIG_KVM_GUEST=y
CONFIG_KVM_DEBUG_FS=y
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_KVM_AMD=m
CONFIG_KVM_DEVICE_ASSIGNMENT=y

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
[Change debugfs directory name. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 12:10:42 +01:00
Peter Zijlstra e00b12e64b perf/x86: Further optimize copy_from_user_nmi()
Now that we can deal with nested NMI due to IRET re-enabling NMIs and
can deal with faults from NMI by making sure we preserve CR2 over NMIs
we can in fact simply access user-space memory from NMI context.

So rewrite copy_from_user_nmi() to use __copy_from_user_inatomic() and
rework the fault path to do the minimal required work before taking
the in_atomic() fault handler.

In particular avoid perf_sw_event() which would make perf recurse on
itself (it should be harmless as our recursion protections should be
able to deal with this -- but why tempt fate).

Also rename notify_page_fault() to kprobes_fault() as that is a much
better name; there is no notifier in it and its specific to kprobes.

Don measured that his worst case NMI path shrunk from ~300K cycles to
~150K cycles.

Cc: Stephane Eranian <eranian@google.com>
Cc: jmario@redhat.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: dave.hansen@linux.intel.com
Tested-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131024105206.GM2490@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 12:02:54 +01:00
Peter Zijlstra e8a923cc1f perf/x86: Fix NMI measurements
OK, so what I'm actually seeing on my WSM is that sched/clock.c is
'broken' for the purpose we're using it for.

What triggered it is that my WSM-EP is broken :-(

  [    0.001000] tsc: Fast TSC calibration using PIT
  [    0.002000] tsc: Detected 2533.715 MHz processor
  [    0.500180] TSC synchronization [CPU#0 -> CPU#6]:
  [    0.505197] Measured 3 cycles TSC warp between CPUs, turning off TSC clock.
  [    0.004000] tsc: Marking TSC unstable due to check_tsc_sync_source failed

For some reason it consistently detects TSC skew, even though NHM+
should have a single clock domain for 'reasonable' systems.

This marks sched_clock_stable=0, which means that we do fancy stuff to
try and get a 'sane' clock. Part of this fancy stuff relies on the tick,
clearly that's gone when NOHZ=y. So for idle cpus time gets stuck, until
it either wakes up or gets kicked by another cpu.

While this is perfectly fine for the scheduler -- it only cares about
actually running stuff, and when we're running stuff we're obviously not
idle. This does somewhat break down for perf which can trigger events
just fine on an otherwise idle cpu.

So I've got NMIs get get 'measured' as taking ~1ms, which actually
don't last nearly that long:

          <idle>-0     [013] d.h.   886.311970: rcu_nmi_enter <-do_nmi
  ...
          <idle>-0     [013] d.h.   886.311997: perf_sample_event_took: HERE!!! : 1040990

So ftrace (which uses sched_clock(), not the fancy bits) only sees
~27us, but we measure ~1ms !!

Now since all this measurement stuff lives in x86 code, we can actually
fix it.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org
Cc: dave.hansen@linux.intel.com
Cc: eranian@google.com
Cc: Don Zickus <dzickus@redhat.com>
Cc: jmario@redhat.com
Cc: acme@infradead.org
Link: http://lkml.kernel.org/r/20131017133350.GG3364@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 12:01:20 +01:00
Ingo Molnar aac898548d Merge branch 'perf/urgent' into perf/core
Conflicts:
	tools/perf/builtin-record.c
	tools/perf/builtin-top.c
	tools/perf/util/hist.h
2013-10-29 11:23:32 +01:00
Matt Fleming 72548e836b x86/efi: Add EFI framebuffer earlyprintk support
It's incredibly difficult to diagnose early EFI boot issues without
special hardware because earlyprintk=vga doesn't work on EFI systems.

Add support for writing to the EFI framebuffer, via earlyprintk=efi,
which will actually give users a chance of providing debug output.

Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Jones <pjones@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-10-28 18:09:58 +00:00
Jan Kiszka a294c9bbd0 nVMX: Report CPU_BASED_VIRTUAL_NMI_PENDING as supported
If the host supports it, we can and should expose it to the guest as
well, just like we already do with PIN_BASED_VIRTUAL_NMIS.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-28 13:14:38 +01:00
Jan Kiszka cd2633c59b nVMX: Fix pick-up of uninjected NMIs
__vmx_complete_interrupts stored uninjected NMIs in arch.nmi_injected,
not arch.nmi_pending. So we actually need to check the former field in
vmcs12_save_pending_event. This fixes the eventinj unit test when run
in nested KVM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-28 13:14:24 +01:00
Jan Kiszka d3134dbf20 KVM: nVMX: Report 2MB EPT pages as supported
As long as the hardware provides us 2MB EPT pages, we can also expose
them to the guest because our shadow EPT code already supports this
feature.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-28 13:14:12 +01:00
Rafael J. Wysocki ce6bceabae Merge branch 'powercap'
* powercap:
  PowerCap: Convert class code to use dev_groups
  PowerCap: Introduce Intel RAPL power capping driver
  bitops: Introduce BIT_ULL
  x86 / msr: add 64bit _on_cpu access functions
  PowerCap: Add to drivers Kconfig and Makefile
  PowerCap: Add class driver
  PowerCap: Documentation
2013-10-28 01:21:49 +01:00
Rafael J. Wysocki 5171f4fa74 Merge branch 'acpi-assorted'
* acpi-assorted:
  ACPI: Add Toshiba NB100 to Vista _OSI blacklist
  ACPI / osl: remove an unneeded NULL check
  ACPI / platform: add ACPI ID for a Broadcom GPS chip
  ACPI: improve acpi_extract_package() utility
  ACPI / LPSS: fix UART Auto Flow Control
  ACPI / platform: Add ACPI IDs for Intel SST audio device
  x86 / ACPI: fix incorrect placement of __initdata tag
  ACPI / thermal: convert printk(LEVEL...) to pr_<lvl>
  ACPI / sysfs: make GPE sysfs attributes only accept correct values
  ACPI / EC: Convert all printk() calls to dynamic debug function
  ACPI / button: Using input_set_capability() to mark device's event capability
  ACPI / osl: implement acpi_os_sleep() with msleep()
2013-10-28 01:20:24 +01:00
Rafael J. Wysocki 3f66c315f5 Merge branch 'acpi-tables'
* acpi-tables:
  ACPI / x86: Increase override tables number limit
2013-10-28 01:13:29 +01:00
Rafael J. Wysocki 5c2aae8355 Merge branch 'acpi-hotplug'
* acpi-hotplug:
  ACPI / memhotplug: Use defined marco METHOD_NAME__STA
  ACPI / hotplug: Use kobject_init_and_add() instead of _init() and _add()
  ACPI / hotplug: Don't set kobject parent pointer explicitly
  ACPI / hotplug: Set kobject name via kobject_add(), not kobject_set_name()
  hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock()
  hotplug / x86: Disable ARCH_CPU_PROBE_RELEASE on x86
  hotplug / x86: Add hotplug lock to missing places
  hotplug / x86: Fix online state in cpu0 debug interface
2013-10-28 01:12:41 +01:00
Rafael J. Wysocki 3fbc4d6374 Merge branch 'acpi-processor'
* acpi-processor:
  ACPI / processor: fixed a brace coding style issue
  ACPI / processor: Remove outdated comments
  ACPI / processor: remove unnecessary if (!pr) check
  ACPI / processor: remove some dead code in acpi_processor_get_info()
  x86 / ACPI: simplify _acpi_map_lsapic()
  ACPI / processor: use apic_id and remove duplicated _MAT evaluation
  ACPI / processor: Introduce apic_id in struct processor to save parsed APIC id
2013-10-28 01:11:24 +01:00
Rafael J. Wysocki 8e32e47dbb Merge branch 'acpi-cleanup'
* acpi-cleanup: (34 commits)
  ACPI / proc: Remove alarm proc file
  ACPI: Remove CONFIG_ACPI_PROCFS_POWER and cm_sbsc.c
  ACPI / SBS: Remove SBS's proc directory
  ACPI / Battery: Remove battery's proc directory
  ACP / fan: trivial style cleanup
  ACPI / processor: remove superfluous pr == NULL checks
  ACPI / mm: use NUMA_NO_NODE
  toshiba_acpi: convert acpi_evaluate_object() to acpi_evaluate_integer()
  intel-smartconnect: convert acpi_evaluate_object() to acpi_evaluate_integer()
  intel-rst: convert acpi_evaluate_object() to acpi_evaluate_integer()
  fujitsu-laptop: convert acpi_evaluate_object() to acpi_evaluate_integer()
  i2c-hid: convert acpi_evaluate_object() to acpi_evaluate_integer()
  ACPI: dock: convert acpi_evaluate_object() to acpi_evaluate_integer()
  acpi_processor: convert acpi_evaluate_object() to acpi_evaluate_integer()
  pnpacpi: convert acpi_get_handle() to acpi_has_method()
  wmi: convert acpi_get_handle() to acpi_has_method()
  toshiba_acpi: convert acpi_get_handle() to acpi_has_method()
  sony-laptop: convert acpi_get_handle() to acpi_has_method()
  intel_menlow: convert acpi_get_handle() to acpi_has_method()
  fujitsu-laptop: convert acpi_get_handle() to acpi_has_method()
  ...
2013-10-28 01:10:20 +01:00
Heiko Carstens 90f2492cf9 x86: remove this_cpu_xor() implementation
Remove the unused x86 implementation of this_cpu_xor().

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-10-27 09:03:46 -04:00
Jan Beulich ee5872befc x86/time: Honor ACPI FADT flag indicating absence of a CMOS RTC
Even though the omission was found only during code review
(originally in the Xen hypervisor, looking through ACPI v5 flags
and their meanings and uses), we shouldn't be creating a
corresponding platform device in that case.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/5265029D02000078000FC4D2@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-26 13:36:00 +02:00
Jan Beulich 09dc68d958 x86/cpu: Track legacy CPU model data only on 32-bit kernels
struct cpu_dev's c_models is only ever set inside CONFIG_X86_32
conditionals (or code that's being built for 32-bit only), so
there's no use of reserving the (empty) space for the model
names in a 64-bit kernel.

Similarly, c_size_cache is only used in the #else of a
CONFIG_X86_64 conditional, so reserving space for (and in one
case even initializing) that field is pointless for 64-bit
kernels too.

While moving both fields to the end of the structure, I also
noticed that:

 - the c_models array size was one too small, potentially causing
   table_lookup_model() to return garbage on Intel CPUs (intel.c's
   instance was lacking the sentinel with family being zero), so the
   patch bumps that by one,

 - c_models' vendor sub-field was unused (and anyway redundant
   with the base structure's c_x86_vendor field), so the patch deletes it.

Also rename the legacy fields so that their legacy nature stands out
and comment their declarations.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/5265036802000078000FC4DB@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-26 13:34:39 +02:00
Jan Beulich 7a3d9b0f3a x86: Unify copy_to_user() and add size checking to it
Similarly to copy_from_user(), where the range check is to
protect against kernel memory corruption, copy_to_user() can
benefit from such checking too: Here it protects against kernel
information leaks.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <arjan@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5265059502000078000FC4F6@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
2013-10-26 12:27:37 +02:00
Jan Beulich 3df7b41aa5 x86: Unify copy_from_user() size checking
Commits 4a31276930 ("x86: Turn the
copy_from_user check into an (optional) compile time warning")
and 63312b6a6f ("x86: Add a
Kconfig option to turn the copy_from_user warnings into errors")
touched only the 32-bit variant of copy_from_user(), whereas the
original commit 9f0cf4adb6 ("x86:
Use __builtin_object_size() to validate the buffer size for
copy_from_user()") also added the same code to the 64-bit one.

Further the earlier conversion from an inline WARN() to the call
to copy_from_user_overflow() went a little too far: When the
number of bytes to be copied is not a constant (e.g. [looking at
3.11] in drivers/net/tun.c:__tun_chr_ioctl() or
drivers/pci/pcie/aer/aer_inject.c:aer_inject_write()), the
compiler will always have to keep the funtion call, and hence
there will always be a warning. By using __builtin_constant_p()
we can avoid this.

And then this slightly extends the effect of
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS in that apart from
converting warnings to errors in the constant size case, it
retains the (possibly wrong) warnings in the non-constant size
case, such that if someone is prepared to get a few false
positives, (s)he'll be able to recover the current behavior
(except that these diagnostics now will never be converted to
errors).

Since the 32-bit variant (intentionally) didn't call
might_fault(), the unification results in this being called
twice now. Adding a suitable #ifdef would be the alternative if
that's a problem.

I'd like to point out though that with
__compiletime_object_size() being restricted to gcc before 4.6,
the whole construct is going to become more and more pointless
going forward. I would question however that commit
2fb0815c9e ("gcc4: disable
__compiletime_object_size for GCC 4.6+") was really necessary,
and instead this should have been dealt with as is done here
from the beginning.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5265056D02000078000FC4F3@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-26 12:27:36 +02:00
Raghavendra K T 4c4e4f6136 x86/locking/kconfig: Update paravirt spinlock Kconfig description
Since the paravirt spinlock optimizations went into the v3.12 kernel
we have a very good performance benefit for paravirtualized KVM / Xen
kernels. Also we no longer suffer from 5% side effect on native
kernel that is mentioned in the Kconfig entry.

So update the Kconfig entry accordingly.

pvspinlock benefit on KVM link:

  https://lkml.org/lkml/2013/8/6/178

Attilio's tests on native kernel impact:

  http://blog.xen.org/index.php/2012/05/11/benchmarking-the-new-pv-ticketlock-implementation/

Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: <konrad.wilk@oracle.com>
Cc: <linux@eikelenboom.it>
Cc: <gleb@redhat.com>
Cc: <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1382371508-3843-1-git-send-email-raghavendra.kt@linux.vnet.ibm.com
[ Updated the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-26 12:23:22 +02:00
Ingo Molnar d69525789e Merge tag 'please-pull-eMCA-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce
Pull MCE updates from Tony Luck:

 "There is a enhanced error logging mechanism for Xeon processors.

  Full description is here:

    http://www.intel.com/content/www/us/en/architecture-and-technology/enhanced-mca-logging-xeon-paper.html

  This patch series provides a module (and support code) to
  check for an extended error log and print extra details about
  the error on the console.
  "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-26 12:08:29 +02:00
Rafael J. Wysocki 17e8d03c48 Merge back earlier 'acpi-assorted' material. 2013-10-25 23:39:25 +02:00
Lan Tianyu 6d9153bbce x86/reboot: Correct pr_info() log message in the set_bios/pci/kbd_reboot()
commit:

  c767a54ba0 x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level>

broke the log messages in the set_bios/pci/kbd_reboot() functions, by
putting the reboot method string and quirk entry's ident string in the
wrong order. This patch fixes it.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Cc: holt@sgi.com
Cc: davej@fedoraproject.org
Cc: lenb@kernel.org
Cc: rjw@rjwysocki.net
Cc: awilliam@redhat.com
Link: http://lkml.kernel.org/r/1382598693-29334-1-git-send-email-tianyu.lan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-25 12:45:04 +02:00
Stefano Stabellini 7100b077ab xen: introduce xen_dma_map/unmap_page and xen_dma_sync_single_for_cpu/device
Introduce xen_dma_map_page, xen_dma_unmap_page,
xen_dma_sync_single_for_cpu and xen_dma_sync_single_for_device.
They have empty implementations on x86 and ia64 but they call the
corresponding platform dma_ops function on arm and arm64.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Changes in v9:
- xen_dma_map_page return void, avoid page_to_phys.
2013-10-25 10:39:49 +00:00
Grant Likely 16b84e5a50 of/irq: Create of_irq_parse_and_map_pci() to consolidate arch code.
Several architectures open code effectively the same code block for
finding and mapping PCI irqs. This patch consolidates it down to a
single function.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-24 11:50:36 +01:00
Grant Likely e6d30ab1e7 of/irq: simplify args to irq_create_of_mapping
All the callers of irq_create_of_mapping() pass the contents of a struct
of_phandle_args structure to the function. Since all the callers already
have an of_phandle_args pointer, why not pass it directly to
irq_create_of_mapping()?

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-24 11:42:57 +01:00
Grant Likely 530210c781 of/irq: Replace of_irq with of_phandle_args
struct of_irq and struct of_phandle_args are exactly the same structure.
This patch makes the kernel use of_phandle_args everywhere. This in
itself isn't a big deal, but it makes some follow-on patches simpler.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-24 11:42:51 +01:00
Grant Likely 0c02c8007e of/irq: Rename of_irq_map_* functions to of_irq_parse_*
The OF irq handling code has been overloading the term 'map' to refer to
both parsing the data in the device tree and mapping it to the internal
linux irq system. This is probably because the device tree does have the
concept of an 'interrupt-map' function for translating interrupt
references from one node to another, but 'map' is still confusing when
the primary purpose of some of the functions are to parse the DT data.

This patch renames all the of_irq_map_* functions to of_irq_parse_*
which makes it clear that there is a difference between the parsing
phase and the mapping phase. Kernel code can make use of just the
parsing or just the mapping support as needed by the subsystem.

The patch was generated mechanically with a handful of sed commands.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-24 11:40:59 +01:00
David S. Miller c3fa32b976 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/qmi_wwan.c
	include/net/dst.h

Trivial merge conflicts, both were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23 16:49:34 -04:00
Mark Salter 77fbbc8112 x86: select ARCH_MIGHT_HAVE_PC_PARPORT
Architectures which support CONFIG_PARPORT_PC should select
ARCH_MIGHT_HAVE_PC_PARPORT.

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Ingo Molnar <mingo@redhat.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: x86@kernel.org
2013-10-23 16:00:16 -04:00
Chen, Gong 147de14772 ACPI, APEI, CPER: Add UEFI 2.4 support for memory error
In latest UEFI spec(by now it is 2.4) memory error definition
for CPER (UEFI 2.4 Appendix N Common Platform Error Record)
adds some new fields. These fields help people to locate
memory error to an actual DIMM location.

Original-author: Tony Luck <tony.luck@intel.com>
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-10-23 10:10:20 -07:00
Chen, Gong dd6dad4288 DMI: Parse memory device (type 17) in SMBIOS
This patch adds a new interface to decode memory device (type 17)
to help error reporting on DIMMs.

Original-author: Tony Luck <tony.luck@intel.com>
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-10-23 10:10:12 -07:00
Chen, Gong 4b3db708b1 ACPI, x86: Extended error log driver for x86 platform
This H/W error log driver (a.k.a eMCA driver) is implemented based on
http://www.intel.com/content/www/us/en/architecture-and-technology/enhanced-mca-logging-xeon-paper.html

After errors are captured, more detailed platform specific information
can be got via this new enhanced H/W error log driver. Most notably we
can track memory errors back to the DIMM slot silk screen label.

Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-10-23 10:09:07 -07:00
Linus Torvalds db10accfd2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Sorry I let so much accumulate, I was in Buffalo and wanted a few
  things to cook in my tree for a while before sending to you.  Anyways,
  it's a lot of little things as usual at this stage in the game"

 1) Make bonding MAINTAINERS entry reflect reality, from Andy
    Gospodarek.

 2) Fix accidental sock_put() on timewait mini sockets, from Eric
    Dumazet.

 3) Fix crashes in l2tp due to mis-handling of ipv4 mapped ipv6
    addresses, from François CACHEREUL.

 4) Fix heap overflow in __audit_sockaddr(), from the eagle eyed Dan
    Carpenter.

 5) tcp_shifted_skb() doesn't take handle FINs properly, from Eric
    Dumazet.

 6) SFC driver bug fixes from Ben Hutchings.

 7) Fix TX packet scheduling wedge after channel change in ath9k driver,
    from Felix Fietkau.

 8) Fix user after free in BPF JIT code, from Alexei Starovoitov.

 9) Source address selection test is reversed in
    __ip_route_output_key(), fix from Jiri Benc.

10) VLAN and CAN layer mis-size netlink attributes, from Marc
    Kleine-Budde.

11) Fix permission checks in sysctls to use current_euid() instead of
    current_uid().  From Eric W Biederman.

12) IPSEC policies can go away while a timer is still pending for them,
    add appropriate ref-counting to fix, from Steffen Klassert.

13) Fix mis-programming of FDR and RMCR registers on R8A7740 sh_eth
    chips, from Nguyen Hong Ky and Simon Horman.

14) MLX4 forgets to DMA unmap pages on RX, fix from Amir Vadai.

15) IPV6 GRE tunnel MTU upper limit is miscalculated, from Oussama
    Ghorbel.

16) Fix typo in fq_change(), we were assigning "initial quantum" to
    "quantum".  From Eric Dumazet.

17) Set a more appropriate sk_pacing_rate for non-TCP sockets, otherwise
    FQ packet scheduler does not pace those flows properly.  Also from
    Eric Dumazet.

18) rtlwifi miscalculates packet pointers, from Mark Cave-Ayland.

19) l2tp_xmit_skb() can be called from process context, not just softirq
    context, so we must always make sure to BH disable around it.  From
    Eric Dumazet.

20) On qdisc reset, we forget to purge the RB tree of SKBs in netem
    packet scheduler.  From Stephen Hemminger.

21) Fix info leak in farsync WAN driver ioctl() handler, from Dan
    Carpenter and Salva Peiró.

22) Fix PHY reset and other issues in dm9000 driver, from Nikita
    Kiryanov and Michael Abbott.

23) When hardware can do SCTP crc32 checksums, we accidently don't
    disable the csum offload when IPSEC transformations have been
    applied.  From Fan Du and Vlad Yasevich.

24) Tail loss probing in TCP leaves the socket in the wrong congestion
    avoidance state.  From Yuchung Cheng.

25) In CPSW driver, enable NAPI before interrupts are turned on, from
    Markus Pargmann.

26) Integer underflow and dual-assignment in YAM hamradio driver, from
    Dan Carpenter.

27) If we are going to mangle a packet in tcp_set_skb_tso_segs() we must
    unclone it.  This fixes various hard to track down crashes in
    drivers where the SKBs ->gso_segs was changing right from underneath
    the driver during TX queueing.  From Eric Dumazet.

28) Fix the handling of VLAN IDs, and in particular the special IDs 0
    and 4095, in the bridging layer.  From Toshiaki Makita.

29) Another info leak, this time in wanxl WAN driver, from Salva Peiró.

30) Fix race in socket credential passing, from Daniel Borkmann.

31) WHen NETLABEL is disabled, we don't validate CIPSO packets properly,
    from Seif Mazareeb.

32) Fix identification of fragmented frames in ipv4/ipv6 UDP
    Fragmentation Offload output paths, from Jiri Pirko.

33) Virtual Function fixes in bnx2x driver from Yuval Mintz and Ariel
    Elior.

34) When we removed the explicit neighbour pointer from ipv6 routes a
    slight regression was introduced for users such as IPVS, xt_TEE, and
    raw sockets.  We mix up the users requested destination address with
    the routes assigned nexthop/gateway.  From Julian Anastasov and
    Simon Horman.

35) Fix stack overruns in rt6_probe(), the issue is that can end up
    doing two full packet xmit paths at the same time when emitting
    neighbour discovery messages.  From Hannes Frederic Sowa.

36) davinci_emac driver doesn't handle IFF_ALLMULTI correctly, from
    Mariusz Ceier.

37) Make sure to set TCP sk_pacing_rate after the first legitimate RTT
    sample, from Neal Cardwell.

38) Wrong netlink attribute passed to xfrm_replay_verify_len(), from
    Steffen Klassert.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (152 commits)
  ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter
  ax88179_178a: Correct the RX error definition in RX header
  Revert "bridge: only expire the mdb entry when query is received"
  tcp: initialize passive-side sk_pacing_rate after 3WHS
  davinci_emac.c: Fix IFF_ALLMULTI setup
  mac802154: correct a typo in ieee802154_alloc_device() prototype
  ipv6: probe routes asynchronous in rt6_probe
  netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper
  ipv6: fill rt6i_gateway with nexthop address
  ipv6: always prefer rt6i_gateway if present
  bnx2x: Set NETIF_F_HIGHDMA unconditionally
  bnx2x: Don't pretend during register dump
  bnx2x: Lock DMAE when used by statistic flow
  bnx2x: Prevent null pointer dereference on error flow
  bnx2x: Fix config when SR-IOV and iSCSI are enabled
  bnx2x: Fix Coalescing configuration
  bnx2x: Unlock VF-PF channel on MAC/VLAN config error
  bnx2x: Prevent an illegal pointer dereference during panic
  bnx2x: Fix Maximum CoS estimation for VFs
  drivers: net: cpsw: fix kernel warn during iperf test with interrupt pacing
  ...
2013-10-23 07:47:42 +01:00
Hannes Frederic Sowa a8fab07445 x86/jump_label: expect default_nop if static_key gets enabled on boot-up
net_get_random_once(intrduced in the next patch) uses static_keys in
a way that they get enabled on boot-up instead of replaced with an
ideal_nop. So check for default_nop on initial enabling.

Other architectures don't check for this.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: x86@kernel.org
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-19 19:45:35 -04:00
Linus Torvalds 9219cec5f2 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Two fixlets:

   - fix a (rare-config) build bug
   - fix a next-gen SGI/UV hw/firmware enumeration bug"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Update UV3 hub revision ID
  x86/microcode: Correct Kconfig dependencies
2013-10-18 12:25:11 -07:00
Kees Cook aec58bafaf x86/relocs: Add percpu fixup for GNU ld 2.23
The GNU linker tries to put __per_cpu_load into the percpu area,
resulting in a lack of its relocation. Force this symbol to be
relocated. Seen starting with GNU ld 2.23 and later.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Cc: Cong Ding <dinggnu@gmail.com>
Link: http://lkml.kernel.org/r/20131016064314.GA2739@www.outflux.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-18 08:45:09 +02:00
David Cohen 40a96d54ee intel_mid: Move platform device setups to their own platform_<device>.* files
As Intel rolling out more SoC's after Moorestown, we need to
re-structure the code in a way that is backward compatible and easy to
expand. This patch implements a flexible way to support multiple boards
and devices.

This patch does not add any new functional support. It just refactors
the existing code to increase the modularity and decrease the code
duplication for supporting multiple soc's and boards.

Currently intel-mid.c has both board and soc related code in one file.
This patch moves the board related code to new files and let linker
script to create SFI devite table following this:

1. Move the SFI device specific code to
   arch/x86/platform/intel-mid/device-libs/platform_<device>.*
   A new device file is added for every supported device. This code will
   get conditionally compiled by using corresponding device driver
   CONFIG option.

2. Move the device_ids location to .x86_intel_mid_dev.init section by
   using new sfi_device() macro.

This patch was based on previous code from Sathyanarayanan Kuppuswamy.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-13-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:41:50 -07:00
David Cohen 66ac501370 x86: intel-mid: Add section for sfi device table
When Intel mid uses SFI table to enumerate devices, it requires an extra
device table with further information about how to probe such devices.

This patch creates a section where the device table will stay if
CONFIG_X86_INTEL_MID is selected.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-12-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:41:04 -07:00
David Cohen 0e6fdb5f03 intel-mid: sfi: Allow struct devs_id.get_platform_data to be NULL
Intel mid sfi code doesn't need struct devs_id.get_platform_data != NULL.
If the callback is not set, just assume there is no platform_data.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-11-git-send-email-david.a.cohen@linux.intel.com
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:59 -07:00
Kuppuswamy Sathyanarayanan aeedb370e7 intel_mid: Moved SFI related code to sfi.c
Moved SFI specific parsing/handling code to sfi.c. This will enable us
to reuse our intel-mid code for platforms that supports firmware
interfaces other than SFI (like ACPI).

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-10-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:51 -07:00
Kuppuswamy Sathyanarayanan 49c72a0a8a intel_mid: Added custom handler for ipc devices
Added a custom handler for medfield based ipc devices and
moved devs_id structure defintion to header file.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-9-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:50 -07:00
Kuppuswamy Sathyanarayanan 3fd79ae427 intel_mid: Added custom device_handler support
This patch provides a means to add custom handler for
SFI devices. If you set device_handler as NULL in
device_id table standard SFI device handler will be used.
If its not NULL custom handler will be called.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-8-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:49 -07:00
Kuppuswamy Sathyanarayanan 661b010765 intel_mid: Refactored sfi_parse_devs() function
SFI device_id[] table parsing code is duplicated in every SFI
device handler. This patch removes this code duplication, by
adding a seperate function get_device_id() to parse through the
device table. Also this patch moves the SPI, I2C, IPC info code from
sfi_parse_devs() to respective device handlers.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-7-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:48 -07:00
Kuppuswamy Sathyanarayanan 712b6aa873 intel_mid: Renamed *mrst* to *intel_mid*
mrst is used as common name to represent all intel_mid type
soc's. But moorsetwon is just one of the intel_mid soc. So
renamed them to use intel_mid.

This patch mainly renames the variables and related
functions that uses *mrst* prefix with *intel_mid*.

To ensure that there are no functional changes, I have compared
the objdump of related files before and after rename and found
the only difference is symbol and name changes.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-6-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:47 -07:00
Fengguang Wu 6c21b176a9 pci: intel_mid: Return true/false in function returning bool
Function 'type1_access_ok' should return bool value, not 0/1.
This patch changes 'return 0/1' to 'return false/true'.

Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-5-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:45 -07:00
Kuppuswamy Sathyanarayanan 05454c26eb intel_mid: Renamed *mrst* to *intel_mid*
Following files contains code that is common to all intel mid
soc's. So renamed them as below.

mrst/mrst.c              -> intel-mid/intel-mid.c
mrst/vrtc.c              -> intel-mid/intel_mid_vrtc.c
mrst/early_printk_mrst.c -> intel-mid/intel_mid_vrtc.c
pci/mrst.c               -> pci/intel_mid_pci.c

Also, renamed the corresponding header files and made changes
to the driver files that included these header files.

To ensure that there are no functional changes, I have compared
the objdump of renamed files before and after rename and found
that the only difference is file name change.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-4-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:36 -07:00
Kuppuswamy Sathyanarayanan d8059302b3 mrst: Fixed indentation issues
Fixed indentation issues reported by checkpatch script in
mrst related files.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-3-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:35 -07:00
Kuppuswamy Sathyanarayanan 001d4c7aea mrst: Fixed printk/pr_* related issues
Fixed printk and pr_* related issues in mrst related files.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-2-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-17 16:40:33 -07:00
Gleb Natapov 13acfd5715 Powerpc KVM work is based on a commit after rc4.
Merging master into next to satisfy the dependencies.

Conflicts:
	arch/arm/kvm/reset.c
2013-10-17 17:41:49 +03:00
Aneesh Kumar K.V 5587027ce9 kvm: Add struct kvm arg to memslot APIs
We will use that in the later patch to find the kvm ops handler

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-17 15:49:23 +02:00
Jacob Pan 1a6b991a98 x86 / msr: add 64bit _on_cpu access functions
Having 64-bit MSR access methods on given CPU can avoid shifting and
simplify MSR content manipulation. We already have other combinations
of rdmsrl_xxx and wrmsrl_xxx but missing the _on_cpu version.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-17 00:36:06 +02:00
Peter Zijlstra 9536c8d2da perf/x86: Optimize intel_pmu_pebs_fixup_ip()
There's been reports of high NMI handler overhead, highlighted by
such kernel messages:

  [ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
  [ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs

Don Zickus analyzed the source of the overhead and reported:

 > While there are a few places that are causing latencies, for now I focused on
 > the longest one first.  It seems to be 'copy_user_from_nmi'
 >
 > intel_pmu_handle_irq ->
 >	intel_pmu_drain_pebs_nhm ->
 >		__intel_pmu_drain_pebs_nhm ->
 >			__intel_pmu_pebs_event ->
 >				intel_pmu_pebs_fixup_ip ->
 >					copy_from_user_nmi
 >
 > In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of
 > all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles
 > (there are some cases where only 10 iterations are needed to go that high
 > too, but in generall over 50 or so).  At this point copy_user_from_nmi
 > seems to account for over 90% of the nmi latency.

The solution to that is to avoid having to call copy_from_user_nmi() for
every instruction.

Since we already limit the max basic block size, we can easily
pre-allocate a piece of memory to copy the entire thing into in one
go.

Don reported this test result:

 > Your patch made a huge difference in improvement.  The
 > copy_from_user_nmi() no longer hits the million of cycles.  I still
 > have a batch of 100,000-300,000 cycles.  My longest NMI paths used
 > to be dominated by copy_from_user_nmi, now it is not (I have to dig
 > up the new hot path).

Reported-and-tested-by: Don Zickus <dzickus@redhat.com>
Cc: jmario@redhat.com
Cc: acme@infradead.org
Cc: dave.hansen@linux.intel.com
Cc: eranian@google.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131016105755.GX10651@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-16 15:44:00 +02:00
Linus Torvalds b83aea88d3 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fix from Gleb Natapov.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Enable pvspinlock after jump_label_init() to avoid VM hang
2013-10-15 16:22:51 -07:00
Linus Torvalds 36704263f1 A small fix for Xen on x86_32 and a build fix for xen-tpmfront on arm64.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJSXSSbAAoJEIlPj0hw4a6QkIoQAMe69LtP8x4l7ZfgqT/QxwHK
 COGw+KpPXZPxASpCJO9aTXW/Mr54X9BpC4if6OmpRA6BXyU2sTrp2TIeuPbZCFvg
 DjFy4g6Snd9yvb0FPbIcUbMbUZUWmLA05PstUZV+FNub2l9ie7WWzfjyVNQv/izA
 AJ6J5GDSPxHegoygeBCCrn0Qc5rgN6soxhRDndmVGLxlBNRs71ORojRMGUrWY2X/
 OhxSj97hCXZHZyaQBI2JfuNzmFAVGvdLVN3NGzROKK1u8zKGn1I1kL39OU1QkCHZ
 iUmBeJt7jzetS85XcDxCzcErWs6K6HjFqiSF1tC6GGOtX73oqNqV4weIzNSdO+lc
 tLUGDfJSJdaVB6tNoGsywqx380CHyJi6WjUR21o2RSW8TtsNrzkpqLE0Aqv84tnl
 dyTgykzmpKhtmchjHHO4mGDioCRDYLcok3gol2IRc6P+BYXzHgwvWkH/TWMhvVjK
 IHGTd+cuUqrs7MmuxGZ/kQiqV6OegdFZYrW6E6vulITUMWNPxUaXFBu9njzrMxGt
 2bLeC+xDTi2BW9ihaln+zVUxmHZ2TLq6fHMZ/XTWrwGNNdUhKU/jNIWqC25MxSX8
 vOVnCLvT+AqqSU82ng3JL5LvnZFGlAd/cUdVEdo723xRNz6OgWLbrhi7Pq/OkjIJ
 9VKoA3zX8mgiMl1Wgp9t
 =wJOk
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull Xen fixes from Stefano Stabellini:
 "A small fix for Xen on x86_32 and a build fix for xen-tpmfront on
  arm64"

* tag 'stable/for-linus-3.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Fix possible user space selector corruption
  tpm: xen-tpmfront: fix missing declaration of xen_domain
2013-10-15 16:22:11 -07:00
Raghavendra K T 3dbef3e3bf KVM: Enable pvspinlock after jump_label_init() to avoid VM hang
We use jump label to enable pv-spinlock. With the changes in (442e0973e9
Merge branch 'x86/jumplabel'), the jump label behaviour has changed
that would result in eventual hang of the VM since we would end up in a
situation where slow path locks would halt the vcpus but we will not be
able to wakeup the vcpu by lock releaser using unlock kick.

Similar problem in Xen and more detailed description is available in
a945928ea2 (xen: Do not enable spinlocks before jump_label_init()
has executed)

This patch splits kvm_spinlock_init to separate jump label changes with
pvops patching and also make jump label enabling after jump_label_init().

Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-15 14:15:54 +03:00
chai wen f2e106692d KVM: Drop FOLL_GET in GUP when doing async page fault
Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP.  Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.

Suggested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-15 13:43:37 +03:00
Russ Anderson dd3c9c4b60 x86: Update UV3 hub revision ID
The UV3 hub revision ID is different than expected.  The first
revision was supposed to start at 1 but instead will start at 0.

Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: <stable@kernel.org> # v3.9, v3.10, v3.11
Link: http://lkml.kernel.org/r/20131014161733.GA6274@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-15 08:44:46 +02:00
Ingo Molnar 426ee9e3bb Linux 3.12-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQEcBAABAgAGBQJSWyGgAAoJEHm+PkMAQRiGA8MH/35upHXImoRCsI5uC1qvHJtI
 QvQAhDFxoEXbFUKeaYgTfcM8q9FgnqnfjhLf8eYa4Q7tDZeqLXOE8bkI807mSZMl
 yECr3jcwlV+zyhV2MP/HdwTjzy25bwxLM3Zy43S7QROrYoMHZYznil/QPfyMATCJ
 XLPuXZC1FtuUen89n4BoDIuL8QaVrIR/zLqFklAQcdTcGpLHSOwFtH8gb2WaRLhv
 +4IikFRFgTNZiMR5tP0GPc6UH6TVTvRb4QKSqqa7J8OmfAIvOzAUdhqWSPOIwWwt
 Z/+JFxFDczAcNmpv4gE6jkgc2vR8CVeHsvh0j61RDSFObBWspwk337CSyUZxYSA=
 =w4VQ
 -----END PGP SIGNATURE-----

Merge tag 'v3.12-rc5' into perf/core

Merge Linux v3.12-rc5, to pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-15 07:05:18 +02:00
Michael Opdenacker aa5e5dc2a8 treewide: fix "distingush" typo
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-10-14 15:38:33 +02:00
Maxime Jayat 3f79410c7c treewide: Fix common typo in "identify"
Correct common misspelling of "identify" as "indentify" throughout
the kernel

Signed-off-by: Maxime Jayat <maxime@artisandeveloppeur.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-10-14 15:31:06 +02:00
Borislav Petkov 80030e3d8e x86/microcode: Correct Kconfig dependencies
I have a randconfig here which has enabled only

  CONFIG_MICROCODE=y
  CONFIG_MICROCODE_OLD_INTERFACE=y

with both

  # CONFIG_MICROCODE_INTEL is not set
  # CONFIG_MICROCODE_AMD is not set

off. Which makes building the microcode functionality a little
pointless. Don't do that in such cases then.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1381682189-14470-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-14 09:24:27 +02:00
Christoffer Dall 6d9d41e574 KVM: Move gfn_to_index to x86 specific code
The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures.  Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:11:44 +03:00
H. Peter Anvin 6e6a4932b0 x86, boot: Rename get_flags() and check_flags() to *_cpuflags()
When a function is used in more than one file it may not be possible
to immediately tell from context what the intended meaning is.  As
such, it is more important that the naming be self-evident.  Thus,
change get_flags() to get_cpuflags().

For consistency, change check_flags() to check_cpuflags() even though
it is only used in cpucheck.c.

Link: http://lkml.kernel.org/r/1381450698-28710-2-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-13 04:08:56 -07:00
Kees Cook 6145cfe394 x86, kaslr: Raise the maximum virtual address to -1 GiB on x86_64
On 64-bit, this raises the maximum location to -1 GiB (from -1.5 GiB),
the upper limit currently, since the kernel fixmap page mappings need
to be moved to use the other 1 GiB (which would be the theoretical
limit when building with -mcmodel=kernel).

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1381450698-28710-7-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-13 03:13:13 -07:00
Kees Cook f32360ef66 x86, kaslr: Report kernel offset on panic
When the system panics, include the kernel offset in the report to assist
in debugging.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1381450698-28710-6-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-13 03:12:24 -07:00
Kees Cook 82fa9637a2 x86, kaslr: Select random position from e820 maps
Counts available alignment positions across all e820 maps, and chooses
one randomly for the new kernel base address, making sure not to collide
with unsafe memory areas.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1381450698-28710-5-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-13 03:12:19 -07:00
Kees Cook 5bfce5ef55 x86, kaslr: Provide randomness functions
Adds potential sources of randomness: RDRAND, RDTSC, or the i8254.

This moves the pre-alternatives inline rdrand function into the header so
both pieces of code can use it. Availability of RDRAND is then controlled
by CONFIG_ARCH_RANDOM, if someone wants to disable it even for kASLR.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1381450698-28710-4-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-13 03:12:12 -07:00
Kees Cook 8ab3820fd5 x86, kaslr: Return location from decompress_kernel
This allows decompress_kernel to return a new location for the kernel to
be relocated to. Additionally, enforces CONFIG_PHYSICAL_START as the
minimum relocation position when building with CONFIG_RELOCATABLE.

With CONFIG_RANDOMIZE_BASE set, the choose_kernel_location routine
will select a new location to decompress the kernel, though here it is
presently a no-op. The kernel command line option "nokaslr" is introduced
to bypass these routines.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1381450698-28710-3-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-13 03:12:07 -07:00
Kees Cook dd78b97367 x86, boot: Move CPU flags out of cpucheck
Refactor the CPU flags handling out of the cpucheck routines so that
they can be reused by the future ASLR routines (in order to detect CPU
features like RDRAND and RDTSC).

This reworks has_eflag() and has_fpu() to be used on both 32-bit and
64-bit, and refactors the calls to cpuid to make them PIC-safe on 32-bit.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1381450698-28710-2-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-13 03:12:02 -07:00
Michael Davidson d751c169e9 x86, relocs: Add more per-cpu gold special cases
The "gold" linker doesn't seem to put some additional per-cpu cases in
the right place. Add these to the per-cpu check. Without this, the kASLR
patch series fails to correctly apply relocations, and fails to boot.

Signed-off-by: Michael Davidson <md@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20131011013954.GA28902@www.outflux.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-13 03:11:57 -07:00
Linus Torvalds c786e90bb2 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull gcc "asm goto" miscompilation workaround from Ingo Molnar:
 "This is the fix for the GCC miscompilation discussed in the following
  lkml thread:

    [x86] BUG: unable to handle kernel paging request at 00740060

  The bug in GCC has been fixed by Jakub and the fix will be part of the
  GCC 4.8.2 release expected to be released next week - so the quirk's
  version test checks for <= 4.8.1.

  The quirk is only added to compiler-gcc4.h and not to the higher level
  compiler.h because all asm goto uses are behind a feature check"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
2013-10-12 11:06:18 -07:00
Linus Torvalds 71ac3d1938 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "A build fix and a reboot quirk"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/reboot: Add reboot quirk for Dell Latitude E5410
  x86, build, pci: Fix PCI_MSI build on !SMP
2013-10-12 10:36:03 -07:00
Ingo Molnar 88f182dd77 x86: Apply the asm_volatile_goto() compiler quirk
Apply the asm_volatile_goto() compiler quirk to the new rmwcc.h
file as well, introduced in:

   c2daa3bed5 sched, x86: Provide a per-cpu preempt_count implementation

Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-11 07:41:43 +02:00
Ingo Molnar ec0ad3d01f Merge branch 'core/urgent' into sched/core
Merge in asm goto fix, to be able to apply the asm/rmwcc.h fix.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-11 07:39:37 +02:00
Ingo Molnar 3f0116c323 compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670

Implement a workaround suggested by Jakub Jelinek.

Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-11 07:39:14 +02:00
K. Y. Srinivasan 90ab9d5510 x86, hyperv: Correctly guard the local APIC calibration code
The code that gets the local APIC timer frequency from the hypervisor
rather depends on there being a local APIC.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1381444224-3303-1-git-send-email-kys@microsoft.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-10 15:21:38 -07:00
K. Y. Srinivasan 9e7827b5ea x86, hyperv: Get the local APIC timer frequency from the hypervisor
Hyper-V supports a mechanism for retrieving the local APIC frequency.
Use this and bypass the calibration code in the kernel . This would
allow us to boot the Linux kernel as a "modern VM" on Hyper-V where
many of the legacy devices (such as PIT) are not emulated.

I would like to thank Olaf Hering <olaf@aepfle.de>, Jan Beulich <JBeulich@suse.com> and
H. Peter Anvin <h.peter.anvin@intel.com> for their help in this effort.

In this version of the patch, I have addressed Jan's comments.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1380554932-9888-1-git-send-email-olaf@aepfle.de
Tested-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-10 11:44:12 -07:00
Linus Torvalds 8273548c54 Fixes for 3.12-rc5. Two old PPC bugs and one new (3.12-rc2) x86 bug.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSVndRAAoJEBvWZb6bTYby/SIP/26s8gh/TkTa6eYVXAOIq2PE
 0T2rCN2KYT/huDuoH5SJ1dMUweWwmwaXsZVD8iNVuvbZSIUcC7j6hlPg+IIl7uD4
 EEKFPsCehCumleCsoeyYu14YPFApp8CTbx2nXre2fT3z82bEdFbdAF8i0nwCWsBF
 7yvQYMl/qDhWotNFnn9D9i3VaiJwn4BrQxWsGkSssOr1+YRD7o718qErmqwliECv
 m25QQCr+kX7O5Le9kaQ0LsMMzUWnR63VHsBArXsLP+bKozC2AgyTCku3QVYGk3K3
 uYeen0wbErhZB2JDZq4NUYVltvF0gNoOWwgfRGuOm4ZRfP5g1Paeu1xDDs3TYXkG
 9/fkeJRsFpPm+o783IhZaTwTh8EOIcpSyrAq/mC9fCRYf0u50aKOexjV+Sj0Lsya
 jC9MkGYB6sT3q+XmYKntVqmkemjgr6ku1TLMS7zouR0g8IsnJ9oEKzxE/0PXgIr0
 IkOD6rceSmG8KR5r57GYO1zdCyfaPgKAgAzeifBHnCaWQiX/Id9rO1nk9mzMIa70
 xQw7LcUC0rzN6t+zS236wrfCzDAh/trZep+WTU+98WC4fqxyo8vlPNhEu/In56jw
 mhNTb3UCsPyzUEXDdkeZN/Hy4RZ/WbMAeuh8ZcQzn+grK8h5Rura07B4zKluhjvx
 DLAjE/JmEzfBuhpoXsZn
 =O6an
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Fixes for 3.12-rc5: two old PPC bugs and one new (3.12-rc2) x86 bug"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: ppc: booke: check range page invalidation progress on page setup
  KVM: PPC: Book3S HV: Fix typo in saving DSCR
  KVM: nVMX: fix shadow on EPT
2013-10-10 11:33:48 -07:00
Arthur Chunqi Li 7854cbca81 KVM: nVMX: Fully support nested VMX preemption timer
This patch contains the following two changes:
1. Fix the bug in nested preemption timer support. If vmexit L2->L0
with some reasons not emulated by L1, preemption timer value should
be save in such exits.
2. Add support of "Save VMX-preemption timer value" VM-Exit controls
to nVMX.

With this patch, nested VMX preemption timer features are fully
supported.

Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-10 18:22:54 +02:00
Frediano Ziglio 7cde9b27e7 xen: Fix possible user space selector corruption
Due to the way kernel is initialized under Xen is possible that the
ring1 selector used by the kernel for the boot cpu end up to be copied
to userspace leading to segmentation fault in the userspace.

Xen code in the kernel initialize no-boot cpus with correct selectors (ds
and es set to __USER_DS) but the boot one keep the ring1 (passed by Xen).
On task context switch (switch_to) we assume that ds, es and cs already
point to __USER_DS and __KERNEL_CSso these selector are not changed.

If processor is an Intel that support sysenter instruction sysenter/sysexit
is used so ds and es are not restored switching back from kernel to
userspace. In the case the selectors point to a ring1 instead of __USER_DS
the userspace code will crash on first memory access attempt (to be
precise Xen on the emulated iret used to do sysexit will detect and set ds
and es to zero which lead to GPF anyway).

Now if an userspace process call kernel using sysenter and get rescheduled
(for me it happen on a specific init calling wait4) could happen that the
ring1 selector is set to ds and es.

This is quite hard to detect cause after a while these selectors are fixed
(__USER_DS seems sticky).

Bisecting the code commit 7076aada10 appears
to be the first one that have this issue.

Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2013-10-10 14:39:37 +00:00
Stefano Stabellini 1b65c4e5a9 swiotlb-xen: use xen_alloc/free_coherent_pages
Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
free coherent pages.

We need to be careful handling the pointer returned by
xen_alloc_coherent_pages, because on ARM the pointer is not equal to
phys_to_virt(*dma_handle). In fact virt_to_phys only works for kernel
direct mapped RAM memory.
In ARM case the pointer could be an ioremap address, therefore passing
it to virt_to_phys would give you another physical address that doesn't
correspond to it.

Make xen_create_contiguous_region take a phys_addr_t as start parameter to
avoid the virt_to_phys calls which would be incorrect.

Changes in v6:
- remove extra spaces.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-10-10 13:41:10 +00:00
Alexey Neyman 3ad674d6c6 x86/PCI: Coalesce multiple overlapping host bridge windows
Previously we coalesced windows by expanding the first overlapping one and
making the second invalid.  But we never look at the expanded first window
again, so we fail to notice other windows that overlap it.  For example, we
coalesced these:

  [io  0x0000-0x03af] // #0
  [io  0x03e0-0x0cf7] // #1
  [io  0x0000-0xdfff] // #2

into these, which still overlap:

  [io  0x0000-0xdfff] // #0
  [io  0x03e0-0x0cf7] // #1

The fix is to expand the *second* overlapping resource and ignore the
first, so we get this instead with no overlaps:

  [io  0x0000-0xdfff] // #2

[bhelgaas: changelog]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=62511
Signed-off-by: Alexey Neyman <stilor@att.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-10 07:11:59 -06:00
Gleb Natapov d0d538b9d1 KVM: nVMX: fix shadow on EPT
72f857950f broke shadow on EPT. This patch reverts it and fixes PAE
on nEPT (which reverted commit fixed) in other way.

Shadow on EPT is now broken because while L1 builds shadow page table
for L2 (which is PAE while L2 is in real mode) it never loads L2's
GUEST_PDPTR[0-3].  They do not need to be loaded because without nested
virtualization HW does this during guest entry if EPT is disabled,
but in our case L0 emulates L2's vmentry while EPT is enables, so we
cannot rely on vmcs12->guest_pdptr[0-3] to contain up-to-date values
and need to re-read PDPTEs from L2 memory. This is what kvm_set_cr3()
is doing, but by clearing cache bits during L2 vmentry we drop values
that kvm_set_cr3() read from memory.

So why the same code does not work for PAE on nEPT? kvm_set_cr3()
reads pdptes into vcpu->arch.walk_mmu->pdptrs[]. walk_mmu points to
vcpu->arch.nested_mmu while nested guest is running, but ept_load_pdptrs()
uses vcpu->arch.mmu which contain incorrect values. Fix that by using
walk_mmu in ept_(load|save)_pdptrs.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-10 11:39:57 +02:00
Rob Herring 32df8dca50 of: remove HAVE_ARCH_DEVTREE_FIXUPS
HAVE_ARCH_DEVTREE_FIXUPS appears to always be needed except for sparc,
but it is only used for /proc/device-teee and sparc does not enable
/proc/device-tree. So this option is redundant. Remove the option and
always enable it. This has the side effect of fixing /proc/device-tree
on arches such as arm64 which failed to define this option.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
2013-10-09 20:04:08 -05:00
Rob Herring 25ff79443c of: implement pci_address_to_pio as weak function
Implement pci_address_to_pio as weak function to remove the dependency on
asm/prom.h. This is in preparation to make prom.h optional.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Grant Likely <grant.likely@linaro.org>
2013-10-09 20:04:06 -05:00
Rob Herring ba904f0649 x86: add necessary includes for prom.h
Once prom.h is no longer implicitly included, we need to include setup.h
to get COMMAND_LINE_SIZE.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
2013-10-09 20:04:06 -05:00
Stefano Stabellini d6fe76c58c xen: introduce xen_alloc/free_coherent_pages
xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
and devices. On native x86 is sufficient to call __get_free_pages in
order to get a coherent buffer, while on ARM (and potentially ARM64) we
need to call the native dma_ops->alloc implementation.

Introduce xen_alloc_coherent_pages to abstract the arch specific buffer
allocation.

Similarly introduce xen_free_coherent_pages to free a coherent buffer:
on x86 is simply a call to free_pages while on ARM and ARM64 is
arm_dma_ops.free.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


Changes in v7:
- rename __get_dma_ops to __generic_dma_ops;
- call __generic_dma_ops(hwdev)->alloc/free on arm64 too.

Changes in v6:
- call __get_dma_ops to get the native dma_ops pointer on arm.
2013-10-09 17:18:14 +00:00
Stefano Stabellini 69908907b0 xen: make xen_create_contiguous_region return the dma address
Modify xen_create_contiguous_region to return the dma address of the
newly contiguous buffer.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>


Changes in v4:
- use virt_to_machine instead of virt_to_bus.
2013-10-09 16:56:32 +00:00
Stefano Stabellini 2f558d4091 xen/x86: allow __set_phys_to_machine for autotranslate guests
Allow __set_phys_to_machine to be called for autotranslate guests.
It can be used to keep track of phys_to_machine changes, however we
don't do anything with the information at the moment.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-10-09 20:39:01 +00:00
Rob Herring 29eb45a9ab of: remove early_init_dt_setup_initrd_arch
All arches do essentially the same thing now for
early_init_dt_setup_initrd_arch, so it can now be removed.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
2013-10-09 11:39:01 -05:00
Rob Herring 21c561bda1 x86: use unflatten_and_copy_device_tree
Use the common unflatten_and_copy_device_tree to copy the built-in FDT
out of init section.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
2013-10-09 11:38:05 -05:00
Ingo Molnar 37bf06375c Linux 3.12-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQEcBAABAgAGBQJSUc9zAAoJEHm+PkMAQRiG9DMH/AtpuAF6LlMRPjrCeuJQ1pyh
 T0IUO+CsLKO6qtM5IyweP8V6zaasNjIuW1+B6IwVIl8aOrM+M7CwRiKvpey26ldM
 I8G2ron7hqSOSQqSQs20jN2yGAqQGpYIbTmpdGLAjQ350NNNvEKthbP5SZR5PAmE
 UuIx5OGEkaOyZXvCZJXU9AZkCxbihlMSt2zFVxybq2pwnGezRUYgCigE81aeyE0I
 QLwzzMVdkCxtZEpkdJMpLILAz22jN4RoVDbXRa2XC7dA9I2PEEXI9CcLzqCsx2Ii
 8eYS+no2K5N2rrpER7JFUB2B/2X8FaVDE+aJBCkfbtwaYTV9UYLq3a/sKVpo1Cs=
 =xSFJ
 -----END PGP SIGNATURE-----

Merge tag 'v3.12-rc4' into sched/core

Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree
before applying more scheduler patches.

Conflicts:
	arch/avr32/include/asm/Kbuild

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-09 12:36:13 +02:00
Andre Richter 1224987384 x86, msr: Use file_inode(), not f_mapping->host
As discussed in [1], exchange f_mapping->host with file_inode().  This
is a bug, but happens to be non-manifest in this case.

[1] http://lkml.kernel.org/r/20131007190357.GA13318@ZenIV.linux.org.uk

Signed-off-by: Andre Richter <andre.o.richter@gmail.com>
Link: http://lkml.kernel.org/r/1381224142-3267-1-git-send-email-andre.o.richter@gmail.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-08 12:01:29 -07:00
Geyslan G. Bem 49449c30c4 x86: mkpiggy.c: Explicitly close the output file
Even though the resource is released when the application is closed or
when returned from main function, modify the code to make it obvious,
and to keep static analysis tools from complaining.

Signed-off-by: Geyslan G. Bem <geyslan@gmail.com>
Link: http://lkml.kernel.org/r/1381184219-10985-1-git-send-email-geyslan@gmail.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-08 11:36:09 -07:00
Linus Torvalds 0e7a3ed04f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Various fixlets:

  On the kernel side:

   - fix a race
   - fix a bug in the handling of the perf ring-buffer data page

  On the tooling side:

   - fix the handling of certain corrupted perf.data files
   - fix a bug in 'perf probe'
   - fix a bug in 'perf record + perf sched'
   - fix a bug in 'make install'
   - fix a bug in libaudit feature-detection on certain distros"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf session: Fix infinite loop on invalid perf.data file
  perf tools: Fix installation of libexec components
  perf probe: Fix to find line information for probe list
  perf tools: Fix libaudit test
  perf stat: Set child_pid after perf_evlist__prepare_workload()
  perf tools: Add default handler for mmap2 events
  perf/x86: Clean up cap_user_time* setting
  perf: Fix perf_pmu_migrate_context
2013-10-08 09:23:12 -07:00
Alexei Starovoitov d45ed4a4e3 net: fix unsafe set_memory_rw from softirq
on x86 system with net.core.bpf_jit_enable = 1

sudo tcpdump -i eth1 'tcp port 22'

causes the warning:
[   56.766097]  Possible unsafe locking scenario:
[   56.766097]
[   56.780146]        CPU0
[   56.786807]        ----
[   56.793188]   lock(&(&vb->lock)->rlock);
[   56.799593]   <Interrupt>
[   56.805889]     lock(&(&vb->lock)->rlock);
[   56.812266]
[   56.812266]  *** DEADLOCK ***
[   56.812266]
[   56.830670] 1 lock held by ksoftirqd/1/13:
[   56.836838]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380
[   56.849757]
[   56.849757] stack backtrace:
[   56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45
[   56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[   56.882004]  ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007
[   56.895630]  ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001
[   56.909313]  ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001
[   56.923006] Call Trace:
[   56.929532]  [<ffffffff8175a145>] dump_stack+0x55/0x76
[   56.936067]  [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208
[   56.942445]  [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50
[   56.948932]  [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150
[   56.955470]  [<ffffffff810ccb52>] mark_lock+0x282/0x2c0
[   56.961945]  [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50
[   56.968474]  [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50
[   56.975140]  [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90
[   56.981942]  [<ffffffff810cef72>] lock_acquire+0x92/0x1d0
[   56.988745]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
[   56.995619]  [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50
[   57.002493]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
[   57.009447]  [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380
[   57.016477]  [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380
[   57.023607]  [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460
[   57.030818]  [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10
[   57.037896]  [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0
[   57.044789]  [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0
[   57.051720]  [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40
[   57.058727]  [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40
[   57.065577]  [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30
[   57.072338]  [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0
[   57.078962]  [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0
[   57.085373]  [<ffffffff81058245>] run_ksoftirqd+0x35/0x70

cannot reuse jited filter memory, since it's readonly,
so use original bpf insns memory to hold work_struct

defer kfree of sk_filter until jit completed freeing

tested on x86_64 and i386

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-07 15:16:45 -04:00
Oliver Neukum 16c0c4e165 crypto: sha256_ssse3 - also test for BMI2
The AVX2 implementation also uses BMI2 instructions,
but doesn't test for their availability. The assumption
that AVX2 and BMI2 always go together is false. Some
Haswells have AVX2 but not BMI2.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-10-07 14:17:10 +08:00
Ingo Molnar ced3c42c9f x86/iommu: Clean up the CONFIG_GART_IOMMU config option a bit
Improve the explanation of this config option.

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/n/tip-gUMmysvsbl3mccbyf6olmxqg@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-06 15:58:41 +02:00
Ville Syrjälä 8412da7577 x86/reboot: Add reboot quirk for Dell Latitude E5410
Dell Latitude E5410 needs reboot=pci to actually reboot.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://lkml.kernel.org/r/1380888964-14517-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-06 11:49:07 +02:00
Andi Kleen 38901f1c1c x86/iommu: Don't make AMD_GART depend on EXPERT and default y
The AMD_GART driver was made EXPERT/EMBEDDED a long time
ago to avoid unbootable 64bit systems with 32bit only devices.

This was before swiotlb was there, which does the job
of this fallback today. SWIOTLB is always on, so systems
should always boot.

The drawback is that every system has to compile that
driver in (it cannot be a module).

Also:
 - Newer AMD CPUs (the APUs) don't seem to have AMD_GART support
   at all anymore.

 - Newer AMD platforms have a much better real IOMMU

 - The AMD GART driver was never very good (lots of overhead,
   e.g. in flushing due to some workarounds) and it's doubtful it's
   really better than SWIOTLB.

 - On older K8 systems it didn't even work with all chipsets.

 - The 32bit device bounce buffer case should be rare/
   non performance critical these days anyways.

 - On non AMD systems it is not needed at all.

So drop the EXPERT dependency on AMD_GART and remove the
default y. The driver can be still compiled in, just
it's an explicit decision now, and people who don't want
it can unselect it.

I also clarified the description a bit.

This allows to save ~8K text on most modern x86-64 systems.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1380922676-23007-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-06 11:47:04 +02:00
Linus Torvalds 95167aad67 PCI update for v3.12:
MMCONFIG
       Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJST3wYAAoJEFmIoMA60/r8GlIP/An+r2j9679vAH7Iuk0wkIRi
 0NLv0j7RdcdwbM7gXfeMc+EGzrLwDhbsnP60S5iKT2CfJ1n1buDe/na4fO9/MDaH
 prMDhDzRs9MrdLIhv0l/ovyoFiITWEvVoLngxDU9WjUIHWsFGQ1asIP+bpk8+t/m
 zZ5xHm6eczoaBtcPbeQUI/vXqKuC78koZMs979Nk7y8Py/br5xX9uqzyWMFWkx5u
 C1DvREEDFPHtde9e1jKr1G+Pzn1M0W8CItEKCfUNev70ob244m+0HsQN698Bniqp
 p+rzBBM3XXGvuOuemHrz8x0Fomkn8G6vL1D3PXL8hWvuwiKrO95HGKCCbn2VCT+g
 OgUZ9QkOCEJpCD3A7ET3sV1iFdoeMNXk/iwQpPvIrkr50m0C6LLOmO7yaTUcLhJu
 Aa3x4r+KG5rzKfNT8vPCEC2TY1IL2O6bZvmRubzSocxc8SUfQ+92vmnSd9G169hN
 0X5ljIlAFFhfNOvf6dRljD2zCji2WeteRyBO3k3Flk+L/EnHD1Vi5jYrLoVaFJIH
 NB4lx/szysElE6IzDOwtTlTR6v0fWdygHnnUlr6yELZyifNaYk1Z2lNASxi5JzBG
 LlcxavTgNs1a90aEuzVpc94XTdMZDV0bNb4H36I0aKXU/tlSt2fbKAqMoLvDISJL
 hQBRNKAOF2ep2Htg96Zs
 =LqLn
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fix from Bjorn Helgaas:
 "We merged what was intended to be an MMCONFIG cleanup, but in fact,
  for systems without _CBA (which is almost everything), it broke
  extended config space for domain 0 and it broke all config space for
  other domains.

  This reverts the change"

* tag 'pci-v3.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"
2013-10-04 20:48:20 -07:00
Bjorn Helgaas 67d470e0e1 Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"
This reverts commit 07f9b61c39.

07f9b61c was intended to be a cleanup that didn't change anything, but in
fact, for systems without _CBA (which is almost everything), it broke
extended config space for domain 0 and all config space for other domains.

Reference: http://lkml.kernel.org/r/20131004011806.GE20450@dangermouse.emea.sgi.com
Reported-by: Hedi Berriche <hedi@sgi.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-04 16:15:29 -06:00
Leif Lindholm 722da9d20e x86/efi: Fix config_table_type array termination
Incorrect use of 0 in terminating entry of arch_tables[] causes the
following sparse warning,

  arch/x86/platform/efi/efi.c:74:27: sparse: Using plain integer as NULL pointer

Replace with NULL.

Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
[ Included sparse warning in commit message. ]
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-10-04 20:04:45 +01:00
Thomas Petazzoni 0dbc6078c0 x86, build, pci: Fix PCI_MSI build on !SMP
Commit ebd97be635 ('PCI: remove ARCH_SUPPORTS_MSI kconfig option')
removed the ARCH_SUPPORTS_MSI option which architectures could select
to indicate that they support MSI. Now, all architectures are supposed
to build fine when MSI support is enabled: instead of having the
architecture tell *when* MSI support can be used, it's up to the
architecture code to ensure that MSI support can be enabled.

On x86, commit ebd97be635 removed the following line:

  select ARCH_SUPPORTS_MSI if (X86_LOCAL_APIC && X86_IO_APIC)

Which meant that MSI support was only available when the local APIC
and I/O APIC were enabled. While this is always true on SMP or x86-64,
it is not necessarily the case on i386 !SMP.

The below patch makes sure that the local APIC and I/O APIC support is
always enabled when MSI support is enabled. To do so, it:

 * Ensures the X86_UP_APIC option is not visible when PCI_MSI is
   enabled. This is the option that allows, on UP machines, to enable
   or not the APIC support. It is already not visible on SMP systems,
   or x86-64 systems, for example. We're simply also making it
   invisible on i386 MSI systems.

 * Ensures that the X86_LOCAL_APIC and X86_IO_APIC options are 'y'
   when PCI_MSI is enabled.

Notice that this change requires a change in drivers/iommu/Kconfig to
avoid a recursive Kconfig dependencey. The AMD_IOMMU option selects
PCI_MSI, but was depending on X86_IO_APIC. This dependency is no
longer needed: as soon as PCI_MSI is selected, the presence of
X86_IO_APIC is guaranteed. Moreover, the AMD_IOMMU already depended on
X86_64, which already guaranteed that X86_IO_APIC was enabled, so this
dependency was anyway redundant.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Link: http://lkml.kernel.org/r/1380794354-9079-1-git-send-email-thomas.petazzoni@free-electrons.com
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-04 10:43:34 -07:00
Linus Torvalds 413df1cb43 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Two simplefb fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/simplefb: Mark framebuffer mem-resources as IORESOURCE_BUSY to avoid bootup warning
  x86/simplefb: Fix overflow causing bogus fall-back
2013-10-04 09:03:07 -07:00
Andi Kleen b7af41a1bc perf/x86: Suppress duplicated abort LBR records
Haswell always give an extra LBR record after every TSX abort.
Suppress the extra record.

This only works when the abort is visible in the LBR
If the original abort has already left the 16 LBR entries
the extra entry will will stay.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-7-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 10:06:16 +02:00
Andi Kleen a405bad5ad perf/x86: Add Haswell specific transaction flag reporting
In the PEBS handler report the transaction flags using the new
generic transaction flags facility. Most of them come from
the "tsx_tuning" field in PEBSv2, but the abort code is derived
from the RAX register reported in the PEBS record.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 10:06:09 +02:00
Ingo Molnar fafd883f67 Merge branch 'perf/urgent' into perf/core
Pick up the latest fixes before applying new patches.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 09:59:13 +02:00
Peter Zijlstra d8b11a0cbd perf/x86: Clean up cap_user_time* setting
Currently the cap_user_time_zero capability has different tests than
cap_user_time; even though they expose the exact same data.

Switch from CONSTANT && NONSTOP to sched_clock_stable to also deal
with multi cabinet machines and drop the tsc_disabled() check.. non of
this will work sanely without tsc anyway.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-nmgn0j0muo1r4c94vlfh23xy@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 09:58:55 +02:00
Mike Travis e379ea82dd x86/UV: Add call to KGDB/KDB from NMI handler
This patch restores the capability to enter KDB (and KGDB) from
the UV NMI handler.  This is needed because the UV system
console is not capable of sending the 'break' signal to the
serial console port.  It is also useful when the kernel is hung
in such a way that it isn't responding to normal external I/O,
so sending 'g' to sysreq-trigger does not work either.

Another benefit of the external NMI command is that all the cpus
receive the NMI signal at roughly the same time so they are more
closely aligned timewise.

It utilizes the newly added kgdb_nmicallin function to gain
entry to KGDB/KDB by the master.  The slaves still enter via the
standard kgdb_nmicallback function.  It also uses the new
'send_ready' pointer to tell KGDB/KDB to signal the slaves when
to proceed into the KGDB slave loop.

It is enabled when the nmi action is set to "kdb" and the kernel
is built with CONFIG_KDB enabled.  Note that if kgdb is
connected that interface will be used instead.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/r/20131002151418.089692683@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-03 18:48:09 +02:00