OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Jiri Slaby	5f01c98859	x86/dumpstack: Fix printk_address for direct addresses Consider a kernel crash in a module, simulated the following way: static int my_init(void) { char map = (void )0x5; *map = 3; return 0; } module_init(my_init); When we turn off FRAME_POINTERs, the very first instruction in that function causes a BUG. The problem is that we print IP in the BUG report using %pB (from printk_address). And %pB decrements the pointer by one to fix printing addresses of functions with tail calls. This was added in commit `71f9e59800` ("x86, dumpstack: Use %pB format specifier for stack trace") to fix the call stack printouts. So instead of correct output: BUG: unable to handle kernel NULL pointer dereference at 0000000000000005 IP: [<ffffffffa01ac000>] my_init+0x0/0x10 [pb173] We get: BUG: unable to handle kernel NULL pointer dereference at 0000000000000005 IP: [<ffffffffa0152000>] 0xffffffffa0151fff To fix that, we use %pS only for stack addresses printouts (via newly added printk_stack_address) and %pB for regs->ip (via printk_address). I.e. we revert to the old behaviour for all except call stacks. And since from all those reliable is 1, we remove that parameter from printk_address. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: joe@perches.com Cc: jirislaby@gmail.com Link: http://lkml.kernel.org/r/1382706418-8435-1-git-send-email-jslaby@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-12 21:06:06 +01:00
Linus Torvalds	10d0c9705e	DeviceTree updates for 3.13. This is a bit larger pull request than usual for this cycle with lots of clean-up. - Cross arch clean-up and consolidation of early DT scanning code. - Clean-up and removal of arch prom.h headers. Makes arch specific prom.h optional on all but Sparc. - Addition of interrupts-extended property for devices connected to multiple interrupt controllers. - Refactoring of DT interrupt parsing code in preparation for deferred probe of interrupts. - ARM cpu and cpu topology bindings documentation. - Various DT vendor binding documentation updates. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJSgPQ4AAoJEMhvYp4jgsXif28H/1WkrXq5+lCFQZF8nbYdE2h0 R8PsfiJJmAl6/wFgQTsRel+ScMk2hiP08uTyqf2RLnB1v87gCF7MKVaLOdONfUDi huXbcQGWCmZv0tbBIklxJe3+X3FIJch4gnyUvPudD1m8a0R0LxWXH/NhdTSFyB20 PNjhN/IzoN40X1PSAhfB5ndWnoxXBoehV/IVHVDU42vkPVbVTyGAw5qJzHW8CLyN 2oGTOalOO4ffQ7dIkBEQfj0mrgGcODToPdDvUQyyGZjYK2FY2sGrjyquir6SDcNa Q4gwatHTu0ygXpyphjtQf5tc3ZCejJ/F0s3olOAS1ahKGfe01fehtwPRROQnCK8= =GCbY -----END PGP SIGNATURE----- Merge tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: "DeviceTree updates for 3.13. This is a bit larger pull request than usual for this cycle with lots of clean-up. - Cross arch clean-up and consolidation of early DT scanning code. - Clean-up and removal of arch prom.h headers. Makes arch specific prom.h optional on all but Sparc. - Addition of interrupts-extended property for devices connected to multiple interrupt controllers. - Refactoring of DT interrupt parsing code in preparation for deferred probe of interrupts. - ARM cpu and cpu topology bindings documentation. - Various DT vendor binding documentation updates" * tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (82 commits) powerpc: add missing explicit OF includes for ppc dt/irq: add empty of_irq_count for !OF_IRQ dt: disable self-tests for !OF_IRQ of: irq: Fix interrupt-map entry matching MIPS: Netlogic: replace early_init_devtree() call of: Add Panasonic Corporation vendor prefix of: Add Chunghwa Picture Tubes Ltd. vendor prefix of: Add AU Optronics Corporation vendor prefix of/irq: Fix potential buffer overflow of/irq: Fix bug in interrupt parsing refactor. of: set dma_mask to point to coherent_dma_mask of: add vendor prefix for PHYTEC Messtechnik GmbH DT: sort vendor-prefixes.txt of: Add vendor prefix for Cadence of: Add empty for_each_available_child_of_node() macro definition arm/versatile: Fix versatile irq specifications. of/irq: create interrupts-extended property microblaze/pci: Drop PowerPC-ism from irq parsing of/irq: Create of_irq_parse_and_map_pci() to consolidate arch code. of/irq: Use irq_of_parse_and_map() ...	2013-11-12 16:52:17 +09:00
Linus Torvalds	9b66bfb280	Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 UV debug changes from Ingo Molnar: "Various SGI UV debuggability improvements, amongst them KDB support, with related core KDB enabling patches changing kernel/debug/kdb/" * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "x86/UV: Add uvtrace support" x86/UV: Add call to KGDB/KDB from NMI handler kdb: Add support for external NMI handler to call KGDB/KDB x86/UV: Check for alloc_cpumask_var() failures properly in uv_nmi_setup() x86/UV: Add uvtrace support x86/UV: Add kdump to UV NMI handler x86/UV: Add summary of cpu activity to UV NMI handler x86/UV: Update UV support for external NMI signals x86/UV: Move NMI support	2013-11-12 12:01:14 +09:00
Linus Torvalds	c2136301e4	Merge branch 'x86-uaccess-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 uaccess changes from Ingo Molnar: "A single change that micro-optimizes __copy__user_inatomic(), used by the futex code" 'x86-uaccess-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic	2013-11-12 11:46:06 +09:00
Linus Torvalds	986189f9ec	Merge branch 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 reboot changes from Ingo Molnar: "Misc changes - the only one with functional impact should be commit `16c21ae5ca` ("reboot: Allow specifying warm/cold reset for CF9 boot type") which extends cold/warm reboot handling to the 0xCF9 reboot method" * 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/reboot: Correct pr_info() log message in the set_bios/pci/kbd_reboot() x86/reboot: Sort reboot DMI quirks by vendor x86/reboot: Remove the duplicate C6100 entry in the reboot quirks list reboot: Allow specifying warm/cold reset for CF9 boot type	2013-11-12 11:33:38 +09:00
Linus Torvalds	2934578e28	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform fixlet from Ingo Molnar: "A single __initdata fix" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/geode: Fix incorrect placement of __initdata tag	2013-11-12 11:23:44 +09:00
Linus Torvalds	2612c49db3	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm fixlet from Ingo Molnar: "One cleanup that documents a particular detail in init_mem_mapping()" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Add 'step_size' comments to init_mem_mapping()	2013-11-12 11:20:52 +09:00
Linus Torvalds	340286cd4e	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS changes from Ingo Molnar: "The biggest change adds support for Intel 'CPER' (UEFI Common Platform Error Record) error logging, which builds upon an enhanced error logging mechanism available on Xeon processors. Full description is here: http://www.intel.com/content/www/us/en/architecture-and-technology/enhanced-mca-logging-xeon-paper.html This change provides a module (and support code) to check for an extended error log and prints extra details about the error on the console" * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ACPI, x86: Fix extended error log driver to depend on CONFIG_X86_LOCAL_APIC dmi: Avoid unaligned memory access in save_mem_devices() Move cper.c from drivers/acpi/apei to drivers/firmware/efi EDAC, GHES: Update ghes error record info ACPI, APEI, CPER: Cleanup CPER memory error output format ACPI, APEI, CPER: Enhance memory reporting capability ACPI, APEI, CPER: Add UEFI 2.4 support for memory error DMI: Parse memory device (type 17) in SMBIOS ACPI, x86: Extended error log driver for x86 platform bitops: Introduce a more generic BITMASK macro ACPI, CPER: Update cper info ACPI, APEI, CPER: Fix status check during error printing	2013-11-12 11:16:44 +09:00
Linus Torvalds	339a4b72c8	Merge branch 'x86-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 iommu changes from Ingo Molnar: "Make it easier to turn off the old AMD GART code" * 'x86-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/iommu: Clean up the CONFIG_GART_IOMMU config option a bit x86/iommu: Don't make AMD_GART depend on EXPERT and default y	2013-11-12 11:15:12 +09:00
Linus Torvalds	dba538ff56	Merge branch 'x86-intel-mid-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/intel-mid changes from Ingo Molnar: "Update the 'intel mid' (mobile internet device) platform code as Intel is rolling out more SoC designs. This gets rid of most of the 'MRST' platform code in the process, mostly by renaming and shuffling code around into their respective 'intel-mid' platform drivers" * 'x86-intel-mid-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, intel-mid: Do not re-introduce usage of obsolete __cpuinit intel_mid: Move platform device setups to their own platform_<device>.* files x86: intel-mid: Add section for sfi device table intel-mid: sfi: Allow struct devs_id.get_platform_data to be NULL intel_mid: Moved SFI related code to sfi.c intel_mid: Added custom handler for ipc devices intel_mid: Added custom device_handler support intel_mid: Refactored sfi_parse_devs() function intel_mid: Renamed mrst to intel_mid pci: intel_mid: Return true/false in function returning bool intel_mid: Renamed mrst to intel_mid mrst: Fixed indentation issues mrst: Fixed printk/pr_* related issues	2013-11-12 11:12:22 +09:00
Linus Torvalds	2dc1733fd4	Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/hyperv changes from Ingo Molnar: "These changes enable Linux guests to boot as 'Modern VM' guest kernels on MS-Hyperv hosts" * 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, hyperv: Move a variable to avoid an unused variable warning x86, hyperv: Fix build error due to missing <asm/apic.h> include x86, hyperv: Correctly guard the local APIC calibration code x86, hyperv: Get the local APIC timer frequency from the hypervisor	2013-11-12 11:07:49 +09:00
Linus Torvalds	69019d77c7	Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 EFI changes from Ingo Molnar: "Main changes: - Add support for earlyprintk=efi which uses the EFI framebuffer. Very useful for debugging boot problems. - EFI stub support for large memory maps (more than 128 entries) - EFI ARM support - this was mostly done by generalizing x86 <-> ARM platform differences, such as by moving x86 EFI code into drivers/firmware/efi/ and sharing it with ARM. - Documentation updates - misc fixes" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) x86/efi: Add EFI framebuffer earlyprintk support boot, efi: Remove redundant memset() x86/efi: Fix config_table_type array termination x86 efi: bugfix interrupt disabling sequence x86: EFI stub support for large memory maps efi: resolve warnings found on ARM compile efi: Fix types in EFI calls to match EFI function definitions. efi: Renames in handle_cmdline_files() to complete generalization. efi: Generalize handle_ramdisks() and rename to handle_cmdline_files(). efi: Allow efi_free() to be called with size of 0 efi: use efi_get_memory_map() to get final map for x86 efi: generalize efi_get_memory_map() efi: Rename __get_map() to efi_get_memory_map() efi: Move unicode to ASCII conversion to shared function. efi: Generalize relocate_kernel() for use by other architectures. efi: Move relocate_kernel() to shared file. efi: Enforce minimum alignment of 1 page on allocations. efi: Rename memory allocation/free functions efi: Add system table pointer argument to shared functions. efi: Move common EFI stub code from x86 arch code to common location ...	2013-11-12 10:48:30 +09:00
Linus Torvalds	6df1e7f2e9	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu changes from Ingo Molnar: "The biggest change that stands out is the increase of the CONFIG_NR_CPUS range from 4096 to 8192 - as real hardware out there already went beyond 4k CPUs ... We only allow more than 512 CPUs if offstack cpumasks are enabled. CONFIG_MAXSMP=y remains to be the 'you are nuts!' extreme testcase, which now means a max of 8192 CPUs" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Increase max CPU count to 8192 x86/cpu: Allow higher NR_CPUS values x86/cpu: Always print SMP information in /proc/cpuinfo x86/cpu: Track legacy CPU model data only on 32-bit kernels	2013-11-12 10:46:43 +09:00
Linus Torvalds	d96d8aa261	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Two small cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, msr: Use file_inode(), not f_mapping->host x86: mkpiggy.c: Explicitly close the output file	2013-11-12 10:45:01 +09:00
Linus Torvalds	54e11304e8	Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build changes from Ingo Molnar: "Two small changes" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, defconfig: Add DEVTMPFS and DEVTMPFS_MOUNT to 86_defconfig x86, build: move build output statistics away from stderr	2013-11-12 10:43:32 +09:00
Linus Torvalds	014d595c23	Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot changes from Ingo Molnar: "Two changes that prettify and compactify the SMP bootup output from: smpboot: Booting Node 0, Processors #1 #2 #3 OK smpboot: Booting Node 1, Processors #4 #5 #6 #7 OK smpboot: Booting Node 2, Processors #8 #9 #10 #11 OK smpboot: Booting Node 3, Processors #12 #13 #14 #15 OK Brought up 16 CPUs to something like: x86: Booting SMP configuration: .... node #0, CPUs: #1 #2 #3 .... node #1, CPUs: #4 #5 #6 #7 .... node #2, CPUs: #8 #9 #10 #11 .... node #3, CPUs: #12 #13 #14 #15 x86: Booted up 4 nodes, 16 CPUs" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Further compress CPUs bootup message x86: Improve the printout of the SMP bootup CPU table	2013-11-12 10:41:10 +09:00
Linus Torvalds	ae795fe760	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 user access changes from Ingo Molnar: "This tree contains two copy_[from/to]_user() build time checking changes/enhancements from Jan Beulich. The desired outcome is to get better compiler warnings with CONFIG_DEBUG_STRICT_USER_COPY_CHECKS=y, to keep people from introducing bugs such as overflows and information leaks" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Unify copy_to_user() and add size checking to it x86: Unify copy_from_user() size checking	2013-11-12 10:39:08 +09:00
Linus Torvalds	b7ab6e3d21	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/apic fix from Ingo Molnar: "A single fix to the IO-APIC / local-APIC shutdown sequence" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Disable I/O APIC before shutdown of the local APIC	2013-11-12 10:37:53 +09:00
Linus Torvalds	87093826aa	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer changes from Ingo Molnar: "Main changes in this cycle were: - Updated full dynticks support. - Event stream support for architected (ARM) timers. - ARM clocksource driver updates. - Move arm64 to using the generic sched_clock framework & resulting cleanup in the generic sched_clock code. - Misc fixes and cleanups" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) x86/time: Honor ACPI FADT flag indicating absence of a CMOS RTC clocksource: sun4i: remove IRQF_DISABLED clocksource: sun4i: Report the minimum tick that we can program clocksource: sun4i: Select CLKSRC_MMIO clocksource: Provide timekeeping for efm32 SoCs clocksource: em_sti: convert to clk_prepare/unprepare time: Fix signedness bug in sysfs_get_uname() and its callers timekeeping: Fix some trivial typos in comments alarmtimer: return EINVAL instead of ENOTSUPP if rtcdev doesn't exist clocksource: arch_timer: Do not register arch_sys_counter twice timer stats: Add a 'Collection: active/inactive' line to timer usage statistics sched_clock: Remove sched_clock_func() hook arch_timer: Move to generic sched_clock framework clocksource: tcb_clksrc: Remove IRQF_DISABLED clocksource: tcb_clksrc: Improve driver robustness clocksource: tcb_clksrc: Replace clk_enable/disable with clk_prepare_enable/disable_unprepare clocksource: arm_arch_timer: Use clocksource for suspend timekeeping clocksource: dw_apb_timer_of: Mark a few more functions as __init clocksource: Put nodes passed to CLOCKSOURCE_OF_DECLARE callbacks centrally arm: zynq: Enable arm_global_timer ...	2013-11-12 10:36:00 +09:00
Linus Torvalds	39cf275a1a	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "The main changes in this cycle are: - (much) improved CONFIG_NUMA_BALANCING support from Mel Gorman, Rik van Riel, Peter Zijlstra et al. Yay! - optimize preemption counter handling: merge the NEED_RESCHED flag into the preempt_count variable, by Peter Zijlstra. - wait.h fixes and code reorganization from Peter Zijlstra - cfs_bandwidth fixes from Ben Segall - SMP load-balancer cleanups from Peter Zijstra - idle balancer improvements from Jason Low - other fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits) ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED stop_machine: Fix race between stop_two_cpus() and stop_cpus() sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus sched: Fix asymmetric scheduling for POWER7 sched: Move completion code from core.c to completion.c sched: Move wait code from core.c to wait.c sched: Move wait.c into kernel/sched/ sched/wait: Fix __wait_event_interruptible_lock_irq_timeout() sched: Avoid throttle_cfs_rq() racing with period_timer stopping sched: Guarantee new group-entities always have weight sched: Fix hrtimer_cancel()/rq->lock deadlock sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining sched: Fix race on toggling cfs_bandwidth_used sched: Remove extra put_online_cpus() inside sched_setaffinity() sched/rt: Fix task_tick_rt() comment sched/wait: Fix build breakage sched/wait: Introduce prepare_to_wait_event() sched/wait: Add ___wait_cond_timeout() to wait_event*_timeout() too sched: Remove get_online_cpus() usage sched: Fix race in migrate_swap_stop() ...	2013-11-12 10:20:12 +09:00
Linus Torvalds	ad5d69899e	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "As a first remark I'd like to note that the way to build perf tooling has been simplified and sped up, in the future it should be enough for you to build perf via: cd tools/perf/ make install (ie without the -j option.) The build system will figure out the number of CPUs and will do a parallel build+install. The various build system inefficiencies and breakages Linus reported against the v3.12 pull request should now be resolved - please (re-)report any remaining annoyances or bugs. Main changes on the perf kernel side: * Performance optimizations: . perf ring-buffer code optimizations, by Peter Zijlstra . perf ring-buffer code optimizations, by Oleg Nesterov . x86 NMI call-stack processing optimizations, by Peter Zijlstra . perf context-switch optimizations, by Peter Zijlstra . perf sampling speedups, by Peter Zijlstra . x86 Intel PEBS processing speedups, by Peter Zijlstra * Enhanced hardware support: . for Intel Ivy Bridge-EP uncore PMUs, by Zheng Yan . for Haswell transactions, by Andi Kleen, Peter Zijlstra * Core perf events code enhancements and fixes by Oleg Nesterov: . for uprobes, if fork() is called with pending ret-probes . for uprobes platform support code * New ABI details by Andi Kleen: . Report x86 Haswell TSX transaction abort cost as weight Main changes on the perf tooling side (some of these tooling changes utilize the above kernel side changes): * 'perf report/top' enhancements: . Convert callchain children list to rbtree, greatly reducing the time taken for callchain processing, from Namhyung Kim. . Add new COMM infrastructure, further improving histogram processing, from Frédéric Weisbecker, one fix from Namhyung Kim. . Add /proc/kcore based live-annotation improvements, including build-id cache support, multi map 'call' instruction navigation fixes, kcore address validation, objdump workarounds. From Adrian Hunter. . Show progress on histogram collapsing, that can take a long time, from Namhyung Kim. . Add --max-stack option to limit callchain stack scan in 'top' and 'report', improving callchain processing when reducing the stack depth is an option, from Waiman Long. . Add new option --ignore-vmlinux for perf top, from Willy Tarreau. * 'perf trace' enhancements: . 'perf trace' now can can use a 'perf probe' dynamic tracepoints to hook into the userspace -> kernel pathname copy so that it can map fds to pathnames without reading /proc/pid/fd/ symlinks. From Arnaldo Carvalho de Melo. . Show VFS path associated with fd in live sessions, using a 'vfs_getname' 'perf probe' created dynamic tracepoint or by looking at /proc/pid/fd, from Arnaldo Carvalho de Melo. . Add 'trace' beautifiers for lots of syscall arguments, from Arnaldo Carvalho de Melo. . Implement more compact 'trace' output by suppressing zeroed args, from Arnaldo Carvalho de Melo. . Show thread COMM by default in 'trace', from Arnaldo Carvalho de Melo. . Add option to show full timestamp in 'trace', from David Ahern. . Add 'record' command in 'trace', to record raw_syscalls:, from David Ahern. . Add summary option to dump syscall statistics in 'trace', from David Ahern. . Improve error messages in 'trace', providing hints about system configuration steps needed for using it, from Ramkumar Ramachandra. . 'perf trace' now emits hints as to why tracing is not possible, helping the user to setup the system to allow tracing in the desired permission granularity, telling if the problem is due to debugfs not being mounted or with not enough permission for !root, /proc/sys/kernel/perf_event_paranoit value, etc. From Arnaldo Carvalho de Melo. 'perf record' enhancements: . Check maximum frequency rate for record/top, emitting better error messages, from Jiri Olsa. . 'perf record' code cleanups, from David Ahern. . Improve write_output error message in 'perf record', from Adrian Hunter. . Allow specifying B/K/M/G unit to the --mmap-pages arguments, from Jiri Olsa. . Fix command line callchain attribute tests to handle the new -g/--call-chain semantics, from Arnaldo Carvalho de Melo. * 'perf kvm' enhancements: . Disable live kvm command if timerfd is not supported, from David Ahern. . Fix detection of non-core features, from David Ahern. * 'perf list' enhancements: . Add usage to 'perf list', from David Ahern. . Show error in 'perf list' if tracepoints not available, from Pekka Enberg. * 'perf probe' enhancements: . Support "$vars" meta argument syntax for local variables, allowing asking for all possible variables at a given probe point to be collected when it hits, from Masami Hiramatsu. * 'perf sched' enhancements: . Address the root cause of that 'perf sched' stack initialization build slowdown, by programmatically setting a big array after moving the global variable back to the stack. Fix from Adrian Hunter. * 'perf script' enhancements: . Set up output options for in-stream attributes, from Adrian Hunter. . Print addr by default for BTS in 'perf script', from Adrian Juntmer * 'perf stat' enhancements: . Improved messages when doing profiling in all or a subset of CPUs using a workload as the session delimitator, as in: 'perf stat --cpu 0,2 sleep 10s' from Arnaldo Carvalho de Melo. . Add units to nanosec-based counters in 'perf stat', from David Ahern. . Remove bogus info when using 'perf stat' -e cycles/instructions, from Ramkumar Ramachandra. * 'perf lock' enhancements: . 'perf lock' fixes and cleanups, from Davidlohr Bueso. * 'perf test' enhancements: . Fixup PERF_SAMPLE_TRANSACTION handling in sample synthesizing and 'perf test', from Adrian Hunter. . Clarify the "sample parsing" test entry, from Arnaldo Carvalho de Melo. . Consider PERF_SAMPLE_TRANSACTION in the "sample parsing" test, from Arnaldo Carvalho de Melo. . Memory leak fixes in 'perf test', from Felipe Pena. * 'perf bench' enhancements: . Change the procps visible command-name of invididual benchmark tests plus cleanups, from Ingo Molnar. * Generic perf tooling infrastructure/plumbing changes: . Separating data file properties from session, code reorganization from Jiri Olsa. . Fix version when building out of tree, as when using one of these: $ make help \| grep perf perf-tar-src-pkg - Build perf-3.12.0.tar source tarball perf-targz-src-pkg - Build perf-3.12.0.tar.gz source tarball perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball perf-tarxz-src-pkg - Build perf-3.12.0.tar.xz source tarball $ from David Ahern. . Enhance option parse error message, showing just the help lines of the options affected, from Namhyung Kim. . libtraceevent updates from upstream trace-cmd repo, from Steven Rostedt. . Always use perf_evsel__set_sample_bit to set sample_type, from Adrian Hunter. . Memory and mmap leak fixes from Chenggang Qin. . Assorted build fixes for from David Ahern and Jiri Olsa. . Speed up and prettify the build system, from Ingo Molnar. . Implement addr2line directly using libbfd, from Roberto Vitillo. . Separate the GTK support in a separate libperf-gtk.so DSO, that is only loaded when --gtk is specified, from Namhyung Kim. . perf bash completion fixes and improvements from Ramkumar Ramachandra. . Support for Openembedded/Yocto -dbg packages, from Ricardo Ribalda Delgado. And lots and lots of other fixes and code reorganizations that did not make it into the list, see the shortlog, diffstat and the Git log for details!" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (300 commits) uprobes: Fix the memory out of bound overwrite in copy_insn() uprobes: Fix the wrong usage of current->utask in uprobe_copy_process() perf tools: Remove unneeded include perf record: Remove post_processing_offset variable perf record: Remove advance_output function perf record: Refactor feature handling into a separate function perf trace: Don't relookup fields by name in each sample perf tools: Fix version when building out of tree perf evsel: Ditch evsel->handler.data field uprobes: Export write_opcode() as uprobe_write_opcode() uprobes: Introduce arch_uprobe->ixol uprobes: Kill module_init() and module_exit() uprobes: Move function declarations out of arch perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes perf: Factor out strncpy() in perf_event_mmap_event() tools/perf: Add required memory barriers perf: Fix arch_perf_out_copy_user default perf: Update a stale comment perf: Optimize perf_output_begin() -- address calculation ...	2013-11-12 10:06:34 +09:00
Linus Torvalds	1006fae359	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull IRQ changes from Ingo Molnar: "The biggest change this cycle are the softirq/hardirq stack interaction and nesting fixes, cleanups and reorganizations from Frederic. This is the longer followup story to the softirq nesting fix that is already upstream (commit ded797547548: "irq: Force hardirq exit's softirq processing on its own stack")" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: bcm2835: Convert to use IRQCHIP_DECLARE macro powerpc: Tell about irq stack coverage x86: Tell about irq stack coverage irq: Optimize softirq stack selection in irq exit irq: Justify the various softirq stack choices irq: Improve a bit softirq debugging irq: Optimize call to softirq on hardirq exit irq: Consolidate do_softirq() arch overriden implementations x86/irq: Correct comment about i8259 initialization	2013-11-12 10:02:59 +09:00
Ingo Molnar	b5dfcb09de	Revert "x86/UV: Add uvtrace support" This reverts commit `8eba18428a`. uv_trace() is not used by anything, nor is uv_trace_nmi_func, nor uv_trace_func. That's not how we do instrumentation code in the kernel: we add tracepoints, printk()s, etc. so that everyone not just those with magic kernel modules can debug a system. So remove this unused (and misguied) piece of code. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Mike Travis <travis@sgi.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/n/tip-tumfBffmr4jmnt8Gyxanoblg@git.kernel.org	2013-11-11 19:53:42 +01:00
H. Peter Anvin	a4f61dec55	x86, trace: Change user\|kernel_page_fault to page_fault_user\|kernel Tracepoints are named hierachially, and it makes more sense to keep a general flow of information level from general to specific from left to right, i.e. x86_exceptions.page_fault_user\|kernel rather than x86_exceptions.user\|kernel_page_fault Suggested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Seiji Aguchi <seiji.aguchi@hds.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/20131111082955.GB12405@gmail.com	2013-11-11 08:15:40 -08:00
Al Viro	ce39596048	constify copy_siginfo_to_user{,32}() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:29 -05:00
Al Viro	9b56d54380	dump_skip(): dump_seek() replacement taking coredump_params Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:26 -05:00
Al Viro	43a5d548eb	aout: switch to dump_emit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:25 -05:00
Al Viro	aa3e7eaf0a	switch elf_core_write_extra_data() to dump_emit() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:23 -05:00
Al Viro	506f21c556	switch elf_core_write_extra_phdrs() to dump_emit() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:23 -05:00
Al Viro	7d2f551f6d	restore 32bit aout coredump just getting rid of bitrot Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:22 -05:00
Seiji Aguchi	d34603b07c	x86, trace: Add page fault tracepoints This patch introduces page fault tracepoints to x86 architecture by switching IDT. Two events, for user and kernel spaces, are introduced at the beginning of page fault handler for tracing. - User space event There is a request of page fault event for user space as below. https://lkml.kernel.org/r/1368079520-11015-2-git-send-email-fdeslaur+()+gmail+!+com https://lkml.kernel.org/r/1368079520-11015-1-git-send-email-fdeslaur+()+gmail+!+com - Kernel space event: When we measure an overhead in kernel space for investigating performance issues, we can check if it comes from the page fault events. Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/52716E67.6090705@hds.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-11-08 14:15:49 -08:00
Seiji Aguchi	ac7956e269	x86, trace: Delete __trace_alloc_intr_gate() Currently irq vector handlers for tracing are registered in both set_intr_gate() and __trace_alloc_intr_gate() in alloc_intr_gate(). But, we don't need to do that twice. So, let's delete __trace_alloc_intr_gate(). Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/52716E1B.7090205@hds.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-11-08 14:15:47 -08:00
Seiji Aguchi	25c74b10ba	x86, trace: Register exception handler to trace IDT This patch registers exception handlers for tracing to a trace IDT. To implemented it in set_intr_gate(), this patch does followings. - Register the exception handlers to the trace IDT by prepending "trace_" to the handler's names. - Also, newly introduce trace_page_fault() to add tracepoints in a subsequent patch. Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/52716DEC.5050204@hds.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-11-08 14:15:45 -08:00
Seiji Aguchi	959c071f09	x86, trace: Remove __alloc_intr_gate() Prepare to move set_intr_gate() into a macro by removing __alloc_intr_gate(). The purpose is to avoid failing a kernel build after applying a subsequent patch which changes set_intr_gate() into a macro. Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/52716DB8.1080702@hds.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-11-08 14:15:44 -08:00
Konrad Rzeszutek Wilk	e1d8f62ad4	Merge remote-tracking branch 'stefano/swiotlb-xen-9.1' into stable/for-linus-3.13 * stefano/swiotlb-xen-9.1: swiotlb-xen: fix error code returned by xen_swiotlb_map_sg_attrs swiotlb-xen: static inline xen_phys_to_bus, xen_bus_to_phys, xen_virt_to_bus and range_straddles_page_boundary grant-table: call set_phys_to_machine after mapping grant refs arm,arm64: do not always merge biovec if we are running on Xen swiotlb: print a warning when the swiotlb is full swiotlb-xen: use xen_dma_map/unmap_page, xen_dma_sync_single_for_cpu/device xen: introduce xen_dma_map/unmap_page and xen_dma_sync_single_for_cpu/device swiotlb-xen: use xen_alloc/free_coherent_pages xen: introduce xen_alloc/free_coherent_pages arm64/xen: get_dma_ops: return xen_dma_ops if we are running as xen_initial_domain arm/xen: get_dma_ops: return xen_dma_ops if we are running as xen_initial_domain swiotlb-xen: introduce xen_swiotlb_set_dma_mask xen/arm,arm64: enable SWIOTLB_XEN xen: make xen_create_contiguous_region return the dma address xen/x86: allow __set_phys_to_machine for autotranslate guests arm/xen,arm64/xen: introduce p2m arm64: define DMA_ERROR_CODE arm: make SWIOTLB available Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Conflicts: arch/arm/include/asm/dma-mapping.h drivers/xen/swiotlb-xen.c [Conflicts arose b/c "arm: make SWIOTLB available" v8 was in Stefano's branch, while I had v9 + Ack from Russel. I also fixed up white-space issues]	2013-11-08 16:10:48 -05:00
Konrad Rzeszutek Wilk	bad97817de	Linux 3.12-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSWyGgAAoJEHm+PkMAQRiGA8MH/35upHXImoRCsI5uC1qvHJtI QvQAhDFxoEXbFUKeaYgTfcM8q9FgnqnfjhLf8eYa4Q7tDZeqLXOE8bkI807mSZMl yECr3jcwlV+zyhV2MP/HdwTjzy25bwxLM3Zy43S7QROrYoMHZYznil/QPfyMATCJ XLPuXZC1FtuUen89n4BoDIuL8QaVrIR/zLqFklAQcdTcGpLHSOwFtH8gb2WaRLhv +4IikFRFgTNZiMR5tP0GPc6UH6TVTvRb4QKSqqa7J8OmfAIvOzAUdhqWSPOIwWwt Z/+JFxFDczAcNmpv4gE6jkgc2vR8CVeHsvh0j61RDSFObBWspwk337CSyUZxYSA= =w4VQ -----END PGP SIGNATURE----- Merge tag 'v3.12-rc5' into stable/for-linus-3.13 Linux 3.12-rc5 Because the Stefano branch (for SWIOTLB ARM changes) is based on that. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> * tag 'v3.12-rc5': (550 commits) Linux 3.12-rc5 watchdog: sunxi: Fix section mismatch watchdog: kempld_wdt: Fix bit mask definition watchdog: ts72xx_wdt: locking bug in ioctl ARM: exynos: dts: Update 5250 arch timer node with clock frequency parisc: let probe_kernel_read() capture access to page zero parisc: optimize variable initialization in do_page_fault parisc: fix interruption handler to respect pagefault_disable() parisc: mark parisc_terminate() noreturn and cold. parisc: remove unused syscall_ipi() function. parisc: kill SMP single function call interrupt parisc: Export flush_cache_page() (needed by lustre) vfs: allow O_PATH file descriptors for fstatfs() ext4: fix memory leak in xattr ARC: Ignore ptrace SETREGSET request for synthetic register "stop_pc" ALSA: hda - Sony VAIO Pro 13 (haswell) now has a working headset jack ALSA: hda - Add a headset mic model for ALC269 and friends ALSA: hda - Fix microphone for Sony VAIO Pro 13 (Haswell model) compiler/gcc4: Add quirk for 'asm goto' miscompilation bug Revert "i915: Update VGA arbiter support for newer devices" ...	2013-11-08 15:28:05 -05:00
Stefano Stabellini	92c0fd17c0	pci-swiotlb-xen: call pci_request_acs only ifdef CONFIG_PCI Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-11-08 15:21:44 -05:00
Paul Gortmaker	3b284bde70	xen: delete new instances of added __cpuinit commit `6efa20e49b` ("xen: Support 64-bit PV guest receiving NMIs") and commit `cd9151e26d` ( "xen/balloon: set a mapping for ballooned out pages") added new instances of __cpuinit usage. We removed this a couple versions ago; we now want to remove the compat no-op stubs. Introducing new users is not what we want to see at this point in time, as it will break once the stubs are gone. Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-11-08 15:13:16 -05:00
Ben Widawsky	9459d25237	drm/i915/bdw: support GMS and GGMS changes All the BARs have the ability to grow. v2: Pulled out the simulator workaround to a separate patch. Rebased. v3: Rebase onto latest vlv patches from Jesse. v4: Rebased on top of the early stolen quirk patch from Jesse. v5: Use the new macro names. s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS It's Jesse's fault for not following the convention I originally set. Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:39 +01:00
Andrey Vagin	98bbc06aab	net: x86: bpf: don't forget to free sk_filter (v2) sk_filter isn't freed if bpf_func is equal to sk_run_filter. This memory leak was introduced by v3.12-rc3-224-gd45ed4a4 "net: fix unsafe set_memory_rw from softirq". Before this patch sk_filter was freed in sk_filter_release_rcu, now it should be freed in bpf_jit_free. Here is output of kmemleak: unreferenced object 0xffff8800b774eab0 (size 128): comm "systemd", pid 1, jiffies 4294669014 (age 124.062s) hex dump (first 32 bytes): 00 00 00 00 0b 00 00 00 20 63 7f b7 00 88 ff ff ........ c...... 60 d4 55 81 ff ff ff ff 30 d9 55 81 ff ff ff ff `.U.....0.U..... backtrace: [<ffffffff816444be>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811845af>] __kmalloc+0xef/0x260 [<ffffffff81534028>] sock_kmalloc+0x38/0x60 [<ffffffff8155d4dd>] sk_attach_filter+0x5d/0x190 [<ffffffff815378a1>] sock_setsockopt+0x991/0x9e0 [<ffffffff81531bd6>] SyS_setsockopt+0xb6/0xd0 [<ffffffff8165f3e9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff v2: add extra { } after else Fixes: `d45ed4a4e3` ("net: fix unsafe set_memory_rw from softirq") Acked-by: Daniel Borkmann <dborkman@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-07 19:06:52 -05:00
Bjorn Helgaas	eaaeb1cb33	Merge branch 'pci/misc' into next * pci/misc: PCI: Enable upstream bridges even for VFs on virtual buses PCI: Add pci_upstream_bridge() PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()	2013-11-07 15:02:04 -07:00
Paul Gortmaker	aeeca40426	x86, intel-mid: Do not re-introduce usage of obsolete __cpuinit The commit `712b6aa873` [Nov7 linux-next via tip/auto-latest] ("intel_mid: Renamed mrst to intel_mid") adds a __cpuinit. We removed this a couple versions ago; we now want to remove the compat no-op stubs. Introducing new users is not what we want to see at this point in time, as it will break once the stubs are gone. Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: David Cohen <david.a.cohen@linux.intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/1383849290-11250-1-git-send-email-paul.gortmaker@windriver.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-11-07 10:44:03 -08:00
Rafael J. Wysocki	0faf996f4d	Merge branch 'acpica' * acpica: (35 commits) ACPICA: Add __init for ACPICA initializers/finalizers. ACPICA: Cleanup asmlinkage for ACPICA APIs. ACPICA: Update acpidump related header file changes. ACPICA: Update compilation environment settings. ACPICA: Fix cached object deletion code. ACPICA: Remove dead AOPOBJ_INVALID check. ACPICA: Cleanup useless memset invocations. ACPICA: Fix an ACPI_ALLOCATE_ZEROED() reversal. ACPICA: Fix wrong object length returned by acpi_ut_get_simple_object_size(). ACPICA: Add new statistics interface. ACPICA: Update DMAR table definitions. ACPICA: Update RSDP table definitions. ACPICA: Update namespace dump code. ACPICA: Update check for setting the ANOBJ_IS_EXTERNAL flag. ACPICA: Update default space handlers. ACPICA: Update version to 20130927. ACPICA: Update aclinux.h for new OSL override mechanism. ACPICA: Add support to allow host OS to redefine individual OSL prototypes. ACPICA: Simplify configuration of global ACPI_REDUCED_HARDWARE macro. ACPICA: Fix indentation issues for macro invocations. ...	2013-11-07 19:21:11 +01:00
Rob Herring	b5480950c6	Merge remote-tracking branch 'grant/devicetree/next' into for-next	2013-11-07 10:34:46 -06:00
Borislav Petkov	1b2ca42267	kvm, cpuid: Fix sparse warning We need to copy padding to kernel space first before looking at it. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-07 12:27:46 +02:00
Fenghua Yu	522e664644	x86/apic: Disable I/O APIC before shutdown of the local APIC In reboot and crash path, when we shut down the local APIC, the I/O APIC is still active. This may cause issues because external interrupts can still come in and disturb the local APIC during shutdown process. To quiet external interrupts, disable I/O APIC before shutdown local APIC. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1382578212-4677-1-git-send-email-fenghua.yu@intel.com Cc: <stable@kernel.org> [ I suppose the 'issue' is a hang during shutdown. It's a fine change nevertheless. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-07 10:12:37 +01:00
Konrad Rzeszutek Wilk	0e4ccb1505	PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq() Certain platforms do not allow writes in the MSI-X BARs to setup or tear down vector values. To combat against the generic code trying to write to that and either silently being ignored or crashing due to the pagetables being marked R/O this patch introduces a platform override. Note that we keep two separate, non-weak, functions default_mask_msi_irqs() and default_mask_msix_irqs() for the behavior of the arch_mask_msi_irqs() and arch_mask_msix_irqs(), as the default behavior is needed by x86 PCI code. For Xen, which does not allow the guest to write to MSI-X tables - as the hypervisor is solely responsible for setting the vector values - we implement two nops. This fixes a Xen guest crash when passing a PCI device with MSI-X to the guest. See the bugzilla for more details. [bhelgaas: add bugzilla info] Reference: https://bugzilla.kernel.org/show_bug.cgi?id=64581 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> CC: Zhenzhong Duan <zhenzhong.duan@oracle.com>	2013-11-06 16:32:19 -07:00
Michael Opdenacker	9d71cee667	x86/xen: remove deprecated IRQF_DISABLED This patch proposes to remove the IRQF_DISABLED flag from x86/xen code. It's a NOOP since 2.6.35 and it will be removed one day. Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-11-06 15:31:01 -05:00
Oleg Nesterov	8a8de66c4f	uprobes: Introduce arch_uprobe->ixol Currently xol_get_insn_slot() assumes that we should simply copy arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn) just the copy of the original insn. This is not true for arm which needs to create another insn to execute it out-of-line. So this patch simply adds the new member, ->ixol into the union. This doesn't make any difference for x86 and powerpc, but arm can divorce insn/ixol and initialize the correct xol insn in arch_uprobe_analyze_insn(). Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2013-11-06 20:00:05 +01:00
David A. Long	3820b4d278	uprobes: Move function declarations out of arch Move the function declarations from the arch headers to the common header, since only the function bodies are architecture-specific. These changes are from Vincent Rabin's uprobes patch. [ oleg: update arch/powerpc/include/asm/uprobes.h ] Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: David A. Long <dave.long@linaro.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2013-11-06 19:59:37 +01:00
H. Peter Anvin	4c08edd305	x86, hyperv: Move a variable to avoid an unused variable warning The variable hv_lapic_frequency causes an unused variable warning if CONFIG_X86_LOCAL_APIC is disabled. Since the variable is only used inside a small if statement, move the declaration of that variable into the if statement itself. Cc: K. Y. Srinivasan <kys@microsoft.com> Link: http://lkml.kernel.org/r/1381444224-3303-1-git-send-email-kys@microsoft.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-11-06 10:02:05 -08:00
John Stultz	1ca7d67cf5	seqcount: Add lockdep functionality to seqcount/seqlock structures Currently seqlocks and seqcounts don't support lockdep. After running across a seqcount related deadlock in the timekeeping code, I used a less-refined and more focused variant of this patch to narrow down the cause of the issue. This is a first-pass attempt to properly enable lockdep functionality on seqlocks and seqcounts. Since seqcounts are used in the vdso gettimeofday code, I've provided non-lockdep accessors for those needs. I've also handled one case where there were nested seqlock writers and there may be more edge cases. Comments and feedback would be appreciated! Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/1381186321-4906-3-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 12:40:26 +01:00
Yan, Zheng	f891d8cfb8	perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX device. For PCI bus without UBOX device, we find the next bus that has UBOX device and use its 'bus to socket' mapping. Besides the counter/control registers in IRP boxes are not properly aligned. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Cc: "Yan Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 12:34:31 +01:00
Yan, Zheng	d1e8f4a836	perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes The encoding for filter registers of IvyBridge-EP uncore QPI boxes is completely the same as SandyBridge-EP. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Cc: "Yan Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 12:34:29 +01:00
Peter Zijlstra	0a196848ca	perf: Fix arch_perf_out_copy_user default The arch_perf_output_copy_user() default of __copy_from_user_inatomic() returns bytes not copied, while all other argument functions given DEFINE_OUTPUT_COPY() return bytes copied. Since copy_from_user_nmi() is the odd duck out by returning bytes copied where all other copy_{to,from} functions return bytes not copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes not copied. Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied while expecting its worker functions to return bytes copied. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: will.deacon@arm.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 12:34:25 +01:00
Josh Triplett	a890b6fefd	kvm: Delete prototype for non-existent function kvm_check_iopl The prototype for kvm_check_iopl appeared in commit `f850e2e603` ("KVM: x86 emulator: Check IOPL level during io instruction emulation"), but the function never actually existed. Remove the prototype. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 11:07:09 +02:00
Josh Triplett	35a5121b58	kvm: Delete prototype for non-existent function complete_pio complete_pio ceased to exist in commit `7972995b0c` ("KVM: x86 emulator: Move string pio emulation into emulator.c"), but the prototype remained. Remove its prototype. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 11:06:32 +02:00
Marcelo Tosatti	8b414521bc	hung_task: add method to reset detector In certain occasions it is possible for a hung task detector positive to be false: continuation from a paused VM, for example. Add a method to reset detection, similar as is done with other kernel watchdogs. Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 09:49:02 +02:00
Marcelo Tosatti	d63285e94a	pvclock: detect watchdog reset at pvclock read Implement reset of kernel watchdogs at pvclock read time. This avoids adding special code to every watchdog. This is possible for watchdogs which measure time based on sched_clock() or ktime_get() variants. Suggested by Don Zickus. Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 09:48:43 +02:00
Michael S. Tsirkin	01b71917b5	kvm: optimize out smp_mb after srcu_read_unlock I noticed that srcu_read_lock/unlock both have a memory barrier, so just by moving srcu_read_unlock earlier we can get rid of one call to smp_mb() using smp_mb__after_srcu_read_unlock instead. Unsurprisingly, the gain is small but measureable using the unit test microbenchmark: before vmcall in the ballpark of 1410 cycles after vmcall in the ballpark of 1360 cycles Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-06 09:32:31 +02:00
Josh Boyer	b53b5eda81	x86/cpu: Increase max CPU count to 8192 The MAXSMP option is intended to enable silly large numbers of CPUs for testing purposes. The current value of 4096 isn't very silly any longer as there are actual SGI machines that approach 6096 CPUs when taking HT into account. Increase the value to a nice round 8192 to account for this and allow for short term future increases. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: prarit@redhat.com Cc: Russ Anderson <rja@sgi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131105143816.GK9944@hansolo.jdub.homelinux.org [ Tweaked it so that MAXSMP simply sets the maximum of the normal range. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 08:16:34 +01:00
Josh Boyer	bb61ccc7db	x86/cpu: Allow higher NR_CPUS values The current range for SMP configs is 2 - 512 CPUs, or a full 4096 in the case of MAXSMP. There are machines that have 1024 CPUs in them today and configuring a kernel for that means you are forced to set MAXSMP. This adds additional unnecessary overhead. While that overhead might be considered tiny for large machines, it isn't necessarily so if you are building a kernel that runs across a wide variety of machines. To cover the range of more common machines today, we allow NR_CPUS to be up to 4096 when CPUMASK_OFFSTACK is enabled. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: prarit@redhat.com Cc: Russ Anderson <rja@sgi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131105143728.GJ9944@hansolo.jdub.homelinux.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 08:16:28 +01:00
HATAYAMA Daisuke	a477c8594b	x86/cpu: Always print SMP information in /proc/cpuinfo Currently show_cpuinfo_core() displays cpu core information only if the number of threads per a whole cores is 2 or larger. However, this condition doesn't care about the number of sockets. For example, this condition doesn't hold on systems with two logical cpus consisting of two sockets and a single core on each socket - yet the topology information would be interesting to see in that case as well. I don't know whether or not there are processors in real world by which such configurations are possible, but at least on vitual machine environments, such configuration can occur, typically when no explicit SMP information is provided in advance. For example, on qemu/KVM, SMP information is specified via -smp command-line option, more specifically, its syntax is: -smp n[,cores=cores][,threads=threads][,sockets=sockets][,maxcpus=maxcpus] If this is not specified, qemu tells configuration with n-sockets, 1-core and 1-thread to the guest machine, on which guest, MP information is not displayed in /proc/cpuinfo. I saw this situation on VMWare guest environment, too. To fix this issue, this patch simply removes the condition because this information is useful even if there's only 1 thread. Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/5277D644.4090707@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 08:13:56 +01:00
Ingo Molnar	c90423d1de	Merge branch 'sched/core' into core/locking, to prepare the kernel/locking/ file move Conflicts: kernel/Makefile There are conflicts in kernel/Makefile due to file moving in the scheduler tree - resolve them. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 07:50:37 +01:00
Ingo Molnar	164777530b	Linux 3.12 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEcBAABAgAGBQJSdt9HAAoJEHm+PkMAQRiGnzEH/345Keg5dp+oKACnokBfzOtp V0p3g5EBsGtzEVnV+1B96trczDUtWdDFFr5GfGSj565NBQpFyc+iZC1mC99RDJCs WUquGFqlLMK2aV0SbKwCO4K1rJ5A0TRVj0ZRJOUJUY7jwNf5Qahny0WBVjO/8qAY UvJK1rktBClhKdH53YtpDHHgXBeZ2LOrzt1fQ/AMpujGbZauGvnLdNOli5r2kCFK jzoOgFLvX+PHU/5/d4/QyJPeQNPva5hjk5Ho9UuSJYhnFtPO3EkD4XZLcpcbNEJb LqBvbnZWm6CS435lfU1l93RqQa5xMO9ITk0oe4h69syTSHwWk9aJ+ZTc/4Up+t8= =57MC -----END PGP SIGNATURE----- Merge tag 'v3.12' into x86/cpu, to refresh the branch before queueing up more changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 06:50:21 +01:00
Ingo Molnar	97c53b402f	Linux 3.12 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEcBAABAgAGBQJSdt9HAAoJEHm+PkMAQRiGnzEH/345Keg5dp+oKACnokBfzOtp V0p3g5EBsGtzEVnV+1B96trczDUtWdDFFr5GfGSj565NBQpFyc+iZC1mC99RDJCs WUquGFqlLMK2aV0SbKwCO4K1rJ5A0TRVj0ZRJOUJUY7jwNf5Qahny0WBVjO/8qAY UvJK1rktBClhKdH53YtpDHHgXBeZ2LOrzt1fQ/AMpujGbZauGvnLdNOli5r2kCFK jzoOgFLvX+PHU/5/d4/QyJPeQNPva5hjk5Ho9UuSJYhnFtPO3EkD4XZLcpcbNEJb LqBvbnZWm6CS435lfU1l93RqQa5xMO9ITk0oe4h69syTSHwWk9aJ+ZTc/4Up+t8= =57MC -----END PGP SIGNATURE----- Merge tag 'v3.12' into core/locking to pick up mutex upates Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-06 06:39:45 +01:00
Kevin Hao	ab4ead02ec	ftrace/x86: skip over the breakpoint for ftrace caller In commit `8a4d0a687a` "ftrace: Use breakpoint method to update ftrace caller", we choose to use breakpoint method to update the ftrace caller. But we also need to skip over the breakpoint in function ftrace_int3_handler() for them. Otherwise weird things would happen. Cc: stable@vger.kernel.org # 3.5+ Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2013-11-05 16:01:47 -05:00
Gleb Natapov	a9d4e4393b	KVM: x86: trace cpuid emulation when called from emulator Currently cpuid emulation is traced only when executed by intercept. Move trace point so that emulator invocation is traced too. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-05 09:11:40 +02:00
Gleb Natapov	6d4d85ec56	KVM: emulator: cleanup decode_register_operand() a bit Make code shorter. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-05 09:11:30 +02:00
Gleb Natapov	aa9ac1a632	KVM: emulator: check rex prefix inside decode_register() All decode_register() callers check if instruction has rex prefix to properly decode one byte operand. It make sense to move the check inside. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-05 09:11:18 +02:00
Chen Gang	7f71be4c9f	x86, defconfig: Add DEVTMPFS and DEVTMPFS_MOUNT to 86_defconfig The defconfig kernel can not run under neither fedora16 x86_64 laptop nor fedora17 x86_64 pc. After enable DEVTMPFS* in x86_64_defconfig, it will be OK. DEVTMPFS* is only related with software, so for i386_defconfig may also need them (at least, it has no negative effect for defconfig). Signed-off-by: Chen Gang <gang.chen@asianux.com> Link: http://lkml.kernel.org/r/52784DFF.8040004@asianux.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2013-11-04 20:01:55 -08:00
David S. Miller	394efd19d5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be.h drivers/net/netconsole.c net/bridge/br_private.h Three mostly trivial conflicts. The net/bridge/br_private.h conflict was a function signature (argument addition) change overlapping with the extern removals from Joe Perches. In drivers/net/netconsole.c we had one change adjusting a printk message whilst another changed "printk(KERN_INFO" into "pr_info(". Lastly, the emulex change was a new inline function addition overlapping with Joe Perches's extern removals. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-04 13:48:30 -05:00
Gleb Natapov	95f328d3ad	Merge branch 'kvm-ppc-queue' of git://github.com/agraf/linux-2.6 into queue Conflicts: arch/powerpc/include/asm/processor.h	2013-11-04 10:20:57 +02:00
Ingo Molnar	2a3ede8cb2	Merge branch 'perf/urgent' into perf/core to fix conflicts Conflicts: tools/perf/bench/numa.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-04 07:49:35 +01:00
Paolo Bonzini	daf727225b	KVM: x86: fix emulation of "movzbl %bpl, %eax" When I was looking at RHEL5.9's failure to start with unrestricted_guest=0/emulate_invalid_guest_state=1, I got it working with a slightly older tree than kvm.git. I now debugged the remaining failure, which was introduced by commit `660696d1` (KVM: X86 emulator: fix source operand decoding for 8bit mov[zs]x instructions, 2013-04-24) introduced a similar mis-emulation to the one in commit `8acb4207` (KVM: fix sil/dil/bpl/spl in the mod/rm fields, 2013-05-30). The incorrect decoding occurs in 8-bit movzx/movsx instructions whose 8-bit operand is sil/dil/bpl/spl. Needless to say, "movzbl %bpl, %eax" does occur in RHEL5.9's decompression prolog, just a handful of instructions before finally giving control to the decompressed vmlinux and getting out of the invalid guest state. Because OpMem8 bypasses decode_modrm, the same handling of the REX prefix must be applied to OpMem8. Reported-by: Michele Baldessari <michele@redhat.com> Cc: stable@vger.kernel.org Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-11-03 16:50:37 +02:00
Linus Torvalds	9581b7d268	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two fixes: - Fix 'NMI handler took too long to run' false positives [ Genuine NMI overhead speedups will come for v3.13, this commit only fixes a measurement bug ] - Fix perf ring-buffer missed barrier causing (rare) ring-buffer data corruption on ppc64" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix NMI measurements perf: Fix perf ring buffer memory ordering	2013-11-01 12:54:51 -07:00
Ingo Molnar	fb10d5b7ef	Merge branch 'linus' into sched/core Resolve cherry-picking conflicts: Conflicts: mm/huge_memory.c mm/memory.c mm/mprotect.c See this upstream merge commit for more details: `52469b4fcd` Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-11-01 08:24:41 +01:00
Paolo Bonzini	98f73630f9	KVM: x86: emulate SAHF instruction Yet another instruction that we fail to emulate, this time found in Windows 2008R2 32-bit. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-31 22:14:10 +01:00
Bjorn Helgaas	33de1b8bf6	Merge branch 'pci/misc' into next * pci/misc: PCI: Report pci_pme_active() kmalloc failure mn10300/PCI: Remove useless pcibios_last_bus frv/PCI: Remove pcibios_last_bus PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0 x86/PCI: Coalesce multiple overlapping host bridge windows MAINTAINERS: Add arch/x86/pci to PCI file patterns PCI/PM: Remove pci_pm_complete() PCI: Add pci_dev_show_local_cpu() to simplify code mn10300/PCI: Remove unused pci_mem_start cris/PCI: Remove unused pci_mem_start PCI: Make pci_dev_pm_ops static Conflicts: drivers/pci/pci-sysfs.c	2013-10-31 14:12:40 -06:00
Lv Zheng	40bce100ca	ACPICA: Cleanup asmlinkage for ACPICA APIs. Add an asmlinkage wrapper around acpi_enter_sleep_state() to prevent an empty stub from being called by assmebly code for ACPI_REDUCED_HARDWARE set. As arch/x86/kernel/acpi/wakeup_xx.S is only compiled when CONFIG_ACPI=y and there are no users of ACPI_HARDWARE_REDUCED, currently this is in fact not a real issue, but a cleanup to reduce source code differences between Linux and ACPICA upstream. [rjw: Changelog] Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-10-31 14:37:35 +01:00
Michael S. Tsirkin	6026620475	kvm/vmx: error message typo fix mst can't be blamed for lack of switch entries: the issue is with msrs actually. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-31 11:55:15 +01:00
Paolo Bonzini	c67a04cb9a	KVM: x86: fix KVM_SET_XCRS loop The loop was always using 0 as the index. This means that any rubbish after the first element of the array went undetected. It seems reasonable to assume that no KVM userspace did that. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-31 11:31:19 +01:00
Paolo Bonzini	46c34cb059	KVM: x86: fix KVM_SET_XCRS for CPUs that do not support XSAVE The KVM_SET_XCRS ioctl must accept anything that KVM_GET_XCRS could return. XCR0's bit 0 is always 1 in real processors with XSAVE, and KVM_GET_XCRS will always leave bit 0 set even if the emulated processor does not have XSAVE. So, KVM_SET_XCRS must ignore that bit when checking for attempts to enable unsupported save states. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-31 11:30:46 +01:00
David Rientjes	d68ce0177c	x86, hyperv: Fix build error due to missing <asm/apic.h> include `9e7827b5ea` ("x86, hyperv: Get the local APIC timer frequency from the hypervisor") breaks the build with some configs because apic.h isn't directly included: arch/x86/kernel/cpu/mshyperv.c: In function 'ms_hyperv_init_platform': arch/x86/kernel/cpu/mshyperv.c:90:3: error: 'lapic_timer_frequency' undeclared (first use in this function) arch/x86/kernel/cpu/mshyperv.c:90:3: note: each undeclared identifier is reported only once for each function it appears in Fix it by including asm/apic.h. Signed-off-by: David Rientjes <rientjes@google.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1310111604160.31170@chino.kir.corp.google.com Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-10-30 15:37:47 -07:00
Linus Torvalds	12aee278b5	Merge branch 'akpm' (fixes from Andrew Morton) Merge three fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting percpu: fix this_cpu_sub() subtrahend casting for unsigneds mm/pagewalk.c: fix walk_page_range() access of wrong PTEs	2013-10-30 14:27:10 -07:00
Greg Thelen	bd09d9a351	percpu: fix this_cpu_sub() subtrahend casting for unsigneds this_cpu_sub() is implemented as negation and addition. This patch casts the adjustment to the counter type before negation to sign extend the adjustment. This helps in cases where the counter type is wider than an unsigned adjustment. An alternative to this patch is to declare such operations unsupported, but it seemed useful to avoid surprises. This patch specifically helps the following example: unsigned int delta = 1 preempt_disable() this_cpu_write(long_counter, 0) this_cpu_sub(long_counter, delta) preempt_enable() Before this change long_counter on a 64 bit machine ends with value 0xffffffff, rather than 0xffffffffffffffff. This is because this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta), which is basically: long_counter = 0 + 0xffffffff Also apply the same cast to: __this_cpu_sub() __this_cpu_sub_return() this_cpu_sub_return() All percpu_test.ko passes, especially the following cases which previously failed: l -= ui_one; __this_cpu_sub(long_counter, ui_one); CHECK(l, long_counter, -1); l -= ui_one; this_cpu_sub(long_counter, ui_one); CHECK(l, long_counter, -1); CHECK(l, long_counter, 0xffffffffffffffff); ul -= ui_one; __this_cpu_sub(ulong_counter, ui_one); CHECK(ul, ulong_counter, -1); CHECK(ul, ulong_counter, 0xffffffffffffffff); ul = this_cpu_sub_return(ulong_counter, ui_one); CHECK(ul, ulong_counter, 2); ul = __this_cpu_sub_return(ulong_counter, ui_one); CHECK(ul, ulong_counter, 1); Signed-off-by: Greg Thelen <gthelen@google.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-10-30 14:27:03 -07:00
Alex Williamson	e0f0bbc527	kvm: Create non-coherent DMA registeration We currently use some ad-hoc arch variables tied to legacy KVM device assignment to manage emulation of instructions that depend on whether non-coherent DMA is present. Create an interface for this, adapting legacy KVM device assignment and adding VFIO via the KVM-VFIO device. For now we assume that non-coherent DMA is possible any time we have a VFIO group. Eventually an interface can be developed as part of the VFIO external user interface to query the coherency of a group. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 19:02:23 +01:00
Alex Williamson	d96eb2c6f4	kvm/x86: Convert iommu_flags to iommu_noncoherent Default to operating in coherent mode. This simplifies the logic when we switch to a model of registering and unregistering noncoherent I/O with KVM. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 19:02:13 +01:00
Alex Williamson	ec53500fae	kvm: Add VFIO device So far we've succeeded at making KVM and VFIO mostly unaware of each other, but areas are cropping up where a connection beyond eventfds and irqfds needs to be made. This patch introduces a KVM-VFIO device that is meant to be a gateway for such interaction. The user creates the device and can add and remove VFIO groups to it via file descriptors. When a group is added, KVM verifies the group is valid and gets a reference to it via the VFIO external user interface. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 19:02:03 +01:00
Borislav Petkov	84cffe499b	kvm: Emulate MOVBE This basically came from the need to be able to boot 32-bit Atom SMP guests on an AMD host, i.e. a host which doesn't support MOVBE. As a matter of fact, qemu has since recently received MOVBE support but we cannot share that with kvm emulation and thus we have to do this in the host. We're waay faster in kvm anyway. :-) So, we piggyback on the #UD path and emulate the MOVBE functionality. With it, an 8-core SMP guest boots in under 6 seconds. Also, requesting MOVBE emulation needs to happen explicitly to work, i.e. qemu -cpu n270,+movbe... Just FYI, a fairly straight-forward boot of a MOVBE-enabled 3.9-rc6+ kernel in kvm executes MOVBE ~60K times. Signed-off-by: Andre Przywara <andre@andrep.de> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:41 +01:00
Borislav Petkov	0bc5eedb82	kvm, emulator: Add initial three-byte insns support Add initial support for handling three-byte instructions in the emulator. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:41 +01:00
Borislav Petkov	b51e974fcd	kvm, emulator: Rename VendorSpecific flag Call it EmulateOnUD which is exactly what we're trying to do with vendor-specific instructions. Rename ->only_vendor_specific_insn to something shorter, while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:40 +01:00
Borislav Petkov	1ce19dc16c	kvm, emulator: Use opcode length Add a field to the current emulation context which contains the instruction opcode length. This will streamline handling of opcodes of different length. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:39 +01:00
Borislav Petkov	9c15bb1d0a	kvm: Add KVM_GET_EMULATED_CPUID Add a kvm ioctl which states which system functionality kvm emulates. The format used is that of CPUID and we return the corresponding CPUID bits set for which we do emulate functionality. Make sure ->padding is being passed on clean from userspace so that we can use it for something in the future, after the ioctl gets cast in stone. s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 18:54:39 +01:00
Tim Gardner	d780a31271	KVM: Fix modprobe failure for kvm_intel/kvm_amd The x86 specific kvm init creates a new conflicting debugfs directory which causes modprobe issues with kvm_intel and kvm_amd. For example, sudo modprobe kvm_amd modprobe: ERROR: could not insert 'kvm_amd': Bad address The simplest fix is to just rename the directory. The following KVM config options are set: CONFIG_KVM_GUEST=y CONFIG_KVM_DEBUG_FS=y CONFIG_HAVE_KVM=y CONFIG_HAVE_KVM_IRQCHIP=y CONFIG_HAVE_KVM_IRQ_ROUTING=y CONFIG_HAVE_KVM_EVENTFD=y CONFIG_KVM_APIC_ARCHITECTURE=y CONFIG_KVM_MMIO=y CONFIG_KVM_ASYNC_PF=y CONFIG_HAVE_KVM_MSI=y CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y CONFIG_KVM=m CONFIG_KVM_INTEL=m CONFIG_KVM_AMD=m CONFIG_KVM_DEVICE_ASSIGNMENT=y Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> [Change debugfs directory name. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-30 12:10:42 +01:00
Peter Zijlstra	e00b12e64b	perf/x86: Further optimize copy_from_user_nmi() Now that we can deal with nested NMI due to IRET re-enabling NMIs and can deal with faults from NMI by making sure we preserve CR2 over NMIs we can in fact simply access user-space memory from NMI context. So rewrite copy_from_user_nmi() to use __copy_from_user_inatomic() and rework the fault path to do the minimal required work before taking the in_atomic() fault handler. In particular avoid perf_sw_event() which would make perf recurse on itself (it should be harmless as our recursion protections should be able to deal with this -- but why tempt fate). Also rename notify_page_fault() to kprobes_fault() as that is a much better name; there is no notifier in it and its specific to kprobes. Don measured that his worst case NMI path shrunk from ~300K cycles to ~150K cycles. Cc: Stephane Eranian <eranian@google.com> Cc: jmario@redhat.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: dave.hansen@linux.intel.com Tested-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131024105206.GM2490@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-29 12:02:54 +01:00
Peter Zijlstra	e8a923cc1f	perf/x86: Fix NMI measurements OK, so what I'm actually seeing on my WSM is that sched/clock.c is 'broken' for the purpose we're using it for. What triggered it is that my WSM-EP is broken :-( [ 0.001000] tsc: Fast TSC calibration using PIT [ 0.002000] tsc: Detected 2533.715 MHz processor [ 0.500180] TSC synchronization [CPU#0 -> CPU#6]: [ 0.505197] Measured 3 cycles TSC warp between CPUs, turning off TSC clock. [ 0.004000] tsc: Marking TSC unstable due to check_tsc_sync_source failed For some reason it consistently detects TSC skew, even though NHM+ should have a single clock domain for 'reasonable' systems. This marks sched_clock_stable=0, which means that we do fancy stuff to try and get a 'sane' clock. Part of this fancy stuff relies on the tick, clearly that's gone when NOHZ=y. So for idle cpus time gets stuck, until it either wakes up or gets kicked by another cpu. While this is perfectly fine for the scheduler -- it only cares about actually running stuff, and when we're running stuff we're obviously not idle. This does somewhat break down for perf which can trigger events just fine on an otherwise idle cpu. So I've got NMIs get get 'measured' as taking ~1ms, which actually don't last nearly that long: <idle>-0 [013] d.h. 886.311970: rcu_nmi_enter <-do_nmi ... <idle>-0 [013] d.h. 886.311997: perf_sample_event_took: HERE!!! : 1040990 So ftrace (which uses sched_clock(), not the fancy bits) only sees ~27us, but we measure ~1ms !! Now since all this measurement stuff lives in x86 code, we can actually fix it. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: mingo@kernel.org Cc: dave.hansen@linux.intel.com Cc: eranian@google.com Cc: Don Zickus <dzickus@redhat.com> Cc: jmario@redhat.com Cc: acme@infradead.org Link: http://lkml.kernel.org/r/20131017133350.GG3364@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-29 12:01:20 +01:00
Ingo Molnar	aac898548d	Merge branch 'perf/urgent' into perf/core Conflicts: tools/perf/builtin-record.c tools/perf/builtin-top.c tools/perf/util/hist.h	2013-10-29 11:23:32 +01:00
Matt Fleming	72548e836b	x86/efi: Add EFI framebuffer earlyprintk support It's incredibly difficult to diagnose early EFI boot issues without special hardware because earlyprintk=vga doesn't work on EFI systems. Add support for writing to the EFI framebuffer, via earlyprintk=efi, which will actually give users a chance of providing debug output. Cc: H. Peter Anvin <hpa@zytor.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Jones <pjones@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-10-28 18:09:58 +00:00
Jan Kiszka	a294c9bbd0	nVMX: Report CPU_BASED_VIRTUAL_NMI_PENDING as supported If the host supports it, we can and should expose it to the guest as well, just like we already do with PIN_BASED_VIRTUAL_NMIS. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-10-28 13:14:38 +01:00

1 2 3 4 5 ...

18356 Commits