linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	ca520cab25	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking and atomic updates from Ingo Molnar: "Main changes in this cycle are: - Extend atomic primitives with coherent logic op primitives (atomic_{or,and,xor}()) and deprecate the old partial APIs (atomic_{set,clear}_mask()) The old ops were incoherent with incompatible signatures across architectures and with incomplete support. Now every architecture supports the primitives consistently (by Peter Zijlstra) - Generic support for 'relaxed atomics': - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return() - atomic_read_acquire() - atomic_set_release() This came out of porting qwrlock code to arm64 (by Will Deacon) - Clean up the fragile static_key APIs that were causing repeat bugs, by introducing a new one: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. To be able to know the 'type' of the static key we encode it in the jump entry (by Peter Zijlstra) - Static key self-tests (by Jason Baron) - qrwlock optimizations (by Waiman Long) - small futex enhancements (by Davidlohr Bueso) - ... and misc other changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) jump_label/x86: Work around asm build bug on older/backported GCCs locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h locking/qrwlock: Make use of _{acquire\|release\|relaxed}() atomics locking/qrwlock: Implement queue_write_unlock() using smp_store_release() locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition locking, asm-generic: Add _{relaxed\|acquire\|release}() variants for 'atomic_long_t' locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication locking/atomics: Add _{acquire\|release\|relaxed}() variants of some atomic operations locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic locking/static_keys: Make verify_keys() static jump label, locking/static_keys: Update docs locking/static_keys: Provide a selftest jump_label: Provide a self-test s390/uaccess, locking/static_keys: employ static_branch_likely() x86, tsc, locking/static_keys: Employ static_branch_likely() locking/static_keys: Add selftest locking/static_keys: Add a new static_key interface locking/static_keys: Rework update logic locking/static_keys: Add static_key_{en,dis}able() helpers ...	2015-09-03 15:46:07 -07:00
Linus Torvalds	17e6b00ac4	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "This updated pull request does not contain the last few GIC related patches which were reported to cause a regression. There is a fix available, but I let it breed for a couple of days first. The irq departement provides: - new infrastructure to support non PCI based MSI interrupts - a couple of new irq chip drivers - the usual pile of fixlets and updates to irq chip drivers - preparatory changes for removal of the irq argument from interrupt flow handlers - preparatory changes to remove IRQF_VALID" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits) irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2 irqchip: Add documentation for the bcm2836 interrupt controller irqchip/bcm2835: Add support for being used as a second level controller irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ PCI: xilinx: Fix typo in function name irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance irqchip/gic: Only allow the primary GIC to set the CPU map PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove unicore32/irq: Prepare puv3_gpio_handler for irq argument removal tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal m68k/irq: Prepare irq handlers for irq argument removal C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal blackfin: Prepare irq handlers for irq argument removal arc/irq: Prepare idu_cascade_isr for irq argument removal sparc/irq: Use access helper irq_data_get_affinity_mask() sparc/irq: Use helper irq_data_get_irq_handler_data() parisc/irq: Use access helper irq_data_get_affinity_mask() mn10300/irq: Use access helper irq_data_get_affinity_mask() irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal ...	2015-09-01 14:33:35 -07:00
Vineet Gupta	3d5926599a	ARCv2: entry: Fix reserved handler Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 16:25:37 +05:30
Vineet Gupta	9b28829d6d	ARCv2: perf: Finally introduce HS perf unit With all features in place, the ARC HS pct block can now be effectively allowed to be probed/used Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:59:07 +05:30
Alexey Brodkin	e525c37f84	ARCv2: perf: SMP support * split off pmu info into singleton and per-cpu bits * setup PMU on all cores Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:58:42 +05:30
Alexey Brodkin	e6b1d126bb	ARCv2: perf: implement exclusion of event counting in user or kernel mode Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:58:14 +05:30
Alexey Brodkin	36481cf7fb	ARCv2: perf: Support sampling events using overflow interrupts In times of ARC 700 performance counters didn't have support of interrupt an so for ARC we only had support of non-sampling events. Put simply only "perf stat" was functional. Now with ARC HS we have support of interrupts in performance counters which this change introduces support of. ARC performance counters act in the following way in regard of interrupts generation. [1] A counter counts starting from value set in PCT_COUNT register pair [2] Once counter reaches value set in PCT_INT_CNT interrupt is raised Basic setup look like this: [1] PCT_COUNT = 0; [2] PCT_INT_CNT = __limit_value__; [3] Enable interrupts for that counter and let it run [4] Let counter reach its limit [5] Handle interrupt when it happens Note that PCT HW block is build in CPU core and so ints interrupt line (which is basically OR of all counters IRQs) is wired directly to top-level IRQC. That means do de-assert PCT interrupt it's required to reset IRQs from all counters that have reached their limit values. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:57:43 +05:30
Alexey Brodkin	1fe8bfa5ff	ARCv2: perf: implement "event_set_period" This generalization prepares for support of overflow interrupts. Hardware event counters on ARC work that way: Each counter counts from programmed start value (set in ARC_REG_PCT_COUNT) to a limit value (set in ARC_REG_PCT_INT_CNT) and once limit value is reached this timer generates an interrupt. Even though this hardware implementation allows for more flexibility, in Linux kernel we decided to mimic behavior of other architectures this way: [1] Set limit value as half of counter's max value (to allow counter to run after reaching it limit, see below for more explanation): ---------->8----------- arc_pmu->max_period = (1ULL << counter_size) / 2 - 1ULL; ---------->8----------- [2] Set start value as "arc_pmu->max_period - sample_period" and then count up to the limit Our event counters don't stop on reaching max value (the one we set in ARC_REG_PCT_INT_CNT) but continue to count until kernel explicitly stops each of them. And setting a limit as half of counter capacity is done to allow capturing of additional events in between moment when interrupt was triggered until we're actually processing PMU interrupts. That way we're trying to be more precise. For example if we count CPU cycles we keep track of cycles while running through generic IRQ handling code: [1] We set counter period as say 100_000 events of type "crun" [2] Counter reaches that limit and raises its interrupt [3] Once we get in PMU IRQ handler we read current counter value from ARC_REG_PCT_SNAP ans see there something like 105_000. If counters stop on reaching a limit value then we would miss additional 5000 cycles. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:57:29 +05:30
Vineet Gupta	fb7c572551	ARC: perf: cap the number of counters to hardware max of 32 The number of counters in PCT can never be more than 32 (while countable conditions could be 100+) for both ARCompact and ARCv2 And while at it update copyright dates. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:57:03 +05:30
Vineet Gupta	fd0881a24a	ARC: Eliminate some ARCv2 specific code for ARCompact build Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-21 15:06:43 +05:30
Vineet Gupta	090749502f	ARC: add/fix some comments in code - no functional change Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 19:05:49 +05:30
Yuriy Kolerov	6de6066c0d	ARC: change some branchs to jumps to resolve linkage errors When kernel's binary becomes large enough (32M and more) errors may occur during the final linkage stage. It happens because the build system uses short relocations for ARC by default. This problem may be easily resolved by passing -mlong-calls option to GCC to use long absolute jumps (j) instead of short relative branchs (b). But there are fragments of pure assembler code exist which use branchs in inappropriate places and cause a linkage error because of relocations overflow. First of these fragments is .fixup insertion in futex.h and unaligned.c. It inserts a code in the separate section (.fixup) with branch instruction. It leads to the linkage error when kernel becomes large. Second of these fragments is calling scheduler's functions (common kernel code) from entry.S of ARC's code. When kernel's binary becomes large it may lead to the linkage error because scheduler may occur far enough from ARC's code in the final binary. Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:53:15 +05:30
Vineet Gupta	eb2cd8b72b	ARC: ensure futex ops are atomic in !LLSC config W/o hardware assisted atomic r-m-w the best we can do is to disable preemption. Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:16:01 +05:30
Vineet Gupta	5e0574292a	ARC: Enable HAVE_FUTEX_CMPXCHG ARC doesn't need the runtime detection of futex cmpxchg op Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:16:01 +05:30
Vineet Gupta	882a95ae0a	ARC: make futex_atomic_cmpxchg_inatomic() return bimodal Callers of cmpxchg_futex_value_locked() in futex code expect bimodal return value: !0 (essentially -EFAULT as failure) 0 (success) Before this patch, the success return value was old value of futex, which could very well be non zero, causing caller to possibly take the failure path erroneously. Fix that by returning 0 for success (This fix was done back in 2011 for all upstream arches, which ARC obviously missed) Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:16:00 +05:30
Vineet Gupta	ed574e2bbd	ARC: futex cosmetics Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:16:00 +05:30
Vineet Gupta	31d30c8208	ARC: add barriers to futex code The atomic ops on futex need to provide the full barrier just like regular atomics in kernel. Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic() as core code already does that Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:15:59 +05:30
Alexey Brodkin	1648c70d30	ARCv2: IOC: Allow boot time disable Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:15:31 +05:30
Vineet Gupta	79335a2ca0	ARCv2: SLC: Allow boot time disable Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:11:52 +05:30
Alexey Brodkin	f2b0b25a37	ARCv2: Support IO Coherency and permutations involving L1 and L2 caches In case of ARCv2 CPU there're could be following configurations that affect cache handling for data exchanged with peripherals via DMA: [1] Only L1 cache exists [2] Both L1 and L2 exist, but no IO coherency unit [3] L1, L2 caches and IO coherency unit exist Current implementation takes care of [1] and [2]. Moreover support of [2] is implemented with run-time check for SLC existence which is not super optimal. This patch introduces support of [3] and rework of DMA ops usage. Instead of doing run-time check every time a particular DMA op is executed we'll have 3 different implementations of DMA ops and select appropriate one during init. As for IOC support for it we need: [a] Implement empty DMA ops because IOC takes care of cache coherency with DMAed data [b] Route dma_alloc_coherent() via dma_alloc_noncoherent() This is required to make IOC work in first place and also serves as optimization as LD/ST to coherent buffers can be srviced from caches w/o going all the way to memory Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> [vgupta: -Added some comments about IOC gains -Marked dma ops as static, -Massaged changelog a bit] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:11:17 +05:30
Vineet Gupta	2a4401687c	ARC: Enable optimistic spinning for LLSC config Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-11 14:51:09 +05:30
Vineet Gupta	1097163870	ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff The increment of delay counter was 2 instructions: Arithmatic Shfit Left (ASL) + set to 1 on overflow This can be done in 1 using ROtate Left (ROL) Suggested-by: Nigel Topham <ntopham@synopsys.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-07 13:56:16 +05:30
Vineet Gupta	87ce62802f	ARC: Make pt_regs regs unsigned KGDB fails to build after `f51e2f1911` ("ARC: make sure instruction_pointer() returns unsigned value") The hack to force one specific reg to unsigned backfired. There's no reason to keep the regs signed after all. \| CC arch/arc/kernel/kgdb.o \|../arch/arc/kernel/kgdb.c: In function 'kgdb_trap': \| ../arch/arc/kernel/kgdb.c:180:29: error: lvalue required as left operand of assignment \| instruction_pointer(regs) -= BREAK_INSTR_SIZE; Reported-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com> Fixes: `f51e2f1911` ("ARC: make sure instruction_pointer() returns unsigned value") Cc: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-05 11:48:21 +05:30
Vineet Gupta	b89aa12c17	ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle The previous commit for delayed retry of SCOND needs some fine tuning for spin locks. The backoff from delayed retry in conjunction with spin looping of lock itself can potentially cause the delay counter to reach high values. So to provide fairness to any lock operation, after a lock "seems" available (i.e. just before first SCOND try0, reset the delay counter back to starting value of 1 Essentially reset delay to 1 for a new spin-wait-loop-acquire cycle. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-04 09:26:35 +05:30
Vineet Gupta	e78fdfef84	ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff This is to workaround the llock/scond livelock HS38x4 could get into a LLOCK/SCOND livelock in case of multiple overlapping coherency transactions in the SCU. The exclusive line state keeps rotating among contenting cores leading to a never ending cycle. So break the cycle by deferring the retry of failed exclusive access (SCOND). The actual delay needed is function of number of contending cores as well as the unrelated coherency traffic from other cores. To keep the code simple, start off with small delay of 1 which would suffice most cases and in case of contention double the delay. Eventually the delay is sufficient such that the coherency pipeline is drained, thus a subsequent exclusive access would succeed. Link: http://lkml.kernel.org/r/1438612568-28265-1-git-send-email-vgupta@synopsys.com Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-04 09:26:34 +05:30
Vineet Gupta	69cbe630f5	ARC: LLOCK/SCOND based rwlock With LLOCK/SCOND, the rwlock counter can be atomically updated w/o need for a guarding spin lock. This in turn elides the EXchange instruction based spinning which causes the cacheline transition to exclusive state and concurrent spinning across cores would cause the line to keep bouncing around. LLOCK/SCOND based implementation is superior as spinning on LLOCK keeps the cacheline in shared state. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-04 09:26:33 +05:30
Vineet Gupta	ae7eae9e03	ARC: LLOCK/SCOND based spin_lock Current spin_lock uses EXchange instruction to implement the atomic test and set of lock location (reads orig value and ST 1). This however forces the cacheline into exclusive state (because of the ST) and concurrent loops in multiple cores will bounce the line around between cores. Instead, use LLOCK/SCOND to implement the atomic test and set which is better as line is in shared state while lock is spinning on LLOCK The real motivation of this change however is to make way for future changes in atomics to implement delayed retry (with backoff). Initial experiment with delayed retry in atomics combined with orig EX based spinlock was a total disaster (broke even LMBench) as struct sock has a cache line sharing an atomic_t and spinlock. The tight spinning on lock, caused the atomic retry to keep backing off such that it would never finish. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-04 09:26:33 +05:30
Vineet Gupta	8ac0665fb6	ARC: refactor atomic inline asm operands with symbolic names This reduces the diff in forth-coming patches and also helps understand better the incremental changes to inline asm. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-04 09:26:32 +05:30
Vineet Gupta	f5959cb0c3	Revert "ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock" Extended testing of quad core configuration revealed that this fix was insufficient. Specifically LTP open posix shm_op/23-1 would cause the hardware livelock in llock/scond loop in update_cpu_load_active() So remove this and make way for a proper workaround This reverts commit `a5c8b52abe`. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-04 09:26:31 +05:30
Vineet Gupta	6de7abfbad	ARCv2: [axs103_smp] Reduce clk for Quad FPGA configs Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-04 09:26:30 +05:30
Vineet Gupta	e13c42ecbe	ARCv2: Fix the peripheral address space detection With HS 2.1 release, the peripheral space register no longer contains the uncached space specifics, causing the kernel to panic early on. So read the newer NON VOLATILE AUX register to get that info. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-03 19:34:07 +05:30
Thomas Gleixner	badae6bc94	arc/irq: Prepare idu_cascade_isr for irq argument removal The irq argument of most interrupt flow handlers is unused or merily used instead of a local variable. The handlers which need the irq argument can retrieve the irq number from the irq descriptor. Search and update was done with coccinelle and the invaluable help of Julia Lawall. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Vineet Gupta <vgupta@synopsys.com>	2015-07-31 22:20:05 +02:00
Peter Zijlstra	de9e432cb5	atomic: Collapse all atomic_{set,clear}_mask definitions Move the now generic definitions of atomic_{set,clear}_mask() into linux/atomic.h to avoid endless and pointless repetition. Also, provide an atomic_andnot() wrapper for those few archs that can implement that. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-07-27 14:06:24 +02:00
Peter Zijlstra	e6942b7de2	atomic: Provide atomic_{or,xor,and} Implement atomic logic ops -- atomic_{or,xor,and}. These will replace the atomic_{set,clear}_mask functions that are available on some archs. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-07-27 14:06:24 +02:00
Peter Zijlstra	cda7e4137a	arc: Provide atomic_{or,xor,and} Implement atomic logic ops -- atomic_{or,xor,and}. These will replace the atomic_{set,clear}_mask functions that are available on some archs. Acked-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-07-27 14:06:21 +02:00
Alexey Brodkin	450ed0db01	ARCv2: allow selection of page size for MMUv4 MMUv4 also supports the configurable page size as MMUv3. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-23 12:04:39 +03:00
Vineet Gupta	262137bca7	ARCv2: lib: memset: Don't assume 64-bit load/stores There are configurations which may not have LDD/STD Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-20 17:44:37 +03:00
Vineet Gupta	21481f2cfe	ARCv2: lib: memcpy: Missing PREFETCHW Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-20 17:27:35 +03:00
Alexey Brodkin	d05a76ab4d	ARCv2: add knob for DIV_REV in Kconfig Being highly configurable core ARC HS among other features might be configured with or without DIV_REM_OPTION (hardware divider). That option when enabled adds following instructions: div, divu, rem, remu. By default ARC HS38 has this option enabled. So we add here possibility to disable usage of hardware divider by compiler. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-20 13:33:30 +03:00
Viresh Kumar	aeec6cdad6	ARC/time: Migrate to new 'set-state' interface Migrate arc driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-20 13:30:31 +03:00
Laurent Dufour	f2abeef9fd	mm: clean up per architecture MM hook header files Commit `2ae416b142` ("mm: new mm hook framework") introduced an empty header file (mm-arch-hooks.h) for every architecture, even those which doesn't need to define mm hooks. As suggested by Geert Uytterhoeven, this could be cleaned through the use of a generic header file included via each per architecture asm/include/Kbuild file. The PowerPC architecture is not impacted here since this architecture has to defined the arch_remap MM hook. Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Vineet Gupta <vgupta@synopsys.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-07-17 16:39:53 -07:00
Vineet Gupta	624b71ee20	ARCv2: support HS38 releases Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-13 13:33:23 +05:30
Alexey Brodkin	f51e2f1911	ARC: make sure instruction_pointer() returns unsigned value Currently instruction_pointer() returns pt_regs->ret and so return value is of type "long", which implicitly stands for "signed long". While that's perfectly fine when dealing with 32-bit values if return value of instruction_pointer() gets assigned to 64-bit variable sign extension may happen. And at least in one real use-case it happens already. In perf_prepare_sample() return value of perf_instruction_pointer() (which is an alias to instruction_pointer() in case of ARC) is assigned to (struct perf_sample_data)->ip (which type is "u64"). And what we see if instuction pointer points to user-space application that in case of ARC lays below 0x8000_0000 "ip" gets set properly with leading 32 zeros. But if instruction pointer points to kernel address space that starts from 0x8000_0000 then "ip" is set with 32 leadig "f"-s. I.e. id instruction_pointer() returns 0x8100_0000, "ip" will be assigned with 0xffff_ffff__8100_0000. Which is obviously wrong. In particular that issuse broke output of perf, because perf was unable to associate addresses like 0xffff_ffff__8100_0000 with anything from /proc/kallsyms. That's what we used to see: ----------->8---------- 6.27% ls [unknown] [k] 0xffffffff8046c5cc 2.96% ls libuClibc-0.9.34-git.so [.] memcpy 2.25% ls libuClibc-0.9.34-git.so [.] memset 1.66% ls [unknown] [k] 0xffffffff80666536 1.54% ls libuClibc-0.9.34-git.so [.] 0x000224d6 1.18% ls libuClibc-0.9.34-git.so [.] 0x00022472 ----------->8---------- With that change perf output looks much better now: ----------->8---------- 8.21% ls [kernel.kallsyms] [k] memset 3.52% ls libuClibc-0.9.34-git.so [.] memcpy 2.11% ls libuClibc-0.9.34-git.so [.] malloc 1.88% ls libuClibc-0.9.34-git.so [.] memset 1.64% ls [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 1.41% ls [kernel.kallsyms] [k] __d_lookup_rcu ----------->8---------- Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: arc-linux-dev@synopsys.com Cc: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-13 13:33:18 +05:30
Vineet Gupta	b631788ab4	ARC: slightly refactor macros for boot logging Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-09 17:36:33 +05:30
Vineet Gupta	9138d4138d	ARC: Add llock/scond to futex backend Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-09 17:36:33 +05:30
Joël Porquet	70d93d8941	arc:irqchip: prepare for drivers/irqchip/irqchip.h removal The IRQCHIP_DECLARE macro migrated to 'include/linux/irqchip.h'. See commit `91e20b5040` ("irqchip: Move IRQCHIP_DECLARE macro to include/linux/irqchip.h"). This patch removes the inclusions of private header 'drivers/irqchip/irqchip.h' and if necessary replaces them with inclusions of 'include/linux/irqchip.h'. Signed-off-by: Joel Porquet <joel@porquet.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-09 17:36:32 +05:30
Vineet Gupta	80f420842f	ARC: Make ARC bitops "safer" (add anti-optimization) ARCompact/ARCv2 ISA provide that any instructions which deals with bitpos/count operand ASL, LSL, BSET, BCLR, BMSK .... will only consider lower 5 bits. i.e. auto-clamp the pos to 0-31. ARC Linux bitops exploited this fact by NOT explicitly masking out upper bits for @nr operand in general, saving a bunch of AND/BMSK instructions in generated code around bitops. While this micro-optimization has worked well over years it is NOT safe as shifting a number with a value, greater than native size is "undefined" per "C" spec. So as it turns outm EZChip ran into this eventually, in their massive muti-core SMP build with 64 cpus. There was a test_bit() inside a loop from 63 to 0 and gcc was weirdly optimizing away the first iteration (so it was really adhering to standard by implementing undefined behaviour vs. removing all the iterations which were phony i.e. (1 << [63..32]) \| for i = 63 to 0 \| X = ( 1 << i ) \| if X == 0 \| continue So fix the code to do the explicit masking at the expense of generating additional instructions. Fortunately, this can be mitigated to a large extent as gcc has SHIFT_COUNT_TRUNCATED which allows combiner to fold masking into shift operation itself. It is currently not enabled in ARC gcc backend, but could be done after a bit of testing. Fixes STAR 9000866918 ("unsafe "undefined behavior" code in kernel") Reported-by: Noam Camus <noamc@ezchip.com> Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-09 17:36:32 +05:30
Alexey Brodkin	e2fc61f384	ARCv2: [axs103] bump CPU frequency from 75 to 90 MHZ With up-to-date FPGA builds ARC cores are supposed to correctly operate even with 90 MHz clock (which is a target frequency for AXS103 release). Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: arc-linux-dev@synopsys.com	2015-07-09 17:36:31 +05:30
Vineet Gupta	6b12ec177c	ARCv2: intc: IDU: Fix potential race in installing a chained IRQ handler Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-06 11:09:06 +05:30
Vineet Gupta	83ce3e6fcc	ARCv2: intc: IDU: support irq affinity With this nsim standlone / OSCI have working irq affinity - AXS103 still needs some work as IDU is not visible in intc hierarchy yet ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-06 11:09:02 +05:30

1 2 3 4 5 ...

590 Commits