Commit Graph

551 Commits

Author SHA1 Message Date
Peter Zijlstra cda7e4137a arc: Provide atomic_{or,xor,and}
Implement atomic logic ops -- atomic_{or,xor,and}.

These will replace the atomic_{set,clear}_mask functions that are
available on some archs.

Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-27 14:06:21 +02:00
Laurent Dufour f2abeef9fd mm: clean up per architecture MM hook header files
Commit 2ae416b142 ("mm: new mm hook framework") introduced an empty
header file (mm-arch-hooks.h) for every architecture, even those which
doesn't need to define mm hooks.

As suggested by Geert Uytterhoeven, this could be cleaned through the use
of a generic header file included via each per architecture
asm/include/Kbuild file.

The PowerPC architecture is not impacted here since this architecture has
to defined the arch_remap MM hook.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-17 16:39:53 -07:00
Vineet Gupta 624b71ee20 ARCv2: support HS38 releases
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-13 13:33:23 +05:30
Alexey Brodkin f51e2f1911 ARC: make sure instruction_pointer() returns unsigned value
Currently instruction_pointer() returns pt_regs->ret and so return value
is of type "long", which implicitly stands for "signed long".

While that's perfectly fine when dealing with 32-bit values if return
value of instruction_pointer() gets assigned to 64-bit variable sign
extension may happen.

And at least in one real use-case it happens already.
In perf_prepare_sample() return value of perf_instruction_pointer()
(which is an alias to instruction_pointer() in case of ARC) is assigned
to (struct perf_sample_data)->ip (which type is "u64").

And what we see if instuction pointer points to user-space application
that in case of ARC lays below 0x8000_0000 "ip" gets set properly with
leading 32 zeros. But if instruction pointer points to kernel address
space that starts from 0x8000_0000 then "ip" is set with 32 leadig
"f"-s. I.e. id instruction_pointer() returns 0x8100_0000, "ip" will be
assigned with 0xffff_ffff__8100_0000. Which is obviously wrong.

In particular that issuse broke output of perf, because perf was unable
to associate addresses like 0xffff_ffff__8100_0000 with anything from
/proc/kallsyms.

That's what we used to see:
 ----------->8----------
  6.27%  ls       [unknown]                [k] 0xffffffff8046c5cc
  2.96%  ls       libuClibc-0.9.34-git.so  [.] memcpy
  2.25%  ls       libuClibc-0.9.34-git.so  [.] memset
  1.66%  ls       [unknown]                [k] 0xffffffff80666536
  1.54%  ls       libuClibc-0.9.34-git.so  [.] 0x000224d6
  1.18%  ls       libuClibc-0.9.34-git.so  [.] 0x00022472
 ----------->8----------

With that change perf output looks much better now:
 ----------->8----------
  8.21%  ls       [kernel.kallsyms]        [k] memset
  3.52%  ls       libuClibc-0.9.34-git.so  [.] memcpy
  2.11%  ls       libuClibc-0.9.34-git.so  [.] malloc
  1.88%  ls       libuClibc-0.9.34-git.so  [.] memset
  1.64%  ls       [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
  1.41%  ls       [kernel.kallsyms]        [k] __d_lookup_rcu
 ----------->8----------

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: arc-linux-dev@synopsys.com
Cc: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-13 13:33:18 +05:30
Vineet Gupta b631788ab4 ARC: slightly refactor macros for boot logging
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-09 17:36:33 +05:30
Vineet Gupta 9138d4138d ARC: Add llock/scond to futex backend
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-09 17:36:33 +05:30
Joël Porquet 70d93d8941 arc:irqchip: prepare for drivers/irqchip/irqchip.h removal
The IRQCHIP_DECLARE macro migrated to 'include/linux/irqchip.h'.

See commit 91e20b5040
("irqchip: Move IRQCHIP_DECLARE macro to include/linux/irqchip.h").

This patch removes the inclusions of private header 'drivers/irqchip/irqchip.h'
and if necessary replaces them with inclusions of 'include/linux/irqchip.h'.

Signed-off-by: Joel Porquet <joel@porquet.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-09 17:36:32 +05:30
Vineet Gupta 80f420842f ARC: Make ARC bitops "safer" (add anti-optimization)
ARCompact/ARCv2 ISA provide that any instructions which deals with
bitpos/count operand ASL, LSL, BSET, BCLR, BMSK .... will only consider
lower 5 bits. i.e. auto-clamp the pos to 0-31.

ARC Linux bitops exploited this fact by NOT explicitly masking out upper
bits for @nr operand in general, saving a bunch of AND/BMSK instructions
in generated code around bitops.

While this micro-optimization has worked well over years it is NOT safe
as shifting a number with a value, greater than native size is
"undefined" per "C" spec.

So as it turns outm EZChip ran into this eventually, in their massive
muti-core SMP build with 64 cpus. There was a test_bit() inside a loop
from 63 to 0 and gcc was weirdly optimizing away the first iteration
(so it was really adhering to standard by implementing undefined behaviour
vs. removing all the iterations which were phony i.e. (1 << [63..32])

| for i = 63 to 0
|    X = ( 1 << i )
|    if X == 0
|       continue

So fix the code to do the explicit masking at the expense of generating
additional instructions. Fortunately, this can be mitigated to a large
extent as gcc has SHIFT_COUNT_TRUNCATED which allows combiner to fold
masking into shift operation itself. It is currently not enabled in ARC
gcc backend, but could be done after a bit of testing.

Fixes STAR 9000866918 ("unsafe "undefined behavior" code in kernel")

Reported-by: Noam Camus <noamc@ezchip.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-09 17:36:32 +05:30
Alexey Brodkin e2fc61f384 ARCv2: [axs103] bump CPU frequency from 75 to 90 MHZ
With up-to-date FPGA builds ARC cores are supposed to correctly operate
even with 90 MHz clock (which is a target frequency for AXS103 release).

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: arc-linux-dev@synopsys.com
2015-07-09 17:36:31 +05:30
Vineet Gupta 6b12ec177c ARCv2: intc: IDU: Fix potential race in installing a chained IRQ handler
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06 11:09:06 +05:30
Vineet Gupta 83ce3e6fcc ARCv2: intc: IDU: support irq affinity
With this nsim standlone / OSCI have working irq affinity - AXS103 still
needs some work as IDU is not visible in intc hierarchy yet !

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06 11:09:02 +05:30
Vineet Gupta bccea41ec0 ARC: fix unused var wanring
Fixes: 9bf39ab2ad ("vfs: add file_path() helper")
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06 11:09:01 +05:30
Vineet Gupta f718c2efff ARC: Don't memzero twice in dma_alloc_coherent for __GFP_ZERO
alloc_pages_exact() get gfp flags and handle zero'ing already

And while it, fix the case where ioremap fails: return rightaway.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06 11:09:01 +05:30
Vineet Gupta 9770906921 ARC: Override toplevel default -O2 with -O3
ARC kernels have historically been built with -O3, despite top level
Makefile defaulting to -O2. This was facilitated by implicitly ordering
of arch makefile include AFTER top level assigned -O2.

An upstream fix to top level a1c48bb160 ("Makefile: Fix unrecognized
cross-compiler command line options") changed the ordering, making ARC
-O3 defunct.

Fix that by NOT relying on any ordering whatsoever and use the proper
arch override facility now present in kbuild (ARCH_*FLAGS)

Depends-on: ("kbuild: Allow arch Makefiles to override {cpp,ld,c}flags")
Suggested-by: Michal Marek <mmarek@suse.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06 11:09:00 +05:30
Alexey Brodkin b607eddd71 ARCv2: guard SLC DMA ops with spinlock
SLC maintenance ops need to be serialized by software as there is no
inherent buffering / quequing of aux commands. It can silently ignore a
new aux operation if previous one is still ongoing (SLC_CTRL_BUSY)

So gaurd the SLC op using a spin lock

The spin lock doesn't seem to be contended even in heavy workloads such
as iperf. On FPGA @ 75 MHz.

 [1] Before this change:
 ============================================================
  # iperf -c 10.42.0.1
 ------------------------------------------------------------
 Client connecting to 10.42.0.1, TCP port 5001
 TCP window size: 43.8 KByte (default)
 ------------------------------------------------------------
 [  3] local 10.42.0.110 port 38935 connected with 10.42.0.1 port 5001
 [ ID] Interval       Transfer     Bandwidth
 [  3]  0.0-10.0 sec  48.4 MBytes  40.6 Mbits/sec
 ============================================================

 [2] After this change:
 ============================================================
 # iperf -c 10.42.0.1
 ------------------------------------------------------------
 Client connecting to 10.42.0.1, TCP port 5001
 TCP window size: 43.8 KByte (default)
 ------------------------------------------------------------
 [  3] local 10.42.0.243 port 60248 connected with 10.42.0.1 port 5001
 [ ID] Interval       Transfer     Bandwidth
 [  3]  0.0-10.0 sec  47.5 MBytes  39.8 Mbits/sec
 # iperf -c 10.42.0.1
 ------------------------------------------------------------
 Client connecting to 10.42.0.1, TCP port 5001
 TCP window size: 43.8 KByte (default)
 ------------------------------------------------------------
 [  3] local 10.42.0.243 port 60249 connected with 10.42.0.1 port 5001
 [ ID] Interval       Transfer     Bandwidth
 [  3]  0.0-10.0 sec  54.9 MBytes  46.0 Mbits/sec
 ============================================================

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: arc-linux-dev@synopsys.com
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06 10:12:39 +05:30
Vineet Gupta 14a0abfc4a ARC: Kconfig: better way to disable ARC_HAS_LLSC for ARC_CPU_750D
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06 10:12:39 +05:30
Linus Torvalds 1dc51b8288 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted VFS fixes and related cleanups (IMO the most interesting in
  that part are f_path-related things and Eric's descriptor-related
  stuff).  UFS regression fixes (it got broken last cycle).  9P fixes.
  fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups".  The
  file_table locking rule change by Eric Dumazet is a rather big and
  fundamental update even if the patch isn't huge.   - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
  9p: cope with bogus responses from server in p9_client_{read,write}
  p9_client_write(): avoid double p9_free_req()
  9p: forgetting to cancel request on interrupted zero-copy RPC
  dax: bdev_direct_access() may sleep
  block: Add support for DAX reads/writes to block devices
  dax: Use copy_from_iter_nocache
  dax: Add block size note to documentation
  fs/file.c: __fget() and dup2() atomicity rules
  fs/file.c: don't acquire files->file_lock in fd_install()
  fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
  vfs: avoid creation of inode number 0 in get_next_ino
  namei: make set_root_rcu() return void
  make simple_positive() public
  ufs: use dir_pages instead of ufs_dir_pages()
  pagemap.h: move dir_pages() over there
  remove the pointless include of lglock.h
  fs: cleanup slight list_entry abuse
  xfs: Correctly lock inode when removing suid and file capabilities
  fs: Call security_ops->inode_killpriv on truncate
  fs: Provide function telling whether file_remove_privs() will do anything
  ...
2015-07-04 19:36:06 -07:00
Linus Torvalds 2d01eedf1d Merge branch 'akpm' (patches from Andrew)
Merge third patchbomb from Andrew Morton:

 - the rest of MM

 - scripts/gdb updates

 - ipc/ updates

 - lib/ updates

 - MAINTAINERS updates

 - various other misc things

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (67 commits)
  genalloc: rename of_get_named_gen_pool() to of_gen_pool_get()
  genalloc: rename dev_get_gen_pool() to gen_pool_get()
  x86: opt into HAVE_COPY_THREAD_TLS, for both 32-bit and 64-bit
  MAINTAINERS: add zpool
  MAINTAINERS: BCACHE: Kent Overstreet has changed email address
  MAINTAINERS: move Jens Osterkamp to CREDITS
  MAINTAINERS: remove unused nbd.h pattern
  MAINTAINERS: update brcm gpio filename pattern
  MAINTAINERS: update brcm dts pattern
  MAINTAINERS: update sound soc intel patterns
  MAINTAINERS: remove website for paride
  MAINTAINERS: update Emulex ocrdma email addresses
  bcache: use kvfree() in various places
  libcxgbi: use kvfree() in cxgbi_free_big_mem()
  target: use kvfree() in session alloc and free
  IB/ehca: use kvfree() in ipz_queue_{cd}tor()
  drm/nouveau/gem: use kvfree() in u_free()
  drm: use kvfree() in drm_free_large()
  cxgb4: use kvfree() in t4_free_mem()
  cxgb3: use kvfree() in cxgb_free_mem()
  ...
2015-07-01 17:47:51 -07:00
Linus Torvalds 0890a26479 - Support for HS38 cores based on ARCv2 ISA
ARCv2 is the next generation ISA from Synopsys and basis for the
      HS3{4,6,8} families of processors which retain the traditional ARC mantra of
      low power and configurability and are now more performant and feature rich.
 
      HS38x is a 10 stage pipeline core which supports MMU (with huge pages) and
      SMP (upto 4 cores) among other features.
 
      + www.synopsys.com/dw/ipdir.php?ds=arc-hs38-processor
      + http://news.synopsys.com/2014-10-14-New-DesignWare-ARC-HS38-Processor-Doubles-Performance-for-Embedded-Linux-Applications
      + http://www.embedded.com/electronics-news/4435975/Synopsys-ARC-HS38-core-gives-2X-boost-to-Linux-based-apps
 
  - Support for ARC SDP (Software Development platform): Main Board + CPU Cards
     = AXS101: CPU Card with ARC700 in silicon @ 700 MHz
     = AXS103: CPU Card with HS38x in FPGA
 
  - Refactoring of ARCompact port to accomodate new ARCv2 ISA
  - Miscll updates/cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVk0g8AAoJEGnX8d3iisJecqsQAI6gvBC4GSNYDrmgGJJK1uLQ
 uf6ZXQRLBtyxwa6VMvaNFe91i5XV5WvEXDuNBQX4FdYbp7Fs+Jz5VK79xFtbVEdU
 H6mgKcs9HBwQvrHBxl54XxxXfX7kD1kxrlV7cL4b7bXTEX0XyH5ROUj600/YP+B4
 8t+XdYcfgFK0HpeFGXVP+Xmv/e+hBbzCpOjOd2ZFqEwymvSpZDc4KZ2yDvV2+Ybn
 JNZ421urQOrxR27njvvPvtpeN7uuJKfRYq7IuIR8+Ad72S19EDdw+DZHp2XoUMXA
 wgydWrrOaX2Dr2CmXHGA1C4nWEG7+Yo9I1WitjJct0tkOQyDR2OIDGmvKGBd1uoS
 QsihtoKBRvns+2gpXBEOmOHmF6ggpHNN0ppIwCp+AK5kX3fmxBtyUekyYmVpg8oQ
 xgFIuJgmiAvW7QB7xIO6SFFt18De2ifDRrKWJwVauvfW/PvUIwuUBEcbh0OHAn54
 ebUUWu2ZdVNe0XCsZOAQGwYHZRWBk8Bn3bhFpNnOliRiF77e9GsKeGYeIswYFy7I
 42Gp35ftEj1pLLFZ1vIsAo72N6ErmHwPOcJkaBYaTbPGPcTEO2aR6b8WOcCjsPxK
 DUeUV3H2HV+6V4jw/96lnsaRqsaj4TsJxEAFRR3wT1DLoRudCIDubaXTdvvDie77
 RgKn4ZdxgmXD97+deBqc
 =KwNo
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC architecture updates from Vineet Gupta:

 - support for HS38 cores based on ARCv2 ISA

     ARCv2 is the next generation ISA from Synopsys and basis for the
     HS3{4,6,8} families of processors which retain the traditional ARC mantra of
     low power and configurability and are now more performant and feature rich.

     HS38x is a 10 stage pipeline core which supports MMU (with huge pages) and
     SMP (upto 4 cores) among other features.

     + www.synopsys.com/dw/ipdir.php?ds=arc-hs38-processor
     + http://news.synopsys.com/2014-10-14-New-DesignWare-ARC-HS38-Processor-Doubles-Performance-for-Embedded-Linux-Applications
     + http://www.embedded.com/electronics-news/4435975/Synopsys-ARC-HS38-core-gives-2X-boost-to-Linux-based-apps

 - support for ARC SDP (Software Development platform): Main Board + CPU Cards
    = AXS101: CPU Card with ARC700 in silicon @ 700 MHz
    = AXS103: CPU Card with HS38x in FPGA

 - refactoring of ARCompact port to accomodate new ARCv2 ISA

 - misc updates/cleanups

* tag 'arc-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (72 commits)
  ARC: Fix build failures for ARCompact in linux-next after ARCv2 support
  ARCv2: Allow older gcc to cope with new regime of ARCv2/ARCompact support
  ARCv2: [vdk] dts files and defconfig for HS38 VDK
  ARCv2: [axs103] Support ARC SDP FPGA platform for HS38x cores
  ARC: [axs101] Prepare for AXS103
  ARCv2: [nsim*hs*] Support simulation platforms for HS38x cores
  ARCv2: All bits in place, allow ARCv2 builds
  ARCv2: SLC: Handle explcit flush for DMA ops (w/o IO-coherency)
  ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock
  ARC: Reduce bitops lines of code using macros
  ARCv2: barriers
  arch: conditionally define smp_{mb,rmb,wmb}
  ARC: add smp barriers around atomics per Documentation/atomic_ops.txt
  ARC: add compiler barrier to LLSC based cmpxchg
  ARCv2: SMP: intc: IDU 2nd level intc for dynamic IRQ distribution
  ARCv2: SMP: clocksource: Enable Global Real Time counter
  ARCv2: SMP: ARConnect debug/robustness
  ARCv2: SMP: Support ARConnect (MCIP) for Inter-Core-Interrupts et al
  ARC: make plat_smp_ops weak to allow over-rides
  ARCv2: clocksource: Introduce 64bit local RTC counter
  ...
2015-07-01 09:24:26 -07:00
Akinobu Mita 92ed932d30 arc: use for_each_sg()
This replaces the plain loop over the sglist array with for_each_sg()
macro which consists of sg_next() function calls.  Since arc doesn't
select ARCH_HAS_SG_CHAIN, it is not necessary to use for_each_sg() in
order to loop over each sg element.  But this can help find problems with
drivers that do not properly initialize their sg tables when
CONFIG_DEBUG_SG is enabled.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30 19:44:58 -07:00
Vineet Gupta 40b8ad8f76 ARC: Fix build failures for ARCompact in linux-next after ARCv2 support
Reported-by: Guenter Roeck <private@roeck-us.net>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-28 20:30:13 +05:30
Vineet Gupta d1c6c2fbcd ARCv2: Allow older gcc to cope with new regime of ARCv2/ARCompact support
-no-ll64 is specific to ARCv2 ISA, and is obviously not supported by
older ARC gcc - in this case the one hosted by linux-next sanity build
service.

Ensure that it doesn't get included for ISA_ARCOMPACT

Reported-by: Guenter Roeck <private@roeck-us.net>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-28 19:47:22 +05:30
Linus Torvalds ad90fb9751 Merge branch 'for-4.2/sg' of git://git.kernel.dk/linux-block
Pull asm/scatterlist.h removal from Jens Axboe:
 "We don't have any specific arch scatterlist anymore, since parisc
  finally switched over.  Kill the include"

* 'for-4.2/sg' of git://git.kernel.dk/linux-block:
  remove scatterlist.h generation from arch Kbuild files
  remove <asm/scatterlist.h>
2015-06-25 15:22:36 -07:00
Laurent Dufour 2ae416b142 mm: new mm hook framework
CRIU is recreating the process memory layout by remapping the checkpointee
memory area on top of the current process (criu).  This includes remapping
the vDSO to the place it has at checkpoint time.

However some architectures like powerpc are keeping a reference to the
vDSO base address to build the signal return stack frame by calling the
vDSO sigreturn service.  So once the vDSO has been moved, this reference
is no more valid and the signal frame built later are not usable.

This patch serie is introducing a new mm hook framework, and a new
arch_remap hook which is called when mremap is done and the mm lock still
hold.  The next patch is adding the vDSO remap and unmap tracking to the
powerpc architecture.

This patch (of 3):

This patch introduces a new set of header file to manage mm hooks:
- per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
- a generic header (include/linux/mm-arch-hooks.h)

The architecture which need to overwrite a hook as to redefine it in its
header file, while architecture which doesn't need have nothing to do.

The default hooks are defined in the generic header and are used in the
case the architecture is not defining it.

In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
be moved here.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:41 -07:00
Ruud Derwig 2924cd18c4 ARCv2: [vdk] dts files and defconfig for HS38 VDK
- CONFIG_ARC_UBOOT_SUPPORT to handle arguments passed in r0, r1, r2
 - CONFIG_DEVTMPFS_MOUNT for mouting rootfs since it uses external cpio
   for rootfs

Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Ruud Derwig <rderwig@synopsys.com>
[vgupta: folded the Main baord DT files for smp/up into one]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:21 +05:30
Vineet Gupta 5fa2daaa8d ARCv2: [axs103] Support ARC SDP FPGA platform for HS38x cores
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:20 +05:30
Alexey Brodkin e0183f5230 ARC: [axs101] Prepare for AXS103
To avoid duplicating the MB DTS file, move the MB intc entry into cpu
card specific file

Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:20 +05:30
Vineet Gupta a12ebe16a5 ARCv2: [nsim*hs*] Support simulation platforms for HS38x cores
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:19 +05:30
Vineet Gupta 65bfbcdfde ARCv2: All bits in place, allow ARCv2 builds
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:19 +05:30
Vineet Gupta 795f455856 ARCv2: SLC: Handle explcit flush for DMA ops (w/o IO-coherency)
L2 cache on ARCHS processors is called SLC (System Level Cache)
For working DMA (in absence of hardware assisted IO Coherency) we need
to manage SLC explicitly when buffers transition between cpu and
controllers.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:19 +05:30
Vineet Gupta a5c8b52abe ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock
A quad core SMP build could get into hardware livelock with concurrent
LLOCK/SCOND. Workaround that by adding a PREFETCHW which is serialized by
SCU (System Coherency Unit). It brings the cache line in Exclusive state
and makes others invalidate their lines. This gives enough time for
winner to complete the LLOCK/SCOND, before others can get the line back.

The prefetchw in the ll/sc loop is not nice but this is the only
software workaround for current version of RTL.

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:18 +05:30
Vineet Gupta 04e2eee4b0 ARC: Reduce bitops lines of code using macros
No semantical changes !

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:18 +05:30
Vineet Gupta b8a0330239 ARCv2: barriers
ARCv2 based HS38 cores are weakly ordered and thus explicit barriers for
kernel proper.

SMP barrier is provided by DMB instruction which also guarantees local
barrier hence used as backend of smp_*mb() as well as *mb() APIs

Also hookup barriers into MMIO accessors to avoid ordering issues in IO

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:17 +05:30
Vineet Gupta 2576c28e3f ARC: add smp barriers around atomics per Documentation/atomic_ops.txt
- arch_spin_lock/unlock were lacking the ACQUIRE/RELEASE barriers
   Since ARCv2 only provides load/load, store/store and all/all, we need
   the full barrier

 - LLOCK/SCOND based atomics, bitops, cmpxchg, which return modified
   values were lacking the explicit smp barriers.

 - Non LLOCK/SCOND varaints don't need the explicit barriers since that
   is implicity provided by the spin locks used to implement the
   critical section (the spin lock barriers in turn are also fixed in
   this commit as explained above

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:16 +05:30
Vineet Gupta d57f727264 ARC: add compiler barrier to LLSC based cmpxchg
When auditing cmpxchg call sites, Chuck noted that gcc was optimizing
away some of the desired LDs.

|	do {
|		new = old = *ipi_data_ptr;
|		new |= 1U << msg;
|	} while (cmpxchg(ipi_data_ptr, old, new) != old);

was generating to below

| 8015cef8:	ld         r2,[r4,0]  <-- First LD
| 8015cefc:	bset       r1,r2,r1
|
| 8015cf00:	llock      r3,[r4]  <-- atomic op
| 8015cf04:	brne       r3,r2,8015cf10
| 8015cf08:	scond      r1,[r4]
| 8015cf0c:	bnz        8015cf00
|
| 8015cf10:	brne       r3,r2,8015cf00  <-- Branch doesn't go to orig LD

Although this was fixed by adding a ACCESS_ONCE in this call site, it
seems safer (for now at least) to add compiler barrier to LLSC based
cmpxchg

Reported-by: Chuck Jordan <cjordan@synopsys,com>
Cc: <stable@vger.kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 05:59:23 +05:30
Miklos Szeredi 9bf39ab2ad vfs: add file_path() helper
Turn
	d_path(&file->f_path, ...);
into
	file_path(file, ...);

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-06-23 18:00:05 -04:00
Linus Torvalds d70b3ef54c Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Ingo Molnar:
 "There were so many changes in the x86/asm, x86/apic and x86/mm topics
  in this cycle that the topical separation of -tip broke down somewhat -
  so the result is a more traditional architecture pull request,
  collected into the 'x86/core' topic.

  The topics were still maintained separately as far as possible, so
  bisectability and conceptual separation should still be pretty good -
  but there were a handful of merge points to avoid excessive
  dependencies (and conflicts) that would have been poorly tested in the
  end.

  The next cycle will hopefully be much more quiet (or at least will
  have fewer dependencies).

  The main changes in this cycle were:

   * x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas
     Gleixner)

     - This is the second and most intrusive part of changes to the x86
       interrupt handling - full conversion to hierarchical interrupt
       domains:

          [IOAPIC domain]   -----
                                 |
          [MSI domain]      --------[Remapping domain] ----- [ Vector domain ]
                                 |   (optional)          |
          [HPET MSI domain] -----                        |
                                                         |
          [DMAR domain]     -----------------------------
                                                         |
          [Legacy domain]   -----------------------------

       This now reflects the actual hardware and allowed us to distangle
       the domain specific code from the underlying parent domain, which
       can be optional in the case of interrupt remapping.  It's a clear
       separation of functionality and removes quite some duct tape
       constructs which plugged the remap code between ioapic/msi/hpet
       and the vector management.

     - Intel IOMMU IRQ remapping enhancements, to allow direct interrupt
       injection into guests (Feng Wu)

   * x86/asm changes:

     - Tons of cleanups and small speedups, micro-optimizations.  This
       is in preparation to move a good chunk of the low level entry
       code from assembly to C code (Denys Vlasenko, Andy Lutomirski,
       Brian Gerst)

     - Moved all system entry related code to a new home under
       arch/x86/entry/ (Ingo Molnar)

     - Removal of the fragile and ugly CFI dwarf debuginfo annotations.
       Conversion to C will reintroduce many of them - but meanwhile
       they are only getting in the way, and the upstream kernel does
       not rely on them (Ingo Molnar)

     - NOP handling refinements. (Borislav Petkov)

   * x86/mm changes:

     - Big PAT and MTRR rework: making the code more robust and
       preparing to phase out exposing direct MTRR interfaces to drivers -
       in favor of using PAT driven interfaces (Toshi Kani, Luis R
       Rodriguez, Borislav Petkov)

     - New ioremap_wt()/set_memory_wt() interfaces to support
       Write-Through cached memory mappings.  This is especially
       important for good performance on NVDIMM hardware (Toshi Kani)

   * x86/ras changes:

     - Add support for deferred errors on AMD (Aravind Gopalakrishnan)

       This is an important RAS feature which adds hardware support for
       poisoned data.  That means roughly that the hardware marks data
       which it has detected as corrupted but wasn't able to correct, as
       poisoned data and raises an APIC interrupt to signal that in the
       form of a deferred error.  It is the OS's responsibility then to
       take proper recovery action and thus prolonge system lifetime as
       far as possible.

     - Add support for Intel "Local MCE"s: upcoming CPUs will support
       CPU-local MCE interrupts, as opposed to the traditional system-
       wide broadcasted MCE interrupts (Ashok Raj)

     - Misc cleanups (Borislav Petkov)

   * x86/platform changes:

     - Intel Atom SoC updates

  ... and lots of other cleanups, fixlets and other changes - see the
  shortlog and the Git log for details"

* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits)
  x86/hpet: Use proper hpet device number for MSI allocation
  x86/hpet: Check for irq==0 when allocating hpet MSI interrupts
  x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled
  x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled
  x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail
  genirq: Prevent crash in irq_move_irq()
  genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain
  iommu, x86: Properly handle posted interrupts for IOMMU hotplug
  iommu, x86: Provide irq_remapping_cap() interface
  iommu, x86: Setup Posted-Interrupts capability for Intel iommu
  iommu, x86: Add cap_pi_support() to detect VT-d PI capability
  iommu, x86: Avoid migrating VT-d posted interrupts
  iommu, x86: Save the mode (posted or remapped) of an IRTE
  iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
  iommu: dmar: Provide helper to copy shared irte fields
  iommu: dmar: Extend struct irte for VT-d Posted-Interrupts
  iommu: Add new member capability to struct irq_remap_ops
  x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code
  x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation
  x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry()
  ...
2015-06-22 17:59:09 -07:00
Vineet Gupta eaf0ecc33f ARCv2: SMP: intc: IDU 2nd level intc for dynamic IRQ distribution
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:57 +05:30
Vineet Gupta 72d7288061 ARCv2: SMP: clocksource: Enable Global Real Time counter
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:57 +05:30
Vineet Gupta aa6083ed50 ARCv2: SMP: ARConnect debug/robustness
- Handle possible interrupt coalescing from MCIP
- chk if prev IPI ack before sending new

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:57 +05:30
Vineet Gupta 82fea5a1bb ARCv2: SMP: Support ARConnect (MCIP) for Inter-Core-Interrupts et al
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:56 +05:30
Vineet Gupta 173eaafaed ARC: make plat_smp_ops weak to allow over-rides
This allows platforms to provide their own cpu wakeup routines
as well as IPI send / clear backends, while allowing a SMP kernel w/o
any such backend to build/boot

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:56 +05:30
Vineet Gupta aa93e8ef98 ARCv2: clocksource: Introduce 64bit local RTC counter
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:56 +05:30
Vineet Gupta 8922bc3058 ARCv2: Adhere to Zero Delay loop restriction
Branch insn can't be scheduled as last insn of Zero Overhead loop

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:56 +05:30
Claudiu Zissulescu 1f7e3dc0ba ARCv2: optimised string/mem lib routines
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:56 +05:30
Vineet Gupta bcc4d65abe ARCv2: MMUv4: support aliasing icache config
This is also default for AXS103 release

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:56 +05:30
Vineet Gupta d1f317d825 ARCv2: MMUv4: cache programming model changes
Caveats about cache flush on ARCv2 based cores

- dcache is PIPT so paddr is sufficient for cache maintenance ops (no
  need to setup PTAG reg

- icache is still VIPT but only aliasing configs need PTAG setup

So basically this is departure from MMU-v3 which always need vaddr in
line ops registers (DC_IVDL, DC_FLDL, IC_IVIL) but paddr in DC_PTAG,
IC_PTAG respectively.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:55 +05:30
Vineet Gupta d7a512bfe0 ARCv2: MMUv4: TLB programming Model changes
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:55 +05:30
Vineet Gupta 4de0e52867 ARCv2: STAR 9000814690: Really Re-enable interrupts to avoid deadlocks
The issue was, on HS when interrupt is taken, IRQ_ACT is set and that is
NOT cleared unless we do RTIE (or manually clear it). Linux interrupt
handling has top and bottom halves. Latter lead to softirqs (which can
reschedule) AND expect interrupts to be REALLY re-enabled which was NOT
happening for us since we only SETI, dont clear IRQ_ACT

So we can have a state when both cores have taken interrupt (IRQ_ACT set),
get rescheduled, both send IPI and wait in CSD lock which will never be
cleared as cores can't take the pending IPI IRQ due to existing IRQ_ACT
set.

So local_irq_enable() now drops the IRQ_ACT.act bit to re-enable IRQs.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:55 +05:30
Vineet Gupta 0d7b8855a0 ARCv2: STAR 9000808988: signals involving Delay Slot
Reported by Anton as LTP:munmap01 failing with Illegal Instruction
Exception.

   --------------------->8--------------------------------------
   mmap2(NULL, 24576, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x200d2000
   munmap(0x200d2000, 24576)               = 0
   --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x200d2000}
   ---
   potentially unexpected fatal signal 4.
   Path: /munmap01
   CPU: 0 PID: 61 Comm: munmap01 Not tainted 3.13.0-g5d5c46d9a556 #8
   task: 9f1a8000 ti: 9f154000 task.ti: 9f154000

   [ECR   ]: 0x00020100 => Illegal Insn
   [EFA   ]: 0x0001354c
   [BLINK ]: 0x200515d4
   [ERET  ]: 0x1354c
       @off 0x1354c in [/munmap01]
       VMA: 0x00010000 to 0x00018000
   [STAT32]: 0x800802c0
   ...
   --------------------->8--------------------------------------

The issue was
1. munmap01 accessed unmapped memory (on purpose) with signal handler
   installed for SIGSEGV

2. The faulting instruction happened to be in Delay Slot
   00011864 <main>:
      11908:	bl.d       13284 <tst_resm>
      1190c:	stb        r16,[r2]

3. kernel sets up the reg file for signal handler and correctly clears
   the DE bit in pt_regs->status32 placeholder

4. However RESTORE_CALLEE_SAVED_USER macro is not adjusted for ARCv2,
   and it over-writes the above with orig/stale value of status32

5. After RTIE, userspace signal handler executes a non branch
   instruction with DE bit set, triggering Illegal Instruction Exception.

Reported-by: Anton Kolesov <akolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:55 +05:30