If we know we have stack objects, we reserve the registers
that the private buffer resource and wave offset are passed
and use them directly.
If not, reserve the last 5 SGPRs just in case we need to spill.
After register allocation, try to pick the next available registers
instead of the last SGPRs, and then insert copies from the inputs
to the reserved registers in the progloue.
This also only selectively enables all of the input registers
which are really required instead of always enabling them.
llvm-svn: 254331
It does not work because of emergency stack slots.
This pass was supposed to eliminate dummy registers for the
spill instructions, but the register scavenger can introduce
more during PrologEpilogInserter, so some would end up
left behind if they were needed.
The potential for spilling the scratch resource descriptor
and offset register makes doing something like this
overly complicated. Reserve registers to use for the resource
descriptor and use them directly in eliminateFrameIndex.
Also removes creating another scratch resource descriptor
when directly selecting scratch MUBUF instructions.
The choice of which registers are reserved is temporary.
For now it attempts to pick the next available registers
after the user and system SGPRs.
llvm-svn: 254329
Acc==MA implies Acc->getAccessInstruction() == MA->getAccessInstruction().
Suggested as post-commit review for 254305 by Michael Kruse.
llvm-svn: 254327
Function types can be extracted from member pointer types.
However, the type is not appropriate without first adjusting the calling
convention.
This fixes PR25661.
llvm-svn: 254323
Fix for crash in the teams construct in case user sets OMP_THREAD_LIMIT to a
number less than the number of processors. Now the number of threads will be
silently reduced if the user didn't specify teams parameters or with a
warning if the user specified teams parameters conflicting with
OMP_THREAD_LIMIT.
Differential Revision: http://reviews.llvm.org/D14732
llvm-svn: 254322
The task_team pointer is dereferenced unconditionally which causes a SEGFAULT
when it is NULL (e.g. for serialized parallel, that can happen for "teams"
construct or for "target nowait"). The solution is to skip second task team
setup for single thread team.
Differential Revision: http://reviews.llvm.org/D14729
llvm-svn: 254321
These changes allow libhwloc to be used as the topology discovery/affinity
mechanism for libomp. It is supported on Unices. The code additions:
* Canonicalize KMP_CPU_* interface macros so bitmask operations are
implementation independent and work with both hwloc bitmaps and libomp
bitmaps. So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and
the like. These are all in kmp.h and appropriately placed.
* Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc
interface to create a libomp address2os object which the rest of libomp knows
how to handle already.
* To build, use -DLIBOMP_USE_HWLOC=on and
-DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake
can't find the library or hwloc.h, then it will tell you and exit.
Differential Revision: http://reviews.llvm.org/D13991
llvm-svn: 254320
This patch complete removed SANITIZER_AARCH64_VMA definition and usage.
AArch64 ports now supports runtime VMA detection and instrumentation
for 39 and 42-bit VMA.
It also Rewrite print_address to take a variadic argument list
(the addresses to print) and adjust the tests which uses it to the new
signature.
llvm-svn: 254319
The MachineVerifier wants to check that the register operands of an
instruction belong to the instruction's register class. RIP-relative
control flow instructions violated this by referencing RIP. While this
was fixed for SysV, it was never fixed for Win64.
llvm-svn: 254315
Re-enable shrink wrapping for PPC64 Little Endian.
One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks with no terminator) correctly.
Tested with all LLVM test, clang tests, and the self-hosting build, with no problems found.
PHabricator: http://reviews.llvm.org/D14778
llvm-svn: 254314
Splitted writeTo to separate tls relocs handling stuff which is too long for one method now. NFC.
Differential revision: http://reviews.llvm.org/D15012
llvm-svn: 254309
The use of C++'s high-level iterator functionality instead of two while loops
and explicit iterator handling improves readability of this code.
Proposed-by: Michael Kruse <llvm@meinersbur.de>
Differential Revision: http://reviews.llvm.org/D15068
llvm-svn: 254305
On OS X, in AtosSymbolizer, if the answer from atos doesn't contain module name, let's bail and return false. There are some cases where this is beneficial, because we'll try DlAddrSymbolizer next (it's next in the symbolizer chain), which might be able to symbolize something that atos couldn't.
Differential Revision: http://reviews.llvm.org/D15071
llvm-svn: 254301
Changing comments that have references to code.google.com to point to GitHub instead, because the current links are not redirected properly (they instead redirect to different issues, mostly ASan). NFC.
Differential Revision: http://reviews.llvm.org/D15053
llvm-svn: 254300
1) There's a few wrongly defined things in tsan_interceptors.cc,
2) a typo in tsan_rtl_amd64.S which calls setjmp instead of sigsetjmp in the interceptor, and
3) on OS X, accessing an mprotected page results in a SIGBUS (and not SIGSEGV).
Differential Revision: http://reviews.llvm.org/D15052
llvm-svn: 254299
On OS X, for weak function (that user can override by providing their own implementation in the main binary), we need extern `"C" SANITIZER_INTERFACE_ATTRIBUTE SANITIZER_WEAK_ATTRIBUTE NOINLINE`.
Fixes a broken test case on OS X, java_symbolization.cc, which uses a weak function __tsan_symbolize_external.
Differential Revision: http://reviews.llvm.org/D14907
llvm-svn: 254298
Value of offset operand for microMIPS BALC and BC instructions is currently shifted 2 bits, but it should be 1 bit.
Differential Revision: http://reviews.llvm.org/D14770
llvm-svn: 254296
This patch adds functionality for dumping allocations of struct elements. This involves:
+ Jitting the runtime for details on all the struct fields.
+ Finding the name of the struct type by looking for a global variable of the same type, which will have been reflected back to the java host code.
+ Using this struct type name to pass into expression evaluation for pretty printing the data for the dump command.
llvm-svn: 254294
valid-xfail.s is for instructions that should be valid in the given ISA but
incorrectly fail. MSA instructions are correct to fail since MSA is not enabled.
llvm-svn: 254293
build errors on ARM. Define it internally to avoid such errors.
Patch by Max Ostapenko.
Differential Revision: http://reviews.llvm.org/D14921
llvm-svn: 254292