Support for CUDA printf is exploited to support printf for
an NVPTX OpenMP device.
To reflect the support of both programming models, the file
CGCUDABuiltin.cpp has been renamed to CGGPUBuiltin.cpp, and
the call EmitCUDADevicePrintfCallExpr has been renamed to
EmitGPUDevicePrintfCallExpr.
Reviewers: jlebar
Differential Revision: https://reviews.llvm.org/D17890
llvm-svn: 293444
Add "OPTIONAL" comment to declaration of weak function in the internal
interface. This fix the tests `interface_symbols_linux.c` and
`interface_symbols_darwin.c` which were failing after r293423.
llvm-svn: 293442
The primary use of the dump() functions in LLVM is for use in a
debugger. Unfortunately lldb does not seem to handle default arguments
so using `p SomeMI.dump()` fails and you have to type the longer `p
SomeMI.dump(nullptr)`. Remove the paramter to make the most common use
easy. (You can always construct something like `p
SomeMI.print(dbgs(),MyTII)` if you need more features).
Differential Revision: https://reviews.llvm.org/D29241
llvm-svn: 293440
This causes unnecessary warnings when building with `cl`. Newer
versions of the C standard permit the redefinition of the macro to the
same value (which is the case here), unfortunately, `cl` does not yet
implement this. Add a check to prevent the redefinition.
llvm-svn: 293439
Replaces an xor+movd/movq with an xorps which will be shorter in codesize, avoid an int-fpu transfer, allow modern cores to fast path the result during decode and helps other combines recognise an all-zero vector.
The only reason I can think of that we'd want to keep scalar_to_vector in this case is to help recognise the upper elts are undef but this doesn't seem to be a problem.
Differential Revision: https://reviews.llvm.org/D29097
llvm-svn: 293438
While this probably should be considered a dump debugger utility, the C
API currently has no other ways to print a module to stderr for error
reporting purposes, so keep it even in release builds.
llvm-svn: 293436
Support lowering AEABI TLS access (__aeabi_read_tp) with long calls.
This requires adjusting the call sequence to use an indirect call to get
full addressability.
Resolves PR31769!
llvm-svn: 293433
PACKUSWB converts Signed word to Unsigned byte, (the same about DW) and it can't be used for umin+truncate pattern.
AVX-512 VPMOVUS* instructions fit the pattern since they convert Unsigned to Unsigned.
See https://llvm.org/bugs/show_bug.cgi?id=31773
Differential Revision: https://reviews.llvm.org/D29196
llvm-svn: 293431
Add a simple example to update the documentation on how the packing
transformation is implemented.
Reviewed-by: Tobias Grosser <tobias@grosser.es>,
Michael Kruse <llvm@meinersbur.de>
Differential Revision: https://reviews.llvm.org/D28021
llvm-svn: 293429
formatting that has evolved here over the past years prior to making
somewhat invasive changes to thread new PM support through the business
logic.
Differential Revision: https://reviews.llvm.org/D29248
llvm-svn: 293425
This arranges the static helpers in an order where they are defined
prior to their use to avoid the need of forward declarations, and
collect the core pass components at the bottom below their helpers.
This also folds one trivial function into the pass itself. Factoring
this 'runImpl' was an attempt to help porting to the new pass manager,
however in my attempt to begin this port in earnest it turned out to not
be a substantial help. I think it will be easier to factor things
without it.
This is an NFC change and does a minimal amount of edits over all.
Subsequent NFC cleanups will normalize the formatting with clang-format
and improve the basic doxygen commenting.
Differential Revision: https://reviews.llvm.org/D29247
llvm-svn: 293424
Update the headers, so we can change the dllexports to dllimport when
defining SANITIZER_IMPORT_INTERFACE.
Differential Revision: https://reviews.llvm.org/D29052
llvm-svn: 293422
I modify clang driver for windows to include:
"-wholearchive:asan_dynamic_runtime_thunk", so all object files in the
static library: asan_dynamic_runtime_thunk are considered by the linker.
This is necessary, because some object files only include linker pragmas,
and doesn't resolve any symbol. If we don't include that flag, the
linker will ignore them, and won't read the linker pragmas.
Differential Revision: https://reviews.llvm.org/D29159
llvm-svn: 293420
In this diff, I define a general macro for defining weak functions
with a default implementation: "SANITIZER_INTERFACE_WEAK_DEF()".
This way, we simplify the implementation for different platforms.
For example, we cannot define weak functions on Windows, but we can
use linker pragmas to create an alias to a default implementation.
All of these implementation details are hidden in the new macro.
Also, as I modify the name for exported weak symbols on Windows, I
needed to temporarily disable "dll_host" test for asan, which checks
the list of functions included in asan_win_dll_thunk.
Differential Revision: https://reviews.llvm.org/D28596
llvm-svn: 293419
Summary:
Adds the following instructions:
* mfpmr
* mtpmr
* icblc
* icblq
* icbtls
Fix the scheduling for mtspr on e5500, which uses CFX0, instead of
SFX0/SFX1 as on e500mc.
Addresses PR 31538.
Differential Revision: https://reviews.llvm.org/D29002
llvm-svn: 293417
Oops... r293393 started calling ReadSignature in
ModuleManager::addModule even when there was no ExpectedSignature.
Whether or not this would have a measurable performance impact (I
spotted this by inspection, and ReadSignature should be fairly fast), we
might as well get what we can. Add an extra check against
ExpectedSignature to avoid the hit.
llvm-svn: 293415
The type system already requires that the number of vector elements must fit in 32-bits so an index should as well. Even if the type of the index were larger all we care about is that the constant index can fit in 64-bits so that we can call getZExtValue.
llvm-svn: 293413
Most flags were already initialized by the TargetOptions constructor but
we missed out on one. Also, simplify the constructor by using field
initializers when possible.
llvm-svn: 293406
appendCallAsync, which all RPC call functions ultimately build on, will call
abandonAllPendingResponses on channel error. These extra calls are redundant.
llvm-svn: 293405
Zero-initialize ModuleFile members directly in the class definition, and
move the (now uninteresting) constructor into the class definition.
There were a few members that weren't being initialized at all. I
zero-initialized them for consistency, but it's likely that they were
somehow initialized before their first use; i.e., there's likely no
functionality change here.
llvm-svn: 293404
The matching code tries to canonicalize XOR to the left, but if there are two XORs and only one is a vnot, this canonicalization can prevent matching.
llvm-svn: 293402
Invert the main branch in ModuleManager::addModule to return early and
reduce indentation, and clean up a bunch of logic as a result. I split
out a function called updateModuleImports to avoid triggering code
duplication.
llvm-svn: 293400
I don't have a testcase for this (and I'm not sure if it's an observable
bug), but it seems obviously wrong that ModuleManager::removeModules is
failing to clean up deleted modules from ModuleFile::Imports. See the
code in ModuleManager::addModule that inserts into ModuleFile::Imports;
we need the inverse operation.
llvm-svn: 293399
ModuleManager::removeModules always deletes a tail of the
ModuleManager::Chain. Change the API to enforce that so that we can
simplify the code inside.
There's no real functionality change, although there's a slight
performance hack to loop to the First deleted module instead of the
final module in the chain (skipping the about-to-be-deleted tail).
Also document something suspicious: we fail to clean deleted modules out
of ModuleFile::Imports.
llvm-svn: 293398