rC352620 caused regressions because it copied floating point format from
aux target.
floating point format decides whether extended long double is supported.
It is x86_fp80 on x86 but IEEE double on amdgcn.
Document usage of long doubel type in HIP programming guide
https://github.com/ROCm-Developer-Tools/HIP/pull/890
Differential Revision: https://reviews.llvm.org/D57527
llvm-svn: 352801
This reverts commit f47d6b38c7 (r352791).
Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.
llvm-svn: 352800
Instead of calling CUDA runtime to arrange function arguments,
the new API constructs arguments in a local array and the kernels
are launched with __cudaLaunchKernel().
The old API has been deprecated and is expected to go away
in the next CUDA release.
Differential Revision: https://reviews.llvm.org/D57488
llvm-svn: 352799
..and use it to control that parts of CUDA compilation
that depend on the specific version of CUDA SDK.
This patch has a placeholder for a 'new launch API' support
which is in a separate patch. The list will be further
extended in the upcoming patch to support CUDA-10.1.
Differential Revision: https://reviews.llvm.org/D57487
llvm-svn: 352798
This patch fixes pr39098.
For the attached test case, CombineZExtLogicopShiftLoad can optimize it to
t25: i64 = Constant<1099511627775>
t35: i64 = Constant<0>
t0: ch = EntryToken
t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64
t58: i64 = srl t57, Constant:i8<1>
t60: i64 = and t58, Constant:i64<524287>
t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64
But later visitANDLike transforms it to
t25: i64 = Constant<1099511627775>
t35: i64 = Constant<0>
t0: ch = EntryToken
t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64
t61: i32 = truncate t57
t63: i32 = srl t61, Constant:i8<1>
t64: i32 = and t63, Constant:i32<524287>
t65: i64 = zero_extend t64
t58: i64 = srl t57, Constant:i8<1>
t60: i64 = and t58, Constant:i64<524287>
t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64
And it triggers CombineZExtLogicopShiftLoad again, causes a dead loop.
Both forms should generate same instructions, CombineZExtLogicopShiftLoad generated IR looks cleaner. But it looks more difficult to prevent visitANDLike to do the transform, so I prevent CombineZExtLogicopShiftLoad to do the transform if the ZExt is free.
Differential Revision: https://reviews.llvm.org/D57491
llvm-svn: 352792
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352791
CMake 3.6 introduced CMAKE_TRY_COMPILE_PLATFORM_VARIABLES, which solves
precisely the problem that necessitated init_user_prop, so we can switch
over whenever we bump our minimum CMake requirement.
llvm-svn: 352790
Summary:
Use RawPrint instead of Printf for instrumentation warning because
Printf doesn't work on Win when instrumentation is being
initialized (since OutputFile is not yet initialized).
Reviewers: kcc
Reviewed By: kcc
Differential Revision: https://reviews.llvm.org/D57531
llvm-svn: 352789
Preferred types are used by code completion for ranking. This commit
considerably increases the number of points in code where those types
are propagated.
In order to avoid complicating signatures of Parser's methods, a
preferred type is kept as a member variable in the parser and updated
during parsing.
Differential revision: https://reviews.llvm.org/D56723
llvm-svn: 352788
Summary:
EarlyCSE needs to optimize MemoryPhis after an access is removed and has
special handling for it. This should be handled by MemorySSA instead.
The default remains that MemoryPhis are *not* optimized after an access
is removed.
Reviewers: george.burgess.iv
Subscribers: sanjoy, jlebar, llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D57199
llvm-svn: 352787
Summary:
Like with X86, this allows better DAG-level alias analysis and
alignment inference for wrapped addresses.
Reviewers: jonpa, uweigand
Reviewed By: uweigand
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D57407
llvm-svn: 352786
Summary:
Previously, llvm-nm would report symbols for .debug and .note sections as: '?' with an empty section name:
```
00000000 ?
00000000 ?
...
```
With this patch the output more closely resembles GNU nm:
```
00000000 N .debug_abbrev
00000000 n .note.GNU-stack
...
```
This patch calls `getSectionName` for sections that belong to symbols of type `ELF::STT_SECTION`, which returns the name of the section from the section string table.
Reviewers: Bigcheese, davide, jhenderson
Reviewed By: davide, jhenderson
Subscribers: rupprecht, jhenderson, llvm-commits
Differential Revision: https://reviews.llvm.org/D57105
llvm-svn: 352785
While dangling nodes will eventually be pruned when they are
considered, leaving them disables combines requiring single-use.
Reviewers: Carrot, spatel, craig.topper, RKSimon, efriedma
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D57520
llvm-svn: 352784
r zero scale SMULFIX, expand into MUL which produces better code for X86.
For vector arguments, expand into MUL if SMULFIX is provided with a zero scale.
Otherwise, expand into MULH[US] or [US]MUL_LOHI.
Differential Revision: https://reviews.llvm.org/D56987
llvm-svn: 352783
The test was using ASSERT_EQ instead of ASSERT_STREQ which meant we were
comparing string addresses instead of the actual string. This caused the
test to fail with with the sanitizers enabled.
llvm-svn: 352780
This ensures that if we make it to the backend w/o lowering widenable_conditions first, that we generate correct code. Doing it in CGP - instead of isel - let's us fold control flow before hitting block local instruction selection.
Differential Revision: https://reviews.llvm.org/D57473
llvm-svn: 352779
The original commit of this function (r129800 in 2011) had a typo where
part of the "Micro" version check was actually comparing against the "Minor"
version number.
llvm-svn: 352776
Similar to what we already do in DAGCombiner, but this version also handles bitcasts from types with different scalar sizes, which x86 is better at handling.
Differential Revision: https://reviews.llvm.org/D57514
llvm-svn: 352773
Summary:
The method find_matching_slice(self) uses uuid_str on one of the paths but the variable does not exist and so this results in a NameError exception if we take that path.
Differential Revision: https://reviews.llvm.org/D57467
llvm-svn: 352772
Summary:
Include the symbol being defined in the list of requirements for using --localize-symbol.
This is used, for example, when someone is depending on two different projects that have the same (or close enough) method defined in each library, and using "-L sym" for a conflicting symbol in one of the libraries so that the definition from the other one is used. However, the library may have internal references to the symbol, which cause program crashes when those are used, i.e.:
```
$ cat foo.c
int foo() { return 5; }
$ cat bar.c
int foo();
int bar() { return 2 * foo(); }
$ cat foo2.c
int foo() { /* Safer implementation */ return 42; }
$ cat main.c
int bar();
int main() {
__builtin_printf("bar = %d\n", bar());
return 0;
}
$ ar rcs libfoo.a foo.o bar.o
$ ar rcs libfoo2.a foo2.o
# Picks the wrong foo() impl
$ clang main.o -lfoo -lfoo2 -L. -o main
# Picks the right foo() impl
$ objcopy -L foo libfoo.a && clang main.o -lfoo -lfoo2 -L. -o main
# Links somehow, but crashes at runtime
$ llvm-objcopy -L foo libfoo.a && clang main.o -lfoo -lfoo2 -L. -o main
```
Reviewers: jhenderson, alexshap, jakehehrlich, espindola
Subscribers: emaste, arichardson
Differential Revision: https://reviews.llvm.org/D57417
llvm-svn: 352767
This is the most important uaddo problem mentioned in PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754
We were failing to match the canonicalized pattern when it's an 'add 1' operation.
Pattern matching, however, shouldn't assume that we have canonicalized IR, so we
match 4 commuted variants of uaddo.
There's also a test with a crazy type to show that the existing CGP transform
based on this matcher is not limited by target legality checks, but that's a
different problem.
Differential Revision: https://reviews.llvm.org/D57516
llvm-svn: 352766
cl.exe and clang-cl.exe put vftables in a 'discard' comdat when building with
RTTI disabled (/GR-) but in a 'largest' comdat when building with RTTI enabled.
To be able to link /GR- code with /GR code, lld-link needs to accept comdats
that have this type of comdat selection conflict.
For example, static libraries in the Visual Studio standard library are built
with /GR, and without this it's impossible to build client code with /GR- and
still link to the standard library.
link.exe also accepts merging 'discard' with 'largest', and it accepts merging
'largest' with any other selection type. lld-link is still a bit stricter since
it only allows merging 'largest' with 'discard' for symmetry.
Differential Revision: https://reviews.llvm.org/D57515
llvm-svn: 352765
Summary:
This would make diagnostic fixits more discoverable, especially for
plugins like YCM.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D57509
llvm-svn: 352764
Summary:
COFF requires that COMDAT name match that of the leader. When we promote
and rename an internal leader in ThinLTO due to an import, ensure we
subsequently rename the associated COMDAT. Similar to D31963 which did
this during ThinLTO module splitting.
Fixes PR40414.
Reviewers: pcc, inglorion
Subscribers: mehdi_amini, dexonsmith, dmajor, llvm-commits
Differential Revision: https://reviews.llvm.org/D57395
llvm-svn: 352763
This is the fourth (and final for now) of a series of patches
simplifying llvm-symbolizer tests. See r352752, r352753 and 352754 for
the previous ones. This patch splits out several more distinct test
cases from llvm-symbolizer.test into separate tests, and simplifies them
in various ways including:
1) Building a test case for spaces in path from source, rather than
using a pre-canned binary. This allows deleting of said binary and the
source it was built from.
2) Switching to specifying addresses and objects directly on the
command-line rather than via stdin.
This also adds an explict test for the ability to specify a file and
address as a line in stdin, since the majority of the tests have been
migrated away from this approach, leaving this largely untested.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D57446
llvm-svn: 352756
This is the third of a series of patches simplifying llvm-symbolizer
tests. See r352752 and r352753 for the previous two. This patch splits
out a number of distinct test cases from llvm-symbolizer.test into
separate tests, and simplifies them in various ways including:
1) using --obj/positional arguments for the input file and addresses
instead of stdin,
2) using runtime-generated inputs rather than a pre-canned binary, and
3) testing more specifically (i.e. checking only what is interesting to
the behaviour changed in the original commit for that test case).
This patch also removes the test case for using --obj. The
tools/llvm-symbolizer/basic.s test already tests this case. Finally,
this patch adds a simple test case to the demangle switch test case to
show that demangling happens by default.
See https://bugs.llvm.org/show_bug.cgi?id=40070#c1 for the motivation.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D57446
llvm-svn: 352754
This is the second of a series of patches simplifying llvm-symbolizer
tests. See r352752 for the first. This one splits out 5 distinct test
cases from llvm-symbolizer.test into separate tests, and simplifies them
slightly by using --obj/positional arguments for the input file and
addresses instead of stdin.
See https://bugs.llvm.org/show_bug.cgi?id=40070#c1 for the motivation.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D57443
llvm-svn: 352753
This change migrates most llvm-symbolizer tests away from reading input
via stdin and instead using --obj + positional arguments for the file
and addresses respectively, which makes the tests easier to read.
One exception is the test test/tools/llvm-symbolizer/pdb/pdb.test, which
was doing some manipulation on the input addresses. This patch
simplifies this somewhat, but it still reads from stdin.
More changes to follow to simplify/break-up other tests.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D57441
llvm-svn: 352752