We have no tests for OS/ABI values specific to
EM_TI_C6000, ELFOSABI_AMDGPU_MESA3D and ELFOSABI_ARM machines.
Also, related arrays in the code are not grouped together.
(That is why such testing was missed I guess).
The patch fixes that all.
Differential revision: https://reviews.llvm.org/D86341
If the basic block of the instruction passed to getUniqueReachingMIDef
is a transitive predecessor of itself and has a definition of the
register, the function will return that definition even if it is after
the instruction given to the function. This patch stops the function
from scanning the instruction's basic block to prevent this.
Differential Revision: https://reviews.llvm.org/D86607
Currently, the assembly functions for restoring register state have
been direct implementations of the Registers_*::jumpto() method
(contrary to the functions for saving register state, which are
implementations of the extern C function __unw_getcontext). This has
included having the assembly function name match the C++ mangling of
that method name (and having the function match the C++ member
function calling convention). To simplify the interface of the assembly
implementations, make the functions have C calling conventions and
name mangling.
This fixes building the library in with a MSVC C++ ABI with clang-cl,
which uses a significantly different method name mangling scheme.
(The library might not be of much use as C++ exception unwinder in such
an environment, but the libunwind.h interface for stepwise unwinding
still is usable, as is the _Unwind_Backtrace function.)
Differential Revision: https://reviews.llvm.org/D86041
This removes Error.cpp/.h files from obj2yaml.
These files are not needed because we are
using `Error`s instead of error codes widely and do
not need a logic related to obj2yaml specific
error codes anymore.
I had to adjust just a few lines of tool's code
to remove remaining dependencies.
Differential revision: https://reviews.llvm.org/D86536
This fixes several issues in handling of DW_AT_const_value attributes:
- the first is that the size of the data given by data forms does not
need to match the size of the underlying variable. We already had the
case to handle this for DW_FORM_(us)data -- this extends the handling
to other data forms. The main reason this was not picked up is because
clang uses leb forms in these cases while gcc prefers the fixed-size
ones.
- The handling of DW_AT_strp form was completely broken -- we would end
up using the pointer value as the result. I've reorganized this code
so that it handles all string forms uniformly.
- In case of a completely bogus form we would crash due to
strlen(nullptr).
Depends on D86311.
Differential Revision: https://reviews.llvm.org/D86348
llvm-readobj crashes when `-S --section-symbols` is used
on an object that has no symbol table.
The patch fixes it.
Differential revision: https://reviews.llvm.org/D86520
The `sections-ext.test` is a test for ELF that is used
to test `--st`, `--sr` and `--sd` extension options for `-S`.
There are 2 problems with it:
1) It is broken, because for CHECK lines it contains there is
no corresponding `FileCheck` call.
2) It uses the precompiled object: `trivial.obj.elf-i386`.
This is the last ELF test where `trivial.obj.elf-i386` is used so we can get
rid of the binary and use an YAML description.
Also, there is a `Inputs/trivial.ll` file that describes how `trivial*` objects
in `Inputs` folders are created. I've removed it from `ELF`, because it is not
actual anymore (we have no more input binaries created with the use of trivial.ll there)
and copied the refined versions of it to `COFF`, `MachO` and `wasm` Input folders.
Differential revision: https://reviews.llvm.org/D86462
This change extend the CMake files with the necessary additions
to build LLVM for z/OS.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D83866
EXP_MSG generates a message to show on assert
failure. Currently it looks like:
AssertionError: False is not True : '<cmd>'
returns expected result, got '<actual output>'
Which seems to say that the test failed but
also got the expected result.
It should say:
AssertionError: False is not True : '<cmd>'
returned unexpected result, got '<actual output>'
Reviewed By: teemperor, #lldb
Differential Revision: https://reviews.llvm.org/D86603
pointer.
mwaitx uses EBX as one of its argument.
Using this instruction clobbers RBX as it is defined to hold one of the
input. When the backend uses dynamically allocated stack, RBX is used as
a reserved register for the base pointer.
This patch is adapted from @qcolombet patch for cmpxchg at r263325.
This fixes PR43528.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D73475
The documentation was missing a '*/' in '/*<2x32-bit> vadd {0, 64, VPR}',
and the example code are now aligned to improve readability.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D86201
When optimizing the table, PointerToAnyOperandMatchers would be
incorrectly reported as identical even though they have different
SizeInBits values. This bug was due to failing to overload the
isIdentical() method, which this patch addresses.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D86199
Intrinsic properties can now be set to default and applied to all
intrinsics. If the attributes are not needed, the user can opt-out by
setting the DisableDefaultAttributes flag to true.
Differential Revision: https://reviews.llvm.org/D70365
The switch in AArch64Operand::print was changed in D45688 so the shift
can be printed after printing the register. This is implemented with
LLVM_FALLTHROUGH and was broken in D52485 when BTIHint was put between
the register and shift operands.
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D86535
This fixes an issue where the restore point of callee-saves in the
function epilogues was incorrectly calculated when the basic block
consisted of only a RET instruction. This caused dealloc instructions
to be inserted in between the block of callee-save restore instructions,
rather than before it.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D86099
Currently `strace llvm-dwarfdump x.debug >/tmp/file`:
ioctl(1, TCGETS, 0x7ffd64d7f340) = -1 ENOTTY (Inappropriate ioctl for device)
write(1, " DW_AT_decl_line\t(89)\n"..., 4096) = 4096
ioctl(1, TCGETS, 0x7ffd64d7f400) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(1, TCGETS, 0x7ffd64d7f410) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(1, TCGETS, 0x7ffd64d7f400) = -1 ENOTTY (Inappropriate ioctl for device)
After this patch:
write(1, "0000000000001102 \"strlen\")\n "..., 4096) = 4096
write(1, "site\n DW_AT_low"..., 4096) = 4096
write(1, "d53)\n\n0x000e4d4d: DW_TAG_G"..., 4096) = 4096
The same speedup can be achieved by `--color=0` but that is not much convenient.
This implementation has been suggested by Joerg Sonnenberger.
Differential Revision: https://reviews.llvm.org/D86406
This test appears to have never worked on Linux but it seems none of the current
bots ever ran this test as it required enabling compiler-rt (otherwise it
would have just been skipped).
This just copies over the XFAIL decorator that are already on all other sanitizer
tests.
This patch produces an edge-based interface in AAIsDead.
By this, we can query a set of basic blocks that are directly reachable from a given basic block.
This is specifically useful for implementation of AAReachability.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85547
* Generate `CallExpression` syntax node for all semantic nodes inheriting from
`CallExpr` with call-expression syntax - except `CUDAKernelCallExpr`.
* Implement all the accessors
* Arguments of `CallExpression` have their own syntax node which is based on
the `List` base API
Differential Revision: https://reviews.llvm.org/D86544
While since D86306 we do it's sibling fold for `insertvalue`,
we should also do this for `extractvalue`'s.
And unlike that one, the results here are, quite honestly, shocking,
as it can be observed here on vanilla llvm test-suite + RawSpeed results:
```
| statistic name | baseline | proposed | Δ | % | |%| |
|----------------------------------------------------|-----------|-----------|--------:|--------:|-------:|
| asm-printer.EmittedInsts | 7945095 | 7942507 | -2588 | -0.03% | 0.03% |
| assembler.ObjectBytes | 273209920 | 273069800 | -140120 | -0.05% | 0.05% |
| early-cse.NumCSE | 2183363 | 2183398 | 35 | 0.00% | 0.00% |
| early-cse.NumSimplify | 541847 | 550017 | 8170 | 1.51% | 1.51% |
| instcombine.NumAggregateReconstructionsSimplified | 2139 | 108 | -2031 | -94.95% | 94.95% |
| instcombine.NumCombined | 3601364 | 3635448 | 34084 | 0.95% | 0.95% |
| instcombine.NumConstProp | 27153 | 27157 | 4 | 0.01% | 0.01% |
| instcombine.NumDeadInst | 1694521 | 1765022 | 70501 | 4.16% | 4.16% |
| instcombine.NumPHIsOfExtractValues | 0 | 37546 | 37546 | 0.00% | 0.00% |
| instcombine.NumSunkInst | 63158 | 63686 | 528 | 0.84% | 0.84% |
| instcount.NumBrInst | 874304 | 871857 | -2447 | -0.28% | 0.28% |
| instcount.NumCallInst | 1757657 | 1758402 | 745 | 0.04% | 0.04% |
| instcount.NumExtractValueInst | 45623 | 11483 | -34140 | -74.83% | 74.83% |
| instcount.NumInsertValueInst | 4983 | 580 | -4403 | -88.36% | 88.36% |
| instcount.NumInvokeInst | 61018 | 59478 | -1540 | -2.52% | 2.52% |
| instcount.NumLandingPadInst | 35334 | 34215 | -1119 | -3.17% | 3.17% |
| instcount.NumPHIInst | 344428 | 331116 | -13312 | -3.86% | 3.86% |
| instcount.NumRetInst | 100773 | 100772 | -1 | 0.00% | 0.00% |
| instcount.TotalBlocks | 1081154 | 1077166 | -3988 | -0.37% | 0.37% |
| instcount.TotalFuncs | 101443 | 101442 | -1 | 0.00% | 0.00% |
| instcount.TotalInsts | 8890201 | 8833747 | -56454 | -0.64% | 0.64% |
| instsimplify.NumSimplified | 75822 | 75707 | -115 | -0.15% | 0.15% |
| simplifycfg.NumHoistCommonCode | 24203 | 24197 | -6 | -0.02% | 0.02% |
| simplifycfg.NumHoistCommonInstrs | 48201 | 48195 | -6 | -0.01% | 0.01% |
| simplifycfg.NumInvokes | 2785 | 4298 | 1513 | 54.33% | 54.33% |
| simplifycfg.NumSimpl | 997332 | 1018189 | 20857 | 2.09% | 2.09% |
| simplifycfg.NumSinkCommonCode | 7088 | 6464 | -624 | -8.80% | 8.80% |
| simplifycfg.NumSinkCommonInstrs | 15117 | 14021 | -1096 | -7.25% | 7.25% |
```
... which tells us that this new fold fires whopping 38k times,
increasing the amount of SimplifyCFG's `invoke`->`call` transforms by +54% (+1513) (again, D85787 did that last time),
decreasing total instruction count by -0.64% (-56454),
and sharply decreasing count of `insertvalue`'s (-88.36%, i.e. 9 times less)
and `extractvalue`'s (-74.83%, i.e. four times less).
This causes geomean -0.01% binary size decrease
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=size-text
and, ignoring `O0-g`, is a geomean -0.01%..-0.05% compile-time improvement
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=instructions
The other thing that tells is, is that while this is a massive win for `invoke`->`call` transform
`InstCombinerImpl::foldAggregateConstructionIntoAggregateReuse()` fold,
which is supposed to be dealing with such aggregate reconstructions,
fires a lot less now. There are two reasons why:
1. After this fold, as it can be seen in tests, we may (will) end up with trivially redundant PHI nodes.
We don't CSE them in InstCombine presently, which means that EarlyCSE needs to run and then InstCombine rerun.
2. But then, EarlyCSE not only manages to fold such redundant PHI's,
it also sees that the extract-insert chain recreates the original aggregate,
and replaces it with the original aggregate.
The take-aways are
1. We maybe should do most trivial, same-BB PHI CSE in InstCombine
2. I need to check if what other patterns remain, and how they can be resolved.
(i.e. i wonder if `foldAggregateConstructionIntoAggregateReuse()` might go away)
This is a reland of the original commit fcb51d8c24,
because originally i forgot to ensure that the base aggregate types match.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D86530
This assertion does not achieve what it meant to do originally, as it
would fire only when applied to an unregistered operation, which is a
fairly rare circumstance (it needs a dialect or context allowing
unregistered operation in the input in the first place).
Instead we relax it to only fire when it should have matched but didn't
because of the misconfiguration.
Differential Revision: https://reviews.llvm.org/D86588
Update the comment stating the aim of the test - this is currently
only checking that these assembler directives doesn't cause the
assembler to fail, but the results of the testcase aren't particularly
correct yet.
Remove bits of the testcase that are even less likely to be found in
the wild (the .seh_startchained/.seh_endchained block), where the
testcase currently doesn't really generate anything interesting
anyway.
Differential Revision: https://reviews.llvm.org/D86524
Instead of using the TypeConverter infer the value of the alloca created based
on the init value. This will allow some ambiguous types like multidimensional
vectors to be converted correctly.
Differential Revision: https://reviews.llvm.org/D86582
This reverts commit fcb51d8c24.
As buildbots report, there's apparently some missing check to ensure
that the types of incoming values match the type of PHI.
Let's revert for a moment.
While since D86306 we do it's sibling fold for `insertvalue`,
we should also do this for `extractvalue`'s.
And unlike that one, the results here are, quite honestly, shocking,
as it can be observed here on vanilla llvm test-suite + RawSpeed results:
```
| statistic name | baseline | proposed | Δ | % | |%| |
|----------------------------------------------------|-----------|-----------|--------:|--------:|-------:|
| asm-printer.EmittedInsts | 7945095 | 7942507 | -2588 | -0.03% | 0.03% |
| assembler.ObjectBytes | 273209920 | 273069800 | -140120 | -0.05% | 0.05% |
| early-cse.NumCSE | 2183363 | 2183398 | 35 | 0.00% | 0.00% |
| early-cse.NumSimplify | 541847 | 550017 | 8170 | 1.51% | 1.51% |
| instcombine.NumAggregateReconstructionsSimplified | 2139 | 108 | -2031 | -94.95% | 94.95% |
| instcombine.NumCombined | 3601364 | 3635448 | 34084 | 0.95% | 0.95% |
| instcombine.NumConstProp | 27153 | 27157 | 4 | 0.01% | 0.01% |
| instcombine.NumDeadInst | 1694521 | 1765022 | 70501 | 4.16% | 4.16% |
| instcombine.NumPHIsOfExtractValues | 0 | 37546 | 37546 | 0.00% | 0.00% |
| instcombine.NumSunkInst | 63158 | 63686 | 528 | 0.84% | 0.84% |
| instcount.NumBrInst | 874304 | 871857 | -2447 | -0.28% | 0.28% |
| instcount.NumCallInst | 1757657 | 1758402 | 745 | 0.04% | 0.04% |
| instcount.NumExtractValueInst | 45623 | 11483 | -34140 | -74.83% | 74.83% |
| instcount.NumInsertValueInst | 4983 | 580 | -4403 | -88.36% | 88.36% |
| instcount.NumInvokeInst | 61018 | 59478 | -1540 | -2.52% | 2.52% |
| instcount.NumLandingPadInst | 35334 | 34215 | -1119 | -3.17% | 3.17% |
| instcount.NumPHIInst | 344428 | 331116 | -13312 | -3.86% | 3.86% |
| instcount.NumRetInst | 100773 | 100772 | -1 | 0.00% | 0.00% |
| instcount.TotalBlocks | 1081154 | 1077166 | -3988 | -0.37% | 0.37% |
| instcount.TotalFuncs | 101443 | 101442 | -1 | 0.00% | 0.00% |
| instcount.TotalInsts | 8890201 | 8833747 | -56454 | -0.64% | 0.64% |
| instsimplify.NumSimplified | 75822 | 75707 | -115 | -0.15% | 0.15% |
| simplifycfg.NumHoistCommonCode | 24203 | 24197 | -6 | -0.02% | 0.02% |
| simplifycfg.NumHoistCommonInstrs | 48201 | 48195 | -6 | -0.01% | 0.01% |
| simplifycfg.NumInvokes | 2785 | 4298 | 1513 | 54.33% | 54.33% |
| simplifycfg.NumSimpl | 997332 | 1018189 | 20857 | 2.09% | 2.09% |
| simplifycfg.NumSinkCommonCode | 7088 | 6464 | -624 | -8.80% | 8.80% |
| simplifycfg.NumSinkCommonInstrs | 15117 | 14021 | -1096 | -7.25% | 7.25% |
```
... which tells us that this new fold fires whopping 38k times,
increasing the amount of SimplifyCFG's `invoke`->`call` transforms by +54% (+1513) (again, D85787 did that last time),
decreasing total instruction count by -0.64% (-56454),
and sharply decreasing count of `insertvalue`'s (-88.36%, i.e. 9 times less)
and `extractvalue`'s (-74.83%, i.e. four times less).
This causes geomean -0.01% binary size decrease
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=size-text
and, ignoring `O0-g`, is a geomean -0.01%..-0.05% compile-time improvement
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=instructions
The other thing that tells is, is that while this is a massive win for `invoke`->`call` transform
`InstCombinerImpl::foldAggregateConstructionIntoAggregateReuse()` fold,
which is supposed to be dealing with such aggregate reconstructions,
fires a lot less now. There are two reasons why:
1. After this fold, as it can be seen in tests, we may (will) end up with trivially redundant PHI nodes.
We don't CSE them in InstCombine presently, which means that EarlyCSE needs to run and then InstCombine rerun.
2. But then, EarlyCSE not only manages to fold such redundant PHI's,
it also sees that the extract-insert chain recreates the original aggregate,
and replaces it with the original aggregate.
The take-aways are
1. We maybe should do most trivial, same-BB PHI CSE in InstCombine
2. I need to check if what other patterns remain, and how they can be resolved.
(i.e. i wonder if `foldAggregateConstructionIntoAggregateReuse()` might go away)
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D86530
This happens when using -flto and -Wl,--plugin-opt=emit-llvm to create a linked LTO bitcode file, and the bitcode file has a strtab with size > 2^29.
All the issues relate to a pattern like this
size_t x64 = y64 + z32 * C
When z32 is >= (2^32)/C, z32 * C overflows.
Reviewed-by: MaskRay
Differential Revision: https://reviews.llvm.org/D86500
Tests for frexp[f|l] now use the new capability. Not all input-output
combinations have been addressed by this change. Support for newer combinations
can be added in future as needed.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D86506
A Mach-O universal binary may contain bitcode as a slice.
This diff adds proper handling of such binaries to llvm-lipo.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D85740
Update the "image show-unwind" command output to show if the function
being shown is listed as a user-setting or platform trap handler.
Update the individual UnwindPlan dumps to show whether the unwind plan
is registered as a trap handler.
Since we can only copy to GR32 we had to EXTRACT from GR32, but
we would first go to GR16 and then the truncate would extra again
to GR8. This adds a special case to go directly from GR32 to GR8.
This would eventually get cleaned up, but though maybe we should
avoid doing it in the first place. Our k-register handling is weird
and we could probably stand to have some more special ISD nodes
for the conversions so the i32 type would be explicit.