Without this patch, we hit the following a lot:
"llvm.dbg.declare intrinsic requires a !dbg attachment"
"DICompileUnit not listed in llvm.dbg.cu"
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D135492
There are intrinsics for most scalar instructions and almost all HVX
instructions. What's somewhat painful is that there are two intrinsics
for each HVX instruction: one for 64- and one for 128-byte mode.
Instead of checking the current codegen settings every time, this
function would simply return the right intrinsic.
The first two parameters of addcarry are commutative. We may face a situation where both variant are present in the DAG, in which case we benefit from using just one.
Depends on D57302 and D33587
Reviewed By: RKSimon, chfast
Differential Revision: https://reviews.llvm.org/D57317
Typically when you match something, you want to check the result.
Fix a couple warnings in the AMDGPUPostLegalizerCombiner which appear as a
result of this.
Differential Revision: https://reviews.llvm.org/D135491
.bss section emitted by llvm-bolt (e.g. with instrumentation) is not a
real BSS section, i.e. it takes space in the output file. Hence the
order with respect to .data is not defined. Remove .bss from the test
and fix the buildbot failure.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D135475
Clear all dispositions if there are any dead blocks (which will get
removed later) and also clear dispositions for removed instructions.
Clearing all dispositions in case there are dead blocks happens first,
which should avoid traversing SCEV use-lists for invalidating
dispositions for individual values.
Fixes#58179.
Apparently StackColoring depends on SlotIndexes, but not
LiveIntervals. If regalloc fast were manually requested, LiveIntervals
would be dropped before SILowerSGPRSpills but not SlotIndexes.
SILowerSGPRSpills preserved SlotIndexes, but only through
LiveIntervals. As a result, SILowerSGPRSpills was incorrectly
reporting it preserved SlotIndexes. Start updating these directly,
instead of depending on LiveIntervals also being available.
Phis have a quirk where the same predecessor block may appear multiple times
if the same block branches to it multiple ways. All the values need to match,
but this was replacing each operand independently. If an operand can be simplified,
make sure to replace every instance of the incoming block's value.
This allows more precise control over which patterns to pick to
expand arithmetic ops. Previously ceil/floor division epxansion
is only available together with various min/max op expansion.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D135479
This patch was added way back in the beginning of the work which became the statepoint infrastructure. The idea was that safepoints could be inserted late in the optimization pipeline. This is true if the only concern is garbage collection, but this approach turned out to be incompatible with the requirement to also support deoptimization at safepoints.
In theory, this pass would still be quite useful for an AOT compiled language which wants to support garbage collection, but we have no known users, and haven't for over 5 years. Time to remove unused code. If someone wants to use this, restoring it would not be hard. The immediate motivation for removal is that this is one of the last passes remaining which hasn't been ported to the new pass manager and the (straight forward) work to do so is not justified for unused code.
Differential Revision: https://reviews.llvm.org/D135371
In https://reviews.llvm.org/D135127 we created the show flag
`--output-format` which was confusing because it behaved differently
than the same flag in the merge command. So, rename the flag to
`--show-format`. This also allows us to add the `text` option to mean
"normal text output" rather than "text-encoded profiles" like it does
for the merge command.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D135467
Add a new testcase that shows a bug in BOLT when writing out
dynamic relocations. This is currently marked as XFAIL as we work on
solving it. This bug happens when the current strategy fails to
recognize that the original dynamic relocation in the input should
reference the original .bolt.org.rodata section instead of the new one
.rodata created by BOLT after moving jump tables. This bug started
happening after 729d29e167.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D125941
This patch adds support for converting doubles to string in the %f/F
format specifier. It does not yet support long doubles outside of the
double range. This implementation is based on the work of Ulf Adams,
specifically the Ryu Printf algorithm.
See:
Ulf Adams. 2019. Ryū revisited: printf floating point conversion.
Proc. ACM Program. Lang. 3, OOPSLA, Article 169 (October 2019), 23 pages.
https://doi.org/10.1145/3360595
Differential Revision: https://reviews.llvm.org/D131023
While the order of new sections in the output binary was deterministic
in the past (i.e. there was no run-to-run variation), it wasn't always
rational as we used size to define the precedence of allocatable
sections within "code" or "data" groups (probably unintentionally).
Fix that by defining stricter section-ordering rules.
Other than the order of sections, this should be NFC.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135235
When BOLT updates .eh_frame section, it concatenates newly-generated
contents (from CFI directives) with the original .eh_frame that has
relocations applied to it. However, if no new content is generated,
the original .eh_frame has to be left intact. In that case, BOLT was
still writing out the relocatable copy of the original .eh_frame section
to the new segment, even though this copy was never used and was not
even marked in the section header table.
Detect the scenario above and skip allocating extra space for .eh_frame.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135223
To properly set the "_end" symbol, we need to track the last allocatable
address. Simply emitting "_end" at the end of some section is not
sufficient since the order of section allocation is unknown during the
emission step.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135121
We can emit a binary without a new text section. Hence, the text section
assertion is not needed.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135120
The logic for strsignal and strerror is very similar, so I've moved them
both to use a shared utility (MessageMapper) for the basic
functionality.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D135322
I've implemente the gnu variant of strerror_r since that seems to be the
one more relevant to what we're trying to do.
Differential Revision: https://reviews.llvm.org/D135227
PCH supported for HLSL is added when compile in -cc1 mode using -include-pch for test AST.
This change add some notes about the support of PCH for HLSL.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D134330
Two another atomic compare capture forms, `{ v = x; expr-stmt }` and `{ expr-stmt; v = x; }`
where `expr-stmt` could be `cond-expr-stmt` are missing.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D135236
When the work on the Flang driver started, we created 2 test
directories:
* flang/test/Frontend/
* flang/test/Driver/
That was mostly done to model what Clang was doing. In practice, we
stopped using "flang/test/Frontend/" early on and most Flang driver
tests are currently located in "flang/test/Driver/". This patch moves
the remaining tests from the latter into the former directory.
This change also means that we can re-use test input files, i.e.
flang/test/Frontend/Inputs/hello-world.f90 can be replaced with
flang/test/Driver/Inputs/hello.f90. To this end, the affected test is
updated (multiple-input-files.f90).
Differential Revision: https://reviews.llvm.org/D130633