This can happen when the frontend knows the debug info will be emitted
somewhere else. Usually this happens for dynamic classes with out of
line constructors or key functions, but it can also happen when modules
are enabled.
llvm-svn: 281060
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
- We parsed fp literal:
- Instruction expects 64-bit operand:
- If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
- then we do nothing this literal
- Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
- report error
- Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
- If instruction expect fp operand type (f64)
- Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
- If so then do nothing
- Else (e.g. v_fract_f64 v[0:1], 3.1415)
- report warning that low 32 bits will be set to zeroes and precision will be lost
- set low 32 bits of literal to zeroes
- Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
- report error as it is unclear how to encode this literal
- Instruction expects 32-bit operand:
- Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
- Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
- do nothing
- Else report error
- Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
- Parsed binary literal:
- Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
- do nothing
- Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
- report error
- Else, literal is not-inlinable and we are not required to inline it
- Are high 32 bit of literal zeroes or same as sign bit (32 bit)
- do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
- Else
- report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
OPERAND_REG_IMM32_INT,
OPERAND_REG_IMM32_FP,
OPERAND_REG_INLINE_C_INT,
OPERAND_REG_INLINE_C_FP,
}
'''
This is not working yet:
- Several tests are failing
- Problems with predicate methods for inline immediates
- LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2.
llvm-svn: 281040
These instructions were only necessary when type information was stored in the
MachineInstr (because only generic MachineInstrs possessed a type). Now that
it's in MachineRegisterInfo, COPY and PHI work fine.
llvm-svn: 281037
We want each register to have a canonical type, which means the best place to
store this is in MachineRegisterInfo rather than on every MachineInstr that
happens to use or define that register.
Most changes following from this are pretty simple (you need an MRI anyway if
you're going to be doing any transformations, so just check the type there).
But legalization doesn't really want to check redundant operands (when, for
example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's
operand type field to encode these constraints and limit legalization's work.
As an added bonus, more validation is possible, both in MachineVerifier and
MachineIRBuilder (coming soon).
llvm-svn: 281035
Summary:
Also removed duplicate code from AMDGPUTargetAsmStreamer.
This change only change how amd_kernel_code_t is parsed and printed. No variable names are changed.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, wdng, nhaehnle
Differential Revision: https://reviews.llvm.org/D24296
llvm-svn: 281028
This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags.
Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean.
llvm-svn: 281027
As part of this effort, remove MipsFCmp nodes and use tablegen
patterns rather than custom lowering through C++.
Unexpectedly, this improves codesize for microMIPS as previous floating
point setcc expansions would materialize 0 and 1 into GPRs before using
the relevant mov[tf].[sd] instruction. Now $zero is used directly.
Reviewers: dsanders, vkalintiris, zoran.jovanovic
Differential Review: https://reviews.llvm.org/D23118
llvm-svn: 281022
Summary:
If one of the uses of the value is a single edge PHINode, handle it.
Original:
%val = something
<suspend>
%p = PHINode [%val]
After Spill + Part13:
%val = something
%slot = gep val.spill.slot
store %val, %slot
<suspend>
%p = load %slot
Plus tiny fixes/changes:
* use correct index for coro.free in CoroCleanup
* fixup id parameter in coro.free to allow authoring coroutine in plain C with __builtins
Reviewers: majnemer
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24242
llvm-svn: 281020
Summary:
While woring on mapping attributes in the C API, it clearly appeared that the recent changes in the API on the C++ side left Function and Call/Invoke with an attribute API that grew in an ad hoc manner. This makes it difficult to work with it, because one doesn't know which overloads exists and which do not.
Make sure that getter/setter function exists for both enum and string version. Remove inconsistent getter/setter, unless they have many callsites.
This should make it easier to work with attributes in the future.
This doesn't change how attribute works.
Reviewers: bkramer, whitequark, mehdi_amini, void
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21514
llvm-svn: 281019
The x64 ABI has two major function types:
- frame functions
- leaf functions
A frame function is one which requires a stack frame. A leaf function
is one which does not. A frame function may or may not have a frame
pointer.
A leaf function does not require a stack frame and may never modify SP
except via a return (RET, tail call via JMP).
A frame function which has a frame pointer is permitted to use the LEA
instruction in the epilogue, a frame function without which doesn't
establish a frame pointer must use ADD to adjust the stack pointer epilogue.
Fun fact: Leaf functions don't require a function table entry
(associated PDATA/XDATA).
llvm-svn: 281006
The REX prefix should be used on indirect jmps, but not direct ones.
For direct jumps, the unwinder looks at the offset to determine if
it's inside the current function.
Differential Revision: https://reviews.llvm.org/D24359
llvm-svn: 281003
Summary: The hoisted instruction is executed speculatively. It could affect the debugging experience as user would see gdb go into code that may not be expected to execute. It will also affect sample profile accuracy by assigning incorrect frequency to source within then/else branch.
Reviewers: davidxl, dblaikie, chandlerc, kcc, echristo
Subscribers: mehdi_amini, probinson, eric_niebler, andreadb, llvm-commits
Differential Revision: https://reviews.llvm.org/D24164
llvm-svn: 280995
The test case included in r280979 wasn't checking what it was supposed to be
checking for the predicated store case. Fixing the test revealed that the
multi-use case (when a pointer is used by both vectorized and scalarized memory
accesses) wasn't being handled properly. We can't skip over
non-consecutive-like pointers since they may have looked consecutive-like with
a different memory access.
llvm-svn: 280992
Previously, all consecutive pointers were marked uniform after vectorization.
However, if a consecutive pointer is used by a memory access that is eventually
scalarized, the pointer won't remain uniform after all. An example is
predicated stores. Even though a predicated store may be consecutive, it will
still be scalarized, making it's pointer operand non-uniform.
This patch updates the logic in collectLoopUniforms to consider the cases where
a memory access may be scalarized. If a memory access may be scalarized, its
pointer operand is not marked uniform. The determination of whether a given
memory instruction will be scalarized or not has been moved into a common
function that is used by the vectorizer, cost model, and legality analysis.
Differential Revision: https://reviews.llvm.org/D24271
llvm-svn: 280979
mapping a yaml field to an object in code has always been
a stateless operation. You could still pass state by using the
`setContext` function of the YAMLIO object, but this represented
global state for the entire yaml input. In order to have
context-sensitive state, it is necessary to pass this state in
at the granularity of an individual mapping.
This patch adds support for this type of context-sensitive state.
You simply pass an additional argument of type T to the
`mapRequired` or `mapOptional` functions, and provided you have
specialized a `MappingContextTraits<U, T>` class with the
appropriate mapping function, you can pass this context into
the mapping function.
Reviewed By: chandlerc
Differential Revision: https://reviews.llvm.org/D24162
llvm-svn: 280977
Fix the .arch asm parser to use the full set of features for the architecture
and any extensions on the command line. Add and update testcases accordingly
as well as add an extension that was used but not supported.
llvm-svn: 280971
And associated commits, as they broke the Thumb bots.
This reverts commit r280935.
This reverts commit r280891.
This reverts commit r280888.
llvm-svn: 280967
I introduced this potential bug by missing this diff in:
https://reviews.llvm.org/rL280873
...however, I'm not sure how to reach this code path with a regression test.
We may be able to remove this code and assume that the transform to a constant
is always handled by InstSimplify?
llvm-svn: 280964
Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself.
Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking.
llvm-svn: 280949
I mised the check that it had to support ARM to work. This commit tries
to fix that, to make sure we don't emit ARM code in Thumb-only mode.
llvm-svn: 280935
Materializing something like "-3" can be done as 2 instructions:
MOV r0, #3
MVN r0, r0
This has a cost of 2, not 3. It looks like we were already trying to detect this pattern in TII::getIntImmCost(), but were taking the complement of the zero-extended value instead of the sign-extended value which is unlikely to ever produce a number < 256.
There were no tests failing after changing this... :/
llvm-svn: 280928
Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element.
This is an initial step towards determining the sign bits of a vector (PR29079).
Differential Revision: https://reviews.llvm.org/D24253
llvm-svn: 280927
This reverts commit r280808.
It is possible that this change results in an infinite loop. This
is causing timeouts in some tests on ARM, and a Chromebook bot is
failing.
llvm-svn: 280918
Summary:
C allows to jump over variables declaration so lifetime.start can be
avoid before variable usage. To avoid false-positives on such rare cases
we detect them and remove from lifetime analysis.
PR27453
PR28267
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D24321
llvm-svn: 280907
Summary:
When cloning blocks for prologue/epilogue we need to replicate the loop
structure from the original loop. It wasn't a problem for the innermost
loops, but it led to an incorrect loop info when we unrolled a loop with
a child loop - in this case created prologue-loop had a child loop, but
loop info didn't reflect that.
This fixes PR28888.
Reviewers: chandlerc, sanjoy, hfinkel
Subscribers: llvm-commits, silvas
Differential Revision: https://reviews.llvm.org/D24203
llvm-svn: 280901
CGP tail-duplicates rets into blocks that end with a call that feed the ret.
This puts the call in tail position, potentially allowing the DAG builder to
lower it as a tail call. To avoid tail duplication in cases where we won't
form the tail call, CGP tried to predict whether this is going to be possible,
and avoids doing it when lowering as a tail call will definitely fail.
However, it was being too conservative by always throwing away calls to
functions with a signext/zeroext attribute on the return type.
Instead, we can use the same logic the builder uses to determine whether the
attributes work out.
Differential Revision: https://reviews.llvm.org/D24315
llvm-svn: 280894