This adds 2-operand assembly aliases for these instructions:
add r0, r1 => add r0, r0, r1
sub r0, r1 => sub r0, r0, r1
Previously this syntax was only accepted for Thumb2 targets, where the
wide versions of the instructions were used.
This patch allows the 2-operand syntax to be used for Thumb1 targets,
and selects the narrow encoding when it is used for Thumb2 targets.
Differential revision: https://reviews.llvm.org/D37377
llvm-svn: 312321
Test constants as well in the PIC tests. These are also represented as
G_GLOBAL_VALUE, and although they are treated just like other globals
for PIC, they won't be for ROPI, so it's good to have this coverage.
llvm-svn: 312319
This patch provides such debug information for integer
variables whose type is shrinked to bool by providing
dwarf expression which returns either constant initial
value or other value.
Patch by Nikola Prica.
Differential Revision: https://reviews.llvm.org/D35994
llvm-svn: 312318
comparisons into memcmp.
Thanks to recent improvements in the LLVM codegen, the memcmp is typically
inlined as a chain of efficient hardware comparisons.
This typically benefits C++ member or nonmember operator==().
For now this is disabled by default until:
- https://bugs.llvm.org/show_bug.cgi?id=33329 is complete
- Benchmarks show that this is always useful.
Differential Revision:
https://reviews.llvm.org/D33987
llvm-svn: 312315
Previously we generated a register only pattern for each of the 3 instruction forms, but they are all identical as far as isel is concerned. So drop the others and just keep the 213 version.
This removes 2968 bytes from the isel table.
llvm-svn: 312313
Summary:
- `project` is required when `runtime/CMakeList.txt` is the top-level `CMakeList.txt` file. This will establish version and policy settings.
- `-D_FILE_OFFSET_BITS=64` should never be set for Android runtimes.
Reviewers: srhines, pirama, beanz
Subscribers: llvm-commits, srhines, mgorny
Differential Revision: https://reviews.llvm.org/D35648
llvm-svn: 312302
Not all of these will be able to be used by atomics because tablegen, but it
still seems like a good change by itself.
Differential Revision: https://reviews.llvm.org/D37345
llvm-svn: 312287
Prior to this patch we had a DAG combine that tried to bypass an X86ISD::ADD with -1 being added to the carry flag of some previous operation. We would then pass the carry flag directly to user.
But this is only safe if the user is looking for the carry flag and not the zero flag.
So we need to only do this combine in a context where we know what flag the consumer is using.
Fixes PR34381.
Differential Revision: https://reviews.llvm.org/D37317
llvm-svn: 312285
build_vector is a more useful canonical form when
pattern matching packed operations, so turn shift
into high element into a build_vector.
Should show no change for now.
llvm-svn: 312282
Before, Key was a StringRef to avoid unnecessary copies. This commit changes
that to a std::string.
This was okay previously because when people called emit for remarks before,
they would create the remark *within* the call to emit. However, if you build
the remark up and call emit *afterward*, it's possible to end up freeing the
memory assigned to the StringRef before the call to emit.
This caused a test failure with https://reviews.llvm.org/D37085 on Linux.
Since building remarks before a call to emit is a valid use-case, it makes
sense to replace this with a std::string.
llvm-svn: 312277
This adds a new command line option, -udt-stats, which breaks
down the stats of S_UDT records. These are one of the biggest
contributors to the size of /DEBUG:FASTLINK PDBs, so they need
some additional tools to be able to analyze their usage. This
option will dig into each S_UDT record and determine what kind
of record it points to, and then break down the statistics by
the target type. The goal here is to identify how our object
files differ from MSVC object files in S_UDT records, so that
we can output fewer of them and reach size parity.
llvm-svn: 312276
This patch completes the work done by Frederic Riss to addresses
dsymutil incorrectly considering forward declaration as canonical during
uniquing. This resulted in references to the forward declaration even
after the definition was encountered.
In addition to the test provided by Alexander Shaposhnikov in D29609, I
added another test to cover several scenarios that were mentioned in his
conversation with Fred. We now also check that uniquing still occurs
after the definition was encountered.
For more context please refer to D29609
Differential revision: https://reviews.llvm.org/D37127
llvm-svn: 312274
The BasicBlock passed to FindPredecessorRetainWithSafePath should be the
parent block of Autorelease. This fixes a crash that occurs in
FindDependencies when StartInst is not in StartBB.
rdar://problem/33866381
llvm-svn: 312266
This patch completes the work done by Frederic Riss to addresses
dsymutil incorrectly considering forward declaration as canonical during
uniquing. This resulted in references to the forward declaration even
after the definition was encountered.
In addition to the test provided by Alexander Shaposhnikov in D29609, I
added another test to cover several scenarios that were mentioned in his
conversation with Fred. We now also check that uniquing still occurs
after the definition was encountered.
For more context please refer to D29609
Differential revision: https://reviews.llvm.org/D37127
llvm-svn: 312264
Use os.path.normpath instead of realpath to collapse '..' and '.' path
components. Use realpath when caching search results about a path for
good measure.
I considered rigging up a test involving symlinks for this, but I doubt
I can check a symlink into SVN. The test would have to conditionally
create a symlink at runtime if the host OS supports it. This sounds too
fragile and complicated to me to be worth it.
llvm-svn: 312254
This patch changes the default behavior in brief mode to only show the
debug_info section. This is undoubtedly the most popular and likely the
one you'd want in brief mode.
Non-brief mode behavior is not affected and still defaults to all.
Differential revision: https://reviews.llvm.org/D37334
llvm-svn: 312252
This preserves symlinks in paths, so that someone can symlink more tests
into a larger test suite. For example, debuginfo-tests is currently
designed to be checked out into clang/test. With this change, it can be
symlinked into place instead, which works better with the monorepo.
llvm-svn: 312250
Recurse instead of returning on the first found optimization. Also, return early in the caller
instead of continuing because that allows another round of simplification before we might
potentially lose undef information from a shuffle mask by eliminating the shuffle.
As noted in the review, we could probably do better and be more efficient by moving all of
demanded elements into a separate pass, but this is yet another quick fix to instcombine.
Differential Revision: https://reviews.llvm.org/D37236
llvm-svn: 312248
Summary:
Hopefully this also clarifies exactly when and why we're rewriting
certiain S_LOCALs using reference types: We're using the reference type
to stand in for a zero-offset load.
Reviewers: inglorion
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D37309
llvm-svn: 312247
The buildbots have shown that -Wstrict-prototypes behaves differently in GCC
and Clang so we should keep it disabled until Clang follows GCC's behaviour
llvm-svn: 312246
Clang 5 supports -Wstrict-prototypes. We should use it to catch any C
declarations that declare a non-prototype function.
rdar://33705313
Differential Revision: https://reviews.llvm.org/D36669
llvm-svn: 312240
Summary:
This patch enables the following:
1) Regex based Instruction itineraries for integer instructions.
2) The instructions are grouped as per the nature of the instructions
(move, arithmetic, logic, Misc, Control Transfer).
3) FP instructions and their itineraries are added which includes values
for SSE4A, BMI, BMI2 and SHA instructions.
Patch by Ganesh Gopalasubramanian
Reviewers: RKSimon, craig.topper
Subscribers: vprasad, shivaram, ddibyend, andreadb, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D36617
llvm-svn: 312237
The CodingStandards section on avoiding the re-evaluation of end() hasn't been
updated since range-based for loops were adopted in the LLVM codebase. This
patch adds a very brief section that documents how range-based for loops
should be used wherever possible. It also moves example code in
CodingStandards to use range-based for loops and auto when appropriate.
Differential Revision: https://reviews.llvm.org/D37264
llvm-svn: 312236
New instructions are added to AArch32 and AArch64 to aid
floating-point multiplication and addition of complex numbers,
where the complex numbers are packed in a vector register as a
pair of elements. The Imaginary part of the number is placed in the
more significant element, and the Real part of the number is placed
in the less significant element.
Differential Revision: https://reviews.llvm.org/D36792
llvm-svn: 312228
Summary: Add a -name-whitelist option, which behaves in the same way as -name, but it reads in multiple function names from the given input file(s).
Reviewers: vsk
Reviewed By: vsk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37111
llvm-svn: 312227
Replace the UsePostRAScheduler SubtargetFeature with
DisablePostRAScheduler, which is then used by Swift and Cyclone.
This patch maintains enabling PostRA scheduling for other Thumb2
capable cores and/or for functions which are being compiled in Arm
mode.
Differential Revision: https://reviews.llvm.org/D37055
llvm-svn: 312226
The IDSAR6 system register has been introduced to identify the
v8.3-a Javascript data type conversion and v8.2-a dot product
support.
Differential Revision: https://reviews.llvm.org/D37068
llvm-svn: 312225
This is similar to what was done for ARM in SVN r269574; the code
and the test are straight copypaste to the corresponding AArch64
code and test directory.
Differential revision: https://reviews.llvm.org/D37204
llvm-svn: 312223
Current implementation of parseLoopStructure interprets the latch comparison as a
comarison against `iv.next`. If the actual comparison is made against the `iv` current value
then the loop may be rejected, because this misinterpretation leads to incorrect evaluation
of the latch start value.
This patch teaches the IRCE to distinguish this kind of loops and perform the optimization
for them. Now we use `IndVarBase` variable which can be either next or current value of the
induction variable (previously we used `IndVarNext` which was always the value on next iteration).
Differential Revision: https://reviews.llvm.org/D36215
llvm-svn: 312221
Renaming as a preparation step to generalizing IRCE for comparison not only against
the next value of an indvar, but also against the current.
Differential Revision: https://reviews.llvm.org/D36509
llvm-svn: 312215
Summary:
xmlDoc needs to be released with xmlFreeDoc.
Reset root element before release to avoid release of CombinedRoot owned
by CombinedDoc,
Reviewers: ecbeckmann, rnk, zturner, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37321
llvm-svn: 312207
The majority of the time spent in the pass checking
for the register reads. Rather than searching all of
the defined registers for uses in each instruction,
use a set of defined registers and check the operands
of the instruction.
This process still is algorithmically not great,
but with the additional trick of skipping the analysis
for addresses with one use, this brings one slow
testcase into a reasonable range.
llvm-svn: 312206
Summary:
Before this patch, llvm-xray account will assume that thread stacks will
not be empty. Unfortunately there are cases where an instrumented
function will see a call to `fork()` which will cause the child process
to not see the start of the function, but only see the end of the
function. The tooling cannot assume that threads will always have
perfect stacks, and so we change it to support this reality.
Reviewers: dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31870
llvm-svn: 312204
This moves the cmake configuration for fuzzers in LLVM to a new macro,
add_llvm_fuzzer. This will make it easier to keep things consistent
while implementing llvm.org/pr34314.
I've also made a couple of minor functional changes here:
- the fuzzers now use add_llvm_executable rather than add_llvm_tool.
This means they won't create install targets and stuff like that,
because those made little sense for these fuzzers.
- I've grouped these under "Fuzzers" rather than in with "Tools" for
people who build with IDEs.
llvm-svn: 312200
All this does is forward declare the interface functions (and make
sure that they're `extern "C"`), but since we're using libFuzzer from
the toolchain it doesn't make sense to include the local copy of the
interface.
llvm-svn: 312195
This adds missed optimization remarks which report viable candidates that
were not outlined because they would increase code size.
Other remarks will come in separate commits.
This will help to diagnose code size regressions and changes in outliner
behaviour in projects using the outliner.
https://reviews.llvm.org/D37085
llvm-svn: 312194
Some kinds of relocations do not have symbols, like R_X86_64_RELATIVE
for instance. I would like to test this case in D36554 but currently
can't because symbols are required by yaml2obj. The other option is
using the empty symbol but that doesn't seem quite right to me.
This change makes the Symbol field of Relocation optional and in the
case where the user does not specify a symbol name the Symbol index is 0.
Patch by Jake Ehrlich
Differential Revision: https://reviews.llvm.org/D37276
llvm-svn: 312192
These aren't really packed instructions, so the default
op_sel_hi should be 0 since this indicates a conversion.
The operand types are scalar values that behave similar
to an f16 scalar that may be converted to f32.
Doesn't change the default printing for op_sel_hi, just
the parsing.
llvm-svn: 312179
It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!")
> Issues identified by buildbots addressed since original review:
> - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
> - The pass no longer forwards COPYs to physical register uses, since
> doing so can break code that implicitly relies on the physical
> register number of the use.
> - The pass no longer forwards COPYs to undef uses, since doing so
> can break the machine verifier by creating LiveRanges that don't
> end on a use (since the undef operand is not considered a use).
>
> [MachineCopyPropagation] Extend pass to do COPY source forwarding
>
> This change extends MachineCopyPropagation to do COPY source forwarding.
>
> This change also extends the MachineCopyPropagation pass to be able to
> be run during register allocation, after physical registers have been
> assigned, but before the virtual registers have been re-written, which
> allows it to remove virtual register COPY LiveIntervals that become dead
> through the forwarding of all of their uses.
llvm-svn: 312178
writeArchive returned a pair, but the first element of the pair is always
its first argument on failure, so it doesn't make sense to return it from
the function. This patch change the return type so that it does't return it.
Differential Revision: https://reviews.llvm.org/D37313
llvm-svn: 312177
See D37236 for discussion. It seems unlikely that we actually want/need
to do this kind of folding in InstCombine in the long run, but moving
everything will be a bigger follow-up step.
llvm-svn: 312172
Previously we would just describe the first register and then call it
quits. This patch emits fragment expressions for each register.
<rdar://problem/34075307>
llvm-svn: 312169
Now that we print DIExpressions inline everywhere, we don't need to
print them once as an operand and again as a value. This is only really
visible when calling dump() or print() directly on a DIExpression during
debugging.
llvm-svn: 312168
Summary:
Remove a check for `ARMSubtarget::isTargetDarwin` when determining
whether to use Swift error registers, so that Swift errors work
properly on non-Darwin ARM32 targets (specifically Android).
Before this patch, generated code would save and restores ARM register r8 at
the entry and returns of a function that throws. As r8 is used as a virtual
return value for the object being thrown, this gets overwritten by the restore,
and calling code is unable to catch the error. In turn this caused Swift code
that used `do`/`try`/`catch` to work improperly on Android ARM32 targets.
Addresses Swift bug report https://bugs.swift.org/browse/SR-5438.
Patch by John Holdsworth.
Reviewers: manmanren, rjmccall, aschwaighofer
Reviewed By: aschwaighofer
Subscribers: srhines, aschwaighofer, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D35835
llvm-svn: 312164
Added a combiner which can clean up truncs/extends that are created in
order to make the types work during legalization.
Also moved the combineMerges to the LegalizeCombiner.
https://reviews.llvm.org/D36880
llvm-svn: 312158
Issues identified by buildbots addressed since original review:
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
doing so can break code that implicitly relies on the physical
register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
can break the machine verifier by creating LiveRanges that don't
end on a use (since the undef operand is not considered a use).
[MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.
This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.
llvm-svn: 312154
This change simplifies code that has to deal with
DIGlobalVariableExpression and mirrors how we treat DIExpressions in
debug info intrinsics. Before this change there were two ways of
representing empty expressions on globals, a nullptr and an empty
!DIExpression().
If someone needs to upgrade out-of-tree testcases:
perl -pi -e 's/(!DIGlobalVariableExpression\(var: ![0-9]*)\)/\1, expr: !DIExpression())/g' <MYTEST.ll>
will catch 95%.
llvm-svn: 312144
Summary:
DbgVariableLocation::extractFromMachineInstruction originally
returned a boolean indicating success. This change makes it return
an Optional<DbgVariableLocation> so we cannot try to access the fields
of the struct if they aren't valid.
Reviewers: aprantl, rnk, zturner
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D37279
llvm-svn: 312143
Selecting 32-bit element logical ops without a select or broadcast requires matching a bitconvert on the inputs to the and. But that's a weird thing to rely on. It's entirely possible that one of the inputs doesn't have a bitcast and one does.
Since there's no functional difference, just remove the extra patterns and save some isel table size.
Differential Revision: https://reviews.llvm.org/D36854
llvm-svn: 312138
Summary:
Reverts r311008 to reinstate r310825 with a fix.
Refine alias checking for pseudo vs value to be conservative.
This fixes the original failure in builtbot unittest SingleSource/UnitTests/2003-07-09-SignedArgs.
Reviewers: hfinkel, nemanjai, efriedma
Reviewed By: efriedma
Subscribers: bjope, mcrosier, nhaehnle, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D36900
llvm-svn: 312126
This code is double-dead:
1. We simplify all selects with constant true/false condition in InstSimplify.
I've minimized/moved the tests to show that works as expected.
2. All remaining vector selects with a constant condition are canonicalized to
shufflevector, so we really can't see this pattern.
llvm-svn: 312123
This patch supports one more pattern for bbit0 and bbit1
instructions, CBranchBitNum class is expanded so it can
take 32 bit immidate.
Differential Revision: https://reviews.llvm.org/D36222
llvm-svn: 312111
Summary:
If the first insertelement instruction has multiple users and inserts at
position 0, we can re-use this instruction when folding a chain of
insertelement instructions. As we need to generate the first
insertelement instruction anyways, this should be a strict improvement.
We could get rid of the restriction of inserting at position 0 by
creating a different shufflemask, but it is probably worth to keep the
first insertelement instruction with position 0, as this is easier to do
efficiently than at other positions I think.
Reviewers: grosser, mkuper, fpetrogalli, efriedma
Reviewed By: fpetrogalli
Subscribers: gareevroman, llvm-commits
Differential Revision: https://reviews.llvm.org/D37064
llvm-svn: 312110
Support for scalars was committed in r311154, this adds support for allowing
v4f16 vector types (thus avoiding conversions from/to single precision for
these types).
Differential Revision: https://reviews.llvm.org/D37145
llvm-svn: 312104
NFC.
Replaced duplicated HASWELL prefixes in run commands in the X86 Code Gen regression tests by the SKYLAKE prefix when the -mcpu is set to skylake.
The fix is needed in preparation of an upcoming patch containing the Skylake scheduling info.
Reviewers: zvi, RKSimon, aymanmus, igorb
Differential Revision: https://reviews.llvm.org/D37258
llvm-svn: 312103
Summary:
This patch adjusts the patterns to make the result type of the broadcast node vXf64/vXi64. Then adds a bitcast to vXi32 after that. Intrinsic lowering was also adjusted to generate this new pattern.
Fixes PR34357
We should probably just drop the intrinsic entirely and use native IR, but I'll leave that for a future patch.
Any idea what instruction we should be lowering the floating point 128-bit result version of this pattern to? There's a 128-bit v2i32 integer broadcast but not an fp one.
Reviewers: aymanmus, zvi, igorb
Reviewed By: aymanmus
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37286
llvm-svn: 312101
This enables the use of a smaller encoding by using a VEX instruction when possible.
Differential Revision: https://reviews.llvm.org/D37092
llvm-svn: 312100
Currently we start applying this on Haswell and newer. I don't believe anything changed in the Haswell architecture to make this the right cutoff point. The partial flag handling around this has been roughly the same since Sandybridge.
Differential Revision: https://reviews.llvm.org/D37250
llvm-svn: 312099
Summary:
Currently we determine if macro fusion is supported based on the AVX flag as a proxy for the processor being Sandy Bridge".
This is really strange as now AMD supports AVX. It also means if user explicitly disables AVX we disable macro fusion.
This patch adds an explicit macro fusion feature. I've also enabled for the generic 64-bit CPU (which doesn't have AVX)
This is probably another candidate for being in the MI layer, but for now I at least wanted to correct the overloading of the AVX feature.
Reviewers: spatel, chandlerc, RKSimon, zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37280
llvm-svn: 312097
The merge is only possible if the base address register is the
same for the two instructions. If there is only the one use,
there's no point in doing an expensive forward scan checking
for memory interference looking for a merge candidate.
This gives a signficant improvement in one extreme testcase.
The code to do the scan is still algorithmically terrible,
so this is still the slowest pass in that example.
llvm-svn: 312096
If denorms are not flushed we can use max instead of multiplication
by 1. For double that is simply faster, while for float and half
it is shorter, because mul uses constant bus and VOP3.
Differential Revision: https://reviews.llvm.org/D36856
llvm-svn: 312095
https://reviews.llvm.org/D36888
From that review description:
When an OrcMCJITReplacement object gets destructed, LazyEmitLayer may still
contain a shared_ptr of a module, which requires ShouldDelete in the deleter.
But ShouldDelete gets destructed before LazyEmitLayer due to the order of
declaration in OrcMCJITReplacement, which leads to a crash, when the destructor
of LazyEmitLayer is executed. Changing the order of declaration fixes this.
Patch by Moritz Kroll. Thanks Moritz!
llvm-svn: 312086
cantFail is the moral equivalent of an assertion that the wrapped call must
return a success value. This patch allows clients to include an associated
error message (the same way they would for an assertion for llvm_unreachable).
If the error message is not specified it will default to: "Failure value
returned from cantFail wrapped call".
llvm-svn: 312066
We don't have an intrinsic implemented for this instruction yet, but it looked odd that we were missing the accessor method from the subtarget.
llvm-svn: 312064
Summary:
When jumptable encoding does not match target code encoding (arm vs
thumb), a veneer is inserted by the linker. We can not avoid this
in all cases, because entries within one jumptable must have the same
encoding, but we can make it less common by selecting the jumptable
encoding to match the majority of its targets.
This change only covers FullLTO, and not ThinLTO.
Reviewers: pcc
Subscribers: aemerson, mehdi_amini, javed.absar, kristof.beyls, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D37171
llvm-svn: 312054
Summary:
Cross-DSO CFI needs all __cfi_check exports to use the same encoding
(ARM vs Thumb).
Reviewers: pcc
Subscribers: aemerson, srhines, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D37243
llvm-svn: 312052
This is to fix PR34257. rL309059 takes an early return when FindLIVLoopCondition
fails to find a loop invariant condition. This is wrong and it will disable loop
unswitch for select. The patch fixes the bug.
Differential Revision: https://reviews.llvm.org/D36985
llvm-svn: 312045
Summary:
This reduces the number of build actions after a no-op commit from
thousands to about six, which should be acceptable. If six actions is
still too many, developers can disable the LLVM_APPEND_VC_REV cmake
option.
llvm-config.h is a widely included header that should rarely change.
Before this patch, it would change after every re-configure. Very few
users of llvm-config.h need to know the precise version, and those that
do can migrate to incorporating LLVM_REVISION as provided by
llvm/Support/VCSRevision.h.
This should bring LLVM back to the behavior that it had before r306858
from June 30 2017. Most LLVM tools will now print a version string like
"6.0.0svn" instead of "6.0.0-git-c40c2a23de4".
Fixes PR34308
Reviewers: pcc, rafael, hans
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D37272
llvm-svn: 312043
Summary:
Based on Fred's patch here: https://reviews.llvm.org/D6771
I can't seem to commandeer the old review, so I'm creating a new one.
With that change the locations exrpessions are pretty printed inline in the
DIE tree. The output looks like this for debug_loc entries:
DW_AT_location [DW_FORM_data4] (0x00000000
0x0000000000000001 - 0x000000000000000b: DW_OP_consts +3
0x000000000000000b - 0x0000000000000012: DW_OP_consts +7
0x0000000000000012 - 0x000000000000001b: DW_OP_reg0 RAX, DW_OP_piece 0x4
0x000000000000001b - 0x0000000000000024: DW_OP_breg5 RDI+0)
And like this for debug_loc.dwo entries:
DW_AT_location [DW_FORM_sec_offset] (0x00000000
Addr idx 2 (w/ length 190): DW_OP_consts +0, DW_OP_stack_value
Addr idx 3 (w/ length 23): DW_OP_reg0 RAX, DW_OP_piece 0x4)
Simple locations without ranges are printed inline:
DW_AT_location [DW_FORM_block1] (DW_OP_reg4 RSI, DW_OP_piece 0x4, DW_OP_bit_piece 0x20 0x0)
The debug_loc(.dwo) dumping in changed accordingly to factor the code.
Reviewers: dblaikie, aprantl, friss
Subscribers: mgorny, javed.absar, hiraditya, llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D37123
llvm-svn: 312042
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.
This fixes PR34261.
The reland adds two extra checks to the original: It checks if the
DbgVariableLocation is valid before checking any of its fields, and
it only emits ranges with nonzero registers.
Reviewers: aprantl, rnk, zturner
Reviewed By: rnk
Subscribers: mgorny, llvm-commits, aprantl, hiraditya
Differential Revision: https://reviews.llvm.org/D36907
llvm-svn: 312034
Summary:
If SimplifyCFG pass is able to merge conditional stores into single one,
it loses the alignment. This may lead to incorrect codegen. Patch
sets the alignment of the new instruction if it is set in the original
one.
Reviewers: jmolloy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36841
llvm-svn: 312030
Summary:
QuarantineSizeMb is deprecated, and QuarantineChunksUpToSize has been added as a new tunable option.
Reviewers: cryptoad
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D37238
llvm-svn: 312025
This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code.
One test didn't vectorize optimize properly due to another bug so a TODO has been added.
Differential Revision: https://reviews.llvm.org/D37253
llvm-svn: 312023
The loop dependence check looks for dependencies between store merge
candidates not captured by the chain sub-DAG doing a check of
predecessors which may be very large. Conservatively bound number of
nodes checked for compilation time. (Resolves PR34326).
Landing on behalf of Nirav Dave to unblock the 5.0.0 release.
Differential Revision: https://reviews.llvm.org/D37220
llvm-svn: 312022
Summary:
Remove some code that was no longer needed. The first FIXME is
stale since we long ago started using the index to drive importing,
rather than doing force importing based on linkage type. And
now with r309278, we no longer import any aliases.
Reviewers: dblaikie
Subscribers: inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D37266
llvm-svn: 312019
I forgot to specify -unroll-loop-peel, making this test not
really effective. While here, adjust some details (naming and
run line). Thanks to Sanjoy and Michael Z. for pointing out in
their post-commit reviews.
llvm-svn: 312015
Summary:
r296498 introduced a DenseSet to store function importing info.
Using this container causes a test failure in
test/Transform/SampleProfile/import.ll when in Reverse Iteration mode.
This patch orders IDs before iterating through this container.
Reviewers: danielcdh, mgrang
Reviewed By: danielcdh
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37246
llvm-svn: 312012
Since these aren't built by default unless building with coverage (and
even then they aren't built for the check target) they've managed to
bit rot a little.
This just fixes the build. See llvm.org/pr34314 for the plan on making
sure these don't bit rot again.
llvm-svn: 312011
This extends the set of resources parsed by llvm-rc by DIALOG and
DIALOGEX.
Additionally, three optional resource statements specific to these two
resources are added: CAPTION, FONT, and STYLE.
Thanks for Nico Weber for his original work in this area.
Differential Revision: https://reviews.llvm.org/D36905
llvm-svn: 312009
Summary: We originally assume that in pgo-icp, the promoted direct call will never be null after strip point casts. However, stripPointerCasts is so smart that it could possibly return the value of the function call if it knows that the return value is always an argument. In this case, the returned value cannot cast to Instruction. In this patch, null check is added to ensure null pointer will not be accessed.
Reviewers: tejohnson, xur, davidxl, djasper
Reviewed By: tejohnson
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D37252
llvm-svn: 312005
As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(),
but implemented using moveBefore() for symmetry/efficiency.
Differential Revision: https://reviews.llvm.org/D37239
llvm-svn: 312001
Support the selection of G_GLOBAL_VALUE in the PIC relocation model. For
simplicity we use the same pseudoinstructions for both Darwin and ELF:
(MOV|LDRLIT)_ga_pcrel(_ldr).
This is new for ELF, so it requires a small update to the ARM pseudo
expansion pass to make sure it adds the correct constant pool modifier
and add-current-address in the case of ELF.
Differential Revision: https://reviews.llvm.org/D36507
llvm-svn: 311992
When LSR processes code like
int accumulator = 0;
for (int i = 0; i < N; i++) {
accummulator += i;
use((double) accummulator);
}
It may decide to replace integer `accumulator` with a double Shadow IV to get rid
of casts. The problem with that is that the `accumulator`'s value may overflow.
Starting from this moment, the behavior of integer and double accumulators
will differ.
This patch strenghtens up the conditions of Shadow IV mechanism applicability.
We only allow it for IVs that are proved to be `AddRec`s with `nsw`/`nuw` flag.
Differential Revision: https://reviews.llvm.org/D37209
llvm-svn: 311986
Summary: Knights Landing, because it is Atom derived, has slow two memory operand instructions. Mark the Knights Landing CPU model accordingly.
Patch by David Zarzycki.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37224
llvm-svn: 311979
Summary:
Plugins can (and should) be enabled under mingw if we are building
libLLVM.dll, so this is just a missed case. This allows LLVMgold.dll to
be built now under mingw.
Reviewers: llvm-commits, pirama, beanz, chapuni
Reviewed By: chapuni
Subscribers: chapuni, mgorny
Differential Revision: https://reviews.llvm.org/D37116
llvm-svn: 311973
Summary:
Add support for autocompleting values of -std= by including
LangStandards.def. This patch relies on D36782, and is using two-stage
code generation.
Reviewers: v.g.vassilev, teemperor, ruiu
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36820
llvm-svn: 311971
This fixes a problem introduced 311957, where the compiler would crash
with "fatal error: error in backend: unknown codeview register".
llvm-svn: 311969
This fixes an issue with the use of LLVM_PARALLEL_LINK_JOBS.
Original commit message:
macOS 10.13 added a new API (futimens). This API is only available on macOS 10.13
and later, but the cmake check we have in place only tests if the symbol is
present and ignores the availability attribute. Luckily we have new warning for
this and by making this warning an error the cmake check will return the correct
result.
See also rdar://problem/33992750.
Differential Revision: https://reviews.llvm.org/D37027
llvm-svn: 311965
This implements a fuzzer tool for instruction selection, as described
in my [EuroLLVM 2017 talk][1].
The fuzzer must be given both libFuzzer args and llc-like args to
configure the backend. For example, to fuzz AArch64 GlobalISel at -O0,
you could invoke like so:
llvm-isel-fuzzer <corpus dirs> -ignore_remaining_args=1 \
-mtriple arm64-apple-ios -global-isel -O0
If you would like to seed the fuzzer with an initial corpus, simply
provide a directory of valid LLVM bitcode (not textual IR) as one of
the corpus dirs.
[1]: http://llvm.org/devmtg/2017-03//2017/02/20/accepted-sessions.html#2
llvm-svn: 311964
This was pretty close to working already. While I was here I went ahead and passed the ICmpInst pointer from the caller instead of doing a dyn_cast that can never fail.
Differential Revision: https://reviews.llvm.org/D37237
llvm-svn: 311960
In r311742 we marked the PCs array as used so it wouldn't be dead
stripped, but left the guard and 8-bit counters arrays alone since
these are referenced by the coverage instrumentation. This doesn't
quite work if we want the indices of the PCs array to match the other
arrays though, since elements can still end up being dead and
disappear.
Instead, we mark all three of these arrays as used so that they'll be
consistent with one another.
llvm-svn: 311959
This reverts commit 7c46b80c022e18d43c1fdafb117b0c409c5a6d1e.
r311552 broke lld buildbot because I've changed OptionInfos type from
ArrayRef to vector. However the bug is fixed, so I'll commit this again.
llvm-svn: 311958
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.
This fixes PR34261.
Reviewers: aprantl, rnk, zturner
Reviewed By: rnk
Subscribers: mgorny, llvm-commits, aprantl, hiraditya
Differential Revision: https://reviews.llvm.org/D36907
llvm-svn: 311957
This extends llvm-rc parsing tool by MENU resource
(msdn.microsoft.com/en-us/library/windows/desktop/aa381025(v=vs.85).aspx).
As for now, MENUEX
(msdn.microsoft.com/en-us/library/windows/desktop/aa381023(v=vs.85).aspx)
seems unnecessary.
Thanks for Nico Weber for his original work in this area.
Differential Revision: https://reviews.llvm.org/D36898
llvm-svn: 311956
Be more consistent with CreateFunctionLocalArrayInSection in the API
of CreatePCArray, and assign the member variable in the caller like we
do for the guard and 8-bit counter arrays.
This also tweaks the order of method declarations to match the order
of definitions in the file.
llvm-svn: 311955
macOS 10.13 added a new API (futimens). This API is only available on macOS 10.13
and later, but the cmake check we have in place only tests if the symbol is
present and ignores the availability attribute. Luckily we have new warning for
this and by making this warning an error the cmake check will return the correct
result.
See also rdar://problem/33992750.
Differential Revision: https://reviews.llvm.org/D37027
llvm-svn: 311949
This improves the current llvm-rc parser by the ability of parsing
ACCELERATORS statement.
Moreover, some small improvements to the original parsing commit
were made.
Thanks for Nico Weber for his original work in this area.
Differential Revision: https://reviews.llvm.org/D36894
llvm-svn: 311946
Add new predicate to more accurately model the scheduling around branches
and function calls and of loads and stores of pairs and integer
multiplications.
llvm-svn: 311944
Add new predicate to more accurately model the cost of arithmetic and
logical operations shifted left.
Differential revision: https://reviews.llvm.org/D37151
llvm-svn: 311943
We were handling some vectors in foldSelectIntoOp, but not if the operand of the bin op was any kind of vector constant. This patch fixes it to treat vector splats the same as scalars.
Differential Revision: https://reviews.llvm.org/D37232
llvm-svn: 311940
This extends the current llvm-rc parser by ICON and HTML resources.
Moreover, some tests have been slightly rewritten.
Thanks for Nico Weber for his original work in this area.
Differential Revision: https://reviews.llvm.org/D36891
llvm-svn: 311939
Summary:
STRQro* instructions are slower than the alternative ADD/STRQui expanded
instructions on Falkor, so avoid generating them unless we're optimizing
for code size.
Reviewers: t.p.northover, mcrosier
Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D37020
llvm-svn: 311931
When peeling kicks in, it updates the loop preheader.
Later, a successful full unroll of the loop needs to update a PHI
which i-th argument comes from the loop preheader, so it'd better look
at the correct block. Fixes PR33437.
Differential Revision: https://reviews.llvm.org/D37153
llvm-svn: 311922
ARMv4 doesn't support the "BX" instruction, which has been introduced
with ARMv4t. Adjust the call lowering and tail call implementation
accordingly.
Further changes are necessary to ensure that presence of the v4t feature
is correctly set. Most importantly, the "generic" CPU for thumb-*
triples should include ARMv4t, since thumb mode without thumb support
would naturally be pointless.
Add a couple of asserts to ensure thumb instructions are not emitted
without CPU support.
Differential Revision: https://reviews.llvm.org/D37030
llvm-svn: 311921
This fixes 2 problems in subregister hierarchies with multiple levels
and tuples:
1) For bigger tuples computing secondary subregs would miss 2nd order
effects. In the test case a register like `S10_S11_S12_S13_S14` with D5
= S10_S11, D6 = S12_S13 we would correctly compute sub0 = D5, sub1 = D6
but would miss the fact that we could now form ssub0_ssub1_ssub2_ssub3
(aka sub0_sub1) = D5_D6. This is fixed by changing
computeSecondarySubRegs() to compute a fixpoint.
2) Fixing 1) exposed a problem where TableGen would create multiple
names for effectively the same subregister index. In the test case
the subregister index sub0 is composed from ssub0 and ssub1, and sub1 is
composed from ssub2 and ssub3. TableGen should not create both sub0_sub1
and ssub0_ssub1_ssub2_ssub3 as infered subregister indexes. This changes
the code to build a transitive closure of the subregister components
before forming new concatenated subregister indexes.
This fix was developed for an out of tree target. For the in-tree
targets the only change is in the register information computed for ARM.
There is a slight chance this fixed/improved some register coalescing
around the QQQQ/QQ register classes there but I couldn't see/provoke any
code generation differences.
Differential Revision: https://reviews.llvm.org/D36913
llvm-svn: 311914
Adds a new --gen-register-info-debug-dump mode to tablegen that dumps various register related information:
- List of register classes with super and subclasses
- List of subregister indexes with lanemasks
- List of registers with subregisters
I will use this in an upcoming commit to create a test.
It may also be useful for target developers wanting to get an overview
of all the register related information, esp. the things inferred by
tablegen and not directly visible in the .td file.
Differential Revision: https://reviews.llvm.org/D36911
llvm-svn: 311913
Summary:
ARMLoadStoreOpt::FixInvalidRegPairOp() was only checking if one of the
load destination registers to be split overlapped with the base register
if the base register was marked as killed. Since kill flags may not
always be present, this can lead to incorrect code.
This bug was exposed by my MachineCopyPropagation change D30751 breaking
the sanitizer-x86_64-linux-android buildbot.
Also clean up some dead code and add an assert that a register offset is
never encountered by this code, since it does not handle them correctly.
Reviewers: MatzeB, qcolombet, t.p.northover
Subscribers: aemerson, javed.absar, kristof.beyls, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D37164
llvm-svn: 311907
Summary:
Currently, a phi node is created in the normal destination to unify the return values from promoted calls and the original indirect call. This patch makes this phi node to be created only when the return value has uses.
This patch is necessary to generate valid code, as compiler crashes with the attached test case without this patch. Without this patch, an illegal phi node that has no incoming value from `entry`/`catch` is created in `cleanup` block.
I think existing implementation is good as far as there is at least one use of the original indirect call. `insertCallRetPHI` creates a new phi node in the normal destination block only when the original indirect call dominates its use and the normal destination block. Otherwise, `fixupPHINodeForNormalDest` will handle the unification of return values naturally without creating a new phi node. However, if there's no use, `insertCallRetPHI` still creates a new phi node even when the original indirect call does not dominate the normal destination block, because `getCallRetPHINode` returns false.
Reviewers: xur, davidxl, danielcdh
Reviewed By: xur
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37176
llvm-svn: 311906