Commit Graph

752 Commits

Author SHA1 Message Date
Brendon Cahoon 57c3d4bed3 [Pipeliner] Fix incorrect loop carried dependence calculation
The isLoopCarriedDep function does not correctly compute loop
carried dependences when the array index offset is negative
or the stride is smallar than the access size.

Patch by Denis Antrushin.

Differential Revision: https://reviews.llvm.org/D60135

llvm-svn: 358233
2019-04-11 21:57:51 +00:00
Simon Pilgrim d395bc1cc2 [Hexagon] Remove fcmp undef from reduced tests
Pre-commit for D60006 (Add fcmp UNDEF handling to SelectionDAG::FoldSetCC)

Approved by @kparzysz (Krzysztof Parzyszek)

llvm-svn: 357301
2019-03-29 19:14:52 +00:00
Krzysztof Parzyszek 4719502941 Add more rotate tests, including ORs of rotates
This is a part of https://reviews.llvm.org/D47735.

llvm-svn: 356683
2019-03-21 17:14:22 +00:00
Simon Pilgrim 55e1330eda [Hexagon] Remove icmp undef from reduced tests
Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC)

Approved by @kparzysz (Krzysztof Parzyszek)

llvm-svn: 356267
2019-03-15 15:07:44 +00:00
David Green ffc922ec35 [LSR] Attempt to increase the accuracy of LSR's setup cost
In some loops, we end up generating loop induction variables that look like:
  {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1}
As opposed to the simpler:
  {(zext i16 (%i0 * %i1) to i32),+,-1}
i.e we count up from -limit to 0, not the simpler counting down from limit to
0. This is because the scores, as LSR calculates them, are the same and the
second is filtered in place of the first. We end up with a redundant SUB from 0
in the code.

This patch tries to make the calculation of the setup cost a little more
thoroughly, recursing into the scev members to better approximate the setup
required. The cost function for comparing LSR costs is:

return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds,
                C1.ScaleCost, C1.ImmCost, C1.SetupCost) <
       std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds,
                C2.ScaleCost, C2.ImmCost, C2.SetupCost);
So this will only alter results if none of the other variables turn out to be
different.

Differential Revision: https://reviews.llvm.org/D58770

llvm-svn: 355597
2019-03-07 13:44:40 +00:00
Krzysztof Parzyszek 9c005bbdd4 [Hexagon] Avoid creating 5-instruction packets with vgather pseudos
Change the resource usage of the vgather pseudos from SLOT0+LD to
SLOT0+SLOT1.

llvm-svn: 355524
2019-03-06 17:43:50 +00:00
Krzysztof Parzyszek f6e875bacf [Hexagon] Use misaligned load instead of trap0(#0) for __builtin_trap
The trap instruction is intercepted by various runtime environments,
and instead of a crash it creates confusion.

This reapplies r354606 with a fix.

llvm-svn: 354611
2019-02-21 19:42:39 +00:00
Krzysztof Parzyszek 948c9f93c4 Revert r354606, it breaks asan tests
llvm-svn: 354609
2019-02-21 19:33:58 +00:00
Krzysztof Parzyszek 5f47fac3a2 [Hexagon] Use misaligned load instead of trap0(#0) for __builtin_trap
The trap instruction is intercepted by various runtime environments,
and instead of a crash it creates confusion.

llvm-svn: 354606
2019-02-21 18:39:22 +00:00
Krzysztof Parzyszek 6128ac5a8f [Hexagon] Split vector pairs for ISD::SIGN_EXTEND and ISD::ZERO_EXTEND
llvm-svn: 354473
2019-02-20 15:05:19 +00:00
Sanjay Patel 837552fe9f [PatternMatch] add special-case uaddo matching for increment-by-one (2nd try)
This is the most important uaddo problem mentioned in PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754
...but that was overcome in x86 codegen with D57637.

That patch also corrects the inc vs. add regressions seen with the  previous attempt at this.

Still, we want to make this matcher complete, so we can potentially canonicalize the pattern 
even if it's an 'add 1' operation.
Pattern matching, however, shouldn't assume that we have canonicalized IR, so we match 4 
commuted variants of uaddo.

There's also a test with a crazy type to show that the existing CGP transform based on this 
matcher is not limited by target legality checks.

I'm not sure if the Hexagon diff means the test is no longer testing what it intended to
test, but that should be solvable in a follow-up.

Differential Revision: https://reviews.llvm.org/D57516

llvm-svn: 352998
2019-02-03 16:16:48 +00:00
Brendon Cahoon 59d9973146 [Pipeliner] Add two pragmas to control software pipelining optimization
#pragma clang loop pipeline(disable)
  
    Disable SWP optimization for the next loop.
    “disable” is the only possible value.
  
#pragma clang loop pipeline_initiation_interval(number)
  
    Set value of initiation interval for SWP
    optimization to specified number value for
    the next loop. Number is the positive value
    greater than 0.
  
These pragmas could be used for debugging or reducing
compile time purposes. It is possible to disable SWP for
concrete loops to save compilation time or to find bugs
by not doing SWP to certain loops. It is possible to set
value of initiation interval to concrete number to save
compilation time by not doing extra pipeliner passes or
to check created schedule for specific initiation interval.

That is llvm part of the fix

Clang part of fix: https://reviews.llvm.org/D55710

Patch by Alexey Lapshin!

Differential Revision: https://reviews.llvm.org/D56403

llvm-svn: 351923
2019-01-23 03:26:10 +00:00
James Y Knight 693d39dd12 Remove irrelevant references to legacy git repositories from
compiler identification lines in test-cases.

(Doing so only because it's then easier to search for references which
are actually important and need fixing.)

llvm-svn: 351200
2019-01-15 16:18:52 +00:00
Sanjay Patel 4b537aaf6d [DAGCombiner] allow narrowing of add followed by truncate
trunc (add X, C ) --> add (trunc X), C'

If we're throwing away the top bits of an 'add' instruction, do it in the narrow destination type.
This makes the truncate-able opcode list identical to the sibling transform done in IR (in instcombine).

This change used to show regressions for x86, but those are gone after D55494. 
This gets us closer to deleting the x86 custom function (combineTruncatedArithmetic) 
that does almost the same thing.

Differential Revision: https://reviews.llvm.org/D55866

llvm-svn: 350006
2018-12-22 17:10:31 +00:00
Krzysztof Parzyszek 30c42e2ab6 [Hexagon] Add patterns for funnel shifts
llvm-svn: 349770
2018-12-20 16:39:20 +00:00
Sanjay Patel f24900b934 [DAGCombiner] allow hoisting vector bitwise logic ahead of truncates
The transform performs a bitwise logic op in a wider type followed by
truncate when both inputs are truncated from the same source type:
logic_op (truncate x), (truncate y) --> truncate (logic_op x, y)

There are a bunch of other checks that should prevent doing this when 
it might be harmful.

We already do this transform for scalars in this spot. The vector 
limitation was shared with a check for the case when the operands are 
extended. I'm not sure if that limit is needed either, but that would 
be a separate patch.

Differential Revision: https://reviews.llvm.org/D55448

llvm-svn: 349303
2018-12-16 14:57:04 +00:00
Krzysztof Parzyszek 26d994f56e [Hexagon] Add patterns for shifts of v2i16
This fixes https://llvm.org/PR39983.

llvm-svn: 349202
2018-12-14 22:33:48 +00:00
Sanjay Patel 25fc03c5c0 [Hexagon] make test immune to scalarization improvements; NFC
llvm-svn: 349163
2018-12-14 17:23:01 +00:00
Daniel Sanders 9f3cf55e63 [mir] Serialize DILocation inline when not possible to use a metadata reference
Summary:
Sometimes MIR-level passes create DILocations that were not present in the
LLVM-IR. For example, it may merge two DILocations together to produce a
DILocation that points to line 0.

Previously, the address of these DILocations were printed which prevented the
MIR from being read back into LLVM. With this patch, DILocations will use
metadata references where possible and fall back on serializing them inline like so:
    MOV32mr %stack.0.x.addr, 1, _, 0, _, %0, debug-location !DILocation(line: 1, scope: !15)

Reviewers: aprantl, vsk, arphaman

Reviewed By: aprantl

Subscribers: probinson, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D55243

llvm-svn: 349035
2018-12-13 14:25:27 +00:00
Krzysztof Parzyszek 9f003f9262 [Hexagon] Couple of fixes in optimize addressing mode
- Check if an operand is an immediate before calling getImm. Some operands
  that take constant values can actually have global symbols or other
  constant expressions.
- When a load-constant instruction can be folded into users, make sure to
  only delete it when all users have been successfully converted.

llvm-svn: 348802
2018-12-10 21:56:04 +00:00
Krzysztof Parzyszek c1b2d5905a Revert "[Hexagon] Check if operand is an immediate before getImm"
This reverts r348787. The patch wasn't quite correct.

llvm-svn: 348792
2018-12-10 19:30:08 +00:00
Krzysztof Parzyszek c6e9380a56 [Hexagon] Check if operand is an immediate before getImm
llvm-svn: 348787
2018-12-10 18:39:47 +00:00
Krzysztof Parzyszek b754f7a2e0 [Hexagon] Fix post-ra expansion of PS_wselect
llvm-svn: 348655
2018-12-07 22:00:53 +00:00
Krzysztof Parzyszek 8eb394d764 [Hexagon] Add intrinsics for Hexagon V66
llvm-svn: 348413
2018-12-05 21:14:51 +00:00
Krzysztof Parzyszek 545a68ca4b [Hexagon] Add instruction definitions for Hexagon V66
llvm-svn: 348411
2018-12-05 21:01:07 +00:00
Krzysztof Parzyszek 44c1f81b27 [Hexagon] Switch to auto-generated intrinsic definitions and patterns
llvm-svn: 348206
2018-12-03 22:40:36 +00:00
Sanjay Patel 08c0a0ac58 [Hexagon] make test immune to improvements in undef simplification
llvm-svn: 347218
2018-11-19 15:34:09 +00:00
Sanjay Patel cb04e590d3 [Hexagon] make tests immune to improvements in undef simplification
llvm-svn: 347165
2018-11-18 16:50:16 +00:00
Stanislav Mekhanoshin 0ff7c8309d DAG combiner: fold (select, C, X, undef) -> X
Differential Revision: https://reviews.llvm.org/D54646

llvm-svn: 347110
2018-11-16 23:13:38 +00:00
Brendon Cahoon ac8fed68d5 [Hexagon] Implement noreturn optimization
Eliminate the stack frame in functions with the noreturn nounwind
attributes, and when the noreturn-stack-elim target feature is
enabled. This reduces the code and stack space needed for noreturn
functions.

Differential Revision: https://reviews.llvm.org/D54210

llvm-svn: 346532
2018-11-09 18:16:24 +00:00
Krzysztof Parzyszek 8567de0871 [Hexagon] Place globals with explicit .sdata section in small data
Both -fPIC and -G0 disable placement of globals in small data section,
but if a global has an explicit section assigmnent placing it in small
data, it should go there anyway.

llvm-svn: 346523
2018-11-09 17:31:22 +00:00
Krzysztof Parzyszek f070544f8e [Hexagon] Do not reduce load size for globals in small-data
Small-data (i.e. GP-relative) loads and stores allow 16-bit scaled
offset. For a load of a value of type T, the small-data area is
equivalent to an array "T sdata[65536]". This implies that objects
of smaller sizes need to be closer to the beginning of sdata,
while larger objects may be farther away, or otherwise the offset
may be insufficient to reach it. Similarly, an object of a larger
size should not be accessed via a load of a smaller size.

llvm-svn: 345975
2018-11-02 14:17:47 +00:00
Daniel Sanders 29ca764492 [MC] Implement EmitRawText in MCNullStreamer
Summary:
This adds dummy implementation of `EmitRawText` in `MCNullStreamer`.

This fixes the behavior of `AsmPrinter` with `MCNullStreamer` on targets
on which no integrated assembler is used. An attempt to emit inline asm
on such a target would previously lead to a crash, since `AsmPrinter` does not
check for `hasRawTextSupport` in `EmitInlineAsm` and calls `EmitRawText`
anyway if integrated assembler is disabled (the behavior has changed
in D2686).

Error message printed by MCStreamer:

> EmitRawText called on an MCStreamer that doesn't support it, something
> must not be fully mc'ized

Patch by Eugene Sharygin

Reviewers: dsanders, echristo

Reviewed By: dsanders

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53938

llvm-svn: 345841
2018-11-01 15:41:11 +00:00
Krzysztof Parzyszek 977a1fe507 [Hexagon] Make sure not to use GP-relative addressing with PIC
Make sure that -relocation-model=pic prevents use of GP-relative
addressing modes.

llvm-svn: 345731
2018-10-31 15:54:31 +00:00
Matthias Braun a83403892a MachineOperand/MIParser: Do not print debug-use flag, infer it
The debug-use flag must be set exactly for uses on DBG_VALUEs.  This is
so obvious that it can be trivially inferred while parsing. This will
reduce noise when printing while omitting an information that has little
value to the user.

The parser will keep recognizing the flag for compatibility with old
`.mir` files.

Differential Revision: https://reviews.llvm.org/D53903

llvm-svn: 345671
2018-10-30 23:28:27 +00:00
Fangrui Song 5300c2e0ea [Pipeliner] Mark swp-art-deps-rec.ll as REQUIRES: asserts after rL345319
llvm-svn: 345359
2018-10-26 03:15:56 +00:00
Sumanth Gundapaneni ada0f511ba [Pipeliner] Ignore Artificial dependences while computing recurrences.
The artificial dependencies are not real dependencies. In some cases, they
form circuits with bigger MII. However, they are used to schedule instructions
better.

Differential Revision: https://reviews.llvm.org/D53450

llvm-svn: 345319
2018-10-25 21:27:08 +00:00
Justin Bogner 912adfba7e Reapply "[MachineCopyPropagation] Reimplement CopyTracker in terms of register units"
Recommits r342942, which was reverted in r343189, with a fix for an
issue where we would propagate unsafely if we defined only the upper
part of a register.

Original message:

  Change the copy tracker to keep a single map of register units
  instead of 3 maps of registers. This gives a very significant
  compile time performance improvement to the pass. I measured a
  30-40% decrease in time spent in MCP on x86 and AArch64 and much
  more significant improvements on out of tree targets with more
  registers.

llvm-svn: 344942
2018-10-22 19:51:31 +00:00
Krzysztof Parzyszek 6bfc6577f2 [Hexagon] Remove support for V4
llvm-svn: 344791
2018-10-19 17:31:11 +00:00
Fangrui Song d243ac6c4b [pipeliner] Fix test added in rL344748 to require asserts
llvm-svn: 344775
2018-10-19 06:20:01 +00:00
Sumanth Gundapaneni 62ac69d45c [Pipeliner] copyToPhi DAG Mutation to improve scheduling.
In a loop, create artificial dependences between the source of a
COPY/REG_SEQUENCE to the use in next iteration.

Eg:
SRC ----Data Dep--> COPY
COPY ---Anti Dep--> PHI (implies, to be used in next iteration)
PHI ----Data Dep--> USE

This patches creates
USE ----Artificial Dep---> SRC

This will effectively schedule the COPY late to eliminate additional copies.
Before this patch, the schedule can be
SRC, COPY, USE : The COPY is used in next iteration and it needs to be
preserved.

After this patch, the schedule can be
USE, SRC, COPY : The COPY is used in next iteration and the live interval is
reduced.

Differential Revision: https://reviews.llvm.org/D53303

llvm-svn: 344748
2018-10-18 15:51:16 +00:00
Bjorn Pettersson 064944352e [TwoAddressInstructionPass] Replace subregister uses when processing tied operands
Summary:
TwoAddressInstruction pass typically rewrites
  %1:short = foo %0.sub_lo:long
as
  %1:short = COPY %0.sub_lo:long
  %1:short = foo %1:short
when having tied operands.

If there are extra un-tied operands that uses the same reg and
subreg, such as the second and third inputs to fie here:
  %1:short = fie %0.sub_lo:long, %0.sub_hi:long, %0.sub_lo:long
then there was a bug which replaced the register %0 also for
the un-tied operand, but without changing the subregister indices.
So we used to get:
  %1:short = COPY %0.sub_lo:long
  %1:short = fie %1, %1.sub_hi:short, %1.sub_lo:short
With this fix we instead get:
  %1:short = COPY %0.sub_lo:long
  %1:short = fie %1, %0.sub_hi:long, %1

Reviewers: arsenm, JesperAntonsson, kparzysz, MatzeB

Reviewed By: MatzeB

Subscribers: bjope, kparzysz, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D36224

llvm-svn: 344492
2018-10-15 08:36:03 +00:00
Sumanth Gundapaneni a4a9155e4f [Hexagon] Restrict compound instructions with constant value.
Having a constant value operand in the compound instruction
is not always profitable. This patch improves coremark by ~4% on
Hexagon.

Differential Revision: https://reviews.llvm.org/D53152

llvm-svn: 344284
2018-10-11 19:48:15 +00:00
Nirav Dave 07acc992dc [DAGCombine] Improve Load-Store Forwarding
Summary:
Extend analysis forwarding loads from preceeding stores to work with
extended loads and truncated stores to the same address so long as the
load is fully subsumed by the store.

Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are
deleted as they've no longer seem to be relevant.

Reviewers: RKSimon, rnk, kparzysz, javed.absar

Subscribers: sdardis, nemanjai, hiraditya, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D49200

llvm-svn: 344142
2018-10-10 14:15:52 +00:00
Krzysztof Parzyszek 528aff3372 [Hexagon] Fix extracting subvectors of non-HVX vNi1
Patch by Brendon Cahoon.

llvm-svn: 343596
2018-10-02 15:05:43 +00:00
Krzysztof Parzyszek 6d569a2cc4 [Hexagon] Remove incorrect pattern for swiz
The pattern had a couple of problems:
- It was checking for loads of bytes in the reverse order to what it
  should have been looking for.
- It would replace loads of bytes with a load of a word without making
  sure that the alignment was correct.

Thanks to Eli Friedman for pointing it out.

llvm-svn: 343514
2018-10-01 18:24:40 +00:00
Bjorn Pettersson b2154af25f [MachineVerifier] Relax checkLivenessAtDef regarding dead subreg defs
Summary:
Consider an instruction that has multiple defs of the same
vreg, but defining different subregs:
  %7.sub1:rc, dead %7.sub2:rc = inst

Calling checkLivenessAtDef for the live interval associated
with %7 incorrectly reported "live range continues after a
dead def". The live range for %7 has a dead def at the slot
index for "inst" even if the live range continues (given that
there are later uses of %7.sub1).

This patch adjusts MachineVerifier::checkLivenessAtDef
to allow dead subregister definitions, unless we are checking
a subrange (when tracking subregister liveness).

A limitation is that we do not detect the situation when the
live range continues past an instruction that defines the
full virtual register by multiple dead subreg defines.

I also removed some dead code related to physical register
in checkLivenessAtDef. Wwe only call that method for virtual
registers, so I added an assertion instead.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52237

llvm-svn: 342618
2018-09-20 06:59:18 +00:00
Michael Berg 894c39f770 Copy utilities updated and added for MI flags
Summary: This patch adds a GlobalIsel copy utility into MI for flags and updates the instruction emitter for the SDAG path.  Some tests show new behavior and I added one for GlobalIsel which mirrors an SDAG test for handling nsw/nuw.

Reviewers: spatel, wristow, arsenm

Reviewed By: arsenm

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D52006

llvm-svn: 342576
2018-09-19 18:52:08 +00:00
Krzysztof Parzyszek c1e2f39b35 [PostRASink] Make sure to remove subregisters from live-ins as well
llvm-svn: 342492
2018-09-18 16:10:51 +00:00
Krzysztof Parzyszek a6d4fc0e29 [Hexagon] Use shuffles when lowering "gather" shufflevectors
Shufflevector instructions in LLVM IR that extract a subset of elements
of a longer input into a shorter vector can be done using VECTOR_SHUFFLEs.
This will avoid expanding them into constly extracts and inserts.

llvm-svn: 342091
2018-09-12 22:14:52 +00:00