Commit Graph

137241 Commits

Author SHA1 Message Date
Craig Topper abe80cc04d [AVX-512] Promote AND/OR/XOR to v2i64/v4i64/v8i64 even when we have AVX512F/AVX512VL.
Previously we weren't creating masked logical operations if bitcasts appeared between the logic operation and the select. The IR optimizers can move bitcasts across logic operations and create these cases. To minimize the number of cases we need to handle, this change promotes all logic ops to an i64 vector type just like when only SSE or AVX is available.

Unfortunately, this also has the consequence of making it difficult to select unmasked VPANDD/VPORD/VPXORD in all the cases it was previously used. This is the cause of most of the test change. This shouldn't result in any functional change though.

llvm-svn: 279929
2016-08-28 06:06:28 +00:00
Craig Topper 8046e2033e [AVX-512] Add tests to show that we don't select masked logic ops if there are bitcasts between the logic op and the select.
This is taken from optimized IR of clang test cases for masked logic ops.

llvm-svn: 279928
2016-08-28 06:06:24 +00:00
Craig Topper 8877a026e4 [X86] Rename PABSB/D/W instructions to be consistent with SSE/AVX instructions instead of ending 128/256. NFC
llvm-svn: 279927
2016-08-28 06:06:21 +00:00
Jan Vesely 38814fa2fd AMDGPU/R600: Enable Load combine
Fix and improve tests

Differential Revision: https://reviews.llvm.org/D23899

llvm-svn: 279925
2016-08-27 19:09:43 +00:00
Craig Topper 6943aa306e [X86] Rename predicate function that detects if requires one of the REX.B, REX.X or REX.R bits. It's old name conflicted with a function in X8II namespace that doesnt' quite do the same thing. NFC
llvm-svn: 279924
2016-08-27 17:13:43 +00:00
Craig Topper 45793a1f7a [X86] Keep looping over operands looking for byte registers even if we already found a register that requires a REX prefix. Otherwise we don't error if a high byte register is used after SPL/BPL/DIL/SIL.
llvm-svn: 279923
2016-08-27 17:13:41 +00:00
Craig Topper 6acca80e17 [X86] Include XMM/YMM/ZMM16-23 in X86II::isX86_64ExtendedReg. This feels more consistent with its name and simplifies assembler code.
llvm-svn: 279922
2016-08-27 17:13:37 +00:00
Craig Topper 06c60c067f [X86] Don't allow DR8-DR15 to be assembled in 32-bit mode. Add missing test for CR8-CR15.
llvm-svn: 279921
2016-08-27 17:13:34 +00:00
Craig Topper ed71f04abb [X86] Remove stale comment about FixupBWInsts pass being off by default. NFC
llvm-svn: 279915
2016-08-27 05:26:54 +00:00
Craig Topper 225da2cb84 [AVX-512] Allow EVEX encoding unordered/ordered/equal/notequal VCMPPS/PD/SS/SD to be commuted just like the SSE and AVX counterparts.
llvm-svn: 279914
2016-08-27 05:22:15 +00:00
Craig Topper 144fdef66b [X86] Enable FR32/FR64 cmpeq/cmpne/cmpunord/cmpord to be commuted.
llvm-svn: 279913
2016-08-27 05:22:12 +00:00
Craig Topper 4891c724aa [AVX-512] Add load folding for EVEX vcmpps/pd/ss/sd.
llvm-svn: 279912
2016-08-27 05:22:08 +00:00
Teresa Johnson e2e621a36c [LTO] Don't create a new common unless merged has different size
Summary:
This addresses a regression in common handling from the new LTO
API in r278338. Only create a new common if the size is different.
The type comparison against an array type fails when the size is
different but not an array. GlobalMerge does not handle the
array types as well and we lose some global merging opportunities.

Reviewers: mehdi_amini

Subscribers: junbuml, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23955

llvm-svn: 279911
2016-08-27 04:41:22 +00:00
Matt Arsenault a15ea4e217 AMDGPU: Mark sched model complete
Fixes bug 26800

llvm-svn: 279910
2016-08-27 03:39:27 +00:00
Matt Arsenault 71ed8a67e8 AMDGPU: Remove unneeded implicit exec uses/defs
SI_BREAK, SI_IF_BREAK, and SI_ELSE_BREAK do not def exec.
SI_IF_BREAK and SI_ELSE_BREAK do not read it either.

llvm-svn: 279909
2016-08-27 03:00:51 +00:00
Lang Hames 60110f542f [Orc] Explicitly specify type for assignment.
This should fix the MSVC errors in
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15120

llvm-svn: 279908
2016-08-27 02:59:24 +00:00
Sebastian Pop 4660199a33 GVN-hoist: invalidate MD cache (PR29144)
Without invalidating the entries in the MD cache we would try to access instructions
that were removed in previous iterations of hoisting.

Differential Revision: https://reviews.llvm.org/D23927

llvm-svn: 279907
2016-08-27 02:48:41 +00:00
Quentin Colombet acb857b831 [RegBankSelect] Do not abort when the target wants to fall back.
llvm-svn: 279906
2016-08-27 02:38:27 +00:00
Quentin Colombet 948abf0a0f [InstructionSelect] Do not abort when the target wants to fall back.
llvm-svn: 279905
2016-08-27 02:38:24 +00:00
Quentin Colombet 5e60bcdeaf [MachineLegalize] Do not abort when the target wants to fall back.
llvm-svn: 279904
2016-08-27 02:38:21 +00:00
Matt Arsenault 2712d4a3d8 AMDGPU: Select mulhi 24-bit instructions
llvm-svn: 279902
2016-08-27 01:32:27 +00:00
Matt Arsenault 22e417956d AMDGPU: Move cndmask pseudo to be isel pseudo
There's only one use of this for the convenience
of a pattern. I think v_mov_b64_pseudo should also be
moved, but SIFoldOperands does currently make use of it.

llvm-svn: 279901
2016-08-27 01:00:37 +00:00
Matt Arsenault e949744474 AMDGPU: Fix sched type for branches
llvm-svn: 279900
2016-08-27 00:51:02 +00:00
Matt Arsenault f98a596954 AMDGPU: Remove register operand from si_mask_branch
It isn't used for anything, and is also misleading since
it could be spilled at the end of the block, so it can't be relied
on. There ends up being a verifier error about using an undefined
register since the spill kills the register.

llvm-svn: 279899
2016-08-27 00:42:21 +00:00
Matt Arsenault 00e102baf4 AMDGPU: Improve error reporting for maximum branch distance
Unfortunately this seems to only help the assembler diagnostic.

llvm-svn: 279895
2016-08-27 00:21:22 +00:00
Chris Bieneman bc3940e7ec [CMake] Only generate Components.cmake if components are specified
Generating the Components import file is useless if there are no components coming in from the runtimes configuration, so we should skip generation in that case.

This also should fix the configuration error that Renato reported on llvm-dev.

llvm-svn: 279893
2016-08-27 00:19:51 +00:00
Lang Hames 28fa3c519c [ORC] Fix typo in LogicalDylib, add unit test.
llvm-svn: 279892
2016-08-27 00:19:05 +00:00
Quentin Colombet 374796d678 [GlobalISel] Add a fallback path to SDISel.
When global-isel fails on a MachineFunction MF, MF will be cleaned up
and given to SDISel.
Thanks to this fallback, we can already perform correctness test even if
we support only a small portion of the functions in a test.

llvm-svn: 279891
2016-08-27 00:18:31 +00:00
Quentin Colombet a94caa5673 [AArch64][CallLowering] Do not assert for not implemented part.
When doing the ABI lowering, report a failure to the caller instead of
asserting. This gives a chance for the caller to recover.

llvm-svn: 279890
2016-08-27 00:18:28 +00:00
Quentin Colombet 6049524d37 [GlobalISel] Teach the core pipeline not to run if ISel failed.
llvm-svn: 279889
2016-08-27 00:18:24 +00:00
Michael Kuperstein aea50f8b84 [X86] Add baseline test for "odd" shuffles. NFC.
Adds a baseline test for lowering shuffles where the width of the output
vector is not twice the size of the input vectors. Many of those sequences
are suboptimal, and will hopefully be improved in follow-up patches.

llvm-svn: 279888
2016-08-27 00:10:24 +00:00
Quentin Colombet 3bb32cc79c [IRTranslator] Do not abort when the target wants to fall back.
Every pass in the GlobalISel pipeline will need to do something similar.

llvm-svn: 279886
2016-08-26 23:49:05 +00:00
Quentin Colombet e076d3094c [MFProperties] Introduce a FailedISel property.
This is used to communicate that the instruction selection pipeline
failed at some point.
Another way to achieve that would be to have some kind of conditional
scheduling in the PassManager, such that we only schedule a pass based
on the success/failure of another one. The property approach has the
advantage of being lightweight and solve the problem at stake.

llvm-svn: 279885
2016-08-26 23:49:01 +00:00
Teresa Johnson 26a462877b [ThinLTO] Move loading of cache entry to client
Summary:
Have the cache pass back the path to the cache entry when it
is ready to be loaded, instead of a buffer.

For gold-plugin we can simply pass this file back to gold directly,
which avoids expensive writing of a separate tmp file. Ensure
the cache entry is not deleted on cleanup by adjusting the setting
of the IsTemporary flags.

Moved the loading of the buffer into llvm-lto2 to maintain current
behavior.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23946

llvm-svn: 279883
2016-08-26 23:29:14 +00:00
Andrew Kaylor 3aeda4fcfb Adding document describing the use of the -opt-bisect-limit option.
llvm-svn: 279881
2016-08-26 23:11:48 +00:00
Quentin Colombet 0de43b225f [TargetPassConfig] Add a target hook to know what GlobalISel should do on error.
By default, this hook tells GlobalISel to abort (report a fatal error)
when it encounters an error. The alternative will be to fall back on
SDISel.
This fall back will be removed when the bring-up of GlobalISel is over.

llvm-svn: 279879
2016-08-26 22:32:59 +00:00
Quentin Colombet 1d0cb6f107 [IRTranslator][NFC] Use DEBUG_TYPE instead of repeating the name.
llvm-svn: 279878
2016-08-26 22:32:57 +00:00
Quentin Colombet e063e1f68a [SelectionDAG] Do not run the ISel process on already selected code.
Right now, this cannot happen, but with the fall back path of GlobalISel
it will show up eventually.

llvm-svn: 279877
2016-08-26 22:32:55 +00:00
Quentin Colombet 380cd3eb23 [MachineFunction] Introduce a reset method.
This method allows to reset the state of a MachineFunction as if it was
just created. This will be used during the bring-up of GlobalISel to
provide a way to fallback on SelectionDAG. That way, we can start doing
correctness testing even if we are not able to select all functions via
the global instruction selector.

llvm-svn: 279876
2016-08-26 22:32:53 +00:00
Justin Bogner 39b6b2f0b0 TableGen: Switch from a std::map to a DenseMap in CodeGenSubRegIndex. NFC
This mapping is between pointers, which DenseMap is particularly good
at. Most targets aren't really affected, but if there's a lot of
subregister composition this can shave off a good chunk of time from
generating registers.

llvm-svn: 279875
2016-08-26 22:29:36 +00:00
Quentin Colombet e609a9a80a [MFProperties] Introduce a reset method with no argument.
This method allows to reset all the properties in one go.

llvm-svn: 279874
2016-08-26 22:09:11 +00:00
Quentin Colombet c437aa9c26 [MFProperties][NFC] Rename clear into reset to match BitVector naming.
The name clear is used to reset all the bit in bitvectors and using it
to reset just properties was confusing.

llvm-svn: 279873
2016-08-26 22:09:08 +00:00
Tom Stellard e175d8aba5 AMDGPU/SI: Canonicalize offset order for merged DS instructions
Summary:
If the scheduler clusters the loads, then the offsets will be sorted,
but it is possible for the scheduler to scheduler loads together
without out explicitly clustering them, which would give us non-sorted
offsets.

Also, we will want to do this if we move the load/store optimizer before
the scheduler.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23776

llvm-svn: 279870
2016-08-26 21:36:47 +00:00
Tom Stellard 4b5cd87ed3 XXX
llvm-svn: 279868
2016-08-26 21:16:40 +00:00
Tom Stellard 7c463c9168 AMDGPU/SI: Use a better method for determining the largest pressure sets
Summary:
There are a few different sgpr pressure sets, but we only care about
the one which covers all of the sgprs.  We were using hard-coded
register pressure set names to determine the reg set id for the
biggest sgpr set.  However, we were using the wrong name, and this
method is pretty fragile, since the reg pressure set names may
change.

The new method just looks for the pressure set that contains the most
reg units and sets that set as our SGPR pressure set.  We've also
adopted the same technique for determining our VGPR pressure set.

Reviewers: arsenm

Subscribers: MatzeB, arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23687

llvm-svn: 279867
2016-08-26 21:16:37 +00:00
Chris Bieneman ef2ab69315 [CMake] Expose runtime component check targets
This will expose the check targets for runtime project components into the top-level build. It will enable exposing targets like check-asan.

llvm-svn: 279861
2016-08-26 20:34:11 +00:00
Adam Nemet cef3314156 [Inliner] Report when inlining fails because callee's def is unavailable
Summary:
This is obviously an interesting case because it may motivate code
restructuring or LTO.

Reporting this requires instantiation of ORE in the loop where the call
sites are first gathered.  I've checked compile-time
overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in
mcf and quickly tailing off.  As before without -Rpass-with-hotness
there is no overhead.

Because this could be a pretty noisy diagnostics, it is currently
qualified as 'verbose'.  As of this patch, 'verbose' diagnostics are
only emitted with -Rpass-with-hotness, i.e. when the output is expected
to be filtered.

Reviewers: eraman, chandlerc, davidxl, hfinkel

Subscribers: tejohnson, Prazek, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D23415

llvm-svn: 279860
2016-08-26 20:21:05 +00:00
Rafael Espindola 7775c3310c Make writeToResolutionFile a static helper.
llvm-svn: 279859
2016-08-26 20:19:35 +00:00
Kyle Butt 723aa1327c TailDuplication: Record blocks that received the duplicated block. NFC.
This will allow tail duplication during layout to handle the cfg changes more
cleanly.

llvm-svn: 279858
2016-08-26 20:12:40 +00:00
Chris Bieneman c527a49b98 [CMake] Fixing LLVM_INCLUDE_TESTS for runtimes directory
We need to explicitly pass LLVM_INCLUDE_TESTS through from the top-level to the runtimes configuration because it isn't in LLVMConfig.cmake

llvm-svn: 279857
2016-08-26 20:08:57 +00:00