Commit Graph

47232 Commits

Author SHA1 Message Date
Craig Topper 924f20262b [InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions
This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask.

This allows some code to be removed from InstSimplify that was doing this functionality.

This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out.

There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary.

Differential Revision: https://reviews.llvm.org/D37158

llvm-svn: 312382
2017-09-01 21:27:34 +00:00
Sanjay Patel f4425e9a66 [x86] eliminate redundant shuffle of horizontal math ops when both inputs are the same
This is limited to a set of patterns based on the example in PR34111:
https://bugs.llvm.org/show_bug.cgi?id=34111
...but as I was investigating this, I see that horizontal patterns can go wrong in many, 
many other ways that would not be handled by this patch. Each data type may even go 
different in the DAG after starting with the same basic IR pattern, so even proper IR 
canonicalization won't fix it all.

Differential Revision: https://reviews.llvm.org/D37357

llvm-svn: 312379
2017-09-01 21:09:04 +00:00
Zachary Turner abb17cc084 [llvm-pdbutil] Support dumping CodeView from object files.
We have llvm-readobj for dumping CodeView from object files, and
llvm-pdbutil has always been more focused on PDB.  However,
llvm-pdbutil has a lot of useful options for summarizing debug
information in aggregate and presenting high level statistical
views.  Furthermore, it's arguably better as a testing tool since
we don't have to write tests to conform to a state-machine like
structure where you match multiple lines in succession, each
depending on a previous match.  llvm-pdbutil dumps much more
concisely, so it's possible to use single-line matches in many
cases where as with readobj tests you have to use multi-line
matches with an implicit state machine.

Because of this, I'm adding object file support to llvm-pdbutil.
In fact, this mirrors the cvdump tool from Microsoft, which also
supports both object files and pdb files.  In the future we could
perhaps rename this tool llvm-cvutil.

In the meantime, this allows us to deep dive into object files
the same way we already can with PDB files.

llvm-svn: 312358
2017-09-01 20:06:56 +00:00
Davide Italiano c36039f462 [TTI] Fix getGEPCost() for geps with a single operand.
Previously this would sporadically crash as TargetType
was never initialized. We special-case the single-operand
case returning earlier and trying to mimic the behaviour of
isLegalAddressingMode as closely as possible.

Differential Revision:  https://reviews.llvm.org/D37277

llvm-svn: 312357
2017-09-01 19:54:08 +00:00
Daniel Berlin 86932104db NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops
Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context.

Reviewers: davide, mcrosier

Subscribers: sanjoy, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D37174

llvm-svn: 312352
2017-09-01 19:20:18 +00:00
Matt Arsenault efa1d655d4 AMDGPU: Add ds_{read|write}_addtid_b32 definitions
llvm-svn: 312349
2017-09-01 18:38:02 +00:00
Matthias Braun cebdb17522 LiveIntervalAnalysis: Fix alias regunit reserved definition
A register in CodeGen can be marked as reserved: In that case we
consider the register always live and do not use (or rather ignore)
kill/dead/undef operand flags.

LiveIntervalAnalysis however tracks liveness per register unit (not per
register). We already needed adjustments for this in r292871 to deal
with super/sub registers. However I did not look at aliased register
there. Looking at ARM:

FPSCR (regunits FPSCR, FPSCR~FPSCR_NZCV) aliases with FPSCR_NZCV
(regunits FPSCR_NZCV, FPSCR~FPSCR_NZCV) hence they share a register unit
(FPSCR~FPSCR_NZCV) that represents the aliased parts of the registers.
This shared register unit was previously considered non-reserved,
however given that we uses of the reserved FPSCR potentially violate
some rules (like uses without defs) we should make FPSCR~FPSCR_NZCV
reserved too and stop tracking liveness for it.

This patch:
- Defines a register unit as reserved when: At least for one root
  register, the root register and all its super registers are reserved.
- Adjust LiveIntervals::computeRegUnitRange() for new reserved
  definition.
- Add MachineRegisterInfo::isReservedRegUnit() to have a canonical way
  of testing.
- Stop computing LiveRanges for reserved register units in HMEditor even
  with UpdateFlags enabled.
- Skip verification of uses of reserved reg units in the machine
  verifier (this usually didn't happen because there would be no cached
  liverange but there is no guarantee for that and I would run into this
  case before the HMEditor tweak, so may as well fix the verifier too).

Note that this should only affect ARMs FPSCR/FPSCR_NZCV registers today;
aliased registers are rarely used, the only other cases are hexagons
P0-P3/P3_0 and C8/USR pairs which are not mixing reserved/non-reserved
registers in an alias.

Differential Revision: https://reviews.llvm.org/D37356

llvm-svn: 312348
2017-09-01 18:36:26 +00:00
Matt Arsenault ed6e8f0a90 AMDGPU: Add most d16 load/store instruction definitions
Doesn't include the tied operand necessary for the loads,
but is enough for the assembler to work.

llvm-svn: 312347
2017-09-01 18:36:06 +00:00
Sam Clegg 13a2e89926 [WebAssembly] Update relocation names to match spec
Summary: See https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md

Differential Revision: https://reviews.llvm.org/D37385

llvm-svn: 312342
2017-09-01 17:32:01 +00:00
Sam Clegg 4194fb7062 [WebAssembly] Fix getSymbolValue for exported globals
The code wasn't previously taking into account that the
global index space is not same as the into in the Globals
array since the latter does not include imported globals.

This fixes the WebAssembly waterfall failures.

Differential Revision: https://reviews.llvm.org/D37384

llvm-svn: 312340
2017-09-01 17:24:19 +00:00
Nicolai Haehnle 75c98c365b AMDGPU: IMPLICIT_DEFs and DBG_VALUEs do not contribute to wait states
Summary:
This fixes a bug that was exposed on gfx9 in various
GL45-CTS.shaders.loops.*_iterations.select_iteration_count_fragment tests,
e.g. GL45-CTS.shaders.loops.do_while_uniform_iterations.select_iteration_count_fragment

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D36193

llvm-svn: 312337
2017-09-01 16:56:32 +00:00
Craig Topper 2a75b6f26b [X86] Add test case I forgot to commit with r312285.
llvm-svn: 312335
2017-09-01 16:40:24 +00:00
Peter Collingbourne 5e8b94c137 ModuleSummaryAnalysis: Correctly handle refs from function inline asm to module inline asm.
If a function contains inline asm and the module-level inline asm
contains the definition of a local symbol, prevent the function from
being imported in case the function-level inline asm refers to a
symbol in the module-level inline asm.

Differential Revision: https://reviews.llvm.org/D37370

llvm-svn: 312332
2017-09-01 16:24:02 +00:00
Manoj Gupta 6b54c7e11b [LoopVectorizer] Use two step casting for float to pointer types.
Summary:
LoopVectorizer is creating casts between vec<ptr> and vec<float> types
on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a
floating point type to a pointer type even if the types have same size
causing a crash. Fix the crash using a two-step casting by bitcasting
to integer and integer to pointer/float.
Fixes PR33804.

Reviewers: mkuper, Ayal, dlj, rengolin, srhines

Reviewed By: rengolin

Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35498

llvm-svn: 312331
2017-09-01 15:36:00 +00:00
Alexandre Isoard 405728fd47 [SCEV] Add URem support to SCEV
In LLVM IR the following code:

    %r = urem <ty> %t, %b

is equivalent to

    %q = udiv <ty> %t, %b
    %s = mul <ty> nuw %q, %b
    %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t

As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented
with minimal effort using that relation:

    %r --> (-%b * (%t /u %b)) + %t

We implement two special cases:

  - if %b is 1, the result is always 0
  - if %b is a power-of-two, we produce a zext/trunc based expression instead

That is, the following code:

    %r = urem i32 %t, 65536

Produces:

    %r --> (zext i16 (trunc i32 %a to i16) to i32)

Note that while this helps get a tighter bound on the range analysis and the
known-bits analysis, this exposes some normalization shortcoming of SCEVs:

    %div = udim i32 %a, 65536
    %mul = mul i32 %div, 65536
    %rem = urem i32 %a, 65536
    %add = add i32 %mul, %rem

Will usually not be reduced.

llvm-svn: 312329
2017-09-01 14:59:59 +00:00
Geoff Berry 65528f2991 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review:
- Moved removal of dead instructions found by
  LiveIntervals::shrinkToUses() outside of loop iterating over
  instructions to avoid instructions being deleted while pointed to by
  iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

llvm-svn: 312328
2017-09-01 14:27:20 +00:00
Strahinja Petrovic d6ee877b23 Adding missing test case in rL312318
Adding test for debug info for integer 
variables whose type is shrinked to bool.

Patch by Nikola Prica.

llvm-svn: 312325
2017-09-01 11:37:53 +00:00
Diana Picus f959791189 [ARM] GlobalISel: Support ROPI global variables
In the ROPI relocation model, read-only variables are accessed relative
to the PC. We use the (MOV|LDRLIT)_ga_pcrel pseudoinstructions for this.

llvm-svn: 312323
2017-09-01 11:13:39 +00:00
Clement Courbet 65130e2d8d Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer
Add missing header.

This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf.

llvm-svn: 312322
2017-09-01 10:56:34 +00:00
Oliver Stannard d771f6cb16 [ARM] Add 2-operand assembly aliases for Thumb1 ADD/SUB
This adds 2-operand assembly aliases for these instructions:
  add r0, r1    =>   add r0, r0, r1
  sub r0, r1    =>   sub r0, r0, r1

Previously this syntax was only accepted for Thumb2 targets, where the
wide versions of the instructions were used.

This patch allows the 2-operand syntax to be used for Thumb1 targets,
and selects the narrow encoding when it is used for Thumb2 targets.

Differential revision: https://reviews.llvm.org/D37377

llvm-svn: 312321
2017-09-01 10:47:25 +00:00
Diana Picus b67264b182 [ARM] GlobalISel: More tests. NFC.
Test constants as well in the PIC tests. These are also represented as
G_GLOBAL_VALUE, and although they are treated just like other globals
for PIC, they won't be for ROPI, so it's good to have this coverage.

llvm-svn: 312319
2017-09-01 10:18:37 +00:00
Clement Courbet 316212575b Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer"
Break build

This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b.

llvm-svn: 312317
2017-09-01 09:43:08 +00:00
Clement Courbet 9473c01e96 [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer
comparisons into memcmp.

Thanks to recent improvements in the LLVM codegen, the memcmp is typically
inlined as a chain of efficient hardware comparisons.
This typically benefits C++ member or nonmember operator==().

For now this is disabled by default until:
 - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete
 - Benchmarks show that this is always useful.

Differential Revision:
https://reviews.llvm.org/D33987

llvm-svn: 312315
2017-09-01 09:07:05 +00:00
Craig Topper 70e581cdd6 [X86] Add isel patterns for memory forms of FMA3 intrinsic instructions
llvm-svn: 312309
2017-09-01 07:58:13 +00:00
Matt Arsenault ab4a5cd335 AMDGPU: Fold clamp modifier for packed instructions
llvm-svn: 312297
2017-08-31 23:53:50 +00:00
Sam Clegg b09cfa5104 [WebAssembly] Fix getSymbolValue() for data symbols
This is mostly a fix for the output of `llvm-nm`

See Bug 34392: https://bugs.llvm.org//show_bug.cgi?id=34392

Differential Revision: https://reviews.llvm.org/D37359

llvm-svn: 312294
2017-08-31 23:22:44 +00:00
Derek Schuff 0f3bc0f478 [WebAssembly] Refactor load ISel tablegen patterns into classes
Not all of these will be able to be used by atomics because tablegen, but it
still seems like a good change by itself.

Differential Revision: https://reviews.llvm.org/D37345

llvm-svn: 312287
2017-08-31 21:51:48 +00:00
Sam Clegg a3b9fe6acd [WebAssembly] Validate exports when parsing object files
Subscribers: jfb, dschuff, jgravelle-google, aheejin

Differential Revision: https://reviews.llvm.org/D37358

llvm-svn: 312286
2017-08-31 21:43:45 +00:00
Sam Clegg 32cfbeb4bd [llvm-nm] Fix output formatting of -f sysv for 64bit targets
Differential Revision: https://reviews.llvm.org/D37347

llvm-svn: 312284
2017-08-31 21:23:44 +00:00
Jessica Paquette ffe4abc51b [MachineOutliner] Recommit r312194, missed optimization remarks
Before, this commit caused a buildbot failure:

http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/6026/steps/test_llvm/logs/LLVM%20%3A%3A%20CodeGen__AArch64__machine-outliner-remarks.ll

This was caused by the Key value in DiagnosticInfoOptimizationBase being
deallocated before emitting the remarks defined in MachineOutliner.cpp. As of
r312277 this should no longer be an issue.
 

llvm-svn: 312280
2017-08-31 21:02:45 +00:00
Sanjay Patel 841acbbca0 [x86] add more tests for horizontal ops; NFC
llvm-svn: 312279
2017-08-31 20:59:25 +00:00
Zachary Turner 99c6982bcd [llvm-pdbutil] Print detailed S_UDT stats.
This adds a new command line option, -udt-stats, which breaks
down the stats of S_UDT records.  These are one of the biggest
contributors to the size of /DEBUG:FASTLINK PDBs, so they need
some additional tools to be able to analyze their usage.  This
option will dig into each S_UDT record and determine what kind
of record it points to, and then break down the statistics by
the target type.  The goal here is to identify how our object
files differ from MSVC object files in S_UDT records, so that
we can output fewer of them and reach size parity.

llvm-svn: 312276
2017-08-31 20:43:22 +00:00
Jonas Devlieghere d9063658c8 [dsymutil] Don't mark forward declarations as canonical.
This patch completes the work done by Frederic Riss to addresses
dsymutil incorrectly considering forward declaration as canonical during
uniquing. This resulted in references to the forward declaration even
after the definition was encountered.

In addition to the test provided by Alexander Shaposhnikov in D29609, I
added another test to cover several scenarios that were mentioned in his
conversation with Fred. We now also check that uniquing still occurs
after the definition was encountered.

For more context please refer to D29609

Differential revision: https://reviews.llvm.org/D37127

llvm-svn: 312274
2017-08-31 20:22:31 +00:00
Jonas Devlieghere 3aefe872c5 Revert "[dsymutil] Don't mark forward declarations as canonical."
This reverts commit r312264.

llvm-svn: 312271
2017-08-31 19:36:26 +00:00
Akira Hatanaka 13d2beb14d [ObjCARC] Pass the correct BasicBlock to fix assertion failure.
The BasicBlock passed to FindPredecessorRetainWithSafePath should be the
parent block of Autorelease. This fixes a crash that occurs in
FindDependencies when StartInst is not in StartBB.

rdar://problem/33866381

llvm-svn: 312266
2017-08-31 18:27:47 +00:00
Jonas Devlieghere e200f9bae6 [dsymutil] Don't mark forward declarations as canonical.
This patch completes the work done by Frederic Riss to addresses
dsymutil incorrectly considering forward declaration as canonical during
uniquing. This resulted in references to the forward declaration even
after the definition was encountered.

In addition to the test provided by Alexander Shaposhnikov in D29609, I
added another test to cover several scenarios that were mentioned in his
conversation with Fred. We now also check that uniquing still occurs
after the definition was encountered.

For more context please refer to D29609

Differential revision: https://reviews.llvm.org/D37127

llvm-svn: 312264
2017-08-31 18:06:44 +00:00
Jonas Devlieghere 8ac8df03c1 [llvm-dwarfdump] Brief mode only dumps debug_info by default
This patch changes the default behavior in brief mode to only show the
debug_info section. This is undoubtedly the most popular and likely the
one you'd want in brief mode.

Non-brief mode behavior is not affected and still defaults to all.

Differential revision: https://reviews.llvm.org/D37334

llvm-svn: 312252
2017-08-31 16:44:47 +00:00
Sanjay Patel e6b48a1b02 [InstCombine] improve demanded vector elements analysis of insertelement
Recurse instead of returning on the first found optimization. Also, return early in the caller
instead of continuing because that allows another round of simplification before we might
potentially lose undef information from a shuffle mask by eliminating the shuffle.

As noted in the review, we could probably do better and be more efficient by moving all of
demanded elements into a separate pass, but this is yet another quick fix to instcombine.

Differential Revision: https://reviews.llvm.org/D37236

llvm-svn: 312248
2017-08-31 15:57:17 +00:00
Reid Kleckner 08f5fd51cc [codeview] Generalize DIExpression parsing to handle load chains
Summary:
Hopefully this also clarifies exactly when and why we're rewriting
certiain S_LOCALs using reference types: We're using the reference type
to stand in for a zero-offset load.

Reviewers: inglorion

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37309

llvm-svn: 312247
2017-08-31 15:56:49 +00:00
Konstantin Zhuravlyov fafad15b90 Update test:
- REQUIRES: x86_64-linux -> REQUIRES: shell

Differential Revision: https://reviews.llvm.org/D37316

llvm-svn: 312245
2017-08-31 15:35:33 +00:00
Daniel Jasper c0a976d417 Revert r311525: "[XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text"
Breaks builds internally. Will forward repo instructions to author.

llvm-svn: 312243
2017-08-31 15:17:17 +00:00
Yael Tsafrir 185c81725e [X86] Added run line to intrinsics upgrade test. NFC.
llvm-svn: 312241
2017-08-31 13:56:22 +00:00
Ashutosh Nema bfcac0b480 AMD family 17h (znver1) scheduler model update.
Summary:
This patch enables the following:
1) Regex based Instruction itineraries for integer instructions.
2) The instructions are grouped as per the nature of the instructions
   (move, arithmetic, logic, Misc, Control Transfer). 
3) FP instructions and their itineraries are added which includes values
   for SSE4A, BMI, BMI2 and SHA instructions.

Patch by Ganesh Gopalasubramanian

Reviewers: RKSimon, craig.topper

Subscribers: vprasad, shivaram, ddibyend, andreadb, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D36617

llvm-svn: 312237
2017-08-31 12:38:35 +00:00
Benjamin Kramer cbc7ee45f9 [Object] Verify object sizes before handing out StringRefs pointing out
of bounds.

This can only happen on corrupt input. Found by OSS-FUZZ!
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3228

llvm-svn: 312235
2017-08-31 12:27:10 +00:00
Sam Parker 5f9346471c [AArch64] v8.3-a complex number support
New instructions are added to AArch32 and AArch64 to aid
floating-point multiplication and addition of complex numbers,
where the complex numbers are packed in a vector register as a
pair of elements. The Imaginary part of the number is placed in the
more significant element, and the Real part of the number is placed
in the less significant element.

Differential Revision: https://reviews.llvm.org/D36792

llvm-svn: 312228
2017-08-31 09:27:04 +00:00
Sean Eveson e15300ecf5 [llvm-cov] Read in function names for filtering from a text file.
Summary: Add a -name-whitelist option, which behaves in the same way as -name, but it reads in multiple function names from the given input file(s).

Reviewers: vsk

Reviewed By: vsk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37111

llvm-svn: 312227
2017-08-31 09:11:31 +00:00
Sam Parker a42d8a9164 [AArch64] IDSAR6 register assembler support
The IDSAR6 system register has been introduced to identify the
v8.3-a Javascript data type conversion and v8.2-a dot product
support.

Differential Revision: https://reviews.llvm.org/D37068

llvm-svn: 312225
2017-08-31 08:36:45 +00:00
Martin Storsjo 865d01a3cf [AArch64] Support COFF linker directives
This is similar to what was done for ARM in SVN r269574; the code
and the test are straight copypaste to the corresponding AArch64
code and test directory.

Differential revision: https://reviews.llvm.org/D37204

llvm-svn: 312223
2017-08-31 08:28:48 +00:00
Max Kazantsev 0a9c1ef2eb [IRCE] Identify loops with latch comparison against current IV value
Current implementation of parseLoopStructure interprets the latch comparison as a
comarison against `iv.next`. If the actual comparison is made against the `iv` current value
then the loop may be rejected, because this misinterpretation leads to incorrect evaluation
of the latch start value.

This patch teaches the IRCE to distinguish this kind of loops and perform the optimization
for them. Now we use `IndVarBase` variable which can be either next or current value of the
induction variable (previously we used `IndVarNext` which was always the value on next iteration).

Differential Revision: https://reviews.llvm.org/D36215

llvm-svn: 312221
2017-08-31 07:04:20 +00:00
Daniel Jasper b8198f02e6 Revert r312194: "[MachineOutliner] Add missed optimization remarks for the outliner."
Breaks on buildbot:
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/6026/steps/test_llvm/logs/LLVM%20%3A%3A%20CodeGen__AArch64__machine-outliner-remarks.ll

llvm-svn: 312219
2017-08-31 06:22:35 +00:00
Eric Christopher e42ac21499 Temporarily revert "Update branch coalescing to be a PowerPC specific pass"
From comments and code review it wasn't intended to be enabled by default yet.

This reverts commit r311588.

llvm-svn: 312214
2017-08-31 05:56:16 +00:00
Matt Arsenault 376f1bd73c AMDGPU: Don't assert in TTI with fp32 denorms enabled
Also refine for f16 and rcp cases.

llvm-svn: 312213
2017-08-31 05:47:00 +00:00
Dean Michael Berris 415f15c5e2 [XRay][tools] Fix an accounting bug in llvm-xray account
Summary:
Before this patch, llvm-xray account will assume that thread stacks will
not be empty. Unfortunately there are cases where an instrumented
function will see a call to `fork()` which will cause the child process
to not see the start of the function, but only see the end of the
function. The tooling cannot assume that threads will always have
perfect stacks, and so we change it to support this reality.

Reviewers: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31870

llvm-svn: 312204
2017-08-31 01:07:24 +00:00
Adrian Prantl 28454efc67 Revert "Revert r312139 "Verifier: Verify the correctness of fragment expressions attached to globals.""
This reverts commit r312182 after fixing PR34390.

llvm-svn: 312197
2017-08-31 00:07:33 +00:00
Adrian Prantl 504b82d44b Don't add a fragment expression when GlobalSRA splits up a single-member struct
Fixes PR34390.

https://bugs.llvm.org/show_bug.cgi?id=34390

llvm-svn: 312196
2017-08-31 00:06:18 +00:00
Jessica Paquette 65d953e0b1 [MachineOutliner] Add missed optimization remarks for the outliner.
This adds missed optimization remarks which report viable candidates that
were not outlined because they would increase code size.

Other remarks will come in separate commits.

This will help to diagnose code size regressions and changes in outliner
behaviour in projects using the outliner.

https://reviews.llvm.org/D37085

llvm-svn: 312194
2017-08-30 23:31:49 +00:00
Petr Hosek 5aa80f1663 [yaml2obj][ELF] Make symbols optional for relocations
Some kinds of relocations do not have symbols, like R_X86_64_RELATIVE
for instance. I would like to test this case in D36554 but currently
can't because symbols are required by yaml2obj. The other option is
using the empty symbol but that doesn't seem quite right to me.

This change makes the Symbol field of Relocation optional and in the
case where the user does not specify a symbol name the Symbol index is 0.

Patch by Jake Ehrlich

Differential Revision: https://reviews.llvm.org/D37276

llvm-svn: 312192
2017-08-30 23:13:31 +00:00
Matt Morehouse 034126e507 [SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer
Summary:
- Don't sanitize __sancov_lowest_stack.
- Don't instrument leaf functions.
- Add CoverageStackDepth to Fuzzer and FuzzerNoLink.
- Only enable on Linux.

Reviewers: vitalybuka, kcc, george.karpenkov

Reviewed By: kcc

Subscribers: kubamracek, cfe-commits, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37156

llvm-svn: 312185
2017-08-30 22:49:31 +00:00
Hans Wennborg 76794daf05 Revert r312139 "Verifier: Verify the correctness of fragment expressions attached to globals."
This caused PR34390.

llvm-svn: 312182
2017-08-30 22:41:27 +00:00
Matt Arsenault c8f8cda0cd AMDGPU: Correct operand types for v_mad_mix*
These aren't really packed instructions, so the default
op_sel_hi should be 0 since this indicates a conversion.
The operand types are scalar values that behave similar
to an f16 scalar that may be converted to f32.

Doesn't change the default printing for op_sel_hi, just
the parsing.

llvm-svn: 312179
2017-08-30 22:18:40 +00:00
Hans Wennborg 24775a0a6c Revert r312154 "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!")

> Issues identified by buildbots addressed since original review:
> - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
> - The pass no longer forwards COPYs to physical register uses, since
>   doing so can break code that implicitly relies on the physical
>   register number of the use.
> - The pass no longer forwards COPYs to undef uses, since doing so
>   can break the machine verifier by creating LiveRanges that don't
>   end on a use (since the undef operand is not considered a use).
>
>   [MachineCopyPropagation] Extend pass to do COPY source forwarding
>
>   This change extends MachineCopyPropagation to do COPY source forwarding.
>
>   This change also extends the MachineCopyPropagation pass to be able to
>   be run during register allocation, after physical registers have been
>   assigned, but before the virtual registers have been re-written, which
>   allows it to remove virtual register COPY LiveIntervals that become dead
>   through the forwarding of all of their uses.

llvm-svn: 312178
2017-08-30 22:11:37 +00:00
Konstantin Zhuravlyov fbe9041a37 Fix test after rL312144
llvm-svn: 312176
2017-08-30 21:59:14 +00:00
Sanjay Patel 4f44bf7f21 [InstCombine] add more vector demand examples; NFC
See D37236 for discussion. It seems unlikely that we actually want/need
to do this kind of folding in InstCombine in the long run, but moving
everything will be a bigger follow-up step.

llvm-svn: 312172
2017-08-30 21:11:46 +00:00
Adrian Prantl 608df027fd SelectionDAG: Emit correct debug info for multi-register function arguments.
Previously we would just describe the first register and then call it
quits. This patch emits fragment expressions for each register.

<rdar://problem/34075307>

llvm-svn: 312169
2017-08-30 20:51:20 +00:00
Brian Gesiak 3332976478 [ARM] Use Swift error registers on non-Darwin targets
Summary:
Remove a check for `ARMSubtarget::isTargetDarwin` when determining
whether to use Swift error registers, so that Swift errors work
properly on non-Darwin ARM32 targets (specifically Android).

Before this patch, generated code would save and restores ARM register r8 at
the entry and returns of a function that throws. As r8 is used as a virtual
return value for the object being thrown, this gets overwritten by the restore,
and calling code is unable to catch the error. In turn this caused Swift code
that used `do`/`try`/`catch` to work improperly on Android ARM32 targets.

Addresses Swift bug report https://bugs.swift.org/browse/SR-5438.

Patch by John Holdsworth.

Reviewers: manmanren, rjmccall, aschwaighofer

Reviewed By: aschwaighofer

Subscribers: srhines, aschwaighofer, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35835

llvm-svn: 312164
2017-08-30 20:03:54 +00:00
Derek Schuff bd484b4fc3 [WebAssembly] Update debug info test after r312144
Add the expr field to the DIGlobalVariableExpression

llvm-svn: 312163
2017-08-30 19:54:08 +00:00
Aditya Nandakumar c6615f56f5 [GISel]: Add a clean up combiner during legalization.
Added a combiner which can clean up truncs/extends that are created in
order to make the types work during legalization.

Also moved the combineMerges to the LegalizeCombiner.

https://reviews.llvm.org/D36880

llvm-svn: 312158
2017-08-30 19:32:59 +00:00
Geoff Berry feffb0c8af Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues identified by buildbots addressed since original review:
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

llvm-svn: 312154
2017-08-30 18:41:07 +00:00
Derek Schuff 18ba192843 [WebAssembly] Add target feature for atomics
Summary:
This tracks the WebAssembly threads feature proposal at
https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md

Differential Revision: https://reviews.llvm.org/D37300

llvm-svn: 312145
2017-08-30 18:07:45 +00:00
Adrian Prantl 05782218ab Canonicalize the representation of empty an expression in DIGlobalVariableExpression
This change simplifies code that has to deal with
DIGlobalVariableExpression and mirrors how we treat DIExpressions in
debug info intrinsics. Before this change there were two ways of
representing empty expressions on globals, a nullptr and an empty
!DIExpression().

If someone needs to upgrade out-of-tree testcases:
  perl -pi -e 's/(!DIGlobalVariableExpression\(var: ![0-9]*)\)/\1, expr: !DIExpression())/g' <MYTEST.ll>
will catch 95%.

llvm-svn: 312144
2017-08-30 18:06:51 +00:00
Adrian Prantl 8550e88eb7 Verifier: Verify the correctness of fragment expressions attached to globals.
llvm-svn: 312139
2017-08-30 16:49:21 +00:00
Craig Topper afce0baacd [AVX512] Don't use 32-bit elements version of AND/OR/XOR/ANDN during isel unless we're matching a masked op or broadcast
Selecting 32-bit element logical ops without a select or broadcast requires matching a bitconvert on the inputs to the and. But that's a weird thing to rely on. It's entirely possible that one of the inputs doesn't have a bitcast and one does.

Since there's no functional difference, just remove the extra patterns and save some isel table size.

Differential Revision: https://reviews.llvm.org/D36854

llvm-svn: 312138
2017-08-30 16:38:33 +00:00
Igor Breger 36d447d8a8 [GlobalISel][X86] Support variadic function call.
Summary: Support variadic function call. Port the implementation from X86FastISel.

Reviewers: zvi, guyblank, oren_ben_simhon

Reviewed By: guyblank

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D37261

llvm-svn: 312130
2017-08-30 15:10:15 +00:00
Balaram Makam 42adadfca0 Re-land MachineInstr: Reason locally about some memory objects before going to AA.
Summary:
Reverts r311008 to reinstate r310825 with a fix.

Refine alias checking for pseudo vs value to be conservative.
This fixes the original failure in builtbot unittest SingleSource/UnitTests/2003-07-09-SignedArgs.

Reviewers: hfinkel, nemanjai, efriedma

Reviewed By: efriedma

Subscribers: bjope, mcrosier, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D36900

llvm-svn: 312126
2017-08-30 14:57:12 +00:00
Sanjay Patel 6f7ac7e402 [InstCombine] remove unnecessary vector select fold; NFCI
This code is double-dead:
1. We simplify all selects with constant true/false condition in InstSimplify.
   I've minimized/moved the tests to show that works as expected.
2. All remaining vector selects with a constant condition are canonicalized to
   shufflevector, so we really can't see this pattern.

llvm-svn: 312123
2017-08-30 14:04:57 +00:00
Strahinja Petrovic 89df797ee9 [MIPS] Add support to match more patterns for BBIT instruction
This patch supports one more pattern for bbit0 and bbit1
instructions, CBranchBitNum class is expanded  so it can
take 32 bit immidate.

Differential Revision: https://reviews.llvm.org/D36222

llvm-svn: 312111
2017-08-30 11:25:38 +00:00
Florian Hahn b992feee13 [InstCombine] Fold insert sequence if first ins has multiple users.
Summary:
If the first insertelement instruction has multiple users and inserts at
position 0, we can re-use this instruction when folding a chain of
insertelement instructions. As we need to generate the first
insertelement instruction anyways, this should be a strict improvement.

We could get rid of the restriction of inserting at position 0 by
creating a different shufflemask, but it is probably worth to keep the
first insertelement instruction with position 0, as this is easier to do
efficiently than at other positions I think.

Reviewers: grosser, mkuper, fpetrogalli, efriedma

Reviewed By: fpetrogalli

Subscribers: gareevroman, llvm-commits

Differential Revision: https://reviews.llvm.org/D37064

llvm-svn: 312110
2017-08-30 10:54:21 +00:00
Sjoerd Meijer be5b60f735 [AArch64] allow v4f16 types when FullFP16 is supported
Support for scalars was committed in r311154, this adds support for allowing
v4f16 vector types (thus avoiding conversions from/to single precision for
these types).

Differential Revision: https://reviews.llvm.org/D37145

llvm-svn: 312104
2017-08-30 08:38:13 +00:00
Gadi Haber 767d98bad8 [X86][Skylake] Fixing duplicated prefixes in the run command of Code Gen regression tests
NFC.
Replaced duplicated HASWELL prefixes in run commands in the X86 Code Gen regression tests by the SKYLAKE prefix when the -mcpu is set to skylake.
The fix is needed in preparation of an upcoming patch containing the Skylake scheduling info.

Reviewers: zvi, RKSimon, aymanmus, igorb

Differential Revision: https://reviews.llvm.org/D37258

llvm-svn: 312103
2017-08-30 08:08:50 +00:00
Craig Topper 17854ecf24 [AVX512] Correct isel patterns to support selecting masked vbroadcastf32x2/vbroadcasti32x2
Summary:
This patch adjusts the patterns to make the result type of the broadcast node vXf64/vXi64. Then adds a bitcast to vXi32 after that. Intrinsic lowering was also adjusted to generate this new pattern.

Fixes PR34357

We should probably just drop the intrinsic entirely and use native IR, but I'll leave that for a future patch.

Any idea what instruction we should be lowering the floating point 128-bit result version of this pattern to?  There's a 128-bit v2i32 integer broadcast but not an fp one.

Reviewers: aymanmus, zvi, igorb

Reviewed By: aymanmus

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37286

llvm-svn: 312101
2017-08-30 07:48:39 +00:00
Craig Topper 48a7917079 [AVX512] Use 256-bit extract instructions for extracting bits [255:128] from a 512-bit register
This enables the use of a smaller encoding by using a VEX instruction when possible.

Differential Revision: https://reviews.llvm.org/D37092

llvm-svn: 312100
2017-08-30 07:26:12 +00:00
Craig Topper ef1f71669e [X86] Apply SlowIncDec feature to Sandybridge/Ivybridge CPUs as well
Currently we start applying this on Haswell and newer. I don't believe anything changed in the Haswell architecture to make this the right cutoff point. The partial flag handling around this has been roughly the same since Sandybridge.

Differential Revision: https://reviews.llvm.org/D37250

llvm-svn: 312099
2017-08-30 05:00:35 +00:00
Craig Topper 641e2af9e8 [X86] Provide a separate feature bit for macro fusion support instead of basing it on the AVX flag
Summary:
Currently we determine if macro fusion is supported based on the AVX flag as a proxy for the processor being Sandy Bridge".

This is really strange as now AMD supports AVX. It also means if user explicitly disables AVX we disable macro fusion.

This patch adds an explicit macro fusion feature. I've also enabled for the generic 64-bit CPU (which doesn't have AVX)

This is probably another candidate for being in the MI layer, but for now I at least wanted to correct the overloading of the AVX feature.

Reviewers: spatel, chandlerc, RKSimon, zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37280

llvm-svn: 312097
2017-08-30 04:34:48 +00:00
Stanislav Mekhanoshin 06cab79e50 [AMDGPU] Use v_max_f* for fcanonicalize
If denorms are not flushed we can use max instead of multiplication
by 1. For double that is simply faster, while for float and half
it is shorter, because mul uses constant bus and VOP3.

Differential Revision: https://reviews.llvm.org/D36856

llvm-svn: 312095
2017-08-30 03:03:38 +00:00
Matt Arsenault 6b114d2c50 AMDGPU: Select clamp pattern with v2f16
llvm-svn: 312087
2017-08-30 01:20:17 +00:00
Evgeniy Stepanov 7372b67063 [cfi] Avoid branch veneers in jump tables when possible.
Summary:
When jumptable encoding does not match target code encoding (arm vs
thumb), a veneer is inserted by the linker. We can not avoid this
in all cases, because entries within one jumptable must have the same
encoding, but we can make it less common by selecting the jumptable
encoding to match the majority of its targets.

This change only covers FullLTO, and not ThinLTO.

Reviewers: pcc

Subscribers: aemerson, mehdi_amini, javed.absar, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37171

llvm-svn: 312054
2017-08-29 22:40:19 +00:00
Evgeniy Stepanov 4731ad81c7 [cfi] Build __cfi_check as Thumb when applicable.
Summary:
Cross-DSO CFI needs all __cfi_check exports to use the same encoding
(ARM vs Thumb).

Reviewers: pcc

Subscribers: aemerson, srhines, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37243

llvm-svn: 312052
2017-08-29 22:29:15 +00:00
Reid Kleckner 2297fc89b4 Fix the dwarfdump test so that it passes in its new location
llvm-svn: 312051
2017-08-29 22:25:16 +00:00
Reid Kleckner 14469db602 Move dwarfdump test to DebugInfo/X86 now that it looks for x86 register names
Otherwise this test will fail on bots (like the hexagon bot) that don't
enable the x86 backend.

llvm-svn: 312050
2017-08-29 22:16:11 +00:00
Matt Morehouse ba2e61b357 Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer"
This reverts r312026 due to bot breakage.

llvm-svn: 312047
2017-08-29 21:56:56 +00:00
Wei Mi ebb9327759 [LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement
This is to fix PR34257. rL309059 takes an early return when FindLIVLoopCondition
fails to find a loop invariant condition. This is wrong and it will disable loop
unswitch for select. The patch fixes the bug.

Differential Revision: https://reviews.llvm.org/D36985

llvm-svn: 312045
2017-08-29 21:45:11 +00:00
Reid Kleckner a058736c9c [dwarfdump] Pretty print location expressions and location lists
Summary:
Based on Fred's patch here: https://reviews.llvm.org/D6771

I can't seem to commandeer the old review, so I'm creating a new one.

With that change the locations exrpessions are pretty printed inline in the
DIE tree. The output looks like this for debug_loc entries:

    DW_AT_location [DW_FORM_data4]        (0x00000000
       0x0000000000000001 - 0x000000000000000b: DW_OP_consts +3
       0x000000000000000b - 0x0000000000000012: DW_OP_consts +7
       0x0000000000000012 - 0x000000000000001b: DW_OP_reg0 RAX, DW_OP_piece 0x4
       0x000000000000001b - 0x0000000000000024: DW_OP_breg5 RDI+0)

And like this for debug_loc.dwo entries:
    DW_AT_location [DW_FORM_sec_offset]   (0x00000000
      Addr idx 2 (w/ length 190): DW_OP_consts +0, DW_OP_stack_value
      Addr idx 3 (w/ length 23): DW_OP_reg0 RAX, DW_OP_piece 0x4)

Simple locations without ranges are printed inline:

   DW_AT_location [DW_FORM_block1]       (DW_OP_reg4 RSI, DW_OP_piece 0x4, DW_OP_bit_piece 0x20 0x0)

The debug_loc(.dwo) dumping in changed accordingly to factor the code.

Reviewers: dblaikie, aprantl, friss

Subscribers: mgorny, javed.absar, hiraditya, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D37123

llvm-svn: 312042
2017-08-29 21:41:21 +00:00
Joerg Sonnenberger 0cb17cf2d8 Simplify test case, so that it works for both trunk and release-5.0.
llvm-svn: 312038
2017-08-29 21:18:07 +00:00
Bob Haarman 223303c5a3 Reland r311957 [codeview] support more DW_OPs for more complete debug info
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.

This fixes PR34261.

The reland adds two extra checks to the original: It checks if the
DbgVariableLocation is valid before checking any of its fields, and
it only emits ranges with nonzero registers.

Reviewers: aprantl, rnk, zturner

Reviewed By: rnk

Subscribers: mgorny, llvm-commits, aprantl, hiraditya

Differential Revision: https://reviews.llvm.org/D36907

llvm-svn: 312034
2017-08-29 20:59:25 +00:00
Alexey Bataev 978e2e4760 [SimplifyCFG] Fix for PR34219: Preserve alignment after merging conditional stores.
Summary:
If SimplifyCFG pass is able to merge conditional stores into single one,
it loses the alignment. This may lead to incorrect codegen. Patch
sets the alignment of the new instruction if it is set in the original
one.

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36841

llvm-svn: 312030
2017-08-29 20:06:24 +00:00
Matt Morehouse 2ad8d948b2 [SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer
Summary:
- Don't sanitize __sancov_lowest_stack.
- Don't instrument leaf functions.
- Add CoverageStackDepth to Fuzzer and FuzzerNoLink.
- Disable stack depth tracking on Mac.

Reviewers: vitalybuka, kcc, george.karpenkov

Reviewed By: kcc

Subscribers: kubamracek, cfe-commits, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37156

llvm-svn: 312026
2017-08-29 19:48:12 +00:00
Craig Topper 4431bfe88c [InstCombine] Support vector splats in transformZExtICmp
This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code.

One test didn't vectorize optimize properly due to another bug so a TODO has been added.

Differential Revision: https://reviews.llvm.org/D37253

llvm-svn: 312023
2017-08-29 18:58:13 +00:00
Davide Italiano 16a426e9a9 [LoopUnroll] Make the test for PR33437 actually useful.
I forgot to specify -unroll-loop-peel, making this test not
really effective. While here, adjust some details (naming and
run line). Thanks to Sanjoy and Michael Z. for pointing out in
their post-commit reviews.

llvm-svn: 312015
2017-08-29 17:24:09 +00:00
Marek Sokolowski 4ac54d9302 [llvm-rc] Add DIALOG(EX) parsing ability (parser, pt 5/8).
This extends the set of resources parsed by llvm-rc by DIALOG and
DIALOGEX.

Additionally, three optional resource statements specific to these two
resources are added: CAPTION, FONT, and STYLE.

Thanks for Nico Weber for his original work in this area.

Differential Revision: https://reviews.llvm.org/D36905

llvm-svn: 312009
2017-08-29 16:49:59 +00:00
Dehao Chen efd007f6f4 Add null check for promoted direct call
Summary: We originally assume that in pgo-icp, the promoted direct call will never be null after strip point casts. However, stripPointerCasts is so smart that it could possibly return the value of the function call if it knows that the return value is always an argument. In this case, the returned value cannot cast to Instruction. In this patch, null check is added to ensure null pointer will not be accessed.

Reviewers: tejohnson, xur, davidxl, djasper

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37252

llvm-svn: 312005
2017-08-29 15:28:12 +00:00