Commit Graph

169375 Commits

Author SHA1 Message Date
Simon Pilgrim cffa206423 [X86][SSE] Always enable ISD::SRL -> ISD::MULHU for v8i16
For constant non-uniform cases we'll never introduce more and/andn/or selects than already occur in generic pre-SSE41 ISD::SRL lowering.

llvm-svn: 342352
2018-09-16 20:28:38 +00:00
Simon Pilgrim ea069ffd44 [X86][AVX] Enable ISD::SRL -> ISD::MULHU for v16i16
Now that rL340913 has landed with improved v16i16 selects as shuffles.

llvm-svn: 342349
2018-09-16 19:20:47 +00:00
Sanjay Patel 3eaf500a6d [DAGCombiner] try to convert pow(x, 1/3) to cbrt(x)
This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040.

Copying the motivational statement by @evandro from that patch:
"This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp, 
447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. 
Otherwise, no regressions on x86-64 or A64."

I'm proposing to add only the minimum support for a DAG node here. Since we don't have an 
LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I 
don't think we need to worry about DAG builder, legalization, a strict variant, etc. We 
should be able to expand as needed when adding more functionality/transforms. For reference, 
these are transform suggestions currently listed in SimplifyLibCalls.cpp:

//   * cbrt(expN(X))  -> expN(x/3)
//   * cbrt(sqrt(x))  -> pow(x,1/6)
//   * cbrt(cbrt(x))  -> pow(x,1/9)

Also, given that we bail out on long double for now, there should not be any logical 
differences between platforms (unless there's some platform out there that has pow()
but not cbrt()).

Differential Revision: https://reviews.llvm.org/D51753

llvm-svn: 342348
2018-09-16 16:50:26 +00:00
Sanjay Patel bfee5a9b42 [x86] fix uses check in broadcast transform (PR38949)
https://bugs.llvm.org/show_bug.cgi?id=38949

It's not clear to me that we even need a one-use check in this fold.
Ie, 2 independent loads might be better than a load+dependent shuffle.

Note that the existing re-use tests are not affected. We actually do form a
broadcast node in those tests now because there's no extra use of the 
insert_subvector node in those cases. But something later in isel pattern 
matching decides that it is not worth using a broadcast for the full load in 
those tests:

Legalized selection DAG: %bb.0 'test_broadcast_2f64_4f64_reuse:'
  t7: v2f64,ch = load<(load 16 from %ir.p0)> t0, t2, undef:i64
      t4: i64,ch = CopyFromReg t0, Register:i64 %1
    t10: ch = store<(store 16 into %ir.p1)> t7:1, t7, t4, undef:i64
      t18: v4f64 = insert_subvector undef:v4f64, t7, Constant:i64<0>
    t20: v4f64 = insert_subvector t18, t7, Constant:i64<2>

Becomes:
  t7: v2f64,ch = load<(load 16 from %ir.p0)> t0, t2, undef:i64
      t4: i64,ch = CopyFromReg t0, Register:i64 %1
    t10: ch = store<(store 16 into %ir.p1)> t7:1, t7, t4, undef:i64
    t21: v4f64 = X86ISD::SUBV_BROADCAST t7

ISEL: Starting selection on root node: t21: v4f64 = X86ISD::SUBV_BROADCAST t7
...
  Created node: t27: v4f64 = INSERT_SUBREG IMPLICIT_DEF:v4f64, t7, TargetConstant:i32<7>
  Morphed node: t21: v4f64 = VINSERTF128rr t27, t7, TargetConstant:i8<1>

llvm-svn: 342347
2018-09-16 15:41:56 +00:00
Sanjay Patel 3e095174b0 [x86] add failure to splat test (PR38949); NFC
llvm-svn: 342346
2018-09-16 14:59:04 +00:00
Roman Lebedev 6356864e6d [NFC][InstCombine] One more test pattern for comparisons with low-bit-mask.
https://rise4fun.com/Alive/UGzE <- non-canonical, but has extra uses.

https://bugs.llvm.org/show_bug.cgi?id=38123

llvm-svn: 342345
2018-09-16 12:51:09 +00:00
Simon Pilgrim 5ea1b32631 Fix -Wdangling-else gcc warning. NFCI.
llvm-svn: 342344
2018-09-16 12:30:41 +00:00
Roman Lebedev 3fb9414d02 [NFC][InstCombine] Some more tests for comparisons with low-bit-mask.
https://bugs.llvm.org/show_bug.cgi?id=38123
https://bugs.llvm.org/show_bug.cgi?id=38708

llvm-svn: 342343
2018-09-16 08:05:06 +00:00
Fangrui Song 37a72098ae [llvm-readobj] Make some commonly used short options visibile in -help
For people who use llvm-readelf as a replacement of GNU readelf, they would like to see -d -r ... listed in llvm-readelf -help. It also helps understanding the confusing -s (which is unfortunately different in semantics).

Reviewers: phosek, ruiu, echristo

Reviewed By: ruiu, echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52129

llvm-svn: 342339
2018-09-15 21:27:46 +00:00
Nico Weber b09a8c9bd9 Revert r342148 (and follow-on fix attempts r342154, r342180, r342182, r342193)
Many bots buildling with make have been broken for several days, e.g.
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13

llvm-svn: 342336
2018-09-15 19:04:27 +00:00
Craig Topper 2da7381678 [InstCombine] Support (sub (sext x), (sext y)) --> (sext (sub x, y)) and (sub (zext x), (zext y)) --> (zext (sub x, y))
Summary:
If the sub doesn't overflow in the original type we can move it above the sext/zext.

This is similar to what we do for add. The overflow checking for sub is currently weaker than add, so the test cases are constructed for what is supported.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52075

llvm-svn: 342335
2018-09-15 18:54:10 +00:00
Nico Weber 205ca68b8d Give InfoStreamBuilder an opt-in method to write a hash of the PDB as GUID.
Naively computing the hash after the PDB data has been generated is in practice
as fast as other approaches I tried. I also tried online-computing the hash as
parts of the PDB were written out (https://reviews.llvm.org/D51887; that's also
where all the measuring data is) and computing the hash in parallel
(https://reviews.llvm.org/D51957). This approach here is simplest, without
being slower.

Differential Revision: https://reviews.llvm.org/D51956

llvm-svn: 342333
2018-09-15 18:35:51 +00:00
Nico Weber 1359d654e3 Update microsoftDemangle() to work more like itaniumDemangle().
* Use same method of initializing the output stream and its buffer
* Allow a nullptr Status pointer
* Don't print the mangled name on demangling error
* Write to N (if it is non-nullptr)

Differential Revision: https://reviews.llvm.org/D52104

llvm-svn: 342330
2018-09-15 18:24:20 +00:00
Simon Pilgrim fc4c26485c [X86][SSE] Fix insertps load combine test name
The existing test was called extract_lane_insertps_5123 but it was in fact doing a <6,1,2,3> shuffle. I've fixed the name and added the <5,1,2,3> test case as well.

llvm-svn: 342328
2018-09-15 16:57:04 +00:00
Craig Topper fe0b973fbf [X86] Remove an fp->int->fp domain crossing in LowerUINT_TO_FP_i64.
Summary: This unfortunately adds a move, but isn't that better than going to the int domain and back?

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52134

llvm-svn: 342327
2018-09-15 16:23:35 +00:00
Craig Topper 273f755da3 [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C))
Summary:
MOVMSK only care about the sign bit so we don't need the setcc to fill the whole element with 0s/1s. We can just shift the bit we're looking for into the sign bit. This saves a constant pool load.

Inspired by PR38840.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D52121

llvm-svn: 342326
2018-09-15 16:23:33 +00:00
Fedor Sergeev 751341905d [NFC] minor cleanup in PassManagerInternal.h
A few changes found necessary for upcoming PassInstrumentation patch:
  - name() methods made const
  - properly forward arguments in AnalysisPassModel::run

Separated out of the main D47858 patch.

llvm-svn: 342325
2018-09-15 14:56:12 +00:00
Sanjay Patel 296d35a5e9 [InstCombine][x86] try harder to convert blendv intrinsic to generic IR (PR38814)
Missing optimizations with blendv are shown in:
https://bugs.llvm.org/show_bug.cgi?id=38814

If this works, it's an easier and more powerful solution than adding pattern matching 
for a few special cases in the backend. The potential danger with this transform in IR
is that the condition value can get separated from the select, and the backend might 
not be able to make a blendv out of it again. I don't think that's too likely, but 
I've kept this patch minimal with a 'TODO', so we can test that theory in the wild 
before expanding the transform.

Differential Revision: https://reviews.llvm.org/D52059

llvm-svn: 342324
2018-09-15 14:25:44 +00:00
Simon Pilgrim 7bfe87181d Fix line endings. NFCI.
llvm-svn: 342323
2018-09-15 14:20:53 +00:00
Roman Lebedev 1b7fc87020 [InstCombine] Inefficient pattern for high-bits checking 3 (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only n bits wide (where n is a variable.)
There are many ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

The last (as far i know?) pattern, non-canonical due to the extra use.
https://godbolt.org/z/aCMsPk
https://rise4fun.com/Alive/I6f

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52062

llvm-svn: 342321
2018-09-15 12:04:13 +00:00
Vedant Kumar 1b02dad9f2 [CodeGenPrepare] Preserve debug locs in OptimizeExtractBits
CodeGenPrepare has a transform that sinks {lshr, trunc} pairs to make it
easier for the backend to emit fancy extract-bits instructions (e.g UBFX).

Teach it to preserve debug locations and salvage debug values.

llvm-svn: 342319
2018-09-15 04:08:52 +00:00
Thomas Lively 66f3dc031d [WebAssembly][NFC] Generalize operand numbers in SIMD tests
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52130

llvm-svn: 342303
2018-09-15 01:12:48 +00:00
Thomas Lively f2550e0c44 [WebAssembly] SIMD shifts
Summary:
Implement shifts of vectors by i32. Since LLVM defines shifts as
binary operations between two vectors, this involves pattern matching
on splatted shift operands. For v2i64 shifts any i32 shift operands
have to be zero extended in the input and any i64 shift operands have
to be wrapped in the output. Depends on D52007.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51906

llvm-svn: 342302
2018-09-15 00:45:31 +00:00
Wei Mi 67f57c6795 Fix filesystem race issue in SampleProfTest introduced in rL342283.
Before this fix, multiple invocations of testRoundTrip will create multiple
writers which share the same file as output destination. That could introduce
filesystem race issue when multiple subtests are executed concurrently. This
patch assign writers with different files as their output destinations.

llvm-svn: 342301
2018-09-15 00:04:15 +00:00
Thomas Lively 88b7443f94 [WebAssembly] SIMD neg
Summary: Depends on D52007.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52009

llvm-svn: 342296
2018-09-14 22:35:12 +00:00
Zachary Turner a98ee586bf [PDB] Make the pretty dumper output modified types.
Currently if we got something like `const Foo` we'd ignore it and
just rely on printing the unmodified `Foo` later on.  However,
for testing the native reading code we really would like to be able
to see these so that we can verify that the native reader can
actually handle them.  Instead of printing out the full type though,
just print out the header.

llvm-svn: 342295
2018-09-14 22:29:19 +00:00
Craig Topper 5692ac6ce7 [BreakFalseDeps] Fix bad formatting. NFC
llvm-svn: 342293
2018-09-14 22:26:09 +00:00
Sanjay Patel 90a36346bc [InstCombine] refactor mul narrowing folds; NFCI
Similar to rL342278:
The test diffs are all cosmetic due to the change in
value naming, but I'm including that to show that the
new code does perform these folds rather than something
else in instcombine.

D52075 should be able to use this code too rather than
duplicating all of the logic.

llvm-svn: 342292
2018-09-14 22:23:35 +00:00
Adrian Prantl 8ccc7bc27c Attempt to unbreak the build after r342286.
llvm-svn: 342291
2018-09-14 21:43:45 +00:00
Sanjay Patel 46945b9e9d [InstCombine] add/use overflowing math helper functions; NFC
The mul case can already be refactored to use this similar to
rL342278.
The sub case is proposed in D52075.

llvm-svn: 342289
2018-09-14 21:30:07 +00:00
Lion Yang c68f78d5d8 [PowerPC] Fix the calling convention for i1 arguments on PPC32
Summary:
Integer types smaller than i32 must be extended to i32 by default.
The feature "crbits" introduced at r202451 handles i1 as a special case,
but it did not extend properly.
The caller was, therefore, passing i1 stack arguments by writing 0/1 to
the first byte of the 4-byte stack object and callee was
reading the first byte for the value.

"crbits" is enabled if the optimization level is greater than 1,
which is very common in "release builds".
Such discrepancies with ABI specification also introduces
potential incompatibility with programs or libraries
built with other compilers e.g. GCC.

Fixes PR38661

Reviewers: hfinkel, cuviper

Subscribers: sylvestre.ledru, glaubitz, nagisa, nemanjai, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D51108

llvm-svn: 342288
2018-09-14 21:26:05 +00:00
Thomas Lively a3937b231d [WebAssembly][NFC] Move SIMD encoding tests to dedicated file
Summary:
This change makes the tests more focused and avoids problematic
interactions between the testing modes and instruction encoding. This
change also allows the other tests to use less verbose output and
stricter checks.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52007

llvm-svn: 342287
2018-09-14 21:21:42 +00:00
Zachary Turner 08e522a0f5 Add missing include.
llvm-svn: 342286
2018-09-14 21:19:52 +00:00
Reid Kleckner b3d456a79e [codeview] Remove dead code
llvm-svn: 342285
2018-09-14 21:14:08 +00:00
Zachary Turner 4d68951e6d [PDB] Refactor a little of the Symbol creation code.
Eventually we need to be able to support nested types, which don't
have an associated CVType record.  To handle this, remove the
CVType from all of the record classes, and instead store the
deserialized record.  Then move the deserialization up to the thing
that creates the type.  This actually makes error handling better
anyway as we can return an invalid symbol instead of asserting false.

llvm-svn: 342284
2018-09-14 21:03:57 +00:00
Wei Mi 6a14325dff [SampleFDO] Add FunctionOffsetTable in compact binary format profile.
The patch saves a function offset table which maps function name index to the
offset of its function profile to the start of the binary profile. By using
the function offset table, for those function profiles which will not be used
when compiling a module, the profile reader does't have to read them. For
profile size around 10~20M, it saves ~10% compile time.

Differential Revision: https://reviews.llvm.org/D51863

llvm-svn: 342283
2018-09-14 20:52:59 +00:00
Fangrui Song d39d374e15 test/Other/can-execute.txt: delete %t after the test
This test constructs a non-readable file of mode 0111, which lingers in the test output directory and will cause EACCES to various tools (rg, rsync, ...)

llvm-svn: 342279
2018-09-14 20:41:42 +00:00
Sanjay Patel 2426eb46dd [InstCombine] refactor add narrowing folds; NFCI
The test diffs are all cosmetic due to the change in
value naming, but I'm including that to show that the
new code does perform these folds rather than something
else in instcombine.

llvm-svn: 342278
2018-09-14 20:40:46 +00:00
Sebastian Pop 0f30f08b02 HotColdSplit: fix invalid SSA due to outlining
The test used to fail with an invalid phi node: the two predecessors were outlined
and the SSA representation was left invalid. The patch adds the exit block to the
cold region.

llvm-svn: 342277
2018-09-14 20:36:19 +00:00
Sebastian Pop 3abcf69074 HotColdSplit: fix isSingleEntrySingleExit
remove duplicate entries from isSingleEntrySingleExit: the Entry block is
already added by the loop over the dominance frontier.

Remove the heuristic from isOutlineCandidate that a region is too small when it
only contains a basic block. With this change we now grow regions starting from
a block and we continue adding to the ValidColdRegion. Check the heuristic just
before code generation.

llvm-svn: 342276
2018-09-14 20:36:14 +00:00
Sebastian Pop 1217160bb3 HotColdSplit: add back propagation to extend cold regions
Also fix a problem in forward propagation:
  const TerminatorInst *TI = It->getTerminator();
was set outside the while loop that iterates over It.

llvm-svn: 342275
2018-09-14 20:36:10 +00:00
Sanjay Patel fcf8c7c908 [InstCombine] add more tests for add narrowing folds; NFC
llvm-svn: 342274
2018-09-14 20:33:40 +00:00
Thomas Lively e1f67a8bf7 [WebAssembly][NFC] Fix unconventional test names
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52110

llvm-svn: 342273
2018-09-14 20:22:45 +00:00
Reid Kleckner ba732f213d Remove unused DIASession field
llvm-svn: 342272
2018-09-14 20:16:31 +00:00
Konstantin Zhuravlyov e721b11c12 AMDGPU: Clear the bits before they are being set in program resource registers
Change by Tony Tye

llvm-svn: 342270
2018-09-14 20:00:36 +00:00
Alex Langford a250f90efe Fix lit/example/many-tests pickling issue
Summary:
The multiprocess module uses pickling to transfer
information between processes and does not know how to pickle
the class created in the lit.cfg file and thus the example
fails.

Implement ManyTests in a separate file and import for the
example test passes

Patch by Nathan Lanza <nathan@lanza.io>

Differential Revision: https://reviews.llvm.org/D51328

llvm-svn: 342269
2018-09-14 19:44:09 +00:00
Lion Yang 7bff00e841 Test commit access
Remove trailing spaces

llvm-svn: 342268
2018-09-14 19:43:11 +00:00
Reid Kleckner 4d1b75c6b7 Revert r342183 "[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index."
Causes 'isVector() && "Invalid vector type!"' assertion when building
Skia in Chrome.

llvm-svn: 342265
2018-09-14 19:39:40 +00:00
Adrian Prantl 16f58d1850 Fix debug info for SelectionDAG legalization of DAG nodes with two results.
This patch fixes the debug info handling for SelectionDAG legalization
of DAG nodes with two results. When an replaced SDNode has more than
one result, transferDbgValues was always copying the SDDbgValue from
the first result and attaching them to all members. In reality
SelectionDAG::ReplaceAllUsesWith() is given an array of SDNodes
(though the type signature doesn't make this obvious (cf. the call
site code in ReplaceNode()).

rdar://problem/44162227

Differential Revision: https://reviews.llvm.org/D52112

llvm-svn: 342264
2018-09-14 19:38:45 +00:00
Steven Wu ec53c89fde [ThinLTOCodeGenerator] Avoid Rehash StringMap in ThreadPool
Summary:
During threaded thinLTO, it is possible that the entry for current
module doesn't exist in StringMaps (like ExportLists, ResolvedODR,
etc.). Using operator[] might trigger a rehash for the StringMap, which
might happen on multiple threads at the same time.

rdar://problem/43846199

Reviewers: tejohnson, mehdi_amini, kromanova, pcc

Reviewed By: tejohnson

Subscribers: dang, inglorion, eraman, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52049

llvm-svn: 342263
2018-09-14 19:38:21 +00:00
Reid Kleckner 00f0ee718f Revert r342210 "[ARM] bottom-top mul support in ARMParallelDSP"
It causes assertion failures while building Skia for Android in
Chromium:
https://ci.chromium.org/buildbot/chromium.clang/ToTAndroid/4550

Reduction forthcoming.

llvm-svn: 342260
2018-09-14 18:44:37 +00:00
Simon Pilgrim 4c30f3d4e6 Revert a line-endings change that somehow got included with rL342257
llvm-svn: 342258
2018-09-14 18:35:21 +00:00
Simon Pilgrim 32857c54d2 [X86][SSE] Lower shuffles to permute(unpack(x,y)) (PR31151)
Attempt to lower a shuffle as an unpack of elements from two inputs followed by a single-input (wider) permutation.

As long as the permutation is wider this is a win - there may be some circumstances where same size permutations would also be useful but I've left that for future work.

Differential Revision: https://reviews.llvm.org/D52043

llvm-svn: 342257
2018-09-14 18:33:31 +00:00
Craig Topper ac356cac0c [X86] Re-generate test checks using current version of the script. NFC
The regular expression used for stack accesses is different today.

llvm-svn: 342256
2018-09-14 18:27:09 +00:00
Sanjay Patel 003f452522 [InstCombine] rename test file to better describe the fold; NFC
The folds are not limited to zext, and the real goal is width
reduction of a math op. D52075 is proposing to extend this to
subtracts.

llvm-svn: 342254
2018-09-14 18:12:30 +00:00
Sanjay Patel 5a9462e42a [InstCombine] remove unnecessary target constraints for tests; NFC
These are universal folds.

llvm-svn: 342253
2018-09-14 18:06:36 +00:00
Sanjay Patel 7b9e1afd1f [InstCombine] move test next to related tests; NFC
llvm-svn: 342251
2018-09-14 18:05:14 +00:00
Sanjay Patel 1f4f26a2bb [InstCombine] remove stall comment from test file; NFC
llvm-svn: 342250
2018-09-14 18:02:17 +00:00
Sanjay Patel f7ba0ac0b5 [InstCombine] regenerate test checks; NFC
There was a bug in a check line regex that could cause the test to fail
with a naming difference. The auto-gen script seems to work as expected now.

llvm-svn: 342249
2018-09-14 17:53:44 +00:00
Nico Weber 1739dbf6a6 Introduce explicit add_unittest_with_input_files target for tests that use llvm::getInputFileDirectory()
Using llvm::getInputFileDirectory() in unit tests is discouraged, so require an explicit opt-in.
This way, cmake also writes ~60 fewer unused files to disk.

Differential Revision: https://reviews.llvm.org/D52095

llvm-svn: 342248
2018-09-14 17:34:46 +00:00
Adrian Prantl 66945cf6e3 fix noasserts build
llvm-svn: 342247
2018-09-14 17:32:52 +00:00
Adrian Prantl 55b8756b8a SelectionDAG: Add compact SDDbgValue representation to -dag-dump-verbose output
llvm-svn: 342245
2018-09-14 17:08:02 +00:00
James Henderson 13f426304f Revert r342233.
This caused LLD test failures, which I've been unable to reproduce.

Reverting to allow for further investigation next week.

llvm-svn: 342244
2018-09-14 16:48:47 +00:00
Adrian Prantl 86497ad2af fix typos
llvm-svn: 342241
2018-09-14 16:12:14 +00:00
Sanjay Patel b437238e95 [InstCombine] add more tests for x86 blendv (PR38814); NFC
llvm-svn: 342237
2018-09-14 13:47:33 +00:00
Simon Pilgrim 1c1335a10d [X86][BMI1] Fix BLSI/BLSMSK/BLSR BMI1 scheduling on btver2
These have the same behaviour as tzcnt on btver2 - confirmed with AMD 16h SOG, Agner and instlatx64.

llvm-svn: 342235
2018-09-14 13:31:14 +00:00
Simon Pilgrim 6a47cdbdec [X86][BMI1] Add scheduler class for BLSI/BLSMSK/BLSR BMI1 instructions
llvm-svn: 342234
2018-09-14 13:09:56 +00:00
James Henderson 48c0688a36 [ThinLTO]Allow setting of maximum cache size with 64-bit number
Also added a C-interface function for large values, and updated
llvm-lto's --thinlto-cache-max-size-bytes switch to take a type larger
than int.

The maximum cache size in terms of bytes is a 64-bit number. However,
the methods to set it only took unsigned previously, which meant that
the maximum cache size could not be specified above 4GB. That's quite
small compared to the output of some projects, so it makes sense to
provide the ability to set larger values in that field.

We also needed a C-interface function that provides a greater range
than the existing thinlto_codegen_set_cache_size_bytes, which also only
takes an unsigned, so this change also adds
hinlto_codegen_set_cache_size_megabytes.

Reviewed by: mehdi_amini, tejohnson, steven_wu

Differential Revision: https://reviews.llvm.org/D52023

llvm-svn: 342233
2018-09-14 12:51:19 +00:00
David Stuttard 20de3e99b5 [AMDGPU] Ensure trig range reduction only used for subtargets that require it
Summary:
GFX9 and above support sin/cos instructions with a greater range and thus don't
require a fract instruction prior to invocation.

Added a subtarget feature to reflect this and added code to take advantage of
expanded range on GFX9+

Also updated the tests to check correct behaviour

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D51933

Change-Id: I1c1f1d3726a5ae32116646ca5cfa1ab4ef69e5b0
llvm-svn: 342222
2018-09-14 10:27:19 +00:00
Wolfgang Pieb 55dbac9f07 [DWARF] reposting r342048, which was reverted in r342056 due to buildbot
errors.
Adjusted 2 test cases for ARM and darwin and fixed a bug with the original
change in dsymutil.

llvm-svn: 342218
2018-09-14 09:14:10 +00:00
Sam Parker 7b84fd7847 [ARM] bottom-top mul support in ARMParallelDSP
On failing to find sequences that can be converted into dual macs,
try to find sequential 16-bit loads that are used by muls which we
can then use smultb, smulbt, smultt with a wide load.

Differential Revision: https://reviews.llvm.org/D51983

llvm-svn: 342210
2018-09-14 08:09:09 +00:00
Florian Hahn 3afb974aa5 [LoopInterchange] Preserve ScalarEvolution, by forgetting about interchanged loops.
As preparation for LoopInterchange becoming a loop pass, it needs to
preserve ScalarEvolution. Even though interchanging should not change
the trip count of the loop, it modifies loop entry, latch and exit
blocks.

I added -verify-scev to some loop interchange tests, but the verification does
not catch problems caused by missing invalidation of SE in loop interchange, as
the trip counts themselves do not change. So there might be potential to
make the SE verification covering more stuff in the future.

Reviewers: mkazantsev, efriedma, karthikthecool

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D52026

llvm-svn: 342209
2018-09-14 07:50:20 +00:00
Jonas Paulsson 77df2f2f38 [SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM
After recent improvements which makes better use of LOC instead of IPM, the
TTI cost functions also needs to be updated to reflect this.

This involves sext, zext and xor of i1.

The tests were updated so that for z13 the new costs are expected, while the
old costs are still checked for on zEC12.

Review: Ulrich Weigand
https://reviews.llvm.org/D51339

llvm-svn: 342207
2018-09-14 06:46:55 +00:00
Martin Storsjo f08a9c700b [Support] Treat null bytes as separator in windows command line strings
When reading directives from a .drectve section, the directives are
tokenized as a normal windows command line. However in these cases,
link.exe allows the directives to be separated by null bytes, not only by
spaces.

A test case for this change will be added in the lld repo.

Differential Revision: https://reviews.llvm.org/D52014

llvm-svn: 342204
2018-09-14 06:08:01 +00:00
Craig Topper e385365c40 [InstCombine] Add some test cases for (add (sext x), (sext y)) --> (sext (add int x, y)) and (mul (sext x), (sext y)) --> (sext (mul x, y)). NFC
llvm-svn: 342203
2018-09-14 05:16:58 +00:00
Max Kazantsev e9765ac275 [NFC] Remove meaningless code from GVN
llvm-svn: 342202
2018-09-14 04:50:38 +00:00
Hideki Saito d19851ac7e Fix for the buildbot failure http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/23635
from the commit (r342197) of https://reviews.llvm.org/D50820.

llvm-svn: 342201
2018-09-14 02:02:57 +00:00
Hideki Saito ea7f3035a0 [VPlan] Implement initial vector code generation support for simple outer loops.
Summary:
[VPlan] Implement vector code generation support for simple outer loops.

Context: Patch Series #1 for outer loop vectorization support in LV  using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).
                                                          
This patch introduces vector code generation support for simple outer loops that are currently supported in the VPlanNativePath. Changes here essentially do the following:

  - force vector code generation using explicit vectorize_width

  - add conservative early returns in cost model and other places for VPlanNativePath

  - add code for setting up outer loop inductions 

  - support for widening non-induction PHIs that can result from inner loops and uniform conditional branches

  - support for generating uniform inner branches

We plan to add a handful C outer loop executable tests once the initial code generation support is committed. This patch is expected to be NFC for the inner loop vectorizer path. Since we are moving in the direction of supporting outer loop vectorization in LV, it may also be time to rename classes such as InnerLoopVectorizer. 

Reviewers: fhahn, rengolin, hsaito, dcaballe, mkuper, hfinkel, Ayal

Reviewed By: fhahn, hsaito

Subscribers: dmgreen, bollu, tschuett, rkruppe, rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D50820

llvm-svn: 342197
2018-09-14 00:36:00 +00:00
Richard Diamond 975ff5a6d3 [NFC] Link LLVMCore into LLVMExegesisARMTests.
Fixes missing `llvm::LLVMContext::~LLVMContext()` symbols w/
`BUILD_SHARED_LIBS`.

llvm-svn: 342193
2018-09-13 23:18:33 +00:00
Tim Renouf c8af6a46fa [AMDGPU] Removed unused method
Summary:
I accidentally left this behind in D50306, and it causes a build warning
when I build with gcc7.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52022

Change-Id: I30f7a47047e9d9d841f652da66d2fea19e74842c
llvm-svn: 342189
2018-09-13 21:56:25 +00:00
Matt Morehouse 3bea25e554 [SanitizerCoverage] Create comdat for global arrays.
Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

llvm-svn: 342186
2018-09-13 21:45:55 +00:00
Roman Lebedev d2316d756e [NFC][InstCombine] PR38708 - inefficient pattern for high-bits checking 3.
The last, non-canonical variant:
https://godbolt.org/z/aCMsPk
https://rise4fun.com/Alive/I6f

It can only happen due to the extra use on the inner shift.
But here it is ok.

https://bugs.llvm.org/show_bug.cgi?id=38708

llvm-svn: 342184
2018-09-13 21:34:47 +00:00
Amara Emerson ef600cbd86 [DAGCombine] Fix crash when store merging created an extract_subvector with invalid index.
Differential Revision: https://reviews.llvm.org/D51831

llvm-svn: 342183
2018-09-13 21:28:58 +00:00
Roman Lebedev b6d188da08 LLVMExegesisX86Tests: link to LLVMCore, too.
Fixes build for me.
Refs. D52054

[215/217] Linking CXX executable unittests/tools/llvm-exegesis/X86/LLVMExegesisX86Tests
FAILED: unittests/tools/llvm-exegesis/X86/LLVMExegesisX86Tests
: && /usr/bin/g++  -pipe -O2 -g0 -UNDEBUG -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -pipe -O2 -g0 -UNDEBUG  -fuse-ld=lld -Wl,--color-diagnostics -Wl,-allow-shlib-undefined     -Wl,-O3 -Wl,--gc-sections unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/AssemblerTest.cpp.o unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/AnalysisTest.cpp.o unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/SnippetGeneratorTest.cpp.o unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/RegisterAliasingTest.cpp.o unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/TargetTest.cpp.o  -o unittests/tools/llvm-exegesis/X86/LLVMExegesisX86Tests  -Wl,-rpath,/build/llvm-build-GCC-release/lib lib/libLLVMMC.so.8svn lib/libLLVMMCParser.so.8svn lib/libLLVMObject.so.8svn lib/libLLVMSymbolize.so.8svn lib/libLLVMX86CodeGen.so.8svn lib/libLLVMX86AsmParser.so.8svn lib/libLLVMX86AsmPrinter.so.8svn lib/libLLVMX86Desc.so.8svn lib/libLLVMX86Disassembler.so.8svn lib/libLLVMX86Info.so.8svn lib/libLLVMX86Utils.so.8svn lib/libLLVMSupport.so.8svn -lpthread lib/libgtest_main.so.8svn lib/libgtest.so.8svn -lpthread lib/libLLVMExegesis.so.8svn lib/libLLVMExegesisX86.so.8svn && :
ld.lld: error: undefined symbol: llvm::LLVMContext::~LLVMContext()
>>> referenced by AssemblerTest.cpp
>>>               unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/AssemblerTest.cpp.o:(exegesis::(anonymous namespace)::X86MachineFunctionGeneratorTest_DISABLED_JitFunction_Test::TestBody())

ld.lld: error: undefined symbol: llvm::LLVMContext::~LLVMContext()
>>> referenced by AssemblerTest.cpp
>>>               unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/AssemblerTest.cpp.o:(exegesis::(anonymous namespace)::X86MachineFunctionGeneratorTest_DISABLED_JitFunctionXOR32rr_Default_Test::TestBody())

ld.lld: error: undefined symbol: llvm::LLVMContext::~LLVMContext()
>>> referenced by AssemblerTest.cpp
>>>               unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/AssemblerTest.cpp.o:(void exegesis::MachineFunctionGeneratorBaseTest::Check<int, int, int, int, int, int, int, int>(exegesis::ExegesisTarget const&, llvm::ArrayRef<unsigned int>, llvm::MCInst, int, int, int, int, int, int, int, int))

ld.lld: error: undefined symbol: llvm::LLVMContext::~LLVMContext()
>>> referenced by AssemblerTest.cpp
>>>               unittests/tools/llvm-exegesis/X86/CMakeFiles/LLVMExegesisX86Tests.dir/AssemblerTest.cpp.o:(exegesis::(anonymous namespace)::X86MachineFunctionGeneratorTest_DISABLED_JitFunctionMOV32ri_Test::TestBody())
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

llvm-svn: 342182
2018-09-13 21:26:09 +00:00
Sam Clegg 9b3e7c365c [llvm-exegesis] Add missing MC dependency to CMakeLists.txt
See rL342148

This probably only shows up in BUILD_SHARED_LIBS=ON builds
which might explain how it crept in.

Differential Revision: https://reviews.llvm.org/D52054

llvm-svn: 342180
2018-09-13 21:17:16 +00:00
Peter Collingbourne 33d37b2e86 [bindings/go] Add DebugLoc parameter to InsertXXXAtEnd()
These functions previously passed nil for the location, which always resulted in a crash.

This is a signature breaking change, but I cannot see how they could have been used before.

Patch by Ben Clayton!

Differential Revision: https://reviews.llvm.org/D51970

llvm-svn: 342179
2018-09-13 21:16:39 +00:00
Richard Smith 3e32c815e9 Add dependency on new llvm-cxxmap tool to check-llvm.
llvm-svn: 342178
2018-09-13 21:15:34 +00:00
Craig Topper 2f88006ced [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove operands from inline assembly instructions if they have an associated flag operand.
INLINEASM instructions use extra operands to carry flags. If a register operand is removed without removing the flag operand, then the flags will no longer make sense.

This patch fixes this by preventing the removal when a flag operand is present.

The included test case was generated by MS inline assembly. Longer term maybe we should fix the inline assembly parsing to not generate redundant operands.

Differential Revision: https://reviews.llvm.org/D51829

llvm-svn: 342176
2018-09-13 20:51:27 +00:00
Nirav Dave 59ad1c8457 [X86] Fix register resizings for inline assembly register operands.
When replacing a named register input to the appropriately sized
sub/super-register. In the case of a 64-bit value being assigned to a
register in 32-bit mode, match GCC's assignment.

Reviewers: eli.friedman, craig.topper

Subscribers: nickdesaulniers, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D51502

llvm-svn: 342175
2018-09-13 20:33:56 +00:00
Nirav Dave 2060a16dfd [X86] Cleanup pair returns. NFCI.
llvm-svn: 342174
2018-09-13 20:33:27 +00:00
Roman Lebedev 6dc87004fa [InstCombine] Inefficient pattern for high-bits checking 2 (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only n bits wide (where n is a variable.)
There are many ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

More complicated, canonical pattern:
https://rise4fun.com/Alive/uhA

We do need to have two `switch()`'es like this,
to not mismatch the swappable predicates.

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52001

llvm-svn: 342173
2018-09-13 20:33:12 +00:00
George Burgess IV d565b0f017 [PartiallyInlineLibCalls] Add DebugCounter support
This adds DebugCounter support to the PartiallyInlineLibCalls pass,
which should make debugging/automated bisection easier in the future.

Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50093

llvm-svn: 342172
2018-09-13 20:33:04 +00:00
Roman Lebedev 083744852a [NFC][InstCombine] Test what happens if 'unefficient high bit check' pattern is on both sides.
Came up in https://reviews.llvm.org/D52001#1233827
While we don't do a good job here, we at least want to make
sure that we don't have any inf-loops.

llvm-svn: 342171
2018-09-13 20:33:02 +00:00
George Burgess IV 7a8dc3dafa [DCE] Add DebugCounter support
Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50092

llvm-svn: 342170
2018-09-13 20:29:50 +00:00
Volodymyr Sapsai 008982d04f Revert "[cmake] Fix a unittest when `LLVM_LINK_LLVM_DYLIB` is requested."
This reverts commit r342150 as it caused test failure

    LLVM-Unit :: Passes/./PluginsTests/PluginsTests.LoadPlugin

on multiple bots.

llvm-svn: 342169
2018-09-13 20:24:36 +00:00
Richard Smith 3164fcfd27 Add flag to llvm-profdata to allow symbols in profile data to be remapped, and
add a tool to generate symbol remapping files.

Summary:
The new tool llvm-cxxmap builds a symbol mapping table from a file containing
a description of partial equivalences to apply to mangled names and files
containing old and new symbol tables.

Reviewers: davidxl

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D51470

llvm-svn: 342168
2018-09-13 20:22:02 +00:00
Richard Smith 8d8c057eab Fix a couple of mangling canonicalizer corner case bugs.
Summary:
The hash computed for an ArrayType was different when first constructed
versus when later profiled due to the constructor default argument, and
we were not tracking constructor / destructor variant as part of the
mangled name AST, leading to incorrect equivalences.

Reviewers: erik.pilkington

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51463

llvm-svn: 342166
2018-09-13 20:00:21 +00:00
Craig Topper 8fc05ce340 [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible.
This allows the xor to be removed completely.

This might help with recomitting r341674, but seems good regardless.

Coincidentally fixes PR38915.

Differential Revision: https://reviews.llvm.org/D51964

llvm-svn: 342163
2018-09-13 18:52:58 +00:00
Craig Topper 3fc5e72d84 [InstCombine] Add test cases for D51964. NFC
llvm-svn: 342162
2018-09-13 18:52:56 +00:00
Richard Smith 327f05509f Common infrastructure for reading a profile remapping file and building
a mangling remapper from it.

Differential Revision: https://reviews.llvm.org/D51246

llvm-svn: 342161
2018-09-13 18:51:44 +00:00
Ana Pazos 065b088759 [RISCV][MC] Reject bare symbols for the simm6 and simm6nonzero operand types
Summary:
Fixed assertions due to invalid fixup when encoding compressed instructions
 (c.addi, c.addiw, c.li, c.andi) with bare symbols with/without modifiers.
  This matches GAS behavior as well.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D52005

llvm-svn: 342160
2018-09-13 18:37:23 +00:00
Ana Pazos b0799dda77 [RISCV] Fix decoding of invalid instruction with C extension enabled.
Summary:
The illegal instruction 0x00 0x00 is being wrongly decoded as
c.addi4spn with 0 immediate.

The invalid instruction 0x01 0x61 is being wrongly decoded as
c.addi16sp with 0 immediate.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51815

llvm-svn: 342159
2018-09-13 18:21:19 +00:00
Sam Clegg 79c054f6b8 [WebAssembly] Fix signature of `main` in FixFunctionBitcasts
Also, add a check to ensure that when main has the expected signature
we do not create a wrapper.

Differential Revision: https://reviews.llvm.org/D51562

llvm-svn: 342157
2018-09-13 17:13:10 +00:00
Simon Pilgrim dbdd46da18 [AArch64] Add integer abs testcases for D51873.
llvm-svn: 342156
2018-09-13 17:11:25 +00:00
Richard Diamond f29b36c76d [cmake] Fix missing DEPENDS.
Not sure how I didn't catch this.

llvm-svn: 342154
2018-09-13 17:10:44 +00:00
Richard Diamond 939468d4b2 [cmake] Fix a unittest when `LLVM_LINK_LLVM_DYLIB` is requested.
llvm-svn: 342150
2018-09-13 16:39:52 +00:00
Sanjay Patel 37e464876b [InstCombine] remove checks for IsFreeToInvert()
I accidentally committed this diff with rL342147 because
I had applied D51964. We probably do need those checks,
but D51964 has tests and more discussion/motivation,
so they should be re-added with that patch.

llvm-svn: 342149
2018-09-13 16:18:12 +00:00
Richard Diamond f3063baa6e Renovate CMake files in the `llvm-(cfi-verify|exegesis|mca)` tools.
llvm-svn: 342148
2018-09-13 16:15:03 +00:00
Sanjay Patel 6f00fc3317 [InstCombine] reorder folds to reduce chance of infinite loops
I don't have a test case for this, but it's motivated by
the discussion in D51964, and I've added TODO comments for
the better fix - move simplifications into instsimplify
because that's more efficient and reduces risk of infinite
loops in instcombine caused by transforms trying to do the
opposite folds.

In this case, we know that the transform that tries to move
'not' through min/max can be fooled by the multiple uses
of a value in another min/max, so try to squash the 
foldSPFofSPF() patterns first.

llvm-svn: 342147
2018-09-13 16:04:06 +00:00
Sam Parker aaec3c6260 [ARM] Allow truncs as sources in ARM CGP
We previously only allowed truncs as sinks, but now allow them as
sources too. We do this by checking that the result type is the
narrow type that we're trying to optimise for.

Differential Revision: https://reviews.llvm.org/D51978

llvm-svn: 342141
2018-09-13 15:14:12 +00:00
Sam Parker 96f77f142b [ARM] Fix FixConst for ARMCodeGenPrepare
Part of FixConsts wrongly assumes either a 8- or 16-bit constant
which can result in the wrong constants being generated during
promotion.

Differential Revision: https://reviews.llvm.org/D52032

llvm-svn: 342140
2018-09-13 14:48:10 +00:00
Jonas Devlieghere 64c901d2b1 [MC/Dwarf] Unclamp DWARF linetables format on Darwin.
In r319995, we fixed the line table format to version 2 on Darwin
because dsymutil didn't yet understand the new format which caused test
failures for the LLDB bots. This has been resolved in the meantime so
there's no reason to keep this limitation.

rdar://problem/35968332

llvm-svn: 342136
2018-09-13 13:13:50 +00:00
Matt Arsenault ff987ac6ea AMDGPU: Fix not preserving alignent in call setups
If an argument was passed on the stack, this
was using the default alignment.

I'm not sure there's an observable change from this. This
was observable due to bugs in expansion of unaligned
loads and stores, but since that is fixed I don't think
this matters much.

llvm-svn: 342133
2018-09-13 12:14:31 +00:00
Matt Arsenault 842cda6312 DAG: Fix expansion of unaligned FP loads and stores
This was trying to scalarizing a scalar FP type,
resulting in an assert.

Fixes unaligned f64 stack stores for AMDGPU.

llvm-svn: 342132
2018-09-13 12:14:23 +00:00
Matt Arsenault 9de2fb58fa AMDGPU: Fix some outdated datalayouts in tests
llvm-svn: 342131
2018-09-13 11:56:28 +00:00
Simon Pilgrim 5b65e41a8f Fix unused variable warning. NFCI.
llvm-svn: 342128
2018-09-13 10:54:23 +00:00
Tim Northover c15d47bb01 ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.
The Technical Reference Manuals for these two CPUs state that branching
to an unaligned 32-bit instruction incurs an extra pipeline reload
penalty. That's bad.

This also enables the optimization at -Os since it costs on average one
byte per loop in return for 1 cycle per iteration, which is pretty good
going.

llvm-svn: 342127
2018-09-13 10:28:05 +00:00
Dean Michael Berris 90a46bdec2 [XRay] Bug fixes for FDR custom event and arg-logging
Summary:
This change has a number of fixes for FDR mode in compiler-rt along with
changes to the tooling handling the traces in llvm.

In the runtime, we do the following:

- Advance the "last record" pointer appropriately when writing the
  custom event data in the log.

- Add XRAY_NEVER_INSTRUMENT in the rewinding routine.

- When collecting the argument of functions appropriately marked, we
  should not attempt to rewind them (and reset the counts of functions
  that can be re-wound).

In the tooling, we do the following:

- Remove the state logic in BlockIndexer and instead rely on the
  presence/absence of records to indicate blocks.

- Move the verifier into a loop associated with each block.

Reviewers: mboerger, eizan

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D51965

llvm-svn: 342122
2018-09-13 09:25:42 +00:00
Alexander Timofeev 4d302f6911 [AMDGPU] Load divergence predicate refactoring
Differential revision: https://reviews.llvm.org/D51931

    Reviewers: rampitec

llvm-svn: 342120
2018-09-13 09:06:56 +00:00
Simon Atanasyan c49da2e4ed [mips] Enable the mnemonic spell corrector
This implements suggesting alternative mnemonics when an invalid one is
specified. For example `addru $9, $6, 17767` leads to the following
error message:

error: unknown instruction, did you mean: add, addiu, addu, maddu?

Differential revision: https://reviews.llvm.org/D40646

llvm-svn: 342119
2018-09-13 08:38:03 +00:00
Clement Courbet 7958735e45 [llvm-exegesis][NFC] Remove dead parameter.
llvm-svn: 342118
2018-09-13 08:06:29 +00:00
Clement Courbet d939f6d013 [llvm-exegesis][NFC] Split BenchmarkRunner class
Summary:
The snippet-generation part goes to the SnippetGenerator class.

This will allow benchmarking arbitrary code (see PR38437).

Reviewers: gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D51979

llvm-svn: 342117
2018-09-13 07:40:53 +00:00
Alexander Timofeev 2fb44808b1 [AMDGPU] Preliminary patch for divergence driven instruction selection. Load offset inlining pattern changed.
Differential revision: https://reviews.llvm.org/D51975

    Reviewers: rampitec

llvm-svn: 342115
2018-09-13 06:34:56 +00:00
Craig Topper f107123a88 [X86] Type legalize v2i32 div/rem by scalarizing rather than promoting
Summary:
Previously we type legalized v2i32 div/rem by promoting to v2i64. But we don't support div/rem of vectors so op legalization would then scalarize it using i64 scalar ops since it doesn't know about the original promotion. 64-bit scalar divides on Intel hardware are known to be slow and in 32-bit mode they require a libcall.

This patch switches type legalization to do the scalarizing itself using i32.

It looks like the division by power of 2 optimization is still kicking in and leaving the code as a vector. The division by other constant optimization doesn't kick in pre type legalization since it ignores illegal types. And previously, after type legalization we scalarized the v2i64 since we don't have v2i64 MULHS/MULHU support.

Another option might be to widen v2i32 to v4i32 so we could do division by constant optimizations, but we'd have to be careful to only do that for constant divisors or we risk scalaring to 4 scalar divides.

Reviewers: RKSimon, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51325

llvm-svn: 342114
2018-09-13 06:13:37 +00:00
Saleem Abdulrasool aaa72c547b ARM: correct the relocation type for `bl` on WoA
The `IMAGE_REL_ARM_BRANCH20T` applies only to a `b.w` instruction.  A
thumb-2 `bl` should be relocated using a `IMAGE_REL_ARM_BRANCH24T`.
Correct the relocation that we emit in such a case.

Resolves PR38620!  Based on the patch by Jordan Rhee!

llvm-svn: 342109
2018-09-13 04:55:08 +00:00
Max Kazantsev b2724d9af8 [NFC] Add Requires: asserts where needed
llvm-svn: 342108
2018-09-13 04:43:24 +00:00
Max Kazantsev 0e0e19c980 [NFC] Use expensive asserts in relevant LICM tests
llvm-svn: 342107
2018-09-13 04:00:39 +00:00
Thomas Lively 65825cd7c5 Remove isAsCheapAsAMove from v128.const
llvm-svn: 342106
2018-09-13 02:50:57 +00:00
Thomas Lively 17ba6becaa Remove isAsCheapAsAMove from mem ops
llvm-svn: 342105
2018-09-13 02:50:57 +00:00
Thomas Lively 56b34f6c51 [WebAssembly] Add missing SIMD instruction attributes
Summary:
These attributes are copied from equivalent instructions in
WebAssemblyInstrInfo.td.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51518

llvm-svn: 342104
2018-09-13 02:50:56 +00:00
David Blaikie 911907ca3c STLExtras: Add some more algorithm wrappers
llvm-svn: 342102
2018-09-13 00:02:03 +00:00
David Blaikie eee709f03c DebugInfo/PDB: Remove unused member
llvm-svn: 342101
2018-09-13 00:02:02 +00:00
David Blaikie da36f3f482 dwarfdump: Improve performance on large DWP files
llvm-svn: 342099
2018-09-12 23:39:51 +00:00
Sanjay Patel 8a478b79dc [DAGCombiner] improve formatting for select+setcc code; NFC
llvm-svn: 342095
2018-09-12 23:03:50 +00:00
Adrian Prantl 9a45452987 fix 80-column violation with clang-format
llvm-svn: 342094
2018-09-12 22:57:28 +00:00
Zachary Turner c43d55602f [PDB] Remove all clone() methods.
These are dead code and encourage poor usage patterns, so I'm
removing them.  They weren't called anywhere anyway.

llvm-svn: 342093
2018-09-12 22:57:03 +00:00
Krzysztof Parzyszek a6d4fc0e29 [Hexagon] Use shuffles when lowering "gather" shufflevectors
Shufflevector instructions in LLVM IR that extract a subset of elements
of a longer input into a shorter vector can be done using VECTOR_SHUFFLEs.
This will avoid expanding them into constly extracts and inserts.

llvm-svn: 342091
2018-09-12 22:14:52 +00:00
Krzysztof Parzyszek f853741142 [Hexagon] Improve the selection algorithm in scalarizeShuffle
Use topological ordering for newly generated nodes.

llvm-svn: 342090
2018-09-12 22:10:58 +00:00
Kristina Brooks 3a55d1ef27 [Support] sys::fs::directory_entry includes the file_type.
This is available on most platforms (Linux/Mac/Win/BSD) with no extra syscalls.
On other platforms (e.g. Solaris) we stat() if this information is requested.

This will allow switching clang's VFS to efficiently expose (path, type) when
traversing a directory. Currently it exposes an entire Status, but does so by
calling fs::status() on all platforms.
Almost all callers only need the path, and all callers only need (path, type).

Patch by sammccall (Sam McCall)

Differential Revision: https://reviews.llvm.org/D51918

llvm-svn: 342089
2018-09-12 22:08:10 +00:00
Vedant Kumar 2963c49087 [llvm-cov] Delete custom JSON serialization code (NFC)
Teach llvm-cov to use the new llvm JSON library, and remove some
redundant/brittle JSON serialization tests.

llvm-svn: 342088
2018-09-12 21:59:38 +00:00
Lang Hames 8be0d2e3c2 [ORC] Merge ExecutionSessionBase with ExecutionSession by moving a couple of
template methods in JITDylib out-of-line.

This also splits JITDylib::define into a pair of template methods, one taking an
lvalue reference and the other an rvalue reference. This simplifies the
templates at the cost of a small amount of code duplication.

llvm-svn: 342087
2018-09-12 21:49:02 +00:00
Lang Hames 13014d3ce3 [ORC] Add a special 'main' JITDylib that is created on ExecutionSession
construction, a new convenience lookup method, and add-to layer methods.

ExecutionSession now creates a special 'main' JITDylib upon construction. All
subsequently created JITDylibs are added to the main JITDylib's search order by
default (controlled by the AddToMainDylibSearchOrder parameter to
ExecutionSession::createDylib). The main JITDylib's search order will be used in
the future to properly handle cross-JITDylib weak symbols, with the first
definition in this search order selected.

This commit also adds a new ExecutionSession::lookup convenience method that
performs a blocking lookup using the main JITDylib's search order, as this will
be a very common operation for clients.

Finally, new convenience overloads of IRLayer and ObjectLayer's add methods are
introduced that add the given program representations to the main dylib, which
is likely to be the common case.

llvm-svn: 342086
2018-09-12 21:48:59 +00:00
Heejin Ahn 300f42fbce [WebAssembly] Make tied inline asm operands work again
Summary:
rL341389 broke code with tied register operands in inline assembly. For
example, `asm("" : "=r"(var) : "0"(var));`
The code above specifies the input operand to be in the same register
with the output operand, tying the two register. This patch makes this
kind of code work again.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, eraman, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51991

llvm-svn: 342084
2018-09-12 21:34:39 +00:00
Sanjay Patel d341988c86 revert r341288 - [Reassociate] swap binop operands to increase factoring potential
This causes or exposes indeterminism that is visible in the output of -reassociate.

llvm-svn: 342083
2018-09-12 21:29:11 +00:00
Sanjay Patel 31017cd10a [InstCombine] add tests for unsigned add overflow; NFC
llvm-svn: 342082
2018-09-12 21:13:37 +00:00
Michael Berg 22a53cbc7f Guard FMF context by excluding some FP operators from FPMathOperator
Summary:
Some FPMathOperators succeed and the retrieve FMF context when they never have it, we should omit these cases to keep from removing FMF context.

For instance when we visit some FPMathOperator mapped Instructions which never have FMF flags and a Node was associated which does have FMF flags, that Node today will have all its flags cleared via the intersect operation.  With this change, we exclude associating Nodes that never have FPMathOperator status under FMF.

Reviewers: spatel, wristow, arsenm, hfinkel, aemerson

Reviewed By: spatel

Subscribers: llvm-commits, wdng

Differential Revision: https://reviews.llvm.org/D51145

llvm-svn: 342081
2018-09-12 21:09:59 +00:00
Zachary Turner a1f85f8bdd [PDB] Emit old fpo data to the PDB file.
r342003 added support for emitting FPO data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the PDB
file.  However, that is not the end of the story.  FPO can end
up in two different destinations in a PDB, each corresponding to
a different FPO data source.

The case handled by r342003 involves copying data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the
"New FPO" stream in the PDB, which is then referred to by the
DBI stream.  The case handled by this patch involves copying
records from the .debug$F section of an object file to the "FPO"
stream (or perhaps more aptly, the "Old FPO" stream) in the PDB
file, which is also referred to by the DBI stream.

The formats are largely similar, and the difference is mostly
only visible in masm generated object files, such as some of the
low-level CRT object files like memcpy.  MASM doesn't appear to
support writing the DEBUG_S_FRAMEDATA subsection, and instead
just writes these records to the .debug$F section.

Although clang-cl does not emit a .debug$F section ever, lld still
needs to support it so we have good debugging for CRT functions.

Differential Revision: https://reviews.llvm.org/D51958

llvm-svn: 342080
2018-09-12 21:02:01 +00:00
Krzysztof Parzyszek cd95e03cf0 [Hexagon] Use legalized type for extracted elements in scalarizeShuffle
Scalarization of a shuffle will break up the source vectors into individual
elements, and use them to assemble the resulting vector. An element type of
a legal vector type may not necessarily be a legal scalar type, so make
sure that the extracted values are extended to a legal scalar type.

llvm-svn: 342079
2018-09-12 20:58:48 +00:00
Konstantin Zhuravlyov 6e551e0e49 AMDGPU: Print all kernel descriptor directives (including the ones with default values)
Change by Tony Tye

Differential Revision: https://reviews.llvm.org/D51954

llvm-svn: 342077
2018-09-12 20:25:39 +00:00
Roman Lebedev e14b0282bb [NFC][InstCombine] Drop newly-added interference-tests-for-high-bit-check.ll
Now that i have actually double-checked, no,
there is no such interference possible...

llvm-svn: 342076
2018-09-12 20:06:46 +00:00
Roman Lebedev 91c668a276 [NFC][InstCombine] R38708 - inefficient pattern for high-bits checking.
More complicated, canonical pattern:
https://rise4fun.com/Alive/uhA
https://godbolt.org/z/o4RB8D

Also, we need to be careful not to skip some patters...

https://bugs.llvm.org/show_bug.cgi?id=38708

llvm-svn: 342074
2018-09-12 19:44:26 +00:00
Konstantin Zhuravlyov 71e43ee47d AMDGPU: Re-apply r341982 after fixing the layering issue
Move isa version determination into TargetParser.

Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).

llvm-svn: 342069
2018-09-12 18:50:47 +00:00
Roman Lebedev 75404fb9f8 [InstCombine] Inefficient pattern for high-bits checking (PR38708)
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only `n` bits wide (where `n` is a variable.)
There are **many** ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)

Let's handle the second variant first, since it is much simpler.
https://rise4fun.com/Alive/LYjY

https://bugs.llvm.org/show_bug.cgi?id=38708

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51985

llvm-svn: 342067
2018-09-12 18:19:43 +00:00
Julie Hockett 468722ee9f [objcopy] make objcopy follow program header standards
Submitted on behalf of Armando Montanez (amontanez@google.com).

Objects with unused program headers copied by objcopy would always have
nonzero values for program header offset and program header entry size.
While technically valid, this atypical behavior triggers warnings in some
tools. This change sets the two fields to zero when the program header is
unused, better fitting the general expectations for unused program header
data.

Section headers behaved somewhat similarly (though only with the entry size),
and are fixed in this revision as well.

Differential Revision: https://reviews.llvm.org/D51961

llvm-svn: 342065
2018-09-12 17:56:31 +00:00
Thomas Lively ebd4c906d8 [WebAssembly] SIMD comparisons
Summary:
Match the ordering semantics of non-vector comparisons. For
floating point comparisons that do not correspond to instructions, the
tests check that some vector comparison instruction was emitted but do
not care about the full implementation.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51765

llvm-svn: 342064
2018-09-12 17:56:00 +00:00
Diogo N. Sampaio 01b916e188 [ARM] Tighten f64<->f16 conversion requirements
Fix missing Requires fields.

Patch by Bernard Ogden (bogden)

Reviewers: SjoerdMeijer, javed.absar, t.p.northover	

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D51631

llvm-svn: 342061
2018-09-12 16:24:43 +00:00
Craig Topper 2262613532 [X86] Remove isel patterns for ADCX instruction
There's no advantage to this instruction unless you need to avoid touching other flag bits. It's encoding is longer, it can't fold an immediate, it doesn't write all the flags.

I don't think gcc will generate this instruction either.

Fixes PR38852.

Differential Revision: https://reviews.llvm.org/D51754

llvm-svn: 342059
2018-09-12 15:47:34 +00:00
Florian Hahn a3d8065045 [PatternMatch] Use generic One,Two,ThreeOps_match classes (NFC).
Currently we have a few duplicated matcher classes, which all do pretty
much the same thing. This patch introduces generic
One,Tow,ThreeOps_match classes which take the opcode the match as
template argument.

Reviewers: SjoerdMeijer, dneilson, spatel, arsenm

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D51044

llvm-svn: 342058
2018-09-12 14:52:38 +00:00
Wolfgang Pieb 233bc73047 Reverting r342048, which caused UBSan failures in dsymutil.
llvm-svn: 342056
2018-09-12 14:40:04 +00:00
Alexandros Lamprineas 54d56f274c [GVNHoist] computeInsertionPoints() miscalculates IDF
Fix for https://bugs.llvm.org/show_bug.cgi?id=38912.

In GVNHoist::computeInsertionPoints() we iterate over the Value
Numbers and calculate the Iterated Dominance Frontiers without
clearing the IDFBlocks vector first. IDFBlocks ends up accumulating
an insane number of basic blocks, which bloats the compilation time
of SemaChecking.cpp with ubsan enabled.

Differential Revision: https://reviews.llvm.org/D51980

llvm-svn: 342055
2018-09-12 14:28:23 +00:00
Roman Lebedev 99359f391e [NFC][InstCombine] R38708 - inefficient pattern for high-bits checking.
The simplest pattern for now:
https://rise4fun.com/Alive/LYjY
https://godbolt.org/z/o4RB8D

https://bugs.llvm.org/show_bug.cgi?id=38708

llvm-svn: 342054
2018-09-12 14:11:37 +00:00
Sander de Smalen 2d77e788f2 [AArch64] Implement aarch64_vector_pcs codegen support.
This patch adds codegen support for the saving/restoring
V8-V23 for functions specified with the aarch64_vector_pcs
calling convention attribute, as added in patch D51477.

Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D51479

llvm-svn: 342049
2018-09-12 12:10:22 +00:00
Wolfgang Pieb 3a8781cf6c [DWARF] Refactoring range list dumping to fold DWARF v4 functionality into v5 handling
Eliminating some duplication of rangelist dumping code at the expense of
some version-dependent code in dump and extract routines.

Reviewer: dblaikie, JDevlieghere, vleschuk

Differential revision: https://reviews.llvm.org/D51081

llvm-svn: 342048
2018-09-12 12:01:19 +00:00
Kristof Umann c5f0f2f13a [ADT] Made numerous methods of ImmutableList const
Also added ImmutableList<T>::iterator::operator->.

Differential Revision: https://reviews.llvm.org/D51881

llvm-svn: 342045
2018-09-12 11:20:15 +00:00
David Green e27e87cdcb [CGP] Ensure splitgep gives deterministic output
The output of splitLargeGEPOffsets does not appear to be deterministic because
of the way that we iterate over a DenseMap. I've changed it to a MapVector for
consistent output.

The test here isn't particularly great, only showing a consmetic difference in
output. The original reproducer is much larger but show a diffierence in
instruction ordering, leading to different codegen.

Differential Revision: https://reviews.llvm.org/D51851

llvm-svn: 342043
2018-09-12 10:19:10 +00:00
Sam Parker 1187911b0b [ARM] Follow-up to rL342033
Fixed typo which can cause segfault.

llvm-svn: 342040
2018-09-12 09:58:56 +00:00
David Green 2352b30c96 [SimplifyCFG] Put an alignment on generated switch tables
Previously the alignment on the newly created switch table data was not set,
meaning that DataLayout::getPreferredAlignment was free to overalign it to 16
bytes. This causes unnecessary code bloat.

Differential Revision: https://reviews.llvm.org/D51800

llvm-svn: 342039
2018-09-12 09:54:17 +00:00
Sander de Smalen 7140363cd0 [AArch64] NFC: Refactoring to prepare for vector PCS.
This patch refactors several parts of AArch64FrameLowering
so that it can be easily extended to support saving/restoring
of FPR128 (Q) registers.

Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D51478

llvm-svn: 342038
2018-09-12 09:44:46 +00:00
Clement Courbet 903667e956 [llvm-exegesis][NFC]Remove dead function parameter
llvm-svn: 342035
2018-09-12 09:26:32 +00:00
Sam Parker a023c7a9cb [ARM] Exchange MAC operands in ARMParallelDSP
SMLAD and SMLALD instructions also come in the form of SMLADX and
SMLALDX which perform an exchange on their second operand. To support
this, more of the loads in the MAC candidates are compared for
sequential access and a boolean value has been added to BinOpChain.

AddMACCandiate has been refactored into a small pattern matching
state machine to reduce the amount of duplicated code, but also to
enable the matching to be more flexible. CreateParallelMACPairs now
iterates through all the candidates to find parallel ones.

Differential Revision: https://reviews.llvm.org/D51424

llvm-svn: 342033
2018-09-12 09:17:44 +00:00
Sam Parker 569b24549e [ARM] Allow bitcasts in ARMCodeGenPrepare
Allow bitcasts in the use-def chains, treating them as sources.

Differential Revision: https://reviews.llvm.org/D50758

llvm-svn: 342032
2018-09-12 09:11:48 +00:00
Sander de Smalen 4dbc512676 [AArch64] Add parsing of aarch64_vector_pcs attribute.
This patch adds parsing support for the 'aarch64_vector_pcs'
calling convention attribute to calls and function declarations.

More information describing the vector ABI and procedure call standard
can be found here:

  https://developer.arm.com/products/software-development-tools/\
                            hpc/arm-compiler-for-hpc/vector-function-abi

Reviewers: t.p.northover, rnk, rengolin, javed.absar, thegameg, SjoerdMeijer

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D51477

llvm-svn: 342030
2018-09-12 08:54:06 +00:00
Florian Hahn 1086ce2397 [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC)
Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use
them for VPlan outside LoopVectorize.cpp

Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00

Reviewed By: rengolin, xbolva00

Differential Revision: https://reviews.llvm.org/D49488

llvm-svn: 342027
2018-09-12 08:01:57 +00:00
Ilya Biryukov 1ea75783fa Remove unused include from IVDescriptors.cpp.
This fixes a layering violation:

Analysis/IVDescrtors.cpp can't include Transforms/Utils/BasicBlockUtils.h,
since TransformUtils depends on Analysis.

llvm-svn: 342024
2018-09-12 07:22:46 +00:00
Ilya Biryukov 95066496d0 Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into TargetParser."
This reverts commit r341982.

The change introduced a layering violation. Reverting to unbreak
our integrate.

llvm-svn: 342023
2018-09-12 07:05:30 +00:00
Craig Topper 26a3799858 [SelectionDAG] Remove some code from PromoteIntOp_MGATHER that handles UpdateNodeOperands returning an existing node instead of updating.
I suspect this became unecessary when the CSE of mgather was fixed in r338080. It may still be possible to hit this if we widen the element type of a gather outside of type legalization and the promote the mask of a separate gather node so they become the same. But that seems pretty unlikely.

llvm-svn: 342022
2018-09-12 05:25:41 +00:00
Vikram TV 7e98d69847 Break LoopUtils into an Analysis file.
Summary:
The InductionDescriptor and RecurrenceDescriptor classes basically analyze the IR to identify the respective IVs. So, it is better to have them in the "Analysis" directory instead of the "Transforms" directory.

The rationale for this is to make the Induction and Recurrence descriptor classes available for analysis passes. Currently including them in an analysis pass produces link error (http://lists.llvm.org/pipermail/llvm-dev/2018-July/124456.html).

Induction and Recurrence descriptors are moved from Transforms/Utils/LoopUtils.h|cpp to Analysis/IVDescriptors.h|cpp.

Reviewers: dmgreen, llvm-commits, hfinkel

Reviewed By: dmgreen

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D51153

llvm-svn: 342016
2018-09-12 01:59:43 +00:00
Craig Topper dc32e91bc6 [X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32
Summary:
In GNUX23, is64BitMode returns true, but pointers are 32-bits. So we shouldn't copy pointer values into RSI/RDI since the widths don't match.

Fixes PR38865 despite what the title says. I think the llvm_unreachable in the copyPhysReg code tricked the optimizer and made the fatal error trigger.

Reviewers: rnk, efriedma, MatzeB, echristo

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51893

llvm-svn: 342015
2018-09-12 01:57:22 +00:00
Lang Hames ca007b7bc9 [ORC] Remove some unused typedefs.
llvm-svn: 342013
2018-09-12 00:35:03 +00:00
Jessica Paquette 2386eab360 [MachineOutliner] Add codegen size remarks to the MachineOutliner
Since the outliner is a module pass, it doesn't get codegen size remarks like
the other codegen passes do. This adds size remarks *to* the outliner.

This is kind of a workaround, so it's peppered with FIXMEs; size remarks
really ought to not ever be handled by the pass itself. However, since the
outliner is the only "MachineModulePass", this works for now. Since the
entire purpose of the MachineOutliner is to produce code size savings, it
really ought to be included in codgen size remarks.

If we ever go ahead and make a MachineModulePass (say, something similar to
MachineFunctionPass), then all of this ought to be moved there.

llvm-svn: 342009
2018-09-11 23:05:34 +00:00
Sanjay Patel 1cf0734b2f [InstCombine] add folds for unsigned-overflow compares
Name: op_ugt_sum
  %a = add i8 %x, %y
  %r = icmp ugt i8 %x, %a
  =>
  %notx = xor i8 %x, -1
  %r = icmp ugt i8 %y, %notx

Name: sum_ult_op
  %a = add i8 %x, %y
  %r = icmp ult i8 %a, %x
  =>
  %notx = xor i8 %x, -1
  %r = icmp ugt i8 %y, %notx

https://rise4fun.com/Alive/ZRxI

AFAICT, this doesn't interfere with any add-saturation patterns
because those have >1 use for the 'add'. But this should be
better for IR analysis and codegen in the basic cases.

This is another fold inspired by PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

llvm-svn: 342004
2018-09-11 22:40:20 +00:00
Zachary Turner 42e7cc1b0f [PDB] Write FPO Data to the PDB.
llvm-svn: 342003
2018-09-11 22:35:01 +00:00
Reid Kleckner 697d6cb8ee [cmake] Speed up check-llvm 5x by delay loading shell32 and ole32
Summary:
Previously, check-llvm on my Windows 10 workstation took about 300s to
run, and it would lock up my mouse. Now, it takes just under 60s.
Previously running the tests only used about 15% of the available CPU
time, now it uses up to 60%.

Shell32.dll and ole32.dll have direct dependencies on user32.dll and
gdi32.dll. These DLLs are mostly used to for Windows GUI functionality,
which most LLVM tools don't need. It seems that these DLLs acquire and
release some system resources on startup and exit, and that requires
acquiring the same highly contended lock discussed in this post:
https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and-i-cant-move-my-mouse/

Most LLVM tools still have a transitive dependency on
SHGetKnownFolderPathW, which is used to look up the user home directory
and local AppData directory. We also use SHFileOperationW to recursively
delete a directory, but only LLDB and clang-doc use this today. At some
point, we might consider removing these last shell32.dll deps, but for
now, I would like to take this free performance.

Many thanks to Bruce Dawson for suggesting this fix. He looked at an ETW
trace of an LLVM test run, noticed these DLLs, and suggested avoiding
them.

Reviewers: zturner, pcc, thakis

Subscribers: mgorny, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D51952

llvm-svn: 342002
2018-09-11 22:25:13 +00:00
Alexandros Lamprineas fe0512d575 Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts rL341954.

The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
failing with timeouts at stage2 clang/ubsan:

[3065/3073] Linking CXX executable bin/lld
command timed out: 1200 seconds without output running python
../sanitizer_buildbot/sanitizers/buildbot_selector.py,
attempting to kill

llvm-svn: 342001
2018-09-11 22:10:57 +00:00
Reid Kleckner 4a17780291 Apply local fixes intended to be part of r341999.'
llvm-svn: 342000
2018-09-11 22:02:31 +00:00
Reid Kleckner a6f64265ea [codeview] Decode and dump FP regs from S_FRAMEPROC records
Summary:
There are two registers encoded in the S_FRAMEPROC flags: one for locals
and one for parameters. The encoding is described by the
ExpandEncodedBasePointerReg function in cvinfo.h. Two bits are used to
indicate one of four possible values:

  0: no register - Used when there are no variables.
  1: SP / standard - Variables are stored relative to the standard SP
     for the ISA.
  2: FP - Variables are addressed relative to the ISA frame
     pointer, i.e. EBP on x86. If realignment is required, parameters
     use this. If a dynamic alloca is used, locals will be EBP relative.
  3: Alternative - Variables are stored relative to some alternative
     third callee-saved register. This is required to address highly
     aligned locals when there are dynamic stack adjustments. In this
     case, both the incoming SP saved in the standard FP and the current
     SP are at some dynamic offset from the locals. LLVM uses ESI in
     this case, MSVC uses EBX.

Most of the changes in this patch are to pass around the CPU so that we
can decode these into real, named architectural registers.

Subscribers: hiraditya

Differential Revision: https://reviews.llvm.org/D51894

llvm-svn: 341999
2018-09-11 22:00:50 +00:00
Alexander Shaposhnikov 72b27a6a39 [object] Improve the performance of getSymbols used by ArchiveWriter
In this diff we adjust the code of getSymbols to avoid creating LLVMContext when it's not necessary.
Without this patch when the function getSymbols was called on a MachO object with a __bitcode section
it was parsing the embedded bitcode and then ignoring the result.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D51759

llvm-svn: 341998
2018-09-11 22:00:47 +00:00
Sanjay Patel 26725bdc50 [InstCombine] add folds for icmp with xor mask constant
These are the folds in Alive;
Name: xor_ult
Pre: isPowerOf2(-C1)
%xor = xor i8 %x, C1
%r = icmp ult i8 %xor, C1
=>
%r = icmp ugt i8 %x, ~C1

Name: xor_ugt
Pre: isPowerOf2(C1+1)
%xor = xor i8 %x, C1
%r = icmp ugt i8 %xor, C1
=>
%r = icmp ugt i8 %x, C1

https://rise4fun.com/Alive/Vty

The ugt case in its simplest form was already handled by DemandedBits,
but that's not ideal as shown in the multi-use test.

I'm not sure if these are all of the symmetrical folds, but I adjusted 
the existing code for one of the folds to try to show the similarities.

There's no obvious connection, but this is another preliminary step 
for PR14613...
https://bugs.llvm.org/show_bug.cgi?id=14613

llvm-svn: 341997
2018-09-11 22:00:15 +00:00
Michael Berg c72a7259be add IR flags to MI
Summary: Initial support for nsw, nuw and exact flags in MI

Reviewers: spatel, hfinkel, wristow

Reviewed By: spatel

Subscribers: nlopes

Differential Revision: https://reviews.llvm.org/D51738

llvm-svn: 341996
2018-09-11 21:35:32 +00:00
Matt Morehouse f0d7daa972 Revert "[SanitizerCoverage] Create comdat for global arrays."
This reverts r341987 since it will cause trouble when there's a module
ID collision.

llvm-svn: 341995
2018-09-11 21:15:41 +00:00
Sanjay Patel c79d964fdd [InstCombine] add tests for icmp with xor; NFC
llvm-svn: 341993
2018-09-11 21:13:20 +00:00
Reid Kleckner 45265d996b [Support] Quote arguments containing \n on Windows
Fixes at_file.c test failure caused by r341988. We may want to change
how we treat \n in our tokenizer, but this is probably a good fix
regardless, since we can invoke all kinds of programs with different
interpretations of the command line quoting rules.

llvm-svn: 341992
2018-09-11 21:02:03 +00:00
Reid Kleckner f6968d886a [Support] Avoid calling CommandLineToArgvW from shell32.dll
Summary:
Shell32.dll depends on gdi32.dll and user32.dll, which are mostly DLLs
for Windows GUI functionality. LLVM's utilities don't typically need GUI
functionality, and loading these DLLs seems to be slowing down startup.
Also, we already have an implementation of Windows command line
tokenization in cl::TokenizeWindowsCommandLine, so we can just use it.

The goal is to get the original argv in UTF-8, so that it can pass
through most LLVM string APIs. A Windows process starts life with a
UTF-16 string for its command line, and it can be retreived with
GetCommandLineW from kernel32.dll.

Previously, we would:
1. Get the wide command line
2. Call CommandLineToArgvW to handle quoting rules and separate it into
   arguments.
3. For each wide argument, expand wildcards (* and ?) using
   FindFirstFileW.
4. Convert each argument to UTF-8

Now we:
1. Get the wide command line, convert the whole thing to UTF-8
2. Tokenize the UTF-8 command line with cl::TokenizeWindowsCommandLine
3. For each argument, expand wildcards if present
   - This requires converting back to UTF-16 to call FindFirstFileW
   - Results of FindFirstFileW must be converted back to UTF-8

Reviewers: zturner

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51941

llvm-svn: 341988
2018-09-11 20:22:39 +00:00
Matt Morehouse 7ce6032432 [SanitizerCoverage] Create comdat for global arrays.
Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

llvm-svn: 341987
2018-09-11 20:10:40 +00:00
Alina Sbirlea a496143c9e Update MemorySSA in LoopUnswitch.
Summary:
Update MemorySSA in old LoopUnswitch pass.
Actual dependency and update is disabled by default.

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D45301

llvm-svn: 341984
2018-09-11 19:19:21 +00:00
Konstantin Zhuravlyov 941615e4c8 AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination
into TargetParser.

Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).

Differential Revision: https://reviews.llvm.org/D51890

llvm-svn: 341982
2018-09-11 18:56:51 +00:00
Sanjay Patel 342c3bcf11 [InstCombine] enhance vector demanded elements to look at a vector select condition operand
I noticed that we were not back-propagating undef lanes to shuffle masks when we have a 
shuffle that reduces the vector width. This is part of investigating/solving PR38691:
https://bugs.llvm.org/show_bug.cgi?id=38691

The DAG equivalent was proposed with:
D51696

Differential Revision: https://reviews.llvm.org/D51433

llvm-svn: 341981
2018-09-11 18:49:00 +00:00
Matt Davis db834837c2 [llvm-mca] Delay calculation of Cycles per Resources, separate the cycles and resource quantities.
Summary:
This patch removes the storing of accumulated floating point data 
within the llvm-mca library.

This patch splits-up the two quantities: cycles and number of resource units.
By splitting-up these two quantities, we delay the calculation of "cycles per resource unit"
until that value is read, reducing the chance of accumulating floating point error. 

I considered using the APFloat, but after measuring performance, for a large (many iteration)
sample, I decided to go with this faster solution.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: llvm-commits, javed.absar, tschuett, gbedwell

Differential Revision: https://reviews.llvm.org/D51903

llvm-svn: 341980
2018-09-11 18:47:48 +00:00
Sanjay Patel 44c1b3a331 [InstCombine] add tests for add-with-overflow compares; NFC
llvm-svn: 341979
2018-09-11 18:45:28 +00:00
Vedant Kumar 727d89526e [gcov] Fix branch counters with switch statements (fix PR38821)
Right now, the counters are added in regards of the number of successors
for a given BasicBlock: it's good when we've only 1 or 2 successors (at
least with BranchInstr). But in the case of a switch statement, the
BasicBlock after switch has several predecessors and we need know from
which BB we're coming from.

So the idea is to revert what we're doing: add a PHINode in each block
which will select the counter according to the incoming BB.  They're
several pros for doing that:

- we fix the "switch" bug
- we remove the function call to "__llvm_gcov_indirect_counter_increment"
  and the lookup table stuff
- we replace by PHINodes, so the optimizer will probably makes a better
  job.

Patch by calixte!

Differential Revision: https://reviews.llvm.org/D51619

llvm-svn: 341977
2018-09-11 18:38:34 +00:00