Commit Graph

88370 Commits

Author SHA1 Message Date
Duncan P. N. Exon Smith b42fa2e5c6 BitcodeWriter: Reuse writeMetadataRecords, NFC
Change writeFunctionMetadata to call writeMetadataRecords.  For now
there's no functionality change, but makes it easy to serialize other
types of metadata in the function block in the future.

llvm-svn: 264557
2016-03-27 23:59:32 +00:00
Duncan P. N. Exon Smith cffd8cb9dc BitcodeWriter: Rename some functions for consistency, NFC
To match writeMetadataRecords, writeNamedMetadata and
writeMetadataStrings, change:

    WriteModuleMetadata        => writeModuleMetadata
    WriteFunctionLocalMetadata => writeFunctionMetadata
    Write##CLASS               => write##CLASS

The only major change is "FunctionLocal" => "Function".  The point is to
be less specific, in preparation for emitting normal metadata records
inside function metadata blocks (currently we only emit
`LocalAsMetadata` there).

llvm-svn: 264556
2016-03-27 23:56:04 +00:00
Duncan P. N. Exon Smith 80d153f6aa BitcodeWriter: Split out writeMetadataRecords, NFC
Besides being a nice cleanup, this is preparation for reusing the code
in function metadata blocks.

llvm-svn: 264555
2016-03-27 23:53:30 +00:00
Duncan P. N. Exon Smith 5465f0adc4 BitcodeWriter: Restructure WriteFunctionLocalMetadata, NFC
Use an early return to simplify logic.

llvm-svn: 264554
2016-03-27 23:38:36 +00:00
Duncan P. N. Exon Smith 2766e4d488 BitcodeWriter: Simplify tracking of function-local metadata, NFC
We don't really need a separate vector here; instead, point at a range
inside the main MDs array.  This matches how r264551 references the
ranges of strings and non-strings.

llvm-svn: 264552
2016-03-27 23:22:31 +00:00
Duncan P. N. Exon Smith 6565a0d4b2 Reapply ~"Bitcode: Collect all MDString records into a single blob"
Spiritually reapply commit r264409 (reverted in r264410), albeit with a
bit of a redesign.

Firstly, avoid splitting the big blob into multiple chunks of strings.

r264409 imposed an arbitrary limit to avoid a massive allocation on the
shared 'Record' SmallVector.  The bug with that commit only reproduced
when there were more than "chunk-size" strings.  A test for this would
have been useless long-term, since we're liable to adjust the chunk-size
in the future.

Thus, eliminate the motivation for chunk-ing by storing the string sizes
in the blob.  Here's the layout:

    vbr6: # of strings
    vbr6: offset-to-blob
    blob:
       [vbr6]: string lengths
       [char]: concatenated strings

Secondly, make the output of llvm-bcanalyzer readable.

I noticed when debugging r264409 that llvm-bcanalyzer was outputting a
massive blob all in one line.  Past a small number, the strings were
impossible to split in my head, and the lines were way too long.  This
version adds support in llvm-bcanalyzer for pretty-printing.

    <STRINGS abbrevid=4 op0=3 op1=9/> num-strings = 3 {
      'abc'
      'def'
      'ghi'
    }

From the original commit:

Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this
should (a) slightly reduce bitcode size, since there is less record
overhead, and (b) greatly improve reading speed, since blobs are super
cheap to deserialize.

llvm-svn: 264551
2016-03-27 23:17:54 +00:00
Duncan P. N. Exon Smith 456c9968e5 Support: Implement StreamingMemoryObject::getPointer
The implementation is fairly obvious.  This is preparation for using
some blobs in bitcode.

For clarity (and perhaps future-proofing?), I moved the call to
JumpToBit in BitstreamCursor::readRecord ahead of calling
MemoryObject::getPointer, since JumpToBit can theoretically (a) read
bytes, which (b) invalidates the blob pointer.

This isn't strictly necessary the two memory objects we have:

  - The return of RawMemoryObject::getPointer is valid until the memory
    object is destroyed.

  - StreamingMemoryObject::getPointer is valid until the next chunk is
    read from the stream.  Since the JumpToBit call is only going ahead
    to a word boundary, we'll never load another chunk.

However, reordering makes it clear by inspection that the blob returned
by BitstreamCursor::readRecord will be valid.

I added some tests for StreamingMemoryObject::getPointer and
BitstreamCursor::readRecord.

llvm-svn: 264549
2016-03-27 23:00:59 +00:00
Duncan P. N. Exon Smith d3be62ddf2 Bitcode: Add SimpleBitstreamCursor::getPointerToByte, etc.
Add API to SimpleBitstreamCursor to allow users to translate between
byte addresses and pointers.

  - jumpToPointer: move the bit position to a particular pointer.
  - getPointerToByte: get the pointer for a particular byte.
  - getPointerToBit: get the pointer for the byte of the current bit.
  - getCurrentByteNo: convenience function for assertions and tests.

Mainly adds unit tests (getPointerToBit/Byte already has a use), but
also preparation for eventually using jumpToPointer.

llvm-svn: 264546
2016-03-27 22:45:25 +00:00
Duncan P. N. Exon Smith d766d136ce Bitcode: Split out SimpleBitstreamCursor
Split out SimpleBitstreamCursor from BitstreamCursor, which is a
lower-level cursor with no knowledge of bitcode blocks, abbreviations,
or records.  It just knows how to read bits and navigate the stream.

This is mainly organizational, to separate the API for manipulating raw
bits from that for bitcode concepts like Record and Block.

llvm-svn: 264545
2016-03-27 22:40:55 +00:00
Teresa Johnson d29478f70e [ThinLTO] Add optional import message and statistics
Summary:
Add a statistic to count the number of imported functions. Also, add a
new -print-imports option to emit a trace of imported functions, that
works even for an NDEBUG build.

Note that emitOptimizationRemark does not work for the above printing as
it expects a Function object and DebugLoc, neither of which we have
with summary-based importing.

This is part 2 of D18487, the first part was committed separately as
r264536.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18487

llvm-svn: 264537
2016-03-27 15:27:30 +00:00
Teresa Johnson 9aae395fa8 [ThinLTO] Don't try to import alias unless aliasee can be imported
With r264503, aliases are now being added to the GlobalsToImport set
even when their aliasees can't be imported due to their linkage type.
While the importing worked correctly (the aliases imported as
declarations) due to the logic in doImportAsDefinition, there is no
point to adding them to the GlobalsToImport set.

Additionally, with D18487 it was resulting in incorrectly printing a
message indicating that the alias was imported.

To avoid this, delay adding aliases to the GlobalsToImport set until
after the linkage type of the aliasee is checked.

This patch is part of D18487.

llvm-svn: 264536
2016-03-27 15:01:11 +00:00
Hal Finkel 0b37175ca6 [PowerPC] Map max/minnum intrinsics and fmax/fmin to ISD nodes for CTR-based loop legality
Intrinsic::maxnum and Intrinsic::minnum, along with the associated libc
function calls (fmax[f], etc.) generally map to function calls after lowering.
For some vector types with QPX at least, however, we can legally lower these,
and we don't need to prohibit CTR-based loops on their account.

It turned out, however, that the logic that checked the opcodes associated with
intrinsics was broken (it would set the Opcode variable, but that variable was
later checked only if set for some otherwise-external function call.

This fixes the latter problem and adds the FMAX/MINNUM mappings.

llvm-svn: 264532
2016-03-27 05:40:56 +00:00
Michael Kruse ff379b69b2 [Verifier] Reject PHIs using defs from own block.
Reject the following IR as malformed (assuming that %entry, %next are
not in a loop):

    next:
      %y = phi i32 [ 0, %entry ]
      %x = phi i32 [ %y, %entry ]

Such PHI nodes came up in PR26718. While there was no consensus on
whether or not this is valid IR, most opinions on that bug and in a
discussion on the llvm-dev mailing list tended towards a
"strict interpretation" (term by Joseph Tremoulet) of PHI node uses.
Also, the language reference explicitly states that "the use of each
incoming value is deemed to occur on the edge from the corresponding
predecessor block to the current block" and
`DominatorTree::dominates(Instruction*, Use&)` uses this definition as
well.

For the code mentioned in PR15384, clang does not compile to such PHIs
(anymore?). The test case still hangs when replacing `%tmp6` with `%tmp`
in revisions before r176366 (where PR15384 has been fixed). The
occurrence of %tmp6 therefore was probably unintentional. Its value is
not used except in other PHIs.

Reviewers: majnemer, reames, JosephTremoulet, bkramer, grosser, jdoerfert, kparzysz, sanjoy

Differential Revision: http://reviews.llvm.org/D18443

llvm-svn: 264528
2016-03-26 23:32:57 +00:00
Sanjay Patel 796db35f62 [SimplifyCFG] propagate branch metadata when creating select (PR26636)
llvm-svn: 264527
2016-03-26 23:30:50 +00:00
Simon Pilgrim dcdf85033c [X86][AVX] Enabled SMUL_LOHI/UMUL_LOHI v8i32 vectors on AVX1 targets
Correct splitting of v8i32 vectors into v4i32 vectors to prevent scalarization

llvm-svn: 264517
2016-03-26 18:32:13 +00:00
JF Bastien a874d1a40d Revert "NFC: static_assert instead of comment"
This reverts commit fa36fcff16c7d4f78204d6296bf96c3558a4a672.

Causes the following Windows failure:

  C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\lib\CodeGen\MachineInstr.cpp(762):
  error C2338: must be trivially copyable to memmove

llvm-svn: 264516
2016-03-26 18:20:02 +00:00
JF Bastien d4ff3360ae NFC: static_assert instead of comment
Summary: isPodLike is as close as we have for is_trivially_copyable.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18483

llvm-svn: 264515
2016-03-26 18:14:27 +00:00
Simon Pilgrim e4dbeb40c6 [X86][AVX] Enabled MULHS/MULHU v16i16 vectors on AVX1 targets
Correct splitting of v16i16 vectors into v8i16 vectors to prevent scalarization

Differential Revision: http://reviews.llvm.org/D18307

llvm-svn: 264512
2016-03-26 15:44:55 +00:00
Simon Pilgrim 3eef33a806 [X86][SSE] Add MULHS/MULHU custom lowering for i8 vectors
Currently this is to mainly to prevent scalarization of integer division by constants.

Differential Revision: http://reviews.llvm.org/D18307

llvm-svn: 264511
2016-03-26 15:27:20 +00:00
Simon Pilgrim 7379a70677 [X86][AVX512BW] AVX512BW can sign-extend v32i8 to v32i16 for simpler v32i8 multiplies.
Only pre-AVX512BW targets need to split v32i8 vectors.

llvm-svn: 264509
2016-03-26 09:44:27 +00:00
David Majnemer b549ab02b4 [PowerPC] Disable the CTR optimization in the presence of {min,max}num
The minnum and maxnum intrinsics get lowered to libcalls which
invalidates the CTR optimization.

This fixes PR27083.

llvm-svn: 264508
2016-03-26 09:42:31 +00:00
Simon Pilgrim ff7b7141cd [X86][SSE] Don't duplicate Lower256IntArith functionality in LowerMul. NFC.
LowerMul v32i8 on AVX2 needs to split the 256-bit sources to allow sign-extension back to v16i16 to occur. Since this is basically the same as Lower256IntArith we simplify by using that here instead.

llvm-svn: 264506
2016-03-26 09:29:04 +00:00
Junmo Park a26e93bcec Minor code cleanup. NFC.
llvm-svn: 264505
2016-03-26 06:04:55 +00:00
Chuang-Yu Cheng 065969ec8e [Power9] Implement new altivec instructions: permute, count zero, extend sign, negate, parity, shift/rotate, mul10
This change implements the following vector operations:
- vclzlsbb vctzlsbb vctzb vctzd vctzh vctzw
- vextsb2w vextsh2w vextsb2d vextsh2d vextsw2d
- vnegd vnegw
- vprtybd vprtybq vprtybw
- vbpermd vpermr
- vrlwnm vrlwmi vrldnm vrldmi vslv vsrv
- vmul10cuq vmul10uq vmul10ecuq vmul10euq

28 instructions

Thanks Nemanja, Kit for invaluable hints and discussion!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

Phabricator: http://reviews.llvm.org/D15887
llvm-svn: 264504
2016-03-26 05:46:11 +00:00
Mehdi Amini 01e321306b ThinLTO: use the callgraph from the combined index to drive the FunctionImporter
Summary:
Now that the summary contains the full reference/call graph, we can
replace the existing function importer that loads and inspect the IR
to iteratively walk the call graph by a traversal based purely on the
summary information. Decouple the actual importing decision from any
IR manipulation.

Reviewers: tejohnson

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18343

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264503
2016-03-26 05:40:34 +00:00
Mehdi Amini 385cf28829 Rename ModuleSummaryIndex::modPathStringEntries() into modulePaths()
It now return the map instead of an iterator.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264489
2016-03-26 03:35:38 +00:00
Lang Hames d1af8fce0f [Support] Switch to RAII helper for error-as-out-parameter idiom.
As discussed on the llvm-commits thread for r264467.

llvm-svn: 264479
2016-03-25 23:54:32 +00:00
Sunil Srivastava 34fce9377e Improve the reliability of file renaming in Windows by having the compiler retry
the rename operation on 3 error conditions of ReplaceFileW() that it was 
previously bailing out on.

Patch by Douglas Yung!

Differential Revision: http://reviews.llvm.org/D17903

llvm-svn: 264477
2016-03-25 23:41:28 +00:00
Lang Hames ff044b1f69 [Object] Make createMachOObjectFile return Expected<...> rather than
ErrorOr<...>.

llvm-svn: 264473
2016-03-25 23:11:52 +00:00
Philip Reames b5681138e4 Allow value forwarding past release fences in GVN
A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence.  

We choose to be much more conservative about stores.  In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store.  Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time.

The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. 

This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now.

Differential Revision: http://reviews.llvm.org/D11436

llvm-svn: 264472
2016-03-25 22:40:35 +00:00
Lang Hames 8262764869 [Object] Make MachOObjectFile's constructor private, provide a static create
method instead.

This is not quite a named constructor: Construction may fail, and
MachOObjectFiles are usually passed by unique_ptr anyway, so create
returns an Expected<std::unique_ptr<MachOObjectFile>>.

llvm-svn: 264469
2016-03-25 21:59:14 +00:00
David Majnemer 020e890a19 [X86] Emit a proper ADJCALLSTACKDOWN in EmitLoweredTLSAddr
We forgot to add the second machine operand to our ADJCALLSTACKDOWN,
resulting in crashes in PEI.

This fixes PR27071.

llvm-svn: 264465
2016-03-25 21:49:11 +00:00
Jun Bum Lim 36c53fe147 [MachineCopyPropagation] Expose more dead copies across instructions with regmasks
When encountering instructions with regmasks, instead of cleaning up all the
elements in MaybeDeadCopies map, remove only the instructions erased. By keeping
more instruction in MaybeDeadCopies, this change will expose more dead copies
across instructions with regmasks.

llvm-svn: 264462
2016-03-25 21:15:35 +00:00
Nirav Dave fa250cad37 Prevent construction of cycle in DAG store merge
When merging stores in DAGCombiner, add check to ensure that no
dependenices exist that would cause the construction of a cycle in our
DAG.  This may happen if one store has a data dependence on another
instruction (e.g. a load) which itself has a (chain) dependence on
another store being merged. These stores cannot be merged safely and
doing so results in a cycle that is discovered in LegalizeDAG.

This test is only done in cases where Antialias analysis is used (UseAA)
as non-AA store merge candidates will be merged logically after all
loads which have been checked to not alias.

Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight

Subscribers: llvm-commits, tberghammer, danalbert, srhines

Differential Revision: http://reviews.llvm.org/D18336

llvm-svn: 264461
2016-03-25 21:06:30 +00:00
Kostya Serebryany f3ab6d9e10 [libFuzzer] use fflush after every Printf
llvm-svn: 264459
2016-03-25 20:31:26 +00:00
Richard Smith 3a61a4d104 Remove useless and unused CrashRecoveryContext::getBacktrace(). This function always returned an empty string.
llvm-svn: 264458
2016-03-25 20:30:10 +00:00
Sanjoy Das d4c783335b [RS4GC] Lower calls to @llvm.experimental.deoptimize
This changes RS4GC to lower calls to ``@llvm.experimental.deoptimize``
to gc.statepoints wrapping ``__llvm_deoptimize``, and changes
``callsGCLeafFunction`` to recognize ``@llvm.experimental.deoptimize``
as a non GC leaf function.

I've had to hard code the ``"__llvm_deoptimize"`` name in
RewriteStatepointsForGC; since ``TargetLibraryInfo`` is available only
during codegen.  This isn't without precedent in the codebase, so I'm
not overtly concerned.

llvm-svn: 264456
2016-03-25 20:12:13 +00:00
Justin Bogner ec0e7d2582 CodeGen: Don't iterate over operands after we've erased an MI
This fixes a use-after-free introduced 3 years ago, in r182872 ;)

The code more or less worked because the memory that CopyMI was
pointing to happened to still be valid, but lots of tests would crash
if you ran under ASAN with the recycling allocator changes from
llvm.org/PR26808

llvm-svn: 264455
2016-03-25 20:03:28 +00:00
Saleem Abdulrasool 750a90df6a ARM: maintain BB ordering when expanding WIN__DBZCHK
It is possible to have a fallthrough MBB prior to MBB placement.  The original
addition of the BB would result in reordering the BB as not preceding the
successor.  Because of the fallthrough nature of the BB, we could end up
executing incorrect code or even a constant pool island!  Insert the spliced BB
into the same location to avoid that.

Thanks to Tim Northover for invaluable hints and Fiora for the discussion on
what may have been occurring!

llvm-svn: 264454
2016-03-25 19:48:06 +00:00
Teresa Johnson aae2610042 [ThinLTO] Rename edges() to calls() for clarity (NFC)
Helps distinguish from refs() which iterates over non-call references.

llvm-svn: 264445
2016-03-25 18:59:13 +00:00
Justin Bogner ec5ea36891 CodeGen: Fix a use-after-free in TII
Found by ASAN with the recycling allocator changes from PR26808.

llvm-svn: 264443
2016-03-25 18:38:48 +00:00
Justin Bogner f2a0d349a6 AMDGPU: Fix a use-after free and a missing break
We're erasing MI here, but then immediately using it again inside the
`if`. This moves the erase after we're done using it.

Doing that reveals a second problem though - this case is missing a
break, so we fall through to the default and dereference MI again.
This is obviously a bug, though I don't know how to write a test that
triggers it - all we do in the error case is print some extra debug
output.

Both of these issue crash on lots of tests under ASAN with the
recycling allocator changes from PR26808 applied.

llvm-svn: 264442
2016-03-25 18:33:16 +00:00
Hans Wennborg 5f916d3df4 [X86] Use "and $0" and "orl $-1" to store 0 and -1 when optimizing for minsize
64-bit, 32-bit and 16-bit move-immediate instructions are 7, 6, and 5 bytes,
respectively, whereas and/or with 8-bit immediate is only three bytes.

Since these instructions imply an additional memory read (which the CPU could
elide, but we don't think it does), restrict these patterns to minsize functions.

Differential Revision: http://reviews.llvm.org/D18374

llvm-svn: 264440
2016-03-25 18:11:31 +00:00
Reid Kleckner f6f04f8fc8 Consider regmasks when computing register-based DBG_VALUE live ranges
Now register parameters that aren't saved to the stack or CSRs are
considered dead after the first call. Previously the debugger would show
whatever was in the register.

Fixes PR26589

Reviewers: aprantl

Differential Revision: http://reviews.llvm.org/D17211

llvm-svn: 264429
2016-03-25 17:54:46 +00:00
Lang Hames 9e964f3728 [Object] Start threading Error through MachOObjectFile construction.
llvm-svn: 264425
2016-03-25 17:25:34 +00:00
Jonas Paulsson 5dd1e56de5 [SystemZ] Remove isBranch and isTerminator flags on BRCT and BRCTG.
The BranchUnaryRI instruction class already sets these flags.

Reviewed by Ulrich Weigand.

llvm-svn: 264411
2016-03-25 15:42:30 +00:00
Duncan P. N. Exon Smith fc8110041f Revert "Bitcode: Collect all MDString records into a single blob"
This reverts commit r264409 since it failed to bootstrap:
http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build/8302/

llvm-svn: 264410
2016-03-25 15:22:27 +00:00
Duncan P. N. Exon Smith fdbf0a5af8 Bitcode: Collect all MDString records into a single blob
Optimize output of MDStrings in bitcode.  This emits them in big blocks
(currently 1024) in a pair of records:
  - BULK_STRING_SIZES: the sizes of the strings in the block, and
  - BULK_STRING_DATA: a single blob, which is the concatenation of all
    the strings.

Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this
should (a) slightly reduce bitcode size, since there is less record
overhead, and (b) greatly improve reading speed, since blobs are super
cheap to deserialize.

I needed to add support for blobs to streaming input to get the test
suite passing.
  - StreamingMemoryObject::getPointer reads ahead and returns the
    address of the blob.
  - To avoid a possible reallocation of StreamingMemoryObject::Bytes,
    BitstreamCursor::readRecord needs to move the call to JumpToEnd
    forward so that getPointer is the last bitstream operation.

llvm-svn: 264409
2016-03-25 14:40:18 +00:00
Chad Rosier 59bcbba6b4 [AArch64] Fix typo. NFC.
llvm-svn: 264408
2016-03-25 14:37:43 +00:00
David L Kreitzer 8d441eb936 Enable non-power-of-2 #pragma unroll counts.
Patch by Evgeny Stupachenko.

Differential Revision: http://reviews.llvm.org/D18202

llvm-svn: 264407
2016-03-25 14:24:52 +00:00