Commit Graph

26313 Commits

Author SHA1 Message Date
Oliver Stannard 7e7d983a87 Refactor backend diagnostics for unsupported features
Re-commit of r258951 after fixing layering violation.

The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.

llvm-svn: 259498
2016-02-02 13:52:43 +00:00
Chandler Carruth a4499e9f73 [LCG] Build an edge abstraction for the LazyCallGraph and use it to
differentiate between indirect references to functions an direct calls.

This doesn't do a whole lot yet other than change the print out produced
by the analysis, but it lays the groundwork for a very major change I'm
working on next: teaching the call graph to actually be a call graph,
modeling *both* the indirect reference graph and the call graph
simultaneously. More details on that in the next patch though.

The rest of this is essentially a bunch of over-engineering that won't
be interesting until the next patch. But this also isolates essentially
all of the churn necessary to introduce the edge abstraction from the
very important behavior change necessary in order to separately model
the two graphs. So it should make review of the subsequent patch a bit
easier at the cost of making this patch seem poorly motivated. ;]

Differential Revision: http://reviews.llvm.org/D16038

llvm-svn: 259463
2016-02-02 03:57:13 +00:00
Matthias Braun 3f88eabe93 SmallSet/SmallPtrSet: Refuse huge Small numbers
These sets do linear searching in small mode; It is not a good idea to
use huge numbers as the small value here, save people from themselves by
adding a static_assert.

Differential Revision: http://reviews.llvm.org/D16706

llvm-svn: 259419
2016-02-01 22:05:16 +00:00
Sanjoy Das 401e631c4b [SCEV] Rename isKnownPredicateWithRanges; NFC
Make it obvious that it uses constant ranges, and use `Via` instead of
`With`, like other similar functions in SCEV.

llvm-svn: 259400
2016-02-01 20:48:10 +00:00
Asaf Badouh 5a3a0231f4 [X86][AVX512VBMI] add encoding and intrinsics for Multishift
Differential Revision: http://reviews.llvm.org/D16399

llvm-svn: 259363
2016-02-01 15:48:21 +00:00
Amjad Aboud 8bbce8ad8e Improved macro emission in dwarf.
Changed emitting offset of macinfo entry into compiler unit DIE to use "addSectionLabel" method rather than explicitly calculating size/offset of macro entry.

Differential Revision: http://reviews.llvm.org/D16292

llvm-svn: 259358
2016-02-01 14:09:41 +00:00
Ewan Crawford 588980d69e DWARF RenderScript vendor extension
Patch adds a DWARF language vendor extension for RenderScript.
We are already using this identifier in LLDB with a hard coded value, so it's preferable to use a LLVM generated enum instead.
The language is intended to be added to the next version of the standard.
See http://www.dwarfstd.org/ShowIssue.php?issue=150331.1

Reviewers:  dexonsmith, echristo
Subscribers: probinson domipheus, srhines, llvm-commits
Differential Revision: http://reviews.llvm.org/D16409

llvm-svn: 259348
2016-02-01 10:39:24 +00:00
Craig Topper b9f42468ad Remove utostr_32 as it has no uses anymore.
llvm-svn: 259331
2016-01-31 20:00:26 +00:00
Craig Topper fa8b2317a6 Merge utohex_buffer into utohexstr, it's only caller. Also change utohexstr to use the std::string constructor that takes a start and end pointer. This saves a call to strlen. NFC
llvm-svn: 259329
2016-01-31 20:00:22 +00:00
Craig Topper ab3d2ace49 Use std::end instead of repeating buffer sizes.
llvm-svn: 259312
2016-01-31 01:12:35 +00:00
Matt Arsenault 43976df0da AMDGPU: Add new amdgcn workitem intrinsics
These use the correct prefix and follow the HSA naming convention
rather than the config register option names.

llvm-svn: 259293
2016-01-30 04:25:19 +00:00
Justin Bogner 4bc4b5f4b8 Remove references to *.h.in files and some autoconf hackery
Missed this stuff in r259291.

llvm-svn: 259292
2016-01-30 04:15:33 +00:00
Justin Bogner 0138037203 Remove *.h.in - these were only used by the autoconf build system
llvm-svn: 259291
2016-01-30 04:05:45 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
David Majnemer 8b68a6cabd [CodeView] Properly handle empty line tables
Don't crash when there are no appropriate line table entries for a given
function.

llvm-svn: 259277
2016-01-30 00:36:09 +00:00
Vedant Kumar 00dab22853 [Profiling] Add a -sparse mode to llvm-profdata merge
Add an option to llvm-profdata merge for writing out sparse indexed
profiles. These profiles omit InstrProfRecords for functions which are
never executed.

Differential Revision: http://reviews.llvm.org/D16727

llvm-svn: 259258
2016-01-29 22:54:45 +00:00
Fiona Glaser b417d464e6 Add LoopSimplifyCFG pass
Loop transformations can sometimes fail because the loop, while in
valid rotated LCSSA form, is not in a canonical CFG form. This is
an extremely simple pass that just merges obviously redundant
blocks, which can be used to fix some known failure cases. In the
future, it may be enhanced with more cases (and have code shared with
SimplifyCFG).

This allows us to run LoopSimplifyCFG -> LoopRotate -> LoopUnroll,
so that SimplifyCFG cleans up the loop before Rotate tries to run.

Not currently used in the pass manager, since this pass doesn't do
anything unless you can hook it up in an LPM with other loop passes.
It'll be added once Chandler cleans up things to allow this.

Tested in a custom pipeline out of tree to confirm it works in
practice (in addition to the included trivial test).

llvm-svn: 259256
2016-01-29 22:35:36 +00:00
Matthias Braun 3328281538 AttributeSetImpl: Summarize existing function attributes in a bitset.
The majority of attribute queries checks for the existence of an enum
attribute in the FunctionIndex slot. We only have 48 of those and can
therefore summarize them in an uint64_t bitset which measurably improves
compile time.

Differential Revision: http://reviews.llvm.org/D16618

llvm-svn: 259252
2016-01-29 22:25:19 +00:00
David Majnemer 4553723825 Unbreak windows buildbots
llvm-svn: 259231
2016-01-29 19:38:03 +00:00
David Majnemer 6fcbd7e909 [CodeView] Implement .cv_inline_linetable
This support is _very_ rudimentary, just enough to get some basic data
into the CodeView debug section.

Left to do is:
- Use the combined opcodes to save space.
- Do something about code offsets.

llvm-svn: 259230
2016-01-29 19:24:12 +00:00
Reid Kleckner f3b9ba4941 [codeview] Begin to add support for inlined call sites
Summary:
There are three parts to inlined call frames:
1. The inlinee line subsection
2. The inline site symbol record
3. The function ids referenced by both

This change starts by emitting function ids (3) for all subprograms and
emitting the base inline site symbol record (2). The actual line numbers
in (2) use an encoded format that will come next, along with the inlinee
line subsection.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16333

llvm-svn: 259217
2016-01-29 18:16:43 +00:00
Jonas Paulsson 8c738635b1 Temporarily revert "[ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten."
Some buildbot failures needs to be debugged.

llvm-svn: 259213
2016-01-29 17:22:43 +00:00
Jonas Paulsson 23f12e5c02 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
The buildSchedGraph() was in need of reworking as the AA features had been
added on top of earlier code. It was very difficult to understand, and buggy.
There had been found cases where scheduling dependencies had actually been
missed (see r228686).

AliasChain, RejectMemNodes, adjustChainDeps() and iterateChainSucc() have
been removed. There are instead now just the four maps from Value to SUs, which
have been renamed to Stores, Loads, NonAliasStores and NonAliasLoads.

An unknown store used to become the AliasChain, but now becomes a store mapped
to 'unknownValue' (in Stores). What used to be PendingLoads is instead the
list of SUs mapped to 'unknownValue' in Loads.

RejectMemNodes and adjustChainDeps() used to be a safety-net for everything.
The SU maps were sometimes cleared and SUs were put in RejectMemNodes, where
adjustChainDeps() would look. Instead of this, a more straight forward approach
is used in maintaining the SU maps without clearing them and simply letting
them grow over time. Instead of the cutt-off in adjustChainDeps() search, a
reduction of maps will be done if needed (see below).

Each SUnit either becomes the BarrierChain, or is put into one of the maps. For
each SUnit encountered, all the information about previous ones are still
available until a new BarrierChain is set, at which point the maps are cleared.

For huge regions, the algorithm becomes slow, therefore the maps will get
reduced at a threshold (current default is 1000 nodes), by a fraction (default 1/2).
These values can be tuned by use of CL options in case some test case shows that
they need to be changed (-dag-maps-huge-region and -dag-maps-reduction-size).

There has not been any considerable change observed in output quality or compile
time. There may now be more DAG edges inserted than before (i.e. if A->B->C,
then A->C is not needed). However, in a comparison run there were fewer total
calls to AA, and a somewhat improved compile time, which means this seems to
be not a problem.

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

llvm-svn: 259201
2016-01-29 16:11:18 +00:00
Benjamin Kramer 34ac580164 [IR] Move definitions of users of Use::set to Value.h
Still ugly, but at least Use.h is self-contained again.

llvm-svn: 259191
2016-01-29 12:47:05 +00:00
Benjamin Kramer ccb7a86c59 [IR] Shuffle the code for getSequentialElementType to type.h to avoid circular header dependencies.
llvm-svn: 259190
2016-01-29 12:47:01 +00:00
George Burgess IV 5d095c91ee Minor bugfix in AAResults::getModRefInfo.
Also removed a few redundant `else`s.

Bug was found by a test I wrote for MemorySSA (in review at
http://reviews.llvm.org/D7864; shiny update coming soon). So, assuming
that lands at some point, this should be covered by that. If anyone
feels this deserves its own explicit test case, please let me know.
I'll write one.

llvm-svn: 259179
2016-01-29 07:51:15 +00:00
Akira Hatanaka 4f472a8867 [llvm-bcanalyzer] Dump bitcode wrapper header
This patch enables llvm-bcanalyzer to print the bitcode wrapper header
if the file has one, which is needed to test the changes made in
r258627 (bitcode-wrapper-header-armv7m.ll is the test case for r258627).

Differential Revision: http://reviews.llvm.org/D16642

llvm-svn: 259162
2016-01-29 05:55:09 +00:00
Reid Kleckner 2214ed8937 Reland "[CodeView] Use assembler directives for line tables"
This reverts commit r259126 and relands r259117.

This time with updated library dependencies.

llvm-svn: 259130
2016-01-29 00:49:42 +00:00
Reid Kleckner 00d9639c24 Revert "[CodeView] Use assembler directives for line tables"
This reverts commit r259117.

The LineInfo constructor is defined in the codeview library and we have
to link against it now. Doing that isn't trivial, so reverting for now.

llvm-svn: 259126
2016-01-29 00:13:28 +00:00
Reid Kleckner 22d993877b Silence gcc warning about ternary and enumerations
llvm-svn: 259123
2016-01-28 23:59:35 +00:00
Reid Kleckner c62e379d22 [CodeView] Use assembler directives for line tables
Adds a new family of .cv_* directives to LLVM's variant of GAS syntax:

- .cv_file: Similar to DWARF .file directives

- .cv_loc: Similar to the DWARF .loc directive, but starts with a
  function id. CodeView line tables are emitted by function instead of
  by compilation unit, so we needed an extra field to communicate this.
  Rather than overloading the .loc direction further, we decided it was
  better to have our own directive.

- .cv_stringtable: Emits the codeview string table at the current
  position. Currently this just contains the filenames as
  null-terminated strings.

- .cv_filechecksums: Emits the file checksum table for all files used
  with .cv_file so far. There is currently no support for emitting
  actual checksums, just filenames.

This moves the line table emission code down into the assembler.  This
is in preparation for implementing the inlined call site line table
format. The inline line table format encoding algorithm requires knowing
the absolute code offsets, so it must run after the assembler has laid
out the code.

David Majnemer collaborated on this patch.

llvm-svn: 259117
2016-01-28 23:31:52 +00:00
Reid Kleckner 243d024a4b Remove unused MC includes from LTOModule.h
llvm-svn: 259115
2016-01-28 23:21:12 +00:00
Benjamin Kramer 77b3eb3c01 Make header self-contained.
llvm-svn: 259060
2016-01-28 17:48:29 +00:00
Oliver Stannard 02fa1c80c4 Revert r259035, it introduces a cyclic library dependency
llvm-svn: 259045
2016-01-28 13:19:47 +00:00
Oliver Stannard b4b092ea1b Add backend dignostic printer for unsupported features
Re-commit of r258951 after fixing layering violation.

The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.

In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.

Differential Revision: http://reviews.llvm.org/D16590

llvm-svn: 259035
2016-01-28 10:07:27 +00:00
Asaf Badouh 42852d99e7 [X86][AVX512] small fix in ptestm intrinsics
move ptestm{q|d} intrinsics from patterns form (in td file) to the intrinsics table

Differential Revision: http://reviews.llvm.org/D16633

llvm-svn: 259029
2016-01-28 08:33:22 +00:00
Matthias Braun cac5589ba9 SmallPtrSet: Add missing include
llvm-svn: 259021
2016-01-28 05:09:01 +00:00
Matthias Braun 569f207018 SmallPtrSet: Make destructor available for inlining
llvm-svn: 259019
2016-01-28 04:49:14 +00:00
Matthias Braun 924f080529 SmallPtrSet: Share some code between copy/move constructor/assignment operator
llvm-svn: 259018
2016-01-28 04:49:11 +00:00
Matthias Braun f67e5c79d7 SmallPtrSet: Remove trailing whitespace, fix indentation
llvm-svn: 259017
2016-01-28 04:49:07 +00:00
NAKAMURA Takumi 628a7a0aef Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features"
It broke layering violation in LLVMIR.

clang r258950 "Add backend dignostic printer for unsupported features"
llvm  r258951 "Refactor backend diagnostics for unsupported features"

llvm-svn: 259016
2016-01-28 04:41:32 +00:00
Dan Gohman adf28177eb [WebAssembly] Enhanced register stackification
This patch revamps the RegStackifier pass with a new tree traversal mechanism,
enabling three major new features:

 - Stackification of values with multiple uses, using the result value of set_local
 - More aggressive stackification of instructions with side effects
 - Reordering operands in commutative instructions to enable more stackification.

llvm-svn: 259009
2016-01-28 01:22:44 +00:00
George Burgess IV 77351ba3ae Minor style cleanup of CFLAA. NFC.
llvm-svn: 259008
2016-01-28 00:54:01 +00:00
Adam Nemet dadfbb52f7 [TTI] Add getPrefetchDistance from PPCLoopDataPrefetch, NFC
This patch is part of the work to make PPCLoopDataPrefetch
target-independent
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758).

As it was discussed in the above thread, getPrefetchDistance is
currently using instruction count which may change in the future.

llvm-svn: 258995
2016-01-27 22:21:25 +00:00
Tim Northover 042a6c1fe1 ARMv7k: base ABI decision on v7k Arch rather than watchos OS.
Various bits we want to use the new ABI actually compile with "-arch armv7k
-miphoneos-version-min=9.0". Not ideal, but also not ridiculous given how
slices work.

llvm-svn: 258975
2016-01-27 19:32:29 +00:00
Benjamin Kramer 391be792f2 One more batch of self-containing headers.
llvm-svn: 258974
2016-01-27 19:29:56 +00:00
John McCall 3fe604f89f Add support for objc_unsafeClaimAutoreleasedReturnValue to the
ObjC ARC Optimizer.

The main implication of this is:

1. Ensuring that we treat it conservatively in terms of optimization.
2. We put the ASM marker on it so that the runtime can recognize
objc_unsafeClaimAutoreleasedReturnValue from releaseRV.

<rdar://problem/21567064>

Patch by Michael Gottesman!

llvm-svn: 258970
2016-01-27 19:05:08 +00:00
Benjamin Kramer 45275a4d3c Make more headers self-contained.
A lot of this comes from the new complete type requirement of DenseMap.

llvm-svn: 258956
2016-01-27 18:03:37 +00:00
Oliver Stannard 1e67a9f196 Refactor backend diagnostics for unsupported features
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.

The implementation of DiagnosticInfoUnsupported::print must be in
lib/Codegen rather than in the existing file in lib/IR/ to avoid
introducing a dependency from IR to CodeGen.

Differential Revision: http://reviews.llvm.org/D16590

llvm-svn: 258951
2016-01-27 17:30:33 +00:00
Benjamin Kramer 390c33cd18 Move SafeStack to CodeGen.
It depends on the target machinery, that's not available for
instrumentation passes.

llvm-svn: 258942
2016-01-27 16:53:42 +00:00
Benjamin Kramer f9172fd4ac Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/
It's a SelectionDAG thing, not a Target thing.

llvm-svn: 258939
2016-01-27 16:32:26 +00:00
Benjamin Kramer c67cf31f5c Move passes that live in lib/CodeGen out of Scalar.h
llvm-svn: 258938
2016-01-27 16:05:42 +00:00
Benjamin Kramer 820f7548a1 Make some headers self-contained, remove unused includes that violate layering.
llvm-svn: 258937
2016-01-27 16:05:37 +00:00
Benjamin Kramer b3e8a6d2b8 Move MCTargetAsmParser.h to llvm/MC/MCParser where it belongs.
llvm-svn: 258917
2016-01-27 10:01:28 +00:00
Matthias Braun 4962bd737f SmallPtrSet: Inline the part of insert_imp in the small case
Most of the time we only hit the small case, so it is beneficial to pull
it out of the insert_imp() implementation. This improves compile time
at least for non-LTO builds.

Differential Revision: http://reviews.llvm.org/D16619

llvm-svn: 258908
2016-01-27 04:20:24 +00:00
Matthias Braun 2a60c4424c Function: Slightly simplify code by using existing hasFnAttribute() convenience function
llvm-svn: 258907
2016-01-27 03:45:25 +00:00
Reid Kleckner 5b4637141e [llvm-tblgen] Avoid StringMatcher for GCC and MS builtin names
This brings the compile time of Function.cpp from ~40s down to ~4s for
me locally. It also shaves off about 400KB of object file size in a
release+asserts build.

I also realized that the AMDGPU backend does not have any GCC builtin
names to match, so the extra lookup was a no-op. I removed it to silence
a zero-length string table array warning. There should be no functional
change here.

This change really ends the story of PR11951.

llvm-svn: 258897
2016-01-27 01:43:12 +00:00
Xinliang David Li ae89869747 [PGO] Make header portable for C /NFC
llvm-svn: 258889
2016-01-27 00:13:39 +00:00
Xinliang David Li 731637567e [PGO] allow pgo name collector to disable compression (for testing)/NFC
llvm-svn: 258876
2016-01-26 23:13:00 +00:00
Reid Kleckner c2752daa6b Handle more edge cases in intrinsic name binary search
I tried to make the AMDGPU intrinsic info table use this instead of
another StringMatcher, and some issues arose.

llvm-svn: 258871
2016-01-26 22:33:19 +00:00
Eugene Zelenko 6ac3f739ca Fix Clang-tidy modernize-use-nullptr and modernize-use-override warnings; other minor fixes.
Differential revision: reviews.llvm.org/D16568

llvm-svn: 258831
2016-01-26 18:48:36 +00:00
Sanjay Patel f1ac8ba45b don't repeat names in documentation comments; NFC
llvm-svn: 258820
2016-01-26 17:06:13 +00:00
Benjamin Kramer f57c1977c1 Reflect the MC/MCDisassembler split on the include/ level.
No functional change, just moving code around.

llvm-svn: 258818
2016-01-26 16:44:37 +00:00
Igor Laevsky 03a670c0ec Re-submit r256008 "Improve DWARFDebugFrame::parse to also handle __eh_frame."
Originally this change was causing failures on windows buildbots.
But those problems were fixed in r258806.

llvm-svn: 258811
2016-01-26 15:09:42 +00:00
Matt Arsenault 051d6f9fde AMDGPU: Add new amdgcn intrinsics for cube instructions
More cleanup to try to get all intrinsics using the correct
amdgcn prefix that are as close to the instruction as possible.

llvm-svn: 258786
2016-01-26 04:29:56 +00:00
Matt Arsenault 0c3e2338fe AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now
Also move into backend intrinsics to discourage use of the old name.

llvm-svn: 258783
2016-01-26 04:14:16 +00:00
Haicheng Wu f1c00a22be [LIR] Add support for structs and hand unrolled loops
This is a recommit of r258620 which causes PR26293.

The original message:

Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258777
2016-01-26 02:27:47 +00:00
Matthias Braun db320773e1 LiveIntervalAnalysis: Improve some comments
As recommended by Justin.

llvm-svn: 258771
2016-01-26 01:40:48 +00:00
Matthias Braun 242b8bb58d LiveIntervalAnalysis: Cleanup handleMove{Down|Up}() functions, NFC
These two functions are hard to reason about. This commit makes the code
more comprehensible:

- Use four distinct variables (OldIdxIn, OldIdxOut, NewIdxIn, NewIdxOut)
  with a fixed value instead of a changing iterator I that points to
  different things during the function.
- Remove the early explanation before the function in favor of more
  detailed comments inside the function. Should have more/clearer comments now
  stating which conditions are tested and which invariants hold at
  different points in the functions.

The behaviour of the code was not changed.

I hope that this will make it easier to review the changes in
http://reviews.llvm.org/D9067 which I will adapt next.

Differential Revision: http://reviews.llvm.org/D16379

llvm-svn: 258756
2016-01-26 00:43:50 +00:00
Xinliang David Li 0c57c9acb7 Fix a typo
llvm-svn: 258716
2016-01-25 20:38:13 +00:00
Quentin Colombet a392810bea Speculatively revert r258620 as it is the likely culprid of PR26293.
llvm-svn: 258703
2016-01-25 19:12:49 +00:00
Sanjay Patel 22da9fafb7 don't repeat function names in documentation comments; NFC
llvm-svn: 258699
2016-01-25 18:38:38 +00:00
Michael Zuckerman 1bd7f993fc [AVX512] Adding PTESTNMB/D/W/Q instruction
Differential Revision: http://reviews.llvm.org/D16520

llvm-svn: 258688
2016-01-25 14:43:23 +00:00
Michael Zuckerman 19670d479a [AVX512] Adding PTESTMB/W/D/Q instruction
Differential Revision: http://reviews.llvm.org/D16519

llvm-svn: 258686
2016-01-25 13:27:32 +00:00
Bradley Smith d27a6a7072 [ARM] Add DSP build attribute and extension targeting
This patch was originally committed as r257885, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258683
2016-01-25 11:26:11 +00:00
Asaf Badouh 655822ab7e [X86][IFMA] adding intrinsics and encoding for multiply and add of unsigned 52bit integer
VPMADD52LUQ - Packed Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Qword Accumulators
 VPMADD52HUQ - Packed Multiply of Unsigned 52-bit Unsigned Integers and Add High 52-bit Products to 64-bit Accumulators

Differential Revision: http://reviews.llvm.org/D16407

llvm-svn: 258680
2016-01-25 11:14:24 +00:00
Igor Breger 1e5bafbc82 AVX512: VMOVDQU8/16/32/64 (load) intrinsic implementation.
Differential Revision: http://reviews.llvm.org/D16137

llvm-svn: 258657
2016-01-24 08:04:33 +00:00
David Majnemer 88542a0a69 [SCCP] Remove duplicate code
SCCP has code identical to changeToUnreachable's behavior, switch it
over to just call changeToUnreachable.

No functionality change intended.

llvm-svn: 258654
2016-01-24 06:26:47 +00:00
David Majnemer 35c46d3e0b [InstCombine, SCCP] Consolidate code used to remove instructions
InstCombine and SCCP both want to remove dead code in a very particular
way but using identical means to do so.  Share the code between the two.

No functionality change is intended.

llvm-svn: 258653
2016-01-24 05:26:18 +00:00
Manuel Jacob 25510fcf5c Update outdated method documention in Attributes.h. NFC.
Nowadays the alignment attribute is not the only integer attribute.

llvm-svn: 258647
2016-01-23 22:38:39 +00:00
Justin Lebar 561d5a1758 [CUDA] Add Target::isNVPTX().
Summary: Helper so we don't have to enumerate nvptx && nvptx64 everywhere.

Reviewers: echristo

Subscribers: llvm-commits, jhen, tra

Differential Revision: http://reviews.llvm.org/D16494

llvm-svn: 258639
2016-01-23 21:12:22 +00:00
Joseph Tremoulet 23d02f6149 [ORC] Update ObjectTransformLayer signature
Summary:
Update ObjectTransformLayer::addObjectSet to take the object set by
value rather than reference and pass it to the base layer with move
semantics rather than copy, to match r258185's changes to
ObjectLinkingLayer.

Update the unit test to verify that ObjectTransformLayer's signature stays
in sync with ObjectLinkingLayer's.


Reviewers: lhames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16414

llvm-svn: 258630
2016-01-23 18:36:01 +00:00
NAKAMURA Takumi 0933b4280a AlignOf.h: Satisfy both g++-4.7 and msc18.
llvm-svn: 258623
2016-01-23 13:52:09 +00:00
Haicheng Wu dd5e9d2159 [LIR] Add support for structs and hand unrolled loops
Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258620
2016-01-23 06:52:41 +00:00
NAKAMURA Takumi 7cfaab2bcc AlignOf.h: Appease g++-4.7 for now. Will fix later.
llvm-svn: 258600
2016-01-23 02:22:36 +00:00
Matt Arsenault 7766b951d6 AMDGPU: Remove GCCBuiltin from intrinsics that need mangling
If the intrinsic is overloaded and works on multiple types,
it cannot resolve to a single corresponding builtin and requires
handling in clang. This just causes crashes now.

llvm-svn: 258559
2016-01-22 21:30:46 +00:00
Matt Arsenault 10ca39ca8b AMDGPU: Add new name for barrier intrinsic
llvm-svn: 258558
2016-01-22 21:30:43 +00:00
Matt Arsenault bef34e21c7 AMDGPU: Rename intrinsics to use amdgcn prefix
The intrinsic target prefix should match the target name
as it appears in the triple.

This is not yet complete, but gets most of the important ones.
llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled
for compatability for now.

llvm-svn: 258557
2016-01-22 21:30:34 +00:00
Nico Weber 7849ad0f72 Make InstProfWriter compile again after 258544 with MSVC.
\src\llvm-rw\include\llvm/Support/AlignOf.h(254) :
    error C2872: 'detail' : ambiguous symbol
        could be 'llvm::detail'
        or       'llvm::support::detail'

llvm-svn: 258553
2016-01-22 21:13:04 +00:00
Xinliang David Li e1574f0cdf [PGO] Remove use of static variable. /NFC
Make the variable a member of  the writer trait object owned
now by the writer. Also use a different generator interface
to pass the infoObject from the writer. 

llvm-svn: 258544
2016-01-22 20:25:56 +00:00
Rafael Espindola c0b103b133 Typo fix and simplification.
Thanks to Justin Bogner for the suggestion.

llvm-svn: 258540
2016-01-22 19:58:18 +00:00
Xinliang David Li 46ad363ba4 Revert 258486 -- for a better fix coming soon
llvm-svn: 258538
2016-01-22 19:53:31 +00:00
Rafael Espindola d5607d35d4 Add ArrayRef support to EndianStream.
Using an array instead of ArrayRef would allow type inference, but
(short of using C99) one would still need to write

    typedef uint16_t VT[];
    LE.write(VT{0x1234, 0x5678});

llvm-svn: 258535
2016-01-22 19:44:46 +00:00
Xinliang David Li 3865fdc4cc [PGO] add an interface needed by icall promotion
llvm-svn: 258509
2016-01-22 18:13:34 +00:00
Xinliang David Li 876c2024e2 [PGO] eliminate use of static variable
llvm-svn: 258486
2016-01-22 05:48:40 +00:00
Dan Gohman 0bf3ae84ca [SelectionDAG] Fold more offsets into GlobalAddresses
This reapplies r258296 and r258366, and also fixes an existing bug in
SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the
offset in a GlobalAddressSDNode, which is uncovered by those patches.

llvm-svn: 258482
2016-01-22 03:57:34 +00:00
Eduard Burtescu 68e7f49f8e [opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType.
Summary:

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16282

llvm-svn: 258478
2016-01-22 03:08:27 +00:00
Eduard Burtescu e2a6917849 [opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead of just the pointer.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16422

llvm-svn: 258477
2016-01-22 01:51:51 +00:00
Eduard Burtescu 093ae49077 [opaque pointer types] [NFC] gep_type_{begin,end} now take source element type and address space.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16436

llvm-svn: 258474
2016-01-22 01:33:43 +00:00
Eduard Burtescu 1423921a24 [opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16418

llvm-svn: 258472
2016-01-22 01:17:26 +00:00
Reid Kleckner b7ecfa5b09 Revert "[SelectionDAG] Fold more offsets into GlobalAddresses"
This reverts r258296 and the follow up r258366. With this change, we
miscompiled the following program on Windows:
  #include <string>
  #include <iostream>
  static const char kData[] = "asdf jkl;";
  int main() {
    std::string s(kData + 3, sizeof(kData) - 3);
    std::cout << s << '\n';
  }

llvm-svn: 258465
2016-01-22 01:09:29 +00:00
Reid Kleckner 18ec96f0fc Avoid unnecessary stack realignment in musttail thunks with SSE2 enabled
The X86 musttail implementation finds register parameters to forward by
running the calling convention algorithm until a non-register location
is returned. However, assigning a vector memory location has the side
effect of increasing the function's stack alignment. We shouldn't
increase the stack alignment when we are only looking for register
parameters, so this change conditionalizes it.

llvm-svn: 258442
2016-01-21 22:23:22 +00:00
David L Kreitzer 4d7257dfa1 Fix for two constant propagation problems in GVN with the assume intrinsic
instruction.

Patch by Yuanrui Zhang.

Differential Revision: http://reviews.llvm.org/D16100

llvm-svn: 258435
2016-01-21 21:32:35 +00:00
Rong Xu 950af1558f Fix buildbot failure due to r258420
Include the needed headfile to fix the buildbot failure due to r258420 [PGO] Passmanagerbuilder change that enable IR level PGO instrumentation.

llvm-svn: 258423
2016-01-21 19:06:24 +00:00
Rong Xu 34abbfb78e [PGO] Passmanagerbuilder change that enable IR level PGO instrumentation
This patch includes the passmanagerbuilder change that enables IR level PGO instrumentation. It adds two passmanagerbuilder options: -profile-generate=<profile_filename> and -profile-use=<profile_filename>. The new options are primarily for debug purpose.

Reviewers: davidxl, silvas

Differential Revision: http://reviews.llvm.org/D15828

llvm-svn: 258420
2016-01-21 18:28:59 +00:00
Adam Nemet af761104ba [TTI] Add getCacheLineSize
Summary:
And use it in PPCLoopDataPrefetch.cpp.

@hfinkel, please let me know if your preference would be to preserve the
ppc-loop-prefetch-cache-line option in order to be able to override the
value of TTI::getCacheLineSize for PPC.

Reviewers: hfinkel

Subscribers: hulx2000, mcrosier, mssimpso, hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D16306

llvm-svn: 258419
2016-01-21 18:28:36 +00:00
Sanjay Patel 4e971da272 make helper functions static; NFCI
llvm-svn: 258416
2016-01-21 18:01:57 +00:00
Philip Reames 82e0f15f86 Fix a type in a comment
Thanks to Sean Silva for pointing it out.

llvm-svn: 258410
2016-01-21 17:32:12 +00:00
Scott Egerton 2455701117 [mips] Allowed dla instructions on 32-bit architectures.
Summary:
This is now the same as the behaviour of the GNU assembler. This was done
as it is required in order to build the Linux kernel with the integrated
assembler enabled.

Reviewers: dsanders, vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D13594

llvm-svn: 258400
2016-01-21 15:11:01 +00:00
Igor Breger 7a000f5bb2 AVX512: Masked move intrinsic implementation.
Implemented intrinsic for the follow instructions (reg move) : VMOVDQU8/16, VMOVDQA32/64, VMOVAPS/PD.

Differential Revision: http://reviews.llvm.org/D16316

llvm-svn: 258398
2016-01-21 14:18:11 +00:00
Michael Zuckerman 21a30a42a9 [AVX512] Adding VPERMT2B and VPERMI2B Intrinsics
Differential Revision: http://reviews.llvm.org/D16398

llvm-svn: 258397
2016-01-21 13:36:01 +00:00
Manuel Jacob e902459c4b Change ConstantFoldInstOperands to take Instruction instead of opcode and type. NFC.
Summary:
The previous form, taking opcode and type, is moved to an internal
helper and the new form, taking an instruction, is a wrapper around this
helper.

Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16383

llvm-svn: 258391
2016-01-21 06:33:22 +00:00
Manuel Jacob 925d029461 Introduce ConstantFoldCastOperand function and migrate some callers of ConstantFoldInstOperands to use it. NFC.
Summary:
Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: zzheng, dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16380

llvm-svn: 258390
2016-01-21 06:31:08 +00:00
Manuel Jacob a61ca37b6d Introduce ConstantFoldBinaryOpOperands function and migrate some callers of ConstantFoldInstOperands to use it. NFC.
Summary:
Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16378

llvm-svn: 258389
2016-01-21 06:26:35 +00:00
David Majnemer 8cc30787b0 Rename MCLineEntry to MCDwarfLineEntry
MCLineEntry gives the impression that it is generic MC machinery.
However, it is specific to DWARF.

llvm-svn: 258381
2016-01-21 01:59:03 +00:00
Quentin Colombet 25d8e21949 [GlobalISel] Move generic opcodes description to their own file.
Differential Revision: http://reviews.llvm.org/D16384

llvm-svn: 258378
2016-01-21 01:37:18 +00:00
Manuel Jacob 4e3b446ae8 Run clang-format over ConstantFolding.h, fixing inconsistent indentation. NFC.
llvm-svn: 258361
2016-01-20 22:27:06 +00:00
Quentin Colombet 105cf2b179 [GlobalISel] Add the proper cmake plumbing.
This patch adds the necessary plumbing to cmake to build the sources related to
GlobalISel.

To build the sources related to GlobalISel, we need to add -DBUILD_GLOBAL_ISEL=ON.
By default, this is OFF, thus GlobalISel sources will not impact people that do
not explicitly opt-in.

Differential Revision: http://reviews.llvm.org/D15983

llvm-svn: 258344
2016-01-20 20:58:56 +00:00
Sanjoy Das a34ce95b60 Add a "gc-transition" operand bundle
Summary:
This adds a new kind of operand bundle to LLVM denoted by the
`"gc-transition"` tag.  Inputs to `"gc-transition"` operand bundle are
lowered into the "transition args" section of `gc.statepoint` by
`RewriteStatepointsForGC`.

This removes the last bit of functionality that was unsupported in the
deopt bundle based code path in `RewriteStatepointsForGC`.

Reviewers: pgavlin, JosephTremoulet, reames

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16342

llvm-svn: 258338
2016-01-20 19:50:25 +00:00
Quentin Colombet 2d7fa7065f [GlobalISel] Add a generic machine opcode for ADD.
The selection process being split into separate passes, we need generic opcodes
to translate the LLVM IR to target independent code.

This patch adds an opcode for addition: G_ADD.

Differential Revision: http://reviews.llvm.org/D15472

llvm-svn: 258333
2016-01-20 19:14:55 +00:00
Michael Zuckerman 65c40afb03 [AVX512] Adding VPERMB Intrinsics
Differential Revision: http://reviews.llvm.org/D16296

llvm-svn: 258316
2016-01-20 15:24:56 +00:00
Igor Breger d3341f5021 AVX512: Store (MOVNTPD, MOVNTPS, MOVNTDQ) using non-temporal hint intrinsic implementation.
Differential Revision: http://reviews.llvm.org/D16350

llvm-svn: 258309
2016-01-20 13:11:47 +00:00
Dylan McKay cc018c1713 [AVR] Defnined calling conventions. NFC.
llvm-svn: 258300
2016-01-20 09:30:01 +00:00
Dan Gohman edf98c5682 [SelectionDAG] Fold more offsets into GlobalAddresses
SelectionDAG previously missed opportunities to fold constants into
GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it
would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to
create `(add GA+c, y)`. This isn't often visible on targets such as X86 which
effectively reassociate adds in their complex address-mode folding logic,
however it is currently visible on WebAssembly since it currently has very
simple address mode folding code that doesn't reassociate anything.

This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses
at the same times that it folds constants together, so that it doesn't miss any
opportunities to perform such folding.

Differential Revision: http://reviews.llvm.org/D16090

llvm-svn: 258296
2016-01-20 07:03:08 +00:00
Lang Hames 3c43dc27ab [Orc] 'this' qualify more lambda-captured members.
More workaround attempts for GCC ICEs.

llvm-svn: 258288
2016-01-20 05:10:59 +00:00
Lang Hames 5959df89e9 [Orc] More qualifications of lambda-captured member variables to fix GCC ICEs.
llvm-svn: 258286
2016-01-20 04:32:05 +00:00
Lang Hames efa5f6c170 [Orc] Qualify captured variable to work around GCC ICE.
llvm-svn: 258278
2016-01-20 03:12:40 +00:00
Xinliang David Li 59411db520 [PGO] Add a new interface to be used by Indirect Call Promotion
llvm-svn: 258271
2016-01-20 01:26:34 +00:00
Xinliang David Li 440cd7027b Function name change /NFC
llvm-svn: 258260
2016-01-20 00:24:36 +00:00
Matthias Braun d4f6409dff MachineScheduler: Allow independent scheduling of sub register defs
Note that this is disabled by default and still requires a patch to
handleMove() which is not upstreamed yet.

If the TrackLaneMasks policy/strategy is enabled the MachineScheduler
will build a schedule graph where definitions of independent
subregisters are no longer serialised.

Implementation comments:
- Without lane mask tracking a sub register def also counts as a use
  (except for the first one with the read-undef flag set), with lane
  mask tracking enabled this is no longer the case.
- Pressure Diffs where previously maintained per definition of a
  vreg with the help of the SSA information contained in the
  LiveIntervals.  With lanemask tracking enabled we cannot do this
  anymore and instead change the pressure diffs for all uses of the vreg
  as it becomes live/dead.  For this changed style to work correctly we
  ignore uses of instructions that define the same register again: They
  won't affect register pressure.
- With lanemask tracking we remove all read-undef flags from
  sub register defs when building the graph and re-add them later when
  all vreg lanes have become dead.

Differential Revision: http://reviews.llvm.org/D14969

llvm-svn: 258259
2016-01-20 00:23:32 +00:00
Matthias Braun 5d458617aa RegisterPressure: Make liveness tracking subregister aware
Differential Revision: http://reviews.llvm.org/D14968

llvm-svn: 258258
2016-01-20 00:23:26 +00:00
Matthias Braun 3907fded1b LiveInterval: Add utility class to rename independent subregister usage
This renaming is necessary to avoid a subregister aware scheduler
accidentally creating liveness "holes" which are rejected by the
MachineVerifier.

Explanation as found in this patch:

Helper class that can divide MachineOperands of a virtual register into
equivalence classes of connected components.
MachineOperands belong to the same equivalence class when they are part of
the same SubRange segment or adjacent segments (adjacent in control
flow); Different subranges affected by the same MachineOperand belong to
the same equivalence class.

Example:
  vreg0:sub0 = ...
  vreg0:sub1 = ...
  vreg0:sub2 = ...
  ...
  xxx        = op vreg0:sub1
  vreg0:sub1 = ...
  store vreg0:sub0_sub1

The example contains 3 different equivalence classes:
  - One for the (dead) vreg0:sub2 definition
  - One containing the first vreg0:sub1 definition and its use,
    but not the second definition!
  - The remaining class contains all other operands involving vreg0.

We provide a utility function here to rename disjunct classes to different
virtual registers.

Differential Revision: http://reviews.llvm.org/D16126

llvm-svn: 258257
2016-01-20 00:23:21 +00:00
Davide Italiano 648f4e32ba Reinstate the second part of a comment. NFC.
Reported by:  Filipe Cabecinhas
Pointy-hat to: me

llvm-svn: 258223
2016-01-19 23:39:28 +00:00
David Majnemer ce10842036 [MC, COFF] Add .reloc support for WinCOFF
This adds rudimentary support for a few relocations that we will use for
the CodeView debug format.

llvm-svn: 258216
2016-01-19 23:05:27 +00:00
Lang Hames 951f73a2de [Orc] Oops - lambda capture changed in r258206 was correct.
Fully qualify reference to Finalized in the body of the lambda instead to work
around GCC ICE.

llvm-svn: 258208
2016-01-19 22:32:58 +00:00
Quentin Colombet 2c49e2e664 [MachineFunction] Constify getter. NFC.
llvm-svn: 258207
2016-01-19 22:31:12 +00:00
Lang Hames 97ce2bcefe [Orc] Add missing capture to lambda.
llvm-svn: 258206
2016-01-19 22:31:01 +00:00
Lang Hames df1ce15ef2 [Orc] Qualify call to make_unique to avoid ambiguity with std::make_unique.
This should fix some of the bot failures associated with r258185.

llvm-svn: 258204
2016-01-19 22:22:43 +00:00
Lang Hames bf4e1981e6 [Orc] Fix a stale comment.
llvm-svn: 258187
2016-01-19 21:13:54 +00:00
Lang Hames 2fe7acb773 [Orc] Refactor ObjectLinkingLayer::addObjectSet to defer loading objects until
they're needed.

Prior to this patch objects were loaded (via RuntimeDyld::loadObject) when they
were added to the ObjectLinkingLayer, but were not relocated and finalized until
a symbol address was requested. In the interim, another object could be loaded
and finalized with the same memory manager, causing relocation/finalization of
the first object to fail (as the first finalization call may have marked the
allocated memory for the first object read-only).

By deferring the loadObject call (and subsequent memory allocations) until an
object file is needed we can avoid prematurely finalizing memory.

llvm-svn: 258185
2016-01-19 21:06:38 +00:00
Sanjoy Das 29a4b5dc0d [SCEV] Fix PR26207
In some cases, the max backedge taken count can be more conservative
than the exact backedge taken count (for instance, because
ScalarEvolution::getRange is not control-flow sensitive whereas
computeExitLimitFromICmp can be).  In these cases,
computeExitLimitFromCond (specifically the bit that deals with `and` and
`or` instructions) can create an ExitLimit instance with a
`SCEVCouldNotCompute` max backedge count expression, but a computable
exact backedge count expression.  This violates an implicit SCEV
assumption: a computable exact BE count should imply a computable max BE
count.

This change

 - Makes the above implicit invariant explicit by adding an assert to
   ExitLimit's constructor

 - Changes `computeExitLimitFromCond` to be more robust around
   conservative max backedge counts

llvm-svn: 258184
2016-01-19 20:53:51 +00:00
Sanjay Patel d3112a5bcc function names start with a lowercase letter; NFC
Note: There are no uses of these functions outside of
SimplifyLibCalls, so they could be static functions in
that file.

llvm-svn: 258172
2016-01-19 19:46:10 +00:00
Sanjay Patel 251cf1336a don't repeat function names in documentation comments; NFC
llvm-svn: 258164
2016-01-19 19:10:10 +00:00
Philip Reames 1a196f7daf Revert 258157
According the build bots, clang is using the Registry class somewhere as well. Will reapply with appropriate clang changes at a later point.

llvm-svn: 258159
2016-01-19 18:41:10 +00:00
Philip Reames 0f6650e8e8 [GC] Registry initialization and linkage interactions
The Registry class constructs a linked list of nodes whose storage is inside static variables and nodes are added via static initializers. The trick is that those static initializers are in both the LLVM code base, and some random plugin that might get loaded in at runtime. The existing code tries to use C++ templates and their ODR rules to get a single definition of the registry for each type, but, experimentally, this doesn't quite work as designed. (Well, the entire structure doesn't. It might not actually be an ODR problem.)

Previously, when I tried moving the GCStrategy class (along with it's registry) from CodeGen to IR, I ran into a problem where asking the GCStrategyRegistry a question would return inconsistent results depending on whether you asked from CodeGen (where the static initializers still were) or Transforms. My best guess is that this is a result of either a) an order of initialization error, or b) we ended up with two copies of the registry being created. I remember at the time having convinced myself it was probably (b), but I don't have any of my notes around from that investigation any more.

See http://reviews.llvm.org/rL226311 for the original patch in question.

This patch tries to remove the possibility of (b) above. (a) was already fixed in change 258109.

Differential Revision: http://reviews.llvm.org/D16170

llvm-svn: 258157
2016-01-19 18:34:27 +00:00
Philip Reames 1ec08ac7e4 Add clarifying comments defining what a Loop is
Our loop construct is not a way to identify cycles in the CFG.  This wasn't immediately obvious from the header, so clarify that fact.

The motivation for this was that I just fixed a out of tree bug due to a mistaken assumption (on my part) on what a Loop actually was.  While it was fresh in my mind, I wanted to document the key point.

llvm-svn: 258154
2016-01-19 18:26:01 +00:00
Eduard Burtescu 19eb03106d [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.

GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16275

llvm-svn: 258145
2016-01-19 17:28:00 +00:00
Rafael Espindola 1a7e8b4bc1 Simplify MCFillFragment.
The value size was always 1 or 0, so we don't need to store it.

In a no asserts build this takes the testcase of pr26208 from 11 to 10
seconds.

llvm-svn: 258141
2016-01-19 16:57:08 +00:00
Asaf Badouh d4a0d9a78c [X86][AVX512]fix dag & add intrinsics for fixupimm
cover all width and types (pd/ps/sd/ss) of fixupimm instruction and inrtinsics

Differential Revision: http://reviews.llvm.org/D16313

llvm-svn: 258124
2016-01-19 14:21:39 +00:00
Tobias Edler von Koch 8ecaf69291 [LTO] Restore original linkage of externals prior to splitting
Summary:
This is a companion patch for http://reviews.llvm.org/D16124.

Internalized symbols increase the size of strongly-connected components in
SCC-based module splitting and thus reduce the amount of parallelism. This
patch records the original linkage of non-local symbols prior to
internalization and then restores it just before splitting/CodeGen. This is
also useful for cases where the linker requires symbols to remain external, for
instance, so they can be placed according to linker script rules.

It's currently under its own flag (-restore-globals) but should eventually
share a common flag with D16124.

Reviewers: joker.eph, pcc

Subscribers: slarin, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D16229

llvm-svn: 258100
2016-01-18 23:24:54 +00:00
Davide Italiano f0caa3eaab [Support/ELF] Remove field erroneously added in r258025.
Although glibc defines it, this is currently of no use for my primary
use-case (dumping DT_* keys correctly). Its semantic is not described
anywhere I can find, so better leave it out for now.
Thanks to Rafael for pointing out in his post-commit review!

llvm-svn: 258089
2016-01-18 21:20:02 +00:00
Sergei Larin d19d4d30d8 Add to the split module utility an SCC based method which allows not to globalize any local variables.
Summary:
    Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios.
    This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols.
    Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module).
    
    Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org)
    
    Subscribers: llvm-commits, joker.eph
    
    Differential Revision: http://reviews.llvm.org/D16124

llvm-svn: 258083
2016-01-18 21:07:13 +00:00
Rafael Espindola df9e61b599 Delete dead code.
llvm-svn: 258082
2016-01-18 21:01:50 +00:00
Craig Topper 5e46adb09a [TableGen] Use FoldingSets instead of DenseMaps to unique UnOpInit, BinOpInit and TernOpInit. This remove the memory needed to store the key for the DenseMap. NFC
llvm-svn: 258071
2016-01-18 20:36:06 +00:00
Tom Stellard ccdc5391ea TargetLowering: Improve handling of (setcc ([sz]ext x) 0, cc) in SimplifySetCC
Summary:
When SimplifySetCC sees a setcc node that compares the result of a
value extension operation with a constant, it tries to simplify the
setcc node by eliminating the extension and shrinking the constant.

If shrinking the inputs to setcc is deemed not desirable by the target
(e.g. the target does not want a setcc comparing i1 values), then it
is still possible to optimize this sequence in some cases.

This patch adds the following combines to SimplifySetCC when shrinking setcc
inputs is not desirable:

(setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc))
(setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc))

There are no tests for this yet, but once AMDGPU correctly implements
TargetLowering::isTypeDesirableForOp(), this new combine will be
exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test.

Reviewers: resistor, arsenm

Subscribers: jroelofs, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15034

llvm-svn: 258067
2016-01-18 19:55:21 +00:00
Craig Topper 0e41d0b963 [TableGen] Merge the SuperClass Record and SMRange vector into a single vector. This removes the state needed to manage the extra vector thus reducing the size of the Record class. NFC
llvm-svn: 258065
2016-01-18 19:52:37 +00:00
Craig Topper d4d3ebd937 [TableGen] Reorder fields in Record class to optimize memory usage. NFC
llvm-svn: 258064
2016-01-18 19:52:29 +00:00
Craig Topper fbfd578056 [TableGen] Allocate the Init pointer array for BitsInit/ListInit after the BitsInit/ListInit object itself. Saves a bit of memory. NFC
llvm-svn: 258063
2016-01-18 19:52:24 +00:00
Igor Breger 239fda676c AVX512: Masked store intrinsic implementation.
Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD.

Differential Revision: http://reviews.llvm.org/D16271

llvm-svn: 258047
2016-01-18 13:52:57 +00:00
Xinliang David Li 42a13308a1 [Coverage] move a local var to be BinaryCoverageReader's member
The symtab is logically referenced beyond the call to the create
method. This changes makes sure its lifetime matches that of
the reader.

llvm-svn: 258036
2016-01-18 06:48:01 +00:00
Amaury Sechet 1c39507772 Fix typo in the C API comments
llvm-svn: 258033
2016-01-18 01:06:52 +00:00
Xinliang David Li a3feba2e01 minor comment clean and add a method \NFC
llvm-svn: 258030
2016-01-18 00:26:33 +00:00
Davide Italiano 696f043bc2 [Support/ELF] Add Sun machine-independent extesions DT_* constants.
llvm-svn: 258025
2016-01-17 22:46:50 +00:00
Manuel Jacob 20c6d5bcb8 [opaque pointer types] [breaking-change] [NFC] SimplifyGEPInst: take the source element type of the GEP as an argument.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16281

llvm-svn: 258024
2016-01-17 22:46:43 +00:00
Sanjoy Das ce6555f0be [SCEV] Use range for; NFC
llvm-svn: 258014
2016-01-17 18:12:45 +00:00
Michael Zuckerman ac1b238b0a [AVX512] Adding VPERMW/D/Q VPERMPS/D Intrinsics
Differential Revision: http://reviews.llvm.org/D16189

llvm-svn: 258008
2016-01-17 11:33:29 +00:00
Michael Zuckerman ede597c753 [AVX512] Adding VPERMQ VPERMPD Intrinsics
Differential Revision: http://reviews.llvm.org/D16194

llvm-svn: 258006
2016-01-17 08:32:14 +00:00
Xinliang David Li 6ed987dffe [PGO] fix a bug in profile summary computation
Entry block count was not counted and is corrected. Also
introduce a new metric that is MaxInternalBlockCount which
show command shows (as before).

llvm-svn: 257987
2016-01-16 05:29:49 +00:00
Peter Collingbourne f0f5e87083 Introduce sanstats tool and llvm::CreateSanitizerStatReport function.
This is part of a new statistics gathering feature for the sanitizers.
See clang/docs/SanitizerStats.rst for further info and docs.

Differential Revision: http://reviews.llvm.org/D16174

llvm-svn: 257970
2016-01-16 00:31:11 +00:00
Dan Gohman 2f301f3e92 [WebAssembly] Don't create a needless .note.GNU-stack section
WebAssembly's stack will never be executable by default, so it isn't
necessary to declare .note.GNU-stack sections to request a non-executable
stack.

Differential Revision: http://reviews.llvm.org/D15969

llvm-svn: 257962
2016-01-15 23:59:13 +00:00
David Blaikie ab105bbf0c [opaque pointer types] Remove an unnecessary extra explicit value type in Function
Now that this is up in GlobalValue, just use the value there.

llvm-svn: 257949
2016-01-15 23:07:58 +00:00
Dan Gohman 4e9b2a60ab [SelectionDAG] CSE nodes with differing SDNodeFlags
In the optimizer (GVN etc.) when eliminating redundant nodes with different
flags, the flags are ignored for the purposes of testing for congruence, and
then intersected for the purposes of producing a result that supports the union
of all the uses. This commit makes SelectionDAG's CSE do the same thing,
allowing it to CSE nodes in more cases. This fixes PR26063.

Differential Revision: http://reviews.llvm.org/D15957

llvm-svn: 257940
2016-01-15 21:56:40 +00:00
Joseph Tremoulet 44b3f961e1 [WinEH] Rename CatchReturnInst::getParentPad, NFC
Summary:
Rename to getCatchSwitchParentPad, to make it more clear which ancestor
the "parent" in question is.  Add a comment pointing out the key feature
that the returned pad indicates which funclet contains the successor
block.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16222

llvm-svn: 257933
2016-01-15 21:16:19 +00:00
Lang Hames 2ba12953d2 [Orc] Remove some reinterpret casts in debugging output.
These casts were from function pointer to data pointer type, which some
compilers (including GCC) may warn about. In all cases where these casts were
used the original value was still available as a TargetAddress (uint64_t), so
we can just print a formatted version of that instead.

llvm-svn: 257932
2016-01-15 21:14:05 +00:00
Lang Hames 2f9773863f [Orc] Add a void cast to work around a GCC diagnostic bug.
llvm-svn: 257927
2016-01-15 19:37:14 +00:00
Xinliang David Li b606638526 [PGO] Commonize (more) index profile file and buffer writer.
The file and buffer writer code are mostly shared except for the
stream back-patching. This is because raw_string_ostream does not
support seek like interface. The result is that the data patching
code needs to be pushed to the caller which is not quite readable 
(passing around offset, value etc). This also makes future enhancement
(which needs more patching) more difficult (and can make impl messy).

In this patch, two types of streams needed by the writer are now
unified with same set of interfaces under ProfOStream class. The patch
method is added so that common implementation becomes cleaner. It
also enables future enhancement. Should be NFC.

llvm-svn: 257921
2016-01-15 19:01:04 +00:00
Rafael Espindola 257a35368f Bring back "Assert that we have all use/users in the getters."
This reverts commit r257751, bringing back r256105.

The problem the assert found was fixed in r257915.

Original commit message:

Assert that we have all use/users in the getters.

An error that is pretty easy to make is to use the lazy bitcode reader
and then do something like

if (V.use_empty())

The problem is that uses in unmaterialized functions are not accounted
for.

This patch adds asserts that all uses are known.

llvm-svn: 257920
2016-01-15 19:00:20 +00:00
Reid Kleckner 47f2452da8 # This is a combination of 2 commits.
# The first commit's message is:

Revert "[ARM] Add DSP build attribute and extension targeting"

This reverts commit b11cc50c0b4a7c8cdb628abc50b7dc226ff583dc.

# This is the 2nd commit message:

Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline"

This reverts commit 837d08454e3e5beb8581951ac26b22fa07df3cd5.

llvm-svn: 257916
2016-01-15 18:31:29 +00:00
George Rimar 05535bccbc [Support/ELF] - Added DT_TLSDESC_PLT and DT_TLSDESC_GOT constants.
Added 2 constants:

DT_TLSDESC_PLT = 0x6FFFFEF6, Location of PLT entry for TLS descriptor resolver calls.
DT_TLSDESC_GOT = 0x6FFFFEF7, Location of GOT entry used by TLS descriptor resolver PLT entry.

Constants were taken from "Thread-Local Storage Descriptors for IA32 and AMD64/EM64T Version 0.9.5" http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt


Differential revision: http://reviews.llvm.org/D16185

llvm-svn: 257911
2016-01-15 18:09:27 +00:00
Reid Kleckner c31f530cb7 [codeview] Dump the file checksum substream
llvm-svn: 257910
2016-01-15 18:06:25 +00:00
James Y Knight ac03dca412 Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)

Differential Revision: http://reviews.llvm.org/D16145

llvm-svn: 257902
2016-01-15 16:33:06 +00:00
Artur Pilipenko 6dd6969cee Change isSafeToLoadUnconditionally arguments order. Separated from http://reviews.llvm.org/D10920.
llvm-svn: 257894
2016-01-15 15:27:46 +00:00
Bradley Smith 48b93e1f21 [ARM] Add DSP build attribute and extension targeting
llvm-svn: 257885
2016-01-15 10:28:25 +00:00
Bradley Smith e26f799422 [ARM] Add ARMv8-M Baseline/Mainline LLVM targeting
llvm-svn: 257878
2016-01-15 10:24:39 +00:00
James Molloy f01488e2bc [InstCombine] Rewrite bswap/bitreverse handling completely.
There are several requirements that ended up with this design;
  1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
  2. Bitreversals and byteswaps are very related in their matching logic.
  3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
  4. Bswaps are best matched early in InstCombine.

The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.

We can then extend the matching logic in one place only.

llvm-svn: 257875
2016-01-15 09:20:19 +00:00
Pete Cooper 835594e627 Delete MCRelocationInfo::createExprForRelocation.
This method has no callers.

Also remove X86ELFRelocationInfo.cpp and X86MachORelocationInfo.cpp
which only existed to provide an implementation of that method.

Ok'd by Rafael and Jim.

llvm-svn: 257859
2016-01-15 02:24:12 +00:00
David Blaikie edbe568573 Orc: Simplify some things with NSDMIs and some braced init.
llvm-svn: 257840
2016-01-14 23:33:43 +00:00
Easwaran Raman f4bb2f0dc3 Refactor threshold computation for inline cost analysis
Differential Revision: http://reviews.llvm.org/D15401

llvm-svn: 257832
2016-01-14 23:16:29 +00:00
Xinliang David Li 565b301380 [PGO] Move profile summary interface/impl into InstrProf.[*] /NFC
llvm-svn: 257819
2016-01-14 22:10:49 +00:00
Lang Hames 52c4724165 [Orc] Add support for EH-frame registration to the Orc Remote Target utility
classes.

OrcRemoteTargetClient::RCMemoryManager will now register EH frames with the
server automatically. This allows remote-execution of code that uses exceptions.

llvm-svn: 257816
2016-01-14 22:02:03 +00:00
Krzysztof Parzyszek c005e20d3b [Packetizer] Code cleanup, NFC
llvm-svn: 257805
2016-01-14 21:17:04 +00:00
Rui Ueyama da00f2fdf4 Update to use new name alignTo().
llvm-svn: 257804
2016-01-14 21:06:47 +00:00
Rui Ueyama c58a06d739 [Support] Rename RoundUpToAlignment -> alignTo.
Rounding up an integer m to a nearest multiple of n where n is a power
of 2 is used very often if you are writing code to emit binary files.
RoundUpToAlignment is a small function to do that. But we found that the
function has a small but annoying issue; the name is a bit too long.
Because it is used quite often, that hurts readability.

This patch is to rename the function. The original name is kept as a
forwarder, so that submitting this patch won't immediately break Clang
and other LLVM projects. Once I update all occurrences of RoundUpToAlignment,
I'll remove the old name entirely.

http://reviews.llvm.org/D16162

llvm-svn: 257799
2016-01-14 20:43:11 +00:00
Reid Kleckner 3e8d8c7d26 Include TypeIndex. Again, the "check" target is not enough to catch this currently
llvm-svn: 257793
2016-01-14 19:40:27 +00:00
Reid Kleckner e9ab3498f3 [codeview] Dump CodeView inlinee lines subsection
llvm-svn: 257790
2016-01-14 19:20:17 +00:00
James Y Knight 582f556251 Revert "Stop increasing alignment of externally-visible globals on ELF platforms."
This reverts commit r257719, due to PR26144.

llvm-svn: 257775
2016-01-14 16:33:21 +00:00
Michael Zolotukhin 65c0120193 Revert "Assert that we have all use/users in the getters."
This reverts commit fdb838f3f8a8b6896bbbd5285555874eb3b748eb.

llvm-svn: 257751
2016-01-14 09:02:45 +00:00
Igor Breger fc96331d88 AVX512: VMOVDQA32/64 (load) intrinsic implementation.
Differential Revision: http://reviews.llvm.org/D16142

llvm-svn: 257749
2016-01-14 07:56:04 +00:00
Xinliang David Li d5d8887d28 Cleanup: shorten prefix to consistent with other decls /NFC
llvm-svn: 257744
2016-01-14 06:21:25 +00:00
David Majnemer e21e90933c [CodeView] Add support for dumping binary annotations
Binary annotations are encoded along the lines of UTF-8 and ECI but with
a few minor differences.

The algorithm specified in "ECMA-335 CLI Section II.3.2 - Blobs and
Signatures" is used to compress binary annotations.  Signed binary
annotations are encoded like unsigned annotations except the sign bit is
rotated left to reduce the number of bits needed to be encoded.

llvm-svn: 257742
2016-01-14 06:12:30 +00:00