Commit Graph

183574 Commits

Author SHA1 Message Date
Richard Smith ffb650856d Enable both C and C++ modules with -fmodules, by switching -fcxx-modules to
being on by default. -fno-cxx-modules can still be used to enable C modules but
not C++ modules, but C++ modules is not significantly less stable than C
modules any more.

Also remove some of the scare words from the modules documentation. We're
certainly not going to remove modules support (though we might change the
interface), and it works well enough to bootstrap and build lots of
non-trivial code.

Note that this does not represent a commitment to the current interface nor
implementation, and we still intend to follow whatever direction the C and C++
committees take regarding modules support.

llvm-svn: 218717
2014-09-30 23:10:19 +00:00
Kuba Brecka 12dee62b02 [compiler-rt] Re-enable the use of -gmlt for ASan tests on Darwin
The optimization for -gmlt/-gline-tables-only introduced in r218129 happened to break on Darwin and produce no line number information due to
an incompatibility with dsymutil. ASan tests have been failing because of that and we disabled the use of -gmlt for the tests in r218545. This patch re-enables the use of -gmlt, because we have conditionally disabled the incompatible optimization in LLVM, so -gmlt now works on Darwin. Once Darwin's dsymutil is modified to allow this optimization, we can re-enable the optimization in LLVM.

llvm-svn: 218716
2014-09-30 23:07:45 +00:00
Richard Trieu 9f8509f70d Update -Wuninitialized to be stricter on CK_NoOp casts.
llvm-svn: 218715
2014-09-30 23:04:37 +00:00
Hal Finkel fd86317989 [BasicAA] Make better use of zext and sign information
Two related things:

 1. Fixes a bug when calculating the offset in GetLinearExpression. The code
    previously used zext to extend the offset, so negative offsets were converted
    to large positive ones.

 2. Enhance aliasGEP to deduce that, if the difference between two GEP
    allocations is positive and all the variables that govern the offset are also
    positive (i.e. the offset is strictly after the higher base pointer), then
    locations that fit in the gap between the two base pointers are NoAlias.

Patch by Nick White!

llvm-svn: 218714
2014-09-30 22:43:40 +00:00
David Blaikie 1cae849c04 DebugInfo: Sink the code emitting DW_AT_APPLE_omit_frame_ptr down to a more common spot.
No functional change. Pre-emptive refactoring before I start pushing
some of this subprogram creation down into DWARFCompileUnit so I can
build different subprograms in the skeleton unit from the dwo unit for
adding -gmlt-like data to the skeleton.

llvm-svn: 218713
2014-09-30 22:32:49 +00:00
Hans Wennborg 437aa948b8 MSBuild integration: fix the loop in install.bat
It would previously not continue the platforms loop
unless it could find the latest toolset directory.

llvm-svn: 218712
2014-09-30 22:30:06 +00:00
Jingyue Wu fc0296704c [SimplifyCFG] threshold for folding branches with common destination
Summary:
This patch adds a threshold that controls the number of bonus instructions
allowed for folding branches with common destination. The original code allows
at most one bonus instruction. With this patch, users can customize the
threshold to allow multiple bonus instructions. The default threshold is still
1, so that the code behaves the same as before when users do not specify this
threshold.

The motivation of this change is that tuning this threshold significantly (up
to 25%) improves the performance of some CUDA programs in our internal code
base. In general, branch instructions are very expensive for GPU programs.
Therefore, it is sometimes worth trading more arithmetic computation for a more
straightened control flow. Here's a reduced example:

  __global__ void foo(int a, int b, int c, int d, int e, int n,
                      const int *input, int *output) {
    int sum = 0;
    for (int i = 0; i < n; ++i)
      sum += (((i ^ a) > b) && (((i | c ) ^ d) > e)) ? 0 : input[i];
    *output = sum;
  }

The select statement in the loop body translates to two branch instructions "if
((i ^ a) > b)" and "if (((i | c) ^ d) > e)" which share a common destination.
With the default threshold, SimplifyCFG is unable to fold them, because
computing the condition of the second branch "(i | c) ^ d > e" requires two
bonus instructions. With the threshold increased, SimplifyCFG can fold the two
branches so that the loop body contains only one branch, making the code
conceptually look like:

  sum += (((i ^ a) > b) & (((i | c ) ^ d) > e)) ? 0 : input[i];

Increasing the threshold significantly improves the performance of this
particular example. In the configuration where both conditions are guaranteed
to be true, increasing the threshold from 1 to 2 improves the performance by
18.24%. Even in the configuration where the first condition is false and the
second condition is true, which favors shortcuts, increasing the threshold from
1 to 2 still improves the performance by 4.35%.

We are still looking for a good threshold and maybe a better cost model than
just counting the number of bonus instructions. However, according to the above
numbers, we think it is at least worth adding a threshold to enable more
experiments and tuning. Let me know what you think. Thanks!

Test Plan: Added one test case to check the threshold is in effect

Reviewers: nadav, eliben, meheff, resistor, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D5529

llvm-svn: 218711
2014-09-30 22:23:38 +00:00
Chandler Carruth bebedbaf36 [x86] Add AVX1 and AVX2 testing to all of the 128-bit shuffle test
cases.

While clearly we don't need the AVX vector width, these ISA extensions
often cause us to select different instructions and we should cover them
even with the narrow vector width.

Also, while here, nuke the stress_test2 contents. There is no reason to
try to FileCheck this entire body when it is mostly a test for
successfully surviving the code generator.

llvm-svn: 218710
2014-09-30 22:16:23 +00:00
Chandler Carruth a41dceb39b [x86] Update the exact FileCheck syntax of the 256-bit and 512-bit
shuffle tests to match that used in the script I posted and now used
consistently in 128-bit tests.

Nothing interesting changing here, just using the label name as the
FileCheck label and a slightly more general comment marker consumption
strategy.

llvm-svn: 218709
2014-09-30 22:04:45 +00:00
David Blaikie 515387569a Adjust test case addition in r218702 so as not to fail when the X86 target isn't built.
llvm-svn: 218708
2014-09-30 22:02:27 +00:00
Chandler Carruth 6a62cd3538 [x86] Rework all of the 128-bit vector shuffle tests with my handy test
updating script so that they are more thorough and consistent.

Specific fixes here include:
- Actually test VEX-encoded AVX mnemonics.
- Actually use an SSE 4.1 run to test SSE 4.1 features!
- Correctly check instructions sequences from the start of the function.
- Elide the shuffle operands and comment designator in a consistent way.
- Test all of the architectures instead of just the ones I was motivated
  to manually author.

I've gone back through and fixed up any egregious issues I spotted. Let
me know if I missed something you really dislike.

One downside to this is that we're now not as diligently using FileCheck
variables for registers. I would be much more concerned with this if we
had larger register usage, but there just aren't that interesting of
register choices here and most of the registers are constrained by the
ABI. Ultimately, I don't think this is likely to be the maintenance
burden for these tests and updating them again should be staright
forward.

llvm-svn: 218707
2014-09-30 21:44:34 +00:00
Rui Ueyama 07fae9691b [PECOFF] Fix /entry option.
This is yet another edge case of ambiguous name resolution.
When a symbol is specified with /entry:SYM, SYM may be resolved
to the C++ mangled function name (?SYM@@YAXXZ).

llvm-svn: 218706
2014-09-30 21:39:49 +00:00
Rui Ueyama dbddf11649 [PECOFF] Move helper function out of class
No functionality change intended.

llvm-svn: 218705
2014-09-30 21:39:46 +00:00
Tim Northover 3fc12bf860 [mach-o] add file comment to compact unwind pass
llvm-svn: 218704
2014-09-30 21:32:46 +00:00
Tim Northover cf78d37fd6 [mach-o] create __unwind_info section on x86_64
This is a minimally useful pass to construct the __unwind_info section in a
final object from the various __compact_unwind inputs. Currently it doesn't
produce any compressed pages, only works for x86_64 and will fail if any
function ends up without __compact_unwind.

rdar://problem/18208653

llvm-svn: 218703
2014-09-30 21:29:54 +00:00
David Blaikie e1c79749ca Disable the -gmlt optimization implemented in r218129 under Darwin due to issues with dsymutil.
r218129 omits DW_TAG_subprograms which have no inlined subroutines when
emitting -gmlt data. This makes -gmlt very low cost for -O0 builds.

Darwin's dsymutil reasonably considers a CU empty if it has no
subprograms (which occurs with the above optimization in -O0 programs
without any force_inline function calls) and drops the line table, CU,
and everything in this situation, making backtraces impossible.

Until dsymutil is modified to account for this, disable this
optimization on Darwin to preserve the desired functionality.
(see r218545, which should be reverted after this patch, for other
discussion/details)

Footnote:
In the long term, it doesn't look like this scheme (of simplified debug
info to describe inlining to enable backtracing) is tenable, it is far
too size inefficient for optimized code (the DW_TAG_inlined_subprograms,
even once compressed, are nearly twice as large as the line table
itself (also compressed)) and we'll be considering things like Cary's
two level line table proposal to encode all this information directly in
the line table.

llvm-svn: 218702
2014-09-30 21:28:32 +00:00
Sanjay Patel ab7f460bca Use the target-specified iteration count to opt out of any further refinement of an estimate. NFC.
llvm-svn: 218700
2014-09-30 20:44:23 +00:00
Jim Ingham 89fd66813b Not all processes have a Dynamic Loader. Be sure to check that it exists before using it.
<rdar://problem/18491391>

llvm-svn: 218699
2014-09-30 20:33:25 +00:00
Sanjay Patel 8fde95cb2b Split the estimate() interface into separate functions for each type. NFC.
It was hacky to use an opcode as a switch because it won't always match
(rsqrte != sqrte), and it looks like we'll need to add more special casing
per arch than I had hoped for. Eg, x86 will prefer a different NR estimate
implementation. ARM will want to use it's 'step' instructions. There also
don't appear to be any new estimate instructions in any arch in a long,
long time. Altivec vloge and vexpte may have been the first and last in
that field...

llvm-svn: 218698
2014-09-30 20:28:48 +00:00
Justin Bogner 916cca728f InstrProf: Remove an unused member (NFC)
llvm-svn: 218697
2014-09-30 20:21:50 +00:00
Rui Ueyama fa67adc28d [PECOFF] Allow /export:<symbol>,PRTVATE.
PRIVATE option is also an undocumented feature.

llvm-svn: 218696
2014-09-30 20:09:31 +00:00
Rui Ueyama 3837e10002 [PECOFF] Fix /export option.
MSDN doesn't say about /export:foo=bar style option, but
it turned out MSVC link.exe actually accepts that. So we need that
too.

It also means that the export directive in the module definition
file and /export command line option are functionally equivalent.

llvm-svn: 218695
2014-09-30 20:03:11 +00:00
Ben Langmuir c28ce3aba6 Avoid a crash after loading an #undef'd macro in code completion
In code-completion, don't assume there is a MacroInfo for everything,
since we aren't serializing the def corresponding to a later #undef in
the same module.  Also setup the HadMacro bit correctly for undefs to
avoid an assertion failure.

rdar://18416901

llvm-svn: 218694
2014-09-30 20:00:18 +00:00
Juergen Ributzka c110c0b99a Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Note: This version fixed an issue with the TBZ/TBNZ instructions that were
generated in FastISel. The issue was that the 64bit version of TBZ (TBZX)
automagically sets the upper bit of the immediate field that is used to specify
the bit we want to test. To test for any of the lower 32bits we have to first
extract the subregister and use the 32bit version of the TBZ instruction (TBZW).

Original commit message:
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).

llvm-svn: 218693
2014-09-30 19:59:35 +00:00
Matt Arsenault 9706978077 R600/SI: Fix printing of clamp and omod
No tests for omod since nothing uses it yet, but
this should get rid of the remaining annoying trailing
zeros after some instructions.

llvm-svn: 218692
2014-09-30 19:49:48 +00:00
Matt Arsenault 272c50a1fe R600/SI: Update VOP3b to not include obsolete operands
abs / neg are now part of the srcN_modifiers operands

llvm-svn: 218691
2014-09-30 19:49:43 +00:00
Rui Ueyama 3041443b5c [PECOFF] Fix __imp_ prefix on x64.
"__imp_" prefix always starts with double underscores.
When I was writing the original code I misunderstood
that it's "_imp_" on x64.

llvm-svn: 218690
2014-09-30 19:42:04 +00:00
Daniel Jasper 67f8ad258f clang-format: [JS] Support AllowShortFunctionsOnASingleLine.
Specifically, this also counts for stuff like (with style "inline"):
  var x = function() {
    return 1;
  };

llvm-svn: 218689
2014-09-30 17:57:06 +00:00
Eli Bendersky f2787a0dc0 CUDA: mark the target of implicit intrinsics properly
r218624 implemented target inference for implicit special members. However,
other entities can be implicit - for example intrinsics. These can not have
inference running on them, so they should be marked host device as before. This
is the safest and most flexible setting, since by construction these functions
don't invoke anything, and we'd like them to be invokable from both host and
device code. LLVM's intrinsics definitions (where these intrinsics come from in
the case of CUDA/NVPTX) have no notion of target, so both host and device
intrinsics can be supported this way.

llvm-svn: 218688
2014-09-30 17:38:34 +00:00
Jim Ingham 8d81bdf11f Add SBThreadPlan to this CMakeLists.txt as well.
llvm-svn: 218687
2014-09-30 17:11:53 +00:00
Todd Fiala 1f67ded0b2 thread state coordinator: add additional assert missing from previous test check-in.
llvm-svn: 218686
2014-09-30 17:00:52 +00:00
Zachary Turner 25cbf5aac6 Fix FreeBSD build.
llvm-svn: 218685
2014-09-30 16:56:56 +00:00
Zachary Turner c76a445279 Fixup some minor issues with HostProcess.
llvm-svn: 218684
2014-09-30 16:56:40 +00:00
Todd Fiala f8d929dc82 thread state coordinator: add test to be explicit about resume behavior in presence of deferred stop notification still pending.
There is a state transition that seems potentially buggy that I am capturing and
logging here, and including an explicit test to demonstrate expected behavior.  See new test
for detailed description.  Added logging around this area since, if we hit it, we
may have a usage bug, or a new state transition we really need to investigate.

This is around this scenario:
Thread C deferred stop notification awaiting thread A and thread B to stop.
Thread A stops.
Thread A requests resume.
Thread B stops.

Here we will explicitly signal the deferred stop notification after thread B
stops even though thread A is now resumed.  Copious logging happens here.

llvm-svn: 218683
2014-09-30 16:56:28 +00:00
Bradley Smith 7a77075530 Extend C disassembler API to allow specifying target features
llvm-svn: 218682
2014-09-30 16:31:40 +00:00
Reed Kotler 3ebdcc9ea7 Add numeric extend, trunctate to mips fast-isel
Summary:
 Add numeric extend, trunctate to mips fast-isel

 Reactivates D4827



Test Plan:
fpext.ll
loadstoreconv.ll

Reviewers: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D5251

llvm-svn: 218681
2014-09-30 16:30:13 +00:00
Tom Coxon 2c13e71728 [AArch64] Remove unnecessary whitespace. (Test commit)
llvm-svn: 218680
2014-09-30 16:23:16 +00:00
Todd Fiala 8a3716bfab Fix cmake build for new thread plan files.
llvm-svn: 218679
2014-09-30 15:58:56 +00:00
Andrea Di Biagio c7c524129b [DAG] Check in advance if a build_vector has a legal type before attempting to convert it into a shuffle.
Currently, the DAG Combiner only tries to convert type-legal build_vector nodes
into shuffles. This patch simply moves the logic that checks if a
build_vector has a legal value type up before we even start analyzing the
operands. This allows to early exit immediately from method
'visitBUILD_VECTOR' if the node type is known to be illegal.

No functional change intended.

llvm-svn: 218677
2014-09-30 15:30:22 +00:00
Alex Lorenz 597eaf2a43 Revert r218673 'llvm-cov: add test for report's function & file association.'
Test causes buildbot failures.

llvm-svn: 218676
2014-09-30 14:48:12 +00:00
Alexander Potapenko d775f3b5f0 [UBsan] Disable summary.cpp on Darwin. The test requires ubsan-asan, which does not work yet.
llvm-svn: 218675
2014-09-30 13:55:44 +00:00
Evgeniy Stepanov a9d434918e [asan] XFAIL one test on Android.
And add a missing return in main, just in case.

llvm-svn: 218674
2014-09-30 12:54:32 +00:00
Alex Lorenz a891e6d44a llvm-cov: add test for report's function & file association.
This commit adds a test which checks that the functions defined in header files will get associated with the header files rather than the source files in the reports.

Differential Revision: http://reviews.llvm.org/D5489

llvm-svn: 218673
2014-09-30 12:52:31 +00:00
Alex Lorenz cb1702d45a llvm-cov: Use the number of executed functions for the function coverage metric.
This commit fixes llvm-cov's function coverage metric by using the number of executed functions instead of the number of fully covered functions.

Differential Revision: http://reviews.llvm.org/D5196

llvm-svn: 218672
2014-09-30 12:45:13 +00:00
Lorenzo Martignoni 40d3deeb7d Introduce support for custom wrappers for vararg functions.
Differential Revision: http://reviews.llvm.org/D5412

llvm-svn: 218671
2014-09-30 12:33:16 +00:00
Robert Khasanov 28a7df0b5f [AVX512] Added intrinsics for 128-, 256- and 512-bit versions of VCMPGT{BWDQ}.
Patch by Sergey Lisitsyn <sergey.lisitsyn@intel.com>

llvm-svn: 218670
2014-09-30 12:15:52 +00:00
Robert Khasanov 5aa4445bde [AVX512] Added intrinsics for 128- and 256-bit versions of VCMPEQ{BWDQ}
Fixed lowering of this intrinsics in case when mask is v2i1 and v4i1.
Now cmp intrinsics lower in the following way:
 (i8 (int_x86_avx512_mask_pcmpeq_q_128
             (v2i64 %a), (v2i64 %b), (i8 %mask))) ->
 (i8 (bitcast
   (v8i1 (insert_subvector undef,
           (v2i1 (and (PCMPEQM %a, %b),
                      (extract_subvector
                         (v8i1 (bitcast %mask)), 0))), 0))))

llvm-svn: 218669
2014-09-30 11:41:54 +00:00
Robert Khasanov b25e562d14 [AVX512] Added intrinsics for VPCMPEQB and VPCMPEQW.
Added new operand type for intrinsics (IIT_V64)

llvm-svn: 218668
2014-09-30 11:32:22 +00:00
Robert Khasanov a27c8e0fd9 [AVX512] Enabled intrinsics for VPCMPEQD and VPCMPEQQ.
Added CMP_MASK intrinsic type

llvm-svn: 218667
2014-09-30 11:19:50 +00:00
Job Noorman ac95cd5c22 Make sure aggregates are properly alligned on MSP430.
llvm-svn: 218666
2014-09-30 11:19:13 +00:00