Commit Graph

189673 Commits

Author SHA1 Message Date
Lang Hames 12b12e800b [APFloat][ADT] Fix sign handling logic for FMA results that truncate to zero.
This patch adds a check for underflow when truncating results back to lower
precision at the end of an FMA. The additional sign handling logic in
APFloat::fusedMultiplyAdd should only be performed when the result of the
addition step of the FMA (in full precision) is exactly zero, not when the
result underflows to zero.

Unit tests for this case and related signed zero FMA results are included.

Fixes <rdar://problem/18925551>.

llvm-svn: 225123
2015-01-04 01:20:55 +00:00
Nico Weber 744cc5b5dd Wrap to 80 columns, no behavior change.
llvm-svn: 225122
2015-01-04 00:47:22 +00:00
David Majnemer 99b98f07d4 AST: Remove overzealous assertion from IsModifiable
It's reasonable to ask if an l-value with class type is modifiable.

llvm-svn: 225121
2015-01-04 00:44:32 +00:00
Saleem Abdulrasool ddd926441e llvm-readobj: add support to dump COFF export tables
This enhances llvm-readobj to print out the COFF export table, similar to the
-coff-import option.  This is useful for testing in lld.

llvm-svn: 225120
2015-01-03 21:35:09 +00:00
Saleem Abdulrasool 67f729933f ARM: permit tail calls to weak externals on COFF
Weak externals are resolved statically, so we can actually generate the tail
call on PE/COFF targets without breaking the requirements.  It is questionable
whether we want to propagate the current behaviour for MachO as the requirements
are part of the ARM ELF specifications, and it seems that prior to the SVN
r215890, we would have tail'ed the call.  For now, be conservative and only
permit it on PE/COFF where the call will always be fully resolved.

llvm-svn: 225119
2015-01-03 21:35:00 +00:00
David Majnemer 22fe771f78 Parse: __attribute__((keyword)) shouldn't error
Weird constructs like __attribute__((inline)) or
__attibute__((typename)) should result in warnings, not errors.

llvm-svn: 225118
2015-01-03 19:41:00 +00:00
Hal Finkel 5772566ed6 [PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference
The existing code provided for specifying a global loop alignment preference.
However, the preferred loop alignment might depend on the loop itself. For
recent POWER cores, loops between 5 and 8 instructions should have 32-byte
alignment (while the others are better with 16-byte alignment) so that the
entire loop will fit in one i-cache line.

To support this, getPrefLoopAlignment has been made virtual, and can be
provided with an optional MachineLoop* so the target can inspect the loop
before answering the query. The default behavior, as before, is to return the
value set with setPrefLoopAlignment. MachineBlockPlacement now queries the
target for each loop instead of only once per function. There should be no
functional change for other targets.

llvm-svn: 225117
2015-01-03 17:58:24 +00:00
Aaron Ballman 409af50858 Volatile reads are side-effecting operations, but in the general case of access through a volatile-qualified type, we're not certain of the underlying object's side-effects on access.
Treat volatile accesses as "maybe" instead of "definite" side effects for the purposes of warning on evaluations in an unevaluated context. No longer diagnose on idiomatic code like:

int * volatile v;
(void)sizeof(*v);

llvm-svn: 225116
2015-01-03 17:00:12 +00:00
Hal Finkel d73bfba7eb [PowerPC] Use 16-byte alignment for modern cores for functions/loops
Most modern PowerPC cores prefer that functions and loops start on
16-byte-aligned boundaries (*), so instruct block placement, etc. to make this
happen. The branch selector has also been adjusted so account for the extra
nops that might now be inserted before loop headers.

(*) Some cores actually prefer other alignments for small loops, but that will
    be addressed in a follow-up commit.

llvm-svn: 225115
2015-01-03 14:58:25 +00:00
Craig Topper 589ceee7f4 Minor cleanup to all the switches after MatchInstructionImpl in all the AsmParsers.
Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation.

llvm-svn: 225114
2015-01-03 08:16:34 +00:00
Craig Topper a5754e6e82 Fix some formatting in tablegen output.
llvm-svn: 225113
2015-01-03 08:16:29 +00:00
Craig Topper 8c714d10bd Replace some 'unreachable' comments with llvm_unreachable.
llvm-svn: 225112
2015-01-03 08:16:14 +00:00
Alexey Samsonov df3aeb8e71 Remove TSAN_DEBUG in favor of SANITIZER_DEBUG.
llvm-svn: 225111
2015-01-03 04:29:12 +00:00
Alexey Samsonov 3b1885448a Replace DCHECK with DCHECK_LE where appropriate.
llvm-svn: 225110
2015-01-03 04:29:05 +00:00
David Majnemer 8df46c9289 ValueTracking: Make computeKnownBits for Arguments a little more clear
We would sometimes leave the out-param APInts untouched while going
through computeKnownBits.  While I don't know of a way to trigger a bug
involving this in practice, it goes against the overall design of
computeKnownBits.

Found via code inspection.

llvm-svn: 225109
2015-01-03 02:33:25 +00:00
Kostya Serebryany 0f53d9a2ee [asan/tracing] extend the test a bit more, simplify the tracing code, add a guard page to trace array, fix the trace IDs before dumping
llvm-svn: 225108
2015-01-03 02:07:58 +00:00
Kostya Serebryany 86ced092f4 [asan] extend coverage-tracing.cc test
llvm-svn: 225107
2015-01-03 01:41:11 +00:00
Hal Finkel 4edc66b8de [PowerPC] Add support for the CMPB instruction
Newer POWER cores, and the A2, support the cmpb instruction. This instruction
compares its operands, treating each of the 8 bytes in the GPRs separately,
returning a 'mask' result of 0 (for false) or -1 (for true) in each byte.

Code generation support is added, in the form of a PPCISelDAGToDAG
DAG-preprocessing routine, that recognizes patterns close to what the
instruction computes (either exactly, or related by a constant masking
operation), and generates the cmpb instruction (along with any necessary
constant masking operation). This can be expanded if use cases arise.

llvm-svn: 225106
2015-01-03 01:16:37 +00:00
Saleem Abdulrasool 4c059622d5 test: correct PE/COFF tests to build under MSVC mode
This adjusts the inputs to be compatible with armv7-windows-msvc as well as
armv7-windows-itanium.  NFC.

llvm-svn: 225105
2015-01-03 00:57:14 +00:00
Saleem Abdulrasool a09f872f58 ReaderWriter: adjust ARM target addresses for exec
ARM NT assumes a THUMB only environment.  As such, any address that is detected
as residing in an executable section is adjusted to have its bottom bit set to
indicate THUMB in case of a mode exchange.

Although the testing here seems insufficient (missing the negative cases) the
existing test cases for the IMAGE_REL_ARM_{ADDR32,MOV32T} are relevant as they
ensure that we do not incorrectly set the bit.

llvm-svn: 225104
2015-01-03 00:57:10 +00:00
Kostya Serebryany d421db05bb [asan] simplify the tracing code, make it use the same guard variables as coverage
llvm-svn: 225103
2015-01-03 00:54:43 +00:00
Rafael Espindola f733b422d0 Remove -Werror from test.
It is not needed since we FileCheck for the warning and -Werror itself can end up
unused.

llvm-svn: 225102
2015-01-03 00:28:47 +00:00
Rafael Espindola 577637a6af Really don't warn about -flto/fno-lto :-(
This should fix the last bots.

llvm-svn: 225100
2015-01-03 00:06:04 +00:00
Craig Topper ae8e1b3831 [X86] Disassembler support for move to/from %rax with a 32-bit memory offset is REX.W and AdSize prefix are both present.
llvm-svn: 225099
2015-01-03 00:00:20 +00:00
Craig Topper 017b830564 [X86] Use 32-bit sign extended immediate for 64-bit LOCK_ArithBinOp with sign extended immediate.
llvm-svn: 225098
2015-01-03 00:00:14 +00:00
Chandler Carruth d3e2b4c0ea [PM] Add proper documentation for the ModulePassManager and
FunctionPassManager. These never got documented, likely due to the
clutter of this header file. This fixes another problem people noticed
when they started trying to use the new pass manager.

I've also used this to document the aspirational constraints I would
like to hold passes to. I don't really have a better place to document
such things at this point, but eventually will probably create a proper
.rst file and page for the LLVM pass infrastructure that carries such
high-level concerns.

llvm-svn: 225097
2015-01-02 23:34:39 +00:00
Chandler Carruth 4664057108 [PM] Actually include the correct file name. Sorry for the breakage.
llvm-svn: 225096
2015-01-02 23:25:16 +00:00
Rafael Espindola 16042fc2b9 Also avoid warning on -flto/-fno-lto on linux.
On OS X a .s file is preprocessed, it is not on linux, which is why the warning was still
showing up on linux but not OS X.

llvm-svn: 225095
2015-01-02 23:23:52 +00:00
Chandler Carruth 2ad2a8b943 [PM] Lift the majority of the template boilerplate used to implement the
concept-based polymorphism in the pass manager to a separate header.

I got feedback from someone reading the code and trying to use it that
this was really making it hard to dive in and start using these APIs and
that makes a lot of sense.

This only requires a moderate amount of gymnastics to separate in this
way, namely rinsing the PreservedAnalysis object through a template
argument in a few places so that it is dependent and we only examine it
on instantiation.

llvm-svn: 225094
2015-01-02 23:16:59 +00:00
Rafael Espindola 5640ae48db Don't warn on unused -fno-lto.
It is somewhat common for CFLAGS to be used with .s files. We were
already ignoring -flto. This patch just does the same for -fno-lto.

llvm-svn: 225093
2015-01-02 22:56:15 +00:00
Chandler Carruth ce08983110 [PM] Fix some formatting where clang-format has improved recently.
llvm-svn: 225092
2015-01-02 22:51:44 +00:00
David Blaikie b9a23c9155 DebugInfo: Provide a less subtle way to set the debug location of simple ret instructions
un-XFAILing the test XFAIL'd in r225086 after it regressed in r225083.

llvm-svn: 225090
2015-01-02 22:07:26 +00:00
Saleem Abdulrasool 61770ab26f Driver: honour the clang-cl behaviour on ARM as well
Unfortunately, MSVC does not indicate to the driver what target is being used.
This means that we cannot correctly select the target architecture for the
clang_rt component.  This breaks down when targeting windows with the clang
driver as opposed to the clang-cl driver.  This should fix the native ARM
buildbot tests.

llvm-svn: 225089
2015-01-02 21:47:33 +00:00
Alexey Samsonov c426c337ed Revert "Revert r224736: "[Sanitizer] Make CommonFlags immutable after initialization.""
Fix test failures by introducing CommonFlags::CopyFrom() to make sure
compiler doesn't insert memcpy() calls into runtime code.

Original commit message:
Protect CommonFlags singleton by adding const qualifier to
common_flags() accessor. The only ways to modify the flags are
SetCommonFlagsDefaults(), ParseCommonFlagsFromString() and
OverrideCommonFlags() functions, which are only supposed to be
called during initialization.

llvm-svn: 225088
2015-01-02 21:28:37 +00:00
Saleem Abdulrasool 1d59f49f9c Driver: reuse getCompilerRT in place of addSanitizerRTWindows
The logic for addSanitizerRTWindows was performing the same logical operation as
getCompilerRT, which was previously fully generalised for Linux and Windows.
This avoids having a duplication of the logic for building up the name of a
clang_rt component.  This change does move the current limitation for Windows
into getArchNameForCompilerRTLib, where it is assumed that the architecture for
Windows is always i386.

llvm-svn: 225087
2015-01-02 20:00:55 +00:00
David Blaikie 5e9e13f54a Temporarily XFAIL fallout from r225083 while investigating.
Between this behavior and that fixed by r225083/r225000, I'll take the
latter over the former for now, but I'm immediately working on
understanding/addressing this behavior too.

(the fact that the code change in r225083 caused this change in behavior
is a bit troubling anyway - given that it looks & claims to be just a
preformance thing)

llvm-svn: 225086
2015-01-02 19:49:28 +00:00
David Blaikie fcee870c17 DebugInfo: Remove some now-unnecessary location handling around function arguments.
r225000 generalized debug info line info handling for expressions such
that this code is no longer necessary.

This removes the last use of CGDebugInfo::getLocation, but not all the
uses of CGDebugInfo::CurLoc, which is still used internally in
CGDebugInfo. I'd like to do away with all of that & might succeed after
a few more patches.

llvm-svn: 225085
2015-01-02 19:49:10 +00:00
Philip Reames dfc238b45f Reformat statepoint documentation and fix a couple of typos
Patch by Ramkumar Ramachandra <artagnon@gmail.com>.

llvm-svn: 225084
2015-01-02 19:46:49 +00:00
David Blaikie ba90b04b7b DebugInfo: Fix cases where location failed to be updated after r225000
The optimization (that appears to have been here since the earliest
implementation (r50848) & has become more complicated over the years) to
avoid recreating the debugloc if it would be the same was out of date
because ApplyDebugLocation was not re-updating the CurLoc/PrevLoc. This
optimization doesn't look terribly beneficial/necessary, so I'm removing
it - if it turns up in benchmarks, I'm happy to reconsider/reimplement
this with justification, but for now it just seems to add
complexity/problems.

llvm-svn: 225083
2015-01-02 19:06:25 +00:00
Saleem Abdulrasool 434fedb8d8 ReaderWriter: teach the writer about IMAGE_REL_ARM_BRANCH24T
This adds support for IMAGE_REL_ARM_BRANCH24T relocations.  Similar to the
IMAGE_REL_ARM_BLX32T relocation, this relocation requires munging an
instruction.  The instruction encoding is quite similar, allowing us to reuse
the same munging implementation.  This is needed by the entry point stubs for
modules provided by MSVCRT.

llvm-svn: 225082
2015-01-02 18:51:59 +00:00
Saleem Abdulrasool f081873161 ReaderWriter: teach the writer about IMAGE_REL_ARM_BLX23T
This adds support for IMAGE_REL_ARM_BLX23T relocations.  Similar to the
IMAGE_REL_ARM_MOV32T relocation, this relocation requires munging an
instruction.  This inches us closer to supporting a basic hello world
application.

llvm-svn: 225081
2015-01-02 18:51:36 +00:00
Andrea Di Biagio 6477847ef4 Improved comments. No functional change intended.
llvm-svn: 225080
2015-01-02 10:47:46 +00:00
Chandler Carruth 6173e869eb Revert r224736: "[Sanitizer] Make CommonFlags immutable after initialization."
We've got some internal users that either aren't compatible with this or
have found a bug with it. Either way, this is an isolated cleanup and so
I'm reverting it to un-block folks while we investigate. Alexey and
I will be working on fixing everything up so this can be re-committed
soon. Sorry for the noise and any inconvenience.

llvm-svn: 225079
2015-01-02 09:59:38 +00:00
Craig Topper 4e5ab81a12 [X86] Bring some better consistency to the naming of the move to/from %al/ax/eax/rax with memory offset.
llvm-svn: 225078
2015-01-02 07:36:23 +00:00
David Majnemer c8a576b5c0 InstCombine: Detect when llvm.umul.with.overflow always overflows
We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero
and LHSKnownOne * RHSKnownOne overflow.

llvm-svn: 225077
2015-01-02 07:29:47 +00:00
David Majnemer 491331aca8 Analysis: Reformulate WillNotOverflowUnsignedMul for reusability
WillNotOverflowUnsignedMul's smarts will live in ValueTracking as
computeOverflowForUnsignedMul.  It now returns a tri-state result:
never overflows, always overflows and sometimes overflows.

llvm-svn: 225076
2015-01-02 07:29:43 +00:00
Craig Topper 055845f5cb [X86] Make the instructions that use AdSize16/32/64 co-exist together without using mode predicates.
This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used.

Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction.

llvm-svn: 225075
2015-01-02 07:02:25 +00:00
Chandler Carruth 24ac830d7c [SROA] Teach SROA to be more aggressive in splitting now that we have
a pre-splitting pass over loads and stores.

Historically, splitting could cause enough problems that I hamstrung the
entire process with a requirement that splittable integer loads and
stores must cover the entire alloca. All smaller loads and stores were
unsplittable to prevent chaos from ensuing. With the new pre-splitting
logic that does load/store pair splitting I introduced in r225061, we
can now very nicely handle arbitrarily splittable loads and stores. In
order to fully benefit from these smarts, we need to mark all of the
integer loads and stores as splittable.

However, we don't actually want to rewrite partitions with all integer
loads and stores marked as splittable. This will fail to extract scalar
integers from aggregates, which is kind of the point of SROA. =] In
order to resolve this, what we really want to do is only do
pre-splitting on the alloca slices with integer loads and stores fully
splittable. This allows us to uncover all non-integer uses of the alloca
that would benefit from a split in an integer load or store (and where
introducing the split is safe because it is just memory transfer from
a load to a store). Once done, we make all the non-whole-alloca integer
loads and stores unsplittable just as they have historically been,
repartition and rewrite.

The result is that when there are integer loads and stores anywhere
within an alloca (such as from a memcpy of a sub-object of a larger
object), we can split them up if there are non-integer components to the
aggregate hiding beneath. I've added the challenging test cases to
demonstrate how this is able to promote to scalars even a case where we
have even *partially* overlapping loads and stores.

This restores the single-store behavior for small arrays of i8s which is
really nice. I've restored both the little endian testing and big endian
testing for these exactly as they were prior to r225061. It also forced
me to be more aggressive in an alignment test to actually defeat SROA.
=] Without the added volatiles there, we actually split up the weird i16
loads and produce nice double allocas with better alignment.

This also uncovered a number of bugs where we failed to handle
splittable load and store slices which didn't have a begininng offset of
zero. Those fixes are included, and without them the existing test cases
explode in glorious fireworks. =]

I've kept support for leaving whole-alloca integer loads and stores as
splittable even for the purpose of rewriting, but I think that's likely
no longer needed. With the new pre-splitting, we might be able to remove
all the splitting support for loads and stores from the rewriter. Not
doing that in this patch to try to isolate any performance regressions
that causes in an easy to find and revert chunk.

llvm-svn: 225074
2015-01-02 03:55:54 +00:00
Chandler Carruth 5986b541d4 [SROA] Make the computation of adjusted pointers not leak GEP
instructions.

I noticed this when working on dialing up how aggressively we can
pre-split loads and stores. My test case wasn't passing because dead
GEPs into the allocas persisted when they were built by this routine.
This isn't terribly harmful, we still rewrote and promoted the alloca
and I can't conceive of how to cause this to happen in a case where we
will keep the exact same alloca but rewrite and promote the uses of it.
If that ever happened, we'd get an assert out of mem2reg.

So I don't have a direct test case yet, but the subsequent commit's test
case wouldn't pass without this. There are other problems fixed by this
patch that I spotted purely by inspection such as the fact that
getAdjustedPtr could have actually deleted dead base pointers. I don't
know how to get a base pointer to go into getAdjustedPtr today, so
I think this bug could never have manifested (and I certainly can't
write a test case for it) but, it wasn't the intent of the code. The
code really just wanted to GC the new instructions built. That can be
done more directly by comparing with the base pointer which is the only
non-new instruction that this code can return.

llvm-svn: 225073
2015-01-02 02:47:38 +00:00
Saleem Abdulrasool 017822d81a ReaderWriter: teach the writer about IMAGE_REL_ARM_MOV32T
This adds support for the IMAGE_REL_ARM_MOV32T relocation.  This is one of the
most complicated relocations for the Window on ARM target.  It involves
re-encoding an instruction to contain an immediate value which is the relocation
target.

llvm-svn: 225072
2015-01-02 02:32:05 +00:00