Summary:
In RS4GC it is possible that a base pointer is contained in a vector that
has undergone a bitcast from one element-pointertype to another. We teach
RS4GC how to look through bitcasts of vector types when looking for a base
pointer.
Reviewers: anna
Reviewed By: anna
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38849
llvm-svn: 315694
Summary: We seem to inconsistently create CMOV nodes some with a Glue result and some without. But I can't find any cases that use the Glue result. So I've tried to remove all the place that did this.
Reviewers: RKSimon, spatel, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38664
llvm-svn: 315686
Summary:
Documentation says that user can specify sources for both "show" and
"report" commands. "Show" command respects specified sources, but "report" does
not. It is useful to have both "show" and "report" generated for specified
sources. Also added tests to for both commands with sources specified.
Reviewers: vsk, kcc
Reviewed By: vsk
Differential Revision: https://reviews.llvm.org/D38860
llvm-svn: 315685
This patch adds timestamp verification for swiftmodule files. A new flag
is provided to allows us to disable this check in order to allow testing
of this feature.
Differential revision: https://reviews.llvm.org/D38686
llvm-svn: 315684
Summary:
This patch teaches SCEV to calculate the maxBECount when the end bound
of the loop can vary. Note that we cannot calculate the exactBECount.
This will only be done when both conditions are satisfied:
1. the loop termination condition is strictly LT.
2. the IV is proven to not overflow.
This provides more information to users of SCEV and can be used to
improve identification of finite loops.
Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick
Reviewed by: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38825
llvm-svn: 315683
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)
I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.
llvm-svn: 315680
Minor doc update that the FileCheck matcher supports POSIX ERE.
It also fixes a minor issue in the regexp describing a variable
name: underscores are allowed too.
Differential Revision: https://reviews.llvm.org/D38787
llvm-svn: 315679
There's no advantage to using these instructions when they aren't masked. This enables some additional execution domain switching without needing to update the table.
llvm-svn: 315674
Summary:
Currently we do not correctly invalidate memoized results for add recurrences
that were created directly (i.e. they were not created from a `Value`). This
change fixes this by keeping loop use lists and using the loop use lists to
determine which SCEV expressions to invalidate.
Here are some statistics on the number of uses of in the use lists of all loops
on a clang bootstrap (config: release, no asserts):
Count: 731310
Min: 1
Mean: 8.555150
50th %time: 4
95th %tile: 25
99th %tile: 53
Max: 433
Reviewers: atrick, sunfish, mkazantsev
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D38434
llvm-svn: 315672
I don't know if we ever hit this case or not. Turning it into an assert only fired on expanding some atomic operation in a SystemZ lit test.
llvm-svn: 315648
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.
This reverts commit r315633.
llvm-svn: 315637
The clang frontend already creates a DIExpression that replicates the
logic in addBlockByrefAddress() exactly, thus making this function
effectively unreachable. To guard against human error I'm hereby
marking the function with an assertion and let it hit the bots before
eventually removing it.
rdar://problem/31629055
llvm-svn: 315636
Summary:
As the first step to allow analysis and visualization of xray collected data,
allow using the llvm-xray stacks tool to emit a complete listing of stacks in
the format consumable by a flamegraph tool.
Possible follow up formats include chrome trace viewer format and sql load
files.
As a POC, I'm able to generate flamegraphs of an xray instrumented llc compiling
hello world.
Reviewers: dberris, pelikan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38650
llvm-svn: 315635
Summary:
This patch adds processing of binary operations when the def of operands are in
the same block (i.e. local processing).
Earlier we bailed out in such cases (the bail out was introduced in rL252032)
because LVI at that time was more precise about context at the end of basic
blocks, which implied local def and use analysis didn't benefit CVP.
Since then we've added support for LVI in presence of assumes and guards. The
test cases added show how local def processing in CVP helps adding more
information to the ashr, sdiv, srem and add operators.
Note: processCmp which suffers from the same problem will
be handled in a later patch.
Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38766
llvm-svn: 315634
Merge LLVMTargetMachine into TargetMachine.
- There is no in-tree target anymore that just implements TargetMachine
but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
interface.
Differential Revision: https://reviews.llvm.org/D38489
llvm-svn: 315633
This paves the way for other projects which might /use/ clang or
lld but not necessarily need to the full set of functionality
available to clang and lld tests to be able to have a basic set
of substitutions that allow a project to run the clang or lld
executables.
llvm-svn: 315627
The comparator passed to std::sort must provide a strict weak ordering;
otherwise, the behavior is undefined.
Fixes an assertion failure generating debug info for globals
split by GlobalOpt. I have a testcase, but not sure how to reduce it,
so not included here. (Someone else came up with a testcase, but I
can't reproduce the crash with it, presumably because my version of LLVM
ends up sorting the array differently.)
This isn't really a complete fix (see the FIXME in the patch), but at
least it doesn't have undefined behavior.
Differential Revision: https://reviews.llvm.org/D38830
llvm-svn: 315619
This is a follow up for the loop predication change 313981 to support ule, sle latch predicates.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D38177
llvm-svn: 315616
WMMA = "Warp Level Matrix Multiply-Accumulate".
These are the new instructions introduced in PTX6.0 and available
on sm_70 GPUs.
Differential Revision: https://reviews.llvm.org/D38645
llvm-svn: 315601
Building with BUILD_SHARED_LIBS makes it tricky to copy around
executables at will, since they won't be able to find the LLVM
libraries any more. This makes testing a feature that's based on the
executable name problematic, so we'll just disable these two tests in
that configuration.
We could potentially fix this by symlinking the lib directory into the
test directory, but that wouldn't work on windows, and losing testing
on windows would be far worse than losing testing on a configuration
that's barely even supported.
llvm-svn: 315599
In r315079 I added a check for the ERROR_CALL_NOT_IMPLEMENTED error
code, but it turns out earlier versions of Wine just returned false
without setting any error code.
This patch handles the unset error code case.
llvm-svn: 315597
Add profitability checks for modifying counted loops to use the mtctr instruction.
The latency of mtctr is only justified if there are more than 4 comparisons that
will be removed as a result. Usually counted loops are formed relatively early
and before unrolling, so most low trip count loops often don't survive. However
we want to ensure that if they do, we do not mistakenly update them to mtctr loops.
Use CodeMetrics to ensure we are only doing this for small loops with small trip counts.
Differential Revision: https://reviews.llvm.org/D38212
llvm-svn: 315592
Summary:
The interpolation mode workaround ensures that at least one
interpolation mode is enabled in PSInputAddr. It does not also check
PSInputEna on the basis that the user might enable bits in that
depending on run-time state.
However, for amdpal os type, the user does not enable some bits after
compilation based on run-time states; the register values being
generated here are the final ones set in the hardware. Therefore, apply
the workaround to PSInputAddr and PSInputEnable together. (The case
where a bit is set in PSInputAddr but not in PSInputEnable is where the
frontend set up an input arg for a particular interpolation mode, but
nothing uses that input arg. Really we should have an earlier pass that
removes such an arg.)
Reviewers: arsenm, nhaehnle, dstuttard
Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D37758
llvm-svn: 315591
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.
Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.
Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.
Differential Revision: https://reviews.llvm.org/D38406
llvm-svn: 315590
MachineInstr::isIdenticalTo has a lot of logic for dealing with register
Defs (i.e. deciding whether to take them into account or ignore them).
This logic gets things wrong in some obscure cases, for instance if an
operand is not a Def for both the current MI and the one we are
comparing to.
I'm not sure if it's possible for this to happen for regular register
operands, but it may happen in the ARM backend for special operands
which use sentinel values for the register (i.e. 0, which is neither a
physical register nor a virtual one).
This causes MachineInstrExpressionTrait::isEqual (which uses
MachineInstr::isIdenticalTo) to return true for the following
instructions, which are the same except for the fact that one sets the
flags and the other one doesn't:
%1114 = ADDrsi %1113, %216, 17, 14, _, def _
%1115 = ADDrsi %1113, %216, 17, 14, _, _
OTOH, MachineInstrExpressionTrait::getHashValue returns different values
for the 2 instructions due to the different isDef on the last operand.
In practice this means that when trying to add those instructions to a
DenseMap, they will be considered different because of their different
hash values, but when growing the map we might get an assertion while
copying from the old buckets to the new buckets because isEqual
misleadingly returns true.
This patch makes sure that isEqual and getHashValue agree, by improving
the checks in MachineInstr::isIdenticalTo when we are ignoring virtual
register definitions (which is what the Trait uses). Firstly, instead of
checking isPhysicalRegister, we use !isVirtualRegister, so that we cover
both physical registers and sentinel values. Secondly, instead of
checking MachineOperand::isReg, we use MachineOperand::isIdenticalTo,
which checks isReg, isSubReg and isDef, which are the same values that
the hash function uses to compute the hash.
Note that the function is symmetric with this change, since if the
current operand is not a Def, we check MachineOperand::isIdenticalTo,
which returns false if the operands have different isDef's.
Differential Revision: https://reviews.llvm.org/D38789
llvm-svn: 315579
While this shouldn't be necessary anymore, we have cases where we run
into the assertion below, i.e. cases with two non-fragment entries for the
same variable at different frame indices.
This should be fixed, but for now, we should revert to a version that
does not trigger asserts.
llvm-svn: 315576
This patch fixes the bug introduced in https://reviews.llvm.org/D35907; the bug is reported by http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171002/491452.html.
Before D35907, when GetUnderlyingObjects fails to find an identifiable object, allMMOsOkay lambda in getUnderlyingObjectsForInstr returns false and Objects vector is cleared. This behavior is unintentionally changed by D35907.
This patch makes the behavior for such case same as the previous behavior.
Since D35907 introduced a wrapper function getUnderlyingObjectsForCodeGen around GetUnderlyingObjects, getUnderlyingObjectsForCodeGen is modified to return a boolean value to ask the caller to clear the Objects vector.
Differential Revision: https://reviews.llvm.org/D38735
llvm-svn: 315565
Summary:
The comments in the code said
// Remove <def,read-undef> flags. This def is now a partial redef.
but the code didn't just remove read-undef, it could introduce new ones which
could cause errors.
E.g. if we have something like
%vreg1<def> = IMPLICIT_DEF
%vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4
%vreg2:subreg2<def> = op %vreg6, %vreg7
and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg
def, which the old code did.
Now we solve this by actually do what the code comment says. We remove
read-undef flags rather than remove or introduce them.
Reviewers: qcolombet, MatzeB
Reviewed By: MatzeB
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38616
llvm-svn: 315564
Here we add a secondary option parser to llvm-isel-fuzzer (and provide
it for use with other fuzzers). With this, you can copy the fuzzer to
a name like llvm-isel-fuzzer=aarch64-gisel for a fuzzer that fuzzer
AArch64 with GlobalISel enabled, or fuzzer=x86_64 to fuzz x86, with no
flags required. This should be useful for running these in OSS-Fuzz.
Note that this handrolls a subset of cl::opts to recognize, rather
than embedding a complete command parser for argv[0]. If we find we
really need the flexibility of handling arbitrary options at some
point we can rethink this.
This re-applies 315545 using "=" instead of ":" as a separator for
arguments.
llvm-svn: 315557
The llvm-cfi-verify unit tests fail if LLVM is built without the X86
target, disable the unit tests from being built unless X86 is enabled
for now.
llvm-svn: 315556
It broke some tests on Windows:
Failing Tests (4):
LLVM :: tools/llvm-isel-fuzzer/execname-options.ll
LLVM :: tools/llvm-isel-fuzzer/missing-triple.ll
LLVM :: tools/llvm-isel-fuzzer/x86-empty-bc.ll
LLVM :: tools/llvm-isel-fuzzer/x86-empty.ll
> llvm-isel-fuzzer: Handle a subset of backend flags in the executable name
>
> Here we add a secondary option parser to llvm-isel-fuzzer (and provide
> it for use with other fuzzers). With this, you can copy the fuzzer to
> a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer
> AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no
> flags required. This should be useful for running these in OSS-Fuzz.
>
> Note that this handrolls a subset of cl::opts to recognize, rather
> than embedding a complete command parser for argv[0]. If we find we
> really need the flexibility of handling arbitrary options at some
> point we can rethink this.
llvm-svn: 315554
Here we add a secondary option parser to llvm-isel-fuzzer (and provide
it for use with other fuzzers). With this, you can copy the fuzzer to
a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer
AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no
flags required. This should be useful for running these in OSS-Fuzz.
Note that this handrolls a subset of cl::opts to recognize, rather
than embedding a complete command parser for argv[0]. If we find we
really need the flexibility of handling arbitrary options at some
point we can rethink this.
llvm-svn: 315545
This reverts commit 4e4ee1c507e2707bb3c208e1e1b6551c3015cbf5.
This is failing due to some code that isn't built on MSVC
so I didn't catch. Not immediately obvious how to fix this
at first glance, so I'm reverting for now.
llvm-svn: 315536
MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that,
and allows us to remove the last instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.
llvm-svn: 315531
There's a lot of misuse of Twine scattered around LLVM. This
ranges in severity from benign (returning a Twine from a function
by value that is just a string literal) to pretty sketchy (storing
a Twine by value in a class). While there are some uses for
copying Twines, most of the very compelling ones are confined
to the Twine class implementation itself, and other uses are
either dubious or easily worked around.
This patch makes Twine's copy constructor private, and fixes up
all callsites.
Differential Revision: https://reviews.llvm.org/D38767
llvm-svn: 315530
- Use HSA metadata streamer directly from AMDGPUAsmPrinter
- Make naming consistent with PAL metadata
Differential Revision: https://reviews.llvm.org/D38746
llvm-svn: 315526
- Move PAL metadata definitions to AMDGPUMetadata
- Make naming consistent with HSA metadata
Differential Revision: https://reviews.llvm.org/D38745
llvm-svn: 315523
- Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change)
- Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer
- Introduce HSAMD namespace
- Other minor name changes in function and test names
llvm-svn: 315522
In r315079, fs::rename was reimplemented in terms of CreateFile and
SetFileInformationByHandle. Unfortunately, the latter isn't supported by
Wine. This adds a fallback to MoveFileEx for that case.
Differential Revision: https://reviews.llvm.org/D38817
llvm-svn: 315520
Summary:
This adds a set of new directives that describe 32-bit x86 prologues.
The directives are limited and do not expose the full complexity of
codeview FPO data. They are merely a convenience for the compiler to
generate more readable assembly so we don't need to generate tons of
labels in CodeGen. If our prologue emission changes in the future, we
can change the set of available directives to suit our needs. These are
modelled after the .seh_ directives, which use a different format that
interacts with exception handling.
The directives are:
.cv_fpo_proc _foo
.cv_fpo_pushreg ebp/ebx/etc
.cv_fpo_setframe ebp/esi/etc
.cv_fpo_stackalloc 200
.cv_fpo_endprologue
.cv_fpo_endproc
.cv_fpo_data _foo
I tried to follow the implementation of ARM EHABI CFI directives by
sinking most directives out of MCStreamer and into X86TargetStreamer.
This helps avoid polluting non-X86 code with WinCOFF specific logic.
I used cdb to confirm that this can show locals in parent CSRs in a few
cases, most importantly the one where we use ESI as a frame pointer,
i.e. the one in http://crbug.com/756153#c28
Once we have cdb integration in debuginfo-tests, we can add integration
tests there.
Reviewers: majnemer, hans
Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D38776
llvm-svn: 315513
Summary: Move llvm-cfi-verify into a class in preparation for CFI analysis to come.
Reviewers: vlad.tsyrklevich
Reviewed By: vlad.tsyrklevich
Subscribers: mgorny, llvm-commits, pcc, kcc
Differential Revision: https://reviews.llvm.org/D38379
llvm-svn: 315504
Summary:
Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled.
Patch by Paul Walker.
Reviewers: fhahn, Gerolf, efriedma, MatzeB
Reviewed By: fhahn
Subscribers: aemerson, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38734
llvm-svn: 315502
Currently we produce a bunch of unnecessary code when emitting the
prologue/epilogue for spills/restores. Namely, if the load from stack
slot/store to stack slot instruction is an X-Form instruction, we will
always produce an LIS/ORI sequence for the stack offset.
Furthermore, we have not exploited the P9 vector D-Form loads/stores for this
purpose.
This patch address both issues.
Specifying the D-Form load as the instruction to use for stack spills/reloads
should be safe because:
1. The stack should be aligned according to the ABI
2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will
check for the offset being a multiple of 16 and will convert it to an
X-Form instruction if it isn't.
Differential Revision : https://reviews.llvm.org/D38758
llvm-svn: 315500
Previously we would only look in the current directory for a
resource, which might not be the same as the directory of the
rc file. Furthermore, MSVC rc supports a /I option, and can
also look in the system environment. This patch adds support
for this search algorithm.
Differential Revision: https://reviews.llvm.org/D38740
llvm-svn: 315499
Summary:
This patch fixes an error in the patch to ScalarEvolution::createAddRecFromPHIWithCastsImpl
made in D37265. In that patch we handle the cases where the either the start or accum values can be
zero after truncation. But, we assume that the start value must be a constant if the accum is
zero. This is clearly an erroneous assumption. This change removes that assumption.
Reviewers: sanjoy, dorit, mkazantsev
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38814
llvm-svn: 315491
Legalization of fp128 assumes things that we should have asserts for,
so that's another potential improvement.
Differential Revision: https://reviews.llvm.org/D38771
llvm-svn: 315485
ubsan caught an issue I made where I was converting a null pointer to a
reference.
elf utils implements a particularly extreme form of stripping that I'd
like to support. eu-strip has an option called "strip-sections" that
removes all section headers and leaves only program headers and the
segment data. I have implemented this option partly as a test but mainly
because in Fuchsia we would like to use this option to minimize the size
of our executables. The other strip options that are on my list include
--strip-all and --strip-debug. This is a preliminary implementation that
I'd like to start using in Fuchsia builds if possible. This change
implements such a stripping option for llvm-objcopy
Differential Revision: https://reviews.llvm.org/D38335
llvm-svn: 315484
parameterized emit() calls
Summary: This is not functional change to adopt new emit() API added in r313691.
Reviewed By: anemet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38285
llvm-svn: 315476
The software pipeliner and the packetizer try to break dependence
between the post-increment instruction and the dependent memory
instructions by changing the base register and the offset value.
However, in some cases, the existing logic didn't work properly
and created incorrect offset value.
Patch by Jyotsna Verma.
llvm-svn: 315468
The pipeliner is generating a serial sequence that causes poor
register allocation when a post-increment instruction appears
prior to the use of the post-increment register. This occurs when
there is a circular set of dependences involved with a sequence
of instructions in the same cycle. In this case, there is no
serialization of the parallel semantics that will not cause an
additional register to be allocated.
This patch fixes the problem by changing the instructions so that
the post-increment instruction is used by the subsequent
instruction, which enables the register allocator to make a
better decision and not require another register.
Patch by Brendon Cahoon.
llvm-svn: 315466
Eg:
insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7}
This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector.
We may want to abandon that one if we can't find value in squashing the more specific pattern sooner.
We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles.
There may be room for improvement in the shuffle lowering here, but that would be follow-up work.
Differential Revision: https://reviews.llvm.org/D38388
llvm-svn: 315460
The NumFixedArgs field of CallLoweringInfo is used by
TargetLowering::LowerCallTo to determine whether a given argument is passed
using the vararg calling convention or not (specifically, to set IsFixed for
each ISD::OutputArg).
Firstly, CallLoweringInfo::setLibCallee and CallLoweringInfo::setCallee both
incorrectly set NumFixedArgs based on the _previous_ args list. Secondly,
TargetLowering::LowerCallTo failed to increment NumFixedArgs when modifying
the argument list so a pointer is passed for the return value.
If your backend uses the IsFixed property or directly accesses NumFixedArgs,
it is _possible_ this change could result in codegen changes (although the
previous behaviour would have been incorrect). No such cases have been
identified during code review for any in-tree architecture.
Differential Revision: https://reviews.llvm.org/D37898
llvm-svn: 315457
This patch adds timestamp verification for swiftmodule files.
- A new flag is provided to allows us to continue testing of the code
for embedding the__swift_ast. (git doesn't maintain timestamps)
- Adds a new test for fat (arm) binaries.
Differential revision: https://reviews.llvm.org/D38686
llvm-svn: 315456
This adds debug tracing to the table-generated assembly instruction matcher,
enabled by the -debug-only=asm-matcher option.
The changes in the target AsmParsers are to add an MCInstrInfo reference under
a consistent name, so that we can use it from table-generated code. This was
already being used this way for targets that use deprecation warnings, but 5
targets did not have it, and Hexagon had it under a different name to the other
backends.
llvm-svn: 315445
Adding this test files now so after another commit that will add a new pattern for
TESTM and TESTNM instructions will show the improvemnts that have been done.
Change-Id: If3908b7f91897d764053312365a2bc1de78b291d
llvm-svn: 315443
This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:
int array[LEN];
...
guard(0 <= index && index < LEN);
use(array[index]);
Differential Revision: https://reviews.llvm.org/D37460
llvm-svn: 315440
Sinking of unordered atomic load into loop must be disallowed because it turns
a single load into multiple loads. The relevant section of the documentation
is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for
Optimizers section. Here is the full text of this section:
> Notes for optimizers
> In terms of the optimizer, this **prohibits any transformation that
> transforms a single load into multiple loads**, transforms a store into
> multiple stores, narrows a store, or stores a value which would not be
> stored otherwise. Some examples of unsafe optimizations are narrowing
> an assignment into a bitfield, rematerializing a load, and turning loads
> and stores into a memcpy call. Reordering unordered operations is safe,
> though, and optimizers should take advantage of that because unordered
> operations are common in languages that need them.
Patch by Daniil Suchkov!
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D38392
llvm-svn: 315438
IRCE should not apply when the safe iteration range is proved to be empty.
In this case we do unneeded job creating pre/post loops and then never
go to the main loop.
This patch makes IRCE not apply to empty safe ranges, adds test for this
situation and also modifies one of existing tests where it used to happen
slightly.
Reviewed By: anna
Differential Revision: https://reviews.llvm.org/D38577
llvm-svn: 315437
elf utils implements a particularly extreme form of stripping that I'd
like to support. eu-strip has an option called "strip-sections" that
removes all section headers and leaves only program headers and the
segment data. I have implemented this option partly as a test but mainly
because in Fuchsia we would like to use this option to minimize the size
of our executables. The other strip options that are on my list include
--strip-all and --strip-debug. This is a preliminary implementation that
I'd like to start using in Fuchsia builds if possible. This change
implements such a stripping option for llvm-objcopy
Differential Revision: https://reviews.llvm.org/D38335
llvm-svn: 315412
MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that,
and allows us to remove another instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.
llvm-svn: 315410
Of course, casting an unsigned value too large for 'int' is UB. So,
write out the ternary. LLVM folds it to ADD anyway.
Fixes the warning from r303693 a different way.
Thanks to Erich Keane for pointing this out!
llvm-svn: 315406
Similarly to how Instruction has getFunction, this adds a less verbose
way to write MI->getParent()->getParent(). I'll follow up shortly with
a change that changes a bunch of the uses.
llvm-svn: 315388
This change adds the ability to use the "-R"/"-remove-section" option
multiple times.
Differential Revision: https://reviews.llvm.org/D38332
llvm-svn: 315385
This allows clients to avoid an unnecessary fs::status() call on each
directory entry. Because the information returned by FindFirstFileEx
is a subset of the information returned by a regular status() call,
I needed to extract a base class from file_status that contains only
that information.
On my machine, this reduces the time required to enumerate a ThinLTO
cache directory containing 520k files from almost 4 minutes to less
than 2 seconds.
Differential Revision: https://reviews.llvm.org/D38716
llvm-svn: 315378