These symbols were previously not being marked as functions
so were appearing as globals instead, and with the incorrect
relocation type.
Without this fix, objects that take address of external
functions include them as global imports rather than function
imports which then fails at link time.
Differential Revision: https://reviews.llvm.org/D34068
llvm-svn: 305263
This revert was done so that my other patch to add test framework could
land separately. Now the revert can be reverted and this patch can
reland.
This reverts commit 18b3c75b2b0d32601fb60a06b9672c33d6f0dff9.
llvm-svn: 305259
Summary: Added test cases for multiple machine types, file merging, multiple languages, and more resource types. Also fixed new bugs these tests exposed.
Subscribers: javed.absar, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34047
llvm-svn: 305258
I accidentally combined this patch with one for adding more tests, they
should be separated.
This reverts commit 3da218a523be78df32e637d3446ecf97c9ea0465.
llvm-svn: 305257
Previously we were writing the value function index space
value but for these types of relocations we want to be
writing the table element index space value.
Add a test case for these relocation types that fails
without this change.
Differential Revision: https://reviews.llvm.org/D33962
llvm-svn: 305253
User has 3 signatures for operator new today. They take a single size, a size and a number of users, and a size, number of users, and descriptor size.
Historically there used to only be one signature that took size and a number of uses. Long ago derived classes implemented their own versions that took just a size and would call the size and use count version. Then they left an unimplemented signature for the size and use count signature from User. As we moved to C++11 this unimplemented signature because = delete.
Since then operator new has picked up two new signatures for operator new. But when the 3 argument version was added it was never added to the delete list in all of the derived classes where the 2 argument version is deleted. This makes things inconsistent.
I believe once one version of operator new is created in a derived class name hiding will take care of making all of the base class signatures unavailable. So I don't think the deleted lines are needed at all.
This patch removes all of the deletes in cases where there is an override or there is already a delete of another signature (that should trigger name hiding too).
Differential Revision: https://reviews.llvm.org/D34120
llvm-svn: 305251
When we get an unknown symbol type, we might as well at least
dump it. Same goes for round-tripping through YAML, we can
dump the record contents as raw bytes even if we don't know
how to interpret it semantically.
llvm-svn: 305248
This fixes PR33157.
https://bugs.llvm.org//show_bug.cgi?id=33157
We might also think about disallowing duplicate dbg.declare intrinsics
entirely, but this may complicate some passes needlessly.
llvm-svn: 305244
It doesn't seem relevant to set an address space limit - this isn't
important in any sense that I'm aware & it gets in the way of things
that use a lot of address space, like llvm-symbolizer.
This came up when I realized that bugpoint regression tests were much
slower with -gsplit-dwarf than plain -g. Turned out that bugpoint
subprocesses (opt, etc) were crashing and doing symbolization - but
bugpoint runs those subprocesses with a 400MB memory limit. So with
plain -g, mmaping the opt binary would exceed the memory limit, fail,
and thus be really fast - no symbolization occurred. Whereas with
-gsplit-dwarf, comically, having less to map in, it would succeed and
then spend lots of time symbolizing.
I've fixed at least the critical part of bugpoint's perf problem there
by adding an option to allow bugpoint to disable symbolization. Thus
improving the perfromance for -gsplit-dwarf and making the -g-esque
speed available without this quirk/accidental benefit.
llvm-svn: 305242
The last fix required the user to manually add the required
feature. This caused an LLD test to fail because I failed to
update LLD. In practice we can hide this logic so it can just
be transparently added when we write the PDB.
llvm-svn: 305236
Older PDBs don't have this. Its presence is detected by using
the various "feature" flags that come at the end of the PDB
Stream. Detect this, and don't try to dump the ID stream if the
features tells us it's not present.
llvm-svn: 305235
Summary:
After RS4GC, we should drop metadata that is no longer valid. These metadata
is used by optimizations scheduled after RS4GC, and can cause a miscompile.
One such metadata is invariant.load which is used by LICM sinking transform.
After rewriting statepoints, the address of a load maybe relocated. With
invariant.load metadata on a load instruction, LICM sinking assumes the
loaded value (from a dererenceable address) to be invariant, and
rematerializes the load operand and the load at the exit block.
This transforms the IR to have an unrelocated use of the
address after a statepoint, which is incorrect.
Other metadata we conservatively remove are related to
dereferenceability and noalias metadata.
This patch drops such metadata on store and load instructions after
rewriting statepoints.
Reviewers: reames, sanjoy, apilipenko
Reviewed by: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33756
llvm-svn: 305234
This is a precursor to another change (coming soon) that aims to make
FoldingSet's API more type-safe. Without this, the type-safety change
would just duplicate 4 more public methods between the already very
similar classes.
This renames FoldingSetImpl to FoldingSetBase so it's consistent with
the FooBase -> FooImpl<T> -> Foo<T> convention we seem to have with
other containers.
llvm-svn: 305231
The "Add/sub (shifted reg)" instructions use the 31 encoding for xzr and wzr
rather than the SP, so we need to use different variants.
Situations where this actually comes up are rare enough (see test-case) that I
think falling back to DAG is fine.
llvm-svn: 305230
Static data members were causing a problem because I mistakenly
assumed all members would affect a class's layout and so the
Layout member would be non-null.
llvm-svn: 305229
Fix thinko/typo in subreg aware liverange splitting logic. I'm not sure
how to write a proper testcase for this. The original problem only
happens on an out-of-tree target. Forcing subreg enabled targets to
spill and split in a predictable way is near impossible.
llvm-svn: 305228
Summary:
Use the filepath used to open the archive member as the archive member
name instead of the file basename. This path might be absolute or
relative. This is important because the archive member name will show
up in the PDB, and we want our PDBs to look as much like MSVC's as
possible.
This also helps avoid an issue in our PDB module descriptor writing
code, which assumes that all module names are unique. Relative paths
still aren't guaranteed to be unique, but they're much better than
basenames, which definitely aren't unique.
Reviewers: ruiu, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33575
llvm-svn: 305223
Power9 has instructions that will reverse the bytes within an element for all
sizes (half-word, word, double-word and quad-word). These can be used for the
vec_revb builtins in altivec.h. However, we implement these to match vector
shuffle nodes as that will cover both the builtins and vector shuffles that
occur in the SDAG through other means.
Differential Revision: https://reviews.llvm.org/D33690
llvm-svn: 305214
Note that if we need the result of both the divide and the modulo then we
compute the modulo based on the result of the divide and not using the new
hardware instruction.
Commit on behalf of STEFAN PINTILIE.
Differential Revision: https://reviews.llvm.org/D33940
llvm-svn: 305210
Summary:
This change enables the sin(x) cos(x) -> sincos(x) optimization on GNU
target triples. This optimization was being inhibited when -ffast-math
wasn't set because sincos in GLibC does not set errno, while sin and cos
do. However, this optimization will only run if the attributes on the
sin/cos calls include readnone, which is how clang represents the fact
that it doesn't care about the errno values set by these functions (via
the -fno-math-errno flag).
Reviewers: hfinkel, bogner
Subscribers: mcrosier, javed.absar, llvm-commits, paul.redmond
Differential Revision: https://reviews.llvm.org/D32921
llvm-svn: 305204
Summary:
The old check for slot overlap treated 2 slots `S` and `T` as
overlapping if there existed a CFG node in which both of the slots could
possibly be active. That is overly conservative and caused stack blowups
in Rust programs. Instead, check whether there is a single CFG node in
which both of the slots are possibly active *together*.
Fixes PR32488.
Patch by Ariel Ben-Yehuda <ariel.byd@gmail.com>
Reviewers: thanm, nagisa, llvm-commits, efriedma, rnk
Reviewed By: thanm
Subscribers: dotdash
Differential Revision: https://reviews.llvm.org/D31583
llvm-svn: 305193
This step is just intended to reduce code duplication rather than change any functionality.
A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper.
Differential Revision: https://reviews.llvm.org/D33649
llvm-svn: 305192
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.
Reviewers: chandlerc, rnk, reames
Reviewed By: reames
Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits
Differential Revision: https://reviews.llvm.org/D33903
llvm-svn: 305189
First possible step towards merging SSE/AVX memory folding pattern fragments.
Also allows us to remove the duplicate non-temporal load logic.
Differential Revision: https://reviews.llvm.org/D33902
llvm-svn: 305184
I was looking closer at the x86 test diffs in D33866, and the first change seems like it
shouldn't happen in the first place. So this patch will resolve that.
Using Agner's tables and AMD docs, vperm2f128 and vinsertf128 have identical timing for
any given CPU model, so we should be able to interchange those without affecting perf.
But as we can see in some of the diffs here, using vperm2f128 allows load folding, so
we should take that opportunity to reduce code size and register pressure.
A secondary advantage is making AVX1 and AVX2 codegen more similar. Given that vperm2f128
was introduced with AVX1, we should be selecting it in all of the same situations that we
would with AVX2. If there's some reason that an AVX1 CPU would not want to use this
instruction, that should be fixed up in a later pass.
Differential Revision: https://reviews.llvm.org/D33938
llvm-svn: 305171
Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid.
Reviewers: joerg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34088
llvm-svn: 305162
Summary:
Use MemorySSA for memory dependency checking in the EarlyCSE pass at the
start of the function simplification portion of the pipeline. We rely
on the fact that GVNHoist runs just after this pass of EarlyCSE to
amortize the MemorySSA construction cost since GVNHoist uses MemorySSA
and EarlyCSE preserves it.
This is turned off by default. A follow-up change will turn it on to
allow for easier reversion in case it breaks something.
llvm-svn: 305146
We're currently passing endian-ness around as a param (and not uniformly),
so this eliminates the need for that. I'd like to add a constant fold
call too, and that requires a DL.
llvm-svn: 305129
Summary:
- Fix assertion failures on F16 to/from int types in FastISel by falling
back to regular ISel
- Add a testcase of various conversion cases with FastISel (-O0)
Reviewers: kristof.beyls, jmolloy, SjoerdMeijer
Reviewed By: SjoerdMeijer
Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33734
llvm-svn: 305127
Previously it was non-const reference named Result which would tend to make someone think that it was an outparam when really its an input.
llvm-svn: 305114
Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in
GEPOperator::accumulateConstantOffset.
Basically it does not consider the situation that the pointer operand of load or store
may be in a non-zero address space and its size may be different from the size of
a pointer in address space 0.
This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend.
Diffferential Revision: https://reviews.llvm.org/D33298
llvm-svn: 305107
Summary:
isSafeToSpeculativelyExecute is the wrong predicate to use here.
All that checks for is whether it is safe to hoist a value due to
unaligned/un-dereferencable accesses. However, not only are we doing
sinking rather than hoisting, our concern is that the location
we're loading from may have been modified. Instead forbid sinking
any load across a critical edge.
Reviewers: majnemer
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D33179
llvm-svn: 305102
Previously extractors tried to be stateless with any additional
context information needed in order to parse items being passed
in via the extraction method. This led to quite cumbersome
implementation challenges and awkwardness of use. This patch
brings back support for stateful extractors, making the
implementation and usage simpler.
llvm-svn: 305093
Summary: Add the WindowsResourceCOFFWriter class for producing the final COFF after all parsing is done.
Reviewers: hiraditya!, zturner, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34020
llvm-svn: 305092
Summary:
Unless I'm mistaken, the special handling for EQ/NE should cover everything and there is no reason to fallthrough to the more complex code. For that matter I'm not sure there's any reason to special case EQ/NE other than avoiding creating temporary ConstantRanges.
This patch moves the complex code into an else so we only do it when we are handling a predicate other than EQ/NE.
Reviewers: anna, reames, resistor, Farhana
Reviewed By: anna
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34000
llvm-svn: 305086
Summary:
During DAG legalization loop in SelectionDAG::Legalize(),
bookkeeping of the SDNodes that were already legalized is implemented
with SmallPtrSet (LegalizedNodes). This kind of set stores only pointers
to objects, not the objects themselves. Unfortunately, if SDNode is
deleted during legalization for some reason, LegalizedNodes set is not
informed about this fact. This wouldn’t be so bad, if SelectionDAG wouldn’t reuse
space deallocated after deletion of unused nodes, for creation of new
ones. Because of this, new nodes, created during legalization often can
have pointers identical to ones that have been previously legalized,
added to the LegalizedNodes set, and deleted afterwards. This in turn
causes, that newly created nodes, sharing the same pointer as deleted
old ones, are present in LegalizedNodes *already at the moment of
creation*, so we never call Legalize on them.
The fix facilitates the fact, that DAG notifies listeners about each
modification. I have registered DAGNodeDeletedListener inside
SelectionDAG::Legalize, with a callback function that removes any
pointer of any deleted SDNode from the LegalizedNodes set. With this
modification, LegalizeNodes set does not contain pointers to nodes that
were deleted, so newly created nodes can always be inserted to it, even
if they share pointers with old deleted nodes.
Patch by pawel.szczerbuk@intel.com
The issue this patch addresses causes failures in an out-of-tree target,
and i was not able to create a reproducer for an in-tree target, hence
there is no test-case.
Reviewers: delena, spatel, RKSimon, hfinkel, davide, qcolombet
Reviewed By: delena
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33891
llvm-svn: 305084
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.
The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.
Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.
By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.
Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".
This patch enables the MIPS backend to take either form for vector types.
The previous version of this patch had a "conditional move or jump depends on
uninitialized value".
Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur
Differential Revision: https://reviews.llvm.org/D27845
llvm-svn: 305083
Summary:
Alloca promotion pass not dealing with non-canonical input
Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical)
Also added some test cases in non-canonical form to check that it no longer crashes
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D31710
llvm-svn: 305079
This patch creates a customised machine-scheduler for ARM targets,
so that subsequently DAG mutations etc can be added.
Reviewed by: hahn, rengolin, rovka.
Differential Revision: https://reviews.llvm.org/D34039
llvm-svn: 305078
When an empty comment is present in an assembly file, the compiler will crash because it checks the first character for '\n' or '\r'.
The fix consists of also checking if the string is empty before accessing the *front* method of the StringRef.
A test is included for the x86 target, but this issue is reproducible with other targets as well.
Patch by Alexandru Guduleasa!
Reviewers: niravd, grosbach, llvm-commits
Reviewed By: niravd
Differential Revision: https://reviews.llvm.org/D33993
llvm-svn: 305077
Summary:
Currently XRay compares its threshold against `Function::size()` . However, `Function::size()` returns the number of basic blocks (as I understand, such as cycle bodies, if/else bodies, switch-case bodies, etc.), rather than the number of instructions.
The name of the parameter `-fxray-instruction-threshold=N`, as well as XRay documentation at http://llvm.org/docs/XRay.html , suggests that instructions should be counted, rather than the number of basic blocks.
I see two options:
1. Count the number of MachineInstr`s in MachineFunction : this gives better estimate for the number of assembly instructions on the target. So a user can check in disassembly that the threshold works more or less correctly.
2. Count the number of Instruction`s in a Function : AFAIK, this gives correct number of IR instructions, which the user can check in IR listing. However, this number may be far (several times for small functions) from the number of assembly instructions finally emitted.
Option 1 is implemented in this patch because I think that having the closer estimate for the number of assembly instructions emitted is more important than to have a clear definition of the metric.
Reviewers: dberris, rengolin
Reviewed By: dberris
Subscribers: llvm-commits, iid_iunknown
Differential Revision: https://reviews.llvm.org/D34027
llvm-svn: 305072
This prevents against assertion errors like PR32659 which occur from a
replacement deleting a node after it's been added to the list argument
of RemoveDeadNodes. The specific failure from PR32659 does not
currently happen, but it is still potentially possible. The underlying
cause is that the callers of the change dfunction builds up a list of
nodes to delete after having moved their uses and it possible that a
move of a later node will cause a previously deleted nodes to be
deleted.
Reviewers: bkramer, spatel, davide
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33731
llvm-svn: 305070
The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.
Differential Revision: https://reviews.llvm.org/D34040
llvm-svn: 305064
Initial implementation - needs similar work/testing for other tools
bugpoint invokes (llc, lli I think, maybe more).
Alternatively (as suggested by chandlerc@) an environment variable could
be used. This would allow the option to pass transparently through user
scripts, pass to compilers if they happened to be LLVM-ish, etc.
I worry a bit about using cl::opt in the crash handling code - LLVM
might crash early, perhaps before the cl::opt is properly initialized?
Or at least before arguments have been parsed?
- should be OK since it defaults to "pretty", so if the crash is very
early in opt parsing, etc, then crash reports will still be symbolized.
I shyed away from doing this with an environment variable when I
realized that would require copying the existing environment and
appending the env variable of interest. But it seems there's no existing
LLVM API for accessing the environment (even the Support tests for
process launching have their own ifdefs for getting the environment). It
could be added, but seemed like a higher bar/untested codepath to
actually add environment variables.
Most importantly, this reduces the runtime of test/BugPoint/metadata.ll
in a split-dwarf Debug build from 1m34s to 6.5s by avoiding a lot of
symbolication. (this wasn't a problem for non-split-dwarf builds only
because the executable was too large to map into memory (due to bugpoint
setting a 400MB memory (including address space - not sure why? Going to
remove that) limit on the child process) so symbolication would fail
fast & wouldn't spend all that time parsing DWARF, etc)
Reviewers: chandlerc, dannyb
Differential Revision: https://reviews.llvm.org/D33804
llvm-svn: 305056
This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.
Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979
llvm-svn: 305055
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.
If we do, we get incorrect optimizations in cases like:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] - 128;
}
which gets optimized to:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] | 128;
}
Because:
- InstCombine turned `sub i32 %from.i, 128` into
`add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
"couldn't wrap", and changed the `add` to an `or`.
InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.
llvm-svn: 305053
Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.
llvm-svn: 305052
Summary: This matches the behavior we already had for compares and makes us consistent everywhere.
Reviewers: dberlin, hfinkel, spatel
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33604
llvm-svn: 305049
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.
Fixes PR33335.
Reviewers: zturner, hiraditya!
Reviewed By: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33968
llvm-svn: 305043
This is a preparatory change to expose the debug compression style to
clang. It requires exposing the enumeration and passing the actual
value through to the backend from the frontend in actual value form
rather than a boolean that selects the GNU style of debug info
compression.
Minor tweak to the ELF Object Writer to use a variable for re-used
values. Add an assertion that debug information format is one of the
two currently known types if debug information is being compressed.
llvm-svn: 305038
This adds support for Symbols, StringTable, and FrameData subsection
types. Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB. The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them. And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time. Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.
llvm-svn: 305037
This is the same change for the YAML Output style applied to the
raw output style. Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order. This was because some subsections need to be
read first in order to properly dump later subsections. This
patch allows them to be dumped in the order they appear.
Differential Revision: https://reviews.llvm.org/D34015
llvm-svn: 305034
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.
It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.
This is the second attempt to land this change after fixing PR33316.
llvm-svn: 305031
No need in reinterpret_cast<StringTableOffset &> here, as struct coff_symbol Name is a unin
with the member StringTableOffset Offset. This union member could be accessed directly.
llvm-svn: 305029
These used to be virtual methods that would enable doing the right thing with only a TerminatorInst pointer. I believe they were also acting as vtable anchors in my cases. I think the fact that they had a separate name ending in V was to allow a version without V to be called without a virtual call in a pre-C++11 final keyword world.
Where possible the base methods in TerminatorInst dispatch directly to the public methods in the classes that have the same signature. For some classes this wasn't possible so I've left private method versions that match the name and signature of the version in TerminatorInst. All versions have been moved into the class definitions since we no longer need vtable anchors here.
Differential Revision: https://reviews.llvm.org/D34011
llvm-svn: 305028
This is to prepare to allow for dead stripping of globals in the
merged modules.
Differential Revision: https://reviews.llvm.org/D33921
llvm-svn: 305027
This check is a requirement of the irsymtab builder, not of any
particular caller.
Differential Revision: https://reviews.llvm.org/D33970
llvm-svn: 305023
This data type includes the contents of a bitcode file.
Right now a bitcode file can only contain modules, but
a later change will add a symbol table.
Differential Revision: https://reviews.llvm.org/D33969
llvm-svn: 305019
(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register.
(1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite.
(2) MachineVerifier: updated the test per MatzeB's review.
(3) +testcase
Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>!
Differential Revision: https://reviews.llvm.org/D33947
llvm-svn: 305016
Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it.
Reviewers: davidxl, dnovillo, iteratee
Reviewed By: iteratee
Subscribers: iteratee, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D34017
llvm-svn: 305009
In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64.
This patch fixed PR32442.
Differential Revision: https://reviews.llvm.org/D31407
llvm-svn: 305001
The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes.
Differential Revision: https://reviews.llvm.org/D33783
llvm-svn: 304998
The test diff for PowerPC shows we can better optimize if this case is one block.
For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed
cheap and SDAG can't optimize across blocks.
Instead of this:
_cmp_eq8:
movq (%rdi), %rax
cmpq (%rsi), %rax
je LBB23_1
## BB#2: ## %res_block
movl $1, %ecx
jmp LBB23_3
LBB23_1:
xorl %ecx, %ecx
LBB23_3: ## %endblock
xorl %eax, %eax
testl %ecx, %ecx
sete %al
retq
We get this:
cmp_eq8:
movq (%rdi), %rcx
xorl %eax, %eax
cmpq (%rsi), %rcx
sete %al
retq
And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall().
If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable
CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we
can lower the IR produced here optimally.
Differential Revision: https://reviews.llvm.org/D34005
llvm-svn: 304987
Apparently support for /debug:fastlink PDBs isn't part of the
DIA SDK (!), and it was causing llvm-pdbdump to crash because
we weren't checking for a null pointer return value. This
manifests when calling findChildren on the IDiaSymbol, and
it returns E_NOTIMPL.
llvm-svn: 304982
The zero heuristic assumes that integers are more likely positive than negative,
but this also has the effect of assuming that strcmp return values are more
likely positive than negative. Given that for nonzero strcmp return values it's
the ordering of arguments that determines the sign of the result there's no
reason to assume that's true.
Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to
decide if it's strcmp-like, and if so only assume that nonzero is more likely
than zero i.e. strings are more often different than the same. This causes a
slight code generation change in the spec2006 benchmark 403.gcc, but with no
noticeable performance impact. The intent of this patch is to allow better
optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are
also some changes that need to be made to if-conversion.
Differential Revision: https://reviews.llvm.org/D33934
llvm-svn: 304970
This code now lives in lib/Object. The idea is that it can now be reused by
IRObjectFile among other things.
Differential Revision: https://reviews.llvm.org/D31921
llvm-svn: 304958
This was discussed in D33338. We have larger pattern-matching ending in a truncate that
we can reduce or remove by handling these smaller patterns first. Further motivation is
that narrower shift ops are easier for value tracking and zext is better than sext.
http://rise4fun.com/Alive/rhh
Name: boolshift
%sext = sext i1 %x to i8
%r = lshr i8 %sext, 7
=>
%r = zext i1 %x to i8
Name: noboolshift
%sext = sext i3 %x to i8
%r = lshr i8 %sext, 7
=>
%sh = lshr i3 %x, 2
%r = zext i3 %sh to i8
Differential Revision: https://reviews.llvm.org/D33879
llvm-svn: 304939
When considering merging stores values are the results of loads only
consider stores whose values come from loads from the same base.
This fixes much of the longer compile times in PR33330.
llvm-svn: 304934
Summary:
Check that the first access before one being tested is valid.
Before this patch, if there was no definition prior to the Use being tested,
the first time Iter was deferenced, it hit the sentinel.
Reviewers: dberlin, gbiv
Subscribers: sanjoy, Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D33950
llvm-svn: 304926
This could be viewed as another shortcoming of the DAGCombiner:
when both operands of a compare are zexted from the same source
type, we should be able to compare the original types.
The effect on PowerPC perf is likely unnoticeable, but there's a
visible regression for x86 if we feed the suboptimal IR for memcmp
expansion to the DAG:
_cmp_eq4_zexted_to_i64:
movl (%rdi), %ecx
movl (%rsi), %edx
xorl %eax, %eax
cmpq %rdx, %rcx
sete %al
_cmp_eq4_better:
movl (%rdi), %ecx
xorl %eax, %eax
cmpl (%rsi), %ecx
sete %al
llvm-svn: 304923
This makes it so that the code quality for CFI checks when compiling
with -O2 and linking with --lto-O0 is similar to that of the rest of
the code.
Reduces the size of a chrome binary built with -O2/--lto-O0 by
about 750KB.
Differential Revision: https://reviews.llvm.org/D33925
llvm-svn: 304921
Changed immediate type for repl.ph from uimm10 to simm10 as per the specs.
Repl.qb still accepts uimm8. Both instructions now mimic the behaviour of
GNU as.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D33594
llvm-svn: 304918
If we know that both operands of an unsigned integer vector comparison are non-negative,
then it's safe to directly use a signed-compare-greater-than instruction (the only non-equality
integer vector compare predicate provided by SSE/AVX).
We're intentionally not changing the condition code to signed in order to preserve the
existing transforms that use min/max/psubus below here.
This should solve PR33276:
https://bugs.llvm.org/show_bug.cgi?id=33276
Differential Revision: https://reviews.llvm.org/D33862
llvm-svn: 304909
In the special (but also the likely common) case, we can avoid
the multi-block complexity of the general algorithm, so moving
this part off on its own will make it re-usable.
llvm-svn: 304908
The clang compiler by default uses FastISel when invoked with -O0, which
is also the default. In that case, passing of -mxgot does not get honored,
i.e. the code path that is to deal with large got is not taken.
Clang produces same output regardless of -mxgot being present or not.
This change checks whether -mxgot is passed as an option, and turns off
FastISel if it is.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D33593
llvm-svn: 304906
According to the commit message from r296921, G_MERGE_VALUES and
G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating
G_SEQUENCE in the ARM backend and remove the code dealing with it.
This boils down to the code breaking up double values for the soft float
calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of
G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR +
VMOVRRD and simplifies the code in the instruction selector.
There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but
that is part of the target-independent code for translating constant
structs. Therefore, it is beyond the scope of this commit.
llvm-svn: 304902
This is identical to the support for the other binary operators:
- widen to s32
- map into GPR
- select ANDrr (via TableGen'erated code)
llvm-svn: 304885
Summary:
This patch updates Triple::isCompatibleWith to make armxx and thumbxx
triples compatible, as long as the subarch, vendor, os, envorionment and
object format match. Thumb/ARM code generation should be controlled
using the thumb-mode per-function target feature rather than by the
triple to allow mixing Thumb and ARM functions.
D33448 updates Clang's codegen to add thumb-mode for all functions with
armxx or thumbxx triples.
Reviewers: echristo, t.p.northover, rafael, kristof.beyls, rengolin, tejohnson
Reviewed By: tejohnson
Subscribers: rinon, eugenis, pcc, srhines, aemerson, mehdi_amini, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33287
llvm-svn: 304884
Summary:
Relocations are required for unconditional branches to function symbols with
different execution mode. Without this patch, incorrect branches are
generated for tail calls between functions with different execution
mode.
Reviewers: peter.smith, rafael, echristo, kristof.beyls
Reviewed By: peter.smith
Subscribers: aemerson, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33898
llvm-svn: 304882
I believe this code used to use APInt references which would have worked. But then they were changed to pointers to allow m_APInt to be used.
llvm-svn: 304875
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the
special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle.
I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time
limitation in the DAG. We might have to just avoid those special-cases here in CGP. But
regardless of that, I think this is a win for the more general cases.
http://rise4fun.com/Alive/cbQ
Differential Revision: https://reviews.llvm.org/D33963
llvm-svn: 304849
This patch introduces a new command line option, called brief, to
llvm-dwarfdump. When -brief is used, the attribute forms for the
.debug_info section will not be emitted to output.
Patch by Spyridoula Gravani!
rdar://problem/21474365
Differential Revision: https://reviews.llvm.org/D33867
llvm-svn: 304844
Summary:
I would like to add printing of registered targets to clang's version
information. For this to work correctly, the VersionPrinter logic in
CommandLine.cpp should support printing to arbitrary raw_ostreams,
instead of always defaulting to outs().
Add a raw_ostream& parameter to the function pointer type used for
VersionPrinter, and while doing so, introduce a typedef for convenience.
Note that VersionPrinter::print() will still default to using outs(),
the clang part will necessarily go into a separate review.
Reviewers: beanz, chandlerc, dberris, mehdi_amini, zturner
Reviewed By: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33899
llvm-svn: 304835
Seems like at least one reasonable interpretation of optnone is that the
optimizer never "looks inside" a function. This fix is consistent with
that interpretation.
Specifically this came up in the situation:
f3 calls f2 calls f1
f2 is always_inline
f1 is optnone
The application of readnone to f1 (& thus to f2) caused the inliner to
kill the call to f2 as being trivially dead (without even checking the
cost function, as it happens - not sure if that's also a bug).
llvm-svn: 304833
- Add -x <language> option to switch between IR and MIR inputs.
- Change MIR parser to read from stdin when filename is '-'.
- Add a simple mir roundtrip test.
llvm-svn: 304825
Summary:
The patch makes instruction count the highest priority for
LSR solution for X86 (previously registers had highest priority).
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D30562
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304824
Summary:
LVIPrinter pass was previously relying on the LVICache. We now directly call the
the LVI functions which solves the value if the LVI information is not already
available in the cache. This has 2 benefits over the printing of LVI cache:
1. higher coverage (i.e. catches errors) in LVI code when cache value is
invalidated.
2. relies on the core functions, and not dependent on the LVI cache (which may
be scrapped at some point).
It would still catch any cache invalidation errors, since we first go through
the cache.
Reviewers: reames, dberlin, sanjoy
Reviewed by: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32135
llvm-svn: 304819
The change cleans up and unifies the handling of relocation
entries in WasmObjectWriter. Type index relocation no longer
need to be handled separately.
The only externally visible change should be that type
index relocations are no longer grouped at the end.
Differential Revision: https://reviews.llvm.org/D33918
llvm-svn: 304816
CodeGen uses MO_ExternalSymbol to represent the inline assembly strings.
Empty strings for symbol names appear to be invalid. For now just
special case the output code to avoid hitting an `assert()` in
`printLLVMNameWithoutPrefix()`.
This fixes https://llvm.org/PR33317
llvm-svn: 304815
1. When there is no perfect iteration order, we can't let phi nodes
put themselves in terms of things that come later in the iteration
order, or we will endlessly cycle (the normal RPO algorithm clears the
hashtable to avoid this issue).
2. We are sometimes erasing the wrong expression (causing pessimism)
because our equality says loads and stores are the same.
We introduce an exact equality function and use it when erasing to
make sure we erase only identical expressions, not equivalent ones.
llvm-svn: 304807
Summary:
Expanding the loop idiom test for memcpy to also recognize
unordered atomic memcpy. The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
Patch by Daniel Neilson!
Reviewers: reames, anna, skatkov
Reviewed By: reames, anna
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33243
llvm-svn: 304806
These methods looks like they were originally came from
MCELFObjectTargetWriter but they are never called by the
WasmObjectWriter.
Remove these methods meant the declaration of WasmRelocationEntry
could also move into the cpp file.
Differential Revision: https://reviews.llvm.org/D33905
llvm-svn: 304804
Addition of a feature and a predicate used to control generation of madd.fmt
and similar instructions.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D33400
llvm-svn: 304801
Summary:
We were canonizalizing the pre loop (into loop-simplify form) before
the post loop blocks were added into parent loop. This is incorrect when IRCE is
done on a subloop. The post-loop blocks are created, but not yet added to the
parent loop. So, loop-simplification on the pre-loop incorrectly updates
LoopInfo.
This patch corrects the ordering so that pre and post loop blocks are added to
parent loop (if any), and then the loops are canonicalized to LCSSA and
LoopSimplifyForm.
Reviewers: reames, sanjoy, apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33846
llvm-svn: 304800
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
My previous commit r304702 introduced a new case into a switch statement.
This case defined a variable but I forgot to add the curly brackets around the
case to limit the scope.
This change puts the curly braces back in so that the next person that adds a
case doesn't get a build failure. Thanks to avieira for the spot.
Differential Revision: https://reviews.llvm.org/D33931
llvm-svn: 304785
If -simplify-mir option is passed then MIRPrinter will not print such fields.
This change also required some lit test cases in CodeGen directory to be changed.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D32304
llvm-svn: 304779
isKnownNonEqual is called a little earlier in this function and can handle the case that we were checking here as well as more complex cases.
llvm-svn: 304775
This will be used by another commit to remove some code from InstSimplify that is redundant for scalars, but was needed for vectors due to this issue.
llvm-svn: 304774
Summary:
This problem stems from the fact that instructions are allocated using new
in LLVM, i.e. there is no relationship that can be derived by just looking
at the pointer value.
This interface dispatches to appropriate dominance check given 2 instructions,
i.e. in case the instructions are in the same basic block, ordered basicblock
(with instruction numbering and caching) are used. Otherwise, dominator tree
is used.
This is a preparation patch for https://reviews.llvm.org/D32720
Reviewers: dberlin, hfinkel, davide
Subscribers: davide, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33380
llvm-svn: 304764
In testing, we've found yet another miscompile caused by the new tables.
And this one is even less clear how to fix (we could teach it to fold
a 16-bit load instead of the 32-bit load it wants, or block folding
entirely).
Also, the approach to excluding instructions seems increasingly to not
scale well.
I have left a more detailed analysis on the review log for the original
patch (https://reviews.llvm.org/D32684) along with suggested path
forward. I will land an additional test case that I wrote which covers
the code that was miscompiling (folding into the output of `pextrw`) in
a subsequent commit to keep this a pure revert.
For each commit reverted here, I've restricted the revert to the
non-test code touching the x86 fold table emission until the last commit
where I did revert the test updates. This means the *new* test cases
added for `insertps` and `xchg` remain untouched (and continue to pass).
Reverted commits:
r304540: [X86] Don't fold into memory operands into insertps in the ...
r304347: [TableGen] Adapt more places to getValueAsString now ...
r304163: [X86] Don't fold away the memory operand of an xchg.
r304123: Don't capture a temporary std::string in a StringRef.
r304122: Resubmit "[X86] Adding new LLVM TableGen backend that ..."
Original commit was in r304088, and after a string of fixes was reverted
previously in r304121 to fix build bots, and then re-landed in r304122.
llvm-svn: 304762
When parsing .mir files immediately construct the MachineFunctions and
put them into MachineModuleInfo.
This allows us to get rid of the delayed construction (and delayed error
reporting) through the MachineFunctionInitialzier interface.
Differential Revision: https://reviews.llvm.org/D33809
llvm-svn: 304758
- Move ISel (and pre-isel) pass construction into TargetPassConfig
- Extract AsmPrinter construction into a helper function
Putting the ISel code into TargetPassConfig seems a lot more natural and
both changes together make make it easier to build custom pipelines
involving .mir in an upcoming commit. This moves MachineModuleInfo to an
earlier place in the pass pipeline which shouldn't have any effect.
llvm-svn: 304754
Althought it is not wrong to spill undef values, it is useless and harms
both code size and runtime. Before spilling a value, check that its
content actually matters.
http://www.llvm.org/PR33311
llvm-svn: 304752
If a tied source operand was undef, it would be replaced but not
update the other tied operand, which would end up using different
virtual registers.
llvm-svn: 304747
Summary:
The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option.
Currently even if the option set to false we calculate and print (in debug mode) instruction costs.
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D33914
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304746
Running `llc -verify-dom-info` on the attached testcase results in a
crash in the verifier, due to a stale dominator tree.
i.e.
DominatorTree is not up to date!
Computed:
=============================--------------------------------
Inorder Dominator Tree:
[1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7}
[2] %lor.lhs.false.i61.i.i.i {1,2}
[2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6}
[3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
Actual:
=============================--------------------------------
Inorder Dominator Tree:
[1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9}
[2] %lor.lhs.false.i61.i.i.i {1,2}
[2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8}
[3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
[3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7}
This is because in `SelectionDAGIsel` we split critical edges without
updating the corresponding dominator for the function (and we claim
in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved).
We could either stop preserving the domtree in `getAnalysisUsage`
or tell `splitCriticalEdge()` to update it.
As the second option is easy to implement, that's the one I chose.
Differential Revision: https://reviews.llvm.org/D33800
llvm-svn: 304742
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.
This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.
llvm-svn: 304738
This ensures that we can emit the ObjC Image Info structure on COFF and
ELF as well. The frontend already would attempt to emit this
information but would get dropped when generating assembly or an object
file.
llvm-svn: 304736
Truncate currently uses a udivrem call which is going to be slow particularly for larger than 64-bit widths.
As far as I can tell all we were trying to do was modulo LowerDiv by (MaxValue+1) and make sure whatever value was effectively subtracted from LowerDiv was also subtracted from UpperDiv.
This patch recognizes that MaxValue+1 is a power of 2 so we can just use a bitwise AND to accomplish a modulo operation or isolate the upper bits.
Differential Revision: https://reviews.llvm.org/D32672
llvm-svn: 304733
Other calls to DAGCombiner::*PromoteOperand check the result, but here it could cause an assertion in getNode.
Falling back to any extend in this case instead of failing outright seems correct to me.
No test case because:
The failure was triggered by an out of tree backend. In order to trigger it, a backend would need to overload
TargetLowering::IsDesirableToPromoteOp to return true for a type for which ISD::SIGN_EXTEND_INREG is marked
illegal. In tree, only X86 overloads and sometimes returns true for MVT::i16 yet it marks
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16 , Legal);.
Patch by Jacob Young!
Differential Revision: https://reviews.llvm.org/D33633
llvm-svn: 304723
This removes a quadratic behavior in assert-enabled builds.
GVN propagates the equivalence from a condition into the blocks guarded by the
condition. E.g. for 'if (a == 7) { ... }', 'a' will be replaced in the block
with 7. It does this by replacing all the uses of 'a' that are dominated by
the true edge.
For a switch with N cases and U uses of the value, this will mean N * U calls
to 'dominates'. Asserting isSingleEdge in 'dominates' make this N^2 * U
because this function checks for the uniqueness of the edge. I.e. traverses
each edge between the SwitchInst's block and the cases.
The change removes the assert and makes 'dominates' works correctly in the
presence of non-unique edges.
This brings build time down by an order of magnitude for an input that has
~10k cases in a switch statement.
Differential Revision: https://reviews.llvm.org/D33584
llvm-svn: 304721
procedural optimizations to prevent dropping symbols and allow the linker
to process re-directs.
PR33145: --wrap doesn't work with lto.
Differential Revision: https://reviews.llvm.org/D33621
llvm-svn: 304719
When lowering calls, we generate instructions with machine opcodes
rather than generic ones. Therefore, we need to constrain the register
classes of the operands.
Also enable the machine verifier on the arm-irtranslator.ll test, since
that would've caught this issue.
Fixes (part of) PR32146.
llvm-svn: 304712
The C functions added are LLVMGetNumContainedTypes and
LLVMGetSubtypes.
The OCaml function added is Llvm.subtypes.
Patch by Ekaterina Vaartis.
Differential Revision: https://reviews.llvm.org/D33677
llvm-svn: 304709
Summary:
The workaround added in rL301240 for stderr/out/in symbols being both
macros and globals is only necessary for glibc, and it does not compile
with musl libc. Alpine Linux has had the following fix for it:
https://git.alpinelinux.org/cgit/aports/plain/main/llvm4/llvm-fix-DynamicLibrary-to-build-with-musl-libc.patch
Adapt the fix in our DynamicLibrary.inc for Unix.
Reviewers: marsupial, chandlerc, krytarowski
Reviewed By: krytarowski
Subscribers: srhines, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D33883
llvm-svn: 304707
This patch provides a means to specify section-names for global variables,
functions and static variables, using #pragma directives.
This feature is only defined to work sensibly for ELF targets.
One can specify section names as:
#pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText"
One can "unspecify" a section name with empty string e.g.
#pragma clang section bss="" data="" text="" rodata=""
Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D33413
llvm-svn: 304704
This change adds a new fixup fixup_t2_so_imm for the t2_so_imm_asmoperand
"T2SOImm". The fixup permits code such as:
.L1:
sub r3, r3, #.L2 - .L1
.L2:
to assemble in Thumb2 as well as in ARM state.
The operand predicate isT2SOImm() explicitly doesn't match expressions
containing :upper16: and :lower16: as expressions with these operators
must match the movt and movw instructions.
The test mov r0, foo2 in thumb2-diagnostics is moved to a new file as the
fixup delays the error message till after the assembler has quit due to
the other errors.
As the mov instruction shares the t2_so_imm_asmoperand mov instructions
with a non constant expression now match t2MOVi rather than t2MOVi16 so the
error message is slightly different.
Fixes PR28647
Differential Revision: https://reviews.llvm.org/D33492
llvm-svn: 304702
This fixes a bug that can cause extractelements with operands that
haven't been defined yet to be inserted at a wrong point when
optimising insertelements.
Patch by Karl Hylen.
Differential Revision: https://reviews.llvm.org/D33449
llvm-svn: 304701
Fixes bug #33302. Pass did not account that Src1 of max instruction
can be an immediate.
Differential Revision: https://reviews.llvm.org/D33884
llvm-svn: 304696
We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type:
e.g. for v4f32:
Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0>
: unpcklps 1, 3 ==> Y: <?, ?, 3, 1>
Step 2: unpcklps X, Y ==> <3, 2, 1, 0>
The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc.
Instead, this patch unpacks progressively larger sequential vector elements together:
e.g. for v4f32:
Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0>
: unpcklps 1, 3 ==> Y: <?, ?, 3, 2>
Step 2: unpcklpd X, Y ==> <3, 2, 1, 0>
This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree.
Differential Revision: https://reviews.llvm.org/D33864
llvm-svn: 304688
Following the request made in https://reviews.llvm.org/D32871,
scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is
hereby made non-virtual in InnerLoopVectorizer.
Should have been part of r297580 originally.
llvm-svn: 304685
This is actually NFC because the next case starts with the same if statement as this case did. So the result will be the same and it will fallthrough to the end of the switch. But there's no reason to rely on that so we should just break.
llvm-svn: 304680
With this, the two pipelines should be in sync again (modulo
LoopUnswitch, but Chandler is actively working on that).
Differential Revision: https://reviews.llvm.org/D33810
llvm-svn: 304671
Remove dependency of SDWA pass on SIShrinkInstructions.
The goal is to move SDWA even higher in the stack to avoid second run
of MachineLICM, MachineCSE and SIFoldOperands.
Also added handling to preserve original src modifiers.
Differential Revision: https://reviews.llvm.org/D33860
llvm-svn: 304665