In testing, we've found yet another miscompile caused by the new tables.
And this one is even less clear how to fix (we could teach it to fold
a 16-bit load instead of the 32-bit load it wants, or block folding
entirely).
Also, the approach to excluding instructions seems increasingly to not
scale well.
I have left a more detailed analysis on the review log for the original
patch (https://reviews.llvm.org/D32684) along with suggested path
forward. I will land an additional test case that I wrote which covers
the code that was miscompiling (folding into the output of `pextrw`) in
a subsequent commit to keep this a pure revert.
For each commit reverted here, I've restricted the revert to the
non-test code touching the x86 fold table emission until the last commit
where I did revert the test updates. This means the *new* test cases
added for `insertps` and `xchg` remain untouched (and continue to pass).
Reverted commits:
r304540: [X86] Don't fold into memory operands into insertps in the ...
r304347: [TableGen] Adapt more places to getValueAsString now ...
r304163: [X86] Don't fold away the memory operand of an xchg.
r304123: Don't capture a temporary std::string in a StringRef.
r304122: Resubmit "[X86] Adding new LLVM TableGen backend that ..."
Original commit was in r304088, and after a string of fixes was reverted
previously in r304121 to fix build bots, and then re-landed in r304122.
llvm-svn: 304762
When parsing .mir files immediately construct the MachineFunctions and
put them into MachineModuleInfo.
This allows us to get rid of the delayed construction (and delayed error
reporting) through the MachineFunctionInitialzier interface.
Differential Revision: https://reviews.llvm.org/D33809
llvm-svn: 304758
Summary:
Reverse iteration can be turned on, by default, by setting -DLLVM_REVERSE_ITERATION:BOOL=ON during cmake.
With this enabled, we can uncover lots of cases of non-determinism in codegen by simply running our tests (without any other change).
We can then setup a buildbot which will have this turned on by default. Initially, a lot of unit tests will fail in this configuration.
Once we start fixing non-determinism issues, we can gradually make this a blocker for patches.
Reviewers: davide, dblaikie, mehdi_amini, dberlin
Reviewed By: dblaikie
Subscribers: probinson, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33908
llvm-svn: 304757
Create a custom pass pipeline when loading .mir files even in
--start-after/--start-before cases.
This streamlines the mir handling code and prepares for an upcoming
commit.
llvm-svn: 304755
- Move ISel (and pre-isel) pass construction into TargetPassConfig
- Extract AsmPrinter construction into a helper function
Putting the ISel code into TargetPassConfig seems a lot more natural and
both changes together make make it easier to build custom pipelines
involving .mir in an upcoming commit. This moves MachineModuleInfo to an
earlier place in the pass pipeline which shouldn't have any effect.
llvm-svn: 304754
There's nothing darwin-specific in these tests, and using that
setting causes extra phantom diffs when the auto-generated check
lines are regenerated today.
llvm-svn: 304753
Althought it is not wrong to spill undef values, it is useless and harms
both code size and runtime. Before spilling a value, check that its
content actually matters.
http://www.llvm.org/PR33311
llvm-svn: 304752
If a tied source operand was undef, it would be replaced but not
update the other tied operand, which would end up using different
virtual registers.
llvm-svn: 304747
Summary:
The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option.
Currently even if the option set to false we calculate and print (in debug mode) instruction costs.
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D33914
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304746
Running `llc -verify-dom-info` on the attached testcase results in a
crash in the verifier, due to a stale dominator tree.
i.e.
DominatorTree is not up to date!
Computed:
=============================--------------------------------
Inorder Dominator Tree:
[1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7}
[2] %lor.lhs.false.i61.i.i.i {1,2}
[2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6}
[3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
Actual:
=============================--------------------------------
Inorder Dominator Tree:
[1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9}
[2] %lor.lhs.false.i61.i.i.i {1,2}
[2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8}
[3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
[3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7}
This is because in `SelectionDAGIsel` we split critical edges without
updating the corresponding dominator for the function (and we claim
in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved).
We could either stop preserving the domtree in `getAnalysisUsage`
or tell `splitCriticalEdge()` to update it.
As the second option is easy to implement, that's the one I chose.
Differential Revision: https://reviews.llvm.org/D33800
llvm-svn: 304742
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.
This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.
llvm-svn: 304738
This ensures that we can emit the ObjC Image Info structure on COFF and
ELF as well. The frontend already would attempt to emit this
information but would get dropped when generating assembly or an object
file.
llvm-svn: 304736
Truncate currently uses a udivrem call which is going to be slow particularly for larger than 64-bit widths.
As far as I can tell all we were trying to do was modulo LowerDiv by (MaxValue+1) and make sure whatever value was effectively subtracted from LowerDiv was also subtracted from UpperDiv.
This patch recognizes that MaxValue+1 is a power of 2 so we can just use a bitwise AND to accomplish a modulo operation or isolate the upper bits.
Differential Revision: https://reviews.llvm.org/D32672
llvm-svn: 304733
Other calls to DAGCombiner::*PromoteOperand check the result, but here it could cause an assertion in getNode.
Falling back to any extend in this case instead of failing outright seems correct to me.
No test case because:
The failure was triggered by an out of tree backend. In order to trigger it, a backend would need to overload
TargetLowering::IsDesirableToPromoteOp to return true for a type for which ISD::SIGN_EXTEND_INREG is marked
illegal. In tree, only X86 overloads and sometimes returns true for MVT::i16 yet it marks
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16 , Legal);.
Patch by Jacob Young!
Differential Revision: https://reviews.llvm.org/D33633
llvm-svn: 304723
This removes a quadratic behavior in assert-enabled builds.
GVN propagates the equivalence from a condition into the blocks guarded by the
condition. E.g. for 'if (a == 7) { ... }', 'a' will be replaced in the block
with 7. It does this by replacing all the uses of 'a' that are dominated by
the true edge.
For a switch with N cases and U uses of the value, this will mean N * U calls
to 'dominates'. Asserting isSingleEdge in 'dominates' make this N^2 * U
because this function checks for the uniqueness of the edge. I.e. traverses
each edge between the SwitchInst's block and the cases.
The change removes the assert and makes 'dominates' works correctly in the
presence of non-unique edges.
This brings build time down by an order of magnitude for an input that has
~10k cases in a switch statement.
Differential Revision: https://reviews.llvm.org/D33584
llvm-svn: 304721
procedural optimizations to prevent dropping symbols and allow the linker
to process re-directs.
PR33145: --wrap doesn't work with lto.
Differential Revision: https://reviews.llvm.org/D33621
llvm-svn: 304719
Summary:
Since LLVM_NATIVE_ARCH, LLVM_NATIVE_ASMPARSER, LLVM_NATIVE_ASMPRINTER,
LLVM_NATIVE_DISASSEMBLER, LLVM_NATIVE_TARGET, LLVM_NATIVE_TARGETINFO and
LLVM_NATIVE_TARGETMC are already defined in llvm-config.h, there seems
to be no reason to also define them in config.h. Also, I can only find
usage of these macros in files that include llvm-config.h.
So let's remove the duplicated macros from config.h.
Reviewers: chandlerc, rnk, mehdi_amini, joerg
Reviewed By: rnk
Subscribers: chapuni, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33881
llvm-svn: 304714
When lowering calls, we generate instructions with machine opcodes
rather than generic ones. Therefore, we need to constrain the register
classes of the operands.
Also enable the machine verifier on the arm-irtranslator.ll test, since
that would've caught this issue.
Fixes (part of) PR32146.
llvm-svn: 304712
The C functions added are LLVMGetNumContainedTypes and
LLVMGetSubtypes.
The OCaml function added is Llvm.subtypes.
Patch by Ekaterina Vaartis.
Differential Revision: https://reviews.llvm.org/D33677
llvm-svn: 304709
Summary:
The workaround added in rL301240 for stderr/out/in symbols being both
macros and globals is only necessary for glibc, and it does not compile
with musl libc. Alpine Linux has had the following fix for it:
https://git.alpinelinux.org/cgit/aports/plain/main/llvm4/llvm-fix-DynamicLibrary-to-build-with-musl-libc.patch
Adapt the fix in our DynamicLibrary.inc for Unix.
Reviewers: marsupial, chandlerc, krytarowski
Reviewed By: krytarowski
Subscribers: srhines, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D33883
llvm-svn: 304707
This patch provides a means to specify section-names for global variables,
functions and static variables, using #pragma directives.
This feature is only defined to work sensibly for ELF targets.
One can specify section names as:
#pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText"
One can "unspecify" a section name with empty string e.g.
#pragma clang section bss="" data="" text="" rodata=""
Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D33413
llvm-svn: 304704
This change adds a new fixup fixup_t2_so_imm for the t2_so_imm_asmoperand
"T2SOImm". The fixup permits code such as:
.L1:
sub r3, r3, #.L2 - .L1
.L2:
to assemble in Thumb2 as well as in ARM state.
The operand predicate isT2SOImm() explicitly doesn't match expressions
containing :upper16: and :lower16: as expressions with these operators
must match the movt and movw instructions.
The test mov r0, foo2 in thumb2-diagnostics is moved to a new file as the
fixup delays the error message till after the assembler has quit due to
the other errors.
As the mov instruction shares the t2_so_imm_asmoperand mov instructions
with a non constant expression now match t2MOVi rather than t2MOVi16 so the
error message is slightly different.
Fixes PR28647
Differential Revision: https://reviews.llvm.org/D33492
llvm-svn: 304702
This fixes a bug that can cause extractelements with operands that
haven't been defined yet to be inserted at a wrong point when
optimising insertelements.
Patch by Karl Hylen.
Differential Revision: https://reviews.llvm.org/D33449
llvm-svn: 304701
Fixes bug #33302. Pass did not account that Src1 of max instruction
can be an immediate.
Differential Revision: https://reviews.llvm.org/D33884
llvm-svn: 304696
We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type:
e.g. for v4f32:
Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0>
: unpcklps 1, 3 ==> Y: <?, ?, 3, 1>
Step 2: unpcklps X, Y ==> <3, 2, 1, 0>
The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc.
Instead, this patch unpacks progressively larger sequential vector elements together:
e.g. for v4f32:
Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0>
: unpcklps 1, 3 ==> Y: <?, ?, 3, 2>
Step 2: unpcklpd X, Y ==> <3, 2, 1, 0>
This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree.
Differential Revision: https://reviews.llvm.org/D33864
llvm-svn: 304688
Following the request made in https://reviews.llvm.org/D32871,
scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is
hereby made non-virtual in InnerLoopVectorizer.
Should have been part of r297580 originally.
llvm-svn: 304685
This is actually NFC because the next case starts with the same if statement as this case did. So the result will be the same and it will fallthrough to the end of the switch. But there's no reason to rely on that so we should just break.
llvm-svn: 304680
Apparently ::NodeKind is sometimes part of the name in GDB.
Without this patch I get the following error message from GDB:
`Unhandled NodeKind llvm::Twine::NodeKind::EmptyKind`.
Patch by Alexander Richardson!
Differential Revision: https://reviews.llvm.org/D32795
llvm-svn: 304675
With this, the two pipelines should be in sync again (modulo
LoopUnswitch, but Chandler is actively working on that).
Differential Revision: https://reviews.llvm.org/D33810
llvm-svn: 304671
Remove dependency of SDWA pass on SIShrinkInstructions.
The goal is to move SDWA even higher in the stack to avoid second run
of MachineLICM, MachineCSE and SIFoldOperands.
Also added handling to preserve original src modifiers.
Differential Revision: https://reviews.llvm.org/D33860
llvm-svn: 304665
The size of this function was getting a little out of.
control. Split code for writing each section type into
seperate functions.
Differential Revision: https://reviews.llvm.org/D33792
llvm-svn: 304634
Summary:
These are mostly legal, but will probably need special lowering for some
cases.
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D33791
llvm-svn: 304628
SIFoldOperands can commute operands even if no folding was done.
This change is to preserve IR is no folding was done.
Differential Revision: https://reviews.llvm.org/D33802
llvm-svn: 304625
Previously MappedBlockStream owned its own BumpPtrAllocator that
it would allocate from when a read crossed a block boundary. This
way it could still return the user a contiguous buffer of the
requested size. However, It's not uncommon to open a stream, read
some stuff, close it, and then save the information for later.
After all, since the entire file is mapped into memory, the data
should always be available as long as the file is open.
Of course, the exception to this is when the data isn't *in* the
file, but rather in some buffer that we temporarily allocated to
present this contiguous view. And this buffer would get destroyed
as soon as the strema was closed.
The fix here is to force the user to specify the allocator, this
way it can provide an allocator that has whatever lifetime it
chooses.
Differential Revision: https://reviews.llvm.org/D33858
llvm-svn: 304623
Adjust code to look more like the code in LivePhysRegs and port over the
fix for LivePhysRegs from r304001 and adapt to the new CSR management in
MachineRegisterInfo.
llvm-svn: 304622
Fixed some comments, added an additional description of the algorithms,
improved readability of the code.
Differential revision: https://reviews.llvm.org/D33320
llvm-svn: 304616
There's nothing darwin-specific in these tests, and using
that setting causes extra phantom diffs when the auto-generated
check lines are regenerated today.
llvm-svn: 304614
We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed.
llvm-svn: 304607
This pass allows to run the register scavenging independently of
PrologEpilogInserter to allow targeted testing.
Also adds some basic register scavenging tests.
llvm-svn: 304606
Prior to this patch we used to not touch the LiveRegMatrix while doing
live-range splitting. In other words, when live-range splitting was
occurring, the LiveRegMatrix was not reflecting the changes.
This is generally fine because it means the query to the LiveRegMatrix
will be conservately correct. However, when decisions are taken based on
what is going to happen on the interferences (e.g., when we spill a
register and know that it is going to be available for another one), we
might hit an assertion that the color used for the assignment is still
in use.
This patch makes sure the changes on the live-ranges are properly
reflected in the LiveRegMatrix, so the assertions don't break.
An alternative could have been to remove the assertion, but it would
make the invariants of the code and the general reasoning more
complicated in my opnion.
http://llvm.org/PR33057
llvm-svn: 304603
Use the initializeXXX method to initialize the RABasic pass in the
pipeline. This enables us to take advantage of the .mir infrastructure.
llvm-svn: 304602
Minor optimization but mostly simplifies my debugging so I'm not dealing
with empty SCCNodeSets while investigating issues in this optimization.
llvm-svn: 304597
Summary:
This is to enable the new switch inline cost heuristic (r301649) by removing the
old heuristic as well as the flag itself.
In my experiment for LLVM test suite and spec2000/2006, +17.82% performance and
8% code size reduce was observed in spec2000/vertex with O3 LTO in AArch64.
No significant code size / performance regression was found in O3/O2/Os. No
significant complain was reported from the llvm-dev thread.
Reviewers: hans, chandlerc, eraman, haicheng, mcrosier, bmakam, eastig, ddibyend, echristo
Reviewed By: echristo
Subscribers: javed.absar, kristof.beyls, echristo, aemerson, rengolin, mehdi_amini
Differential Revision: https://reviews.llvm.org/D32653
llvm-svn: 304594
Summary:
Fixed some comments, added an additional description of the algorithms,
improved readability of the code.
Reviewers: anemet
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33320
llvm-svn: 304593
Since r288804, we try to lower build_vectors on AVX using broadcasts of
float/double. However, when we broadcast integer values that happen to
have a NaN float bitpattern, we lose the NaN payload, thereby changing
the integer value being broadcast.
This is caused by ConstantFP::get, to which we pass the splat i32 as
a float (by bitcasting it using bitsToFloat). ConstantFP::get takes
a double parameter, so we end up lossily converting a single-precision
NaN to double-precision.
Instead, avoid any kinds of conversions by directly building an APFloat
from the splatted APInt.
Note that this also fixes another piece of code (broadcast of
subvectors), that currently isn't susceptible to the same problem.
Also note that we could really just use APInt and ConstantInt
throughout: the constant pool type doesn't matter much. Still, for
consistency, use the appropriate type.
llvm-svn: 304590
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way. This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.
Differential Revision: https://reviews.llvm.org/D33807
llvm-svn: 304588
This adds an install-builtins target to avoid having to list all
builtins targets explicitly.
Differential Revision: https://reviews.llvm.org/D32710
llvm-svn: 304587
Summary:
As shown in the test case, SROA was crashing when trying to split
stores (to the alloca) of loads (from anywhere), because it assumed
the pointer operand to the loads and stores had to have the same
address space. This isn't the case. Make sure to use the correct
pointer type for both the load and the store.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D32593
llvm-svn: 304585
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.
It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.
llvm-svn: 304582
This way dead stripping results are recorded in combined summary and
can be used in regular LTO passes.
Differential Revision: https://reviews.llvm.org/D33615
llvm-svn: 304577
This reverts commit r304561 and re-lands r303490 & co.
The fix was to use "SymbolName" when translating LLD's internal export
list to lib/Object's short export struct. The SymbolName reflects the
actual symbol name, which may include fastcall and stdcall mangling bits
not included in the /EXPORT or .def file EXPORTS name:
@@ -434,8 +434,7 @@ std::vector<COFFShortExport> createCOFFShortExportFromConfig() {
std::vector<COFFShortExport> Exports;
for (Export &E1 : Config->Exports) {
COFFShortExport E2;
- E2.Name = E1.Name;
+ // Use SymbolName, which will have any stdcall or fastcall qualifiers.
+ E2.Name = E1.SymbolName;
E2.ExtName = E1.ExtName;
E2.Ordinal = E1.Ordinal;
E2.Noname = E1.Noname;
llvm-svn: 304573
This might give a few better opportunities to optimize these to memcpy
rather than loops - also a few minor cleanups (StringRef-izing,
templating (to avoid std::function indirection), etc).
The SmallVector::assign(iter, iter) could be improved with the use of
SFINAE, but the (iter, iter) ctor and append(iter, iter) need it to and
don't have it - so, workaround it for now rather than bothering with the
added complexity.
(also, as noted in the added FIXME, these assign ops could potentially
be optimized better at least for non-trivially-copyable types)
llvm-svn: 304566
While doing so, clarify the comments and update them to reflect current reality.
Note: I'm going to let this sit for a week or so before adding further verification. I want to give this time to cycle through bots and merge it into our downstream tree before pushing this further.
llvm-svn: 304565
This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties.
For those curious, the more complicated version of this patch already found one very suspicious thing.
Differential Revision: https://reviews.llvm.org/D33819
llvm-svn: 304564
This reverts commits r303490, r303491, r303493, and r303494.
This caused http://crbug.com/728726. Essentially, exporting stdcall
functions doesn't appear to work after this change. Reduced test case
soon.
llvm-svn: 304561
Summary:
The constant folding code currently assumes that the constant expression will always be on the left and the simple null will be on the right. But that's not true at least on the path from InstSimplify.
This patch adds support to ConstantFolding to detect the reversed case.
Reviewers: spatel, dberlin, majnemer, davide, joey
Reviewed By: joey
Subscribers: joey, llvm-commits
Differential Revision: https://reviews.llvm.org/D33801
llvm-svn: 304559
-enable-si-insert-waitcnts=1 becomes the default
-enable-si-insert-waitcnts=0 to use old pass
Differential Revision: https://reviews.llvm.org/D33730
llvm-svn: 304551
Author: milena.vujosevic.janicic
Reviewers: sdardis
The patch extends size reduction pass for MicroMIPS.
The following instructions are examined and transformed, if possible:
LBU instruction is transformed into 16-bit instruction LBU16
LHU instruction is transformed into 16-bit instruction LHU16
SB instruction is transformed into 16-bit instruction SB16
SH instruction is transformed into 16-bit instruction SH16
Differential Revision: https://reviews.llvm.org/D33091
llvm-svn: 304550
on macOS
This function will be used to tie Clang's Integeration tests to a particular
SDK version. See https://reviews.llvm.org/D32178 for more context.
llvm-svn: 304541
insertps behaves differently, the register form selects from an input
register based on the immediate operand while the memory form just loads
the given address. We have custom code to change the immediate in cases
where that's legal, so completely remove insertps from the generated
tables.
llvm-svn: 304540
When a global may be preempted it needs to be accessed directly, instead of
indirectly through a MergedGlobals symbol, for the preemption to work.
This fixes PR33136.
Differential Revision: https://reviews.llvm.org/D33727
llvm-svn: 304537
Very very similar to the support for arrays. As with arrays, we don't
support returning large structs that wouldn't fit in R0-R3. Most
front-ends would likely use sret arguments for that anyway.
The only significant difference is that when splitting a struct, we need
to make sure we set the correct original alignment on each member,
otherwise it may get split incorrectly between stack and registers.
llvm-svn: 304536
The recursive implementation of findNonImmUse may overflow stack
on extremely long use chains. This patch replaces it with an equivalent
iterative implementation.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D33775
llvm-svn: 304522
Summary:
Optimization passes may remove llvm.coro.suspend intrinsic while leaving matching llvm.coro.save intrinsic orphaned.
Make sure we clean up orphaned coro.saves. The bug manifested with a crash similar to this:
```
llvm_unreachable("Unknown type!");
llvm::MVT::getVT (Ty=0x489518, HandleUnknown=false)
llvm::EVT::getEVT
llvm::TargetLoweringBase::getValueType
llvm::ComputeValueVTs
llvm::SelectionDAGBuilder::visitTargetIntrinsic
```
Reviewers: GorNishanov
Subscribers: EricWF, llvm-commits
Differential Revision: https://reviews.llvm.org/D33817
llvm-svn: 304518
builtin_expect applied on && or || expressions were not
handled properly before. With this patch, the problem is fixed.
Differential Revision: http://reviews.llvm.org/D33164
llvm-svn: 304517
Summary:
When writing the combined index, we are walking the entire module
path StringMap in the full index, and checking whether each one should be
included in the index being written. For distributed backends, where we
write an individual combined index for each file, each with only a few
module paths, this is incredibly inefficient. Add a method that takes
a callback and hides the details of whether we are writing the full
combined index, or just a slice, and in the latter case it walks the set
of modules to include instead of the entire index.
For a huge application with around 23K files (i.e. where we were iterating
through the 23K-entry modulePath StringMap 23K times), this change improved
the thin link time by a whopping 48%.
Reviewers: pcc
Subscribers: Prazek, inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D33813
llvm-svn: 304516
Summary: Wasm object format has some functionality regressions from the ELF format, and doesn't play nicely with the rest of the toolchain. It should eventually be the default, but not yet.
Reviewers: sunfish, sbc100
Subscribers: jfb, dschuff, llvm-commits
Differential Revision: https://reviews.llvm.org/D33811
llvm-svn: 304512
Undefined externals don't need to have a size or an offset.
This was broken by r303915. Added a test for this case.
This fixes the "Compile LLVM Torture (o)" step on the wasm
waterfall.
Differential Revision: https://reviews.llvm.org/D33803
llvm-svn: 304505
Summary:
As we teach Clang to use ThinkLTO + new PM, it's good for the users to
inject through Config, instead of setting a flag in the LTOBackend
library. Move the flag to llvm-lto2.
As it moves to llvm-lto2, a new name -use-new-pm seems simpler and as
clear.
Reviewers: davide, tejohnson
Subscribers: mehdi_amini, Prazek, inglorion, eraman, chandlerc, llvm-commits
Differential Revision: https://reviews.llvm.org/D33799
llvm-svn: 304492
GVNHoist was moved as part of simplification passes for the current
pass manager (but not for the new), so they're out-of-sync.
Differential Revision: https://reviews.llvm.org/D33806
llvm-svn: 304490
This was rL304226, reverted in 304228 due to a clang assertion failure
on the build bots. That problem should have been addressed by clang
commit rL304470.
llvm-svn: 304488
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries. Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.
Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.
Differential Revision: https://reviews.llvm.org/D33785
llvm-svn: 304484
The AArch64 backend marks calls that involve aggregate function
arguments as having an implicit def of SP. We already have the same
workaround in LiveDebugValues and in DbgValueHistoryCalculator for SP
clobbers in register masks. This adds register defs to the list.
Fixes rdar://problem/30361929 and Swift SR-3851.
llvm-svn: 304471
Summary:
Reduce min percent required for indirect call promotion from 33% to 30%,
which matches gcc's threshold and catches the same hot opportunities.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33798
llvm-svn: 304469
It doesn't exist on Windows. The number we use here doesn't really matter,
the storage will expand automatically but 256 seems like a reasonable default.
Should fix windows buildbots that complained about rL304458.
llvm-svn: 304468
Summary:
Clang wants to clone a function before it is done building the entire
compilation unit. As of now, there is no good way to do that, because
CloneFunction doesn't like dealing with temporary metadata. However,
as long as clang doesn't want to add any variables to this SP, it
should be fine to just prematurely finalize it. Add an API to allow this.
This is done in preparation of a clang commit to fix the assertion that
necessitated the revert of D33655.
Reviewers: aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33704
llvm-svn: 304467
Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of
all the DeadSymbols sets. This is refactoring in order to make
liveness information available in the RegularLTO pipeline.
llvm-svn: 304466
Summary:
`LLVM_TOOLS_INSTALL_DIR` was introduced in r272200 in order to override the directory
name into which to install LLVM's executable. However, `llvm-config --bindir` still reported
`$PREFIX/bin` independent of what LLVM_TOOLS_INSTALL_DIR was set to.
This fixes the out-of-tree clang standalone build for me.
Reviewers: beanz, tstellar
Reviewed By: tstellar
Subscribers: chapuni, tstellar, llvm-commits
Differential Revision: https://reviews.llvm.org/D22499
llvm-svn: 304458
The added test case is to check whether the simplified value is passed to
getGEPCost().
Differential Revision: https://reviews.llvm.org/D33779
llvm-svn: 304454
The lowerer wrongly assumes the ICMP instruction
1) always has a constant operand;
2) the operand has value 0.
It also assumes the expected value can only be one, thus
other values other than one will be considered 'zero'.
This leads to wrong profile annotation when other integer values
are used other than 0, 1 in the comparison or in the expect intrinsic.
Also missing is handling of equal predicate.
This patch fixes all the above problems.
Differential Revision: http://reviews.llvm.org/D33757
llvm-svn: 304453
Summary:
Sort OpsToRename before iterating to make iteration order deterministic.
Thanks to Daniel Berlin for the sorting logic.
Reviewers: dberlin, RKSimon, efriedma, davide
Reviewed By: dberlin, davide
Subscribers: sanjoy, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D33265
llvm-svn: 304447
This commit introduces a structure that holds all the flags that
control the pretty printing of dwarf output.
Patch by Spyridoula Gravani!
Differential Revision: https://reviews.llvm.org/D33749
llvm-svn: 304446
The initial assumption was that the simplification would converge to a
fixed point relatvely quickly. Turns out that there are legitimate situa-
tions where the complexity of the code causes it to take a large number
of iterations.
Two main changes:
- Instead of aborting upon hitting the limit, simply return nullptr.
- Reduce the limit to 10,000 from 100,000.
llvm-svn: 304441
Summary:
Without using a fixup in this case, BL will be used instead of BLX to
call internal ARM functions from Thumb functions.
Reviewers: rafael, t.p.northover, peter.smith, kristof.beyls
Reviewed By: peter.smith
Subscribers: srhines, echristo, aemerson, rengolin, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33436
llvm-svn: 304413
Summary:
Solaris-specific implementation for llvm::sys::fs::is_local_impl.
FStype pattern matching might be a bit unreliable, but at least it fixes the build failure.
Reviewers: mgorny, nlopes, llvm-commits, krytarowski
Reviewed By: krytarowski
Subscribers: voskresensky.vladimir, krytarowski
Differential Revision: https://reviews.llvm.org/D33695
llvm-svn: 304412
Summary:
This is a problem uncovered by stage2 testing. ADDCARRY end up being generated on target that do not support it.
The patch that introduced the problem has other patches layed on top of it, so we want to fix the issue rather than revert it to avoid creating a lor of churn.
A regression test will be added shortly, but this is committed as this in order to get the build back to green promptly.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33770
llvm-svn: 304409
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.
This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.
I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:
1) I sunk the PGO indirect call promotion to only be run when we have
PGO enabled (or as part of the special ThinLTO pipeline).
2) I made the extra GlobalOpt run in ThinLTO just happen all the time
and at a slightly more powerful place (before we remove available
externaly functions). This seems like general goodness and not a big
compile time sink, so it didn't make sense to *only* use it in
ThinLTO. Fewer differences in the pipeline makes everything simpler
IMO.
3) I hoisted the ThinLTO stop point pre-link above the the RPO function
attr inference. The RPO inference won't infer anything terribly
meaningful pre-link (recursiveness?) so it didn't make a lot of
sense. But if the placement of RPO inference starts to matter, we
should move it to the canonicalization phase anyways which seems like
a better place for it (and there is a FIXME to this effect!). But
that seemed a bridge too far for this patch.
If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.
I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.
Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.
Differential Revision: https://reviews.llvm.org/D33540
llvm-svn: 304407
Summary:
Add an early combine to match patterns such as:
(i16 bitcast (v16i1 x))
->
(i16 movmsk (v16i8 sext (v16i1 x)))
This combine needs to happen early enough before
type-legalization scalarizes the result of the setcc.
Reviewers: igorb, craig.topper, RKSimon
Subscribers: delena, llvm-commits
Differential Revision: https://reviews.llvm.org/D33311
llvm-svn: 304406
Summary:
This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process.
This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33374
llvm-svn: 304404
Summary: This pattern is no very useful per se, but it exposes optimization for toehr patterns that wouldn't kick in otherwize. It's very common and worth optimizing for.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32756
llvm-svn: 304402
They weren't used often enough to justify having two different interfaces. Push the responsiblity of creating a StringInit up to the caller.
llvm-svn: 304388
It tried to detect 9 letters (the length of anonymous) followed by a period. But anonymous classes start with "anonymous_" rather than "anonymous." these days.
llvm-svn: 304387
Summary: Also see D33429 for other ThinLTO + New PM related changes.
Reviewers: davide, chandlerc, tejohnson
Subscribers: mehdi_amini, Prazek, cfe-commits, inglorion, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D33525
llvm-svn: 304378
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
llvm-svn: 304371
We should have a single call site entry with no landing pad. This
indicates that no EH action should be taken and the unwinder should
unwind to the next frame.
We currently don't recognize __gxx_personality_seh0 as a known
personality, so we forcibly emit a table, and that table was wrong. This
was filed as PR33220. Now we emit a correct table for that personality.
The next step is to recognize that we can completely skip the table for
this personality.
llvm-svn: 304363
BIND_OPCODE_SET_DYLIB_SPECIAL_IMM(0) is a valid way to setp library
ordinal. MachOObject should set LibraryOrdinalSet even when IMM is zero.
llvm-svn: 304362
The intent of the test is to check that array lengths greater than
UINT_MAX work properly. Change the test to stress that scenario, without
triggering pointer overflow UB.
Caught by a WIP pointer overflow checker in clang.
Differential Revision: https://reviews.llvm.org/D33149
llvm-svn: 304353
It seems not all of our bots have a std::vector::erase() taking a
const_iterator (even though that seems to be part of C++11) attempt to
workaround.
llvm-svn: 304349
After transforming FP to ST registers:
- Do not add the ST register to the livein lists, they are reserved so
we do not need to track their liveness.
- Remove the FP registers from the livein lists, they don't have defs or
uses anymore and so are not live.
- (The setKillFlags() call is moved to an earlier place as it relies on
the FP registers still being present in the livein list.)
llvm-svn: 304342
Summary:
Fairly straightforward patch to fill in some of the holes in the
attributes API with respect to accessing parameter/argument attributes.
The patch aims to step further towards encapsulating the
idx+FirstArgIndex pattern to access these attributes to within the
AttributeList.
Patch by Daniel Neilson!
Reviewers: rnk, chandlerc, pete, javed.absar, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33355
llvm-svn: 304329
Internally both these methods just return the result of getValue on either a StringInit or a CodeInit object. In both cases this returns a StringRef pointing to a string allocated in the BumpPtrAllocator so its not going anywhere. So we can just pass that StringRef along.
This is a fairly naive patch that targets just the build failures caused by this change. There's additional work that can be done to avoid creating std::string at call sites that still think getValueAsString returns a std::string. I'll try to clean those up in future patches.
Differential Revision: https://reviews.llvm.org/D33710
llvm-svn: 304325
Summary:
Don't assign values to undefined references, simply don't emit those
reference edges as they are not useful (we were already not emitting
call edges to undefined refs).
Also, streamline the later lookup of value ids when writing the
summaries, by combining the check for value id existence with the access
of that value id.
Reviewers: pcc
Subscribers: Prazek, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D33634
llvm-svn: 304323
Summary:
If we attempt to unfold an SUnit in ScheduleDAG that results in
finding an already scheduled load, we must should abort the
unfold as it will not improve scheduling.
This fixes PR32610.
Reviewers: jmolloy, sunfish, bogner, spatel
Subscribers: llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D32911
llvm-svn: 304321
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.
This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!
Differential Revision: https://reviews.llvm.org/D33696
llvm-svn: 304320
Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for
newer ppc processors.
Commiting on behalf of Stefan Pintilie.
Differential Revision: https://reviews.llvm.org/D33656
llvm-svn: 304317
This reverts commit r304310.
It caused build failures in polly and mingw
due to undefined reference to
llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC.
llvm-svn: 304315
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.
Differential Revision: https://reviews.llvm.org/D28637
llvm-svn: 304313
- new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it
- fix handling of PERMUTE ops
- fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass )
- add new test
Differential Revision: https://reviews.llvm.org/D33114
llvm-svn: 304311
Summary:
Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy.
The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
Patch by Daniel Neilson!
Reviewers: reames, anna, skatkov
Reviewed By: reames
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33243
llvm-svn: 304310
Correct references to alignment of store which may be deleted in a
previous iteration of merge. Instead use first store that would be
merged.
Corrects pr33172's use-after-poison caught by ASan.
Reviewers: spatel, hfinkel, RKSimon
Reviewed By: RKSimon
Subscribers: thegameg, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33686
llvm-svn: 304299
There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI
Instruction, this patch recognizes them and does the selection to improve
the PPC performance.
Differential Revision: https://reviews.llvm.org/D33404
llvm-svn: 304298
This patch builds upon https://reviews.llvm.org/rL302810 to add
handling for bitwise logical operations in general purpose registers.
The idea is to keep the values in GPRs as long as possible - only
extracting them to a condition register bit when no further operations
are to be done.
Differential Revision: https://reviews.llvm.org/D31851
llvm-svn: 304282