This patch adds logic to deal with the following constructions:
%iv = phi i64 ...
%trunc = trunc i64 %iv to i32
%cmp = icmp <pred> i32 %trunc, %invariant
Replacing it with
%iv = phi i64 ...
%cmp = icmp <pred> i64 %iv, sext/zext(%invariant)
In case if it is legal. Specifically, if `%iv` has signed comparison users, it is
required that `sext(trunc(%iv)) == %iv`, and if it has unsigned comparison
uses then we require `zext(trunc(%iv)) == %iv`. The current implementation
bails if `%trunc` has other uses than `icmp`, but in theory we can handle more
cases here (e.g. if the user of trunc is bitcast).
Differential Revision: https://reviews.llvm.org/D47928
Reviewed By: reames
llvm-svn: 335020
This reverts r334428. It incorrectly marks some multiplications as nuw. Tim
Shen is working on a proper fix.
Original commit message:
[SCEV] Add nuw/nsw to mul ops in StrengthenNoWrapFlags where safe.
Summary:
Previously we would add them for adds, but not multiplies.
llvm-svn: 335016
Shows some missed optimizations for the -7929856 and -2166 testcases.
-7929856 is due to a bug in ARMTargetLowering::getARMCmp, I think;
the -2166 case is a missing pattern.
llvm-svn: 335004
Even if a comparison isn't legal, we should try to prefer constants
which can be materialized with a two-instruction sequence. (Thinking
about it a bit more, there might be some more clever sequence we could
generate for certain comparisons invoving powers of two, but I'm not
sure exactly what that would look like.)
llvm-svn: 335003
Summary: This patch originated from D46562 and is a proper subset, with some issues addressed.
Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar
Reviewed By: spatel
Subscribers: wdng, nhaehnle
Differential Revision: https://reviews.llvm.org/D47909
llvm-svn: 334996
This reverts commit f976cf4cca0794267f28b54e468007fd476d37d9.
I am reverting this because it causes break in a few bots and its going
to take me sometime to look at this.
llvm-svn: 334993
Summary:
Simplify blockaddress usage before giving up in MergeBlockIntoPredecessor
This is a missing small optimization in MergeBlockIntoPredecessor.
This helps with one simplifycfg test which expects this case to be handled.
Reviewers: davide, spatel, brzycki, asbirlea
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48284
llvm-svn: 334992
Summary:
One for register based, much like the existing definitions,
and one for stack based (suffix _S).
This allows us to use registers in most of LLVM (which works better),
and stack based in MC (which results in a simpler and more readable
assembler / disassembler).
Tried to keep this change as small as possible while passing tests,
follow-up commit will:
- Add reg->stack conversion in MI.
- Fix asm/disasm in MC to be stack based.
- Fix emitter to be stack based.
tests passing:
llvm-lit -v `find test -name WebAssembly`
test/CodeGen/WebAssembly
test/MC/WebAssembly
test/MC/Disassembler/WebAssembly
test/DebugInfo/WebAssembly
test/CodeGen/MIR/WebAssembly
test/tools/llvm-objdump/WebAssembly
Reviewers: dschuff, sbc100, jgravelle-google, sunfish
Subscribers: aheejin, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D48183
llvm-svn: 334985
This patch uses the DiagnosticPredicate for SVE predicate patterns
to improve their diagnostics, now giving a 'invalid operand' diagnostic
if the type is not an immediate or one of the expected pattern
labels.
Reviewers: samparker, SjoerdMeijer, javed.absar, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48220
llvm-svn: 334983
The variants added by this patch are:
- SQINC signed increment, e.g. sqinc x0, w0, all, mul #4
- SQDEC signed decrement, e.g. sqdec x0, w0, all, mul #4
- UQINC unsigned increment, e.g. uqinc w0, all, mul #4
- UQDEC unsigned decrement, e.g. uqdec w0, all, mul #4
This patch includes asmparser changes to parse a GPR64 as a GPR32 in
order to satisfy the constraint check:
x0 == GPR64(w0)
in:
sqinc x0, w0, all, mul #4
^___^ (must match)
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47716
llvm-svn: 334980
2 of these tests were clearly not doing what the comments
said they were doing.
The last test was added at rL177933 with no assertions
(presumably it used to crash). But either we don't have
that problem anymore, or this test is folded sooner,
so we don't hit the bug that was fixed by disabling late
FP constant creation. Looking at this as part of reviewing
D48289.
llvm-svn: 334977
When the destination register of a XOP instruction is an XMM register, bits
[255:128] of the corresponding YMM register are cleared.
When the destination register of a EVEX encoded instruction is an XMM/YMM
register, the upper bits of the corresponding ZMM are cleared.
On processors that feature AVX512, a write to an XMM registers always clears the
upper portion of the corresponding ZMM register if the instruction is VEX or
EVEX encoded.
These new tests show some interesting cases which aren't correctly analyzed by
llvm-mca. The lack of knowledge related to the implicit update on the
super-registers is addressed by D48225.
llvm-svn: 334945
This patch adds instructions for comparing elements from two vectors, e.g.
cmpgt p0.s, p0/z, z0.s, z1.s
and also adds support for comparing to a 64-bit wide element vector, e.g.
cmpgt p0.s, p0/z, z0.s, z1.d
The patch also contains aliases for certain comparisons, e.g.:
cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s
cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s
cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s
cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s
llvm-svn: 334931
We already have these aliases for EVEX enocded instructions, but not for the GPR, MMX, SSE, and VEX versions.
Also remove the vpextrw.s EVEX alias. That's not something gas implements.
llvm-svn: 334922
The .s assembly strings allow the reversed forms to be targeted from assembly which matches gas behavior. But when printing the instructions we should print them without the .s to match other tooling like objdump. By using InstAliases we can use the normal string in the instruction and just hide it from the assembly parser.
Ideally we'd add the .s versions to the legacy SSE and VEX versions as well for full compatibility with gas. Not sure how we got to state where only EVEX was supported.
llvm-svn: 334920
Unlike CodeGenInstruction, CodeGenInstAlias was flatting asm strings in its constructor. For instructions it was the users responsibility to flatten the string.
AsmMatcherEmitter didn't know this and treated them the same. This caused double flattening of InstAliases. This is mostly harmless unless the desired assembly string contains curly braces. The second flattening wouldn't know to ignore these and would remove the curly braces. And for variant 1 it would remove the contents of them as well.
To mitigate this, this patch makes removes the flattening from the CodeGenIntAlias constructor and modifies AsmWriterEmitter to account for the flattening not having been done.
llvm-svn: 334919
Some of the calls to hasSingleUseFromRoot were passing the load itself. If the load's chain result has a user this would count against that. By getting the true parent of the match and ensuring any intermediate between the match and the load have a single use we can avoid this case. isLegalToFold will take care of checking users of the load's data output.
This fixed at least fma-scalar-memfold.ll to succed without the peephole pass.
llvm-svn: 334908
There are a lot of instructions to add under these ISAs (and the other AVX512 variants) but this should demonstrate how to test for the EVEX instructions with different maskings
llvm-svn: 334907
Support for SVE's predicated select instructions to select elements
from either vector, both in a data-vector and a predicate-vector
variant.
llvm-svn: 334905
Summary:
We only modify CFG in a couple of places, and we can preserve DT there
with a little effort.
Reviewers: davide, vsk
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D48059
llvm-svn: 334895
This is the common case in the BE when we serialize condition and then
rematerialize it. Use either original or inverted condition.
Differential Revision: https://reviews.llvm.org/D48246
llvm-svn: 334882
Summary: This patch originated from D47388 and is a proper subset of the originating changes, containing only the fmf optimization guard extensions.
Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar, rampitec, nhaehnle, nemanjai
Reviewed By: rampitec, nhaehnle
Subscribers: tpr, nemanjai, wdng
Differential Revision: https://reviews.llvm.org/D47918
llvm-svn: 334876
So far, we've only handled special cases of PatFrag like ImmLeaf. This patch
adds support for the remaining cases using similar mechanisms.
Like most C++ code from SelectionDAG, GISel and DAGISel expect to operate on
different types and representations and as such the code is not compatible
between the two. It's therefore necessary to add an alternative implementation
in the GISelPredicateCode field.
The target test for this feature could easily be done with IntImmLeaf and this
would save on a little boilerplate. The reason I've chosen to implement this
using PatFrag.GISelPredicateCode and not IntImmLeaf is because I was unable to
find a rule that was blocked solely by lack of support for PatFrag predicates. I
found that the ones I investigated as being likely candidates for the test
were further blocked by other things.
llvm-svn: 334871
Modify ExpandStrictFPOp(...) to handle nodes that have scalar
operands.
Also, add a Strict FMA test and do some other light cleanup in the
Strict FP code.
Differential Revision: https://reviews.llvm.org/D48149
llvm-svn: 334863
Summary: This patch originated from D46562 and is a proper subset, with some issues addressed.
Reviewers: spatel, hfinkel, wristow, arsenm
Reviewed By: spatel
Subscribers: wdng, nhaehnle
Differential Revision: https://reviews.llvm.org/D47954
llvm-svn: 334862
Enables using the high and high-adjusted symbol modifiers on thread local
storage modifers in powerpc assembly. Needed to be able to support 64 bit
thread-pointer and dynamic-thread-pointer access sequences.
Differential Revision: https://reviews.llvm.org/D47754
llvm-svn: 334856
Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly.
The modifiers represent accessing the segment consiting of bits 16-31 of a
64-bit address/offset.
Differential Revision: https://reviews.llvm.org/D47729
llvm-svn: 334855
redundant-vf2-cost.ll is X86 specific. Moved from
test/Transforms/LoopVectorize/redundant-vf2-cost.ll to
test/Transforms/LoopVectorize/X86/redundant-vf2-cost.ll
llvm-svn: 334854
Added a Generic x86 cpu set of resource tests to allow us to check all ISAs.
We currently use SandyBridge as our generic CPU model, but it's better if we actually duplicate these tests for if/when we change the model, it also means we don't end up polluting the SandyBridge folder with tests for ISAs it doesn't support.
llvm-svn: 334853
An earlier commit prevented folds from the peephole pass by checking for IMPLICIT_DEF. But later in the pipeline IMPLICIT_DEF just becomes and Undef flag on the input register so we need to check for that case too.
llvm-svn: 334848
When coalescing a small register into a subregister of a larger register,
if the larger register is rematerialized, the function updateRegDefUses
can add an <undef> flag to the rematerialized definition (since it's
treating it as only definining the coalesced subregister). While with that
assumption doing so is not incorrect, make sure to remove the flag later
on after the call to updateRegDefUses.
llvm-svn: 334845
Summary:
When iterating users of a multiply in processUMulZExtIdiom, the
call to setOperand in the truncation case may replace the use
being visited; make sure the iterator has been advanced before
doing that replacement.
Reviewers: majnemer, davide
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48192
llvm-svn: 334844
This is a minor fix for LV cost model, where the cost for VF=2 was
computed twice when the vectorization of the loop was forced without
specifying a VF.
Reviewers: xusx595, hsaito, fhahn, mkuper
Reviewed By: hsaito, xusx595
Differential Revision: https://reviews.llvm.org/D48048
llvm-svn: 334840
Increment/decrement scalar register by (scaled) element count given by
predicate pattern, e.g. 'incw x0, all, mul #4'.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47713
llvm-svn: 334838
Try to access pieces 4 bytes at a time. This helps
various hasOneUse extract_vector_elt combines, such
as load width reductions.
Avoids test regressions in a future commit.
llvm-svn: 334836
Summary:
While that is indeed a quite interesting summary stat,
there are cases where it does not really add anything
other than consuming extra lines.
Declutters the output of D48190.
Reviewers: RKSimon, andreadb, courbet, craig.topper
Reviewed By: andreadb
Subscribers: javed.absar, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D48209
llvm-svn: 334833
Summary:
There does not seem to be any other tests for this.
Split off from D47676.
Reviewers: RKSimon, craig.topper, courbet, andreadb
Reviewed By: andreadb
Subscribers: javed.absar, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D48190
llvm-svn: 334832
This is r334704 (which was reverted in r334732) with a fix for
types like x86_fp80. We need to use getTypeAllocSizeInBits and
not getTypeStoreSizeInBits to avoid dropping debug info for
such types.
Original commit msg:
> Summary:
> Do not convert a DbgDeclare to DbgValue if the store
> instruction only refer to a fragment of the variable
> described by the DbgDeclare.
>
> Problem was seen when for example having an alloca for an
> array or struct, and there were stores to individual elements.
> In the past we inserted a DbgValue intrinsics for each store,
> just as if the store wrote the whole variable.
>
> When handling store instructions we insert a DbgValue that
> indicates that the variable is "undefined", as we do not know
> which part of the variable that is updated by the store.
>
> When ConvertDebugDeclareToDebugValue is used with a load/phi
> instruction we assert that the referenced value is large enough
> to cover the whole variable. Afaict this should be true for all
> scenarios where those methods are used on trunk. If the assert
> blows in the future I guess we could simply skip to insert a
> dbg.value instruction.
>
> In the future I think we should examine which part of the variable
> that is accessed, and add a DbgValue instrinsic with an appropriate
> DW_OP_LLVM_fragment expression.
>
> Reviewers: dblaikie, aprantl, rnk
>
> Reviewed By: aprantl
>
> Subscribers: JDevlieghere, llvm-commits
>
> Tags: #debug-info
>
> Differential Revision: https://reviews.llvm.org/D48024
llvm-svn: 334830
Summary:
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.
The original commit was reverted because
it broke tests for amdgpu backend, which i didn't check.
Now, the backed was updated to recognize these new
patterns, so we are good.
https://bugs.llvm.org/show_bug.cgi?id=37603https://rise4fun.com/Alive/cplX
Reviewers: spatel, craig.topper, mareko, bogner, rampitec, nhaehnle, arsenm
Reviewed By: spatel, rampitec, nhaehnle
Subscribers: wdng, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D47980
llvm-svn: 334818
Summary: The same pattern as D48010, but this one is IR-canonical as of D47428.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48012
llvm-svn: 334817
Summary:
As a followup for D48007.
Since we already handle `x << (bitwidth - y) >> (bitwidth - y)` pattern,
which does not have ub for both the edge cases (`y == 0`, `y == bitwidth`),
i think also handling a pattern that is ub for `y == bitwidth` should be fine.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48010
llvm-svn: 334816
Summary:
D47980 will canonicalize the `x << (32 - y) >> (32 - y)`,
which is the pattern the AMDGPU expects to `x & (-1 >> (32 - y))`,
which is not recognized by AMDGPU.
Thus, it needs to be recognized, too.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48007
llvm-svn: 334815
I think this covers most of the unmasked vector instructions. We're still missing a lot of the masked instructions.
There are some test changes here because of the new folding support. I don't think these particular cases should be folded because it creates an undef register dependency. I think the changes introduced in r334175 are not handling stack folding. They're only blocking the peephole pass.
llvm-svn: 334800
IEEE 754 defines the expected result on overflow. As far as I know,
hardware implementations (of f16), and compiler-rt (__floatuntisf)
correctly return +-Inf on overflow. And I can't think of any useful
transform that would take advantage of overflow being undefined here.
Differential Revision: https://reviews.llvm.org/D47807
llvm-svn: 334777
Summary:
Here we relax the old constraint which utilized unsafe with the TargetOption flag HonorSignDependentRoundingFPMathOption, with the assertion that unsafe is no longer needed or never was required for correctness on FDIV/FMUL.
Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar
Reviewed By: spatel
Subscribers: efriedma, wdng, tpr
Differential Revision: https://reviews.llvm.org/D48057
llvm-svn: 334769
In particular, when asked to print a MemoryAccess, we'll now print where
defs are optimized to, and we'll print optimized access types.
This patch also introduces an operator<< to make printing AliasResults
easier.
Patch by Juneyoung Lee!
Differential Revision: https://reviews.llvm.org/D47860
llvm-svn: 334760
isVectorClearMaskLegal() is the TLI hook used by the generic
DAGCombiner::XformToShuffleWithZero().
We've grown to accomodate/expect this transform to shuffle
(disabling it more generally results in many regressions).
So I'm narrowly excluding the 256-bit types that clearly
are not worthwhile for AVX1.
I think in most cases we are able to recover by converting
the shuffle back into 'and' ops, but the cases in:
https://bugs.llvm.org/show_bug.cgi?id=37749
...show that there are cracks.
llvm-svn: 334759
This is r334750 (which was reverted in r334754) with a fix for an
uninitialized variable that was caught by msan.
Original commit message:
> If a copy bundle happens to involve overlapping registers, we can end
> up with emitting the copies in an order that ends up clobbering some
> of the subregisters. Since instructions in the copy bundle
> semantically happen at the same time, this is incorrect and we need to
> make sure we order the copies such that this doesn't happen.
llvm-svn: 334756
Summary: A FMF constraint is added to FADD with unsafe still available as the fallback
Reviewers: spatel, wristow, arsenm, hfinkel
Reviewed By: spatel
Subscribers: wdng
Differential Revision: https://reviews.llvm.org/D48180
llvm-svn: 334753
WebAssembly doesn't support more than one function per section
and we rely on function sections being unique. This change ignores
the section provided by the function to avoid two functions being
in the same section.
Without this change the object writer produces the following
error for this test:
LLVM ERROR: section already has a defining function: baz
Differential Revision: https://reviews.llvm.org/D48178
llvm-svn: 334752
If a copy bundle happens to involve overlapping registers, we can end
up with emitting the copies in an order that ends up clobbering some
of the subregisters. Since instructions in the copy bundle
semantically happen at the same time, this is incorrect and we need to
make sure we order the copies such that this doesn't happen.
Differential Revision: https://reviews.llvm.org/D48154
llvm-svn: 334750
On x86-64, a write to register EAX implicitly clears the upper half or RAX.
128-bit AVX instructions clear the upper 128-bit of the YMM register that
aliases the XMM definition register.
llvm-mca doesn't know about register writes that implicitly clear the upper
portion of an aliasing super-register. This issue will be fixed in a future patch.
llvm-svn: 334742
Summary:
Specifically, we transform
zext(2^K * (trunc X to iN)) to iM ->
2^K * (zext(trunc X to i{N-K}) to iM)<nuw>
This is helpful because pulling the 2^K out of the zext allows further
optimizations.
Reviewers: sanjoy
Subscribers: hiraditya, llvm-commits, timshen
Differential Revision: https://reviews.llvm.org/D48158
llvm-svn: 334737
Summary:
Previously we would do this simplification only if it did not introduce
any new truncs (excepting new truncs which replace other cast ops).
This change weakens this condition: If the number of truncs stays the
same, but we're able to transform trunc(X + Y) to X + trunc(Y), that's
still simpler, and it may open up additional transformations.
While we're here, also clean up some duplicated code.
Reviewers: sanjoy
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D48160
llvm-svn: 334736
This test checks that a physical register is correctly allocated for the partial
write to register BX.
The ADD instruction has to wait for the write to RBX (and BX) before being
executed.
llvm-svn: 334730
In some cases, for example when compiling a preprocessed file, the
front-end is not able to provide an MD5 checksum for all files. When
that happens, omit the MD5 checksums from the final DWARF, because
DWARF doesn't have a way to indicate that some but not all files have
a checksum.
When assembling a .s file, and some but not all .file directives
provide an MD5 checksum, issue a warning and don't emit MD5 into the
DWARF.
Fixes PR37623.
Differential Revision: https://reviews.llvm.org/D48135
llvm-svn: 334710
This patches teaches EarlyCSE to figure out that if `and i1 %x, %y` is true then both
`%x` and `%y` are true in the taken branch, and if `or i1 %x, %y` is false then both
`%x` and `%y` are false in non-taken branch. Fix for PR37635.
Differential Revision: https://reviews.llvm.org/D47574
Reviewed By: reames
llvm-svn: 334707
Summary:
Do not convert a DbgDeclare to DbgValue if the store
instruction only refer to a fragment of the variable
described by the DbgDeclare.
Problem was seen when for example having an alloca for an
array or struct, and there were stores to individual elements.
In the past we inserted a DbgValue intrinsics for each store,
just as if the store wrote the whole variable.
When handling store instructions we insert a DbgValue that
indicates that the variable is "undefined", as we do not know
which part of the variable that is updated by the store.
When ConvertDebugDeclareToDebugValue is used with a load/phi
instruction we assert that the referenced value is large enough
to cover the whole variable. Afaict this should be true for all
scenarios where those methods are used on trunk. If the assert
blows in the future I guess we could simply skip to insert a
dbg.value instruction.
In the future I think we should examine which part of the variable
that is accessed, and add a DbgValue instrinsic with an appropriate
DW_OP_LLVM_fragment expression.
Reviewers: dblaikie, aprantl, rnk
Reviewed By: aprantl
Subscribers: JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D48024
llvm-svn: 334704
Summary:
The tests in:
https://bugs.llvm.org/show_bug.cgi?id=37751
...show miscompiles because we wrongly mapped and folded x86-specific intrinsics into generic DAG nodes.
This patch corrects the mappings in X86IntrinsicsInfo.h and adds isel matching corresponding to the new patterns. The complete tests for the failure cases should be in avx-cvttp2si.ll and sse-cvttp2si.ll and avx512-cvttp2i.ll
Reviewers: RKSimon, gbedwell, spatel
Reviewed By: spatel
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D47993
llvm-svn: 334685
When using clang --save-stats -mllvm -time-passes, both timers and stats
end up in the same json file.
We could end up with things like:
{
"asm-printer.EmittedInsts": 1,
"time.pass.Virtual Register Map.wall": 2.9015541076660156e-04,
"time.pass.Virtual Register Map.user": 2.0500000000000379e-04,
"time.pass.Virtual Register Map.sys": 8.5000000000001741e-05,
}
This patch makes use of the pass argument name (if available) in the
JSON key to end up with things like:
{
"asm-printer.EmittedInsts": 1,
"time.pass.virtregmap.wall": 2.9015541076660156e-04,
"time.pass.virtregmap.user": 2.0500000000000379e-04,
"time.pass.virtregmap.sys": 8.5000000000001741e-05,
}
This also helps avoiding to write another JSON printer to handle all the
cases that we could have in our pass names.
Fixed test instead of adding a new one originally from r334649.
Differential Revision: https://reviews.llvm.org/D48109
llvm-svn: 334657
Such globals are very likely to be part of a sorted section array, such
the .CRT sections used for dynamic initialization. The uses its own
sorted sections called ATL$__a, ATL$__m, and ATL$__z. Instead of special
casing them, just look for the dollar sign, which is what invokes linker
section sorting for COFF.
Avoids issues with ASan and the ATL uncovered after we started
instrumenting comdat globals on COFF.
llvm-svn: 334653
When using clang --save-stats -mllvm -time-passes, both timers and stats
end up in the same json file.
We could end up with things like:
{
"asm-printer.EmittedInsts": 1,
"time.pass.Virtual Register Map.wall": 2.9015541076660156e-04,
"time.pass.Virtual Register Map.user": 2.0500000000000379e-04,
"time.pass.Virtual Register Map.sys": 8.5000000000001741e-05,
}
This patch makes use of the pass argument name (if available) in the
JSON key to end up with things like:
{
"asm-printer.EmittedInsts": 1,
"time.pass.virtregmap.wall": 2.9015541076660156e-04,
"time.pass.virtregmap.user": 2.0500000000000379e-04,
"time.pass.virtregmap.sys": 8.5000000000001741e-05,
}
This also helps avoiding to write another JSON printer to handle all the
cases that we could have in our pass names.
Differential Revision: https://reviews.llvm.org/D48109
llvm-svn: 334649
Fixes PR37790.
In some (very rare) cases, the LSUnit (Load/Store unit) was wrongly marking a
load (or store) as "ready to execute" effectively bypassing older memory barrier
instructions.
To reproduce this bug, the memory barrier must be the first instruction in the
input assembly sequence, and it doesn't have to perform any register writes.
llvm-svn: 334633
Currently the handle type is a global pointer which holds 8 bytes.
We need a larger type which hold 16 bytes, therefore change it
to [i64 x 2].
Differential Revision: https://reviews.llvm.org/D48094
llvm-svn: 334625