processFixupValue is called on every relaxation iteration. applyFixup
is only called once at the very end. applyFixup is then the correct
place to do last minute changes and value checks.
While here, do proper range checks again for fixup_arm_thumb_bl. We
used to do it, but dropped because of thumb2. We now do it again, but
use the thumb2 range.
llvm-svn: 306177
Revert "[ORC] Remove redundant semicolons from DEFINE_SIMPLE_CONVERSION_FUNCTIONS uses."
Revert "[ORC] Move ORC IR layer interface from addModuleSet to addModule and fix the module type as std::shared_ptr<Module>."
They broke ExecutionEngine/OrcMCJIT/test-global-ctors.ll on linux.
llvm-svn: 306176
After fixing (r306173) a failing test in the lld test suite (r306173),
reland r306095.
Original commit message:
[mips] Fix register positions in the aui/daui instructions
Swapped the position of the rt and rs register in the aui/daui
instructions for mips32r6 and mips64r6. With this change, the format of
the generated instructions complies with specifications and GCC.
Patch by Milos Stojanovic.
llvm-svn: 306174
This reverts commit r306157.
It caused some timeouts in clang tests. Perhaps unreachable loops have
far too many phi nodes.
Reverting and investigating.
llvm-svn: 306162
Summary:
Without this patch some types have incorrect size and/or alignment
according to the MSP430 EABI.
Reviewers: asl, awygle
Reviewed By: asl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34561
llvm-svn: 306159
Currently, the implementation of delete dead loops has a special case
when the loop being deleted is never executed. This special case
(updating of exit block's incoming values for phis) can be
run as a prepass for non-executable loops before performing
the actual deletion.
llvm-svn: 306157
This patch dumps the raw bytes of the pdb name map which contains
the mapping of stream name to stream index for the string table
and other reserved streams.
llvm-svn: 306148
This patch contains a pass that transforms CBZ/CBNZ/TBZ/TBNZ instructions into a
conditional branch (Bcc), when the NZCV flags can be set for "free". This is
preferred on targets that have more flexibility when scheduling Bcc
instructions as compared to CBZ/CBNZ/TBZ/TBNZ (assuming all other variables are
equal). This can reduce register pressure and is also the default behavior for
GCC.
A few examples:
add w8, w0, w1 -> cmn w0, w1 ; CMN is an alias of ADDS.
cbz w8, .LBB_2 -> b.eq .LBB0_2 ; single def/use of w8 removed.
add w8, w0, w1 -> adds w8, w0, w1 ; w8 has multiple uses.
cbz w8, .LBB1_2 -> b.eq .LBB1_2
sub w8, w0, w1 -> subs w8, w0, w1 ; w8 has multiple uses.
tbz w8, #31, .LBB6_2 -> b.ge .LBB6_2
In looking at all current sub-target machine descriptions, this transformation
appears to be either positive or neutral.
Differential Revision: https://reviews.llvm.org/D34220.
llvm-svn: 306144
Commit r306010 adjusted the condition as follows:
- if (Is64Bit) {
+ if (!STI.isTargetWin32()) {
The intent was to preserve the behavior on all Windows platforms
but extend the behavior on 64-bit Windows platforms to every
other one. (Before r306010, emitStackProbeCall only ever executed
when emitting code for Windows triples.)
Unfortunately,
if (Is64Bit && STI.isOSWindows())
is not the same as
if (!STI.isTargetWin32())
because of the way isTargetWin32() is defined:
bool isTargetWin32() const {
return !In64BitMode && (isTargetCygMing() ||
isTargetKnownWindowsMSVC());
}
In practice this broke the JIT tests on 32-bit Windows, which did not
satisfy the new condition:
LLVM :: ExecutionEngine/MCJIT/2003-01-15-AlignmentTest.ll
LLVM :: ExecutionEngine/MCJIT/2003-08-15-AllocaAssertion.ll
LLVM :: ExecutionEngine/MCJIT/2003-08-23-RegisterAllocatePhysReg.ll
LLVM :: ExecutionEngine/MCJIT/test-loadstore.ll
LLVM :: ExecutionEngine/OrcMCJIT/2003-01-15-AlignmentTest.ll
LLVM :: ExecutionEngine/OrcMCJIT/2003-08-15-AllocaAssertion.ll
LLVM :: ExecutionEngine/OrcMCJIT/2003-08-23-RegisterAllocatePhysReg.ll
LLVM :: ExecutionEngine/OrcMCJIT/test-loadstore.ll
because %esp was not updated correctly. The failures are only visible
on a MSVC 2017 Debug build, for which we do not have bots.
llvm-svn: 306142
The goal here is to make it possible to display absolute
file offsets when dumping byets from an MSF. The problem is
that when dumping bytes from an MSF, often the bytes will
cross a block boundary and encounter a discontinuity. We
can't use the normal formatBinary() function for this because
this would just treat the sequence as entirely ascending, and
not account out-of-order blocks.
This patch adds a formatMsfData() function to our printer, and
then uses this function to improve the output of the -stream-data
command line option for dumping bytes from a particular stream.
Test coverage is also expanded to make sure to include all possible
scenarios of offsets, sizes, and crossing block boundaries.
llvm-svn: 306141
It causes an extra pass of the machine verifier to be added to the pass
manager, and causes test/CodeGen/Generic/llc-start-stop.ll to fail.
llvm-svn: 306140
This is useful when an upper limit on the cache size needs to be
controlled independently of the amount of the amount of free space.
One use case is a machine with a large number of cache directories
(e.g. a buildbot slave hosting a large number of independent build
jobs). By imposing an upper size limit on each cache directory,
users can more easily estimate the server's capacity.
Differential Revision: https://reviews.llvm.org/D34547
llvm-svn: 306126
This is essentially just a BinaryStreamRef packaged with an
offset and the logic for reading one is no different than the
logic for reading a BinaryStreamRef, except that we save the
current offset.
llvm-svn: 306122
It was trying to do too many things. The basic lumping together of values for
legalization purposes is now handled by G_MERGE_VALUES. More complex things
involving gaps and odd sizes are handled by G_INSERT sequences.
llvm-svn: 306120
G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to
be taught how to emulate it with alternatives. We use G_MERGE_VALUES where
possible, and a sequence of G_INSERTs if not.
llvm-svn: 306119
Summary: visitSwitchInst should not take INT_MAX when Cost is negative. Instead of INT_MAX , we also use a valid upperbound cost when overflow occurs in Cost.
Reviewers: hans, echristo, dmgreen
Reviewed By: dmgreen
Subscribers: mcrosier, javed.absar, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D34436
llvm-svn: 306118
This reverts the use of TargetLowering::prepareVolatileOrAtomicLoad
introduced by r196905. Nothing in the semantics of the "volatile"
keyword or the definition of the z/Architecture actually requires
that volatile loads are preceded by a serialization operation, and
no other compiler on the platform actually implements this.
Since we've now seen a use case where this additional serialization
causes noticable performance degradation, this patch removes it.
The patch still leaves in the serialization before atomic loads,
which is now implemented directly in lowerATOMIC_LOAD. (This also
seems overkill, but that can be addressed separately.)
llvm-svn: 306117
The isBarrier/isTerminator flags have been removed from the SystemZ trap
instructions, so that tests do not fail with EXPENSIVE_CHECKS. This was just
an issue at -O0 and did not affect code output on benchmarks.
(Like Eli pointed out: "targets are split over whether they consider their
"trap" a terminator; x86, AArch64, and NVPTX don't, but ARM, MIPS, PPC, and
SystemZ do. We should probably try to be consistent here.". This is still the
case, although SystemZ has switched sides).
SystemZ now returns true in isMachineVerifierClean() :-)
These Generic tests have been modified so that they can be run with or without
EXPENSIVE_CHECKS: CodeGen/Generic/llc-start-stop.ll and
CodeGen/Generic/print-machineinstrs.ll
Review: Ulrich Weigand, Simon Pilgrim, Eli Friedman
https://bugs.llvm.org/show_bug.cgi?id=33047https://reviews.llvm.org/D34143
llvm-svn: 306106
Summary:
Many languages have a three way comparison idiom where comparing two values
produces not a boolean, but a tri-state value. Typical values (e.g. as used in
the lcmp/fcmp bytecodes from Java) are -1 for less than, 0 for equality, and +1
for greater than.
We actually do a great job already of converting three way comparisons into
binary comparisons when the result produced has one a single use. Unfortunately,
such values can have more than one use, and in that case, our existing
optimizations break down.
The patch adds a peephole which converts a three-way compare + test idiom into a
binary comparison on the original inputs. It focused on replacing the test on
the result of the three way compare and does nothing about removing the three
way compare itself. That's left to other optimizations (which do actually kick
in commonly.)
We currently recognize one idiom on signed integer compare. In the future, we
plan to recognize and simplify other comparison idioms on
other signed/unsigned datatypes such as floats, vectors etc.
This is a resurrection of Philip Reames' original patch:
https://reviews.llvm.org/D19452
Reviewers: majnemer, apilipenko, reames, sanjoy, mkazantsev
Reviewed by: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34278
llvm-svn: 306100
ELF/mips-plt-r6.s in lld-test is failing. Reverting the change.
Original commit message:
[mips] Fix register positions in the aui/daui instructions
Swapped the position of the rt and rs register in the aut/daui
instructions for mips32r6 and mips64r6. With this change, the format of
the generated instructions complies with specifications and GCC.
Patch by Milos Stojanovic.
llvm-svn: 306099
Summary:
The function matches the interface of llvm::to_integer, but as we are
calling out to a C library function, I let it take a Twine argument, so
we can avoid a string copy at least in some cases.
I add a test and replace a couple of existing uses of strtod with this
function.
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34518
llvm-svn: 306096
Swapped the position of the rt and rs register in the aut/daui instructions
for mips32r6 and mips64r6. With this change, the format of the generated
instructions complies with specifications and GCC.
Patch by Milos Stojanovic.
Differential Revision: https://reviews.llvm.org/D33988
llvm-svn: 306095
Before this change, it was always the first element of a vector that got splatted since the lower 6 bits of vshf.d $wd were always zero for little endian.
Additionally, masking has been performed for vshf via which splat.d is created.
Vshf has a property where if its first operand's elements have either bit 6 or 7 set, destination element is set to zero.
Initially masked with 63 to avoid this property, which would result in generation of and.v + vshf.d in all cases.
Masking with one results in generating a single splati.d instruction when possible.
Differential Revision: https://reviews.llvm.org/D32216
llvm-svn: 306090
Currently JumpThreading can use LazyValueInfo to analyze an 'and' or 'or' of compare if the compare is fed by a livein of a basic block. This can be used to to prove the condition can't be met for some predecessor and the jump from that predecessor can be moved to the false path of the condition.
But if the compare is something that InstCombine turns into an add and a single compare, it can't be analyzed because the livein is now an input to the add and not the compare.
This patch adds a new method to LVI to get a ConstantRange on an edge. Then we teach jump threading to detect the add livein feeding a compare and to get the ConstantRange and propagate it.
Differential Revision: https://reviews.llvm.org/D33262
llvm-svn: 306085
X86_64 COFF only has support for 32 bit pcrel relocations. Produce an
error on all others.
Note that gnu as has extended the relocation values to support
this. It is not clear if we should support the gnu extension.
llvm-svn: 306082
I want to use the same logic as LoopSimplify to form dedicated exits in
another pass (SimpleLoopUnswitch) so I wanted to factor it out here.
I also noticed that there is a pretty significantly more efficient way
to implement this than the way the code in LoopSimplify worked. We don't
need to actually retain the set of unique exit blocks, we can just
rewrite them as we find them and use only a set to deduplicate.
This did require changing one part of LoopSimplify to not re-use the
unique set of exits, but it only used it to check that there was
a single unique exit. That part of the code is about to walk the exiting
blocks anyways, so it seemed better to rewrite it to use those exiting
blocks to compute this property on-demand.
I also had to ditch a statistic, but it doesn't seem terribly valuable.
Differential Revision: https://reviews.llvm.org/D34049
llvm-svn: 306081
Summary: LVI can reason about an AND of icmps on the true dest of a branch. I believe we can do similar for the false dest of ORs. This allows us to get the same answer for the demorganed versions of some of the AND test cases as you can see.
Reviewers: anna, reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34431
llvm-svn: 306076
Details: There was a use but it was in the assert which was not
exercised during product build.
Reviewers: Andrew Kaylor
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32658
llvm-svn: 306073
This is very similar to the transform in:
https://reviews.llvm.org/rL306040
...but in this case, we use cmp X, 1 to set the carry bit as needed.
Again, we can show that all of these are logically equivalent (although
InstCombine currently canonicalizes to a form not seen here), and if
we believe IACA, then this is the smallest/fastest code. Eg, with SNB:
| Num Of | Ports pressure in cycles | |
| Uops | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | |
---------------------------------------------------------------------
| 1 | 1.0 | | | | | | | cmp edi, 0x1
| 2 | | 1.0 | | | | 1.0 | CP | sbb eax, eax
The larger motivation is to clean up all select-of-constants combining/lowering
because we're missing some common cases.
llvm-svn: 306072
Also document the attribute, since "probe-stack" already is.
Reviewed By: majnemer
Differential Revision: https://reviews.llvm.org/D34528
llvm-svn: 306069
For whatever reason, when processing
.globl foo
foo:
.data
bar:
.long foo-bar
llvm-mc creates a relocation with the section:
0x0 IMAGE_REL_I386_REL32 .text
This is different than when the relocation is relative from the
beginning. For example, a file with
call foo
produces
0x0 IMAGE_REL_I386_REL32 foo
I would like to refactor the logic for converting "foo - ." into a
relative relocation so that it is shared with ELF. This is the first
step and just changes the coff implementation to match what ELF (and
COFF in the case of calls) does.
llvm-svn: 306063
The feeder instruction will be moved to right before the compare, so
the updating code should not be looking for kills past the compare.
llvm-svn: 306059
move the ObjectCache from the IRCompileLayer to SimpleCompiler.
This is the first in a series of patches aimed at cleaning up and improving the
robustness and performance of the ORC APIs.
llvm-svn: 306058
There's nothing incorrect about emitting such relocations against
symbols defined in other objects. The code in EmitCOFFSec* was missing
the visitUsedExpr part of MCStreamer::EmitValueImpl, so these symbols
were not being registered with the object file assembler.
This will be used to make reduced test cases for LLD.
llvm-svn: 306057
It looks like that when this code was written recordRelocation could
be called with A-B where A and B are in the same section. The
expression evaluation logic these days makes sure those are folded, so
some of this code was dead.
llvm-svn: 306053
Summary:
Currently, we incorrectly update exit blocks of loops when there are multiple
edges from a single exiting block to the exit block. This can happen when we
have switches as the terminator of the exiting blocks.
The fix here is to correctly update the phi nodes in the exit block, and remove
all incoming values *except* for one which is from the preheader.
Note: Currently, this error can manifest only while deleting non-executed loops. However, it
is possible to trigger this error in invariant loops, once we enhance the logic
around the exit conditions for the loop check.
Reviewers: chandlerc, dberlin, sanjoy, efriedma
Reviewed by: efriedma
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D34516
llvm-svn: 306048
Summary:
These intrinsics aren't used by clang and haven't been for a while.
There's some really terrible codegen in the 32-bit target for avx512bw due to i64 not being legal. But as I said these intrinsics aren't used by clang even before this patch so this codegen reflects our clang behavior today.
Reviewers: spatel, RKSimon, zvi, igorb
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34389
llvm-svn: 306047
This matches the checks done at the beginning of isKnownNonEqual that this code is partially emulating.
Without this we can get assertion failures due to the bit widths of the KnownBits not matching.
llvm-svn: 306044
All NativeRawSymbols will have a unique symbol ID (retrievable via
getSymIndexId). For now, these are initialized to 0, but soon the
NativeSession will be responsible for creating the raw symbols, and it will
assign unique IDs.
The symbol cache in the NativeSession will also require the ability to clone
raw symbols, so I've provided implementations for that as well.
llvm-svn: 306042
There doesn't seem to be a compelling reason why this method should be const
other than it was possible with the DIA implementation. The native session
is going to act as a symbol factory and cache. This could be acheived with
mutable (and the existing const_cast), but it seems cleaner to accept that
this method affects the state of the session.
This change eliminates an existing const_cast.
llvm-svn: 306041
Our handling of select-of-constants is lumpy in IR (https://reviews.llvm.org/D24480),
lumpy in DAGCombiner, and lumpy in X86ISelLowering. That's why we only had the 'sbb'
codegen in 1 out of the 4 tests. This is a step towards smoothing that out.
First, show that all of these IR forms are equivalent:
http://rise4fun.com/Alive/mx
Second, show that the 'sbb' version is faster/smaller. IACA output for SandyBridge
(later Intel and AMD chips are similar based on Agner's tables):
This is the "obvious" x86 codegen (what gcc appears to produce currently):
| Num Of | Ports pressure in cycles | |
| Uops | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | |
---------------------------------------------------------------------
| 1* | | | | | | | | xor eax, eax
| 1 | 1.0 | | | | | | CP | test edi, edi
| 1 | | | | | | 1.0 | CP | setnz al
| 1 | | 1.0 | | | | | CP | neg eax
This is the adc version:
| 1* | | | | | | | | xor eax, eax
| 1 | 1.0 | | | | | | CP | cmp edi, 0x1
| 2 | | 1.0 | | | | 1.0 | CP | adc eax, 0xffffffff
And this is sbb:
| 1 | 1.0 | | | | | | | neg edi
| 2 | | 1.0 | | | | 1.0 | CP | sbb eax, eax
If IACA is trustworthy, then sbb became a single uop in Broadwell, so this will be
clearly better than the alternatives going forward.
llvm-svn: 306040
Without this cast the "char" overload of operator<< is
chosen and the values is output as an ascii rather than
an integer.
Differential Revision: https://reviews.llvm.org/D34486
llvm-svn: 306039
Intrinsic already existed for llvm.SI.tbuffer.store
Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.*
Added CodeGen tests for the 2 new variants added.
Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr
Differential Revision: https://reviews.llvm.org/D30687
llvm-svn: 306031
Summary:
InstCombine likes to turn (icmp eq (and X, C1), 0) into (icmp slt (trunc (X)), 0) sometimes. This breaks foldSelectICmpAndOr's ability to recognize (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y).
This patch tries to recover this. I had to flip around some of the early out checks so that I could create a new And instruction during the compare processing without it possibly never getting used.
Reviewers: spatel, majnemer, davide
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34184
llvm-svn: 306029
If the components of the and/or had multiple uses, this transform created an additional instruction.
This patch makes sure we remove one of the components.
Differential Revision: https://reviews.llvm.org/D34498
llvm-svn: 306027
There are 2 parts to this patch made simultaneously to avoid a regression.
We're reversing the canonicalization that moves bitwise vector ops before bitcasts.
We're moving bitwise vector ops *after* bitcasts instead. That's the 1st and 3rd hunks
of the patch. The motivation is that there's only one fold that currently depends on
the existing canonicalization (see next), but there are many folds that would
automatically benefit from the new canonicalization.
PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these
patterns in IR.
There's an or(and,andn) pattern that requires an adjustment in order to continue matching
to 'select' because the bitcast changes position. This match is unfortunately complicated
because it requires 4 logic ops with optional bitcast and sext ops.
Test diffs:
1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference -
bitcast comes before logic.
2. There are also tests with no diffs in bitcast.ll that verify that we're still doing
folds that were enabled by the previous canonicalization.
3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns
to look through bitcasts.
4. logical-select.ll contains several tests for the or(and,andn) --> select fold to
verify that we are still handling those cases. The lone diff shows the movement of
the bitcast from the new canonicalization rule.
Differential Revision: https://reviews.llvm.org/D33517
llvm-svn: 306011
Summary:
The ARM ELF ABI requires the linker to do interworking for wide
conditional branches from Thumb code to ARM code.
That was pointed out by @peter.smith in the comments for D33436.
Reviewers: rafael, peter.smith, echristo
Reviewed By: peter.smith
Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, peter.smith
Differential Revision: https://reviews.llvm.org/D34447
llvm-svn: 306009
This patch allows $AT to be used as a register name in assembly files.
Currently only $at is recognized as a valid register name.
Patch by Stanislav Ocovaj.
Differential Revision: https://reviews.llvm.org/D34348
llvm-svn: 306007
The fix in r306003 uncovered a pretty fundamental problem that libc++
implementation of std::result_of does not handle the prototype of
open(2) correctly (presumably because it contains ...). This makes the
whole function unusable in its current form, so I am also reverting the
original commit (r305892), which introduced the function, at least until
I figure out a way to solve the libc++ issue.
llvm-svn: 306005
Summary:
Despite that this instructions are listed in VOP2, they are treated as VOP3 in specs. They should not support SDWA.
There are no real instructions for them, but there are pseudo instructions.
Reviewers: arsenm, vpykhtin, cfang
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34403
llvm-svn: 305999
This has been deprecated since ARMARM v7-AR, release C.b, published back
in 2012.
This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was
introduced to check that conditionalization of Neon instructions did
happen when generating Thumb2. However, the test had evolved and was no
longer testing that. Rather than trying to adapt that test, this commit
introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can
now use the MIR framework to write nicer/more maintainable tests.
llvm-svn: 305998
After the N64 static relocation model support was added to llvm it is required to add its support in RuntimeDyld also because lldb uses ExecutionEngine for evaluating expressions.
Reviewed by sdardis
Differential: D31649
llvm-svn: 305997
Rather than creating a separate ".rdata" section distinct from the
customary ".rodata" in ELF, ".rdata" switches to the ".rodata" section.
This patch relands r305949 and r305950 with the correct commit message
and addresses nit raised during review.
Patch By: John Baldwin!
Differential Revision: https://reviews.llvm.org/D34452
llvm-svn: 305995
This patch makes a couple of changes to how we decide whether to use the narrow
or wide encoding of thumb2 instructions:
* Common out the detection of the .w qualifier
* Check for the CPSR operand in a consistent way
Differential Revision: https://reviews.llvm.org/D34460
llvm-svn: 305992
Summary:
This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES
instructions back to back and adds FeatureFuseAES to enable the feature.
Reviewers: evandro, javed.absar, rengolin, t.p.northover
Reviewed By: javed.absar
Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34142
llvm-svn: 305988
Masked gather for vector length 2 is lowered incorrectly for element type i32.
The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD.
The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal.
In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well.
Differential revision: https://reviews.llvm.org/D34343
llvm-svn: 305987
Summary:
Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers.
Added several subtarget features for GFX9 SDWA.
This diff also contains changes from D34026.
Depends D34026
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34241
llvm-svn: 305986
This includes the safe SEH tables and the control flow guard function
table. LLD will emit the guard table soon, and I need a tool that dumps
them for testing.
llvm-svn: 305979